nips nips2010 nips2010-89 nips2010-89-reference knowledge-graph by maker-knowledge-mining

89 nips-2010-Factorized Latent Spaces with Structured Sparsity


Source: pdf

Author: Yangqing Jia, Mathieu Salzmann, Trevor Darrell

Abstract: Recent approaches to multi-view learning have shown that factorizing the information into parts that are shared across all views and parts that are private to each view could effectively account for the dependencies and independencies between the different input modalities. Unfortunately, these approaches involve minimizing non-convex objective functions. In this paper, we propose an approach to learning such factorized representations inspired by sparse coding techniques. In particular, we show that structured sparsity allows us to address the multiview learning problem by alternately solving two convex optimization problems. Furthermore, the resulting factorized latent spaces generalize over existing approaches in that they allow having latent dimensions shared between any subset of the views instead of between all the views only. We show that our approach outperforms state-of-the-art methods on the task of human pose estimation. 1


reference text

[1] C. Archambeau and F. Bach. Sparse probabilistic projections. In Neural Information Processing Systems, 2008.

[2] F. Bach, G. Lanckriet, and M. Jordan. Multiple kernel learning, conic duality, and the SMO algorithm. In International Conference on Machine learning. ACM New York, NY, USA, 2004.

[3] F. R. Bach and M. I. Jordan. A probabilistic interpretation of canonical correlation analysis. Technical Report 688, Department of Statistics, University of California, Berkeley, 2005.

[4] S. Bengio, F. Pereira, Y. Singer, and D. Strelow. Group sparse coding. Neural Information Processing Systems, 2009.

[5] A. Bosch, A. Zisserman, and X. Munoz. Image classification using random forests and ferns. In International Conference on Computer Vision, 2007.

[6] C. H. Ek, P. Torr, and N. Lawrence. Gaussian process latent variable models for human pose estimation. In Joint Workshop on Machine Learning and Multimodal Interaction, 2007.

[7] C. H. Ek, P. Torr, and N. Lawrence. Ambiguity modeling in latent spaces. In Joint Workshop on Machine Learning and Multimodal Interaction, 2008.

[8] A. Geiger, R. Urtasun, and T. Darrell. Rank priors for continuous non-linear dimensionality reduction. In Conference on Computer Vision and Pattern Recognition, 2009.

[9] R. Jenatton, G. Obozinski, and F. Bach. Structured sparse principal component analysis. In International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, May 2010.

[10] A. Kanaujia, C. Sminchisescu, and D. N. Metaxas. Semi-supervised hierarchical models for 3d human pose reconstruction. In Conference on Computer Vision and Pattern Recognition, 2007.

[11] A. Klami and S. Kaski. Probabilistic approach to detecting dependencies between data sets. Neurocomputing, 72:39–46, 2008.

[12] M. Kuss and T. Graepel. The geometry of kernel canonical correlation analysis. Technical Report TR-108, Max Planck Institute for Biological Cybernetics, T¨ bingen, Germany, 2003. u

[13] H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In Neural Information Processing Systems, 2006.

[14] G. Leen. Context assisted information extraction. PhD thesis, University the of West of Scotland, University of the West of Scotland, High Street, Paisley PA1 2BE, Scotland, 2008.

[15] R. Navaratnam, A. Fitzgibbon, and R. Cipolla. The Joint Manifold Model for Semi-supervised Multivalued Regression. In International Conference on Computer Vision, Rio, Brazil, October 2007.

[16] B. Olshausen and D. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381:607–609, 1996.

[17] A. Quattoni, X. Carreras, M. Collins, and T. Darrell. An efficient projection for l1,infinity regularization. In International Conference on Machine Learning, 2009.

[18] A. Quattoni, M. Collins, and T. Darrell. Transfer learning for image classification with sparse prototype representations. In Conference on Computer Vision and Pattern Recognition, 2008.

[19] G. Rogez, J. Rihan, S. Ramalingam, C. Orrite, and P. Torr. Randomized Trees for Human Pose Detection. In Conference on Computer Vision and Pattern Recognition, 2008.

[20] M. Salzmann, C.-H. Ek, R. Urtasun, and T. Darrell. Factorized orthogonal latent spaces. In International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, May 2010.

[21] A. P. Shon, K. Grochow, A. Hertzmann, and R. P. N. Rao. Learning shared latent structure for image synthesis and robotic imitation. In Neural Information Processing Systems, pages 1233–1240, 2006.

[22] L. Sigal and M. J. Black. Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion. Technical Report CS-06-08, Brown University, 2006.

[23] L. Sigal, R. Memisevic, and D. J. Fleet. Shared kernel information embedding for discriminative inference. In Conference on Computer Vision and Pattern Recognition, 2009.

[24] S. Sonnenburg, G. R¨ tsch, C. Sch¨ fer, and B. Sch¨ lkopf. Large scale multiple kernel learning. The a a o Journal of Machine Learning Research, 7:1531–1565, 2006.

[25] R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58:267–288, 1996.

[26] M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society, Series B, 68:49–67, 2006. 9