nips nips2012 nips2012-54 nips2012-54-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Lei Shi
Abstract: For modeling data matrices, this paper introduces Probabilistic Co-Subspace Addition (PCSA) model by simultaneously capturing the dependent structures among both rows and columns. Briefly, PCSA assumes that each entry of a matrix is generated by the additive combination of the linear mappings of two low-dimensional features, which distribute in the row-wise and column-wise latent subspaces respectively. In consequence, PCSA captures the dependencies among entries intricately, and is able to handle non-Gaussian and heteroscedastic densities. By formulating the posterior updating into the task of solving Sylvester equations, we propose an efficient variational inference algorithm. Furthermore, PCSA is extended to tackling and filling missing values, to adapting model sparseness, and to modelling tensor data. In comparison with several state-of-art methods, experiments demonstrate the effectiveness and efficiency of Bayesian (sparse) PCSA on modeling matrix (tensor) data and filling missing values.
[1] A. Agovic, A. Banerjee, and S. Chatterjee. Probabilistic matrix addition. In Proc. ICML, pages 1025– 1032, 2011.
[2] C. M. Bishop. Training with noise is equivalent to Tikhonov regularization. Neural Computation, 7(1):108–116, 1995.
[3] C. M. Bishop. Variational principal components. In Proc. ICANN’1999, volume 1, pages 509–514, 1999.
[4] M. A. T. Figueiredo. Adaptive sparseness using Jeffreys prior. In Advances in NIPS, volume 14, pages 679–704. MIT Press, Cambridge, MA, 2002.
[5] A. E. Gelfand and S. Banerjee. Multivariate spatial process models. In A. E. Gelfand, P. Diggle, P.Guttorp, and M. Fuentes, editors, Handbook of Spatial Statistics. CRC Press, 2010.
[6] X. Geng, K. Smith-Miles, Z.-H. Zhou, and L. Wang. Face image modeling by multilinear subspace analysis with missing values. IEEE Trans. Syst., Man, Cybern. B, Cybern., 41(3):881–892, 2011.
[7] Z. Ghahramani and G. Hinton. The EM algorithm for mixtures of factor analyzers. Technical Report CRG-TR-96-1, Department of Computer Science, University of Toronto, Toronto, Canada, 1997.
[8] K. Goldberg, T. Roeder, D. Gupta, and C. Perkins. Eigentaset: A constant time collaborative filtering algorithm. Information Retrieval, 4(2):133–151, 2001.
[9] Y. Guan and J. Dy. Sparse probabilistic principal component analysis. In Proc. AISTATS’2009, JMLR W&CP;, volume 5, pages 185–192. 2009.
[10] D. Y. Hu and L. Reichel. Krylov-subspace methods for the Sylvester equation. Linear Algebra and Its Applications, 172:283–313, 1992.
[11] M. I. Jordan, editor. Learning in graphical models. MIT Press, Cambridge MA, 1999.
[12] B. Lakshimanarayan, G. Bouchard, and C. Archambeau. Robust Bayesian matrix factorisation. In Proc. AISTATS’2011, JMLR W&CP;, volume 15, pages 425–433. 2011.
[13] N. D. Lawrence. Gaussian process latent variable models for visualisation of high dimensional data. In Advances in NIPS, volume 16, pages 329–336. MIT Press, Cambridge, MA, 2003.
[14] R. M. Neal. Bayesian Learning for Neural Networks. Springer-Verlag, New York, 1996.
[15] S. Roweis, L. K. Saul, and G. Hinton. Global coordination of local linear models. In Advances in NIPS, volume 14, pages 889–896. MIT Press, Cambridge, MA, 2002.
[16] R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In Advances in NIPS, volume 20, pages 1257–1264. MIT Press, Cambridge, MA, 2008.
[17] F. Samaria and A. Harter. Parameterisation of a stochastic model for human face identification. In Proc. 2nd IEEE Workshop on Applications of Computer Vision, pages 138–142, 1994.
[18] T. Sim, S. Baker, and M. Bsat. The CMU pose, illumination, and expression database. IEEE Trans. Patten Anal. Mach. Intell., 25(12):1615–1618, 2003.
[19] R. Tibshirani. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B, 58(1):267–288, 1996.
[20] M. E. Tipping and C. M. Bishop. Mixtures of probabilistic principal component analyzers. Neural Computation, 11(2):443–482, 1999.
[21] M. Titsias and N. Lawrence. Bayesian Gaussian process latent variable model. In Proc. AISTATS’2009, JMLR W&CP;, volume 9, pages 844–851. 2010.
[22] K. Trohidis, G. Tsoumakas, G. Kalliris, and I. Vlahavas. Multilabel classification of music into emotions. In Proc. Intl. Conf. on Music Information Retrieval (ISMIR), pages 325–330, 2008.
[23] D. Turnbull, L. Barrington, D. Torres, and G. Lanckriet. Semantic annotation and retrieval of music and sound effects. IEEE Trans. Audio, Speech and Lang. Process., 16(2):467–476, 2008.
[24] S. Virtanen, A. Klami, S. A. Khan, and S. Kaski. Bayesian group factor analysis. In Proc. AISTATS’2012, JMLR W&CP;, volume 22, pages 1269–1277. 2012.
[25] Z. Xu, K. Kersting, and V. Tresp. Multi-relational learning with Gaussian processes. In Proc. IJCAI’2009, pages 1309–1314, 2009. 9