nips nips2013 nips2013-180 nips2013-180-reference knowledge-graph by maker-knowledge-mining

180 nips-2013-Low-rank matrix reconstruction and clustering via approximate message passing

Source: pdf

Author: Ryosuke Matsushita, Toshiyuki Tanaka

Abstract: We study the problem of reconstructing low-rank matrices from their noisy observations. We formulate the problem in the Bayesian framework, which allows us to exploit structural properties of matrices in addition to low-rankedness, such as sparsity. We propose an efﬁcient approximate message passing algorithm, derived from the belief propagation algorithm, to perform the Bayesian inference for matrix reconstruction. We have also successfully applied the proposed algorithm to a clustering problem, by reformulating it as a low-rank matrix reconstruction problem with an additional structural property. Numerical experiments show that the proposed algorithm outperforms Lloyd’s K-means algorithm. 1

reference text

[1] P. Paatero, “Least squares formulation of robust non-negative factor analysis,” Chemometrics and Intelligent Laboratory Systems, vol. 37, no. 1, pp. 23–35, May 1997.

[2] P. O. Hoyer, “Non-negative matrix factorization with sparseness constraints,” The Journal of Machine Learning Research, vol. 5, pp. 1457–1469, Dec. 2004.

[3] R. Salakhutdinov and A. Mnih, “Bayesian probabilistic matrix factorization using Markov chain Monte Carlo,” in Proceedings of the 25th International Conference on Machine Learning, New York, NY, Jul. 5– Aug. 9, 2008, pp. 880–887.

[4] Y. J. Lim and Y. W. Teh, “Variational Bayesian approach to movie rating prediction,” in Proceedings of KDD Cup and Workshop, San Jose, CA, Aug. 12, 2007.

[5] T. Raiko, A. Ilin, and J. Karhunen, “Principal component analysis for large scale problems with lots of missing values,” in Machine Learning: ECML 2007, ser. Lecture Notes in Computer Science, J. N. Kok, J. Koronacki, R. L. de Mantaras, S. Matwin, D. Mladeniˇ , and A. Skowron, Eds. Springer Berlin c Heidelberg, 2007, vol. 4701, pp. 691–698.

[6] D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algorithms for compressed sensing,” Proceedings of the National Academy of Sciences USA, vol. 106, no. 45, pp. 18 914–18 919, Nov. 2009.

[7] S. Rangan, “Generalized approximate message passing for estimation with random linear mixing,” in Proceedings of 2011 IEEE International Symposium on Information Theory, St. Petersburg, Russia, Jul. 31– Aug. 5, 2011, pp. 2168–2172.

[8] S. Rangan and A. K. Fletcher, “Iterative estimation of constrained rank-one matrices in noise,” in Proceedings of 2012 IEEE International Symposium on Information Theory, Cambridge, MA, Jul. 1–6, 2012, pp. 1246–1250.

[9] R. Matsushita and T. Tanaka, “Approximate message passing algorithm for low-rank matrix reconstruction,” in Proceedings of the 35th Symposium on Information Theory and its Applications, Oita, Japan, Dec. 11–14, 2012, pp. 314–319.

[10] W. Xu, X. Liu, and Y. Gong, “Document clustering based on non-negative matrix factorization,” in Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, Toronto, Canada, Jul. 28–Aug. 1, 2003, pp. 267–273.

[11] C. Ding, T. Li, and M. Jordan, “Convex and semi-nonnegative matrix factorizations,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 1, pp. 45–55, Jan. 2010.

[12] S. P. Lloyd, “Least squares quantization in PCM,” IEEE Transactions on Information Theory, vol. IT-28, no. 2, pp. 129–137, Mar. 1982.

[13] F. Krzakala, M. M´ zard, and L. Zdeborov´ , “Phase diagram and approximate message passing for blind e a calibration and dictionary learning,” preprint, Jan. 2013, arXiv:1301.5898v1 [cs.IT].

[14] J. T. Parker, P. Schniter, and V. Cevher, “Bilinear generalized approximate message passing,” preprint, Oct. 2013, arXiv:1310.2632v1 [cs.IT].

[15] S. Nakajima and M. Sugiyama, “Theoretical analysis of Bayesian matrix factorization,” Journal of Machine Learning Research, vol. 12, pp. 2583–2648, Sep. 2011.

[16] D. Arthur and S. Vassilvitskii, “k-means++: the advantages of careful seeding,” in SODA ’07 Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, Louisiana, Jan. 7–9, 2007, pp. 1027–1035.

[17] J. S. Yedidia, W. T. Freeman, and Y. Weiss, “Constructing free-energy approximations and generalized belief propagation algorithms,” IEEE Transactions on Information Theory, vol. 51, no. 7, pp. 2282–2312, Jul. 2005.

[18] M. Bayati and A. Montanari, “The dynamics of message passing on dense graphs, with applications to compressed sensing,” IEEE Transactions on Information Theory, vol. 57, no. 2, pp. 764–785, Feb. 2011.

[19] H. W. Kuhn, “The Hungarian method for the assignment problem,” Naval Research Logistics Quarterly, vol. 2, no. 1–2, pp. 83–97, Mar. 1955.

[20] F. S. Samaria and A. C. Harter, “Parameterisation of a stochastic model for human face identiﬁcation,” in Proceedings of 2nd IEEE Workshop on Applications of Computer Vision, Sarasota FL, Dec. 1994, pp. 138–142. [Online]. Available: http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html 9