nips nips2005 nips2005-86 nips2005-86-reference knowledge-graph by maker-knowledge-mining

86 nips-2005-Generalized Nonnegative Matrix Approximations with Bregman Divergences

Source: pdf

Author: Suvrit Sra, Inderjit S. Dhillon

Abstract: Nonnegative matrix approximation (NNMA) is a recent technique for dimensionality reduction and data analysis that yields a parts based, sparse nonnegative representation for nonnegative input data. NNMA has found a wide variety of applications, including text analysis, document clustering, face/image recognition, language modeling, speech processing and many others. Despite these numerous applications, the algorithmic development for computing the NNMA factors has been relatively deﬁcient. This paper makes algorithmic progress by modeling and solving (using multiplicative updates) new generalized NNMA problems that minimize Bregman divergences between the input matrix and its lowrank approximation. The multiplicative update formulae in the pioneering work by Lee and Seung [11] arise as a special case of our algorithms. In addition, the paper shows how to use penalty functions for incorporating constraints other than nonnegativity into the problem. Further, some interesting extensions to the use of “link” functions for modeling nonlinear relationships are also discussed. 1

reference text

[1] A. Banerjee, S. Merugu, I. S. Dhillon, and J. Ghosh. Clustering with Bregman Divergences. In SIAM International Conf. on Data Mining, Lake Buena Vista, Florida, April 2004. SIAM.

[2] Y. Censor and S. A. Zenios. Parallel Optimization: Theory, Algorithms, and Applications. Numerical Mathematics and Scientiﬁc Computation. Oxford University Press, 1997.

[3] H. Cho, I. S. Dhillon, Y. Guan, and S. Sra. Minimum Sum Squared Residue based Co-clustering of Gene Expression data. In Proc. 4th SIAM International Conference on Data Mining (SDM), pages 114–125, Florida, 2004. SIAM.

[4] A. Cichocki, R. Zdunek, and S. Amari. Csisz´ r’s Divergences for Non-Negative Matrix Factora ization: Family of New Algorithms. In 6th Int. Conf. ICA & BSS, USA, March 2006.

[5] M. Collins, R. Schapire, and Y. Singer. Logistic regression, adaBoost, and Bregman distances. In Thirteenth annual conference on COLT, 2000.

[6] M. Collins, S. Dasgupta, and R. E. Schapire. A Generalization of Principal Components Analysis to the Exponential Family. In NIPS 2001, 2001.

[7] I. S. Dhillon and S. Sra. Generalized nonnegative matrix approximations. Technical report, Computer Sciences, University of Texas at Austin, 2005.

[8] T. Feng, S. Z. Li, H-Y. Shum, and H. Zhang. Local nonnegative matrix factorization as a visual representation. In Proceedings of the 2nd International Conference on Development and Learning, pages 178–193, Cambridge, MA, June 2002.

[9] D. Guillamet, M. Bressan, and J. Vitri` . A weighted nonnegative matrix factorization for local a representations. In CVPR. IEEE, 2001.

[10] P. O. Hoyer. Non-negative sparse coding. In Proc. IEEE Workshop on Neural Networks for Signal Processing, pages 557–565, 2002.

[11] D. D. Lee and H. S. Seung. Algorithms for nonnegative matrix factorization. In NIPS, pages 556–562, 2000.

[12] D. D. Lee and H. S. Seung. Learning the parts of objects by nonnegative matrix factorization. Nature, 401:788–791, October 1999.

[13] P. Paatero and U. Tapper. Positive matrix factorization: A nonnegative factor model with optimal utilization of error estimates of data values. Environmetrics, 5(111–126), 1994.

[14] R. T. Rockafellar. Convex Analysis. Princeton Univ. Press, 1970.

[15] N. Srebro and T. Jaakola. Weighted low-rank approximations. In Proc. of 20th ICML, 2003.

[16] J. A. Tropp. Topics in Sparse Approximation. PhD thesis, The Univ. of Texas at Austin, 2004.