nips nips2013 nips2013-186 nips2013-186-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Martin Slawski, Matthias Hein, Pavlo Lutsik
Abstract: Motivated by an application in computational biology, we consider low-rank matrix factorization with {0, 1}-constraints on one of the factors and optionally convex constraints on the second one. In addition to the non-convexity shared with other matrix factorization schemes, our problem is further complicated by a combinatorial constraint set of size 2m·r , where m is the dimension of the data points and r the rank of the factorization. Despite apparent intractability, we provide − in the line of recent work on non-negative matrix factorization by Arora et al. (2012)− an algorithm that provably recovers the underlying factorization in the exact case with O(mr2r + mnr + r2 n) operations for n datapoints. To obtain this result, we use theory around the Littlewood-Offord lemma from combinatorics.
[1] P. Paatero and U. Tapper. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5:111–126, 1994.
[2] D. Lee and H. Seung. Learning the parts of objects by nonnegative matrix factorization. Nature, 401:788– 791, 1999.
[3] J. Ramsay and B. Silverman. Functional Data Analysis. Springer, New York, 2006.
[4] F. Bach, J. Mairal, and J. Ponce. Convex Sparse Matrix Factorization. Technical report, ENS, Paris, 2008.
[5] D. Witten, R. Tibshirani, and T. Hastie. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics, 10:515–534, 2009.
[6] A-J. van der Veen. Analytical Method for Blind Binary Signal Separation. IEEE Signal Processing, 45:1078–1082, 1997.
[7] J. Liao, R. Boscolo, Y. Yang, L. Tran, C. Sabatti, and V. Roychowdhury. Network component analysis: reconstruction of regulatory signals in biological systems. PNAS, 100(26):15522–15527, 2003.
[8] S. Tu, R. Chen, and L. Xu. Transcription Network Analysis by a Sparse Binary Factor Analysis Algorithm. Journal of Integrative Bioinformatics, 9:198, 2012.
[9] E. Houseman et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics, 13:86, 2012.
[10] A. Banerjee, C. Krumpelman, J. Ghosh, S. Basu, and R. Mooney. Model-based overlapping clustering. In KDD, 2005.
[11] E. Segal, A. Battle, and D. Koller. Decomposing gene expression into cellular processes. In Proceedings of the 8th Pacific Symposium on Biocomputing, 2003.
[12] A. Schein, L. Saul, and L. Ungar. A generalized linear model for principal component analysis of binary data. In AISTATS, 2003.
[13] A. Kaban and E. Bingham. Factorisation and denoising of 0-1 data: a variational approach. Neurocomputing, 71:2291–2308, 2008.
[14] E. Meeds, Z. Gharamani, R. Neal, and S. Roweis. Modeling dyadic data with binary latent factors. In NIPS, 2007.
[15] Z. Zhang, C. Ding, T. Li, and X. Zhang. Binary matrix factorization with applications. In IEEE ICDM, 2007.
[16] P. Miettinen and T. Mielik¨ inen and A. Gionis and G. Das and H. Mannila. The discrete basis problem. a In PKDD, 2006.
[17] S. Arora, R. Ge, R. Kannan, and A. Moitra. Computing a nonnegative matrix factorization – provably. STOC, 2012.
[18] V. Bittdorf, B. Recht, C. Re, and J. Tropp. Factoring nonnegative matrices with linear programs. In NIPS, 2012.
[19] D. Donoho and V. Stodden. When does non-negative matrix factorization give a correct decomposition into parts? In NIPS, 2003.
[20] P. Erd¨ s. On a lemma of Littlewood and Offord. Bull. Amer. Math. Soc, 51:898–902, 1951. o
[21] M. Gu and S. Eisenstat. Efficient algorithms for computing a strong rank-revealing QR factorization. SIAM Journal on Scientific Computing, 17:848–869, 1996.
[22] G. Golub and C. Van Loan. Matrix Computations. Johns Hopkins University Press, 1996.
[23] A. Odlyzko. On Subspaces Spanned by Random Selections of ±1 vectors. Journal of Combinatorial Theory A, 47:124–133, 1988.
[24] J. Kahn, J. Komlos, and E. Szemeredi. On the Probability that a ±1 matrix is singular. Journal of the American Mathematical Society, 8:223–240, 1995.
[25] H. Nguyen and V. Vu. Small ball probability, Inverse theorems, and applications. arXiv:1301.0019.
[26] T. Tao and V. Vu. The Littlewoord-Offord problem in high-dimensions and a conjecture of Frankl and F¨ redi. Combinatorica, 32:363–372, 2012. u
[27] C.-J. Lin. Projected gradient methods for non-negative matrix factorization. Neural Computation, 19:2756–2779, 2007.
[28] P. Tao and L. An. Convex analysis approach to D.C. programming: theory, algorithms and applications. Acta Mathematica Vietnamica, pages 289–355, 1997.
[29] https://sites.google.com/site/nicolasgillis/publications. 9