jmlr jmlr2013 jmlr2013-116 jmlr2013-116-reference knowledge-graph by maker-knowledge-mining

116 jmlr-2013-Truncated Power Method for Sparse Eigenvalue Problems

Source: pdf

Author: Xiao-Tong Yuan, Tong Zhang

Abstract: This paper considers the sparse eigenvalue problem, which is to extract dominant (largest) sparse eigenvectors with at most k non-zero components. We propose a simple yet effective solution called truncated power method that can approximately solve the underlying nonconvex optimization problem. A strong sparse recovery result is proved for the truncated power method, and this theory is our key motivation for developing the new algorithm. The proposed method is tested on applications such as sparse principal component analysis and the densest k-subgraph problem. Extensive experiments on several synthetic and real-world data sets demonstrate the competitive empirical performance of our method. Keywords: sparse eigenvalue, power method, sparse principal component analysis, densest k-subgraph

reference text

A. Alizadeh, M. Eisen, R. Davis, C. Ma, I. Lossos, and A. Rosenwald. Distinct types of diffuse large b-cell lymphoma identiﬁed by gene expression proﬁling. Nature, 403:503–511, 2000. 922 T RUNCATED P OWER M ETHOD FOR S PARSE E IGENVALUE P ROBLEMS A. Alon, N. Barkai, D. A. Notterman, K. Gish, S. Ybarra, D. Mack, and A. J. Levine. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Cell Biology, 96:6745–6750, 1999. A. A. Amini and M. J. Wainwright. High-dimensional analysis of semideﬁnite relaxiation for sparse principal components. Annals of Statistics, 37:2877–2921, 2009. Y. Asahiro, R. Hassin, and K. Iwama. Complexity of ﬁnidng dense subgraphs. Discrete Applied Mathematics, 211(1-3):15–26, 2002. A. Billionnet and F. Roupin. A deterministic algorithm for the densest k-subgraph problem using linear programming. Technical report, Technical Report, No. 486, CEDRIC, CNAM-IIE, Paris, 2004. T. Cai, Z. Ma, and Y. Wu. Sparse pca: Optimal rates and adaptive estimation. 2012. URL arxiv. org/pdf/1211.1309v1.pdf. E. J. Candes and T. Tao. Decoding by linear programming. IEEE Transactions on Information Theory, 51:4203–4215, 2005. D. G. Corneil and Y. Perl. Clustering and domination in perfect graphs. Discrete Applied Mathematics, 9:27–39, 1984. A. d’Aspremont, L. El Ghaoui, M. I. Jordan, and G. R. G. Lanckriet. A direct formulation for sparse pca using semideﬁnite programming. SIAM Review, 49:434–448, 2007. A. d’Aspremont, F. Bach, and L. El Ghaoui. Optimal solutions for sparse principal component analysis. Journal of Machine Learning Research, 9:1269–1294, 2008. U. Feige, G. Kortsarz, and D. Peleg. The dense k-subgraph problem. Algorithmica, 29(3):410–421, 2001. X. Geng, T. Liu, T. Qin, and H. Li. Feature selection for ranking. In Proceedings of the 30th Annual International ACM SIGIR Conference (SIGIR’07), 2007. D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB’05), pages 721–732, 2005. G. H. Golub and C.F. Van Loan. Matrix Computations. Johns Hopkins University Press, Baltimore, MD, third edition, 1996. J. Jeffers. Two case studies in the application of principal components. Applied Statistics, 16(3): 225–236, 1967. P. Jiang, J. Peng, M. Heath, and R. Yang. Finding densest k-subgraph via 1-mean clustering and low-dimension approximation. Technical report, 2010. I. M. Johnstone. On the distribution of the largest eigenvalue in principal components analysis. Annals of Statistics, 29:295–327, 2001. 923 Y UAN AND Z HANG I. T. Jolliffe, N. T. Trendaﬁlov, and M. Uddin. A modiﬁed principal component technique based on the lasso. Journal of Computational and Graphical Statistics, 12(3):531–547, 2003. M. Journ´ e, Y. Nesterov, P. Richt´ rik, and Rodolphe Sepulchre. Generalized power method for e a sparse principal component analysis. Journal of Machine Learning Research, 11:517–553, 2010. S. Khot. Ruling out ptas for graph min-bisection, dense k-subgraph, and bipartite clique. SIAM Journal on Computing, 36(4):1025–1071, 2006. S. Khuller and B. Saha. On ﬁnding dense subgraphs. In Proceedings of the 36th International Colloquium on Automata, Languages and Programming (ICALP’09), pages 597–608, 2009. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Trawling the web for emerging cybercommunities. In Proceedings of the 8th World Wide Web Conference (WWW’99), pages 403–410, 1999. M. Liazi, I. Milis, and V. Zissimopoulos. A constant approximation algorithm for the densest ksubgraph problem on chordal graphs. Information Processing Letters, 108(1):29–32, 2008. Z. Ma. Sparse principal component analysis and iterative thresholding. Annals of Statistics, to appear, 2013. L. Mackey. Deﬂation methods for sparse pca. In Proceedings of the 22nd Annual Conference on Neural Information Processing Systems (NIPS’08), 2008. J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11:10–60, 2010. B. Moghaddam, Y. Weiss, and S. Avidan. Generalized spectral bounds for sparse lda. In Proceedings of the 23rd International Conference on Machine Learning (ICML’06), pages 641–648, 2006. D. Paul and I.M. Johnstone. Augmented sparse principal component analysis for high dimensional data. 2012. URL arxiv.org/pdf/1202.1242v1.pdf. S. S. Ravi, D. J. Rosenkrantz, and G. K. Tayi. Heuristic and special case algorithms for dispersion problems. Operations Research, 42:299–310, 1994. D. Shen, H. Shen, and J.S. Marron. Consistency of sparse pca in high dimension, low sample size contexts. Journal of Multivariate Analysis, 115:317–333, 2013. H. Shen and J. Z. Huang. Sparse principal component analysis via regularized low rank matrix approximation. Journal of Multivariate Analysis, 99(6):1015–1034, 2008. A. Srivastav and K. Wolf. Finding dense subgraphs with semideﬁnite programming. In Proceedings of International Workshop on Approximation Algorithms for Combinatorial Optimization (APPROX’98), pages 181–191, 1998. R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B, 58(1):267–288, 1996. 924 T RUNCATED P OWER M ETHOD FOR S PARSE E IGENVALUE P ROBLEMS D. M. Witten, R. Tibshirani, and T. Hastie. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics, 10(3):515–534, 2009. R. Yang. New approximation methods for solving binary quadratic programming problem. Technical report, Master Thesis, Department of Industrial and Enterprise Systems Engineering, University of Illnois at Urbana-Champaign, 2010. Y. Y. Ye and J. W. Zhang. Approximation of dense-n/2-subgraph and the complement of minbisection. Journal of Global Optimization, 25:55–73, 2003. H. Zou, T. Hastie, and R. Tibshirani. Sparse principal component analysis. Journal of Computational and Graphical Statistics, 15(2):265–286, 2006. 925