nips nips2008 nips2008-57 nips2008-57-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Lester W. Mackey
Abstract: In analogy to the PCA setting, the sparse PCA problem is often solved by iteratively alternating between two subtasks: cardinality-constrained rank-one variance maximization and matrix deflation. While the former has received a great deal of attention in the literature, the latter is seldom analyzed and is typically borrowed without justification from the PCA context. In this work, we demonstrate that the standard PCA deflation procedure is seldom appropriate for the sparse PCA setting. To rectify the situation, we first develop several deflation alternatives better suited to the cardinality-constrained context. We then reformulate the sparse PCA optimization problem to explicitly reflect the maximum additional variance objective on each round. The result is a generalized deflation procedure that typically outperforms more standard techniques on real-world datasets. 1
[1] A. d’Aspremont, L. El Ghaoui, M. I. Jordan, and G. R. G. Lanckriet. A Direct Formulation for Sparse PCA using Semidefinite Programming. In Advances in Neural Information Processing Systems (NIPS). Vancouver, BC, December 2004.
[2] A. d’Aspremont, F. R. Bach, and L. E. Ghaoui. Full regularization path for sparse principal component analysis. In Proceedings of the 24th international Conference on Machine Learning. Z. Ghahramani, Ed. ICML ’07, vol. 227. ACM, New York, NY, 177-184, 2007.
[3] J. Cadima and I. Jolliffe. Loadings and correlations in the interpretation of principal components. Applied Statistics, 22:203.214, 1995.
[4] C.C. Fowlkes, C.L. Luengo Hendriks, S.V. Kernen, G.H. Weber, O. Rbel, M.-Y. Huang, S. Chatoor, A.H. DePace, L. Simirenko and C. Henriquez et al. Cell 133, pp. 364-374, 2008.
[5] J. Jeffers. Two case studies in the application of principal components. Applied Statistics, 16, 225-236, 1967.
[6] I.T. Jolliffe and M. Uddin. A Modified Principal Component Technique based on the Lasso. Journal of Computational and Graphical Statistics, 12:531.547, 2003.
[7] I.T. Jolliffe, Principal component analysis, Springer Verlag, New York, 1986.
[8] I.T. Jolliffe. Rotation of principal components: choice of normalization constraints. Journal of Applied Statistics, 22:29-35, 1995.
[9] B. Moghaddam, Y. Weiss, and S. Avidan. Spectral bounds for sparse PCA: Exact and greedy algorithms. Advances in Neural Information Processing Systems, 18, 2006.
[10] B. Moghaddam, Y. Weiss, and S. Avidan. Generalized spectral bounds for sparse LDA. In Proc. ICML, 2006.
[11] Y. Saad, Projection and deflation methods for partial pole assignment in linear state feedback, IEEE Trans. Automat. Contr., vol. 33, pp. 290-297, Mar. 1998.
[12] B.K. Sriperumbudur, D.A. Torres, and G.R.G. Lanckriet. Sparse eigen methods by DC programming. Proceedings of the 24th International Conference on Machine learning, pp. 831838, 2007.
[13] D. Torres, B.K. Sriperumbudur, and G. Lanckriet. Finding Musically Meaningful Words by Sparse CCA. Neural Information Processing Systems (NIPS) Workshop on Music, the Brain and Cognition, 2007.
[14] P. White. The Computation of Eigenvalues and Eigenvectors of a Matrix. Journal of the Society for Industrial and Applied Mathematics, Vol. 6, No. 4, pp. 393-437, Dec., 1958.
[15] F. Zhang (Ed.). The Schur Complement and Its Applications. Kluwer, Dordrecht, Springer, 2005.
[16] Z. Zhang, H. Zha, and H. Simon, Low-rank approximations with sparse factors I: Basic algorithms and error analysis. SIAM J. Matrix Anal. Appl., 23 (2002), pp. 706-727.
[17] Z. Zhang, H. Zha, and H. Simon, Low-rank approximations with sparse factors II: Penalized methods with discrete Newton-like iterations. SIAM J. Matrix Anal. Appl., 25 (2004), pp. 901-920.
[18] H. Zou, T. Hastie, and R. Tibshirani. Sparse Principal Component Analysis. Technical Report, Statistics Department, Stanford University, 2004. 8