nips nips2013 nips2013-314 nips2013-314-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Raman Arora, Andy Cotter, Nati Srebro
Abstract: We study PCA as a stochastic optimization problem and propose a novel stochastic approximation algorithm which we refer to as “Matrix Stochastic Gradient” (MSG), as well as a practical variant, Capped MSG. We study the method both theoretically and empirically. 1
Arora, Raman, Cotter, Andrew, Livescu, Karen, and Srebro, Nathan. Stochastic optimization for PCA and PLS. In 50th Annual Allerton Conference on Communication, Control, and Computing, 2012. Balzano, Laura, Nowak, Robert, and Recht, Benjamin. Online identification and tracking of subspaces from highly incomplete information. In 48th Annual Allerton Conference on Communication, Control, and Computing, 2010. Beck, A. and Teboulle, M. Mirror descent and nonlinear projected subgradient methods for convex optimization. Operations Research Letters, 31(3):167–175, 2003. Bottou, Leon and Bousquet, Olivier. The tradeoffs of large scale learning. In NIPS’07, pp. 161–168, 2007. Boyd, Stephen and Vandenberghe, Lieven. Convex Optimization. Cambridge University Press, 2004. Brand, Matthew. Incremental singular value decomposition of uncertain data with missing values. In ECCV, 2002. Collins, Michael, Globerson, Amir, Koo, Terry, Carreras, Xavier, and Bartlett, Peter L. Exponentiated gradient algorithms for conditional random fields and max-margin markov networks. J. Mach. Learn. Res., 9:1775–1822, June 2008. Duchi, John, Shalev-Shwartz, Shai, Singer, Yoram, and Chandra, Tushar. Efficient projections onto the l1-ball for learning in high dimensions. In Proceedings of the 25th international conference on Machine learning, ICML ’08, pp. 272–279, New York, NY, USA, 2008. ACM. Nemirovski, Arkadi and Yudin, David. Problem complexity and method efficiency in optimization. John Wiley & Sons Ltd, 1983. Nemirovski, Arkadi, Juditsky, Anatoli, Lan, Guanghui, and Shapiro, Alexander. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19(4):1574– 1609, January 2009. Oja, Erkki and Karhunen, Juha. On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix. Journal of Mathematical Analysis and Applications, 106: 69–84, 1985. Sanger, Terence D. Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Networks, 12:459–473, 1989. Shalev-Shwartz, Shai and Srebro, Nathan. SVM optimization: Inverse dependence on training set size. In ICML’08, pp. 928–935, 2008. Shalev-Shwartz, Shai and Tewari, Ambuj. Stochastic methods for l1 regularized loss minimization. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML’09, pp. 929–936, New York, NY, USA, 2009. ACM. Shalev-Shwartz, Shai, Singer, Yoram, and Srebro, Nathan. Pegasos: Primal Estimated sub-GrAdient SOlver for SVM. In ICML’07, pp. 807–814, 2007. Srebro, N., Sridharan, K., and Tewari, A. On the universality of online mirror descent. Advances in Neural Information Processing Systems, 24, 2011. Warmuth, Manfred K. and Kuzmin, Dima. Randomized online PCA algorithms with regret bounds that are logarithmic in the dimension. Journal of Machine Learning Research (JMLR), 9:2287– 2320, 2008. 9