nips nips2013 nips2013-297 nips2013-297-reference knowledge-graph by maker-knowledge-mining

297 nips-2013-Sketching Structured Matrices for Faster Nonlinear Regression

Source: pdf

Author: Haim Avron, Vikas Sindhwani, David Woodruff

Abstract: Motivated by the desire to extend fast randomized techniques to nonlinear lp regression, we consider a class of structured regression problems. These problems involve Vandermonde matrices which arise naturally in various statistical modeling settings, including classical polynomial ﬁtting problems, additive models and approximations to recently developed randomized techniques for scalable kernel methods. We show that this structure can be exploited to further accelerate the solution of the regression problem, achieving running times that are faster than “input sparsity”. We present empirical results conﬁrming both the practical value of our modeling framework, as well as speedup beneﬁts of randomized regression. 1

reference text

[1] C. Boutsidis and A. Gittens. Improved matrix algorithms via the Subsampled Randomized Hadamard Transform. ArXiv e-prints, Mar. 2012. To appear in the SIAM Journal on Matrix Analysis and Applications.

[2] P. Buhlmann and S. v. d. Geer. Statistics for High-dimensional Data. Springer, 2011.

[3] Y. Chang, C. Hsieh, K. Chang, M. Ringgaard, and C. Lin. Low-degree polynomial mapping of data for svm. JMLR, 11, 2010.

[4] M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. Theoretical Computer Science, 312(1):3 – 15, 2004. ¡ce:title¿Automata, Languages and Programming¡/ce:title¿.

[5] K. L. Clarkson, P. Drineas, M. Magdon-Ismail, M. W. Mahoney, X. Meng, and D. P. Woodruff. The Fast Cauchy Transform and faster faster robust regression. CoRR, abs/1207.4684, 2012. Also in SODA 2013.

[6] K. L. Clarkson and D. P. Woodruff. Numerical linear algebra in the streaming model. In Proceedings of the 41st annual ACM Symposium on Theory of Computing, STOC ’09, pages 205–214, New York, NY, USA, 2009. ACM.

[7] K. L. Clarkson and D. P. Woodruff. Low rank approximation and regression in input sparsity time. In Proceedings of the 45th annual ACM Symposium on Theory of Computing, STOC ’13, pages 81–90, New York, NY, USA, 2013. ACM.

[8] A. Dasgupta, P. Drineas, B. Harb, R. Kumar, and M. Mahoney. Sampling algorithms and coresets for p regression. SIAM Journal on Computing, 38(5):2060–2078, 2009.

[9] A. Gilbert and P. Indyk. Sparse recovery using sparse matrices. Proceedings of the IEEE, 98(6):937–947, 2010.

[10] N. Halko, P. G. Martinsson, and J. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Review, 53(2):217– 288, 2011.

[11] Q. Le, T. Sarl´ s, and A. Smola. Fastfood computing hilbert space expansions in loglinear o time. In Proceedings of International Conference on Machine Learning, ICML ’13, 2013.

[12] M. W. Mahoney. Randomized algorithms for matrices and data. Foundations and Trends in Machine Learning, 3(2):123–224, 2011.

[13] M. W. Mahoney, P. Drineas, M. Magdon-Ismail, and D. P. Woodruff. Fast approximation of matrix coherence and statistical leverage. In Proceedings of the 29th International Conference on Machine Learning, ICML ’12, 2012.

[14] X. Meng and M. W. Mahoney. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In Proceedings of the 45th annual ACM Symposium on Theory of Computing, STOC ’13, pages 91–100, New York, NY, USA, 2013. ACM.

[15] J. Nelson and H. L. Nguyen. OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. CoRR, abs/1211.1002, 2012.

[16] R. Rahimi and B. Recht. Random features for large-scale kernel machines. In Proceedings of Neural Information Processing Systems, NIPS ’07, 2007.

[17] V. Rokhlin and M. Tygert. A fast randomized algorithm for overdetermined linear least-squares regression. Proceedings of the National Academy of Sciences, 105(36):13212, 2008.

[18] W. Rudin. Fourier Analysis on Groups. Wiley Classics Library. Wiley-Interscience, New York, 1994.

[19] T. Sarl´ s. Improved approximation algorithms for large matrices via random projections. In o Proceeding of IEEE Symposium on Foundations of Computer Science, FOCS ’06, pages 143– 152, 2006.

[20] C. Sohler and D. P. Woodruff. Subspace embeddings for the l1-norm with applications. In Proceedings of the 43rd annual ACM Symposium on Theory of Computing, STOC ’11, pages 755–764, 2011.

[21] D. P. Woodruff and Q. Zhang. Subspace embeddings and lp regression using exponential random variables. In COLT, 2013. 9