nips nips2011 nips2011-265 nips2011-265-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Martin Slawski, Matthias Hein
Abstract: Non-negative data are commonly encountered in numerous fields, making nonnegative least squares regression (NNLS) a frequently used tool. At least relative to its simplicity, it often performs rather well in practice. Serious doubts about its usefulness arise for modern high-dimensional linear models. Even in this setting − unlike first intuition may suggest − we show that for a broad class of designs, NNLS is resistant to overfitting and works excellently for sparse recovery when combined with thresholding, experimentally even outperforming 1 regularization. Since NNLS also circumvents the delicate choice of a regularization parameter, our findings suggest that NNLS may be the method of choice. 1
[1] Y. Lin, D. Lee, and L. Saul. Nonnegative deconvolution for time of arrival estimation. In ICASSP, 2004.
[2] J. Bardsley and J. Nagy. Covariance-preconditioned iterative methods for nonnegatively constrained astronomical imaging. SIAM Journal on Matrix Analysis and Applications, 27:1184–1198, 2006.
[3] A. Szlam and. Z. Guo and S. Osher. A split Bregman method for non-negative sparsity penalized least squares with applications to hyperspectral demixing. In IEEE International Conference on Image Processing, 2010.
[4] L. Li and T. Speed. Parametric deconvolution of positive spike trains. The Annals of Statistics, 28:1279– 1301, 2000.
[5] M. Slawski and M. Hein. Sparse recovery for Protein Mass Spectrometry data. In NIPS workshop on practical applications of sparse modelling, 2010.
[6] D. Donoho, I. Johnstone, J. Hoch, and A. Stern. Maximum entropy and the nearly black object. Journal of the Royal Statistical Society Series B, 54:41–81, 1992.
[7] D. Chen and R. Plemmons. Nonnegativity constraints in numerical analysis. In Symposium on the Birth of Numerical Analysis, 2007.
[8] A. Bruckstein, M. Elad, and M. Zibulevsky. On the uniqueness of nonnegative sparse solutions to underdetermined systems of equations. IEEE Transactions on Information Theory, 54:4813–4820, 2008.
[9] D. Donoho and J. Tanner. Counting the faces of randomly-projected hypercubes and orthants, with applications. Discrete and Computational Geometry, 43:522–541, 2010.
[10] M. Wang and A. Tang. Conditions for a Unique Non-negative Solution to an Underdetermined System. In Proceedings of Allerton Conference on Communication, Control, and Computing, 2009.
[11] M. Wang, W. Xu, and A. Tang. A unique nonnegative solution to an undetermined system: from vectors to matrices. IEEE Transactions on Signal Processing, 59:1007–1016, 2011.
[12] C. Liew. Inequality Constrained Least-Squares Estimation. Journal of the American Statistical Association, 71:746–751, 1976.
[13] R. Tibshirani. Regression shrinkage and variable selection via the lasso. Journal of the Royal Statistical Society Series B, 58:671–686, 1996.
[14] S. van de Geer and P. B¨ hlmann. On the conditions used to prove oracle results for the Lasso. The u Electronic Journal of Statistics, 3:1360–1392, 2009.
[15] P. Zhao and B. Yu. On model selection consistency of the lasso. Journal of Machine Learning Research, 7:2541–2567, 2006.
[16] M. Wainwright. Sharp thresholds for noisy and high-dimensional recovery of sparsity using 1 constrained quadratic programming (Lasso). IEEE Transactions on Information Theory, 55:2183–2202, 2009.
[17] N. Meinshausen and B. Yu. Lasso-type recovery of sparse representations for high-dimensional data. The Annals of Statistics, 37:246–270, 2009.
[18] T. Zhang. Some Sharp Performance Bounds for Least Squares Regression with L1 Regularization. The Annals of Statistics, 37:2109–2144, 2009.
[19] S. Zhou. Thresholding procedures for high dimensional variable selection and statistical estimation. In NIPS, 2009.
[20] E. Greenshtein and Y. Ritov. Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli, 6:971–988, 2004.
[21] D. Donoho and J. Tanner. Sparse nonnegative solution of underdetermined linear equations by linear programming. Proceedings of the National Academy of Science, 102:9446–9451, 2005.
[22] J. Tropp. Greed is good: Algorithmic results for sparse approximation. IEEE Transactions on Information Theory, 50:2231–2242, 2004.
[23] T. Zhang. On the Consistency of Feature Selection using Greedy Least Squares Regression. Journal of Machine Learning Research, 10:555–568, 2009.
[24] H. Rue and L. Held. Gaussian Markov Random Fields. Chapman and Hall/CRC, Boca Raton, 2001.
[25] C. Genovese, J. Jin, and L. Wasserman. Revisiting Marginal Regression. Technical report, Carnegie Mellon University, 2009. http://arxiv.org/abs/0911.4080.
[26] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least Angle Regression. The Annals of Statistics, 32:407–499, 2004. 9