jmlr jmlr2006 jmlr2006-66 jmlr2006-66-reference knowledge-graph by maker-knowledge-mining

66 jmlr-2006-On Model Selection Consistency of Lasso

Source: pdf

Author: Peng Zhao, Bin Yu

Abstract: Sparsity or parsimony of statistical models is crucial for their proper interpretations, as in sciences and social sciences. Model selection is a commonly used method to ﬁnd such models, but usually involves a computationally heavy combinatorial search. Lasso (Tibshirani, 1996) is now being used as a computationally feasible alternative to model selection. Therefore it is important to study Lasso for model selection purposes. In this paper, we prove that a single condition, which we call the Irrepresentable Condition, is almost necessary and sufﬁcient for Lasso to select the true model both in the classical ﬁxed p setting and in the large p setting as the sample size n gets large. Based on these results, sufﬁcient conditions that are veriﬁable in practice are given to relate to previous works and help applications of Lasso for feature selection and sparse representation. This Irrepresentable Condition, which depends mainly on the covariance of the predictor variables, states that Lasso selects the true model consistently if and (almost) only if the predictors that are not in the true model are “irrepresentable” (in a sense to be clariﬁed) by predictors that are in the true model. Furthermore, simulations are carried out to provide insights and understanding of this result. Keywords: Lasso, regularization, sparsity, model selection, consistency

reference text

Z. D. Bai. Methodologies in spectral analysis of large dimensional random matrices: A review. Statistica Sinica, (9):611–677, 1999. D. Donoho, M. Elad, and V. Temlyakov. Stable recovery of sparse overcomplete representations in the presence of noise. Preprint, 2004. B. Efron, T. Hastie, and R. Tibshirani. Least angle regression. Annals of Statistics, 32:407–499, 2004. A.E. Hoerl and R.W. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12:55–67, 1970. K. Knight and W. J. Fu. Asymptotics for Lasso-type estimators. Annals of Statistics, 28:1356–1378, 2000. C. Leng, Y. Lin, and G. Wahba. A note on the lasso and related procedures in model selection. Statistica Sinica, (To appear), 2004. N Meinshausen. Lasso with relaxation. Technical Report, 2005. N. Meinshausen and P. Buhlmann. Consistent neighbourhood selection for high-dimensional graphs with the Lasso. Annals of Statistics, 34(3), 2006. M.R. Osborne, B. Presnell, and B.A. Turlach. Knot selection for regression splines via the Lasso. Computing Science and Statistics, (30):44–49, 1998. M.R. Osborne, B. Presnell, and B.A. Turlach. On the lasso and its dual. Journal of Computational and Graphical Statistics, (9(2)):319–337, 2000b. 2562 O N M ODEL S ELECTION C ONSISTENCY OF L ASSO S. Rosset. Tracking curved regularized optimization solution paths. NIPS, 2004. R. Tibshirani. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B, 58(1):267–288, 1996. P. Zhao and B. Yu. Boosted Lasso. Technical Report, Statistics Department, UC Berkeley, 2004. H. Zou, T. Hastie, and R. Tibshirani. On the ”degrees of freedom” of the Lasso. submitted, 2004. 2563