jmlr jmlr2012 jmlr2012-29 jmlr2012-29-reference knowledge-graph by maker-knowledge-mining

29 jmlr-2012-Consistent Model Selection Criteria on High Dimensions

Source: pdf

Author: Yongdai Kim, Sunghoon Kwon, Hosik Choi

Abstract: Asymptotic properties of model selection criteria for high-dimensional regression models are studied where the dimension of covariates is much larger than the sample size. Several sufﬁcient conditions for model selection consistency are provided. Non-Gaussian error distributions are considered and it is shown that the maximal number of covariates for model selection consistency depends on the tail behavior of the error distribution. Also, sufﬁcient conditions for model selection consistency are given when the variance of the noise is neither known nor estimated consistently. Results of simulation studies as well as real data analysis are given to illustrate that ﬁnite sample performances of consistent model selection criteria can be quite different. Keywords: model selection consistency, general information criteria, high dimension, regression

reference text

H. Akaike. Information theory and an extension of the maximum likelihood principle. In B. N. Petrox and F. Caski, editors, Second International Symposium on Information Theory, volume 1, pages 267–281. Budapest: Akademiai Kiado, 1973. K. W. Broman and T. P. Speed. A model selection approach for the identiﬁcation of quantitative trait loci in experimental crosses. Journal of the Royal Statistical Society, Ser. B, 64:641–656, 2002. G. Casella, F. J. Giron, M. L. Martinez, and E. Moreno. Consistency of bayesian procedure for variable selection. The Annals of Statistics, 37:1207–1228, 2009. J. Chen and Z. Chen. Extended bayesian information criteria for model selection with large model spaces. Biometrika, 24:759–771, 2008. 1055 K IM , K WON AND C HOI A. P. Chiang, J. S. Beck, H.-J. Yen, M. K. Tayeh, T. E. Scheetz, R. Swiderski, D. Nishimura, T. A. Braun, K.-Y. Kim, J. Huang, K. Elbedour, R. Carmi, D. C. Slusarski, T. L. Casavant, E. M. Stone, and V. C. Shefﬁeld. Homozygosity mapping with snp arrays identiﬁes a novel gene for bardet-biedl syndrome (bbs10). Proc. Nat. Acad. Sci., 103:6287–6292, 2006. P. Craven and G. Wahba. Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross validation. Numer. Math., 31:377–403, 1979. B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. The Annals of Statistics, 32:407–499, 2004. J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96:1348–1360, 2001. D. P. Foster and E. I. George. The risk inﬂation criterion for multiple regression. The Annals of Statistics, 22:1947–1975, 1994. E. Greenshtein and Y. Ritov. Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli, 10:971–988, 2004. J. Huang, S. Ma, and C-H. Zhang. Adaptive lasso for sparse high-dimensional regression models. Statistica Sinica, 18:1603–1618, 2008. Y. Kim and S. Kwon. The global optimality of the smoothly clipped absolute deviation penalized estimator. Biometrika, forthcoming, 2012. Y. Kim, H. Choi, and H. Oh. Smoothly clipped absolute deviation on high dimensions. Journal of the American Statistical Association, 103:1665–1673, 2008. T. E. Scheetz, K.-Y. A. Kim, R. E. Swiderski, A. R. Philp1, T. A. Braun, K. L. Knudtson, A. M. Dorrance, G. F. DiBona, J. Huang, T. L. Casavant, V. C. Shefﬁeld, and E. M. Stone. Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proceedings of the National Academy of Sciences, 103:14429–14434, 2006. G. Schwarz. Estimating the dimension of a model. The Annals of Statistics, 6:461–464, 1978. J. Shao. An asymptotic theory for linear model selection. Statistica Sinica, 7:221–264, 1997. M. Stone. Cross-validatory choice and assessment of statistical predictions (with discussion). Journal of the Royal Statistical Society, Ser. B, 39:111–147, 1974. R. J. Tibshirani. Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society, Ser. B, 58:267–288, 1996. H. Wang. Forward regression for ultra-high dimensional variable screening. Journal of the American Statistical Association, 104:1512–1524, 2009. H. Wang, B. Li, and C. Leng. Shrinkage tuning parameter selection with a diverging number of parameters. Journal of the Royal Statistical Society, Ser. B, 71:671–683, 2009. Y. Yang. Model selection for nonparametric regression. Statistica Sinica, 9:475–499, 1999. 1056 C ONSISTENT M ODEL S ELECTION C RITERIA ON H IGH D IMENSIONS Y. Yang. Can the strengths of aic and bic be shared? a conﬂict between model identiﬁcation and regression estimation. Biometrika, 92:937–950, 2005. Y. Yang and A. R. Barron. An asymptotic property of model selection criteria. IEEE Transanctions in Information Theory, 44:95–116, 1998. C.-H. Zhang. Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38:894–942, 2010. Y. Zhang and X. Shen. Model selection procedure for high-dimensional data. Statistical Analysis and Data Mining, 3:350–358, 2010. P. Zhao and B. Yu. On model selection consistency of lasso. Journal of Machine Learning Reserach, 7:2541–2563, 2006. H. Zou. The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101:1418–1429, 2006. H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Ser. B, 67:301–320, 2005. H. Zou, T. Hastie, and R. Tibshirani. On the “degree of freedom” of lasso. The Annals of Statsitics, 35:2173–2192, 2007. 1057