nips nips2010 nips2010-178 nips2010-178-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Han Liu, Xi Chen
Abstract: We propose a new nonparametric learning method based on multivariate dyadic regression trees (MDRTs). Unlike traditional dyadic decision trees (DDTs) or classification and regression trees (CARTs), MDRTs are constructed using penalized empirical risk minimization with a novel sparsity-inducing penalty. Theoretically, we show that MDRTs can simultaneously adapt to the unknown sparsity and smoothness of the true regression functions, and achieve the nearly optimal rates of convergence (in a minimax sense) for the class of (α, C)-smooth functions. Empirically, MDRTs can simultaneously conduct function estimation and variable selection in high dimensions. To make MDRTs applicable for large-scale learning problems, we propose a greedy heuristics. The superior performance of MDRTs are demonstrated on both synthetic and real datasets. 1
[1] G. Blanchard, C. Sch¨ fer, Y. Rozenholc, and K.-R. M¨ ller. Optimal dyadic decision trees. a u Machine Learning Journal, 66(2-3):209–241, 2007.
[2] Leo Breiman, Jerome Friedman, Charles J. Stone, and R.A. Olshen. Classification and regression trees. Wadsworth Publishing Co Inc, 1984.
[3] Leo Breiman and Jerome H. Friedman. Predicting multivariate responses in multiple linear regression. J. Roy. Statist. Soc. B, 59:3, 1997.
[4] R. Castro, R. Willett, and R. Nowak. Fast rates in regression via active learning. NIPS, 2005.
[5] Xi Chen, Weike Pan, James T. Kwok, and Jamie G. Carbonell. Accelerated gradient method for multi-task sparse learning problem. In ICDM, 2009.
[6] Hugh A. Chipman, Edward I. George, and Robert E. McCulloch. Bart: Bayesian additive regression trees. Technical report, Department of Mathematics and Statistics, Acadia University, Canada, 2006.
[7] Jerome H. Friedman. Multivariate adaptive regression splines. The Annals of Statistics, 19:1– 67, 1991.
[8] S. Gey and E. Nedelec. Model selection for cart regression trees. IEEE Tran. on Info. Theory, 51(2):658– 670, 2005.
[9] L´ szl´ Gy¨ rfi, Michael Kohler, Adam Krzy˙ ak, and Harro Walk. A Distribution-Free Theory a o o z of Nonparametric Regression. Springer-Verlag, 2002.
[10] Han Liu, John Lafferty, and Larry Wasserman. Nonparametric regression and classification with joint sparsity constraints. In NIPS. MIT Press, 2008.
[11] Han Liu and Jian Zhang. On the estimation consistency of the group lasso and its applications. AISTATS, pages 376–383, 2009.
[12] Wendy L. Martinez and Angel R. Martinez. Computational Statistics Handbook with MATLAB. Chapman & Hall CRC, 2 edition, 2008.
[13] G. Obozinski, M. J. Wainwright, and M. I. Jordan. High-dimensional union support recovery in multivariate regression. In NIPS. MIT Press, 2009.
[14] Pradeep Ravikumar, Han Liu, John Lafferty, and Larry Wasserman. Spam: Sparse additive models. In NIPS. MIT Press, 2007.
[15] C. Scott and R.D. Nowak. Minimax-optimal classification with dyadic decision trees. IEEE Tran. on Info. Theory, 52(4):1335–1353, 2006.
[16] B.A. Turlach, W. N. Venables, and S. J. Wright. Simultaneous variable selection. Technometrics, 27:349–363, 2005. 9