nips nips2004 nips2004-137 nips2004-137-reference knowledge-graph by maker-knowledge-mining

137 nips-2004-On the Adaptive Properties of Decision Trees

Source: pdf

Author: Clayton Scott, Robert Nowak

Abstract: Decision trees are surprisingly adaptive in three important respects: They automatically (1) adapt to favorable conditions near the Bayes decision boundary; (2) focus on data distributed on lower dimensional manifolds; (3) reject irrelevant features. In this paper we examine a decision tree based on dyadic splits that adapts to each of these conditions to achieve minimax optimal rates of convergence. The proposed classiﬁer is the ﬁrst known to achieve these optimal rates while being practical and implementable. 1

reference text

[1] A. B. Tsybakov, “Optimal aggregation of classiﬁers in statistical learning,” Ann. Stat., vol. 32, no. 1, pp. 135–166, 2004.

[2] Y. Mansour and D. McAllester, “Generalization bounds for decision trees,” in Proceedings of the Thirteenth Annual Conference on Computational Learning Theory, N. Cesa-Bianchi and S. Goldman, Eds., Palo Alto, CA, 2000, pp. 69–74.

[3] Y. Yang, “Minimax nonparametric classiﬁcation–Part I: Rates of convergence,” IEEE Trans. Inform. Theory, vol. 45, no. 7, pp. 2271–2284, 1999.

[4] E. Mammen and A. B. Tsybakov, “Smooth discrimination analysis,” Ann. Stat., vol. 27, pp. 1808–1829, 1999.

[5] A. B. Tsybakov and S. A. van de Geer, “Square root penalty: adaptation to the margin in classiﬁcation and in edge estimation,” 2004, preprint.

[6] P. Bartlett, M. Jordan, and J. McAuliffe, “Convexity, classiﬁcation, and risk bounds,” Department of Statistics, U.C. Berkeley, Tech. Rep. 638, 2003, to appear in Journal of the American Statistical Association.

[7] G. Blanchard, G. Lugosi, and N. Vayatis, “On the rate of convergence of regularized boosting classiﬁers,” J. Machine Learning Research, vol. 4, pp. 861–894, 2003.

[8] J. C. Scovel and I. Steinwart, “Fast rates for support vector machines,” Los Alamos National Laboratory, Tech. Rep. LA-UR 03-9117, 2004.

[9] G. Blanchard, C. Sch¨ fer, and Y. Rozenholc, “Oracle bounds and exact algorithm for dyadic a classiﬁcation trees,” in Learning Theory: 17th Annual Conference on Learning Theory, COLT 2004, J. Shawe-Taylor and Y. Singer, Eds. Heidelberg: Springer-Verlag, 2004, pp. 378–392.

[10] C. Scott and R. Nowak, “Near-minimax optimal classiﬁcation with dyadic classiﬁcation trees,” in Advances in Neural Information Processing Systems 16, S. Thrun, L. Saul, and B. Sch¨ lkopf, o Eds. Cambridge, MA: MIT Press, 2004.

[11] ——, “Minimax optimal classiﬁcation with dyadic decision trees,” Rice University, Tech. Rep. TREE0403, 2004. [Online]. Available: http://www.stat.rice.edu/∼cscott

[12] A. Nobel, “Analysis of a complexity based pruning scheme for classiﬁcation trees,” IEEE Trans. Inform. Theory, vol. 48, no. 8, pp. 2362–2368, 2002.

[13] J.-Y. Audibert, “PAC-Bayesian statistical learning theory,” Ph.D. dissertation, Universit´ Paris e 6, June 2004.