jmlr jmlr2008 jmlr2008-3 jmlr2008-3-reference knowledge-graph by maker-knowledge-mining

3 jmlr-2008-A Moment Bound for Multi-hinge Classifiers


Source: pdf

Author: Bernadetta Tarigan, Sara A. van de Geer

Abstract: The success of support vector machines in binary classification relies on the fact that hinge loss employed in the risk minimization targets the Bayes rule. Recent research explores some extensions of this large margin based method to the multicategory case. We show a moment bound for the socalled multi-hinge loss minimizers based on two kinds of complexity constraints: entropy with bracketing and empirical entropy. Obtaining such a result based on the latter is harder than finding one based on the former. We obtain fast rates of convergence that adapt to the unknown margin. Keywords: multi-hinge classification, all-at-once, moment bound, fast rate, entropy


reference text

Peter L. Bartlett and Marten H. Wegkamp. Classification with a reject option using a hinge loss. Technical report, U.C. Berkeley, 2006. Peter L. Bartlett, Michael I. Jordan, and Jon D. McAuliffe. Convexity, classification and risk bounds. Journal of the American Statistical Association, 101(473):138–156, 2006. 2183 TARIGAN AND VAN DE G EER St´ phane Boucheron, Olivier Bousquet, and G´ bor Lugosi. Theory of classification: a survey of e a some recent advances. ESAIM: Probability and Statistics, 9:323–375, 2005. Koby Crammer and Yoram Singer. On the learnability and design of output codes for multiclass problems. In Proceeding of the 13th Annual Conference on Computational Learning Theory, pages 35–46. Morgan Kaufmann, 2000. Koby Crammer and Yoram Singer. On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research, 2:265–292, 2001. Eustasio del Barrio, Paul Deheuvels, and Sara A. van de Geer. Lectures on Empirical Processes. EMS Series of Lectures in Mathematics. European Mathematical Society, 2007. Kaibo Duan and S. Sathiya Keerthi. Which is the best multiclass svm method? an empirical study. In Multiple Classifier Systems, number 3541 in Lecture Notes in Computer Science, pages 278– 285. Springer Berlin/Heidelberg, 2005. Theodoros Evgeniou, Massimiliano Pontil, and Tomaso Poggio. Regularization networks and support vector machines. Advances in Computational Mathematics, 13:1–50, 2000. Yann Guermeur. Combining discriminant models with new multiclass svms. Pattern Analysis & Applications, 5:168–179, 2002. ´ Godfrey H. Hardy, John E. Littlewood, and George P olya. Inequalities. Cambridge University Press, second edition, 1988. Chih-Wei Hsu and Chih-Jen Lin. A comparison methods for multiclass support vector machines. Neural Networks, IEEE Transactions on, 13(2):415–425, 2002. Yoonkyung Lee. Multicategory Support Vector Machines, Theory and Application to the Classification of Microarray Data and Satellite Radiance Data. PhD thesis, University of WisconsinMadison, Departement of Statistics, 2002. Yoonkyung Lee and Zhenhuan Cui. Characterizing the solution path of multicategory support vector machines. Statistica Sinica, 16(2):391–409, 2006. Yoonkyung Lee, Yi Lin, and Grace Wahba. Multicategory support vector machines: Theory and application to the classification of microarray data and satellite radiance data. Journal of the American Statistical Association, 99(465):67–81, 2004. Yi Lin. Support vector machines and the bayes rule in classification. Data Mining and Knowledge Discovery, 6(3):259–275, 2002. Enno Mammen and Alexandre B. Tsybakov. Smooth discrimination analysis. Ann. Statist., 27(6): 1808–1829, 1999. David Pollard. Convergence of Stochastic Processes. Springer-Verlag New York Inc., 1984. Shuguang Song and Jon A. Wellner. An upper bound for uniform entropy numbers. Technical report, Departement of Statistics, University of Washington, 2002. URL www.stat.washington.edu/www/research/reports/#2002/tr409.ps. 2184 A M OMENT B OUND M ULTI - HINGE C LASSIFIERS Ingo Steinwart and Clint Scovel. Fast rates for support vector machines. In P. Auer and R. Meir, editors, COLT, volume 3559 of Lecture Notes in Computer Science, pages 279–294, 2005. Bernadetta Tarigan and Sara A. van de Geer. Classifiers of support machine type with l 1 complexity regularization. Bernoulli, 12(6):1045–1076, 2006. Ambuj Tewari and Peter L. Bartlett. On the consistency of multiclass classification methods. In P. Auer and R. Meir, editors, COLT, volume 3559 of Lecture Notes in Computer Science, pages 143–157, 2005. Alexandre B. Tsybakov. Optimal aggregation of classifiers in statistical learning. Annals of Statistics, 32:135–166, 2004. Sara A. van de Geer. Empirical Processes in M-estimation. Cambridge University Press, 2000. Aad W. van der Vaart and Jon A. Wellner. Weak Convergence and Empirical Processes. Springer Series in Statistics. Springer-Verlag, New York, 1996. Vladimir N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, New York, 2000. Lifeng Wang and Xiaotong Shen. On l1 -norm multiclass support vector machines: Methodology and theory. Journal of the American Statistical Association, 102(478):583–594, 2007. Jason Weston and Chris Watkins. Multi-class support vector machines. In Proceedings of ESANN99, 1999. Tong Zhang. Statistical behaviour and consistency of classification methods based on convex risk minimization. The Annals of Statistics, 32(1):56–134, 2004a. With discussion. Tong Zhang. Statistical analysis of some multi-category large margin classification methods. Journal of Machine Learning Research, 5:1225–1251, 2004b. Tong Zhang. Statistical analysis of some multi-category large margin classification methods. Journal of Machine Learning Research, 5:1225–1251, 2004c. Hui Zou, Ji Zhu, and Trevor Hastie. The margin vector, admissible loss and multi-class marginbased classifiers. Technical report, Statistics Departement, Stanford University, 2006. 2185