nips nips2010 nips2010-282 nips2010-282-reference knowledge-graph by maker-knowledge-mining

282 nips-2010-Variable margin losses for classifier design

Source: pdf

Author: Hamed Masnadi-shirazi, Nuno Vasconcelos

Abstract: The problem of controlling the margin of a classiﬁer is studied. A detailed analytical study is presented on how properties of the classiﬁcation risk, such as its optimal link and minimum risk functions, are related to the shape of the loss, and its margin enforcing properties. It is shown that for a class of risks, denoted canonical risks, asymptotic Bayes consistency is compatible with simple analytical relationships between these functions. These enable a precise characterization of the loss for a popular class of link functions. It is shown that, when the risk is in canonical form and the link is inverse sigmoidal, the margin properties of the loss are determined by a single parameter. Novel families of Bayes consistent loss functions, of variable margin, are derived. These families are then used to design boosting style algorithms with explicit control of the classiﬁcation margin. The new algorithms generalize well established approaches, such as LogitBoost. Experimental results show that the proposed variable margin losses outperform the ﬁxed margin counterparts used by existing algorithms. Finally, it is shown that best performance can be achieved by cross-validating the margin parameter. 1

reference text

[1] V. N. Vapnik, Statistical Learning Theory. John Wiley Sons Inc, 1998.

[2] J. Friedman, T. Hastie, and R. Tibshirani, “Additive logistic regression: A statistical view of boosting,” Annals of Statistics, 2000.

[3] H. Masnadi-Shirazi and N. Vasconcelos, “On the design of loss functions for classiﬁcation: theory, robustness to outliers, and savageboost,” in NIPS, 2008, pp. 1049–1056.

[4] L. J. Savage, “The elicitation of personal probabilities and expectations,” Journal of the American Statistical Association, vol. 66, pp. 783–801, 1971.

[5] C. Leistner, A. Saffari, P. M. Roth, and H. Bischof, “On robustness of on-line boosting - a competitive study,” in IEEE ICCV Workshop on On-line Computer Vision, 2009.

[6] A. Buja, W. Stuetzle, and Y. Shen, “Loss functions for binary class probability estimation and classiﬁcation: Structure and applications,” 2006.

[7] T. Zhang, “Statistical behavior and consistency of classiﬁcation methods based on convex risk minimization,” Annals of Statistics, 2004.

[8] J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” The Annals of Statistics, vol. 29, no. 5, pp. 1189–1232, 2001.

[9] J. Demˇar, “Statistical comparisons of classiﬁers over multiple data sets,” The Journal of Machine Learning s Research, vol. 7, pp. 1–30, 2006. 9