nips nips2008 nips2008-228 nips2008-228-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yves Grandvalet, Alain Rakotomamonjy, Joseph Keshet, Stéphane Canu
Abstract: We consider the problem of binary classification where the classifier may abstain instead of classifying each observation. The Bayes decision rule for this setup, known as Chow’s rule, is defined by two thresholds on posterior probabilities. From simple desiderata, namely the consistency and the sparsity of the classifier, we derive the double hinge loss function that focuses on estimating conditional probabilities only in the vicinity of the threshold points of the optimal decision rule. We show that, for suitable kernel machines, our approach is universally consistent. We cast the problem of minimizing the double hinge loss as a quadratic program akin to the standard SVM optimization problem and propose an active set method to solve it efficiently. We finally provide preliminary experimental results illustrating the interest of our constructive approach to devising loss functions. 1
Bartlett, P. L., & Tewari, A. (2007). Sparseness vs estimating conditional probabilities: Some asymptotic results. Journal of Machine Learning Research, 8, 775–790. Bartlett, P. L., & Wegkamp, M. H. (2008). Classification with a reject option using a hinge loss. Journal of Machine Learning Research, 9, 1823–1840. Chow, C. K. (1970). On optimum recognition error and reject tradeoff. IEEE Trans. on Info. Theory, 16, 41–46. Fumera, G., & Roli, F. (2002). Support vector machines with embedded reject option. Pattern Recognition with Support Vector Machines: First International Workshop (pp. 68–82). Springer. Grandvalet, Y., Mari´ thoz, J., & Bengio, S. (2006). A probabilistic interpretation of SVMs with an application e to unbalanced classification. NIPS 18 (pp. 467–474). MIT Press. Herbei, R., & Wegkamp, M. H. (2006). Classification with reject option. The Canadian Journal of Statistics, 34, 709–721. Kwok, J. T. (1999). Moderating the outputs of support vector machine classifiers. IEEE Trans. on Neural Networks, 10, 1018–1031. Steinwart, I. (2005). Consistency of support vector machine and other regularized kernel classifiers. IEEE Trans. on Info. Theory, 51, 128–142. Vapnik, V. N. (1995). The nature of statistical learning theory. Springer Series in Statistics. Springer. Vishwanathan, S. V. N., Smola, A., & Murty, N. (2003). SimpleSVM. Proceedings of the Twentieth International Conference on Machine Learning (pp. 68–82). AAAI. 8