nips nips2011 nips2011-33 nips2011-33-reference knowledge-graph by maker-knowledge-mining

33 nips-2011-An Exact Algorithm for F-Measure Maximization

Source: pdf

Author: Krzysztof J. Dembczynski, Willem Waegeman, Weiwei Cheng, Eyke Hüllermeier

Abstract: The F-measure, originally introduced in information retrieval, is nowadays routinely used as a performance metric for problems such as binary classiﬁcation, multi-label classiﬁcation, and structured output prediction. Optimizing this measure remains a statistically and computationally challenging problem, since no closed-form maximizer exists. Current algorithms are approximate and typically rely on additional assumptions regarding the statistical distribution of the binary response variables. In this paper, we present an algorithm which is not only computationally efﬁcient but also exact, regardless of the underlying distribution. The algorithm requires only a quadratic number of parameters of the joint distribution (with respect to the number of binary responses). We illustrate its practical performance by means of experimental results for multi-label classiﬁcation. 1

reference text

[1] C. J. van Rijsbergen. Foundation of evaluation. Journal of Documentation, 30(4):365–373, 1974.

[2] David R. Musicant, Vipin Kumar, and Aysel Ozgur. Optimizing F-measure with support vector machines. In FLAIRS-16, 2003, pages 356–360, 2003.

[3] Thorsten Joachims. A support vector method for multivariate performance measures. In ICML 2005, pages 377–384, 2005.

[4] Martin Jansche. Maximum expected F-measure training of logistic regression models. In HLT/EMNLP 2005, pages 736–743, 2005.

[5] Sathiya Keerthi, Vikas Sindhwani, and Olivier Chapelle. An efﬁcient method for gradientbased adaptation of hyperparameters in SVM models. In Advances in Neural Information Processing Systems 19, 2007.

[6] Ioannis Tsochantaridis, Thorsten Joachims, Thomas Hofmann, and Yasemin Altun. Large margin methods for structured and interdependent output variables. J. Mach. Learn. Res., 6:1453–1484, 2005.

[7] Jun Suzuki, Erik McDermott, and Hideki Isozaki. Training conditional random ﬁelds with multivariate evaluation measures. In ACL, pages 217–224, 2006.

[8] Hal Daum´ III, John Langford, and Daniel Marcu. Search-based structured prediction. Mae chine Learning, 75:297–325, 2009.

[9] Rong-En Fan and Chih-Jen Lin. A study on threshold selection for multi-label classiﬁcation. Technical report, Department of Computer Science, National Taiwan University, 2007.

[10] Xinhua Zhang, Thore Graepel, and Ralf Herbrich. Bayesian online learning for multi-label and multi-variate performance measures. In AISTATS 2010, pages 956–963, 2010.

[11] James Petterson and Tiberio Caetano. Reverse multi-label learning. In Advances in Neural Information Processing Systems 23, pages 1912–1920, 2010.

[12] Martin Jansche. A maximum expected utility framework for binary sequence labeling. In ACL 2007, pages 736–743, 2007.

[13] David Lewis. Evaluating and optimizing autonomous text classiﬁcation systems. In SIGIR 1995, pages 246–254, 1995.

[14] Juan Jose del Coz, Jorge Diez, and Antonio Bahamonde. Learning nondeterministic classiﬁers. J. Mach. Learn. Res., 10:2273–2293, 2009.

[15] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, 2nd edition. MIT Press, 2001.

[16] Don Coppersmith and Shmuel Winograd. Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation, 3(9):251–280, 1990.

[17] John Lafferty, Andrew McCallum, and Fernando Pereira. Conditional random ﬁelds: Probabilistic models for segmenting and labeling sequence data. In ICML 2001, pages 282–289, 2001.

[18] Nadia Ghamrawi and Andrew McCallum. Collective multi-label classiﬁcation. In CIKM 2005, pages 195–200, 2005.

[19] Krzysztof Dembczy´ ski, Weiwei Cheng, and Eyke H¨ llermeier. Bayes optimal multilabel n u classiﬁcation via probabilistic classiﬁer chains. In ICML 2010, pages 279–286, 2010.

[20] Piyush Rai and Hal Daum´ III. Multi-label prediction via sparse inﬁnite CCA. In Advances in e Neural Information Processing Systems 22, pages 1518–1526, 2009.

[21] Matthew R. Boutell, Jiebo Luo, Xipeng Shen, and Christopher M. Brown. Learning multi-label scene classiﬁcation. Pattern Recognition, 37(9):1757–1771, 2004. 9