nips nips2012 nips2012-280 nips2012-280-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jesús Cid-sueiro
Abstract: This paper discusses the problem of calibrating posterior class probabilities from partially labelled data. Each instance is assumed to be labelled as belonging to one of several candidate categories, at most one of them being true. We generalize the concept of proper loss to this scenario, we establish a necessary and sufficient condition for a loss function to be proper, and we show a direct procedure to construct a proper loss for partial labels from a conventional proper loss. The problem can be characterized by the mixing probability matrix relating the true class of the data and the observed labels. The full knowledge of this matrix is not required, and losses can be constructed that are proper for a wide set of mixing probability matrices. 1
[1] T. Cour, B. Sapp, and B. Taskar, “Learning from partial labels,” Journal of Machine Learning Research, vol. 12, pp. 1225–1261, 2011.
[2] V. C. Raykar, S. Yu, L. H. Zhao, G. H. Valadez, C. Florin, L. Bogoni, and L. Moy, “Learning from crowds,” Journal of Machine Learning Research, vol. 99, pp. 1297–1322, August 2010.
[3] V. S. Sheng, F. Provost, and P. G. Ipeirotis, “Get another label? improving data quality and data mining using multiple, noisy labelers,” in Procs. of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ser. KDD ’08. New York, NY, USA: ACM, 2008, pp. 614–622.
[4] E. Cˆ me, L. Oukhellou, T. Denux, and P. Aknin, “Mixture model estimation with soft lao bels,” in Soft Methods for Handling Variability and Imprecision, ser. Advances in Soft Computing, D. Dubois, M. Lubiano, H. Prade, M. Gil, P. Grzegorzewski, and O. Hryniewicz, Eds. Springer Berlin / Heidelberg, 2008, vol. 48, pp. 165–174.
[5] P. Liang, M. Jordan, and D. Klein, “Learning from measurements in exponential families,” in Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 2009, pp. 641–648.
[6] R. Jin and Z. Ghahramani, “Learning with multiple labels,” Advances in Neural Information Processing Systems, vol. 15, pp. 897–904, 2002.
[7] C. Ambroise, T. Denoeux, G. Govaert, and P. Smets, “Learning from an imprecise teacher: probabilistic and evidential approaches,” in Applied Stochastic Models and Data Analysis, 2001, vol. 1, pp. 100–105.
[8] Y. Grandvalet and Y. Bengio, “Semi-supervised learning by entropy minimization,” 2005.
[9] M. Reid and B. Williamson, “Information, divergence and risk for binary experiments,” Journal of Machine Learning Research, vol. 12, pp. 731–817, 2011.
[10] H. Masnadi-Shirazi and N. Vasconcelos, “Risk minimization, probability elicitation, and costsensitive svms,” in Proceedings of the International Conference on Machine Learning, 2010, pp. 204–213.
[11] L. Savage, “Elicitation of personal probabilities and expectations,” Journal of the American Statistical Association, pp. 783–801, 1971.
[12] T. Gneiting and A. Raftery, “Strictly proper scoring rules, prediction, and estimation,” Journal of the American Statistical Association, vol. 102, no. 477, pp. 359–378, 2007. 9