jmlr jmlr2011 jmlr2011-58 jmlr2011-58-reference knowledge-graph by maker-knowledge-mining

58 jmlr-2011-Learning from Partial Labels

Source: pdf

Author: Timothee Cour, Ben Sapp, Ben Taskar

Abstract: We address the problem of partially-labeled multiclass classiﬁcation, where instead of a single label per instance, the algorithm is given a candidate set of labels, only one of which is correct. Our setting is motivated by a common scenario in many image and video collections, where only partial access to labels is available. The goal is to learn a classiﬁer that can disambiguate the partiallylabeled training instances, and generalize to unseen data. We deﬁne an intuitive property of the data distribution that sharply characterizes the ability to learn in this setting and show that effective learning is possible even when all the data is only partially labeled. Exploiting this property of the data, we propose a convex learning formulation based on minimization of a loss function appropriate for the partial label setting. We analyze the conditions under which our loss function is asymptotically consistent, as well as its generalization and transductive performance. We apply our framework to identifying faces culled from web news sources and to naming characters in TV series and movies; in particular, we annotated and experimented on a very large video data set and achieve 6% error for character naming on 16 episodes of the TV series Lost. Keywords: weakly supervised learning, multiclass classiﬁcation, convex learning, generalization bounds, names and faces

reference text

C. Ambroise, T. Denoeux, G. Govaert, and P. Smets. Learning from an imprecise teacher: Probabilistic and evidential approaches. In Applied Stochastic Models and Data Analysis, volume 1, pages 100–105, 2001. S. Andrews and T. Hofmann. Multiple instance learning via disjunctive programming boosting. In Advances in Neural Information Processing Systems, 2004. A. Asuncion and D.J. Newman. UCI machine learning repository, 2007. K. Barnard, P. Duygulu, D.A. Forsyth, N. de Freitas, D.M. Blei, and M.I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107–1135, 2003. P. L. Bartlett and S. Mendelson. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3:463–482, 2002. T.L. Berg, A.C. Berg, J.Edwards, M.Maire, R.White, Y.W. Teh, E.G. Learned-Miller, and D.A. Forsyth. Names and faces in the news. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, pages 848–854, 2004. M.R. Boutell, J. Luo, X. Shen, and C.M. Brown. Learning multi-label scene classiﬁcation. Pattern Recognition, 37(9):1757–1771, 2004. P. E. Brown, V. J. Della Pietra, S. A. Della Pietra, and R. L. Mercer. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19:263–311, 1993. O. Chapelle, B. Sch¨ lkopf, and A. Zien. Semi-Supervised Learning. The MIT Press, 2006. o E. Cˆ me, L. Oukhellou, T. Denœux, and P. Aknin. Mixture model estimation with soft labels. o International Conference on Soft Methods in Probability and Statistics, 2008. T. Cour, C. Jordan, E. Miltsakaki, and B. Taskar. Movie/script: Alignment and parsing of video and text transcription. In Proc. European Conference on Computer Vision, 2008. T. Cour, B. Sapp, C. Jordan, and B. Taskar. Learning from ambiguously labeled images. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2009. K. Crammer and Y. Singer. On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research, 2:265–292, 2002. T.G. Dietterich, R.H. Lathrop, and T. Lozano-P´ rez. Solving the multiple instance problem with e axis-parallel rectangles. Artiﬁcial Intelligence, 89(1-2):31–71, 1997. 1534 L EARNING FROM PARTIAL L ABELS P. Duygulu, K. Barnard, J.F.G. de Freitas, and D.A. Forsyth. Object recognition as machine translation: Learning a lexicon for a ﬁxed image vocabulary. In Proc. European Conference on Computer Vision, pages 97–112, 2002. M. Everingham, J. Sivic, and A. Zisserman. Hello! My name is... Buffy – automatic naming of characters in tv video. In British Machine Vision Conference, 2006. R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classiﬁcation. Journal of Machine Learning Research, 9:1871–1874, 2008. J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28:337–407, 2000. A.C. Gallagher and T. Chen. Using group prior to identify people in consumer images. In CVPR Workshop on Semantic Learning Applications in Multimedia, 2007. Y. Grandvalet and Y. Bengio. Learning from partial labels with minimum entropy. Centre interuniversitaire de recherche en analyse des organisations (CIRANO), 2004. G.B. Huang, V. Jain, and E. Learned-Miller. Unsupervised joint alignment of complex images. In Proc. International Conference on Computer Vision, 2007a. G.B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst, 2007b. E. Hullermeier and J. Beringer. Learning from ambiguously labeled examples. Intelligent Data Analysis, 10(5):419–439, 2006. R. Jin and Z. Ghahramani. Learning with multiple labels. In Advances in Neural Information Processing Systems, pages 897–904, 2002. H. Kuck and N. de Freitas. Learning about individuals from group statistics. In Uncertainty in Artiﬁcial Intelligence, 2005. I. Laptev, M. Marszałek, C. Schmid, and B. Rozenfeld. Learning realistic human actions from movies. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2008. J. Luo and F. Orabona. Learning from candidate labeling sets. In Advances in Neural Information Processing Systems, 2010. P. Mermelstein. Distance measures for speech recognition, psychological and instrumental. Pattern Recognition and Artiﬁcial Intelligence, pages 374–388, 1976. P.J. Moreno, C. Joerg, J.M.V. Thong, and O. Glickman. A recursive algorithm for the forced alignment of very long audio segments. In International Conference on Spoken Language Processing, 1998. J.G. Proakis and D.G. Manolakis. Digital signal processing: principles, algorithms, and applications. Prentice Hall, 1996. 1535 C OUR , S APP AND TASKAR N. Quadrianto, A.J. Smola, T.S. Caetano, and Q.V. Le. Estimating labels from label proportions. Journal of Machine Learning Research, 10:2349–2374, 2009. ISSN 1532-4435. D. Ramanan, S. Baker, and S. Kakade. Leveraging archival video for building face datasets. In Proc. International Conference on Computer Vision, 2007. R. Rifkin and A. Klautau. In defense of one-vs-all classiﬁcation. Journal of Machine Learning Research, 5:101–141, 2004. S. Satoh, Y. Nakamura, and T. Kanade. Name-it: Naming and detecting faces in news videos. IEEE MultiMedia, 6(1):22–35, 1999. K. Sj¨ lander. An HMM-based system for automatic segmentation and alignment of speech. In o Fonetik, pages 93–96, 2003. D. Talkin. A robust algorithm for pitch tracking (RAPT). Speech Coding and Synthesis, pages 495–518, 1995. A. Tewari and P. L. Bartlett. On the consistency of multiclass classiﬁcation methods. In International Conference on Learning Theory, volume 3559, pages 143–157, 2005. G. Tsoumakas, I. Katakis, and I. Vlahavas. Mining multi-label data. Data Mining and Knowledge Discovery Handbook, pages 667–685, 2010. P. Vannoorenberghe and P. Smets. Partially supervised learning by a credal EM approach. In European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, pages 956–967, 2005. P. Viola, J. Platt, and C. Zhang. Multiple instance boosting for object detection. Advances in Neural Information Processing Systems, 18:1417, 2006. R. Yan., J. Zhang, J. Yang, and A.G. Hauptmann. A discriminative learning framework with pairwise constraints for video object classiﬁcation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(4):578–593, 2006. T. Zhang. Statistical analysis of some multi-category large margin classiﬁcation methods. Journal of Machine Learning Research, 5:1225–1251, 2004. ISSN 1533-7928. Z.H. Zhou and M.L. Zhang. Multi-instance multi-label learning with application to scene classiﬁcation. Advances in Neural Information Processing Systems, 19:1609, 2007. X. Zhu and A.B. Goldberg. Introduction to semi-supervised learning. Synthesis Lectures on Artiﬁcial Intelligence and Machine Learning, 3(1):1–130, 2009. 1536