nips nips2010 nips2010-275 nips2010-275-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Andrew Goldberg, Ben Recht, Junming Xu, Robert Nowak, Xiaojin Zhu
Abstract: We pose transductive classification as a matrix completion problem. By assuming the underlying matrix has a low rank, our formulation is able to handle three problems simultaneously: i) multi-label learning, where each item has more than one label, ii) transduction, where most of these labels are unspecified, and iii) missing data, where a large number of features are missing. We obtained satisfactory results on several real-world tasks, suggesting that the low rank assumption may not be as restrictive as it seems. Our method allows for different loss functions to apply on the feature and label entries of the matrix. The resulting nuclear norm minimization problem is solved with a modified fixed-point continuation method that is guaranteed to find the global optimum. 1
[1] Andreas Argyriou, Charles A. Micchelli, and Massimiliano Pontil. On spectral learning. Journal of Machine Learning Research, 11:935–953, 2010.
[2] Emmanuel J. Cand` s and Benjamin Recht. Exact matrix completion via convex optimization. e Foundations of Computational Mathematics, 9:717–772, 2009.
[3] Emmanuel J. Cand` s and Terence Tao. The power of convex relaxation: Near-optimal matrix e completion. IEEE Transactions on Information Theory, 56:2053–2080, 2010.
[4] Olivier Chapelle, Alexander Zien, and Bernhard Sch¨ lkopf, editors. Semi-supervised learning. o MIT Press, 2006.
[5] Andr´ Elisseeff and Jason Weston. A kernel method for multi-labelled classification. In e Thomas G. Dietterich, Suzanna Becker, and Zoubin Ghahramani, editors, NIPS, pages 681– 687. MIT Press, 2001.
[6] Zoubin Ghahramani and Michael I. Jordan. Supervised learning from incomplete data via an EM approach. In Advances in Neural Information Processing Systems 6, pages 120–127. Morgan Kaufmann, 1994.
[7] Daniel Hsu, Sham Kakade, John Langford, and Tong Zhang. Multi-label prediction via compressed sensing. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 772–780. 2009.
[8] Shuiwang Ji, Lei Tang, Shipeng Yu, and Jieping Ye. Extracting shared subspace for multi-label classification. In KDD ’08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 381–389, New York, NY, USA, 2008. ACM.
[9] Roderick J. A. Little and Donald B. Rubin. Statistical Analysis with Missing Data. WileyInterscience, 2nd edition, September 2002.
[10] Shiqian Ma, Donald Goldfarb, and Lifeng Chen. Fixed point and Bregman iterative methods for matrix rank minimization. Mathematical Programming Series A, to appear (published online September 23, 2009).
[11] Guillaume Obozinski, Ben Taskar, and Michael I. Jordan. Joint covariate selection and joint subspace selection for multiple classification problems. Statistics and Computing, 20(2):231– 252, 2010.
[12] Piyush Rai and Hal Daume. Multi-label prediction via sparse infinite CCA. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 1518–1526. 2009.
[13] Nathan Srebro and Adi Shraibman. Rank, trace-norm and max-norm. In Proceedings of the 18th Annual Conference on Learning Theory, pages 545–560. Springer-Verlag, 2005.
[14] K. Trohidis, G. Tsoumakas, G. Kalliris, and I. Vlahavas. Multilabel classification of music into emotions. In Proc. 9th International Conference on Music Information Retrieval (ISMIR 2008), Philadelphia, PA, USA, 2008, 2008.
[15] G. Tsoumakas, I. Katakis, and I. Vlahavas. Mining multi-label data. In Data Mining and Knowledge Discovery Handbook. Springer, 2nd edition, 2010.
[16] Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi-Supervised Learning. Morgan & Claypool, 2009. 9