jmlr jmlr2008 jmlr2008-58 jmlr2008-58-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Gal Chechik, Geremy Heitz, Gal Elidan, Pieter Abbeel, Daphne Koller
Abstract: We consider the problem of learning classifiers in structured domains, where some objects have a subset of features that are inherently absent due to complex relationships between the features. Unlike the case where a feature exists but its value is not observed, here we focus on the case where a feature may not even exist (structurally absent) for some of the samples. The common approach for handling missing features in discriminative models is to first complete their unknown values, and then use a standard classification procedure over the completed data. This paper focuses on features that are known to be non-existing, rather than have an unknown value. We show how incomplete data can be classified directly without any completion of the missing features using a max-margin learning framework. We formulate an objective function, based on the geometric interpretation of the margin, that aims to maximize the margin of each sample in its own relevant subspace. In this formulation, the linearly separable case can be transformed into a binary search over a series of second order cone programs (SOCP), a convex problem that can be solved efficiently. We also describe two approaches for optimizing the general case: an approximation that can be solved as a standard quadratic program (QP) and an iterative approach for solving the exact problem. By avoiding the pre-processing phase in which the data is completed, both of these approaches could offer considerable computational savings. More importantly, we show that the elegant handling of missing values by our approach allows it to both outperform other methods when the missing values have non-trivial structure, and be competitive with other methods when the values are missing at random. We demonstrate our results on several standard benchmarks and two real-world problems: edge prediction in metabolic pathways, and automobile detection in natural images. Keywords: max margin, missing features, network reconstruction, metabolic pa
A. Berg, T. Berg, and J. Malik. Shape matching and object recognition using low distortion correspondence. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2005. G. Elidan, G. Heitz, and D. Koller. Learning object shape: From drawings to images. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2006. 19 C HECHIK , H EITZ , E LIDAN , A BBEEL AND KOLLER R. Fergus, P. Perona, and A. Zisserman. Object class recognition by unsupervised scale-invariant learning. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, pages 264–271, 2003. J. Forster, I. Famili, P. Fu, B.. Palsson, and J. Nielsen. Genome-scale reconstruction of the saccharomyces cerevisiae metabolic network. Genome Research, 13(2):244–253, February 2003. Z. Ghahramani and M. I. Jordan. Supervised learning from incomplete data via an EM approach. In Jack D. Cowan, Gerald Tesauro, and Joshua Alspector, editors, Advances in Neural Information Processing Systems, volume 6, pages 120–127. Morgan Kaufmann Publishers, Inc., 1994. K. Grauman and T. Darrell. Pyramid match kernels: Discriminative classification with sets of image features. In International Conference on Computer Vision, October 2005. W.K. Huh, J.V. Falvo, L.C. Gerke, A.S. Carroll, R.W. Howson, J.S. Weissman, and E.K. O’Shea. Global analysis of protein localization in budding yeast. Nature, 425:686–691, 2003. N. Hulo, A. Bairoch, V. Bulliard, L. Cerutti, E. De Castro, P.S. Langendijk-Genevaux, M. Pagni, and C.J.A. Sigrist. The prosite database. Nucleic Acids Res., 34:D227–D230, 2006. J. Ihmels, R. Levy, and N. Barkai. Principles of transcriptional control in the metabolic network of saccharomyces cerevisiae. Nature Biotechnology, 22:86–92, 2003. A. Jaimovich, G. Elidan, H. Margalit, and N. Friedman. Towards an integrated protein-protein interaction network. In RECOMB 2005, 2005. D. Jensen and J. Neville. Linkage and autocorrelation cause feature selection bias in relational learning. In 19th international confernce on Machine learning (ICML 2002), 2002. A. Kapoor. Learning Discriminative Models with Incomplete Data. PhD thesis, MIT Media Lab, Feb 2006 2006. P. Kharchenko, D. Vitkup, and GM Church. Filling gaps in a metabolic network using expression information. Bioinformatics, 2003. P. Kharchenko, GM Church, and D. Vitkup. Expression dynamics of a cellular metabolic network. Molecular Systems Biology, 2005. Y. LeCun. MNIST Database of Handwritten Digits. http://yann.lecun.com/exdb/mnist/. NEC Research, 1998. URL R.J.A. Little and D.B. Rubin. Statistical Analysis with Missing Data. NY wiley, 1987. M. Sousa Lobo, L. Vandenbergheb, S. Boyd, and H. Lebret. Applications of second-order cone programming. Linear Algebra and its Applications, 284(1-3):193–228, 1998. S. K. Pannagadatta, C. Bhattacharyya, and A.J. Smola. Second order cone programming approaches for handling missing and uncertain data. Journal of Machine Learning Research, 7:1283–1314, 2006. 20 M AX - MARGIN C LASSIFICATION OF DATA WITH A BSENT F EATURES A. Quattoni, M. Collins, and T. Darrell. Conditional random fields for object recognition. In Lawrence K. Saul, Yair Weiss, and L´ on Bottou, editors, Advances in Neural Information Proe cessing Systems 17, pages 1097–1104, Cambridge, MA, 2005. MIT Press. P. Roth. Missing data: A conceptual review for applied psychologists. Personnel Psychology, 47 (3):537–560, 1994. J. L. Schafer. Analysis of Incomplete Multivariate Data. Chapman and Hall, 1997. B. Sch¨ lkopf and A.J. Smola. Learning with Kernels: Support Vector Machines, Regularization o Optimization and Beyond. MIT Press, Cambridge, MA, 2002. V.N. Vapnik. The Nature of Statistical Learning Theory. SpringerVerlag, 1995. J. P. Vert and Y. Yamanishi. Supervised graph inference. In L. K. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems 17, pages 1433–1440, Cambridge, MA, 2004. MIT Press. 21