nips nips2006 nips2006-130 nips2006-130-reference knowledge-graph by maker-knowledge-mining

130 nips-2006-Max-margin classification of incomplete data


Source: pdf

Author: Gal Chechik, Geremy Heitz, Gal Elidan, Pieter Abbeel, Daphne Koller

Abstract: We consider the problem of learning classifiers for structurally incomplete data, where some objects have a subset of features inherently absent due to complex relationships between the features. The common approach for handling missing features is to begin with a preprocessing phase that completes the missing features, and then use a standard classification procedure. In this paper we show how incomplete data can be classified directly without any completion of the missing features using a max-margin learning framework. We formulate this task using a geometrically-inspired objective function, and discuss two optimization approaches: The linearly separable case is written as a set of convex feasibility problems, and the non-separable case has a non-convex objective that we optimize iteratively. By avoiding the pre-processing phase in which the data is completed, these approaches offer considerable computational savings. More importantly, we show that by elegantly handling complex patterns of missing values, our approach is both competitive with other methods when the values are missing at random and outperforms them when the missing values have non-trivial structure. We demonstrate our results on two real-world problems: edge prediction in metabolic pathways, and automobile detection in natural images. 1


reference text

[1] A. Berg, T. Berg, and J. Malik. Shape matching and object recognition using low distortion correspondence. In CVPR, 2005.

[2] Paul H. Calamai and Jorge J. More:9A. Projected gradient methods for linearly constrained problems. Math. Program., 39(1):93–116, 1987.

[3] J. Forster, I. Famili, P. Fu, B.. Palsson, and J. Nielsen. Genome-scale reconstruction of the saccharomyces cerevisiae metabolic network. Genome Research, 13(2):244–253, February 2003.

[4] Z. Ghahramani and MI. Jordan. Supervised learning from incomplete data via an EM approach. In JD. Cowan, G. Tesauro, and J. Alspector, editors, NIPS, volume 6, pages 120–127, 1994.

[5] K. Grauman and T. Darrell. Pyramid match kernels: Discriminative classification with sets of image features. In ICCV, 2005.

[6] W.K. Huh, J.V. Falvo, L.C. Gerke, A.S. Carroll, R.W. Howson, J.S. Weissman, and E.K. O’Shea. Global analysis of protein localization in budding yeast. Nature, 425:686–691, 2003.

[7] J. Ihmels, R. Levy, and N. Barkai. Principles of transcriptional control in the metabolic network of saccharomyces cerevisiae. Nature Biotechnology, 22:86–92, 2003.

[8] D. Jensen and J. Neville. Linkage and autocorrelation cause feature selection bias in relational learning. In ICML, 2002.

[9] P. Kharchenko, D. Vitkup, and GM Church. Filling gaps in a metabolic network using expression information. Bioinformatics, 20:I178–I185, 2003.

[10] R.J.A. Little and D.B. Rubin. Statistical Analysis with Missing Data. NY wiley, 1987.

[11] MS. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret. Applications of second-order cone programming. Linear Algebra and its Applications, 284:193–228, 1998.

[12] A. Quattoni, M. Collins, and T. Darrell. Conditional random fields for object recognition. In LK. Saul, Y. Weiss, and L. Bottou, editors, NIPS 17, pages 1097–1104, 2005.

[13] B. Sch¨lkopf and A.J. Smola. Learning with Kernels: Support Vector Machines, Regularization o Optimization and Beyond. MIT Press, Cambridge, MA, 2002.

[14] V.N. Vapnik. The nature of statistical learning theory. SpringerVerlag, 1995.

[15] J. P. Vert and Y. Yamanishi. Supervised graph inference. In LK. Saul, Y. Weiss, and L. Bottou, editors, NIPS 17, pages 1433–1440, 2004.