nips nips2011 nips2011-254 nips2011-254-reference knowledge-graph by maker-knowledge-mining

254 nips-2011-Similarity-based Learning via Data Driven Embeddings


Source: pdf

Author: Purushottam Kar, Prateek Jain

Abstract: We consider the problem of classification using similarity/distance functions over data. Specifically, we propose a framework for defining the goodness of a (dis)similarity function with respect to a given learning task and propose algorithms that have guaranteed generalization properties when working with such good functions. Our framework unifies and generalizes the frameworks proposed by [1] and [2]. An attractive feature of our framework is its adaptability to data - we do not promote a fixed notion of goodness but rather let data dictate it. We show, by giving theoretical guarantees that the goodness criterion best suited to a problem can itself be learned which makes our approach applicable to a variety of domains and problems. We propose a landmarking-based approach to obtaining a classifier from such learned goodness criteria. We then provide a novel diversity based heuristic to perform task-driven selection of landmark points instead of random selection. We demonstrate the effectiveness of our goodness criteria learning method as well as the landmark selection heuristic on a variety of similarity-based learning datasets and benchmark UCI datasets on which our method consistently outperforms existing approaches by a significant margin. 1


reference text

[1] Maria-Florina Balcan and Avrim Blum. On a Theory of Learning with Similarity Functions. In International Conference on Machine Learning, pages 73–80, 2006.

[2] Liwei Wang, Cheng Yang, and Jufu Feng. On Learning with Dissimilarity Functions. In International Conference on Machine Learning, pages 991–998, 2007.

[3] Piotr Indyk and Nitin Thaper. Fast Image Retrieval via Embeddings. In International Workshop Statistical and Computational Theories of Vision, 2003.

[4] El˙ bieta Pekalska and Robert P. W. Duin. On Combining Dissimilarity Representations. In Multiple z ¸ Classifier Systems, pages 359–368, 2001.

[5] Yihua Chen, Eric K. Garcia, Maya R. Gupta, Ali Rahimi, and Luca Cazzanti. Similarity-based Classification: Concepts and Algorithms. Journal of Machine Learning Research, 10:747–776, 2009.

[6] Cheng Soon Ong, Xavier Mary, St´ phane Canu, and Alexander J. Smola. Learning with non-positive e Kernels. In International Conference on Machine Learning, 2004.

[7] Bernard Haasdonk. Feature Space Interpretation of SVMs with Indefinite Kernels. IEEE Transactions on Pattern Analysis and Machince Intelligence, 27(4):482–492, 2005.

[8] Thore Graepel, Ralf Herbrich, Peter Bollmann-Sdorra, and Klaus Obermayer. Classification on Pairwise Proximity Data. In Neural Information Processing Systems, page 438444, 1998.

[9] Maria-Florina Balcan, Avrim Blum, and Nathan Srebro. Improved Guarantees for Learning via Similarity Functions. In 21st Annual Conference on Computational Learning Theory, pages 287–298, 2008.

[10] Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. LIBLINEAR: A Library for Large Linear Classification. Journal of Machine Learning Research, 9:1871–1874, 2008.

[11] Manik Varma and Bodla Rakesh Babu. More Generality in Efficient Multiple Kernel Learning. In 26th Annual International Conference on Machine Learning, pages 1065–1072, 2009.

[12] Prateek Jain, Brian Kulis, Jason V. Davis, and Inderjit S. Dhillon. Metric and Kernel Learning using a Linear Transformation. To appear, Journal of Machine Learning (JMLR), 2011.

[13] Maria-Florina Balcan, Avrim Blum, and Santosh Vempala. Kernels as Features: On Kernels, Margins, and Low-dimensional Mappings. Machine Learning, 65(1):79–94, 2006.

[14] Nathan Srebro. How Good Is a Kernel When Used as a Similarity Measure? In 20th Annual Conference on Computational Learning Theory, pages 323–335, 2007.

[15] M. R. Garey and D.S. Johnson. Computers and Intractability: A Guide to the theory of NP-Completeness. Freeman, San Francisco, 1979.

[16] Sanjeev Arora, L´ szl´ Babai, Jacques Stern, and Z. Sweedyk. The Hardness of Approximate Optima in a o Lattices, Codes, and Systems of Linear Equations. Journal of Computer and System Sciences, 54(2):317– 331, April 1997.

[17] Chih-Chung Chang and Chih-Jen Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(3):27:1–27:27, 2011.

[18] Krithika Venkataramani and B. V. K. Vijaya Kumar. Designing classifiers for fusion-based biometric verification. In Plataniotis Boulgouris and Micheli-Tzankou, editors, Biometrics: Theory, Methods and Applications. Springer, 2009.

[19] A. Frank and Arthur Asuncion. UCI Machine Learning Repository. http://archive.ics.uci. edu/ml, 2010. University of California, Irvine, School of Information and Computer Sciences. 9