jmlr jmlr2010 jmlr2010-59 jmlr2010-59-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Gal Chechik, Varun Sharma, Uri Shalit, Samy Bengio
Abstract: Learning a measure of similarity between pairs of objects is an important generic problem in machine learning. It is particularly useful in large scale applications like searching for an image that is similar to a given image or finding videos that are relevant to a given video. In these tasks, users look for objects that are not only visually similar but also semantically related to a given object. Unfortunately, the approaches that exist today for learning such semantic similarity do not scale to large data sets. This is both because typically their CPU and storage requirements grow quadratically with the sample size, and because many methods impose complex positivity constraints on the space of learned similarity functions. The current paper presents OASIS, an Online Algorithm for Scalable Image Similarity learning that learns a bilinear similarity measure over sparse representations. OASIS is an online dual approach using the passive-aggressive family of learning algorithms with a large margin criterion and an efficient hinge loss cost. Our experiments show that OASIS is both fast and accurate at a wide range of scales: for a data set with thousands of images, it achieves better results than existing state-of-the-art methods, while being an order of magnitude faster. For large, web scale, data sets, OASIS can be trained on more than two million images from 150K text queries within 3 days on a single CPU. On this large scale data set, human evaluations showed that 35% of the ten nearest neighbors of a given test image, as found by OASIS, were semantically relevant to that image. This suggests that query independent similarity could be accurately learned even for large scale data sets that could not be handled before. Keywords: large scale, metric learning, image similarity, online learning ∗. Varun Sharma and Uri Shalit contributed equally to this work. †. Also at ICNC, The Hebrew University of Jerusalem, 91904, Israel. c 2010 Gal Chechik, Varun Sharma, Uri Shalit
A. Bar-Hillel, T. Hertz, N. Shental, and D. Weinshall. Learning distance functions using equivalence relations. In Proc. of 20th International Conference on Machine Learning (ICML), page 11, 2003. L. Bottou. Large-scale machine learning and stochastic algorithms. In NIPS 2008 Workshop on Optimization for Machine Learning, 2008. Y. Chen, E.K. Garcia, M.R. Gupta, A. Rahimi, and L. Cazzanti. Similarity-based classification: Concepts and algorithms. The Journal of Machine Learning Research, 10:747–776, 2009. K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. Journal of Machine Learning Research (JMLR), 7:551–585, 2006. J.V. Davis, B. Kulis, P. Jain, S. Sra, and I.S. Dhillon. Information-theoretic metric learning. In Proceedings of the 24th international conference on Machine learning, pages 209–216. ACM Press New York, NY, USA, 2007. 1133 C HECHIK , S HARMA , S HALIT AND B ENGIO P. Duygulu, K. Barnard, N. de Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In European Conference on Computer Vision (ECCV), pages 97–112, 2002. S.L. Feng, R. Manmatha, and V. Lavrenko. Multiple Bernoulli relevance models for image and video annotation. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2004. A. Frome, Y. Singer, F. Sha, and J. Malik. Learning globally-consistent local distance functions for shape-based image retrieval and classification. In International Conference on Computer Vision, pages 1–8, 2007. A. Globerson and S. Roweis. Metric learning by collapsing classes. Advances in Neural Information Processing Systems, 18:451, 2006. D. Grangier and S. Bengio. A discriminative kernel-based model to rank images from text queries. Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 30(8):1371–1384, 2008. D. Grangier, F. Monay, and S. Bengio. Learning to retrieve images from text queries with a discriminative model. In International Conference on Adaptive Multimedia Retrieval (AMR), 2006. G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology, 2007. R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, 2006. P. Jain, B. Kulis, I. Dhillon, and K. Grauman. Online metric learning and fast similarity search. In Advances in Neural Information Processing Systems, volume 22, 2008a. P. Jain, B. Kulis, and K. Grauman. Fast image search for learned metrics. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pages 1–8, 2008b. J. Jeon and R. Manmatha. Using maximum entropy for automatic image annotation. In International Conference on Image and Video Retrieval, pages 24–32, 2004. B. Kulis, M.A. Sustik, and I.S. Dhillon. Low-rank kernel learning with bregman matrix divergences. Journal of Machine Learning Research, 10:341–376, 2009. G.R.G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, and M.I. Jordan. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research (JMLR), 5:27– 72, 2004. D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (IJCV), 60(2):91–110, 2004. W.S. Noble. Multi-kernel learning for biology. In NIPS 2008 workshop on kernel learning, 2008. 1134 L ARGE S CALE O NLINE L EARNING OF I MAGE S IMILARITY T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 24(7):971–987, 2002. P. Quelhas, F. Monay, J. M. Odobez, D. Gatica-Perez, T. Tuytelaars, and L. J. Van Gool. Modeling scenes with local descriptors and latent aspects. In International Conference on Computer Vision, pages 883–890, 2005. N. Rasiwasia and N. Vasconcelos. A study of query by semantic example. In 3rd International Workshop on Semantic Learning and Applications in Multimedia, 2008. R. Rosales and G. Fung. Learning sparse metrics via linear programming. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 367–373. ACM New York, NY, USA, 2006. M. Schultz and T. Joachims. Learning a distance metric from relative comparisons. In Advances in Neural Information Processing Systems 16: Proceedings of the 2003 Conference. Bradford Book, 2004. V. Takala, T. Ahonen, and M. Pietikainen. Block-based methods for image retrieval using local binary patterns. In Scandinavian Conference on Image Analysis (SCIA), 2005. K. Tieu and P. Viola. Boosting image retrieval. International Journal of Computer Vision (IJCV), 56(1):17 – 36, 2004. A. Torralba, R. Fergus, and W. T. Freeman. Tiny images. Technical Report MIT-CSAIL-TR-2007024, Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 2007. URL http://dspace.mit.edu/handle/1721.1/37291. A. Tversky. Features of similarity. Psychological Review, 84(4):327–352, 1977. K. Weinberger, J. Blitzer, and L. Saul. Distance metric learning for large margin nearest neighbor classification. Advances in Neural Information Processing Systems, 18:1473, 2006. K.Q. Weinberger and L.K. Saul. Fast solvers and efficient implementations for distance metric learning. In ICML25, pages 1160–1167, 2008. E.P. Xing, A.Y. Ng, M.I. Jordan, and S. Russell. Distance metric learning with application to clustering with side-information. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, pages 521–528, Cambridge, MA, 2003. MIT Press. L. Yang. Distance metric learning: A comprehensive survey. Technical report, Michigan State University, 2006. 1135