nips nips2006 nips2006-94 nips2006-94-reference knowledge-graph by maker-knowledge-mining

94 nips-2006-Image Retrieval and Classification Using Local Distance Functions

Source: pdf

Author: Andrea Frome, Yoram Singer, Jitendra Malik

Abstract: In this paper we introduce and experiment with a framework for learning local perceptual distance functions for visual recognition. We learn a distance function for each training image as a combination of elementary distances between patch-based visual features. We apply these combined local distance functions to the tasks of image retrieval and classiﬁcation of novel images. On the Caltech 101 object recognition benchmark, we achieve 60.3% mean recognition across classes using 15 training images per class, which is better than the best published performance by Zhang, et al. 1

reference text

[1] I. Biederman, “Recognition-by-components: A theory of human image understanding,” Psychological Review, vol. 94, no. 2, pp. 115–147, 1987.

[2] C. Schmid and R. Mohr, “Combining greyvalue invariants with local constraints for object recognition,” in CVPR, 1996.

[3] D. Lowe, “Object recognition from local scale-invariant features,” in ICCV, pp. 1000–1015, Sep 1999.

[4] S. Belongie, J. Malik, and J. Puzicha, “Shape matching and object recognition using shape contexts,” PAMI, vol. 24, pp. 509–522, April 2002.

[5] A. Berg and J. Malik, “Geometric blur for template matching,” in CVPR, pp. 607–614, 2001.

[6] E. Xing, A. Ng, and M. Jordan, “Distance metric learning with application to clustering with sideinformation,” in NIPS, 2002.

[7] Schutlz and Joachims, “Learning a distance metric from relative comparisons,” in NIPS, 2003.

[8] S. Shalev-Shwartz, Y. Singer, and A. Ng, “Online and batch learning of pseudo-metrics,” in ICML, 2004.

[9] K. Q. Weinberger, J. Blitzer, and L. K. Saul, “Distance metric learning for large margin nearest neighbor classiﬁcation,” in NIPS, 2005.

[10] A. Globerson and S. Roweis, “Metric learning by collapsing classes,” in NIPS, 2005.

[11] H. Zhang, A. Berg, M. Maire, and J. Malik, “SVM-KNN: Discriminative Nearset Neighbor Classiﬁcation for Visual Category Recognition,” in CVPR, 2006.

[12] Y. Censor and S. A. Zenios, Parallel Optimization: Theory, Algorithms, and Applications. Oxford University Press, 1998.

[13] A. Berg, T. Berg, and J. Malik, “Shape matching and object recognition using low distortion correspondence,” in CVPR, 2005.

[14] S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories,” in CVPR, 2006.

[15] K. Grauman and T. Darrell, “Pyramic match kernels: Discriminative classﬁciation with sets of image features (version 2),” Tech. Rep. MIT CSAIL TR 2006-020, MIT, March 2006.

[16] J. Mutch and D. G. Lowe, “Multiclass object recognition with sparse, localized features,” in CVPR, 2006.

[17] E. L. Allwein, R. E. Schapire, and Y. Singer, “Reducing multiclass to binary: A unifying approach for margin classiﬁers,” JMLR, vol. 1, pp. 113–141, 2000.

[18] L. Fei-Fei, R. Fergus, and P. Perona, “Learning generative visual models from few training examples: an incremental bayesian approach testing on 101 object categories.,” in Workshop on Generative-Model Based Vision, CVPR, 2004.

[19] G. Wang, Y. Zhang, and L. Fei-Fei, “Using dependent regions for object categorization in a generative framework,” in CVPR, 2006.

[20] A. D. Holub, M. Welling, and P. Perona, “Combining generative models and ﬁsher kernels for object recognition,” in ICCV, 2005.

[21] T. Serre, L. Wolf, and T. Poggio, “Object recognition with features inspired by visual cortex,” in CVPR, 2005. 8 To further speed up comparisons, in place of an exact nearest neighbor computation, we could use approximate nearest neighbor algorithms such as locality-sensitive hashing or spill trees.