nips nips2011 nips2011-112 nips2011-112-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yangqing Jia, Trevor Darrell
Abstract: Many applications in computer vision measure the similarity between images or image patches based on some statistics such as oriented gradients. These are often modeled implicitly or explicitly with a Gaussian noise assumption, leading to the use of the Euclidean distance when comparing image descriptors. In this paper, we show that the statistics of gradient based image descriptors often follow a heavy-tailed distribution, which undermines any principled motivation for the use of Euclidean distances. We advocate for the use of a distance measure based on the likelihood ratio test with appropriate probabilistic models that fit the empirical data distribution. We instantiate this similarity measure with the Gammacompound-Laplace distribution, and show significant improvement over existing distance measures in the application of SIFT feature matching, at relatively low computational cost. 1
[1] A Agarwal and H Daume III. Generative kernels for exponential families. In AISTATS, 2011.
[2] A Banerjee, S Merugu, I Dhillon, and J Ghosh. Clustering with Bregman divergences. JMLR, 6:1705– 1749, 2005.
[3] JT Barron and J Malik. High-frequency shape and albedo from shading using natural image statistics. In CVPR, 2011.
[4] O Boiman, E Shechtman, and M Irani. In defense of nearest-neighbor based image classification. In CVPR, 2008.
[5] P Chen, Y Chen, and M Rao. Metrics defined by bregman divergences. Communications in Mathematical Sciences, 6(4):915–926, 2008.
[6] N Dalal. Histograms of oriented gradients for human detection. In CVPR, 2005.
[7] AC Davison. Statistical models. Cambridge Univ Press, 2003.
[8] J Huang, AB Lee, and D Mumford. Statistics of range images. In CVPR, 2000.
[9] S Ji, Y Xue, and L Carin. Bayesian compressive sensing. IEEE Trans. Signal Processing, 56(6):2346– 2356, 2008.
[10] Y Jia, M Salzmann, and D Trevor. Factorized latent spaces with structured sparsity. In NIPS, 2010.
[11] D Koller and N Friedman. Probabilistic graphical models. MIT press, 2009.
[12] B Kulis and T Darrell. Learning to hash with binary reconstructive embeddings. In NIPS, 2009.
[13] S Lazebnik, C Schmid, and J Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR, 2006.
[14] D Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60(2):91–110, 2004.
[15] J Mairal, F Bach, J Ponce, and G Sapiro. Online learning for matrix factorization and sparse coding. JMLR, 11:19–60, 2010.
[16] AW Moore. The anchors hierarchy: using the triangle inequality to survive high dimensional data. In UAI, 2000.
[17] A Oliva and A Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 42(3):145–175, 2001.
[18] B Olshausen. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583):607–609, 1996.
[19] M Ozuysal and P Fua. Fast keypoint recognition in ten lines of code. In CVPR, 2007.
[20] J Portilla, V Strela, MJ Wainwright, and EP Simoncelli. Image denoising using scale mixtures of gaussians in the wavelet domain. IEEE Trans. Image Processing, 12(11):1338–1351, 2003.
[21] M Riesenhuber and T Poggio. Hierarchical models of object recognition in cortex. Nature Neuroscience, 2:1019–1025, 1999.
[22] G Shakhnarovich, P Viola, and T Darrell. Fast pose estimation with parameter-sensitive hashing. In ICCV, 2003.
[23] EP Simoncelli. Statistical models for images: compression, restoration and synthesis. In Asilomar Conference on Signals, Systems & Computers, 1997.
[24] Y Weiss and WT Freeman. What makes a good model of natural images? In CVPR, 2007.
[25] S Winder and M Brown. Learning local image descriptors. In CVPR, 2007.
[26] L Wolf, T Hassner, and Y Taigman. The one-shot similarity kernel. In ICCV, 2009. 9