cvpr cvpr2013 cvpr2013-189 cvpr2013-189-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Song Cao, Noah Snavely
Abstract: Recognizing the location of a query image by matching it to a database is an important problem in computer vision, and one for which the representation of the database is a key issue. We explore new ways for exploiting the structure of a database by representing it as a graph, and show how the rich information embedded in a graph can improve a bagof-words-based location recognition method. In particular, starting from a graph on a set of images based on visual connectivity, we propose a method for selecting a set of subgraphs and learning a local distance function for each using discriminative techniques. For a query image, each database image is ranked according to these local distance functions in order to place the image in the right part of the graph. In addition, we propose a probabilistic method for increasing the diversity of these ranked database images, again based on the structure of the image graph. We demonstrate that our methods improve performance over standard bag-of-words methods on several existing location recognition datasets.
[1] S. Agarwal, N. Snavely, I. Simon, S. Seitz, and R. Szeliski. Building Rome in a day. In ICCV, 2009.
[2] R. Arandjelovic and A. Zisserman. Three things everyone should know to improve object retrieval. In CVPR, 2012.
[3] S. Cao and N. Snavely. Learning to match images in largescale collections. In ECCV Workshop on Web-scale Vision and Social Media, 2012.
[4] O. Chum and J. Matas. Large-scale discovery of spatially related images. PAMI, 2010.
[5] O. Chum and J. Matas. Unsupervised discovery of cooccurrence in sparse high dimensional data. In CVPR, 2010.
[6] C. Doersch, S. Singh, A. Gupta, J. Sivic, and A. A. Efros. What makes Paris look like Paris? SIGGRAPH, 2012.
[7] J.-M. Frahm et al. Building Rome on a cloudless day. In ECCV, 2010.
[8] A. Frome, Y. Singer, F. Sha, and J. Malik. Learning globallyconsistent local distance functions for shape-based image retrieval and classification. In ICCV, 2007.
[9] S. Guha and S. Khuller. Approximation algorithms for con-
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22] nected dominating sets. Algorithmica, 1998. M. Havlena, A. Torii, and T. Pajdla. Efficient structure from motion by graph optimization. In ECCV, 2010. J. Hays and A. Efros. Im2gps: estimating geographic information from a single image. In CVPR, 2008. A. Irschara, C. Zach, J. Frahm, and H. Bischof. From structure-from-motion point clouds to fast location recognition. In CVPR, 2009. E. Johns and G. Yang. From images to scenes: Compressing an image cluster into a single scene model for place recognition. In ICCV, 2011. J. Knopp, J. Sivic, and T. Pajdla. Avoiding confusing features in place recognition. In ECCV, 2010. X. Li, C. Wu, C. Zach, S. Lazebnik, and J. Frahm. Modeling and recognition of landmark image collections using iconic scene graphs. In ECCV, 2008. Y. Li, D. Crandall, and D. Huttenlocher. Landmark classification in large-scale image collections. In ICCV, 2009. Y. Li, N. Snavely, and D. Huttenlocher. Location recognition using prioritized feature matching. In ECCV, 2010. Y. Li, N. Snavely, D. Huttenlocher, and P. Fua. Worldwide pose estimation using 3d point clouds. In ECCV, 2012. T. Malisiewicz and A. Efros. Beyond categories: The visual memex model for reasoning about object relationships. NIPS, 2009. T. Malisiewicz, A. Gupta, and A. A. Efros. Ensemble of exemplar-SVMs for object detection and beyond. In ICCV, 2011. A. Mikulik, M. Perdoch, O. Chum, and J. Matas. Learning a fine vocabulary. In ECCV, 2010. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Ob-
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30] [3 1]
[32]
[33] ject retrieval with large vocabularies and fast spatial matching. In CVPR, 2007. J. Platt et al. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers, 1999. T. Sattler, B. Leibe, and L. Kobbelt. Fast image-based localization using direct 2D-to-3D matching. In ICCV, 2011. T. Sattler, B. Leibe, and L. Kobbelt. Improving image-based localization by active correspondence search. In ECCV, 2012. T. Sattler, T. Weyand, B. Leibe, and L. Kobbelt. Image retrieval for image-based localization revisited. In BMVC, 2012. G. Schindler, M. Brown, and R. Szeliski. City-scale location recognition. In CVPR, 2007. A. Shrivastava, T. Malisiewicz, A. Gupta, and A. A. Efros. Data-driven visual similarity for cross-domain image matching. SIGGRAPH ASIA, 2011. J. Sivic and A. Zisserman. Video google: A text retrieval approach to object matching in videos. In ICCV, 2003. A. Torii, J. Sivic, and T. Pajdla. Visual localization by linear combination of image descriptors. In ICCV Workshops, 2011. P. Turcot and D. Lowe. Better matching with fewer features: The selection of useful features in large database recognition problems. In Workshop on Emergent Issues in Large Amounts of Visual Data, ICCV, 2009. Y. Yue and C. Guestrin. Linear submodular bandits and their application to diversified retrieval. In NIPS, 2011. Y.-T. Zheng et al. Tour the world: building a web-scale landmark recognition engine. In CVPR, 2009. 7 7 7 0 0 0 75 575