cvpr cvpr2013 cvpr2013-456 cvpr2013-456-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Akihiko Torii, Josef Sivic, Tomáš Pajdla, Masatoshi Okutomi
Abstract: Repeated structures such as building facades, fences or road markings often represent a significant challenge for place recognition. Repeated structures are notoriously hard for establishing correspondences using multi-view geometry. Even more importantly, they violate thefeature independence assumed in the bag-of-visual-words representation which often leads to over-counting evidence and significant degradation of retrieval performance. In this work we show that repeated structures are not a nuisance but, when appropriately represented, theyform an importantdistinguishing feature for many places. We describe a representation of repeated structures suitable for scalable retrieval. It is based on robust detection of repeated image structures and a simple modification of weights in the bag-of-visual-word model. Place recognition results are shown on datasets of street-level imagery from Pittsburgh and San Francisco demonstrating significant gains in recognition performance compared to the standard bag-of-visual-words baseline and more recently proposed burstiness weighting.
[1] B. Aguera y Arcas. Augmented reality using Bing maps., 2010. Talk at TED 2010.
[2] R. Arandjelovi c´ and A. Zisserman. Three things everyone should know to improve object retrieval. In CVPR, 2012.
[3] D. Chen, G. Baatz, et al. City-scale landmark identification on mobile devices. In CVPR, 2011.
[4] O. Chum, A. Mikulik, M. Perdoch, and J. Matas. Total recall II: Query expansion revisited. In CVPR, 2011. 888888888899777
[5] O. Chum, M. Perdoch, and J. Matas. Geometric minhashing: Finding a (thick) needle in a haystack. In CVPR, 2009.
[6] O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman. Total recall: Automatic query expansion with a generative feature model for object retrieval. In ICCV, 2007.
[7] M. Cummins and P. Newman. Highly scalable appearanceonly SLAM - FAB-MAP 2.0. In Proceedings of Robotics: Science and Systems, Seattle, USA, June 2009.
[8] P. Doubek, J. Matas, M. Perdoch, and O. Chum. Image matching and retrieval by repetitive patterns. In ICPR, 2010.
[9] D. Hauagge and N. Snavely. Image matching using local symmetry features. In CVPR, 2012.
[10] J. Hays, M. Leordeanu, A. Efros, and Y. Liu. Discovering texture regularity as a higher-order correspondence problem. In ECCV, 2006.
[11] A. Irschara, C. Zach, J. Frahm, and H. Bischof. From structure-from-motion point clouds to fast location recognition. In CVPR, 2009.
[12] H. Jegou, M. Douze, and C. Schmid. Hamming embedding and weak geometric consistency for large-scale image search. In ECCV, 2008.
[13] H. Jegou, M. Douze, and C. Schmid. On the burstiness of visual elements. In CVPR, 2009.
[14] H. J ´egou, M. Douze, and C. Schmid. Product quantization for nearest neighbor search. PAMI, 33(1): 117–128, 2011.
[15] H. Jegou, H. Harzallah, and C. Schmid. A contextual dissimilarity measure for accurate and efficient image search. In CVPR, 2007.
[16] H. J ´egou, F. Perronnin, M. Douze, J. S ´anchez, P. P ´erez, and
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29] C. Schmid. Aggregating local image descriptors into compact codes. PAMI, 34(9): 1704–1716, 2012. S. Katz. Distribution of content words and phrases in text and language modelling. Natural Language Engineering, 2(1): 15–59, 1996. J. Knopp, J. Sivic, and T. Pajdla. Avoiding confusing features in place recognition. In ECCV, 2010. T. Leung and J. Malik. Detecting, localizing and grouping repeated scene elements from an image. In ECCV, 1996. Y. Li, N. Snavely, and D. Huttenlocher. Location recognition using prioritized feature matching. In ECCV, 2010. A. Mikulik, M. Perdoch, O. Chum, and J. Matas. Learning a fine vocabulary. In ECCV, 2010. M. Muja and D. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In VISAPP, 2009. P. Muller, G. Zeng, P. Wonka, and L. Van Gool. Image-based procedural modeling of facades. ACM TOG, 26(3):85, 2007. D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In CVPR, 2006. M. Park, K. Brocklehurst, R. Collins, and Y. Liu. Deformed lattice detection in real-world images using mean-shift belief propagation. PAMI, 31(10): 1804–1816, 2009. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In CVPR, 2007. J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In CVPR, 2008. J. Philbin, M. Isard, J. Sivic, and A. Zisserman. Descriptor learning for efficient retrieval. In ECCV, 2010. J. Philbin, J. Sivic, and A. Zisserman. Geometric latent
[30] [3 1]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40] dirichlet allocation on a matching graph for large-scale image datasets. IJCV, 2010. A. Pothen and C.-J. Fan. Computing the block triangular form of a sparse matrix. ACM Transactions on Mathematical Software, 16(4):303–324, 1990. T. Quack, B. Leibe, and L. Van Gool. World-scale mining of objects and events from community photo collections. In Proc. CIVR, 2008. G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 1988. T. Sattler, B. Leibe, and L. Kobbelt. SCRAMSAC: Improving RANSAC’s efficiency with a spatial consistency filter. In ICCV, 2009. T. Sattler, T. Weyand, B. Leibe, and L. Kobbelt. Image retrieval for image-based localization revisited. In BMVC, 2012. F. Schaffalitzky and A. Zisserman. Geometric grouping of repeated elements within images. In BMVC, 1998. F. Schaffalitzky and A. Zisserman. Automated location matching in movies. CVIU, 92:236–264, 2003. G. Schindler, M. Brown, and R. Szeliski. City-scale location recognition. In CVPR, 2007. G. Schindler, P. Krishnamurthy, R. Lublinerman, Y. Liu, and F. Dellaert. Detecting and matching repeated patterns for automatic geo-tagging in urban environments. In CVPR, 2008. C. Schmid and R. Mohr. Local greyvalue invariants for image retrieval. PAMI, 19(5):530–534, 1997. J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In ICCV, 2003.
[41] O. Teboul, L. Simon, P. Koutsourakis, and N. Paragios. Segmentation of building facades using procedural shape priors. In CVPR, 2010.
[42] A. Torii, J. Sivic, and T. Pajdla. Visual localization by linear combination of image descriptors. In Proceedings of the 2nd IEEE Workshop on Mobile Vision, with ICCV, 2011.
[43] P. Turcot and D. Lowe. Better matching with fewer features: The selection of useful features in large database recognition problem. In WS-LAVD, ICCV, 2009.
[44] J. C. van Gemert, C. J. Veenman, A. W. Smeulders, and J.M. Geusebroek. Visual word ambiguity. PAMI, 32(7): 1271– 1283, 2010.
[45] C. Wu, J. Frahm, and M. Pollefeys. Detecting large repetitive structures with salient boundaries. In ECCV, 2010.
[46] C. Wu, J.-M. Frahm, and M. Pollefeys. Repetition-based dense single-view reconstruction. In CVPR, 2011.
[47] A. Zamir and M. Shah. Accurate image localization based on google maps street view. In ECCV, 2010.
[48] Y. Zhang, Z. Jia, and T. Chen. Image retrieval with geometrypreserving visual phrases. In CVPR, 2011. 888889888900888