cvpr cvpr2013 cvpr2013-275 cvpr2013-275-reference knowledge-graph by maker-knowledge-mining

275 cvpr-2013-Lp-Norm IDF for Large Scale Image Search

Source: pdf

Author: Liang Zheng, Shengjin Wang, Ziqiong Liu, Qi Tian

Abstract: The Inverse Document Frequency (IDF) is prevalently utilized in the Bag-of-Words based image search. The basic idea is to assign less weight to terms with high frequency, and vice versa. However, the estimation of visual word frequency is coarse and heuristic. Therefore, the effectiveness of the conventional IDF routine is marginal, and far from optimal. To tackle thisproblem, thispaper introduces a novel IDF expression by the use of Lp-norm pooling technique. . edu . cn qit i @ c s an . ut s a . edu ? ? ? ? ? ? ? ? Carefully designed, the proposed IDF takes into account the term frequency, document frequency, the complexity of images, as well as the codebook information. Optimizing the IDF function towards optimal balancing between TF and pIDF weights yields the so-called Lp-norm IDF (pIDF). WpIDe sFho wwe ithghatts sth yeie clodsnv tehnetio son-acla IlDleFd i Ls a special case of our generalized version, and two novel IDFs, i.e. the average IDF and the max IDF, can also be derived from our formula. Further, by counting for the term-frequency in each image, the proposed Lp-norm IDF helps to alleviate the viismuaalg we,o trhde b purrosptionseesds phenomenon. Our method is evaluated through extensive experiments on three benchmark datasets (Oxford 5K, Paris 6K and Flickr 1M). We report a performance improvement of as large as 27.1% over the baseline approach. Moreover, since the Lp-norm IDF is computed offline, no extra computation or memory cost is introduced to the system at all.

reference text

[1] X. Blix, G. Roig, and L. V. Gool. Nested sparse quantization for efficient feature coding. In ECCV, 2012.

[2] Y. Cai, W. Tong, L. Yang, and A. Hauptmann. Constrained keypoint quantization: towards better bag-of-words model for large-scale multimedia retrieval. In ICMR, 2012.

[3] J. Feng, B. Ni, Q. Tian, and S. Yan. Geometric Lp-norm fJ.ea Ftuerneg pooling f Qor. image aclnadss Si.fic Yaatino.n. GIeno CmVePtrRic, 2 L01 1.

[4] H. J ´egou, M. Douze, and C. Schmid. Hamming embedding and weak geometric consistency for large scale image search. In ECCV, 2008.

[5] H. J ´egou, M. Douze, and C. Schmid. On the burstiness of visual elements. In CVPR, 2009.

[6] S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: spatial pyramid matching for recognizing natual scene categories. In CVPR, 2006.

[7] D. G. Lowe. Distinctive image features from scale invariant keypoints. IJCV, 2004.

[8] D. Niester and H. Stewenius. Scalable recognition with a vocabulary tree. In CVPR, 2006.

[9] M. Perd’och, O. Chum, and J. Matas. Efficient representation of local geometry for large scale object retrieval. In CVPR, 2009.

[10] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Lost in quantization: Improving particular object retrieval in large scale image databases. In CVPR, 2008.

[11] J. Philbin, O. Chum, M. Isard, and A. Zisserman. Object retrieval with large vocabularies and fast sptial matching. In CVPR, 2007.

[12] S. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. SIGIR of Documentation, 1994.

[13] K. Simonyan, A. Vedaldi, and A. Zisserman. Descriptor learning using convex optimisation. In ECCV, 2012.

[14] A. Singhal. Modern information retrieval: A brief overview. IEEE Data Eng. Bull., 2001.

[15] J. Sivic and A. Zisserman. Video google: a text retrieval approach to object matching in videos. In ICCV, 2003.

[16] J. C. van Gemert, C. J. Veenman, and W. M. Smeulders. Visual word ambiguity. PAMI, 2010.

[17] X. Wang, M. Yang, T. Cour, S. Zhu, K. Yu, and T. X. Han. Contextual weighting for vocabulary tree based image retrieval. In ICCV, 2011.

[18] Y. Yang, F. Nie, D. Xu, J. Luo, Y. Zhuang, and Y. Pan. A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. PAMI, 2012.

[19] S. Zhang, Q. Huang, G. Hua, S. Jiang, W. Gao, and Q. Tian. Building contextual visual vocabulary for large-scale image applications. In ACM MM, 2010.

[20] S. Zhang, Q. Tian, G. Hua, Q. Huang, and S. Li. Descriptive visual words and visual phrases for image applications. In

[21]

[22]

[23]

[24] ACM MM, 2009. X. Zhang, Z. Li, L. Zhang, W. Ma, and H.-Y. Shum. Efficient indexing for large scale visual search. In ICCV, 2009. L. Zheng and S. Wang. Visual phraselet: Refining spatial constraints for large scale image search. Signal Processing Letters, IEEE, 20(4):391–394, 2013. W. Zhou, Y. Lu, H. Li, Y. Song, and Q. Tian. Spatial coding for large scale partial-duplicate web image search. In ACM MM, 2010. W. Zhou, Y. Lu, H. Li, and Q. Tian. Scalar quantization for large scale image search. In ACM MM, 2012. 111666333311