cvpr cvpr2013 cvpr2013-293 cvpr2013-293-reference knowledge-graph by maker-knowledge-mining

293 cvpr-2013-Multi-attribute Queries: To Merge or Not to Merge?

Source: pdf

Author: Mohammad Rastegari, Ali Diba, Devi Parikh, Ali Farhadi

Abstract: Users often have very specific visual content in mind that they are searching for. The most natural way to communicate this content to an image search engine is to use keywords that specify various properties or attributes of the content. A naive way of dealing with such multi-attribute queries is the following: train a classifier for each attribute independently, and then combine their scores on images to judge their fit to the query. We argue that this may not be the most effective or efficient approach. Conjunctions of attribute often correspond to very characteristic appearances. It would thus be beneficial to train classifiers that detect these conjunctions as a whole. But not all conjunctions result in such tight appearance clusters. So given a multi-attribute query, which conjunctions should we model? An exhaustive evaluation of all possible conjunctions would be time consuming. Hence we propose an optimization approach that identifies beneficial conjunctions without explicitly training the corresponding classifier. It reasons about geometric quantities that capture notions similar to intra- and inter-class variances. We exploit a discrimina- tive binary space to compute these geometric quantities efficiently. Experimental results on two challenging datasets of objects and birds show that our proposed approach can improveperformance significantly over several strong baselines, while being an order of magnitude faster than exhaustively searching through all possible conjunctions.

reference text

[1] M. Douze, A. Ramisa, and C. Schmid. Combining attributes and fisher vectors for efficient image retrieval. In CVPR, 2011. 2

[2] K. Duan, D. Parikh, D. J. Crandall, and K. Grauman. Discovering localized attributes for fine-grained recognition. In CVPR, 2012. 4

[3] A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth. Describing Objects by their Attributes. In CVPR, 2009. 1, 4

[4] Gionis, Indyk, and Motwani. Similarity search in high dimensions via hashing. 1999. 2, 5

[5] Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. CVPR ’ 11, 2011. 2, 5

[6] M. P. Kumar, B. Packer, and D. Koller. Self-Paced Learning for Latent Variable Models. In NIPS, 2010. 5

[7] C. H. Lampert, H. Nickisch, and S. Harmeling. Learning to detect unseen object classes by between-class attribute transfer. In CVPR, 2009. 1

[8] C. Li, D. Parikh, and T. Chen. Automatic discovery of groups of objects for scene understanding. In CVPR, 2012. 2 333333 111644 ???????????????????????? ??!!???

[9] M. Naphade, J. Smith, J. Tesic, S. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. Large-scale concept ontology for multimedia. IEEE Multimedia, 13(3), 2006. 2

[10] N. Rasiwasia, P. Moreno, and N. Vasconcelos. Bridging the gap: Query by semantic example. Trans Multimedia, 9(5), Aug 2007. 2

[11] M. Rastegari, A. Farhadi, and D. A. Forsyth. Attribute discovery via predictable discriminative binary codes. In ECCV (6), 2012. 2, 5

[12] M. A. Sadeghi and A. Farhadi. Recognition Using Visual Phrases. In CVPR, 2011. 2

[13] R. Salakhutdinov and G. Hinton. Semantic hashing. Int. J. Approx. Reasoning, 2009. 2

[14] B. Saleh, A. Farhadi, and A. Elgammal. Object-centeric anomaly detection by atribute-based reasoning. In CVPR, 2013. 2

[15] W. Scheirer, N. Kumar, P. N. Belhumeur, and T. E. Boult. Multiattribute spaces: Calibration for attribute fusion and similarity search. In The 25th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2012. 2

[16] B. Siddiquie, R. S. Feris, and L. S. Davis. Image Ranking and Retrieval based on Multi-Attribute Queries. In CVPR, 2011. 2

[17] J. Smith, M. Naphade, and A. Natsev. Multimedia semantic indexing using model vectors. In ICME, 2003. 2

[18] A. Wagner, J. Wright, A. Ganesh, Z. Zhou, H. Mobahi, and Y. Ma.

[19]

[20]

[21]

[22] Towards a Practical Face Recognition System: Robust Alignment and Illumination by Sparse Representation. IEEE PAMI, 2011. 4 X. Wang, K. Liu, and X. Tang. Query-specific visual semantic spaces for web image re-ranking. In CVPR, 2011. 2 Y. Weiss, R. Fergus, and A. Torralba. Multidimensional spectral hashing. In ECCV (5), 2012. 2 P. Welinder, S. Branson, T. Mita, C. Wah, F. Schroff, S. Belongie, and P. Perona. Caltech-UCSD Birds 200. Technical report, California Institute of Technology, 2010. 4 E. Zavesky and S.-F. Chang. Cuzero: Embracing the frontier of interactive visual search for informed users. In ACM MIR, 2008. 2 333333 111755