iccv iccv2013 iccv2013-53 iccv2013-53-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Naman Turakhia, Devi Parikh
Abstract: When we look at an image, some properties or attributes of the image stand out more than others. When describing an image, people are likely to describe these dominant attributes first. Attribute dominance is a result of a complex interplay between the various properties present or absent in the image. Which attributes in an image are more dominant than others reveals rich information about the content of the image. In this paper we tap into this information by modeling attribute dominance. We show that this helps improve the performance of vision systems on a variety of human-centric applications such as zero-shot learning, image search and generating textual descriptions of images.
[1] A. Berg, T. Berg, H. Daume, J. Dodge, A. Goyal, X. Han, A. Mensch, M. Mitchell, A. Sood, K. Stratos, et al. Understanding and predicting importance in images. In CVPR, 2012.
[2] T. Berg, A. Berg, and J. Shih. Automatic attribute discovery and characterization from noisy web data. In ECCV, 2010.
[3] A. Biswas and D. Parikh. Simultaneous active learning of classifiers & attributes via relative feedback. In CVPR, 2013.
[4] S. Branson, C. Wah, B. Babenko, F. Schroff, P. Welinder, P. Perona, and S. Belongie. Visual recognition with humans in the loop. In ECCV, 2010.
[5] M. Douze, A. Ramisa, and C. Schmid. Combining attributes and fisher vectors for efficient image retrieval. In CVPR, 2011.
[6] L. Elazary and L. Itti. Interesting objects are visually salient. J. of Vision, 8(3), 2008.
[7] A. Farhadi, I. Endres, and D. Hoiem. Attribute-centric recognition for cross-category generalization. In CVPR, 2010.
[8] A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth. Describing objects
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23] by their attributes. In CVPR, 2009. A. Farhadi, M. Hejrati, A. Sadeghi, P. Young, C. Rashtchian, J. Hockenmaier, and D. Forsyth. Every picture tells a story: Generating sentences for images. In ECCV, 2010. P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. PAMI, 2010. V. Ferrari and A. Zisserman. Learning visual attributes. In NIPS, 2007. S. Hwang and K. Grauman. Learning the relative importance of objects from tagged images for retrieval and cross-modal search. IJCV, 2011. S. J. Hwang and K. Grauman. Reading between the lines: Object localization using implicit cues from image tags. PAMI, 2012. L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. PAMI, 1998. T. Judd, K. Ehinger, F. Durand, and A. Torralba. Learning to predict where humans look. In ICCV, 2009. A. Kovashka, D. Parikh, and K. Grauman. Whittlesearch: Image search with relative attribute feedback. In CVPR, 2012. G. Kulkarni, V. Premraj, S. L. Sagnik Dhar and, Y. Choi, A. C. Berg, and T. L. Berg. Baby talk: Understanding and generating simple image descriptions. In CVPR, 2011. N. Kumar, P. Belhumeur, and S. Nayar. Facetracer: A search engine for large collections of images with faces. In ECCV, 2010. N. Kumar, A. Berg, P. Belhumeur, and S. Nayar. Attribute and simile classifiers for face verification. In ICCV, 2009. C. Lampert, H. Nickisch, and S. Harmeling. Learning to detect unseen object classes by between-class attribute transfer. In CVPR, 2009. M. Naphade, J. Smith, J. Tesic, S. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. Large-scale concept ontology for multimedia. IEEE Multimedia, 2006. V. Ordonez, G. Kulkarni, and T. Berg. Im2text: Describing images using 1million captioned photographs. In NIPS, 2011. D. Parikh and K. Grauman. Interactively building a discriminative vocabulary of nameable attributes. In CVPR, 2011.
[24] D. Parikh and K. Grauman. Relative attributes. In ICCV, 2011.
[25] A. Parkash and D. Parikh. Attributes for classifier feedback. In ECCV, 2012.
[26] N. Rasiwasia, P. Moreno, and N. Vasconcelos. Bridging the gap: Query by semantic example. IEEE Trans. on Multimedia, 2007.
[27] A. Sadovnik, A. C. Gallagher, D. Parikh, and T. Chen. Spoken attributes: Mixing binary and relative attributes to say the right thing. In ICCV, 2013.
[28] B. Siddiquie, R. S. Feris, and L. S. Davis. Image ranking and retrieval based on multi-attribute queries. In CVPR, 2011.
[29] J. Smith, M. Naphade, and A. Natsev. Multimedia semantic indexing using model vectors. In ICME, 2003.
[30] M. Spain and P. Perona. Measuring and predicting object importance. IJCV, 91(1), 2011.
[31] G. Wang and D. Forsyth. Joint learning of visual attributes, object classes and visual saliency. In ICCV, 2009.
[32] G. Wang, D. Forsyth, and D. Hoiem. Comparative object similarity for improved recognition with few or no examples. In CVPR, 2010.
[33] J. Wang, K. Markert, and M. Everingham. Learning models for object recognition from natural language descriptions. In BMVC, 2009.
[34] X. Wang, K. Liu, and X. Tang. Query-specific visual semantic spaces for web image re-ranking. In CVPR, 2011.
[35] Y. Yang, C. Teo, H. Daum e´ III, and Y. Aloimonos. Corpus-guided sentence generation of natural images. In EMNLP, 2011.
[36] E. Zavesky and S.-F. Chang. Cuzero: Embracing the frontier of interactive visual search for informed users. In Proceedings of ACM Multimedia Information Retrieval, 2008. 1232