cvpr cvpr2013 cvpr2013-241 cvpr2013-241-reference knowledge-graph by maker-knowledge-mining

241 cvpr-2013-Label-Embedding for Attribute-Based Classification

Source: pdf

Author: Zeynep Akata, Florent Perronnin, Zaid Harchaoui, Cordelia Schmid

Abstract: Attributes are an intermediate representation, which enables parameter sharing between classes, a must when training data is scarce. We propose to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors. We introduce a function which measures the compatibility between an image and a label embedding. The parameters of this function are learned on a training set of labeled samples to ensure that, given an image, the correct classes rank higher than the incorrect ones. Results on the Animals With Attributes and Caltech-UCSD-Birds datasets show that the proposed framework outperforms the standard Direct Attribute Prediction baseline in a zero-shot learning scenario. The label embedding framework offers other advantages such as the ability to leverage alternative sources of information in addition to attributes (e.g. class hierarchies) or to transition smoothly from zero-shot learning to learning with large quantities of data.

reference text

[1] Y. Amit, M. Fink, N. Srebro, and S. Ullman. Uncovering shared structures in multiclass classification. In ICML, 2007. 2, 3

[2] S. Bengio, J. Weston, and D. Grangier. Label embedding trees for large multi-class tasks. In NIPS, 2010. 2, 3

[3] S. Branson, C. Wah, B. Babenko, F. Schroff, P. Welinder, P. Perona, and S. Belongie. Visual recognition with humans in the loop. In ECCV, 2010. 6

[4] K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman. The devil is in the details: an evaluation of recent feature encoding methods. In BMVC, 2011. 2, 5

[5] S. Clinchant, G. Csurka, F. Perronnin, and J.-M. Renders. XRCEs participation to ImageEval. In ImageEval Workshop at CVIR, 2007. 5

[6] T. Deselaers and V. Ferrari. Visual and semantic similarity in ImageNet. In CVPR, 2011. 2

[7] M. Douze, A. Ramisa, and C. Schmid. Combining attributes and Fisher vectors for efficient image retrieval. In CVPR, 2011. 2

[8] A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth. Describing objects by their attributes. CVPR, 2009. 1, 2, 3

[9] V. Ferrari and A. Zisserman. Learning visual attributes. In NIPS, 2007. 2

[10] B. Geng, L. Yang, C. Xu, and X.-S. Hua. Ranking model adaptation for domain-specific search. IEEE TKDE, 2012. 4

[11] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning (2nd Ed.). Springer Series in Statistics. Springer, 2008. 3

[12] D. Hsu, S. Kakade, J. Langford, and T. Zhang. Multi-label prediction via compressed sensing. In NIPS, 2009. 2

[13] H. J ´egou, M. Douze, and C. Schmid. Product quantization for nearest neighbor search. IEEE TPAMI, 2011. 2, 5

[14] G. Kulkarni, V. Premraj, S. Dhar, S. Li, Y. choi, A. Berg, and T. Berg. Baby talk: understanding and generating simple image descriptions. In CVPR, 2011. 2

[15] N. Kumar, P. Belhummeur, and S. Nayar. FaceTracer: A search engine for large collections of images with faces. In ECCV, 2008. 2

[16] C. Lampert, H. Nickisch, and S. Harmeling. Learning to detect unseen object classes by between-class attribute transfer. In CVPR, 2009. 1, 2, 3, 4, 5, 6

[17] H. Larochelle, D. Erhan, and Y. Bengio. Zero-data learning of new tasks. In AAAI, 2008. 1, 2, 3

[18] D. G. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60:91–1 10, 2004. 5

[19] D. Mahajan, S. Sellamanickam, and V. Nair. A joint learning framework for attribute models and object descriptions. In ICCV, 2011. 2

[20] S. Maji and A. Berg. Max-margin additive classifiers for detection. In ICCV, 2009. 2

[21] T. Mensink, J. Verbeek, and G. Csurka. Tree-structured CRF models for interactive image labeling. IEEE TPAMI, 2012. 2

[22] T. Mensink, J. Verbeek, F. Perronnin, and G. Csurka. Metric learning for large scale image classification: Generalizing to new classes at near-zero cost. In ECCV, 2012. 2

[23] V. Ordonez, G. Kulkarni, and T. Berg. Im2Text: Describing images using 1 million captioned photographs. In NIPS, 2011. 2

[24] D. Osherson, J. Stern, O. Wilkie, M. Stob, and E. Smith. Default probability. Cognitive Science, 1991. 4

[25] M. Palatucci, D. Pomerleau, G. Hinton, and T. Mitchell. Zero-shot learning with semantic output codes. In NIPS, 2009. 1, 2, 3, 5

[26] F. Perronnin, J. S ´anchez, and T. Mensink. Improving the Fisher kernel for large-scale image classification. In ECCV, 2010. 2, 5

[27] M. Rohrbach, M. Stark, and B.Schiele. Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In CVPR, 2011. 2

[28] M. Rohrbach, M. Stark, G. Szarvas, I. Gurevych, and B. Schiele. What hepls here – and why? Semantic relatedness for knowledge transfer. In CVPR, 2010. 2

[29] M. Saerens, F. Fouss, L. Yen, and P. Dupont. The principal components analysis of a graph, and its relationships to spectral clustering.

[30] [3 1]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43] In ECML, 2004. 4 J. S ´anchez and F. Perronnin. High-dimensional signature compression for large-scale image classification. In CVPR, 2011. 2 V. Sharmanska, N. Quadrianto, and C. H. Lampert. Augmented attribute representations. In ECCV, 2012. 2 J. Shawe-Taylor and N. Cristianini. Kernel MethodsforPatternAnalysis. Cambridge Univ. Press, 2004. 2 B. Siddiquie, R. Feris, and L. Davis. Image ranking and retrieval based on multi-attribute queries. In CVPR, 2011. 2 I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. JMLR, 2005. 3, 4 A. Vedaldi and A. Zisserman. Efficient additive kernels via explicit feature maps. In CVPR, 2010. 2 A. Vedaldi and A. Zisserman. Sparse kernel approximations for efficient classification and detection. In CVPR, 2012. 2 C. Wah, S. Branson, P. Perona, and S. Belongie. Multiclass recognition and part localization with humans in the loop. In ICCV, 2011. 2, 4, 6 G. Wang and D. Forsyth. Joint learning of visual attributes, object classes and visual saliency. In ICCV, 2009. 2 Y. Wang and G. Mori. A discriminative latent model of object classes and attributes. In ECCV, 2010. 2 K. Weinberger and O. Chapelle. Large margin taxonomy embedding for document categorization. In NIPS, 2008. 2, 3 J. Weston, S. Bengio, and N. Usunier. Large scale image annotation: Learning to rank with joint word-image embeddings. ECML, 2010. 2, 3, 4, 7 J. Weston, O. Chapelle, A. Elisseeff, B. Sch o¨lkopf, and V. Vapnik. Kernel dependency estimation. In NIPS, 2002. 2 X. Yu and Y. Aloimonos. Attribute-based transfer learning for object categorization with zero or one training example. In ECCV, 2010. 2 8 8 82 2 26 4 4