nips nips2012 nips2012-306 nips2012-306-reference knowledge-graph by maker-knowledge-mining

306 nips-2012-Semantic Kernel Forests from Multiple Taxonomies

Source: pdf

Author: Sung J. Hwang, Kristen Grauman, Fei Sha

Abstract: When learning features for complex visual recognition problems, labeled image exemplars alone can be insufﬁcient. While an object taxonomy specifying the categories’ semantic relationships could bolster the learning process, not all relationships are relevant to a given visual classiﬁcation task, nor does a single taxonomy capture all ties that are relevant. In light of these issues, we propose a discriminative feature learning approach that leverages multiple hierarchical taxonomies representing different semantic views of the object categories (e.g., for animal classes, one taxonomy could reﬂect their phylogenic ties, while another could reﬂect their habitats). For each taxonomy, we ﬁrst learn a tree of semantic kernels, where each node has a Mahalanobis kernel optimized to distinguish between the classes in its children nodes. Then, using the resulting semantic kernel forest, we learn class-speciﬁc kernel combinations to select only those relationships relevant to recognize each object class. To learn the weights, we introduce a novel hierarchical regularization term that further exploits the taxonomies’ structure. We demonstrate our method on challenging object recognition datasets, and show that interleaving multiple taxonomic views yields signiﬁcant accuracy improvements.

reference text

[1] N. Dalal and B. Triggs. Histograms of Oriented Gradients for Human Detection. In CVPR, 2005.

[2] J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. Locality-Constrained Linear Coding for Image Classiﬁcation. In CVPR, 2010.

[3] C. Fellbaum, editor. WordNet An Electronic Lexical Database. MIT Press, May 1998.

[4] A. Zweig and D. Weinshall. Exploiting Object Hierarchy: Combining Models from Different Category Levels. In ICCV, 2007.

[5] M. Marszalek and C. Schmid. Semantic hierarchies for visual object recognition. In CVPR, 2007.

[6] A. Torralba, R. Fergus, and W. T. Freeman. 80 million Tiny Images: a Large Dataset for Non-Parametric Object and Scene Recognition. PAMI, 30(11):1958–1970, 2008.

[7] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A Large-Scale Hierarchical Image Database. In CVPR, 2009.

[8] R. Fergus, H. Bernal, Y. Weiss, and A. Torralba. Semantic label sharing for learning with many categories. In ECCV, 2010.

[9] J. Deng, A. Berg, K. Li, and L. Fei-Fei. What does classifying more than 10,000 image categories tell us? In ECCV, 2010.

[10] S. J. Hwang, K. Grauman, and F. Sha. Learning a tree of metrics with disjoint visual features. In NIPS, 2011.

[11] N. Verma, D. Mahajan, S. Sellamanickam, and V. Nair. Learning hierarchical similarity metrics. In CVPR, 2012.

[12] S. Bengio, J. Weston, and D. Grangier. Label Embedding Trees for Large Multi-Class Task. In NIPS, 2010.

[13] J. Deng, S. Satheesh, A. Berg, and L. Fei Fei. Fast and balanced: Efﬁcient label tree learning for large scale object recognition. In NIPS, 2011.

[14] C. Lampert, H. Nickisch, and S. Harmeling. Learning to Detect Unseen Object Classes by Between-Class Attribute Transfer. In CVPR, 2009.

[15] M. Marszalek and C. Schmid. Constructing category hierarchies for visual recognition. In ECCV, 2008.

[16] G. Grifﬁn and P. Perona. Learning and using taxonomies for fast visual categorization. In CVPR, 2008.

[17] T. Gao and D. Koller. Discriminative learning of relaxed hierarchy for large-scale visual recognition. In ICCV, 2011.

[18] J. Sivic, B. Russell, A. Zisserman, W. Freeman, and A. Efros. Unsupervised discovery of visual object class hierarchies. In CVPR, 2008.

[19] E. Bart, I. Porteous, P. Perona, and M. Welling. Unsupervised learning of visual taxonomies. In CVPR, 2008.

[20] L.-J. Li, C. Wang, Y. Lim, D. Blei, and L. Fei-Fei. Building and using a semantivisual image hierarchy. In CVPR, 2010.

[21] S. Kim and E. Xing. Tree-guided group lasso for multi-task regression with structured sparsity. In ICML, 2010.

[22] D. R. Hardoon, S. Szedmak, and J. Shawe-Taylor. Canonical Correlation Analysis: An Overview with Application to Learning Methods. Neural Computation, 16(12), 2004.

[23] A. Blum and T. Mitchell. Combining Labeled and Unlabeled Data with Co-training. In COLT: Proceedings of the Workshop on Computational Learning Theory, 1998.

[24] C. Christoudias, K. Saenko, L. Morency, and T. Darrell. Co-adaptation of audio-visual speech and gesture classiﬁers. In International Conference on Multimodal Interaction, 2006.

[25] I. Dhillon, S. Mallela, and R. Kumar. A divisive information-theoretic feature clustering algorithm for text classiﬁcation. Journal of Machine Learning Research, 3:1265–1287, 2003.

[26] A. Gupta and S. Dasgupta. Hybrid hierarchical clustering: Forming a tree from multiple views. In Workshop on Learning With Multiple Views, 2005.

[27] A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. In NIPS, 2006.

[28] N. Loeff and A. Farhadi. Scene Discovery by Matrix Factorization. In ECCV, 2008.

[29] S. J. Hwang, F. Sha, and K. Grauman. Sharing features between objects and their attributes. In CVPR, 2011.

[30] F. Bach, G. Lanckriet, and M. Jordan. Multiple Kernel Learning, Conic Duality, and the SMO Algorithm. In ICML, 2004.

[31] M. Varma and D. Ray. Learning the discriminative power-invariance trade-off. In ICCV, 2007.

[32] P. Gehler and S. Nowozin. On feature combination for multiclass object classiﬁcation. In ICCV, 2009.

[33] K. Weinberger, J. Blitzer, and L. Saul. Distance Metric Learning for Large Margin Nearest Neighbor Classiﬁcation. In NIPS, 2006.

[34] F. Bach. Exploring large feature spaces with hierarchical multiple kernel learning. In NIPS, 2008.

[35] D. Bertsekas. Nonlinear Programming. Athena Scientiﬁc, 1999.

[36] S. Boyd and A. Mutapcic. Subgradient methods. 2007.

[37] O. Russakovsky and L. Fei-Fei. Attribute learning in large-scale datasets. In ECCV, 2010. 9