nips nips2012 nips2012-306 nips2012-306-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sung J. Hwang, Kristen Grauman, Fei Sha
Abstract: When learning features for complex visual recognition problems, labeled image exemplars alone can be insufficient. While an object taxonomy specifying the categories’ semantic relationships could bolster the learning process, not all relationships are relevant to a given visual classification task, nor does a single taxonomy capture all ties that are relevant. In light of these issues, we propose a discriminative feature learning approach that leverages multiple hierarchical taxonomies representing different semantic views of the object categories (e.g., for animal classes, one taxonomy could reflect their phylogenic ties, while another could reflect their habitats). For each taxonomy, we first learn a tree of semantic kernels, where each node has a Mahalanobis kernel optimized to distinguish between the classes in its children nodes. Then, using the resulting semantic kernel forest, we learn class-specific kernel combinations to select only those relationships relevant to recognize each object class. To learn the weights, we introduce a novel hierarchical regularization term that further exploits the taxonomies’ structure. We demonstrate our method on challenging object recognition datasets, and show that interleaving multiple taxonomic views yields significant accuracy improvements.
[1] N. Dalal and B. Triggs. Histograms of Oriented Gradients for Human Detection. In CVPR, 2005.
[2] J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. Locality-Constrained Linear Coding for Image Classification. In CVPR, 2010.
[3] C. Fellbaum, editor. WordNet An Electronic Lexical Database. MIT Press, May 1998.
[4] A. Zweig and D. Weinshall. Exploiting Object Hierarchy: Combining Models from Different Category Levels. In ICCV, 2007.
[5] M. Marszalek and C. Schmid. Semantic hierarchies for visual object recognition. In CVPR, 2007.
[6] A. Torralba, R. Fergus, and W. T. Freeman. 80 million Tiny Images: a Large Dataset for Non-Parametric Object and Scene Recognition. PAMI, 30(11):1958–1970, 2008.
[7] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A Large-Scale Hierarchical Image Database. In CVPR, 2009.
[8] R. Fergus, H. Bernal, Y. Weiss, and A. Torralba. Semantic label sharing for learning with many categories. In ECCV, 2010.
[9] J. Deng, A. Berg, K. Li, and L. Fei-Fei. What does classifying more than 10,000 image categories tell us? In ECCV, 2010.
[10] S. J. Hwang, K. Grauman, and F. Sha. Learning a tree of metrics with disjoint visual features. In NIPS, 2011.
[11] N. Verma, D. Mahajan, S. Sellamanickam, and V. Nair. Learning hierarchical similarity metrics. In CVPR, 2012.
[12] S. Bengio, J. Weston, and D. Grangier. Label Embedding Trees for Large Multi-Class Task. In NIPS, 2010.
[13] J. Deng, S. Satheesh, A. Berg, and L. Fei Fei. Fast and balanced: Efficient label tree learning for large scale object recognition. In NIPS, 2011.
[14] C. Lampert, H. Nickisch, and S. Harmeling. Learning to Detect Unseen Object Classes by Between-Class Attribute Transfer. In CVPR, 2009.
[15] M. Marszalek and C. Schmid. Constructing category hierarchies for visual recognition. In ECCV, 2008.
[16] G. Griffin and P. Perona. Learning and using taxonomies for fast visual categorization. In CVPR, 2008.
[17] T. Gao and D. Koller. Discriminative learning of relaxed hierarchy for large-scale visual recognition. In ICCV, 2011.
[18] J. Sivic, B. Russell, A. Zisserman, W. Freeman, and A. Efros. Unsupervised discovery of visual object class hierarchies. In CVPR, 2008.
[19] E. Bart, I. Porteous, P. Perona, and M. Welling. Unsupervised learning of visual taxonomies. In CVPR, 2008.
[20] L.-J. Li, C. Wang, Y. Lim, D. Blei, and L. Fei-Fei. Building and using a semantivisual image hierarchy. In CVPR, 2010.
[21] S. Kim and E. Xing. Tree-guided group lasso for multi-task regression with structured sparsity. In ICML, 2010.
[22] D. R. Hardoon, S. Szedmak, and J. Shawe-Taylor. Canonical Correlation Analysis: An Overview with Application to Learning Methods. Neural Computation, 16(12), 2004.
[23] A. Blum and T. Mitchell. Combining Labeled and Unlabeled Data with Co-training. In COLT: Proceedings of the Workshop on Computational Learning Theory, 1998.
[24] C. Christoudias, K. Saenko, L. Morency, and T. Darrell. Co-adaptation of audio-visual speech and gesture classifiers. In International Conference on Multimodal Interaction, 2006.
[25] I. Dhillon, S. Mallela, and R. Kumar. A divisive information-theoretic feature clustering algorithm for text classification. Journal of Machine Learning Research, 3:1265–1287, 2003.
[26] A. Gupta and S. Dasgupta. Hybrid hierarchical clustering: Forming a tree from multiple views. In Workshop on Learning With Multiple Views, 2005.
[27] A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. In NIPS, 2006.
[28] N. Loeff and A. Farhadi. Scene Discovery by Matrix Factorization. In ECCV, 2008.
[29] S. J. Hwang, F. Sha, and K. Grauman. Sharing features between objects and their attributes. In CVPR, 2011.
[30] F. Bach, G. Lanckriet, and M. Jordan. Multiple Kernel Learning, Conic Duality, and the SMO Algorithm. In ICML, 2004.
[31] M. Varma and D. Ray. Learning the discriminative power-invariance trade-off. In ICCV, 2007.
[32] P. Gehler and S. Nowozin. On feature combination for multiclass object classification. In ICCV, 2009.
[33] K. Weinberger, J. Blitzer, and L. Saul. Distance Metric Learning for Large Margin Nearest Neighbor Classification. In NIPS, 2006.
[34] F. Bach. Exploring large feature spaces with hierarchical multiple kernel learning. In NIPS, 2008.
[35] D. Bertsekas. Nonlinear Programming. Athena Scientific, 1999.
[36] S. Boyd and A. Mutapcic. Subgradient methods. 2007.
[37] O. Russakovsky and L. Fei-Fei. Attribute learning in large-scale datasets. In ECCV, 2010. 9