nips nips2011 nips2011-151 nips2011-151-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Kristen Grauman, Fei Sha, Sung J. Hwang
Abstract: We introduce an approach to learn discriminative visual representations while exploiting external semantic knowledge about object category relationships. Given a hierarchical taxonomy that captures semantic similarity between the objects, we learn a corresponding tree of metrics (ToM). In this tree, we have one metric for each non-leaf node of the object hierarchy, and each metric is responsible for discriminating among its immediate subcategory children. Specifically, a Mahalanobis metric learned for a given node must satisfy the appropriate (dis)similarity constraints generated only among its subtree members’ training instances. To further exploit the semantics, we introduce a novel regularizer coupling the metrics that prefers a sparse disjoint set of features to be selected for each metric relative to its ancestor (supercategory) nodes’ metrics. Intuitively, this reflects that visual cues most useful to distinguish the generic classes (e.g., feline vs. canine) should be different than those cues most useful to distinguish their component fine-grained classes (e.g., Persian cat vs. Siamese cat). We validate our approach with multiple image datasets using the WordNet taxonomy, show its advantages over alternative metric learning approaches, and analyze the meaning of attribute features selected by our algorithm. 1
[1] A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. Multiple kernels for object detection. In ICCV, 2009.
[2] P. Gehler and S. Nowozin. On feature combination for multiclass object classification. In ICCV, 2009.
[3] J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In CVPR, 2009.
[4] L.-J. Li, H. Su, E. Xing, and L. Fei-Fei. Object bank: A high-level image representation for scene classification and semantic feature sparsification. In NIPS, 2010.
[5] Y. Jia, M. Salzmann, and T. Darrell. Factorized latent spaces with structured sparsity. In NIPS, 2010.
[6] A. Frome, Y. Singer, and J. Malik. Image retrieval and classification using local distance functions. In NIPS, 2006.
[7] P. Kumar, P. Torr, and A. Zisserman. An invariant large margin nearest neighbour classifier. In ICCV, 2007.
[8] P. Jain, B. Kulis, and K. Grauman. Fast image search for learned metrics. In CVPR, 2008.
[9] D. Ramanan and S. Baker. Local distance functions: A taxonomy, new algorithms, and an evaluation. In PAMI, 2011.
[10] Z. Wang, Y. Hu, and L.-T. Chia. Image-to-class distance metric learning for image classification. In ECCV, 2010.
[11] K. Q. Weinberger and K. L. Saul. Distance metric learning for large margin nearest neighbor classification. JMLR, 10:207–244, June 2009.
[12] M. Marszalek and C. Schmid. Constructing category hierarchies for visual recognition. In ECCV, 2008.
[13] G. Griffin and P. Perona. Learning and using taxonomies for fast visual category recognition. In CVPR, 2008.
[14] D. Koller and M. Sahami. Hierarchically classifying documents using very few words. In ICML, 1997.
[15] A. McCallum, R. Rosenfeld, T. Mitchell, and A. Ng. Improving text classification by shrinkage in a hierarchy of classes. In ICML, 1998.
[16] L. Cai and T. Hofmann. Hierarchical document categorization with support vector machines. In CIKM, 2004.
[17] C. Lampert, H. Nickisch, and S. Harmeling. Learning to detect unseen object classes by between-class attribute transfer. In CVPR, 2009.
[18] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-F ei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.
[19] A. Torralba and K. Murphy. Sharing visual features for multiclass and multiview object detection. PAMI, 29(5), 2007.
[20] G. Shakhnarovich. Learning Task-Specific Similarity. PhD thesis, MIT, 2006.
[21] B. Babenko, S. Branson, and S. Belongie. Similarity functions for categorization: from monolithic to category specific. In ICCV, 2009.
[22] A. Globerson and S. Roweis. Metric learning by collapsing classes. In NIPS, pages 451–458. 2006.
[23] J. Davis, B. Kulis, P. Jain, S. Sra, and I. Dhillon. Information-theoretic metric learning. In ICML, 2007.
[24] K. Weinberger and L. Saul. Fast solvers and efficient implementations for distance metric learning. In ICML, 2008.
[25] Q. Chen and S. Sun. Hierarchical large margin nearest neighbor classification. In ICPR, 2010.
[26] S. Parameswaran and K. Weinberger. Large margin multi-task metric learning. In NIPS, 2010.
[27] Y. Ying, K. Huang, and C. Campbell. Sparse metric learning via smooth optimization. In NIPS. 2009.
[28] D. Zhou, L. Xiao, and M. Wu. Hierarchical classification via orthogonal transfer. In ICML, 2011.
[29] J. Sivic, B. Russell, A. Zisserman, W. Freeman, and A. Efros. Unsupervised discovery of visual object class hierarchies. In CVPR, 2008.
[30] E. Bart, I. Porteous, P. Perona, and M. Welling. Unsupervised learning of visual taxonomies. In CVPR, 2008.
[31] A. Zweig and D. Weinshall. Exploiting object hierarchy: Combining models from different category levels. In ICCV, 2007.
[32] J. Deng, A. Berg, K. Li, and L. Fei-Fei. What does classifying more than 10,000 image categories tell us? In ECCV, 2010.
[33] Y. Wang and G. Mori. A discriminative latent model of object classes and attributes. In ECCV, 2010.
[34] R. Tibshirani. Regression shrinkage and selection via the lasso. J. Roy. Statistical Society, 58:267–288, 1994.
[35] S. Dumais and H. Chen. Hierarchical classification of web content. In Research and Development in Information Retrieval, 2000.
[36] R. Fergus, H. Bernal, Y. Weiss, and A. Torralba. Semantic label sharing for learning with many categories. In ECCV, 2010.
[37] K. Barnard, Q. Fan, R. Swaminathan, A. Hoogs, R. Collins, P. Rondot, and J. Kaufhold. Evaluation of localized semantics: data, methodology, and experiments. Technical report, University of Arizona, 2005. 9