cvpr cvpr2013 cvpr2013-296 cvpr2013-296-reference knowledge-graph by maker-knowledge-mining

296 cvpr-2013-Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization

Source: pdf

Author: Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang

Abstract: For the task of visual categorization, the learning model is expected to be endowed with discriminative visual feature representation and flexibilities in processing many categories. Many existing approaches are designed based on a flat category structure, or rely on a set of pre-computed visual features, hence may not be appreciated for dealing with large numbers of categories. In this paper, we propose a novel dictionary learning method by taking advantage of hierarchical category correlation. For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity. Moreover, the dictionaries in lower levels also inherit the dictionary of ancestor nodes, so that categories in lower levels are described with multi-scale visual information using our dictionary learning approach. Experiments on ImageNet object data subset and SUN397 scene dataset demonstrate that our approach achieves promising performance on data with large numbers of classes compared with some state-of-the-art methods, and is more efficient in processing large numbers of categories.

reference text

[1] E. Bart, I. Porteous, P. Perona, and M. Welling. Unsupervised learning of visual taxonomies. In CVPR, 2008.

[2] Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce. Learning mid-level features for recognition. In CVPR, 2010.

[3] N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Incremental algorithms for hierarchical classification. JMLR, 7:3 1–54, 2006.

[4] M. Choi, A. Torralba, and A. Willsky. A tree-based context model for object recognition. TPAMI, 34(2):240–252, 2012.

[5] O. Dekel, J. Keshet, and Y. Singer. Large margin hierarchical classification. In ICML, 2004.

[6] J. Deng, A. Berg, K. Li, and L. Fei-Fei. What does classifying more than 10,000 image categories tell us? In ECCV, 2010.

[7] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. FeiFei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.

[8] T. Deselaers and V. Ferrari. Visual and semantic similarity in ImageNet. In CVPR, 2011.

[9] L. Fei-Fei and P. Perona. A bayesian heirarchical model for learning natural scene categories. In CVPR, 2005.

[10] S. Fidler, M. Boben, and A. Leonardis. Similarity-based cross-layered hierarchical representation for object categorization. In CVPR, 2008.

[11] G. Griffin and P. Perona. Learning and using taxonomies for fast visual categorization. In CVPR, 2008.

[12] R. Jenatton, J. Mairal, G. Obozinski, and F. Bach. Proximal methods for sparse hierarchical dictionary learning. In ICML, 2010.

[13] T. Joachims, T. Finley, and C. Yu. Cutting plane training of structural svms. Machine Learning, 77(1):27–59, 2009.

[14] S. Lazebnik, C. Schmid, and J. Ponce. Beyond bag of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR, 2006.

[15] H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In ICML, 2009.

[16] H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng. Unsupervised learning of hierarchical representations with convolutional deep belief networks. Commun. ACM, 54:95–103, 2011.

[17] J. Mairal, F. Bach, and J. Ponce. Task-driven dictionary learning. TPAMI, 34(4):791–804, 2012.

[18] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online learning for matrix factorization and sparse coding. JMLR, pages 19– 60, 2010.

[19] M. Marszalek and C. Schmid. Semantic hierarchies for visual object recognition. In CVPR, 2007.

[20] D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In CVPR, 2006.

[21] B. Olshausen and D. Field. Sparse coding with an overcomplete basis set: a strategy employed by v1? Vision Research, 37:331 1–3325, 1997.

[22] R. Salakhutdinov, A. Torralba, and J. Tenenbaum. Learning to share visual appearance for multiclass object detection. In CVPR, 2011.

[23] L. Shen, S. Jiang, S. Wang, and Q. Huang. Learning-to-share based on finding groups for large scale image classification. In PCM, 2011.

[24] J. Sivic, B. Russell, A. Efros, A. Zisserman, and W. Freeman. Discovering object categories in image collections. In ICCV, 2005.

[25] A. Torralba, K. Murphy, and W. Freeman. Sharing visual features for multiclass and multiview object detection. TPAMI, 29(5):854–869, 2007.

[26] J. Wang, J. Yang, K. Yu, and F. Lv. Locality-constrained linear coding for image classification. In CVPR, 2010.

[27] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. SUN database: Large-scale scene recognition from abbey to zoo. In CVPR, 2010.

[28] J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In CVPR, 2009.

[29] M. Yang, L. Zhang, X. Feng, and D. Zhang. Fisher discrimination dictionary learning for sparse representation. In ICCV, 2011.

[30] K. Yu, Y. Lin, and J. Lafferty. Learning image representations from the pixel level via hierarchical sparse coding. In CVPR, 2011. [3 1] K. Yu, T. Zhang, and Y. Gong. Nonlinear learning using local coordinate coding. In NIPS, 2009.

[32] B. Zhao, L. Fei-Fei, and E. Xing. Large-scale category structure aware aware categorization. In NIPS. 2011.

[33] D. Zhou, L. Xiao, and M. Wu. Hierarchical classification via orthogonal transfer. In ICML, 2011.

[34] N. Zhou, Y. Shen, J. Peng, and J. Fan. Learning inter-related visual dictionary for object recognition. In CVPR, 2012.

[35] A. Zweig and D. Weinshall. Exploiting object hierarchy: Combining models from different category levels. In ICCV, 2007. 333898880