iccv iccv2013 iccv2013-198 iccv2013-198-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Lingxi Xie, Qi Tian, Richang Hong, Shuicheng Yan, Bo Zhang
Abstract: As a special topic in computer vision, , fine-grained visual categorization (FGVC) has been attracting growing attention these years. Different with traditional image classification tasks in which objects have large inter-class variation, the visual concepts in the fine-grained datasets, such as hundreds of bird species, often have very similar semantics. Due to the large inter-class similarity, it is very difficult to classify the objects without locating really discriminative features, therefore it becomes more important for the algorithm to make full use of the part information in order to train a robust model. In this paper, we propose a powerful flowchart named Hierarchical Part Matching (HPM) to cope with finegrained classification tasks. We extend the Bag-of-Features (BoF) model by introducing several novel modules to integrate into image representation, including foreground inference and segmentation, Hierarchical Structure Learn- ing (HSL), and Geometric Phrase Pooling (GPP). We verify in experiments that our algorithm achieves the state-ofthe-art classification accuracy in the Caltech-UCSD-Birds200-2011 dataset by making full use of the ground-truth part annotations.
[1] P. Arbel ´aez, M. Maire, C. Fowlkes, and J. Malik. From Contours to Regions: An Empirical Evaluation. CVPR, 2009.
[2] T. Berg and P. Belhumeur. POOF: Part-Based One-vs-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation. CVPR, 2013.
[3] A. Bosch, A. Zisserman, and X. Muoz. Image Classification using Random Forests and Ferns. ICCV, 2007.
[4] Y. Boureau, F. Bach, Y. LeCun, J. Ponce, et al. Learning Mid-Level Features for Recognition. CVPR, 2010.
[5] Y. Boykov, O. Veksler, and R. Zabih. Fast Approximate Energy Minimization via Graph Cuts. PAMI, 2001 .
[6] Q. Chen, Z. Song, Y. Hua, Z. Huang, and S. Yan. Hierarchical Matching with Side Information for Image Classification. CVPR, 2012.
[7] G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual Categorization with Bags of Keypoints. Workshop on
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18] Statistical Learning in Computer Vision, ECCV, 2004. K. Duan, D. Parikh, D. Crandall, and K. Grauman. Discovering Localized Attributes for Fine-Grained Recognition. CVPR, 2012. R. Fan, K. Chang, C. Hsieh, X. Wang, and C. Lin. LIBLINEAR: A Library for Large Linear Classification. JMLR, 2008. P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object Detection with Discriminatively Trained PartBased Models. PAMI, 2010. J. Feng, B. Ni, Q. Tian, and S. Yan. Geometric Lp-Norm Feature Pooling for Image Classification. CVPR, 2011. A. Khosla, N. Jayadevaprakash, B. Yao, and F. Li. Novel Dataset for Fine-Grained Image Categorization. First Workshop on FGVC, CVPR, 2011. S. Lazebnik, C. Schmid, and J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. CVPR, 2006. D. G. Lowe. Distinctive Image Features from Scale-Invariant Keypoints. IJCV, 2004. C. Rother, V. Kolmogorov, and A. Blake. GrabCut: Interactive Foreground Extraction Using Iterated Graph Cuts. ACM Transactions on Graphics, 2004. K. Van De Sande, T. Gevers, and C. Snoek. Evaluating Color Descriptors for Object and Scene Recognition. PAMI, 2010. A. Vedaldi and B. Fulkerson. VLFeat: An Open and Portable Library of Computer Vision Algorithms. ACM Multimedia, 2010. C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. The Caltech-UCSD Birds-200-201 1 Dataset. Technical Report, 2011.
[19] J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. Locality-Constrained Linear Coding for Image Classification. CVPR, 2010.
[20] L. Xie, Q. Tian, and B. Zhang. Spatial Pooling of Heterogeneous Features for Image Applications. ACM Multimedia, 2012.
[21] L. Xie, Q. Tian, and B. Zhang. Feature Normalization for Part-Based Image Classification. ICIP, 2013.
[22] S. Yang, L. Bo, J. Wang, and L. Shapiro. Unsupervised Template Learning for Fine-Grained Object Recognition. NIPS, 2012.
[23] J. Yuan, M. Yang, and Y. Wu. Mining Discriminative Cooccurrence Patterns for Visual Recognition. CVPR, 2011.
[24] N. Zhang, R. Farrell, and T. Darrell. Pose Pooling Kernels for Sub-Category Recognition. CVPR, 2012.
[25] S. Zhang, Q. Tian, G. Hua, Q. Huang, and S. Li. Descriptive Visual Words and Visual Phrases for Image Applications. ACM Multimedia, 2009. 11664488