nips nips2012 nips2012-357 nips2012-357-reference knowledge-graph by maker-knowledge-mining

357 nips-2012-Unsupervised Template Learning for Fine-Grained Object Recognition


Source: pdf

Author: Shulin Yang, Liefeng Bo, Jue Wang, Linda G. Shapiro

Abstract: Fine-grained recognition refers to a subordinate level of recognition, such as recognizing different species of animals and plants. It differs from recognition of basic categories, such as humans, tables, and computers, in that there are global similarities in shape and structure shared cross different categories, and the differences are in the details of object parts. We suggest that the key to identifying the fine-grained differences lies in finding the right alignment of image regions that contain the same object parts. We propose a template model for the purpose, which captures common shape patterns of object parts, as well as the cooccurrence relation of the shape patterns. Once the image regions are aligned, extracted features are used for classification. Learning of the template model is efficient, and the recognition results we achieve significantly outperform the stateof-the-art algorithms. 1


reference text

[1] Farrell, R., Oza, O., Zhang, N., Morariu, V., Darrell, T., Davis, L.: Birdlets: subordinate categorization using volumetric primitives and pose-normalized appearance. ICCV (2011)

[2] Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. CVPR (2006)

[3] Bo, L., Ren, X., Fox, D.: Kernel Descriptors for Visual Recognition. NIPS (2010)

[4] Bo, L., Sminchisescu, C.: Efficient match kernel between sets of features for visual recognition. NIPS (2009)

[5] Branson, S., Wah, C., Babenko, B., Schroff, F., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. ECCV (2010)

[6] Khan, F., van de Weijer, J., Bagdanov, A., Vanrell, M.: Portmanteau vocabularies for multi-cue image representations. NIPS (2011)

[7] Wah, C., Branson, S., Perona, P., Belongie, S.: Interactive localization and recognition of fine-grained visual categories. ICCV (2011)

[8] Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-ucsd birds 200. Technical Report CNS-TR-201, Caltech (2010)

[9] Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. CVPR (2011)

[10] Yao, B., Bradski, G., Fei-Fei, L.: A codebook-free and annotation-free approach for finegrained image categorization. CVPR (2012)

[11] Duan, K., Parikh, D., Crandall, D., Grauman, K.: Discovering localized attributes for finegrained recognition. CVPR (2012)

[12] Zhang, N., Farrell, R., Darrell, T.: Pose pooling kernels for sub-category recognition. CVPR (2012)

[13] Bourdev, L., Malik, J.: Poselets: body partddetectors trained using 3d human pose annotations. ICCV (2009)

[14] Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (2010)

[15] Parkhi, O., Vedaldi, A., Zisserman, A., Jawahar, C.: Cats and dogs. CVPR (2012)

[16] Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60 (2004)

[17] Lee, H., Battle, A., Raina, R., Ng, A.: Efficient sparse coding algorithms. NIPS (2007)

[18] Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. CVPR (2009)

[19] Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Guo, Y.: Locality-constrained linear coding for image classification. CVPR (2010)

[20] Boureau, Y., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. CVPR (2010)

[21] Coates, A., Ng, A.: The importance of encoding versus training with sparse coding and vector quantization. ICML (2011)

[22] Yu, K., Lin, Y., Lafferty, J.: Learning image representations from the pixel level via hierarchical sparse coding. CVPR (2011)

[23] Boureau, Y., Ponce, J.: A theoretical analysis of feature pooling in visual recognition. ICML (2010)

[24] Chang, C., Lin, C.: LIBSVM: a library for support vector machines. (2001)

[25] Maire, M., Arbelaez, P., Fowlkes, C., Malik, J.: Using contours to detect and localize junctions in natural images. CVPR (2008)

[26] Schmidt, M., Fung, G., Rosales, R.: Optimization methods for L1 -regularization. UBC Technical Report (2009)

[27] Khosla, A., Jayadevaprakash, N., Yao, B., Fei-Fei, L.: Novel dataset for fine-grained image categorization. First Workshop on Fine-Grained Visual Categorization, CVPR (2011) 9