cvpr cvpr2013 cvpr2013-174 cvpr2013-174-reference knowledge-graph by maker-knowledge-mining

174 cvpr-2013-Fine-Grained Crowdsourcing for Fine-Grained Recognition

Source: pdf

Author: Jia Deng, Jonathan Krause, Li Fei-Fei

Abstract: Fine-grained recognition concerns categorization at sub-ordinate levels, where the distinction between object classes is highly local. Compared to basic level recognition, fine-grained categorization can be more challenging as there are in general less data and fewer discriminative features. This necessitates the use of stronger prior for feature selection. In this work, we include humans in the loop to help computers select discriminative features. We introduce a novel online game called “Bubbles ” that reveals discriminative features humans use. The player’s goal is to identify the category of a heavily blurred image. During the game, the player can choose to reveal full details of circular regions ( “bubbles”), with a certain penalty. With proper setup the game generates discriminative bubbles with assured quality. We next propose the “BubbleBank” algorithm that uses the human selected bubbles to improve machine recognition performance. Experiments demonstrate that our approach yields large improvements over the previous state of the art on challenging benchmarks.

reference text

[1] http://www.cs.washington.edu/robotics/projects/kdes/.

[2] L. Bo, X. Ren, and D. Fox. Kernel descriptors for visual recognition. NIPS, 7, 2010.

[3] L. Bourdev and J. Malik. Poselets: Body part detectors trained using 3d human pose annotations. In CVPR, 2009.

[4] S. Branson, P. Perona, and S. Belongie. Strong supervision from weak annotation: Interactive training of deformable part models. In ICCV, Barcelona, 2011.

[5] S. Branson, C. Wah, F. Schroff, B. Babenko, P. Welinder, P. Perona, and S. Belongie. Visual recognition with humans in the loop. ECCV, 2010.

[6] Y. Chai, E. Rahtu, V. Lempitsky, L. Van Gool, and A. Zisserman. Tricos: A tri-level class-discriminative cosegmentation method for image classification. In ECCV, 2012.

[7] Q. Chen, Z. Song, Y. Hua, Z. Huang, and S. Yan. Hierarchical matching with side information for image classification. In CVPR, 2012.

[8] J. Deng, A. Berg, K. Li, and L. Fei-Fei. What does classifying more than 10,000 image categories tell us? ECCV, 2010.

[9] J. Donahue and K. Grauman. Annotator rationales for visual recognition. In ICCV, 2011.

[10] G. Druck, B. Settles, and A. McCallum. Active learning by labeling features. In EMNLP, 2009.

[11] K. Duan, D. Parikh, D. Crandall, and K. Grauman. Discovering localized attributes for fine-grained recognition. In CVPR, 2012.

[12] R. Fan, K. Chang, C. Hsieh, X. Wang, and C. Lin. Liblinear: A library for large linear classification. The Journal of Machine Learning Research, 9: 1871–1874, 2008. 555888446 Figure9.Testimagesandtheirtopbu blesthatcontributemost otheclas ifcationdecison.Al bu blesarenormalizedto hesamesize for viewing. Top and middle row: Correctly classified test examples. Bottom row: Incorrectly classified test examples.

[13] R. Farrell, O. Oza, N. Zhang, V. Morariu, T. Darrell, and L. Davis. Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance. In ICCV, 2011.

[14] F. Gosselin and P. Schyns. Bubbles: a technique to reveal the use of information in recognition tasks. Vision research, 41(17):2261–2271, 2001.

[15] F. Khan, J. van de Weijer, A. Bagdanov, and M. Vanrell. Portmanteau vocabularies for multi-cue image representation. NIPS, 2011.

[16] N. Kumar, P. Belhumeur, A. Biswas, D. Jacobs, W. Kress, I. Lopez, and J. Soares. Leafsnap: A computer vision system for automatic plant species identification. ECCV, 2012.

[17] E. Law, B. Settles, A. Snook, H. Surana, L. von Ahn, and T. Mitchell. Human computation for attribute and attribute value acquisition. In Proceedings of the First Workshop on Fine-Grained Visual Categorization (FGVC), 2011.

[18] L. Li, H. Su, E. Xing, and L. Fei-Fei. Object bank: A high-level image representation for scene classification and semantic feature sparsification. NIPS, 24, 2010.

[19] J. Liu, A. Kanazawa, D. Jacobs, and P. Belhumeur. Dog breed classification using part localization. ECCV, 2012.

[20] S. Maji. Discovering a lexicon of parts and attributes. In Second International Workshop on Parts and Attributes, ECCV, 2012.

[21] S. Maji and G. Shakhnarovich. Part annotations via pairwise correspondence. In Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012.

[22] D. Parikh and K. Grauman. Interactively building a discriminative vocabulary of nameable attributes. In CVPR, 2011.

[23] D. Parikh and C. Zitnick. Human-debugging of machines. In Second Workshop on Computational Social Science and the Wisdom of Crowds, NIPS, volume 11, 2011.

[24] A. Parkash and D. Parikh. Attributes for classifier feedback. In ECCV, 2012.

[25]

[26]

[27]

[28]

[29]

[30] [3 1] 555888557 O. Parkhi, A. Vedaldi, A. Zisserman, and C. Jawahar. Cats and dogs. In CVPR, 2012. A. Sorokin and D. Forsyth. Utility data annotation with amazon mechanical turk. In CVPRW, 2008. K. E. A. van de Sande, T. Gevers, and C. G. M. Snoek. Evaluating color descriptors for object and scene recognition. TPAMI, 32(9): 1582–1596, 2010. J. Van De Weijer, C. Schmid, and J. Verbeek. Learning color names from real-world images. In CVPR, 2007. S. Vijayanarasimhan and K. Grauman. Large-scale live active learning: Training object detectors with crawled data and crowds. In CVPR, 2011. L. Von Ahn and L. Dabbish. Labeling images with a computer game. In CHI, pages 3 19–326. ACM, 2004. L. Von Ahn, R. Liu, and M. Blum. Peekaboom: a game for locating objects in images. In CHI. ACM, 2006.

[32] C. Wah, S. Branson, P. Perona, and S. Belongie. Multiclass recognition and part localization with humans in the loop. In ICCV, Barcelona, 2011.

[33] C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, and C. Wah. Caltech-ucsd birds-200-201 1, 2011.

[34] J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. Locality-constrained linear coding for image classification. In CVPR, 2010.

[35] P. Welinder, S. Branson, T. Mita, C. Wah, F. Schroff, S. Belongie, and P. Perona. Caltech-ucsd birds 200. 2010.

[36] B. Yao, G. Bradski, and L. Fei-Fei. A codebook-free and annotation-free approach for fine-grained image categorization. In CVPR, 2012.

[37] B. Yao, A. Khosla, and L. Fei-Fei. Combining randomization and discrimination for fine-grained image categorization. In CVPR, 2011.

[38] N. Zhang, R. Farrell, and T. Darrell. Pose pooling kernels for sub-category recognition. In CVPR, 2012.