cvpr cvpr2013 cvpr2013-132 cvpr2013-132-reference knowledge-graph by maker-knowledge-mining

132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations

Source: pdf

Author: Payman Yadollahpour, Dhruv Batra, Gregory Shakhnarovich

Abstract: This paper introduces a two-stage approach to semantic image segmentation. In the first stage a probabilistic model generates a set of diverse plausible segmentations. In the second stage, a discriminatively trained re-ranking model selects the best segmentation from this set. The re-ranking stage can use much more complex features than what could be tractably used in the probabilistic model, allowing a better exploration of the solution space than possible by simply producing the most probable solution from the probabilistic model. While our proposed approach already achieves state-of-the-art results (48.1%) on the challenging VOC 2012 dataset, our machine and human analyses suggest that even larger gains are possible with such an approach.

reference text

[1] P. Arbelaez, B. Hariharan, C. Gu, S. Gupta, L. Bourdev, and J. Malik. Semantic segmentation using regions and parts. In CVPR, pages 3378 –3385, june 2012. 1

[2] D. Batra, P. Yadollahpour, A. Guzman-Rivera, and G. Shakhnarovich. Diverse M-Best Solutions in Markov Random Fields. In ECCV, 2012. 2, 3, 4, 5

[3] X. Boix, J. M. Gonfaus, J. van de Weijer, A. D. Bagdanov, J. S. Gual, and J. Gonz a`lez. Harmony potentials - fusing global and local scale for semantic image segmentation. IJCV, 96(1):83–102, 2012. 1

[4] J. Carreira, R. Caseiro, J. Batista, and C. Sminchisescu. Semantic segmentation with second-order pooling. In ECCV, pages 430–443, 2012. 2, 5

[5] J. Carreira, F. Li, and C. Sminchisescu. Object recognition by sequential figure-ground ranking. IJCV, 98(3):243–262, 2012. 1

[6] J. Carreira and C. Sminchisescu. Constrained parametric min-cuts for automatic object segmentation. In CVPR, 2010. 2, 5

[7] Y.-L. Chow and R. Schwartz. The n-best algorithm: an efficient procedure for finding top n sentence hypotheses. In Proceedings of the Workshop on Speech and Natural Lang., pages 199–202, 1989. 2

[8] M. Collins. Discriminative reranking for natural language parsing. In ICML, pages 175–182, 2000. 3

[9] M. Collins. Discriminative syntactic language modeling for speech recognition. In In Proc. of the Annual Meeting of the Association for Computational Linguistics (ACL), pages 507–514, 2005. 3

[10] M. Dinarelli, A. Moschitti, and G. Riccardi. Discriminative reranking for spoken language understanding. Trans. Audio, Speech and Lang. Proc., 20(2):526–539, Feb. 2012. 3

[11] I. Endres and D. Hoiem. Category independent object proposals. In ECCV, 2010. 2

[12] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual object classes (voc) challenge. IJCV, 88(2):303–338, June 2010. 2, 3

[13] M. Fromer and A. Globerson. An LP view of the m-best MAP problem. In NIPS, 2009. 3

[14] C. Gu, J. J. Lim, P. Arbelaez, and J. Malik. Recognition using regions. In CVPR, pages 1030–1037, 2009. 1

[15] T. Joachims, T. Finley, and C.-N. Yu. Cutting-plane training of struc-

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30] tural svms. Machine Learning, 77(1):27–59, 2009. 4 J. Kim and K. Grauman. Shape sharing for object segmentation. In ECCV, pages 444–458, 2012. 2 P. Kohli, L. Ladicky, and P. H. S. Torr. Robust higher order potentials for enforcing label consistency. In CVPR, 2008. 1 L. Ladick` y, C. Russell, P. Kohli, and P. H. S. Torr. Associative hierarchical CRFs for object class image segmentation. ICCV, 2009. 1, 2, 5 L. Ladicky, C. Russell, P. Kohli, and P. H. S. Torr. Graph cut based inference with co-occurrence statistics. In ECCV, 2010. 5 L. Ladicky and P. H. Torr. The automatic labelling environment. http://cms.brookes.ac.uk/staff/PhilipTorr/ale.htm. 2 R. Mottaghi. Augmenting deformable part models with irregularshaped object patches. In CVPR, 2012. 6 D. Nilsson. An efficient algorithm for finding the m most probable configurations in probabilistic expert systems. Statistics and Computing, 8:159–173, 1998. 10.1023/A: 1008990218483. 3 B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman. Using multiple segmentations to discover objects and their extent in image collections. In CVPR, 2006. 2 B. Sapp, A. Toshev, and B. Taskar. Cascaded models for articulated pose estimation. In ECCV, 2010. 2 L. Shen, A. Sarkar, and F. J. Och. Discriminative reranking for machine translation. In HLT-NAACL, pages 177–184, 2004. 3 I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. Large margin methods for structured and interdependent output variables. JMLR, 6: 1453–1484, 2005. 2, 4 J. Uijlings, K. van de Sande, A. Smeulders, T. Gevers, N. Sebe, and C. Snoek. The most telling window for image classification. In ICCV Pascal VOC Workshop, 2011. 6 P. Viola and M. J. Jones. Robust real-time face detection. Int. J. Comput. Vision, 57(2): 137–154, May 2004. 2 D. Weiss, B. Sapp, and B. Taskar. Sidestepping intractable inference with structured ensemble cascades. In NIPS, 2010. 2 C. Yanover and Y. Weiss. Finding the m most probable configurations using loopy belief propagation. In NIPS, 2003. 3 111999223088