nips nips2009 nips2009-211 nips2009-211-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Bryan Russell, Alyosha Efros, Josef Sivic, Bill Freeman, Andrew Zisserman
Abstract: In this paper, we investigate how, given an image, similar images sharing the same global description can help with unsupervised scene segmentation. In contrast to recent work in semantic alignment of scenes, we allow an input image to be explained by partial matches of similar scenes. This allows for a better explanation of the input scenes. We perform MRF-based segmentation that optimizes over matches, while respecting boundary information. The recovered segments are then used to re-query a large database of images to retrieve better matches for the target regions. We show improved performance in detecting the principal occluding and contact boundaries for the scene over previous methods on data gathered from the LabelMe database.
[1] A. Agarwala, M. Dontcheva, M. Agrawala, S. Drucker, A. Colburn, B. Curless, D. Salesin, and M. Cohen. Interactive digital photomontage. In SIGGRAPH, 2004.
[2] D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003.
[3] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. IEEE Trans. on Pattern Analysis and Machine Intelligence, 23(11), 2001.
[4] J. F. Canny. A computational approach to edge detection. IEEE Trans. on Pattern Analysis and Machine Intelligence, 8(6):679–698, 1986.
[5] S. K. Divvala, A. A. Efros, and M. Hebert. Can similar scenes help surface layout estimation? In IEEE Workshop on Internet Vision, associated with CVPR, 2008.
[6] J. Hays and A. Efros. Scene completion using millions of photographs. In ”SIGGRAPH”, 2007.
[7] J. Hays and A. A. Efros. IM2GPS: estimating geographic information from a single image. In CVPR, 2008.
[8] T. Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 43:177– 196, 2001.
[9] M. K. Johnson, K. Dale, S. Avidan, H. Pfister, W. T. Freeman, and W. Matusik. CG2Real: Improving the realism of computer-generated images using a large collection of photographs. Technical Report 2009-034, MIT CSAIL, 2009.
[10] H. Kang, A. A. Efros, M. Hebert, and T. Kanade. Image composition for object pop-out. In IEEE Workshop on 3D Representation for Recognition (3dRR-09), in assoc. with CVPR, 2009.
[11] C. Liu, J. Yuen, and A. Torralba. Nonparametric scene parsing: label transfer via dense scene alignment. In CVPR, 2009.
[12] C. Liu, J. Yuen, A. Torralba, J. Sivic, and W. T. Freeman. SIFT flow: dense correspondence across different scenes. In ECCV, 2008.
[13] D. Martin, C. Fowlkes, and J. Malik. Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans. on Pattern Analysis and Machine Intelligence, 26(5):530–549, 2004.
[14] A. Oliva and A. Torralba. Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV, 42(3):145–175, 2001.
[15] T. Quack, B. Leibe, and L. V. Gool. World-scale mining of objects and events from community photo collections. In CIVR, 2008.
[16] C. Rother, V. Kolmogorov, T. Minka, and A. Blake. Cosegmentation of image pairs by histogram matching - incorporating a global constraint into MRFs. In CVPR, 2006.
[17] B. C. Russell and A. Torralba. Building a database of 3D scenes from user annotations. In CVPR, 2009.
[18] B. C. Russell, A. Torralba, C. Liu, R. Fergus, and W. T. Freeman. Object recognition by scene alignment. In Advances in Neural Info. Proc. Systems, 2007.
[19] B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman. LabelMe: a database and web-based tool for image annotation. IJCV, 77(1-3):157–173, 2008.
[20] E. Shechtman and M. Irani. Matching local self-similarities across images and videos. In CVPR, 2007.
[21] J. Sivic, B. Kaneva, A. Torralba, S. Avidan, and W. T. Freeman. Creating and exploring a large photorealistic virtual space. In First IEEE Workshop on Internet Vision, associated with CVPR, 2008.
[22] A. Torralba, R. Fergus, and W. T. Freeman. 80 million tiny images: a large dataset for non-parametric object and scene recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence, 30(11):1958– 1970, 2008.
[23] O. Whyte, J. Sivic, and A. Zisserman. Get out of my picture! Internet-based inpainting. In British Machine Vision Conference, 2009. 9