nips nips2006 nips2006-52 nips2006-52-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Anitha Kannan, John Winn, Carsten Rother
Abstract: Patch-based appearance models are used in a wide range of computer vision applications. To learn such models it has previously been necessary to specify a suitable set of patch sizes and shapes by hand. In the jigsaw model presented here, the shape, size and appearance of patches are learned automatically from the repeated structures in a set of training images. By learning such irregularly shaped ‘jigsaw pieces’, we are able to discover both the shape and the appearance of object parts without supervision. When applied to face images, for example, the learned jigsaw pieces are surprisingly strongly associated with face parts of different shapes and scales such as eyes, noses, eyebrows and cheeks, to name a few. We conclude that learning the shape of the patch not only improves the accuracy of appearance-based part detection but also allows for shape-based part detection. This enables parts of similar appearance but different shapes to be distinguished; for example, while foreheads and cheeks are both skin colored, they have markedly different shapes. 1
[1] N. Jojic, B. Frey, and A. Kannan. Epitomic analysis of appearance and shape. In ICCV, 2003.
[2] W. Freeman, E. Pasztor, and O. Carmichael. Learning low-level vision. IJCV, 40(1), 2000.
[3] S. Roth and M. J. Black. Fields of experts: A framework for learning image priors. In Proceedings of IEEE CVPR, 2005.
[4] A. Efros and W. Freeman. Image quilting for texture synthesis and transfer. In ACM Transactions on Graphics (Siggraph), 2001.
[5] C. Rother, S. Kumar, V. Kolmogorov, and A. Blake. Digital tapestry. In Proc. Conf. Computer Vision and Pattern Recognition, 2005.
[6] R. Fergus, P. Perona, and A. Zisserman. Object class recognition by unsupervised scale-invariant learning. In CVPR, volume 2, pages 264–271, June 2003.
[7] B. Leibe and B. Schiele. Interleaved object categorization and segmentation. In BMVC, 2003.
[8] E. Borenstein, E. Sharon, and S. Ullman. Combining top-down and bottom-up segmentation. In Proceedings IEEE workshop on Perceptual Organization in Computer Vision, CVPR 2004, 2004.
[9] E. Borenstein and S. Ullman. Class-specific, top-down segmentation. In Proceedings of ECCV, 2003.
[10] D. Huttenlocher, D. Crandall, and P. Felzenszwalb. Spatial priors for part-based recognition using statistical models. In Proceedings of IEEE CVPR, 2005.
[11] Y Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. PAMI, 23(11), 2001.
[12] J. Winn and J. Shotton. The layout consistent random field for recognizing and segmenting partially occluded objects. In Proceedings of IEEE CVPR, 2006.