nips nips2009 nips2009-175 nips2009-175-reference knowledge-graph by maker-knowledge-mining

175 nips-2009-Occlusive Components Analysis


Source: pdf

Author: Jörg Lücke, Richard Turner, Maneesh Sahani, Marc Henniges

Abstract: We study unsupervised learning in a probabilistic generative model for occlusion. The model uses two types of latent variables: one indicates which objects are present in the image, and the other how they are ordered in depth. This depth order then determines how the positions and appearances of the objects present, specified in the model parameters, combine to form the image. We show that the object parameters can be learnt from an unlabelled set of images in which objects occlude one another. Exact maximum-likelihood learning is intractable. However, we show that tractable approximations to Expectation Maximization (EM) can be found if the training images each contain only a small number of objects on average. In numerical experiments it is shown that these approximations recover the correct set of object parameters. Experiments on a novel version of the bars test using colored bars, and experiments on more realistic data, show that the algorithm performs well in extracting the generating causes. Experiments based on the standard bars benchmark test for object learning show that the algorithm performs well in comparison to other recent component extraction approaches. The model and the learning algorithm thus connect research on occlusion with the research field of multiple-causes component extraction methods. 1


reference text

[1] B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381:607 – 609, 1996.

[2] D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788–91, 1999.

[3] N. Jojic and B. Frey. Learning flexible sprites in video layers. Conf. on Computer Vision and Pattern Recognition, 1:199–206, 2001.

[4] C. K. I. Williams and M. K. Titsias. Greedy learning of multiple objects in images using robust statistics and factorial learning. Neural Computation, 16(5):1039–1062, 2004.

[5] K. Fukushima. Restoring partly occluded patterns: a neural network model. Neural Networks, 18(1):33–43, 2005.

[6] C. Eckes, J. Triesch, and C. von der Malsburg. Analysis of cluttered scenes using an elastic matching approach for stereo images. Neural Computation, 18(6):1441–1471, 2006.

[7] R. M. Neal and G. E. Hinton. A view of the EM algorithm that justifies incremental, sparse, and other variants. In M. I. Jordan, editor, Learning in Graphical Models. Kluwer, 1998.

[8] J. L¨ cke and M. Sahani. Maximal causes for non-linear component extraction. Journal of u Machine Learning Research, 9:1227 – 1267, 2008.

[9] N. Ueda and R. Nakano. Deterministic annealing EM algorithm. Neural Networks, 11(2):271– 282, 1998.

[10] M. Sahani. Latent variable models for neural data analysis, 1999. PhD Thesis, Caltech.

[11] P. F¨ ldi´ k. Forming sparse representations by local anti-Hebbian learning. Biol Cybern, 64:165 o a – 170, 1990.

[12] M. W. Spratling. Learning image components for object recognition. Journal of Machine Learning Research, 7:793 – 815, 2006.

[13] S. Hochreiter and J. Schmidhuber. Feature extraction through LOCOCODE. Neural Computation, 11:679 – 714, 1999.

[14] P. O. Hoyer. Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research, 5:1457–1469, 2004.

[15] S. A. Nene, S. K. Nayar, and H. Murase. Columbia object image library (COIL-100). Technical report, cucs-006-96, 1996.

[16] H. Wersing and E. K¨ rner. Learning optimized features for hierarchical models of invariant o object recognition. Neural Computation, 15(7):1559–1588, 2003.

[17] U. K¨ ster, J. T. Lindgren, M. Gutmann, and A. Hyv¨ rinen. Learning natural image structure o a with a horizontal product model. In Int. Conf. on Independent Component Analysis and Signal Separation (ICA), pages 507–514, 2009.

[18] P. Wolfrum, C. Wolff, J. L¨ cke, and C. von der Malsburg. A recurrent dynamic model for u correspondence-based face recognition. Journal of Vision, 8(7):1–18, 2008.

[19] D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91–110, 2004.

[20] J. Eggert, H. Wersing, and E. K¨ rner. Transformation-invariant representation and NMF. In o Int. J. Conf. on Neural Networks (IJCNN), pages 2535–2539, 2004. 9