nips nips2004 nips2004-73 nips2004-73-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: John Winn, Andrew Blake
Abstract: We present an extension to the Jojic and Frey (2001) layered sprite model which allows for layers to undergo affine transformations. This extension allows for affine object pose to be inferred whilst simultaneously learning the object shape and appearance. Learning is carried out by applying an augmented variational inference algorithm which includes a global search over a discretised transform space followed by a local optimisation. To aid correct convergence, we use bottom-up cues to restrict the space of possible affine transformations. We present results on a number of video sequences and show how the model can be extended to track an object whose appearance changes throughout the sequence. 1
[1] J. Y. A. Wang and E. H. Adelson. Representing moving images with layers. In IEEE Transactions on Image Processing, volume 3, pages 625–638, 1994.
[2] N. Jojic and B. Frey. Learning flexible sprites in video layers. In Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, 2001.
[3] B. Frey and N. Jojic. Fast, large-scale transformation-invariant clustering. In Advances in Neural Information Processing Systems 14, 2001.
[4] M. K. Titsias and C. K. I. Williams. Fast unsupervised greedy learning of multiple objects and parts from video. 2004. To appear in Proc. Generative-Model Based Vision Workshop, Washington DC, USA.
[5] C.K.I. Williams and M. K. Titsias. Greedy learning of multiple objects in images using robust statistics and factorial learning. Neural Computation, 16(5):1039–1062, 2004.
[6] M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul. An introduction to variational methods for graphical models. In M. I. Jordan, editor, Learning in Graphical Models, pages 105–162. Kluwer, 1998.
[7] C. M. Bishop, J. M. Winn, and D. Spiegelhalter. VIBES: A variational inference engine for Bayesian networks. In Advances in Neural Information Processing Systems, volume 15, 2002.
[8] J. M. Winn and C. M. Bishop. Variational Message Passing. 2004. To appear in Journal of Machine Learning Research. Available from http://johnwinn.org.
[9] A. Jepson, D. Fleet, and T. El-Maraghi. Robust online appearance models for visual tracking. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, volume I, pages 415–422, 2001.