nips nips2010 nips2010-246 nips2010-246-reference knowledge-graph by maker-knowledge-mining

246 nips-2010-Sparse Coding for Learning Interpretable Spatio-Temporal Primitives

Source: pdf

Author: Taehwan Kim, Gregory Shakhnarovich, Raquel Urtasun

Abstract: Sparse coding has recently become a popular approach in computer vision to learn dictionaries of natural images. In this paper we extend the sparse coding framework to learn interpretable spatio-temporal primitives. We formulated the problem as a tensor factorization problem with tensor group norm constraints over the primitives, diagonal constraints on the activations that provide interpretability as well as smoothness constraints that are inherent to human motion. We demonstrate the effectiveness of our approach to learn interpretable representations of human motion from motion capture data, and show that our approach outperforms recently developed matching pursuit and sparse coding algorithms. 1

reference text

[1] S. Bengio, F Pereira, Y. Singer, and D. Strelow. Group sparse coding. In NIPS, 2009.

[2] D. P. Bertsekas. Nonlinear Programming. Athena Scientiﬁc, Belmont, Massachusetts, 1999.

[3] A. diAvella and E. Bizzi. Shared and speciﬁc muscle synergies in natural motor behaviors. PNAS, 102(8):3076–3081, 2005.

[4] M. Elad and M. Aharon. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. on Image Processing, 15(12):3736–3745, 2006.

[5] Z. Ghahramani. Building blocks of movement. Nature, 407:682–683, 2000.

[6] R. Jenatton, G. Obozinski, and F. Bach. Structured sparse principal component analysis. In Proc. AISTATS10, 2010.

[7] H. Lee, Alexis Battle, Raina R, and A. Y. Ng. Efﬁcient sparse coding algorithms. In NIPS, 2007.

[8] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In ICML, 2009.

[9] J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. Non-local sparse models for image restoration. In ICCV, 2009.

[10] J. Mairal, G. Sapiro, and M. Elad. Learning multiscale sparse representations for image and video restoration. SIAM Multiscale Modelling and Simulation., 7(1):214–241, 2008b.

[11] S. G. Mallat and Z. Zhang. Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal. Proc. 41, pages 3397–3415, 1993.

[12] C. R. Mason, J. E. Gomez, and T. J. Ebner. Hand synergies during reach to grasp. J. of Neurophysiology, 86:2896–2910, 2001.

[13] F. A. Mussa-Ivaldi and E. Bizzi. Motor learning: the combination of primitives. Phil. Trans. Royal Society London, Series B, 355:1755–1769, 2000.

[14] F. A. Mussa-Ivaldi and S. Solla. Neural primitives for motion control. IEEE Journal on Ocean Engineering, 29(3):640–650, 2004.

[15] E. Todorov and Z. Ghahramani. Analysis of the synergies underlying complex hand manipulation. In Proceedings of Conf. of the IEEE Engineering in Medicine and Biology Society, pages 4637–4640, 2004.

[16] R. Urtasun, D. J. Fleet, A. Geiger, J. Popovic, T. Darrell, and N. D. Lawrence. Topologically-constrained latent variable models. In ICML, 2008. 9