nips nips2006 nips2006-112 nips2006-112-reference knowledge-graph by maker-knowledge-mining

112 nips-2006-Learning Nonparametric Models for Probabilistic Imitation

Source: pdf

Author: David B. Grimes, Daniel R. Rashid, Rajesh P. Rao

Abstract: Learning by imitation represents an important mechanism for rapid acquisition of new behaviors in humans and robots. A critical requirement for learning by imitation is the ability to handle uncertainty arising from the observation process as well as the imitator’s own dynamics and interactions with the environment. In this paper, we present a new probabilistic method for inferring imitative actions that takes into account both the observations of the teacher as well as the imitator’s dynamics. Our key contribution is a nonparametric learning method which generalizes to systems with very different dynamics. Rather than relying on a known forward model of the dynamics, our approach learns a nonparametric forward model via exploration. Leveraging advances in approximate inference in graphical models, we show how the learned forward model can be directly used to plan an imitating sequence. We provide experimental results for two systems: a biomechanical model of the human arm and a 25-degrees-of-freedom humanoid robot. We demonstrate that the proposed method can be used to learn appropriate motor inputs to the model arm which imitates the desired movements. A second set of results demonstrates dynamically stable full-body imitation of a human teacher by the humanoid robot. 1

reference text

[1] P. Abbeel and A. Y. Ng. Exploration and apprenticeship learning in reinforcement learning. In In Proceedings of the Twenty-ﬁrst International Conference on Machine Learning, 2005.

[2] C. Atkeson and S. Schaal. Robot learning from demonstration. pages 12–20, 1997.

[3] A. Billard and M. Mataric. Learning human arm movements by imitation: Evaluation of a biologically-inspired connectionist architecture. Robotics and Autonomous Systems, (941), 2001.

[4] M. A. Carreira-Perpinan. Mode-ﬁnding for mixtures of gaussian distributions. IEEE Trans. Pattern Anal. Mach. Intell., 22(11):1318–1323, 2000.

[5] J. Demiris and G. Hayes. A robot controller using learning by imitation, 1994.

[6] D. B. Grimes, R. Chalodhorn, and R. P. N. Rao. Dynamic imitation in a humanoid robot through nonparametric probabilistic inference. In Proceedings of Robotics: Science and Systems (RSS’06), Cambridge, MA, 2006. MIT Press.

[7] A. T. Ihler, E. B. Sudderth, W. T. Freeman, and A. S. Willsky. Efﬁcient multiscale sampling from products of gaussian mixtures. In S. Thrun, L. Saul, and B. Sch¨ lkopf, editors, Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA, 2004. o

[8] A. J. Ijspeert, J. Nakanishi, and S. Schaal. Trajectory formation for imitation with nonlinear dynamical systems. In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 752–757, 2001.

[9] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger. Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 47(2):498–519, 2001.

[10] W. Li and E. Todorov. Iterative linear-quadratic regulator design for nonlinear biological movement systems. In Proceedings of the 1st Int. Conf. on Informatics in Control, Automation and Robotics, volume 1, pages 222–229, 2004.

[11] A. N. Meltzoff. Elements of a developmental theory of imitation. pages 19–41, 2002.

[12] A. Y. Ng and S. Russell. Algorithms for inverse reinforcement learning. In Proc. 17th International Conf. on Machine Learning, pages 663–670, 2000.

[13] J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988.

[14] S. Schaal, A. Ijspeert, and A. Billard. Computational approaches to motor learning by imitation. 1431:199–218, 2004.

[15] D. Scott and W. Szewczyk. From kernels to mixtures. Technometrics, 43(3):323–335.

[16] E. B. Sudderth, A. T. Ihler, W. T. Freeman, and A. S. Willsky. Nonparametric belief propagation. In CVPR (1), pages 605–612, 2003.

[17] H.-G. Sung. Gaussian Mixture Regression and Classiﬁcation. PhD thesis, Rice University, 2004.

[18] Y. Weiss. Correctness of local probability propagation in graphical models with loops. Neural Computation, 12(1):1–41, 2000.

[19] M. Y. Kuniyoshi and H. Inoue. “learning by watching: Extracting reusable task knowledge from visual observation of human performance” ieee transaction on robotics and automation, vol.10, no.6, pp.799–822, dec., 1994.