nips nips2002 nips2002-172 nips2002-172-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Leonid Taycher, John Iii, Trevor Darrell
Abstract: Accurate representation of articulated motion is a challenging problem for machine perception. Several successful tracking algorithms have been developed that model human body as an articulated tree. We propose a learning-based method for creating such articulated models from observations of multiple rigid motions. This paper is concerned with recovering topology of the articulated model, when the rigid motion of constituent segments is known. Our approach is based on finding the Maximum Likelihood tree shaped factorization of the joint probability density function (PDF) of rigid segment motions. The topology of graphical model formed from this factorization corresponds to topology of the underlying articulated body. We demonstrate the performance of our algorithm on both synthetic and real motion capture data.
[1] C. K. Chow and C. N. Liu. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, IT-14(3):462–467, May 1968.
[2] Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivern. Introduction to Algorithms. MIT Press, Cambridge, MA, 1990.
[3] Joao Paolo Costeira and Takeo Kanade. A multibody factorization method for independently moving objects. International Journal of Computer Vision, 29(3):159–179, 1998.
[4] T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley & Sons, Inc., New York, 1991.
[5] Luc Devroye. A Course in Density Estimation, volume 14 of Progress in Probability and Statistics. Birkhauser, Boston, 1987. S3 S2 S1 (a) (b) S1 S2 (c) S3 S1 S1 S2 S2 S3 S3 (d) (e) Figure 6.1: Simple kinematic chain topology recovery. The first row shows 3 sample frames from a 100 frame synthetic sequence. The adjacency matrix of the mutual information graph is shown in (d), with intensities corresponding to edge weights. The vertices in the graph correspond to the rigid segments labeled in (a). (e) is the recovered articulated topology. S4 S2 S3 S5 S1 (b) (a) S1 S2 S3 S4 (c) S5 S1 S2 S3 S4 S5 (d) (e) Figure 6.2: Humanoid torso synthetic test. The sample frames from a randomly generated 150 frame sequence are shown in (a), (b), and (c). The adjacency matrix of the mutual information graph is shown in (d), with intensities corresponding to edge weights. The vertices in the graph correspond to the rigid segments labeled in (a). (e) is the recovered articulated topology. S1 S9 S5 S8 S4 S7 S3 S6 S2 (a) (b) S1 S2 S3 S4 S5 S6 S7 (c) S8 S9 S1 S2 S3 S4 S5 S6 S7 S8 (d) S9 (e) Figure 6.3: Motion Capture based test. (a), (b), and (c) are the sample frames from a 220 frame sequence. The adjacency matrix of the mutual information graph is shown in (d), with intensities corresponding to edge weights. The vertices in the graph correspond to the rigid segments labeled in (a). (e) is the recovered articulated topology.
[6] David C. Hogg. Model-based vision: A program to see a walking person. Image and Vision Computing, 1(1):5–20, 1983.
[7] Yi-Ping Hung, Cheng-Yuan Tang, Sheng-Wen Shin, Zen Chen, and Wei-Song Lin. A 3d featurebased tracker for tracking multiple moving objects with a controlled binocular head. Technical report, Academia Sinica Institute of Information Science, 1995.
[8] Finn Jensen. An Introduction to Bayesian Networks. Springer, 1996.
[9] N. Jojic and B.J. Frey. Learning flexible sprites in video layers. In Computer Vision and Pattern Recognition, pages I:199–206, 2001.
[10] Ioannis A. Kakadiaris and Dimirti Metaxas. 3d human body acquisition from multiple views. In Proc. Fifth International Conference on Computer Vision, pages 618–623, 1995.
[11] Marina Meila. Learning Mixtures of Trees. PhD thesis, MIT, 1998.
[12] Ivana Mikic, Mohan Triverdi, Edward Hunter, and Pamela Cosman. Articulated body posture estimation from multi-camera voxel data. In Computer Vision and Pattern Recognition, 2001.
[13] Richard M. Murray, Zexiang Li, and S. Shankar Sastry. A Mathematical Introduction to Robotic Manipulation. CRC Press, 1994.
[14] J. O’Brien, R. E. Bodenheimer, G. Brostow, and J. K. Hodgins. Automatic joint parameter estimation from magnetic motion capture data. In Graphics Interface’2000, pages 53–60, 2000.
[15] James M. Regh and Daniel D. Morris. Singularities in articulated object tracking with 2-d and 3-d models. Technical report, Digital Equipment Corporation, 1997.
[16] Hedvig Sidenbladh, Michael J. Black, and David J. Fleet. Stochastic tracking of 3d human figures using 2d image motion. In Proc. European Conference on Computer Vision, 2000.
[17] Yang Song, Luis Goncalves, Enrico Di Bernardo, and Pietro Perona. Monocular perception of biological motion - detection and labeling. In Proc. International Conference on Computer Vision, pages 805–812, 1999.
[18] Ying Wu, Zhengyou Zhang, Thomas S. Huang, and John Y. Lin. Multibody grouping via orthogonal subspace decomposition. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2001.