iccv iccv2013 iccv2013-143 iccv2013-143-reference knowledge-graph by maker-knowledge-mining

143 iccv-2013-Estimating Human Pose with Flowing Puppets

Source: pdf

Author: Silvia Zuffi, Javier Romero, Cordelia Schmid, Michael J. Black

Abstract: We address the problem of upper-body human pose estimation in uncontrolled monocular video sequences, without manual initialization. Most current methods focus on isolated video frames and often fail to correctly localize arms and hands. Inferring pose over a video sequence is advantageous because poses of people in adjacent frames exhibit properties of smooth variation due to the nature of human and camera motion. To exploit this, previous methods have used prior knowledge about distinctive actions or generic temporal priors combined with static image likelihoods to track people in motion. Here we take a different approach based on a simple observation: Information about how a person moves from frame to frame is present in the optical flow field. We develop an approach for tracking articulated motions that “links” articulated shape models of peo- ple in adjacent frames through the dense optical flow. Key to this approach is a 2D shape model of the body that we use to compute how the body moves over time. The resulting “flowing puppets ” provide a way of integrating image evidence across frames to improve pose inference. We apply our method on a challenging dataset of TV video sequences and show state-of-the-art performance.

reference text

[1] M. Andriluka, S. Roth, and B. Schiele. Pictorial structures revisited: People detection and articulated pose estimation. CVPR, pp. 1014–1021, 2009. 2, 3

[2] S. Baker, D. Scharstein, J. Lewis, S. Roth, M. Black, and R. Szeliski. A database and evaluation methodology for optical flow. IJCV, 92(1): 1–31, 2011. 7

[3] P. Buehler, M. Everingham, D. Huttenlocher, and A. Zisserman. Upper body detection and tracking in extended signing. IJCV, 95(2): 180–197, 2011. 2

[4] T.-J. Cham and J. Rehg. A multiple hypothesis approach to figure tracking. CVPR, pp. 239–245, 1999. 2, 3

[5] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. CVPR, p. 886-893, 2005. 4

[6] M. Eichner, V. Ferrari, and S. Zurich. Better appearance models for pictorial structures. BMVC, pp. 1–1 1, 2009. 3

[7] M. Eichner, M. Marin-Jimenez, A. Zisserman, and V. Ferrari. 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. IJCV, 99(2): 190–214, 2012. 2, 3

[8] P. Felzenszwalb and D. Huttenlocher. Pictorial structures for object recognition. IJCV, 61(1):55–79, 2005. 2, 3

[9] V. Ferrari, M. Marin-Jiminez, and A. Zisserman. Progressive search space reduction for human pose estimation. CVPR, pp. 1–8, 2008. 3, 4

[10] K. Fragkiadaki, H. Hu, J. Shi. Pose from flow and flow from pose estimation. CVPR, pp. 2059–2066, 2013. 3, 6

[11] O. Freifeld, A. Weiss, S. Zuffi, and M. Black. Contour people: A parametrized model of 2D articulated human shape. CVPR, pp. 639–646, 2010. 3

[12] P. Guan, O. Freifeld, and M. Black. A 2D human body model dressed in eigen clothing. ECCV, pp. I:285–298, 2010. 3

[13] Sˇ. Ivekovi cˇ, E. Trucco, and Y. Petillot. Human body pose estimation with particle swarm optimisation. Evol. Comput., 16(4):509–528, 2008. 2, 5

[14] V. John, E. Trucco, and Sˇ. Ivekovi cˇ. Markerless human articulated tracking using hierarchical particle swarm optimisation. Image Vision Comput, 28(1 1): 1530 – 1547, 2010. 5

[15] S. X. Ju, M. J. Black, and Y. Yacoob. Cardboard people: A parameterized model of articulated motion. IEEE Face and Gesture Recog. , pp. 38–44, 1996. 2, 3

[16] A. Mittal, A. Zisserman, and P. H. S. Torr. Hand detection using multiple proposals. BMVC, pp. 1–1 1, 2011. 6

[17] D. Ramanan, D. A. Forsyth, and A. Zisserman. Strike a pose: Tracking people by finding stylized poses. CVPR, pp. 1:271–

[18]

[19]

[20]

[21]

[22]

[23]

[24] 278, 2005. 2 D. Ramanan, D. A. Forsyth, and A. Zisserman. Tracking people by learning their appearance. PAMI, 29(1):65–81, 2007. 2, 3 B. Sapp, C. Jordan, and B. Taskar. Adaptive pose priors for pictorial structures. CVPR, pp. 422–429, 2010. 2 B. Sapp, D. Weiss, and B. Taskar. Parsing human motion with stretchable models. CVPR, pp. 1281–1288, 2011. 2, 3, 5, 6, 7 C. Sminchisescu and B. Triggs. Estimating articulated human motion with covariance scaled sampling. Int. J. Robot. Res., 22(6):371–391, 2003. 3 L. Xu, J. Jia, and Y. Matsushita. Motion detail preserving optical flow estimation. PAMI, 34(9): 1744–1757, 2012. 6 Y. Yang and D. Ramanan. Articulated pose estimation using flexible mixtures of parts. CVPR, pp. 1385–1392, 2011. 2, 6, 7 S. Zuffi, O. Freifeld, and M. Black. From pictorial structures to deformable structures. CVPR, pp. 3546–3553, 2012. 1, 2, 3, 4 3333 1192