iccv iccv2013 iccv2013-225 knowledge-graph by maker-knowledge-mining

225 iccv-2013-Joint Segmentation and Pose Tracking of Human in Natural Videos


Source: pdf

Author: Taegyu Lim, Seunghoon Hong, Bohyung Han, Joon Hee Han

Abstract: We propose an on-line algorithm to extract a human by foreground/background segmentation and estimate pose of the human from the videos captured by moving cameras. We claim that a virtuous cycle can be created by appropriate interactions between the two modules to solve individual problems. This joint estimation problem is divided into two subproblems, , foreground/background segmentation and pose tracking, which alternate iteratively for optimization; segmentation step generates foreground mask for human pose tracking, and human pose tracking step provides foreground response map for segmentation. The final solution is obtained when the iterative procedure converges. We evaluate our algorithm quantitatively and qualitatively in real videos involving various challenges, and present its outstandingperformance compared to the state-of-the-art techniques for segmentation and pose estimation.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 kr Abstract We propose an on-line algorithm to extract a human by foreground/background segmentation and estimate pose of the human from the videos captured by moving cameras. [sent-5, score-0.853]

2 We evaluate our algorithm quantitatively and qualitatively in real videos involving various challenges, and present its outstandingperformance compared to the state-of-the-art techniques for segmentation and pose estimation. [sent-9, score-0.656]

3 Introduction Foreground/background segmentation and human pose estimation have been studied intensively in recent years and significant performance improvement has been achieved so far. [sent-11, score-0.719]

4 Although foreground/background segmentation and human pose estimation are potentially related and complementary, the majority of algorithms attempt to solve the two problems separately and the investigation of a joint estimation technique is not active yet. [sent-14, score-0.818]

5 We introduce an algorithm to address foreground/background segmentation and human pose tracking simultaneously in a video captured by a moving camera. [sent-15, score-0.861]

6 human body area in each frame, and the latter estimates temporally coherent human body configurations. [sent-22, score-0.65]

7 Foreground/background segmentation problem in moving camera environment has been studied actively these days. [sent-24, score-0.381]

8 [21] proposes an algorithm to construct foreground and background appearance models for pixelwise labeling based on a sparse set of motion trajectories. [sent-26, score-0.457]

9 Similarly, a matrix factorization is employed in [6] to decompose a dense set of trajectories into foreground or background via low rank and group sparsity constraints. [sent-27, score-0.388]

10 These labeled trajectories are used to compute pixel-wise motion and appearance models for both foreground and background. [sent-30, score-0.391]

11 In [14, 15], each frame is divided into regular grid blocks, and each block motion is estimated to propagate foreground and background models in a recursive manner. [sent-31, score-0.665]

12 On the other hand, [17, 16] study techniques to extract human body areas from videos through the combination of various algorithms. [sent-32, score-0.376]

13 883333 Several interesting approaches for pose estimation and tracking have been proposed. [sent-33, score-0.512]

14 Pictorial structure [10] is a widely adopted model for human pose estimation, which computes the Maximum A Posteriori (MAP) estimate of body configuration efficiently by dynamic programming. [sent-34, score-0.696]

15 A variation of the pictorial structure model is introduced in [19], where part-specific appearance models are learned and the pose of an articulated body is estimated by message passing in a Conditional Random Field (CRF). [sent-35, score-0.727]

16 [1] present a discriminatively trained appearance model and a flexible kinematic tree prior on the configurations of body parts. [sent-38, score-0.416]

17 Most of these approaches assume that foreground/background segmentation is given, or do not utilize the segmentation labels at all. [sent-40, score-0.564]

18 There are several prior studies related to the joint formulation of foreground/background segmentation and pose estimation. [sent-41, score-0.668]

19 ObjCut [13] tackles a joint problem of segmentation and pose estimation in an image, where shape model is obtained from pose and segmentation is estimated given the shape model, and a similar approach is proposed by [5]. [sent-42, score-1.424]

20 [12] introduce PoseCut for simultaneous 3D pose tracking and segmentation. [sent-44, score-0.454]

21 However, its results may be sensitive to initial pose estimation and background clutter due to weak foreground/background appearance models. [sent-45, score-0.517]

22 [4] couple pose estimation and contour extraction problems in a multi-camera environment. [sent-47, score-0.371]

23 In [22], the multi-level inference framework for pose estimation and segmentation from a single image is proposed. [sent-48, score-0.637]

24 Recently, a technique for segmentation and pose estimation of human is presented in [8], where foreground area is first separated by grab-cut [20] given a bounding box, and human pose is estimated based on the foreground region. [sent-49, score-1.697]

25 We propose a unified probabilistic framework for foreground/background segmentation and pose tracking in videos captured by moving cameras. [sent-51, score-0.86]

26 Our algorithm is an iterative approach that combines foreground/background segmentation and pose tracking. [sent-52, score-0.626]

27 In each iteration of our algorithm, segmentation module propagates foreground and background models, and provides pose tracking module with foreground mask using the estimated labels. [sent-53, score-1.539]

28 In pose tracking module, the configuration of each body part is estimated by multiple part detectors with label constraint, and gives shape prior represented by probabilistic foreground response map back to segmentation module. [sent-54, score-1.766]

29 The refined segmentation result and the estimated pose configuration are utilized to update foreground/background motion and shape prior in the next iteration, respectively. [sent-55, score-0.856]

30 Our joint human segmentation and pose tracking algorithm has the following contributions and characteristics: • • • We formulate a probabilistic framework of a joint and iterative optimization procedure for foregroundbackground segmentation and pose tracking. [sent-57, score-1.54]

31 We propose an online algorithm based on a recursive foreground/background appearance modeling and sequential Bayesian filtering for pose tracking. [sent-58, score-0.366]

32 Our algorithm is applied to natural videos and improves both segmentation and pose tracking performance significantly. [sent-59, score-0.771]

33 Foreground/background segmentation in a moving camera environment is discussed in Section 3, and our pose estimation technique is presented in Section 4. [sent-62, score-0.752]

34 Objective and Main Formulation Our goal is to perform foreground/background segmentation and human pose tracking jointly and sequentially in a video captured by a moving camera. [sent-66, score-0.861]

35 , xm,t} is composed of a set of pose parameters for individual body parts, where m is the number of parts1 . [sent-72, score-0.583]

36 The pose of each body part, xi,t, is represented by location, orientation and scale information. [sent-73, score-0.556]

37 883344 Figure 2: Overview of our algorithm, which is composed of two modules: foreground/background segmentation and pose tracking. [sent-85, score-0.579]

38 Segmentation is refined with shape information given by pose tracking while pose tracking utilizes segmentation mask. [sent-86, score-1.238]

39 mentation and pose tracking—and solve the following energy minimization problem: Lt,mXint,Yt Eseg(Lt,Yt,It) + Epose(Xt,Lt,It), (2) where Eseg(Lt, Yt, It) and Epose(Xt, Lt ,It) denote energy functions for segmentation and pose tracking, respectively. [sent-88, score-0.892]

40 Yt denotes a foreground response map for human body area generated from the pose variable Xt. [sent-89, score-1.102]

41 To the end, we employ a slightly modified version of [15]; spatial model composition step is removed but pose tracking feedback is taken into ac- count for the joint formulation. [sent-98, score-0.52]

42 The segmentation labels, Lt∗, are obtained efficiently by graph-cut algorithm [2, 3] based on p(Lt |Yt, It), which depends on the two terms—observation likelihood and segmentation prior given pose. [sent-100, score-0.667]

43 Estimation of Observation Likelihood We obtain the observation likelihood p(It |Lt) from the probabilistic models of foreground and background appearances. [sent-105, score-0.52]

44 We divide a frame into N regular grid blocks2 and construct foreground and background models in each block by kernel density estimation. [sent-106, score-0.499]

45 Suppose that foreground model ϕfk,t−1 and background model ϕbk,t−1 for the k-th block Btk−1 at time t − 1 are already given, where {y1ξ,t−1 , . [sent-107, score-0.458]

46 The foreground and background likelihoods of an observed pixel zt−1 are respectively given by ξ p(zt−1 |ϕkf,t−1) = αU(zt−1) +1 −nf αi? [sent-111, score-0.428]

47 To construct foreground and background models at time t based on the earlier ones, we compute foreground and background motion vectors in each block, and propagate models from the previous frame using the block motions. [sent-114, score-0.962]

48 If the motion observation in a block is insufficient due to occlusion, background block motion is estimated by the average motion of adjacent blocks and foreground block motion is set to zero. [sent-117, score-1.059]

49 Note that χξk,t depends on the segmentation labels and is updated in each iteration due to potential label changes of the pixels within the block. [sent-118, score-0.39]

50 Through the iterative model propagation with respect to backward block motion Vξk,t, the likelihood of an observed 2The size of each block is 24 ξ 24 in our experiment. [sent-119, score-0.433]

51 Prior of Segmentation Given Pose For the alternating procedure between segmentation and pose tracking, feedback from pose tracking needs to be incorporated for label estimation. [sent-139, score-1.083]

52 In this work, pixel-wise foreground response map Yt plays this role, and the prior of segmentation given human pose introduced in Eq. [sent-140, score-1.138]

53 y where spatial smoothness term defines the relationship between adjacent pixels and pose consistency term corresponds to the coherency between the labels from segmentation and pose tracking. [sent-169, score-0.956]

54 On the other hand, pixel-wise foreground response map, Yt, is estimated based on the response maps of individual 3We used the standard four neighborhood system. [sent-187, score-0.58]

55 (b) Detector response of each body part (head, torso, upper and lower legs, upper and lower arms). [sent-190, score-0.473]

56 (d) Foreground response map is generated by the sum of the marginalized response and the segmentation mask in the previous iteration. [sent-192, score-0.66]

57 body parts obtained from pose tracking as well as the segmentation mask inferred in the previous iteration. [sent-193, score-1.061]

58 To construct the foreground response map, we marginalize the responses of all body parts in a 2D image space, and then normalize the marginalized responses with the maximum value. [sent-194, score-0.822]

59 Also, the label likelihood given the foreground response, p(? [sent-195, score-0.384]

60 Figure 3 visualizes the construction of the foreground response map. [sent-201, score-0.393]

61 Pose Tracking Pose tracking sequentially estimates p(Xt |Lt, I1:t), posterior distribution over the current human body configuration Xt at time t given segmentation result Lt and all image evidences I1:t. [sent-203, score-0.826]

62 The posterior probability is factorized by Bayesian filtering as p(Xt |Lt, I1:t) ∝ p(Lt, It |Xt)p(Xt |I1:t−1), (15) where p(Lt, It |Xt) is the likelihood of image evidence and segmentation given a particular body part configurations, and p(Xt |I1:t−1) is the prior of Xt. [sent-204, score-0.735]

63 The configurations of body parts are estimated by tracking-by-detection paradigm, in which the responses of part detectors serve as observation for tracking. [sent-205, score-0.501]

64 We restrict search space for each body part to the intersection of foreground area and predicted region corresponding to each part area. [sent-206, score-0.665]

65 The restricted search region significantly reduces ambiguities and false positive detections caused by the features similar to human body part in background clutter. [sent-207, score-0.511]

66 (a) Result of pose estimation at frame 125 in Skating sequence. [sent-210, score-0.412]

67 The shape likelihood of the part xi,t is computed by the convolution of shape model filters and local foreground edge map with respect to xi,t, which is given by p(It|xi,t,Lt)−( ∝1 e −xp ηi()−? [sent-227, score-0.578]

68 (17) and (19), are used to construct foreground response map Yt. [sent-237, score-0.429]

69 Pose Prediction The prediction p(Xt |I1:t−1) is composed of a spatial prior on the relative position between parts and a temporal prior of an individual body part as p(Xt|I1:t−1) ∝ ? [sent-240, score-0.462]

70 upper arms must be connected to torso), p(xi,t |xi,t−1) denotes the temporal prior for an individual part, and p(Xt−1 |I1:t−1) is the posterior of the pose parameters at the previous time step t − 1. [sent-245, score-0.537]

71 The spatial prior on body part configurations is based on a tree structure and represents the kinematic dependencies between body parts. [sent-246, score-0.688]

72 To deal with various changes of pose and appearance in image, we adopt the spatial prior model by discrete binning [19], which is given by (i? [sent-247, score-0.42]

73 883377 Algorithm 1: Joint segmentation and pose tracking Input: It, ϕξk,t−1 , Xt−1 , ρis,t−1 Output: Lt, ϕξk,t, Xt, ρis,t iterate 11 12 13 Model Update: FG/BG model update (Eq. [sent-252, score-0.72]

74 Model Updates After the iterative procedure at each frame, we obtain foreground/background labels and human body configuration. [sent-255, score-0.404]

75 To propagate the labels and pose parameters accurately, foreground/background models and specific shape model should be updated in each frame based on the converged segmentation labels. [sent-256, score-0.809]

76 The foreground and background models are recursively updated using the propagated models from the previous time step t − 1 and the observations in the current time step t, which are given by p(zt| ϕˆξk,t) = τseg· p(zt| ϕ˜kξ,t) +1 −nξ τsegi? [sent-257, score-0.397]

77 The specific shape model of each body part is also updated incrementally based on the local foreground edge map at the current time step t, which is given by ρˆsi,t = τpose · ˆρ is,t−1 + (1 − τpose) · f(It; xi,t, Lt), (24) where τpose is a forgetting factor for specific shape model. [sent-259, score-0.865]

78 The pseudo code of overall joint segmentation and pose tracking algorithm is described in Algorithm 1. [sent-260, score-0.761]

79 Our results are compared with existing methods for foreground/background segmentation and pose estimation such as [15, 8, 19]. [sent-263, score-0.637]

80 Since the proposed technique combines segmentation and pose estimation, the two subproblems are evaluated separately. [sent-264, score-0.62]

81 All the sequences involve various pose changes and substantial camera motions. [sent-268, score-0.374]

82 On the other hand, pose estimation is evaluated by the Percentage of Correctly estimated body Parts (PCP) [7]. [sent-271, score-0.653]

83 We compute PCP values for individual body parts, and the performance of entire human body is estimated based on the average of the PCP measures of all body parts. [sent-273, score-0.877]

84 Results We present our foreground/background segmentation and human pose tracking results in Figure 5. [sent-277, score-0.802]

85 The experimental results illustrate that our algorithm produces accurate and robust outputs in the presence of background clutter, significant pose variations, fast camera motions, occlusions, scale changes, and so on. [sent-278, score-0.432]

86 To demonstrate the effectiveness of our joint estimation algorithm, we first compare our foreground/background segmentation algorithm with other methods such as pro- gressive pruning [8], motion segmentation, and our algorithm with segmentation only. [sent-280, score-0.699]

87 Then, our pose tracking algorithm is also compared with progressive pruning [8], iterative learning algorithm with our foreground/background segmentation [19], and our algorithm with pose tracking only. [sent-281, score-1.25]

88 As illustrated in Figure 6, our joint estimation algorithm performs better than all other methods significantly and is robust to the background features similar to human body parts. [sent-282, score-0.515]

89 The quantitative performance of foreground/background segmentation algorithm are summarized in Figure 7, where our algorithm is compared with a simple motion segmentation, our algorithm with segmentation only and an existing techniques [8]. [sent-283, score-0.628]

90 The quantitative performance of human pose estimation is evaluated based on the PCP-curves, which are presented in Figure 8. [sent-285, score-0.481]

91 Conclusion We presented a unified probabilistic framework to perform foreground/background segmentation and human pose tracking jointly in an on-line manner. [sent-304, score-0.832]

92 The proposed algorithm presents outstanding segmentation and pose estimation performance by mutual interactions between the two complementary subsystems; they alternate each other and improve the quality of solution in each iteration. [sent-305, score-0.662]

93 We showed the robustness of our foreground/background segmentation and pose tracking algorithms to background clutter, pose changes, object scale changes, and illumination variations through qualitative and quantitative validation in real videos. [sent-306, score-1.152]

94 PCP evaluation on Skating sequence PCP − threshold (a) Skating PCP evaluation on Jumping sequence PCP evaluation on Pitching sequence PCP − threshold (b) Pitching PCP evaluation on Dunk sequence PCP − threshold PCP − threshold (c) Jumping (d) Dunk %]PC[0 98. [sent-404, score-0.376]

95 ]O[8naly 19] PCP evaluation on Javelin sequence (e) Javelin Figure 8: Quantitative performance evaluation results of pose estimation evaluation by PCP curves. [sent-411, score-0.439]

96 Hybrid body representation for integrated pose recognition, localization and segmentation. [sent-442, score-0.556]

97 2d articulated human pose estimation and retrieval in (almost) unconstrained still images. [sent-462, score-0.498]

98 Simultaneous segmentation and pose estimation of humans using dynamic graph cuts. [sent-488, score-0.637]

99 Modeling and segmentation of floating foreground and background in videos. [sent-510, score-0.629]

100 Multi-level inference by relaxed dual decomposition for human pose segmentation. [sent-550, score-0.395]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('lt', 0.33), ('pose', 0.313), ('foreground', 0.272), ('segmentation', 0.266), ('yt', 0.252), ('body', 0.243), ('pcp', 0.235), ('xt', 0.198), ('zt', 0.185), ('eseg', 0.159), ('skating', 0.143), ('tracking', 0.141), ('dunk', 0.132), ('epose', 0.132), ('response', 0.121), ('pitching', 0.117), ('javelin', 0.109), ('btk', 0.106), ('block', 0.095), ('background', 0.091), ('likelihood', 0.087), ('jumping', 0.083), ('human', 0.082), ('eichner', 0.072), ('sequence', 0.068), ('motion', 0.068), ('shape', 0.064), ('pictorial', 0.061), ('kinematic', 0.059), ('moving', 0.059), ('marginalized', 0.059), ('configuration', 0.058), ('estimation', 0.058), ('mask', 0.057), ('part', 0.055), ('riegion', 0.053), ('vti', 0.053), ('videos', 0.051), ('samsung', 0.051), ('arms', 0.051), ('prior', 0.048), ('iterative', 0.047), ('korea', 0.047), ('btj', 0.047), ('postech', 0.047), ('articulated', 0.045), ('module', 0.044), ('iti', 0.043), ('forgetting', 0.043), ('responses', 0.043), ('frame', 0.041), ('joint', 0.041), ('parts', 0.041), ('subproblems', 0.041), ('backward', 0.041), ('observation', 0.04), ('configurations', 0.04), ('region', 0.04), ('likelihoods', 0.039), ('estimated', 0.039), ('han', 0.039), ('torso', 0.039), ('electronics', 0.037), ('subtraction', 0.037), ('map', 0.036), ('seg', 0.036), ('lti', 0.036), ('posterior', 0.036), ('denotes', 0.035), ('updated', 0.034), ('changes', 0.033), ('modules', 0.032), ('labels', 0.032), ('adjacent', 0.032), ('propagate', 0.032), ('niebles', 0.031), ('mixture', 0.031), ('probabilistic', 0.03), ('clutter', 0.029), ('progressive', 0.029), ('environment', 0.028), ('quantitative', 0.028), ('blocks', 0.028), ('camera', 0.028), ('sheikh', 0.027), ('upper', 0.027), ('individual', 0.027), ('specific', 0.027), ('recursive', 0.027), ('andriluka', 0.027), ('pixel', 0.026), ('qualitatively', 0.026), ('appearance', 0.026), ('threshold', 0.026), ('legs', 0.025), ('trajectories', 0.025), ('feedback', 0.025), ('ig', 0.025), ('alternate', 0.025), ('label', 0.025)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000005 225 iccv-2013-Joint Segmentation and Pose Tracking of Human in Natural Videos

Author: Taegyu Lim, Seunghoon Hong, Bohyung Han, Joon Hee Han

Abstract: We propose an on-line algorithm to extract a human by foreground/background segmentation and estimate pose of the human from the videos captured by moving cameras. We claim that a virtuous cycle can be created by appropriate interactions between the two modules to solve individual problems. This joint estimation problem is divided into two subproblems, , foreground/background segmentation and pose tracking, which alternate iteratively for optimization; segmentation step generates foreground mask for human pose tracking, and human pose tracking step provides foreground response map for segmentation. The final solution is obtained when the iterative procedure converges. We evaluate our algorithm quantitatively and qualitatively in real videos involving various challenges, and present its outstandingperformance compared to the state-of-the-art techniques for segmentation and pose estimation.

2 0.26789236 143 iccv-2013-Estimating Human Pose with Flowing Puppets

Author: Silvia Zuffi, Javier Romero, Cordelia Schmid, Michael J. Black

Abstract: We address the problem of upper-body human pose estimation in uncontrolled monocular video sequences, without manual initialization. Most current methods focus on isolated video frames and often fail to correctly localize arms and hands. Inferring pose over a video sequence is advantageous because poses of people in adjacent frames exhibit properties of smooth variation due to the nature of human and camera motion. To exploit this, previous methods have used prior knowledge about distinctive actions or generic temporal priors combined with static image likelihoods to track people in motion. Here we take a different approach based on a simple observation: Information about how a person moves from frame to frame is present in the optical flow field. We develop an approach for tracking articulated motions that “links” articulated shape models of peo- ple in adjacent frames through the dense optical flow. Key to this approach is a 2D shape model of the body that we use to compute how the body moves over time. The resulting “flowing puppets ” provide a way of integrating image evidence across frames to improve pose inference. We apply our method on a challenging dataset of TV video sequences and show state-of-the-art performance.

3 0.24317393 403 iccv-2013-Strong Appearance and Expressive Spatial Models for Human Pose Estimation

Author: Leonid Pishchulin, Mykhaylo Andriluka, Peter Gehler, Bernt Schiele

Abstract: Typical approaches to articulated pose estimation combine spatial modelling of the human body with appearance modelling of body parts. This paper aims to push the state-of-the-art in articulated pose estimation in two ways. First we explore various types of appearance representations aiming to substantially improve the bodypart hypotheses. And second, we draw on and combine several recently proposed powerful ideas such as more flexible spatial models as well as image-conditioned spatial models. In a series of experiments we draw several important conclusions: (1) we show that the proposed appearance representations are complementary; (2) we demonstrate that even a basic tree-structure spatial human body model achieves state-ofthe-art performance when augmented with the proper appearance representation; and (3) we show that the combination of the best performing appearance model with a flexible image-conditioned spatial model achieves the best result, significantly improving over the state of the art, on the “Leeds Sports Poses ” and “Parse ” benchmarks.

4 0.22321771 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects

Author: Stefan Duffner, Christophe Garcia

Abstract: In this paper, we present a novel algorithm for fast tracking of generic objects in videos. The algorithm uses two components: a detector that makes use of the generalised Hough transform with pixel-based descriptors, and a probabilistic segmentation method based on global models for foreground and background. These components are used for tracking in a combined way, and they adapt each other in a co-training manner. Through effective model adaptation and segmentation, the algorithm is able to track objects that undergo rigid and non-rigid deformations and considerable shape and appearance variations. The proposed tracking method has been thoroughly evaluated on challenging standard videos, and outperforms state-of-theart tracking methods designed for the same task. Finally, the proposed models allow for an extremely efficient implementation, and thus tracking is very fast.

5 0.19640414 322 iccv-2013-Pose Estimation and Segmentation of People in 3D Movies

Author: Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev

Abstract: We seek to obtain a pixel-wise segmentation and pose estimation of multiple people in a stereoscopic video. This involves challenges such as dealing with unconstrained stereoscopic video, non-stationary cameras, and complex indoor and outdoor dynamic scenes. The contributions of our work are two-fold: First, we develop a segmentation model incorporating person detection, pose estimation, as well as colour, motion, and disparity cues. Our new model explicitly represents depth ordering and occlusion. Second, we introduce a stereoscopic dataset with frames extracted from feature-length movies “StreetDance 3D ” and “Pina ”. The dataset contains 2727 realistic stereo pairs and includes annotation of human poses, person bounding boxes, and pixel-wise segmentations for hundreds of people. The dataset is composed of indoor and outdoor scenes depicting multiple people with frequent occlusions. We demonstrate results on our new challenging dataset, as well as on the H2view dataset from (Sheasby et al. ACCV 2012).

6 0.19057643 78 iccv-2013-Coherent Motion Segmentation in Moving Camera Videos Using Optical Flow Orientations

7 0.18172282 366 iccv-2013-STAR3D: Simultaneous Tracking and Reconstruction of 3D Objects Using RGB-D Data

8 0.17930457 160 iccv-2013-Fast Object Segmentation in Unconstrained Video

9 0.17408872 273 iccv-2013-Monocular Image 3D Human Pose Estimation under Self-Occlusion

10 0.16817485 341 iccv-2013-Real-Time Body Tracking with One Depth Camera and Inertial Sensors

11 0.16542986 218 iccv-2013-Interactive Markerless Articulated Hand Motion Tracking Using RGB and Depth Data

12 0.15996356 62 iccv-2013-Bird Part Localization Using Exemplar-Based Models with Enforced Pose and Subcategory Consistency

13 0.15977986 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning

14 0.15713474 316 iccv-2013-Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose?

15 0.15651159 282 iccv-2013-Multi-view Object Segmentation in Space and Time

16 0.15123172 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization

17 0.15037893 429 iccv-2013-Tree Shape Priors with Connectivity Constraints Using Convex Relaxation on General Graphs

18 0.14935492 24 iccv-2013-A Non-parametric Bayesian Network Prior of Human Pose

19 0.14616328 386 iccv-2013-Sequential Bayesian Model Update under Structured Scene Prior for Semantic Road Scenes Labeling

20 0.14592196 379 iccv-2013-Semantic Segmentation without Annotating Segments


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.281), (1, -0.088), (2, 0.08), (3, 0.104), (4, 0.11), (5, -0.083), (6, -0.097), (7, 0.185), (8, -0.03), (9, 0.143), (10, 0.021), (11, 0.035), (12, -0.103), (13, -0.104), (14, -0.146), (15, 0.141), (16, -0.013), (17, -0.129), (18, -0.046), (19, -0.011), (20, 0.217), (21, -0.019), (22, 0.07), (23, -0.004), (24, -0.041), (25, 0.06), (26, 0.009), (27, 0.01), (28, 0.081), (29, 0.014), (30, -0.009), (31, 0.035), (32, 0.113), (33, -0.049), (34, 0.016), (35, 0.028), (36, -0.078), (37, 0.092), (38, 0.006), (39, 0.071), (40, -0.008), (41, 0.045), (42, -0.021), (43, 0.059), (44, 0.052), (45, 0.007), (46, -0.032), (47, -0.037), (48, 0.05), (49, -0.081)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9881072 225 iccv-2013-Joint Segmentation and Pose Tracking of Human in Natural Videos

Author: Taegyu Lim, Seunghoon Hong, Bohyung Han, Joon Hee Han

Abstract: We propose an on-line algorithm to extract a human by foreground/background segmentation and estimate pose of the human from the videos captured by moving cameras. We claim that a virtuous cycle can be created by appropriate interactions between the two modules to solve individual problems. This joint estimation problem is divided into two subproblems, , foreground/background segmentation and pose tracking, which alternate iteratively for optimization; segmentation step generates foreground mask for human pose tracking, and human pose tracking step provides foreground response map for segmentation. The final solution is obtained when the iterative procedure converges. We evaluate our algorithm quantitatively and qualitatively in real videos involving various challenges, and present its outstandingperformance compared to the state-of-the-art techniques for segmentation and pose estimation.

2 0.78420961 143 iccv-2013-Estimating Human Pose with Flowing Puppets

Author: Silvia Zuffi, Javier Romero, Cordelia Schmid, Michael J. Black

Abstract: We address the problem of upper-body human pose estimation in uncontrolled monocular video sequences, without manual initialization. Most current methods focus on isolated video frames and often fail to correctly localize arms and hands. Inferring pose over a video sequence is advantageous because poses of people in adjacent frames exhibit properties of smooth variation due to the nature of human and camera motion. To exploit this, previous methods have used prior knowledge about distinctive actions or generic temporal priors combined with static image likelihoods to track people in motion. Here we take a different approach based on a simple observation: Information about how a person moves from frame to frame is present in the optical flow field. We develop an approach for tracking articulated motions that “links” articulated shape models of peo- ple in adjacent frames through the dense optical flow. Key to this approach is a 2D shape model of the body that we use to compute how the body moves over time. The resulting “flowing puppets ” provide a way of integrating image evidence across frames to improve pose inference. We apply our method on a challenging dataset of TV video sequences and show state-of-the-art performance.

3 0.72395438 403 iccv-2013-Strong Appearance and Expressive Spatial Models for Human Pose Estimation

Author: Leonid Pishchulin, Mykhaylo Andriluka, Peter Gehler, Bernt Schiele

Abstract: Typical approaches to articulated pose estimation combine spatial modelling of the human body with appearance modelling of body parts. This paper aims to push the state-of-the-art in articulated pose estimation in two ways. First we explore various types of appearance representations aiming to substantially improve the bodypart hypotheses. And second, we draw on and combine several recently proposed powerful ideas such as more flexible spatial models as well as image-conditioned spatial models. In a series of experiments we draw several important conclusions: (1) we show that the proposed appearance representations are complementary; (2) we demonstrate that even a basic tree-structure spatial human body model achieves state-ofthe-art performance when augmented with the proper appearance representation; and (3) we show that the combination of the best performing appearance model with a flexible image-conditioned spatial model achieves the best result, significantly improving over the state of the art, on the “Leeds Sports Poses ” and “Parse ” benchmarks.

4 0.70539534 273 iccv-2013-Monocular Image 3D Human Pose Estimation under Self-Occlusion

Author: Ibrahim Radwan, Abhinav Dhall, Roland Goecke

Abstract: In this paper, an automatic approach for 3D pose reconstruction from a single image is proposed. The presence of human body articulation, hallucinated parts and cluttered background leads to ambiguity during the pose inference, which makes the problem non-trivial. Researchers have explored various methods based on motion and shading in order to reduce the ambiguity and reconstruct the 3D pose. The key idea of our algorithm is to impose both kinematic and orientation constraints. The former is imposed by projecting a 3D model onto the input image and pruning the parts, which are incompatible with the anthropomorphism. The latter is applied by creating synthetic views via regressing the input view to multiple oriented views. After applying the constraints, the 3D model is projected onto the initial and synthetic views, which further reduces the ambiguity. Finally, we borrow the direction of the unambiguous parts from the synthetic views to the initial one, which results in the 3D pose. Quantitative experiments are performed on the HumanEva-I dataset and qualitatively on unconstrained images from the Image Parse dataset. The results show the robustness of the proposed approach to accurately reconstruct the 3D pose form a single image.

5 0.70182645 316 iccv-2013-Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose?

Author: Elisabeta Marinoiu, Dragos Papava, Cristian Sminchisescu

Abstract: Human motion analysis in images and video is a central computer vision problem. Yet, there are no studies that reveal how humans perceive other people in images and how accurate they are. In this paper we aim to unveil some of the processing–as well as the levels of accuracy–involved in the 3D perception of people from images by assessing the human performance. Our contributions are: (1) the construction of an experimental apparatus that relates perception and measurement, in particular the visual and kinematic performance with respect to 3D ground truth when the human subject is presented an image of a person in a given pose; (2) the creation of a dataset containing images, articulated 2D and 3D pose ground truth, as well as synchronized eye movement recordings of human subjects, shown a variety of human body configurations, both easy and difficult, as well as their ‘re-enacted’ 3D poses; (3) quantitative analysis revealing the human performance in 3D pose reenactment tasks, the degree of stability in the visual fixation patterns of human subjects, and the way it correlates with different poses. We also discuss the implications of our find- ings for the construction of visual human sensing systems.

6 0.68652904 322 iccv-2013-Pose Estimation and Segmentation of People in 3D Movies

7 0.67292166 330 iccv-2013-Proportion Priors for Image Sequence Segmentation

8 0.66023648 218 iccv-2013-Interactive Markerless Articulated Hand Motion Tracking Using RGB and Depth Data

9 0.65991896 8 iccv-2013-A Deformable Mixture Parsing Model with Parselets

10 0.6457845 118 iccv-2013-Discovering Object Functionality

11 0.64490801 291 iccv-2013-No Matter Where You Are: Flexible Graph-Guided Multi-task Learning for Multi-view Head Pose Classification under Target Motion

12 0.6424346 318 iccv-2013-PixelTrack: A Fast Adaptive Algorithm for Tracking Non-rigid Objects

13 0.64029509 65 iccv-2013-Breaking the Chain: Liberation from the Temporal Markov Assumption for Tracking Human Poses

14 0.6351462 24 iccv-2013-A Non-parametric Bayesian Network Prior of Human Pose

15 0.6135025 344 iccv-2013-Recognising Human-Object Interaction via Exemplar Based Modelling

16 0.6129998 341 iccv-2013-Real-Time Body Tracking with One Depth Camera and Inertial Sensors

17 0.61158025 46 iccv-2013-Allocentric Pose Estimation

18 0.60928935 63 iccv-2013-Bounded Labeling Function for Global Segmentation of Multi-part Objects with Geometric Constraints

19 0.59825093 62 iccv-2013-Bird Part Localization Using Exemplar-Based Models with Enforced Pose and Subcategory Consistency

20 0.59781873 366 iccv-2013-STAR3D: Simultaneous Tracking and Reconstruction of 3D Objects Using RGB-D Data


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.055), (7, 0.02), (13, 0.01), (26, 0.126), (31, 0.034), (35, 0.026), (42, 0.092), (44, 0.237), (64, 0.079), (73, 0.041), (89, 0.182), (98, 0.011)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.83840597 225 iccv-2013-Joint Segmentation and Pose Tracking of Human in Natural Videos

Author: Taegyu Lim, Seunghoon Hong, Bohyung Han, Joon Hee Han

Abstract: We propose an on-line algorithm to extract a human by foreground/background segmentation and estimate pose of the human from the videos captured by moving cameras. We claim that a virtuous cycle can be created by appropriate interactions between the two modules to solve individual problems. This joint estimation problem is divided into two subproblems, , foreground/background segmentation and pose tracking, which alternate iteratively for optimization; segmentation step generates foreground mask for human pose tracking, and human pose tracking step provides foreground response map for segmentation. The final solution is obtained when the iterative procedure converges. We evaluate our algorithm quantitatively and qualitatively in real videos involving various challenges, and present its outstandingperformance compared to the state-of-the-art techniques for segmentation and pose estimation.

2 0.83290231 303 iccv-2013-Orderless Tracking through Model-Averaged Posterior Estimation

Author: Seunghoon Hong, Suha Kwak, Bohyung Han

Abstract: We propose a novel offline tracking algorithm based on model-averaged posterior estimation through patch matching across frames. Contrary to existing online and offline tracking methods, our algorithm is not based on temporallyordered estimates of target state but attempts to select easyto-track frames first out of the remaining ones without exploiting temporal coherency of target. The posterior of the selected frame is estimated by propagating densities from the already tracked frames in a recursive manner. The density propagation across frames is implemented by an efficient patch matching technique, which is useful for our algorithm since it does not require motion smoothness assumption. Also, we present a hierarchical approach, where a small set of key frames are tracked first and non-key frames are handled by local key frames. Our tracking algorithm is conceptually well-suited for the sequences with abrupt motion, shot changes, and occlusion. We compare our tracking algorithm with existing techniques in real videos with such challenges and illustrate its superior performance qualitatively and quantitatively.

3 0.8079769 416 iccv-2013-The Interestingness of Images

Author: Michael Gygli, Helmut Grabner, Hayko Riemenschneider, Fabian Nater, Luc Van_Gool

Abstract: We investigate human interest in photos. Based on our own and others ’psychological experiments, we identify various cues for “interestingness ”, namely aesthetics, unusualness and general preferences. For the ranking of retrieved images, interestingness is more appropriate than cues proposed earlier. Interestingness is, for example, correlated with what people believe they will remember. This is opposed to actual memorability, which is uncorrelated to both of them. We introduce a set of features computationally capturing the three main aspects of visual interestingness that we propose and build an interestingness predictor from them. Its performance is shown on three datasets with varying context, reflecting diverse levels of prior knowledge of the viewers.

4 0.80206859 86 iccv-2013-Concurrent Action Detection with Structural Prediction

Author: Ping Wei, Nanning Zheng, Yibiao Zhao, Song-Chun Zhu

Abstract: Action recognition has often been posed as a classification problem, which assumes that a video sequence only have one action class label and different actions are independent. However, a single human body can perform multiple concurrent actions at the same time, and different actions interact with each other. This paper proposes a concurrent action detection model where the action detection is formulated as a structural prediction problem. In this model, an interval in a video sequence can be described by multiple action labels. An detected action interval is determined both by the unary local detector and the relations with other actions. We use a wavelet feature to represent the action sequence, and design a composite temporal logic descriptor to describe the action relations. The model parameters are trained by structural SVM learning. Given a long video sequence, a sequential decision window search algorithm is designed to detect the actions. Experiments on our new collected concurrent action dataset demonstrate the strength of our method.

5 0.75513983 447 iccv-2013-Volumetric Semantic Segmentation Using Pyramid Context Features

Author: Jonathan T. Barron, Mark D. Biggin, Pablo Arbeláez, David W. Knowles, Soile V.E. Keranen, Jitendra Malik

Abstract: We present an algorithm for the per-voxel semantic segmentation of a three-dimensional volume. At the core of our algorithm is a novel “pyramid context” feature, a descriptive representation designed such that exact per-voxel linear classification can be made extremely efficient. This feature not only allows for efficient semantic segmentation but enables other aspects of our algorithm, such as novel learned features and a stacked architecture that can reason about self-consistency. We demonstrate our technique on 3Dfluorescence microscopy data ofDrosophila embryosfor which we are able to produce extremely accurate semantic segmentations in a matter of minutes, and for which other algorithms fail due to the size and high-dimensionality of the data, or due to the difficulty of the task.

6 0.74261487 281 iccv-2013-Multi-view Normal Field Integration for 3D Reconstruction of Mirroring Objects

7 0.73784536 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning

8 0.7290473 414 iccv-2013-Temporally Consistent Superpixels

9 0.72333038 150 iccv-2013-Exemplar Cut

10 0.72006494 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning

11 0.71966302 379 iccv-2013-Semantic Segmentation without Annotating Segments

12 0.71742165 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization

13 0.71738005 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation

14 0.71727222 442 iccv-2013-Video Segmentation by Tracking Many Figure-Ground Segments

15 0.71686018 8 iccv-2013-A Deformable Mixture Parsing Model with Parselets

16 0.7165696 420 iccv-2013-Topology-Constrained Layered Tracking with Latent Flow

17 0.71589363 102 iccv-2013-Data-Driven 3D Primitives for Single Image Understanding

18 0.71554655 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation

19 0.71470523 295 iccv-2013-On One-Shot Similarity Kernels: Explicit Feature Maps and Properties

20 0.71359891 330 iccv-2013-Proportion Priors for Image Sequence Segmentation