nips nips2002 nips2002-172 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Leonid Taycher, John Iii, Trevor Darrell
Abstract: Accurate representation of articulated motion is a challenging problem for machine perception. Several successful tracking algorithms have been developed that model human body as an articulated tree. We propose a learning-based method for creating such articulated models from observations of multiple rigid motions. This paper is concerned with recovering topology of the articulated model, when the rigid motion of constituent segments is known. Our approach is based on finding the Maximum Likelihood tree shaped factorization of the joint probability density function (PDF) of rigid segment motions. The topology of graphical model formed from this factorization corresponds to topology of the underlying articulated body. We demonstrate the performance of our algorithm on both synthetic and real motion capture data.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract Accurate representation of articulated motion is a challenging problem for machine perception. [sent-4, score-0.759]
2 Several successful tracking algorithms have been developed that model human body as an articulated tree. [sent-5, score-0.78]
3 We propose a learning-based method for creating such articulated models from observations of multiple rigid motions. [sent-6, score-0.828]
4 This paper is concerned with recovering topology of the articulated model, when the rigid motion of constituent segments is known. [sent-7, score-1.543]
5 Our approach is based on finding the Maximum Likelihood tree shaped factorization of the joint probability density function (PDF) of rigid segment motions. [sent-8, score-0.706]
6 The topology of graphical model formed from this factorization corresponds to topology of the underlying articulated body. [sent-9, score-1.027]
7 We demonstrate the performance of our algorithm on both synthetic and real motion capture data. [sent-10, score-0.369]
8 1 Introduction Tracking human motion is an integral part of many proposed human-computer interfaces, surveillance and identification systems, as well as animation and virtual reality systems. [sent-11, score-0.316]
9 A common approach to this task is to model the body as a kinematic tree, and reformulate the problem as articulated body tracking[6]. [sent-12, score-0.966]
10 Most of the state-of-the-art systems rely on predefined kinematic models [16]. [sent-13, score-0.098]
11 We are interested in a principled way to recover articulated models from observations. [sent-15, score-0.514]
12 The recovered models may then be used for further tracking and/or recognition. [sent-16, score-0.131]
13 In the first stage the rigidly moving segments are tracked independently; at the second stage, the topology of the body (the connectivity between the segments) is recovered. [sent-18, score-0.651]
14 After the topology is determined, the joint parameters may be determined. [sent-19, score-0.252]
15 In this paper we concentrate on the second stage of this task, estimating the underlying topology of the observed articulated body, when the motion of the constituent rigid bodies is known. [sent-20, score-1.531]
16 If we assume that the body may be modeled as a kinematic tree, and motion of a particular rigid segment is known, then the motions of the rigid segments that are connected through that segment are independent of each other. [sent-22, score-1.773]
17 That is, we can model a probability distribution of the full body- pose as a tree-structured graphical model, where each node corresponds to pose of a rigid segment. [sent-23, score-0.56]
18 This observation allows us to formulate the problem of recovering topology of an articulated body as finding the tree-shaped graphical model that best (in the Maximum Likelihood sense) describes the observations. [sent-24, score-0.982]
19 2 Prior Work While state-of-the-art tracking algorithms [16] do not address either model creation or model initialization, the necessity of automating these two steps has been long recognized. [sent-25, score-0.072]
20 The approach in [10] required a subject to follow a set of predefined movements, and recovered the descriptions of body parts and body topology from deformations of apparent contours. [sent-26, score-0.671]
21 Various heuristics were used in [12] to adapt an articulated model of known topology to 3D observations. [sent-27, score-0.7]
22 Analysis of magnetic motion capture data was used by [14] to recover limb lengths and joint locations for known topology, it also suggested similar analysis for topology extraction. [sent-28, score-0.643]
23 A learning based approach for decomposing a set of observed marker positions and velocities into sets corresponding to various body parts was described in [17]. [sent-29, score-0.273]
24 Our work builds on the latter two approaches in estimating the topology of the articulated tree model underlying the observed motion. [sent-30, score-0.893]
25 Several methods have been used to recover multiple rigid motions from video, such as factorization [3, 18], RANSAC [7], and learning based methods [9]. [sent-31, score-0.59]
26 In this work we assume that the 3-D rigid motions has been recovered and are represented using a 2-D Scaled Prismatic Model (SPM). [sent-32, score-0.572]
27 3 Representing Pose and Motion A 2-D Scaled Prismatic Model (SPM) was proposed by [15] and is useful for representing image motion of projections of elongated 3-D objects. [sent-33, score-0.301]
28 It is obtained by orthographically “projecting” the major axis of the object to the image plane. [sent-34, score-0.042]
29 3-D rigid motion of an object, may be simulated by SPM transformations, using in-plane translation for rigid translation, and rotation and uniform scaling for plane-parallel and out-of-plane rotations respectively. [sent-36, score-1.203]
30 SPM motion (or pose) may be expressed as a linear transformation in projective space as M= a −b e b a f 0 0 1 (1) Following [13] we have chosen to use exponential coordinates, derived from constant velocity equations, to parameterize motion. [sent-37, score-0.451]
31 An SPM transformation may be represented as an exponential map ˆ M = eξ c ˆ ξ=θ ω 0 −ω c 0 vx vy 0 vx vy ξ = θ ω c (2) In this representation vx is a horizontal velocity, vy – vertical velocity, ω – angular velocity, and c is a rate of scale change. [sent-38, score-0.697]
32 Note that there is an inherent scale ambiguity, since θ and (vx , vy , ω, c)T may be chosen arbitrarily, as long as ˆ eξ = M. [sent-40, score-0.133]
33 It can be shown ([13]) that if the SPM transformation is a combination of scaling and rotation, it may be expressed by the sum of two twists, with coincident centers (u x , uy )T of rotation and expansion. [sent-41, score-0.398]
34 uy −ux −c −ux −uy −ω ξ = ω +c = 1 0 0 1 ω −c ux uy ω 1 c 1 (3) While “pure” translation, rotation or scale have intuitive representation with twists, the combination or rotation and scale does not. [sent-42, score-0.756]
35 We propose a scaled twist representation, that preserves the intuitiveness of representation for all possible SPM motions. [sent-43, score-0.082]
36 We want to separate the “direction” of motion (the direction of translation or the relative amounts of rotation and scale) from the amount of motion. [sent-44, score-0.527]
37 If the transformation involves rotation and/or scale, then we choose θ so that ||(ω, c)|| 2 = 1, and then use eq. [sent-45, score-0.197]
38 The computation may be expressed as a linear transformation: √ θ ux τ = uy = ω c ω 2 + c2 ˜ ˜ c ˜ − ω2 +˜2 ˜ c ω ˜ ω 2 +˜2 ˜ c ˜ − ω 2 ω c2 ˜ +˜ c ˜ − ω2 +˜2 ˜ c √ 1 ω 2 +˜2 ˜ c √ 1 ω 2 +˜2 ˜ c where ξ = (˜x , vy , ω, c)T . [sent-47, score-0.288]
39 v ˜ ˜ ˜ 1 vx ˜ vy ˜ ω ˜ c ˜ (4) The the pure translational motion (ω = c = 0) may be regarded as an infinitely small rotation about a point at infinity, e. [sent-48, score-0.716]
40 4 Learning Articulated Topology We wish to infer the underlying topology of an articulated body from noisy observations of a set of rigid body motions. [sent-52, score-1.498]
41 As a practical matter, one must make choices regarding density models; we discuss one such choice although other choices are also suitable. [sent-54, score-0.092]
42 We denote the set of observed motions of N rigid bodies at time t, 1 ≤ t ≤ F as a set {Mt |1 ≤ s ≤ N }. [sent-55, score-0.579]
43 Variables M i with observations {Mt |1 ≤ t ≤ F } are assigned to the vertices of a graph, while edges between nodes indii cate dependency. [sent-58, score-0.072]
44 We shall denote presence or absence of an edge between two variables, Mi and Mj by an index variable Eij , equal to one if an edge is present and zero otherwise. [sent-59, score-0.078]
45 Furthermore, if the corresponding graphical model is a spanning tree, it can be expressed as a product of conditional densities (e. [sent-60, score-0.12]
46 , MN ) = PMs |pa(Ms ) (Ms |pa (Ms )) (7) Ms where pa(Ms ) is the parent of Ms . [sent-65, score-0.035]
47 While multiple nodes may have the same parent, each individual node has only one parent node. [sent-66, score-0.1]
48 Furthermore, in any decomposition one node (the root node) has no parent. [sent-67, score-0.043]
49 Any node (variable) in the model can serve as the root node [8]. [sent-68, score-0.086]
50 Of the possible tree models (choices of E), we wish to choose the maximum likelihood tree which is equivalent to the minimum entropy tree [4]. [sent-70, score-0.495]
51 The entropy of a tree model can be written H(M ) = H(Ms ) − s I(Mi ; Mj ) (8) Eij =1 where H(Ms ) is the marginal entropy of each variable and I(Mi ; Mj ) is the mutual information between nodes Mi and Mj and quantifies their statistical dependence. [sent-71, score-0.352]
52 Consequently, the minimum entropy tree corresponds to the choice of E which minimizes the sum of the pairwise mutual informations [1]. [sent-72, score-0.349]
53 The tree denoted by E can be found via the maximum spanning tree algorithm [2] using I(Mi ; Mj ) for all i, j as the edge weights. [sent-73, score-0.371]
54 Our conjecture is that if our data are sampled from a variety of motions the topology of the estimated density model is likely to be the same as the topology of the articulated body model. [sent-74, score-1.348]
55 It follows from the intuition that when considering only pairwise relationships, the relative motions of physically connected bodies will be most strongly related. [sent-75, score-0.278]
56 1 Estimation of Mutual Information Computing the minimum entropy spanning tree requires estimating the pairwise mutual informations between rigid motions Mi and Mj for all i, j pairs. [sent-77, score-0.946]
57 In order to do so we must make a choice regarding the parameterization of motion and a probability density over that parameterization; to estimate articulated topology it is sufficient to use the the Scaled Prismatic Model with twist parameterization described in Section 3). [sent-78, score-1.132]
58 2 Estimating Motion Entropy t We parameterize rigid motion, Mt , by the vector of quantities ξi (cf. [sent-80, score-0.377]
59 t Mit Mj|i ) is (11) We wish to use scaled twists (Section 3) to compute the entropies involved. [sent-84, score-0.182]
60 4 and 5), the entropies are related, H(ξ) = H(τ ) − E[log det(A)], (12) where E[log det(A)] may be estimated using Equation 6. [sent-86, score-0.06]
61 3 Estimating the Motion Kernel In order to estimate the entropy of motion, we need to estimate the probability density based on the available samples. [sent-88, score-0.096]
62 Since the functional form of the underlying density is not known we have chosen to use kernel-based density estimator, p(τ ) = α ˆ K(τ ; τi ). [sent-89, score-0.119]
63 If τ1 and τ2 do not represent pure translational motions, then they should be considered to be close if their centers of rotation are close. [sent-92, score-0.254]
64 If τ1 and τ2 are pure translations, then they should be considered close if their directions are close. [sent-94, score-0.056]
65 If τ1 and τ2 represent different types of motion (i. [sent-96, score-0.281]
66 5 Implementation The input to our algorithm is a set of SPM poses (Section 3) {Pt |1 ≤ s ≤ S, 1 ≤ t ≤ T }, s where S is the number of tracked rigid segments and F is the number of frames. [sent-103, score-0.578]
67 Pt1 )−1 s s s1 s2 s2 |s (16) t1 t t1 The parameter vectors τs2 t and τs2 |s1 are then extracted from the transformation matrices Ms2 and Ms2 |s1 (cf. [sent-105, score-0.089]
68 Section 3), and the mutual information is estimated as described in Section 4. [sent-106, score-0.114]
69 6 Results We have tested our algorithm both on synthetic and motion capture data. [sent-108, score-0.369]
70 Two synthetic sequences were generated with the following steps. [sent-109, score-0.049]
71 First, the rigid segments were positioned by randomly perturbing parameters of the corresponding kinematic tree structure. [sent-110, score-0.72]
72 At each time step point positions were computed based on the corresponding segment pose, and perturbed with Gaussian noise with zero mean and standard deviation of 1 pixel. [sent-112, score-0.127]
73 The inputs to the algorithm were the segment poses re-estimated from the feature point coordinates. [sent-113, score-0.143]
74 In the motion capture-based experiment, the segment poses were estimated from the marker positions. [sent-114, score-0.498]
75 The first experiment involved a simple kinematic chain with 3 segments in order to demonstrate the operation of the algorithm. [sent-119, score-0.232]
76 The system has a rotational joint between S 1 and S2 and prismatic joint between S2 and S3 . [sent-120, score-0.205]
77 The sample configurations of the articulated body are shown in the first row of the Figures 6. [sent-121, score-0.673]
78 2 and the corresponding maximum spanning tree are in Figures 6. [sent-124, score-0.194]
79 The second experiment involved a humanoid torso-like synthetic model containing 5 rigid segments. [sent-126, score-0.437]
80 For the human motion experiment, we have used motion capture data of a dance sequence (Figure 6. [sent-130, score-0.671]
81 The rigid segment motion was extracted from the positions of the markers tracked across 220 frames (the marker correspondence to the body locations was known). [sent-132, score-1.131]
82 The algorithm was able to correctly recover the articulated body topology (Compare Figures 6. [sent-133, score-0.931]
83 The dance is a highly structured activity, so not all degrees of freedom were explored in this sequence, and mutual information between some unconnected segments (e. [sent-136, score-0.284]
84 7 Conclusions We have presented a novel general technique for recovering the underlying articulated structure from information about rigid segment motion. [sent-139, score-1.006]
85 Our method relies on only a very weak assumption, that this structure may be represented by a tree with unknown topology. [sent-140, score-0.138]
86 While the results presented in this paper were obtained using the Scaled Prismatic Model and non-parametric density estimator, our methodology does not rely on either modeling assumption. [sent-141, score-0.046]
87 The first row shows 3 sample frames from a 100 frame synthetic sequence. [sent-171, score-0.143]
88 The adjacency matrix of the mutual information graph is shown in (d), with intensities corresponding to edge weights. [sent-172, score-0.274]
89 The vertices in the graph correspond to the rigid segments labeled in (a). [sent-173, score-0.583]
90 The sample frames from a randomly generated 150 frame sequence are shown in (a), (b), and (c). [sent-177, score-0.094]
91 The adjacency matrix of the mutual information graph is shown in (d), with intensities corresponding to edge weights. [sent-178, score-0.274]
92 The vertices in the graph correspond to the rigid segments labeled in (a). [sent-179, score-0.583]
93 (a), (b), and (c) are the sample frames from a 220 frame sequence. [sent-183, score-0.094]
94 The adjacency matrix of the mutual information graph is shown in (d), with intensities corresponding to edge weights. [sent-184, score-0.274]
95 The vertices in the graph correspond to the rigid segments labeled in (a). [sent-185, score-0.583]
96 A 3d featurebased tracker for tracking multiple moving objects with a controlled binocular head. [sent-192, score-0.094]
97 Automatic joint parameter estimation from magnetic motion capture data. [sent-226, score-0.385]
98 Singularities in articulated object tracking with 2-d and 3-d models. [sent-231, score-0.572]
99 Stochastic tracking of 3d human figures using 2d image motion. [sent-236, score-0.127]
100 Monocular perception of biological motion - detection and labeling. [sent-240, score-0.281]
wordName wordTfidf (topN-words)
[('articulated', 0.478), ('rigid', 0.35), ('motion', 0.281), ('topology', 0.222), ('mj', 0.212), ('spm', 0.198), ('body', 0.195), ('motions', 0.163), ('uy', 0.154), ('ux', 0.138), ('tree', 0.138), ('mi', 0.137), ('segments', 0.134), ('rotation', 0.129), ('prismatic', 0.11), ('vy', 0.107), ('segment', 0.101), ('kinematic', 0.098), ('vx', 0.094), ('translation', 0.093), ('mutual', 0.092), ('tracking', 0.072), ('mt', 0.07), ('ms', 0.07), ('transformation', 0.068), ('bodies', 0.066), ('twists', 0.066), ('pt', 0.065), ('pose', 0.065), ('recovered', 0.059), ('vision', 0.057), ('spanning', 0.056), ('pure', 0.056), ('frames', 0.053), ('marker', 0.052), ('adjacency', 0.052), ('tracked', 0.052), ('recovering', 0.05), ('vertices', 0.05), ('entropy', 0.05), ('translational', 0.049), ('graph', 0.049), ('synthetic', 0.049), ('velocity', 0.048), ('scaled', 0.047), ('density', 0.046), ('informations', 0.044), ('node', 0.043), ('det', 0.042), ('intensities', 0.042), ('poses', 0.042), ('factorization', 0.041), ('frame', 0.041), ('capture', 0.039), ('edge', 0.039), ('prede', 0.039), ('eij', 0.038), ('entropies', 0.038), ('humanoid', 0.038), ('multibody', 0.038), ('graphical', 0.037), ('figures', 0.036), ('recover', 0.036), ('parameterization', 0.035), ('parent', 0.035), ('dance', 0.035), ('twist', 0.035), ('trevor', 0.035), ('rotational', 0.035), ('magnetic', 0.035), ('human', 0.035), ('kr', 0.033), ('wish', 0.031), ('joint', 0.03), ('kt', 0.029), ('estimating', 0.028), ('constituent', 0.028), ('expressed', 0.027), ('underlying', 0.027), ('parameterize', 0.027), ('positions', 0.026), ('stage', 0.026), ('scale', 0.026), ('concentrate', 0.025), ('pairwise', 0.025), ('condition', 0.024), ('relative', 0.024), ('pa', 0.024), ('freedom', 0.023), ('video', 0.023), ('choices', 0.023), ('nodes', 0.022), ('moving', 0.022), ('estimated', 0.022), ('object', 0.022), ('john', 0.022), ('extracted', 0.021), ('image', 0.02), ('centers', 0.02), ('identity', 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 0.9999997 172 nips-2002-Recovering Articulated Model Topology from Observed Rigid Motion
Author: Leonid Taycher, John Iii, Trevor Darrell
Abstract: Accurate representation of articulated motion is a challenging problem for machine perception. Several successful tracking algorithms have been developed that model human body as an articulated tree. We propose a learning-based method for creating such articulated models from observations of multiple rigid motions. This paper is concerned with recovering topology of the articulated model, when the rigid motion of constituent segments is known. Our approach is based on finding the Maximum Likelihood tree shaped factorization of the joint probability density function (PDF) of rigid segment motions. The topology of graphical model formed from this factorization corresponds to topology of the underlying articulated body. We demonstrate the performance of our algorithm on both synthetic and real motion capture data.
2 0.15231745 51 nips-2002-Classifying Patterns of Visual Motion - a Neuromorphic Approach
Author: Jakob Heinzle, Alan Stocker
Abstract: We report a system that classifies and can learn to classify patterns of visual motion on-line. The complete system is described by the dynamics of its physical network architectures. The combination of the following properties makes the system novel: Firstly, the front-end of the system consists of an aVLSI optical flow chip that collectively computes 2-D global visual motion in real-time [1]. Secondly, the complexity of the classification task is significantly reduced by mapping the continuous motion trajectories to sequences of ’motion events’. And thirdly, all the network structures are simple and with the exception of the optical flow chip based on a Winner-Take-All (WTA) architecture. We demonstrate the application of the proposed generic system for a contactless man-machine interface that allows to write letters by visual motion. Regarding the low complexity of the system, its robustness and the already existing front-end, a complete aVLSI system-on-chip implementation is realistic, allowing various applications in mobile electronic devices.
Author: Matthias O. Franz, Javaan S. Chahl
Abstract: The tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during self-motion. In this study, we examine whether a simplified linear model of these neurons can be used to estimate self-motion from the optic flow. We present a theory for the construction of an estimator consisting of a linear combination of optic flow vectors that incorporates prior knowledge both about the distance distribution of the environment, and about the noise and self-motion statistics of the sensor. The estimator is tested on a gantry carrying an omnidirectional vision sensor. The experiments show that the proposed approach leads to accurate and robust estimates of rotation rates, whereas translation estimates turn out to be less reliable. 1
4 0.10564883 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities
Author: Robert A. Jacobs, Melissa Dominguez
Abstract: We consider the hypothesis that systems learning aspects of visual perception may benefit from the use of suitably designed developmental progressions during training. Four models were trained to estimate motion velocities in sequences of visual images. Three of the models were “developmental models” in the sense that the nature of their input changed during the course of training. They received a relatively impoverished visual input early in training, and the quality of this input improved as training progressed. One model used a coarse-to-multiscale developmental progression (i.e. it received coarse-scale motion features early in training and finer-scale features were added to its input as training progressed), another model used a fine-to-multiscale progression, and the third model used a random progression. The final model was nondevelopmental in the sense that the nature of its input remained the same throughout the training period. The simulation results show that the coarse-to-multiscale model performed best. Hypotheses are offered to account for this model’s superior performance. We conclude that suitably designed developmental sequences can be useful to systems learning to estimate motion velocities. The idea that visual development can aid visual learning is a viable hypothesis in need of further study.
5 0.090890348 203 nips-2002-Using Tarjan's Red Rule for Fast Dependency Tree Construction
Author: Dan Pelleg, Andrew W. Moore
Abstract: We focus on the problem of efficient learning of dependency trees. It is well-known that given the pairwise mutual information coefficients, a minimum-weight spanning tree algorithm solves this problem exactly and in polynomial time. However, for large data-sets it is the construction of the correlation matrix that dominates the running time. We have developed a new spanning-tree algorithm which is capable of exploiting partial knowledge about edge weights. The partial knowledge we maintain is a probabilistic confidence interval on the coefficients, which we derive by examining just a small sample of the data. The algorithm is able to flag the need to shrink an interval, which translates to inspection of more data for the particular attribute pair. Experimental results show running time that is near-constant in the number of records, without significant loss in accuracy of the generated trees. Interestingly, our spanning-tree algorithm is based solely on Tarjan’s red-edge rule, which is generally considered a guaranteed recipe for bad performance. 1
6 0.078479946 137 nips-2002-Location Estimation with a Differential Update Network
7 0.073465154 80 nips-2002-Exact MAP Estimates by (Hyper)tree Agreement
8 0.070805848 87 nips-2002-Fast Transformation-Invariant Factor Analysis
9 0.068651564 153 nips-2002-Neural Decoding of Cursor Motion Using a Kalman Filter
10 0.065646462 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning
11 0.060253695 124 nips-2002-Learning Graphical Models with Mercer Kernels
12 0.056950256 147 nips-2002-Monaural Speech Separation
13 0.05672301 39 nips-2002-Bayesian Image Super-Resolution
14 0.052109633 16 nips-2002-A Prototype for Automatic Recognition of Spontaneous Facial Actions
15 0.048343517 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search
16 0.048047088 85 nips-2002-Fast Kernels for String and Tree Matching
17 0.04613604 35 nips-2002-Automatic Acquisition and Efficient Representation of Syntactic Structures
18 0.041781612 47 nips-2002-Branching Law for Axons
19 0.040486284 114 nips-2002-Information Regularization with Partially Labeled Data
20 0.04015772 10 nips-2002-A Model for Learning Variance Components of Natural Images
topicId topicWeight
[(0, -0.139), (1, 0.012), (2, -0.016), (3, 0.086), (4, -0.027), (5, 0.044), (6, 0.022), (7, 0.006), (8, 0.077), (9, 0.066), (10, 0.023), (11, 0.119), (12, -0.001), (13, -0.037), (14, -0.125), (15, -0.023), (16, 0.131), (17, 0.035), (18, -0.046), (19, -0.108), (20, 0.0), (21, 0.252), (22, 0.042), (23, -0.019), (24, -0.002), (25, 0.066), (26, -0.126), (27, 0.04), (28, -0.009), (29, 0.066), (30, 0.063), (31, -0.184), (32, -0.002), (33, -0.033), (34, -0.07), (35, -0.048), (36, -0.071), (37, -0.059), (38, -0.04), (39, -0.109), (40, 0.023), (41, 0.216), (42, 0.01), (43, -0.197), (44, -0.03), (45, -0.109), (46, -0.012), (47, -0.035), (48, -0.078), (49, 0.154)]
simIndex simValue paperId paperTitle
same-paper 1 0.96842289 172 nips-2002-Recovering Articulated Model Topology from Observed Rigid Motion
Author: Leonid Taycher, John Iii, Trevor Darrell
Abstract: Accurate representation of articulated motion is a challenging problem for machine perception. Several successful tracking algorithms have been developed that model human body as an articulated tree. We propose a learning-based method for creating such articulated models from observations of multiple rigid motions. This paper is concerned with recovering topology of the articulated model, when the rigid motion of constituent segments is known. Our approach is based on finding the Maximum Likelihood tree shaped factorization of the joint probability density function (PDF) of rigid segment motions. The topology of graphical model formed from this factorization corresponds to topology of the underlying articulated body. We demonstrate the performance of our algorithm on both synthetic and real motion capture data.
2 0.60458738 51 nips-2002-Classifying Patterns of Visual Motion - a Neuromorphic Approach
Author: Jakob Heinzle, Alan Stocker
Abstract: We report a system that classifies and can learn to classify patterns of visual motion on-line. The complete system is described by the dynamics of its physical network architectures. The combination of the following properties makes the system novel: Firstly, the front-end of the system consists of an aVLSI optical flow chip that collectively computes 2-D global visual motion in real-time [1]. Secondly, the complexity of the classification task is significantly reduced by mapping the continuous motion trajectories to sequences of ’motion events’. And thirdly, all the network structures are simple and with the exception of the optical flow chip based on a Winner-Take-All (WTA) architecture. We demonstrate the application of the proposed generic system for a contactless man-machine interface that allows to write letters by visual motion. Regarding the low complexity of the system, its robustness and the already existing front-end, a complete aVLSI system-on-chip implementation is realistic, allowing various applications in mobile electronic devices.
Author: Matthias O. Franz, Javaan S. Chahl
Abstract: The tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during self-motion. In this study, we examine whether a simplified linear model of these neurons can be used to estimate self-motion from the optic flow. We present a theory for the construction of an estimator consisting of a linear combination of optic flow vectors that incorporates prior knowledge both about the distance distribution of the environment, and about the noise and self-motion statistics of the sensor. The estimator is tested on a gantry carrying an omnidirectional vision sensor. The experiments show that the proposed approach leads to accurate and robust estimates of rotation rates, whereas translation estimates turn out to be less reliable. 1
4 0.41475108 100 nips-2002-Half-Lives of EigenFlows for Spectral Clustering
Author: Chakra Chennubhotla, Allan D. Jepson
Abstract: Using a Markov chain perspective of spectral clustering we present an algorithm to automatically find the number of stable clusters in a dataset. The Markov chain’s behaviour is characterized by the spectral properties of the matrix of transition probabilities, from which we derive eigenflows along with their halflives. An eigenflow describes the flow of probability mass due to the Markov chain, and it is characterized by its eigenvalue, or equivalently, by the halflife of its decay as the Markov chain is iterated. A ideal stable cluster is one with zero eigenflow and infinite half-life. The key insight in this paper is that bottlenecks between weakly coupled clusters can be identified by computing the sensitivity of the eigenflow’s halflife to variations in the edge weights. We propose a novel E IGEN C UTS algorithm to perform clustering that removes these identified bottlenecks in an iterative fashion.
5 0.40449211 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities
Author: Robert A. Jacobs, Melissa Dominguez
Abstract: We consider the hypothesis that systems learning aspects of visual perception may benefit from the use of suitably designed developmental progressions during training. Four models were trained to estimate motion velocities in sequences of visual images. Three of the models were “developmental models” in the sense that the nature of their input changed during the course of training. They received a relatively impoverished visual input early in training, and the quality of this input improved as training progressed. One model used a coarse-to-multiscale developmental progression (i.e. it received coarse-scale motion features early in training and finer-scale features were added to its input as training progressed), another model used a fine-to-multiscale progression, and the third model used a random progression. The final model was nondevelopmental in the sense that the nature of its input remained the same throughout the training period. The simulation results show that the coarse-to-multiscale model performed best. Hypotheses are offered to account for this model’s superior performance. We conclude that suitably designed developmental sequences can be useful to systems learning to estimate motion velocities. The idea that visual development can aid visual learning is a viable hypothesis in need of further study.
6 0.34978613 203 nips-2002-Using Tarjan's Red Rule for Fast Dependency Tree Construction
7 0.31210992 80 nips-2002-Exact MAP Estimates by (Hyper)tree Agreement
8 0.30769235 124 nips-2002-Learning Graphical Models with Mercer Kernels
9 0.29910669 137 nips-2002-Location Estimation with a Differential Update Network
10 0.27739385 29 nips-2002-Analysis of Information in Speech Based on MANOVA
11 0.27238467 35 nips-2002-Automatic Acquisition and Efficient Representation of Syntactic Structures
12 0.26323685 153 nips-2002-Neural Decoding of Cursor Motion Using a Kalman Filter
13 0.25933301 114 nips-2002-Information Regularization with Partially Labeled Data
14 0.24533863 87 nips-2002-Fast Transformation-Invariant Factor Analysis
15 0.24335305 179 nips-2002-Scaling of Probability-Based Optimization Algorithms
16 0.24255019 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning
17 0.23826459 84 nips-2002-Fast Exact Inference with a Factored Model for Natural Language Parsing
18 0.2374045 67 nips-2002-Discriminative Binaural Sound Localization
19 0.23475005 47 nips-2002-Branching Law for Axons
20 0.22728521 85 nips-2002-Fast Kernels for String and Tree Matching
topicId topicWeight
[(1, 0.046), (11, 0.018), (23, 0.035), (41, 0.344), (42, 0.048), (54, 0.116), (55, 0.037), (57, 0.017), (68, 0.018), (74, 0.113), (92, 0.025), (98, 0.082)]
simIndex simValue paperId paperTitle
1 0.89358515 134 nips-2002-Learning to Take Concurrent Actions
Author: Khashayar Rohanimanesh, Sridhar Mahadevan
Abstract: We investigate a general semi-Markov Decision Process (SMDP) framework for modeling concurrent decision making, where agents learn optimal plans over concurrent temporally extended actions. We introduce three types of parallel termination schemes – all, any and continue – and theoretically and experimentally compare them. 1
2 0.88781559 54 nips-2002-Combining Dimensions and Features in Similarity-Based Representations
Author: Daniel J. Navarro, Michael D. Lee
Abstract: unkown-abstract
same-paper 3 0.8158865 172 nips-2002-Recovering Articulated Model Topology from Observed Rigid Motion
Author: Leonid Taycher, John Iii, Trevor Darrell
Abstract: Accurate representation of articulated motion is a challenging problem for machine perception. Several successful tracking algorithms have been developed that model human body as an articulated tree. We propose a learning-based method for creating such articulated models from observations of multiple rigid motions. This paper is concerned with recovering topology of the articulated model, when the rigid motion of constituent segments is known. Our approach is based on finding the Maximum Likelihood tree shaped factorization of the joint probability density function (PDF) of rigid segment motions. The topology of graphical model formed from this factorization corresponds to topology of the underlying articulated body. We demonstrate the performance of our algorithm on both synthetic and real motion capture data.
4 0.80967289 84 nips-2002-Fast Exact Inference with a Factored Model for Natural Language Parsing
Author: Dan Klein, Christopher D. Manning
Abstract: We present a novel generative model for natural language tree structures in which semantic (lexical dependency) and syntactic (PCFG) structures are scored with separate models. This factorization provides conceptual simplicity, straightforward opportunities for separately improving the component models, and a level of performance comparable to similar, non-factored models. Most importantly, unlike other modern parsing models, the factored model admits an extremely effective A* parsing algorithm, which enables efficient, exact inference.
5 0.7458629 38 nips-2002-Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement
Author: Patrick J. Wolfe, Simon J. Godsill
Abstract: The Bayesian paradigm provides a natural and effective means of exploiting prior knowledge concerning the time-frequency structure of sound signals such as speech and music—something which has often been overlooked in traditional audio signal processing approaches. Here, after constructing a Bayesian model and prior distributions capable of taking into account the time-frequency characteristics of typical audio waveforms, we apply Markov chain Monte Carlo methods in order to sample from the resultant posterior distribution of interest. We present speech enhancement results which compare favourably in objective terms with standard time-varying filtering techniques (and in several cases yield superior performance, both objectively and subjectively); moreover, in contrast to such methods, our results are obtained without an assumption of prior knowledge of the noise power.
6 0.51327378 183 nips-2002-Source Separation with a Sensor Array using Graphical Models and Subband Filtering
7 0.50580549 16 nips-2002-A Prototype for Automatic Recognition of Spontaneous Facial Actions
8 0.49931493 31 nips-2002-Application of Variational Bayesian Approach to Speech Recognition
9 0.49034673 147 nips-2002-Monaural Speech Separation
10 0.48524457 137 nips-2002-Location Estimation with a Differential Update Network
11 0.47942391 14 nips-2002-A Probabilistic Approach to Single Channel Blind Signal Separation
12 0.47766492 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture
13 0.47747514 2 nips-2002-A Bilinear Model for Sparse Coding
14 0.47605324 52 nips-2002-Cluster Kernels for Semi-Supervised Learning
15 0.47552323 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search
16 0.47512445 87 nips-2002-Fast Transformation-Invariant Factor Analysis
17 0.47467104 124 nips-2002-Learning Graphical Models with Mercer Kernels
18 0.47456282 39 nips-2002-Bayesian Image Super-Resolution
19 0.4725205 70 nips-2002-Distance Metric Learning with Application to Clustering with Side-Information
20 0.47251242 175 nips-2002-Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games