nips nips2000 nips2000-82 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Dirk Ormoneit, Hedvig Sidenbladh, Michael J. Black, Trevor Hastie
Abstract: We present methods for learning and tracking human motion in video. We estimate a statistical model of typical activities from a large set of 3D periodic human motion data by segmenting these data automatically into
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We present methods for learning and tracking human motion in video. [sent-17, score-0.916]
2 We estimate a statistical model of typical activities from a large set of 3D periodic human motion data by segmenting these data automatically into "cycles". [sent-18, score-0.939]
3 Then the mean and the principal components of the cycles are computed using a new algorithm that accounts for missing information and enforces smooth transitions between cycles. [sent-19, score-0.601]
4 The learned temporal model provides a prior probability distribution over human motions that can be used in a Bayesian framework for tracking human subjects in complex monocular video sequences and recovering their 3D motion. [sent-20, score-1.238]
5 1 Introduction The modeling and tracking of human motion in video is important for problems as varied as animation, video database search, sports medicine, and human-computer interaction. [sent-21, score-1.085]
6 Technically, the human body can be approximated by a collection of articulated limbs and its motion can be thought of as a collection of time-series describing the joint angles as they evolve over time. [sent-22, score-1.012]
7 A key challenge in modeling these joint angles involves decomposing the time-series into suitable temporal primitives. [sent-23, score-0.309]
8 For example, in the case of repetitive human motion such as walking, motion sequences decompose naturally into a sequence of "motion cycles" . [sent-24, score-1.311]
9 In this work, we present a new set of tools that carry out this segmentation automatically using the signal-to-noise ratio of the data in an aligned reference domain. [sent-25, score-0.284]
10 To deal with these problems, we develop a new iterative method for functional Principal Component Analysis (PCA). [sent-27, score-0.11]
11 The learned temporal model provides a prior probability distribution over human motions that can be used in a Bayesian framework for tracking. [sent-28, score-0.462]
12 The details of this tracking framework are described in [7] and are briefly summarized here. [sent-29, score-0.305]
13 Specifically, the posterior distribution of the unknown motion parameters is represented using a discrete set of samples and is propagated over time using particle filtering [3 , 7]. [sent-30, score-0.647]
14 Here the prior distribution based on the PCA representation improves the efficiency of the particle filter by constraining the samples to the most likely regions of the parameter space. [sent-31, score-0.127]
15 The resulting algorithm is able to track human subjects in monocular video sequences and to recover their 3D motion under changes in their pose and against complex unknown backgrounds. [sent-32, score-0.957]
16 Previous work on modeling human motion has focused on the recognition of activities using Hidden Markov Models (HMM's), linear dynamical models, or vector quantization (see [7, 5] for a summary of related work). [sent-33, score-0.793]
17 Alternatively, explicit temporal curves corresponding to joint motion may be derived from biometric studies or learned from 3D motion-capture data. [sent-35, score-0.65]
18 In previous work on principal component analysis of motion data, the 3D motion curves corresponding to particular activities had typically to be hand-segmented and aligned [1, 7, 8]. [sent-36, score-1.24]
19 We focus here on cyclic motions which are a particularly simple but important class of human activities [6]. [sent-38, score-0.45]
20 While Bayesian methods for tracking 3D human motion have been suggested previously [2 , 4], the prior information obtained from the functional PCA proves particularly effective for determining a low-dimensional representation of the possible human body positions [8 , 7]. [sent-39, score-1.383]
21 2 Learning Training data is provided by a commercial motion capture system describes the evolution of m = 19 relative joint angles over a period of about 500 to 5000 frames. [sent-40, score-0.635]
22 Here T; denotes the length of sequence i and a = 1, . [sent-48, score-0.101]
23 Altogether, there are n = 20 motion sequences in our training set. [sent-52, score-0.559]
24 Note that missing observations occur frequently as body markers are often occluded during motion capture. [sent-53, score-0.794]
25 , T;} I za ,; (t) is not missing} indicates the positions of valid data. [sent-57, score-0.102]
26 1 Sequence Alignment Periodic motion is composed of repetitive "cycles" which constitute a natural unit of statistical modeling and which must be identified in the training data prior to building a model. [sent-59, score-0.608]
27 To avoid error-prone manual segmentation we present alignment procedures that segment the data automatically by separately estimating the cycle length and a relative offset parameter for each sequence. [sent-60, score-0.414]
28 The cycle length is computed by searching for the value p that maximizes the "signal-to-noise ratio": . [sent-61, score-0.174]
29 ; J' r~ r J' J' r ~" E:= -2 ~;~ ~ ~~ ~ :~ Figure 1: Left: Signal-to-noise ratio of a representative set of angles as a function of the candidate period length. [sent-93, score-0.175]
30 where noisei ,a (p) is the variation in the data that is not explained by the mean cycle, z, and signal;,a (P) measures the signal intensity. [sent-95, score-0.087]
31 1 In Figure 1 we show the individual signal-to-noise ratios for a subset of the angles as well as the accumulated signal-to-noise ratio as functions of p in the range {50, 51, . [sent-96, score-0.224]
32 Note the peak of these values around the optimal cycle length p = 126. [sent-100, score-0.174]
33 Note also that the signalto-noise ratio of the white noise series in the first row is approximately constant , warranting the unbiasedness of our approach. [sent-101, score-0.045]
34 Next, we estimate the offset parameters , 0, to align multiple motion sequences in a common domain . [sent-102, score-0.751]
35 , o(n) so that the shifted motion sequences minimize the deviation from a common prototype model by analogy to the signal-to-noise-criterion (1). [sent-106, score-0.559]
36 An exhaustive search for the optimal offset combination is computationally infeasible. [sent-107, score-0.157]
37 Instead , we suggest the following iterative procedure: We initialize the offset values to zero in Step 1, and we define a reference signal ra in Step 2 so as to minimize the deviation with respect to the aligned data. [sent-108, score-0.415]
38 This reference signal is a periodically constrained regression spline that ensures smooth transitions at the boundaries between cycles. [sent-109, score-0.252]
39 Next, we choose the offsets of all sequences so that they minimize the prediction error with respect to the reference signal (Step 3). [sent-110, score-0.238]
40 By contrast to the exhaustive search, this operation requires 00:=7=1 p(i)) comparisons. [sent-111, score-0.039]
41 Because the solution of the first iteration may be suboptimal, we construct an improved reference signal using the current offset estimates, and use this signal in turn to improve the offset estimates. [sent-112, score-0.411]
42 Repeating these steps, we obtain an iterative optimization algorithm that is terminated if the improvement falls below a given threshold . [sent-113, score-0.09]
43 Figure 1 (right) shows eight joint angles of a walking motion, aligned using this procedure. [sent-115, score-0.406]
44 2 Functional peA The above alignment procedures segment the training data into a collection of cycle-data called "slices". [sent-117, score-0.122]
45 Next, we compute the principal components of these slices , which can be interpreted as the major sources of variation in the data. [sent-118, score-0.239]
46 The algorithm is as follows lThe mean cycle is obtained by "folding" t he original sequence into the doma in {I, . [sent-119, score-0.189]
47 , n: (a) Dissect Zi,a into K i cycles of length p(i), marlcing missing values at both ends. [sent-131, score-0.392]
48 (b) Compute functional estimates in the domain [0,1]. [sent-137, score-0.131]
49 (c) Resample the data in the reference domain, imputing missing observations. [sent-138, score-0.328]
50 zk 2) obtained from all sequences row-wise into a ,a 2:: . [sent-145, score-0.148]
51 First, even though the individual motion sequences are aligned in Figure I , they are still sampled at different frequencies in the reference domain due to the different alignment parameters. [sent-166, score-0.967]
52 This problem is accommodated in Step lc by resampling after computing a functional estimate in continuous time in Step lb. [sent-167, score-0.091]
53 Second, missing data in the design matrix X means we cannot simply use the Singular Value Decomposition (SVD) of X(l) to obtain the principal components. [sent-168, score-0.319]
54 Instead we use an iterative approximation scheme [9] in which we alternate between an SVD step (4 through 7) and a data imputation step (8) , where each update is designed so as to decrease the matrix distance between X and its reconstruction , X(4 ) . [sent-169, score-0.173]
55 Finally, we need to ensure that the m ean estimates and the principal components produce a smooth motion when recombined into a new sequence. [sent-170, score-0.615]
56 Specifically, the approximation of an individual cycle must be periodic in the sense that its first two derivatives match at the left and the right endpoint. [sent-171, score-0.273]
57 This is achieved by translating the cycles into a Fourier domain and by truncating highfrequency coefficients (Step 4). [sent-172, score-0.242]
58 Then we compute the SVD in the Fourier domain in Step 5, and we reconstruct the design matrix using a rank-q approximation in Steps 6 and 7, respectively. [sent-173, score-0.191]
59 In Step 8 we use the reconstructed values as improved estimates for the missing data in X, and then we repeat Steps 4 through 7 using these improved estimates. [sent-174, score-0.181]
60 This iterative process is continued until the performance improvement falls below a given threshold. [sent-175, score-0.09]
61 As its output, the algorithm generates the imputed design matrix , X, as well as its principal components. [sent-176, score-0.138]
62 3 Bayesian Tracking In tracking , our goal is to calculate the posterior probability distribution over 3D human poses given a sequence of image measurements, It. [sent-177, score-0.659]
63 The high dimensionality of the body model makes this calculation computationally demanding. [sent-178, score-0.11]
64 Hence, we use the learned model above to constrain the body motions to valid walking motions. [sent-179, score-0.367]
65 Towards that end , we use the SVD of X(2) to formulate a prior distribution for Bayesian tracking. [sent-180, score-0.052]
66 , m) be a random vector of the relative joint angles at time t; i. [sent-184, score-0.216]
67 , the value of a motion sequence, Zi(t), at time t is interpreted as the i-th realization of O(t). [sent-186, score-0.487]
68 t) , k=l where Vk is the Fourier inverse of the k-th column of V, rearranged as an T X mmatrix; similarly, j1, denotes the rearranged mean vector J. [sent-189, score-0.1]
69 t E {O, T -I} maps absolute time onto relative cycle positions or phases, and Pt denotes the speed of the motion such that 1/! [sent-195, score-0.675]
70 t + pt) mod T Given representation (2), body positions are characterized entirely by the low-dimensional state-vector cPt = (Ct, 1/! [sent-197, score-0.167]
71 Hence we the problem is to calculate the posterior distribution of cPt given images up to time t. [sent-202, score-0.085]
72 Due to the Markovian structure underlying cPt, this posterior distribution is given recursively by: -ri -ri, Oi)" 0i (3) Here p(It I cPt ) is the likelihood of observing the image It given the parameters and P(cPt-l I It-I) is the posterior probability from the previous instant. [sent-203, score-0.229]
73 p(cPt I cPt-d is a temporal prior probability distribution that encodes how the parameters cPt change over time. [sent-204, score-0.126]
74 Let M(It, cPt) be a function that takes image texture at time t and, given the model parameters, maps it onto the surfaces of the 3D model using the camera model. [sent-207, score-0.2]
75 The temporal prior, p(cPt I cPt-d, models how the parameters describing the body configuration are expected to vary over time . [sent-211, score-0.267]
76 Given the generative model above we can compare the image at time t - 1 to the image It at t. [sent-215, score-0.248]
77 Specifically, we compute this likelihood term separately for each limb. [sent-216, score-0.077]
78 To avoid numerical integration over image regions, we generate ns pixel locations stochastically. [sent-217, score-0.087]
79 Denoting the ith sample for limb j as Xj ,i, we obtain the following measure of discrepancy: n E == L(It(xj ,i ) - M-1(M(It _ 1, cPt-I), cPt)(Xj ,i ))2. [sent-218, score-0.204]
80 (4) i =l As an approximate likelihood term we use p(ItlcPt) = II ~Ctj) exp(-E/(2u(Ctj)2n s)) + (1- q(Ctj))Poccluded, . [sent-219, score-0.04]
81 Upper rows: frames 0, 10, 20, 30, 40, 50 with the projection of the expected model configuration overlaid. [sent-221, score-0.049]
82 Lower row: expected 3D configuration in the same frames. [sent-222, score-0.049]
83 where Poccluded is a constant probability that a limb is occluded, aj is the angle between the limb j principal axis and the image plane of the camera, 0"( a j) is a function that increases with narrow viewing angles, and q(aj) = cos(aj) if limb j is non-occluded, or 0 if limb j is occluded. [sent-223, score-1.119]
84 As it is typical for tracking problems, the posterior distribution may well be multi-modal due to the nonlinearity of the likelihood function. [sent-225, score-0.363]
85 Hence, we use a particle filter for inference where the posterior is represented as a weighted set of state samples, ¢;, which are propagated in time. [sent-226, score-0.16]
86 4 Experiment To illustrate the method we show an example of tracking a walking person in a cluttered scene in Figure 2. [sent-230, score-0.471]
87 The 3D motion is recovered from a monocular sequence using only the motion between frames. [sent-231, score-1.08]
88 To visualize the posterior distribution we display the projection of the 3D model corresponding to the expected value of the model parameters: ~, ~~1 Pi¢; where P; is the likelihood of sample ¢;. [sent-232, score-0.091]
89 All parameters were initialized manually with a Gaussian prior at time t = O. [sent-233, score-0.086]
90 The learned model is able to generalize to the subject in the sequence who was not part of the training set. [sent-234, score-0.098]
91 5 Conclusions We described an automated method for learning periodic human motions from training data using statistical methods for detecting the length of the periods in the data, segmenting it into cycles, and optimally aligning the cycles. [sent-235, score-0.597]
92 We also presented a PCA method for building a statistical eigen-model of the motion curves that copes with missing data and enforces smoothness between the beginning and ending of a motion cycle. [sent-236, score-1.155]
93 The learned eigen-curves are used as a prior probability distribution in a Bayesian tracking framework. [sent-237, score-0.364]
94 Tracking in monocular image sequences was performed using a particle filtering technique and results were shown for a cluttered Image sequence. [sent-238, score-0.434]
95 Fleet for many discussions on human motion and Bayesian estimation . [sent-243, score-0.644]
96 Bayesian estimation of 3-d human motion from an image sequence. [sent-267, score-0.731]
97 Hastie, Learning and tracking human motion using functional analysis, submitted: IEEE Workshop on Human Modeling, Analysis and Synthesis, 2000. [sent-274, score-0.973]
98 Stochastic tracking of 3D human figures using 2D image motion. [sent-289, score-0.55]
99 Parameterized modeling and recognition of activities in temporal surfaces. [sent-294, score-0.223]
100 "Imputing missing data for gene expression arrays," 2000, Working Paper, Department of Statistics, Stanford University. [sent-304, score-0.181]
wordName wordTfidf (topN-words)
[('motion', 0.453), ('cpt', 0.275), ('tracking', 0.272), ('limb', 0.204), ('human', 0.191), ('missing', 0.181), ('cycles', 0.168), ('cycle', 0.131), ('angles', 0.13), ('offset', 0.118), ('ctj', 0.116), ('monocular', 0.116), ('walking', 0.112), ('aligned', 0.112), ('body', 0.11), ('sequences', 0.106), ('motions', 0.105), ('activities', 0.096), ('principal', 0.095), ('periodic', 0.093), ('reference', 0.089), ('sidenbladh', 0.087), ('image', 0.087), ('alignment', 0.084), ('stanford', 0.081), ('fourier', 0.078), ('svd', 0.075), ('particle', 0.075), ('temporal', 0.074), ('domain', 0.074), ('segmenting', 0.068), ('ct', 0.068), ('bayesian', 0.067), ('smooth', 0.067), ('hastie', 0.063), ('slices', 0.063), ('pt', 0.06), ('step', 0.06), ('vk', 0.059), ('video', 0.058), ('aligning', 0.058), ('cyclic', 0.058), ('enforcing', 0.058), ('imputing', 0.058), ('poccluded', 0.058), ('sequence', 0.058), ('functional', 0.057), ('positions', 0.057), ('modeling', 0.053), ('pca', 0.053), ('transitions', 0.053), ('iterative', 0.053), ('joint', 0.052), ('prior', 0.052), ('posterior', 0.051), ('occluded', 0.05), ('repetitive', 0.05), ('cluttered', 0.05), ('rearranged', 0.05), ('individual', 0.049), ('configuration', 0.049), ('ratio', 0.045), ('camera', 0.045), ('za', 0.045), ('slice', 0.045), ('variation', 0.044), ('signal', 0.043), ('design', 0.043), ('length', 0.043), ('aj', 0.042), ('angle', 0.042), ('black', 0.042), ('zk', 0.042), ('specifically', 0.041), ('learned', 0.04), ('likelihood', 0.04), ('generative', 0.04), ('exhaustive', 0.039), ('automated', 0.039), ('cvpr', 0.039), ('automatically', 0.038), ('collection', 0.038), ('person', 0.037), ('reconstruct', 0.037), ('enforces', 0.037), ('falls', 0.037), ('viewing', 0.037), ('ki', 0.037), ('compute', 0.037), ('iiii', 0.036), ('propagated', 0.034), ('texture', 0.034), ('time', 0.034), ('details', 0.033), ('subjects', 0.033), ('steps', 0.032), ('singular', 0.031), ('zi', 0.031), ('curves', 0.031), ('brown', 0.03)]
simIndex simValue paperId paperTitle
same-paper 1 0.9999994 82 nips-2000-Learning and Tracking Cyclic Human Motion
Author: Dirk Ormoneit, Hedvig Sidenbladh, Michael J. Black, Trevor Hastie
Abstract: We present methods for learning and tracking human motion in video. We estimate a statistical model of typical activities from a large set of 3D periodic human motion data by segmenting these data automatically into
2 0.21603552 83 nips-2000-Machine Learning for Video-Based Rendering
Author: Arno Schödl, Irfan A. Essa
Abstract: We present techniques for rendering and animation of realistic scenes by analyzing and training on short video sequences. This work extends the new paradigm for computer animation, video textures, which uses recorded video to generate novel animations by replaying the video samples in a new order. Here we concentrate on video sprites, which are a special type of video texture. In video sprites, instead of storing whole images, the object of interest is separated from the background and the video samples are stored as a sequence of alpha-matted sprites with associated velocity information. They can be rendered anywhere on the screen to create a novel animation of the object. We present methods to create such animations by finding a sequence of sprite samples that is both visually smooth and follows a desired path. To estimate visual smoothness, we train a linear classifier to estimate visual similarity between video samples. If the motion path is known in advance, we use beam search to find a good sample sequence. We can specify the motion interactively by precomputing the sequence cost function using Q-Iearning.
3 0.15261358 80 nips-2000-Learning Switching Linear Models of Human Motion
Author: Vladimir Pavlovic, James M. Rehg, John MacCormick
Abstract: The human figure exhibits complex and rich dynamic behavior that is both nonlinear and time-varying. Effective models of human dynamics can be learned from motion capture data using switching linear dynamic system (SLDS) models. We present results for human motion synthesis, classification, and visual tracking using learned SLDS models. Since exact inference in SLDS is intractable, we present three approximate inference algorithms and compare their performance. In particular, a new variational inference algorithm is obtained by casting the SLDS model as a Dynamic Bayesian Network. Classification experiments show the superiority of SLDS over conventional HMM's for our problem domain.
4 0.12938567 72 nips-2000-Keeping Flexible Active Contours on Track using Metropolis Updates
Author: Trausti T. Kristjansson, Brendan J. Frey
Abstract: Condensation, a form of likelihood-weighted particle filtering, has been successfully used to infer the shapes of highly constrained
5 0.12179638 53 nips-2000-Feature Correspondence: A Markov Chain Monte Carlo Approach
Author: Frank Dellaert, Steven M. Seitz, Sebastian Thrun, Charles E. Thorpe
Abstract: When trying to recover 3D structure from a set of images, the most difficult problem is establishing the correspondence between the measurements. Most existing approaches assume that features can be tracked across frames, whereas methods that exploit rigidity constraints to facilitate matching do so only under restricted camera motion. In this paper we propose a Bayesian approach that avoids the brittleness associated with singling out one
6 0.093233928 78 nips-2000-Learning Joint Statistical Models for Audio-Visual Fusion and Segregation
7 0.087115265 45 nips-2000-Emergence of Movement Sensitive Neurons' Properties by Learning a Sparse Code for Natural Moving Images
8 0.086384237 2 nips-2000-A Comparison of Image Processing Techniques for Visual Speech Recognition Applications
9 0.082836516 98 nips-2000-Partially Observable SDE Models for Image Sequence Recognition Tasks
10 0.081899442 61 nips-2000-Generalizable Singular Value Decomposition for Ill-posed Datasets
11 0.075084537 27 nips-2000-Automatic Choice of Dimensionality for PCA
12 0.073446214 103 nips-2000-Probabilistic Semantic Video Indexing
13 0.073005632 99 nips-2000-Periodic Component Analysis: An Eigenvalue Method for Representing Periodic Structure in Speech
14 0.071396485 135 nips-2000-The Manhattan World Assumption: Regularities in Scene Statistics which Enable Bayesian Inference
15 0.07056684 57 nips-2000-Four-legged Walking Gait Control Using a Neuromorphic Chip Interfaced to a Support Vector Learning Algorithm
16 0.070412792 137 nips-2000-The Unscented Particle Filter
17 0.069464236 106 nips-2000-Propagation Algorithms for Variational Bayesian Learning
18 0.068941206 121 nips-2000-Sparse Kernel Principal Component Analysis
19 0.068325952 107 nips-2000-Rate-coded Restricted Boltzmann Machines for Face Recognition
20 0.067649573 31 nips-2000-Beyond Maximum Likelihood and Density Estimation: A Sample-Based Criterion for Unsupervised Learning of Complex Models
topicId topicWeight
[(0, 0.236), (1, -0.094), (2, 0.142), (3, 0.13), (4, -0.029), (5, 0.021), (6, 0.161), (7, 0.133), (8, -0.139), (9, -0.093), (10, 0.047), (11, 0.063), (12, 0.14), (13, -0.023), (14, -0.165), (15, -0.138), (16, 0.093), (17, -0.015), (18, 0.061), (19, -0.072), (20, 0.046), (21, 0.04), (22, 0.056), (23, -0.204), (24, -0.212), (25, 0.037), (26, -0.052), (27, -0.082), (28, -0.137), (29, -0.047), (30, 0.06), (31, -0.038), (32, -0.023), (33, -0.025), (34, -0.123), (35, 0.029), (36, -0.077), (37, 0.002), (38, -0.173), (39, 0.054), (40, 0.088), (41, 0.027), (42, 0.014), (43, -0.016), (44, 0.118), (45, -0.003), (46, -0.1), (47, -0.043), (48, -0.026), (49, 0.075)]
simIndex simValue paperId paperTitle
same-paper 1 0.96021456 82 nips-2000-Learning and Tracking Cyclic Human Motion
Author: Dirk Ormoneit, Hedvig Sidenbladh, Michael J. Black, Trevor Hastie
Abstract: We present methods for learning and tracking human motion in video. We estimate a statistical model of typical activities from a large set of 3D periodic human motion data by segmenting these data automatically into
2 0.74721515 83 nips-2000-Machine Learning for Video-Based Rendering
Author: Arno Schödl, Irfan A. Essa
Abstract: We present techniques for rendering and animation of realistic scenes by analyzing and training on short video sequences. This work extends the new paradigm for computer animation, video textures, which uses recorded video to generate novel animations by replaying the video samples in a new order. Here we concentrate on video sprites, which are a special type of video texture. In video sprites, instead of storing whole images, the object of interest is separated from the background and the video samples are stored as a sequence of alpha-matted sprites with associated velocity information. They can be rendered anywhere on the screen to create a novel animation of the object. We present methods to create such animations by finding a sequence of sprite samples that is both visually smooth and follows a desired path. To estimate visual smoothness, we train a linear classifier to estimate visual similarity between video samples. If the motion path is known in advance, we use beam search to find a good sample sequence. We can specify the motion interactively by precomputing the sequence cost function using Q-Iearning.
3 0.6660533 80 nips-2000-Learning Switching Linear Models of Human Motion
Author: Vladimir Pavlovic, James M. Rehg, John MacCormick
Abstract: The human figure exhibits complex and rich dynamic behavior that is both nonlinear and time-varying. Effective models of human dynamics can be learned from motion capture data using switching linear dynamic system (SLDS) models. We present results for human motion synthesis, classification, and visual tracking using learned SLDS models. Since exact inference in SLDS is intractable, we present three approximate inference algorithms and compare their performance. In particular, a new variational inference algorithm is obtained by casting the SLDS model as a Dynamic Bayesian Network. Classification experiments show the superiority of SLDS over conventional HMM's for our problem domain.
4 0.50138533 53 nips-2000-Feature Correspondence: A Markov Chain Monte Carlo Approach
Author: Frank Dellaert, Steven M. Seitz, Sebastian Thrun, Charles E. Thorpe
Abstract: When trying to recover 3D structure from a set of images, the most difficult problem is establishing the correspondence between the measurements. Most existing approaches assume that features can be tracked across frames, whereas methods that exploit rigidity constraints to facilitate matching do so only under restricted camera motion. In this paper we propose a Bayesian approach that avoids the brittleness associated with singling out one
5 0.43541604 61 nips-2000-Generalizable Singular Value Decomposition for Ill-posed Datasets
Author: Ulrik Kjems, Lars Kai Hansen, Stephen C. Strother
Abstract: We demonstrate that statistical analysis of ill-posed data sets is subject to a bias, which can be observed when projecting independent test set examples onto a basis defined by the training examples. Because the training examples in an ill-posed data set do not fully span the signal space the observed training set variances in each basis vector will be too high compared to the average variance of the test set projections onto the same basis vectors. On basis of this understanding we introduce the Generalizable Singular Value Decomposition (GenSVD) as a means to reduce this bias by re-estimation of the singular values obtained in a conventional Singular Value Decomposition, allowing for a generalization performance increase of a subsequent statistical model. We demonstrate that the algorithm succesfully corrects bias in a data set from a functional PET activation study of the human brain. 1 Ill-posed Data Sets An ill-posed data set has more dimensions in each example than there are examples. Such data sets occur in many fields of research typically in connection with image measurements. The associated statistical problem is that of extracting structure from the observed high-dimensional vectors in the presence of noise. The statistical analysis can be done either supervised (Le. modelling with target values: classification, regresssion) or unsupervised (modelling with no target values: clustering, PCA, ICA). In both types of analysis the ill-posedness may lead to immediate problems if one tries to apply conventional statistical methods of analysis, for example the empirical covariance matrix is prohibitively large and will be rank-deficient. A common approach is to use Singular Value Decomposition (SVD) or the analogue Principal Component Analysis (PCA) to reduce the dimensionality of the data. Let the N observed i-dimensional samples Xj, j = L .N, collected in the data matrix X = [Xl ... XN] of size I x N, I> N . The SVD-theorem states that such a matrix can be decomposed as (1) where U is a matrix of the same size as X with orthogonal basis vectors spanning the space of X, so that UTU = INxN. The square matrix A contains the singular values in the diagonal, A = diag( AI, ... , >w), which are ordered and positive Al ~ A2 ~ ... ~ AN ~ 0, and V is N x N and orthogonal V TV = IN. If there is a mean value significantly different from zero it may at times be advantageous to perform the above analysis on mean-subtracted data, i.e. X - X = U A V T where columns of X all contain the mean vector x = Lj xj/N. Each observation Xj can be expressed in coordinates in the basis defined by the vectors of U with no loss of information[Lautrup et al., 1995]. A change of basis is obtained by qj = U T Xj as the orthogonal basis rotation Q = [ql ... qN] = U T X = UTUAV T = AVT . (2) Since Q is only N x Nand N « I, Q is a compact representation of the data. Having now N examples of N dimension we have reduced the problem to a marginally illposed one. To further reduce the dimensionality, it is common to retain only a subset of the coordinates, e.g. the top P coordinates (P < N) and the supervised or unsupervised model can be formed in this smaller but now well-posed space. So far we have considered the procedure for modelling from a training set. Our hope is that the statistical description generalizes well to new examples proving that is is a good description of the generating process. The model should, in other words, be able to perform well on a new example, x*, and in the above framework this would mean the predictions based on q* = U T x* should generalize well. We will show in the following, that in general, the distribution of the test set projection q* is quite different from the statistics of the projections of the training examples qj. It has been noted in previous work [Hansen and Larsen, 1996, Roweis, 1998, Hansen et al., 1999] that PCA/SVD of ill-posed data does not by itself represent a probabilistic model where we can assign a likelihood to a new test data point, and procedures have been proposed which make this possible. In [Bishop, 1999] PCA has been considered in a Bayesian framework, but does not address the significant bias of the variance in training set projections in ill-posed data sets. In [Jackson, 1991] an asymptotic expression is given for the bias of eigen-values in a sample covariance matrix, but this expression is valid only in the well-posed case and is not applicable for ill-posed data. 1.1 Example Let the signal source be I-dimensional multivariate Gaussian distribution N(O,~) with a covariance matrix where the first K eigen-values equal u 2 and the last 1- K are zero, so that the covariance matrix has the decomposition ~=u2YDyT, D=diag(1, ... ,1,0, ... ,0), yTY=I (3) Our N samples of the distribution are collected in the matrix X = [Xij] with the SVD (4) A = diag(Al, ... , AN) and the representation ofthe N examples in the N basis vector coordinates defined by U is Q = [%] = U T X = A V T. The total variance per training example is ~ LX;j ~Tr(XTX) = ~Tr(VAUTUAVT) = ~Tr(VA2VT) i,j = ~ Tr(VVT A2) = ~ Tr(A2) = ~L A; i (5) Note that this variance is the same in the U-basis coordinates: 1 L...J 2 N '
6 0.40210396 72 nips-2000-Keeping Flexible Active Contours on Track using Metropolis Updates
7 0.39312458 135 nips-2000-The Manhattan World Assumption: Regularities in Scene Statistics which Enable Bayesian Inference
9 0.35581604 125 nips-2000-Stability and Noise in Biochemical Switches
10 0.35235384 45 nips-2000-Emergence of Movement Sensitive Neurons' Properties by Learning a Sparse Code for Natural Moving Images
11 0.34328103 30 nips-2000-Bayesian Video Shot Segmentation
12 0.32642195 103 nips-2000-Probabilistic Semantic Video Indexing
13 0.32216939 27 nips-2000-Automatic Choice of Dimensionality for PCA
15 0.30947423 99 nips-2000-Periodic Component Analysis: An Eigenvalue Method for Representing Periodic Structure in Speech
16 0.30752894 93 nips-2000-On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems
17 0.28895786 3 nips-2000-A Gradient-Based Boosting Algorithm for Regression Problems
18 0.28643328 98 nips-2000-Partially Observable SDE Models for Image Sequence Recognition Tasks
19 0.27891585 73 nips-2000-Kernel-Based Reinforcement Learning in Average-Cost Problems: An Application to Optimal Portfolio Choice
20 0.25958648 64 nips-2000-High-temperature Expansions for Learning Models of Nonnegative Data
topicId topicWeight
[(10, 0.036), (17, 0.153), (32, 0.032), (33, 0.033), (54, 0.017), (55, 0.024), (62, 0.04), (65, 0.024), (67, 0.048), (75, 0.025), (76, 0.055), (79, 0.025), (81, 0.038), (90, 0.028), (94, 0.307), (97, 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 0.83796483 82 nips-2000-Learning and Tracking Cyclic Human Motion
Author: Dirk Ormoneit, Hedvig Sidenbladh, Michael J. Black, Trevor Hastie
Abstract: We present methods for learning and tracking human motion in video. We estimate a statistical model of typical activities from a large set of 3D periodic human motion data by segmenting these data automatically into
2 0.77581394 98 nips-2000-Partially Observable SDE Models for Image Sequence Recognition Tasks
Author: Javier R. Movellan, Paul Mineiro, Ruth J. Williams
Abstract: This paper explores a framework for recognition of image sequences using partially observable stochastic differential equation (SDE) models. Monte-Carlo importance sampling techniques are used for efficient estimation of sequence likelihoods and sequence likelihood gradients. Once the network dynamics are learned, we apply the SDE models to sequence recognition tasks in a manner similar to the way Hidden Markov models (HMMs) are commonly applied. The potential advantage of SDEs over HMMS is the use of continuous state dynamics. We present encouraging results for a video sequence recognition task in which SDE models provided excellent performance when compared to hidden Markov models. 1
3 0.55824548 80 nips-2000-Learning Switching Linear Models of Human Motion
Author: Vladimir Pavlovic, James M. Rehg, John MacCormick
Abstract: The human figure exhibits complex and rich dynamic behavior that is both nonlinear and time-varying. Effective models of human dynamics can be learned from motion capture data using switching linear dynamic system (SLDS) models. We present results for human motion synthesis, classification, and visual tracking using learned SLDS models. Since exact inference in SLDS is intractable, we present three approximate inference algorithms and compare their performance. In particular, a new variational inference algorithm is obtained by casting the SLDS model as a Dynamic Bayesian Network. Classification experiments show the superiority of SLDS over conventional HMM's for our problem domain.
4 0.51395077 122 nips-2000-Sparse Representation for Gaussian Process Models
Author: Lehel Csatč´¸, Manfred Opper
Abstract: We develop an approach for a sparse representation for Gaussian Process (GP) models in order to overcome the limitations of GPs caused by large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the model. Experimental results on toy examples and large real-world data sets indicate the efficiency of the approach.
5 0.49975967 2 nips-2000-A Comparison of Image Processing Techniques for Visual Speech Recognition Applications
Author: Michael S. Gray, Terrence J. Sejnowski, Javier R. Movellan
Abstract: We examine eight different techniques for developing visual representations in machine vision tasks. In particular we compare different versions of principal component and independent component analysis in combination with stepwise regression methods for variable selection. We found that local methods, based on the statistics of image patches, consistently outperformed global methods based on the statistics of entire images. This result is consistent with previous work on emotion and facial expression recognition. In addition, the use of a stepwise regression technique for selecting variables and regions of interest substantially boosted performance. 1
6 0.49871039 79 nips-2000-Learning Segmentation by Random Walks
7 0.49748635 74 nips-2000-Kernel Expansions with Unlabeled Examples
8 0.49726635 95 nips-2000-On a Connection between Kernel PCA and Metric Multidimensional Scaling
9 0.49593201 107 nips-2000-Rate-coded Restricted Boltzmann Machines for Face Recognition
10 0.49468279 83 nips-2000-Machine Learning for Video-Based Rendering
11 0.49462473 4 nips-2000-A Linear Programming Approach to Novelty Detection
12 0.49405357 130 nips-2000-Text Classification using String Kernels
13 0.4915204 133 nips-2000-The Kernel Gibbs Sampler
14 0.49073595 106 nips-2000-Propagation Algorithms for Variational Bayesian Learning
15 0.49043736 60 nips-2000-Gaussianization
16 0.48972592 37 nips-2000-Convergence of Large Margin Separable Linear Classification
17 0.48418757 71 nips-2000-Interactive Parts Model: An Application to Recognition of On-line Cursive Script
18 0.48050526 51 nips-2000-Factored Semi-Tied Covariance Matrices
19 0.47974384 10 nips-2000-A Productive, Systematic Framework for the Representation of Visual Structure
20 0.47884023 146 nips-2000-What Can a Single Neuron Compute?