nips nips2006 nips2006-134 knowledge-graph by maker-knowledge-mining

134 nips-2006-Modeling Human Motion Using Binary Latent Variables


Source: pdf

Author: Graham W. Taylor, Geoffrey E. Hinton, Sam T. Roweis

Abstract: We propose a non-linear generative model for human motion data that uses an undirected model with binary latent variables and real-valued “visible” variables that represent joint angles. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. Such an architecture makes on-line inference efficient and allows us to use a simple approximate learning procedure. After training, the model finds a single set of parameters that simultaneously capture several different kinds of motion. We demonstrate the power of our approach by synthesizing various motion sequences and by performing on-line filling in of data lost during motion capture. Website: http://www.cs.toronto.edu/∼gwtaylor/publications/nips2006mhmublv/

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract We propose a non-linear generative model for human motion data that uses an undirected model with binary latent variables and real-valued “visible” variables that represent joint angles. [sent-6, score-0.748]

2 The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. [sent-7, score-1.17]

3 We demonstrate the power of our approach by synthesizing various motion sequences and by performing on-line filling in of data lost during motion capture. [sent-10, score-0.862]

4 edu/∼gwtaylor/publications/nips2006mhmublv/ 1 Introduction Recent advances in motion capture technology have fueled interest in the analysis and synthesis of complex human motion for animation and tracking. [sent-14, score-0.877]

5 Models based on the physics of masses and springs have produced some impressive results by using sophisticated “energy-based” learning methods[1] to estimate physical parameters from motion capture data[2]. [sent-15, score-0.407]

6 The simplest way to generate new motion sequences based on data is to concatenate parts of training sequences [3]. [sent-17, score-0.721]

7 Another method is to transform motion in the training data to new sequences by learning to adjusting its style or other characteristics[4, 5, 6]. [sent-18, score-0.614]

8 Data from modern motion capture systems is high-dimensional and contains complex non-linear relationships between the components of the observation vector, which usually represent joint angles with respect to some skeletal structure. [sent-20, score-0.686]

9 To model N bits of information about the past history they require 2N hidden states. [sent-22, score-0.349]

10 componential) hidden state that has a representational capacity which is linear in the number of components. [sent-25, score-0.271]

11 2 An energy-based model for vectors of real-values In general, using distributed binary representations for hidden state in directed models of time series makes inference difficult. [sent-27, score-0.477]

12 Typically, RBMs use binary logistic units for both the visible data and hidden variables, but in our application the data (comprised of joint angles) is continuous. [sent-29, score-1.111]

13 The graphical model has a layer of visible units v and a layer of hidden units h; there are undirected connections between layers but no connections within a layer. [sent-31, score-1.795]

14 The main advantage of using this undirected, “energy-based” model rather than a directed “belief net” is that inference is very easy because the hidden units become conditionally independent when the states of the visible units are observed. [sent-35, score-1.436]

15 The learning rule is: ∆wij ∝ vi hj data − vi hj recon , (4) where the first expectation (over hidden unit activations) is with respect to the data distribution and the second expectation is with respect to the distribution of “reconstructed” data. [sent-38, score-1.005]

16 The reconstructions are generated by starting a Markov chain at the data distribution, updating all the hidden units in parallel by sampling (Eq. [sent-39, score-0.607]

17 2) and then updating all the visible units in parallel by sampling (Eq. [sent-40, score-0.711]

18 For both expectations, the states of the hidden units are conditional on the states of the visible units, not vice versa. [sent-42, score-1.085]

19 The learning rule for the hidden biases is just a simplified version of Eq. [sent-43, score-0.342]

20 1 The conditional RBM model The RBM we have described above models static frames of data, but does not incorporate any temporal information. [sent-46, score-0.294]

21 We can model temporal dependencies by treating the visible variables in the previous time slice as additional fixed inputs [10]. [sent-47, score-0.51]

22 We add two types of directed connections (Figure 2): autoregressive connections from the past n configurations (time steps) of the visible units to the current visible configuration, and connections from the past m visibles to the current hidden configuration. [sent-49, score-2.126]

23 The addition of these directed connections turns the RBM into a conditional RBM (CRBM). [sent-50, score-0.326]

24 1 For any setting of the parameters, the gradient of the quadratic log likelihood will always overwhelm the gradient due to the weighted input from the binary hidden units provided the value vi of a visible unit is far enough from its bias, ci . [sent-54, score-1.164]

25 Figure 1: In a trained model, probabilities of each feature being “on” conditional on the data at the visible units. [sent-55, score-0.552]

26 , t − n, the hidden units at time t are conditionally independent. [sent-62, score-0.497]

27 The only change is that when we update the visible and hidden units, we implement the directed connections by treating data from previous time steps as a dynamically changing bias. [sent-64, score-1.014]

28 The contrastive divergence learning rule for hidden biases is given in Eq. [sent-65, score-0.423]

29 5 and the equivalent learning rule for the temporal connections that determine the dynamically changing hidden unit biases is: (t−q) t−q (6) ∆dij ∝ vi ht data − ht recon . [sent-66, score-0.91]

30 j j Hidden layer j i (t−q) where dij is the log-linear parameter (weight) connecting visible unit i at time t − q to hidden unit j for q = 1. [sent-67, score-0.912]

31 Similarly, the learning rule for the autoregressive connections that determine the dynamically changing visible unit biases is: (t−q) ∆aki where unit i. [sent-70, score-0.961]

32 (7) is the weight from visible unit k at time t − q to visible The autoregressive weights can model short-term temporal structure very well, leaving the hidden units to model longer-term, higher level structure. [sent-72, score-1.636]

33 During training, the states of the hidden units are determined by both the input they receive from the observed data and the input they receive from the previous time slices. [sent-73, score-0.667]

34 The learning rule for W remains the same as a standard RBM, but has a different effect because the states of the hidden units are now influenced by the previous visible units. [sent-74, score-1.006]

35 Visible layer t-2 t-1 t Figure 2: Architecture of our model (in our experiments, we use three previous time steps) While learning a model of motion, we do not need to proceed sequentially through the training data sequences. [sent-76, score-0.282]

36 As long as we isolate “chunks” of frames (the size depending on the order of the directed connections), these can be mixed and formed into mini-batches. [sent-78, score-0.234]

37 7 by its expected value and we also used the expected value of vi when computing the probability of activation of the hidden units. [sent-86, score-0.348]

38 However, to compute the one-step reconstructions of the data, we used stochastically chosen binary values of the hidden units. [sent-87, score-0.382]

39 This prevents the hidden activities from transmitting an unbounded amount of information from the data to the reconstruction [11]. [sent-88, score-0.342]

40 4) we used the stochastically chosen binary values of the hidden units in the first term (under the data), but replaced hj by its expected value in the second term (under the reconstruction). [sent-91, score-0.743]

41 We took this approach because the reconstruction of the data depends on the binary choices made when selecting hidden state. [sent-92, score-0.378]

42 Thus when we infer the hiddens from the reconstructed data, the probabilities are highly correlated with the binary hidden states inferred from the data. [sent-93, score-0.385]

43 The inference step, conditional on past visible states, is approximate because it ignores the future (it does not do smoothing). [sent-96, score-0.549]

44 Because of the directed connections, exact inference within the model should include both a forward and backward pass through each sequence (we currently perform only a forward pass). [sent-97, score-0.31]

45 We have avoided a backward pass because missing values create problems in undirected models, so it is hard to perform learning efficiently using the full posterior. [sent-98, score-0.257]

46 The processed data consists of 3D joint angles derived from 30 (CMU) or 17 (MIT) markers plus a root (coccyx, near the base of the back) orientation and displacement. [sent-101, score-0.278]

47 Each of the remaining joint angles had between one and three degrees of freedom. [sent-104, score-0.238]

48 All of the joint angles and the root orientation were converted from Euler angles to the “exponential map” parameterization [13]. [sent-105, score-0.391]

49 The autoregressive connections in our model can be thought of as doing a kind of “whitening” of the data. [sent-118, score-0.302]

50 Perhaps the most direct demonstration, which exploits the fact that it is a probability density model of sequences, is to use the model to generate de-novo a number of synthetic motion sequences. [sent-123, score-0.505]

51 Video files of these sequences are available on the website mentioned in the abstract; these motions have not been retouched by hand in any motion editing software. [sent-124, score-0.687]

52 Note that we also do not have to keep a reservoir of training data sequences around for generation - we only need the weights of the model and a few valid frames for initialization. [sent-125, score-0.427]

53 The visible units at the last few time steps determine the effective biases of the visible and hidden units at the current time step. [sent-127, score-1.654]

54 We always keep the previous visible states fixed and perform alternating Gibbs sampling to obtain a joint sample from the conditional RBM. [sent-128, score-0.609]

55 This picks new hidden and visible states that are compatible with each other and with the recent (visible) history. [sent-129, score-0.715]

56 Generation requires initialization with n time steps of the visible units, which implicitly determine the “mode” of motion in which the synthetic sequence will start. [sent-130, score-0.803]

57 We used randomly drawn consecutive frames from the training data as an initial configuration. [sent-131, score-0.213]

58 1 Generation of walking and running sequences from a single model In our first demonstration, we train a single model on data containing both walking and running motions; we then use the learned model to generate both types of motion, depending on how it is initialized. [sent-133, score-0.837]

59 We trained2 on 23 sequences of walking and 10 sequences of jogging (from subject 35 in the CMU database). [sent-134, score-0.454]

60 Figure 3: After training, the same model can generate walking (top) and running (bottom) motion (see videos on the website). [sent-136, score-0.69]

61 Figure 3 shows a walking sequence and a running sequence generated by the same model, using alternating Gibbs sampling (with the probability of hidden units being “on” conditional on the current and previous three visible vectors). [sent-138, score-1.268]

62 Since the training data does not contain any transitions between walking and running (and vice-versa), the model will continue to generate walking or running motions depending on where it is initialized. [sent-139, score-0.834]

63 The order of the sequences was randomly permuted such that walking and running sequences were distributed throughout the training data. [sent-143, score-0.523]

64 We trained on 9 sequences (from the MIT database, file Jog1 M) containing long examples of running and jogging, as well as a few transitions between the two styles. [sent-145, score-0.332]

65 Training was done as before, except that after the model was trained, an identical 200 hidden-unit model was trained on top of the first model (see Sec. [sent-147, score-0.214]

66 A video available on the website demonstrates our model’s ability to stochastically transition between various motion styles during a single generated sequence. [sent-150, score-0.701]

67 3 Introducing transitions using noise In our third demonstration, we show how transitions between motion styles can be generated even when such transitions are absent in the data. [sent-152, score-0.77]

68 1, where we have learned on separate sequences of walking and running. [sent-155, score-0.293]

69 To generate, we use the same sampling procedure as before, except that at each time we stochastically choose the hidden states (given the current and previous three visible vectors) we add a small amount of Gaussian noise to the hidden state biases. [sent-156, score-1.03]

70 This encourages the model to explore more of the hidden state space without deviating too far the current motion. [sent-157, score-0.292]

71 Applying this “noisy” sampling approach, we see that the generated motion occasionally transitions between learned styles. [sent-158, score-0.46]

72 4 Filling in missing data Due to the nature of the motion capture process, which can be adversely affected by lighting and environmental effects, as well as noise during recording, motion capture data often contains missing or unusable data. [sent-161, score-1.164]

73 The majority of motion editing software packages contain interpolation methods to fill in missing data, but this leaves the data unnaturally smooth. [sent-163, score-0.639]

74 These methods also rely on the starting and end points of the missing data, so if a marker goes missing until the end of a sequence, na¨ve interpolation will not work. [sent-164, score-0.396]

75 Such methods often only use the past and future ı data from the single missing marker to fill in that marker’s missing values, but since joint angles are highly correlated, substantial information about the placement of one marker could be gained from the others. [sent-165, score-0.719]

76 Our trained model has the ability to easily fill in such missing data, regardless of where the dropouts occur in a sequence. [sent-166, score-0.241]

77 Due to its approximate inference method which does not rely on a backward pass through the sequence, it also has the ability to fill in such missing data on-line. [sent-167, score-0.278]

78 Filling in missing data with our model is very similar to generation. [sent-168, score-0.229]

79 We simply clamp the known data to the visible units, initialize the missing data to something reasonable (for example, the value at the previous frame), and alternate between stochastically updating the hidden and visible units, with the known visible states held fixed. [sent-169, score-1.872]

80 1 except that one walking and one running sequence were left out of the training data to be used as test data. [sent-172, score-0.371]

81 For each of these walking and running test sequences, we erased two different sets of joint angles, starting halfway through the test sequence. [sent-173, score-0.324]

82 For the missing leg, mean squared reconstruction error per joint using our model was 8. [sent-180, score-0.338]

83 78, measured in normalized joint angle space, and summed over the 62 frames of interest. [sent-181, score-0.249]

84 For the missing upper body, mean squared reconstruction error per joint using our model was 20. [sent-184, score-0.338]

85 5 20 40 60 80 Frame 100 120 140 −2 0 20 40 60 80 100 120 140 Frame Figure 4: The model successfully fills in missing data using only the previous values of the joint angles (through the temporal connections) and the current angles of other joints (through the RBM connections). [sent-196, score-0.715]

86 Shown are two of the three angles of rotation for the left hip joint (the plot of the third is similar to the first). [sent-197, score-0.269]

87 The original data is shown on a solid line, the model’s prediction is shown on a dashed line, and the results of nearest neighbor interpolation are shown on a dotted line (see a video on the website). [sent-198, score-0.225]

88 The previous layer CRBM is kept, and the sequence of hidden state vectors, while driven by the data, is treated as a new kind of “fully observed” data. [sent-200, score-0.352]

89 If we i keep only the visible layer, and its n-th order directed connections, we have a standard AR(n) model with Gaussian noise. [sent-206, score-0.584]

90 6 Discussion We have introduced a generative model for human motion based on the idea that local constraints and global dynamics can be learned efficiently by a conditional Restricted Boltzmann Machine. [sent-208, score-0.497]

91 The model has been designed with human motion in mind, but should lend itself well to other high-dimensional time series. [sent-210, score-0.45]

92 It would be possible to preserve the global phase information by using a much higher order model, but for higher dimensional data such as full body motion capture this is unnecessary because the whole configuration of joint angles and angular velocities never has any phase ambiguity. [sent-212, score-0.685]

93 Models with more hidden layers are able to implicitly model longer-term temporal information, and thus will mitigate this effect. [sent-214, score-0.376]

94 We have demonstrated that our model can effectively learn different styles of motion, as well as the transitions between these styles. [sent-215, score-0.258]

95 The ability of the model to transition smoothly, however, is dependent on having sufficient examples of such transitions in the training data. [sent-217, score-0.216]

96 We plan to train on larger datasets encompassing such transitions between various styles of motion. [sent-218, score-0.204]

97 If we augment the data with some static skeletal and identity parameters (in essence mapping a person’s unique identity to a set of features), we should be able to use the same generative model for many different people, and generalize individual characteristics from one type of motion to another. [sent-219, score-0.522]

98 For Matlab playback of motion and generation of videos, we have used Neil Lawrence’s motion capture toolbox (http://www. [sent-231, score-0.807]

99 Popovic, “Learning physics-based motion style with nonlinear inverse optimization,” ACM Trans. [sent-247, score-0.404]

100 Shum, “Motion texture: a two-level statistical model for character motion synthesis,” in Proc. [sent-270, score-0.408]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('visible', 0.413), ('motion', 0.354), ('units', 0.259), ('hidden', 0.238), ('rbm', 0.182), ('walking', 0.179), ('connections', 0.162), ('angles', 0.153), ('crbm', 0.14), ('missing', 0.135), ('hj', 0.133), ('website', 0.121), ('directed', 0.117), ('frames', 0.117), ('sequences', 0.114), ('vi', 0.11), ('transitions', 0.106), ('styles', 0.098), ('recon', 0.093), ('autoregressive', 0.086), ('joint', 0.085), ('hinton', 0.082), ('contrastive', 0.081), ('cmu', 0.081), ('layer', 0.078), ('stochastically', 0.077), ('unit', 0.076), ('chunks', 0.074), ('synthesis', 0.074), ('biases', 0.072), ('interpolation', 0.069), ('states', 0.064), ('reconstruction', 0.064), ('hertzmann', 0.061), ('wij', 0.06), ('running', 0.06), ('past', 0.057), ('marker', 0.057), ('motions', 0.057), ('training', 0.056), ('model', 0.054), ('capture', 0.053), ('trained', 0.052), ('joints', 0.052), ('undirected', 0.051), ('frame', 0.051), ('video', 0.051), ('demonstration', 0.05), ('style', 0.05), ('ll', 0.048), ('conditional', 0.047), ('angle', 0.047), ('aki', 0.047), ('gravitational', 0.047), ('hiddens', 0.047), ('jogging', 0.047), ('urtasun', 0.047), ('toronto', 0.046), ('generation', 0.046), ('dynamically', 0.044), ('exponential', 0.044), ('temporal', 0.043), ('generate', 0.043), ('human', 0.042), ('layers', 0.041), ('siggraph', 0.041), ('downsampling', 0.041), ('brand', 0.041), ('editing', 0.041), ('skeletal', 0.041), ('filling', 0.041), ('data', 0.04), ('boltzmann', 0.04), ('updating', 0.039), ('person', 0.037), ('sequence', 0.036), ('backward', 0.036), ('binary', 0.036), ('neighbor', 0.035), ('pass', 0.035), ('incremental', 0.034), ('bj', 0.033), ('static', 0.033), ('belief', 0.033), ('receive', 0.033), ('representational', 0.033), ('latent', 0.032), ('rule', 0.032), ('ci', 0.032), ('inference', 0.032), ('rotation', 0.031), ('leg', 0.031), ('dij', 0.031), ('lling', 0.031), ('reconstructions', 0.031), ('architecture', 0.031), ('database', 0.03), ('nearest', 0.03), ('smoothing', 0.03), ('cyclic', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999803 134 nips-2006-Modeling Human Motion Using Binary Latent Variables

Author: Graham W. Taylor, Geoffrey E. Hinton, Sam T. Roweis

Abstract: We propose a non-linear generative model for human motion data that uses an undirected model with binary latent variables and real-valued “visible” variables that represent joint angles. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. Such an architecture makes on-line inference efficient and allows us to use a simple approximate learning procedure. After training, the model finds a single set of parameters that simultaneously capture several different kinds of motion. We demonstrate the power of our approach by synthesizing various motion sequences and by performing on-line filling in of data lost during motion capture. Website: http://www.cs.toronto.edu/∼gwtaylor/publications/nips2006mhmublv/

2 0.32339579 111 nips-2006-Learning Motion Style Synthesis from Perceptual Observations

Author: Lorenzo Torresani, Peggy Hackney, Christoph Bregler

Abstract: This paper presents an algorithm for synthesis of human motion in specified styles. We use a theory of movement observation (Laban Movement Analysis) to describe movement styles as points in a multi-dimensional perceptual space. We cast the task of learning to synthesize desired movement styles as a regression problem: sequences generated via space-time interpolation of motion capture data are used to learn a nonlinear mapping between animation parameters and movement styles in perceptual space. We demonstrate that the learned model can apply a variety of motion styles to pre-recorded motion sequences and it can extrapolate styles not originally included in the training data. 1

3 0.29507089 88 nips-2006-Greedy Layer-Wise Training of Deep Networks

Author: Yoshua Bengio, Pascal Lamblin, Dan Popovici, Hugo Larochelle

Abstract: Complexity theory of circuits strongly suggests that deep architectures can be much more efficient (sometimes exponentially) than shallow architectures, in terms of computational elements required to represent some functions. Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly non-linear and highly-varying functions. However, until recently it was not clear how to train such deep networks, since gradient-based optimization starting from random initialization appears to often get stuck in poor solutions. Hinton et al. recently introduced a greedy layer-wise unsupervised learning algorithm for Deep Belief Networks (DBN), a generative model with many layers of hidden causal variables. In the context of the above optimization problem, we study this algorithm empirically and explore variants to better understand its success and extend it to cases where the inputs are continuous or where the structure of the input distribution is not revealing enough about the variable to be predicted in a supervised task. Our experiments also confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.

4 0.15668929 31 nips-2006-Analysis of Contour Motions

Author: Ce Liu, William T. Freeman, Edward H. Adelson

Abstract: A reliable motion estimation algorithm must function under a wide range of conditions. One regime, which we consider here, is the case of moving objects with contours but no visible texture. Tracking distinctive features such as corners can disambiguate the motion of contours, but spurious features such as T-junctions can be badly misleading. It is difficult to determine the reliability of motion from local measurements, since a full rank covariance matrix can result from both real and spurious features. We propose a novel approach that avoids these points altogether, and derives global motion estimates by utilizing information from three levels of contour analysis: edgelets, boundary fragments and contours. Boundary fragment are chains of orientated edgelets, for which we derive motion estimates from local evidence. The uncertainties of the local estimates are disambiguated after the boundary fragments are properly grouped into contours. The grouping is done by constructing a graphical model and marginalizing it using importance sampling. We propose two equivalent representations in this graphical model, reversible switch variables attached to the ends of fragments and fragment chains, to capture both local and global statistics of boundaries. Our system is successfully applied to both synthetic and real video sequences containing high-contrast boundaries and textureless regions. The system produces good motion estimates along with properly grouped and completed contours.

5 0.13817711 167 nips-2006-Recursive ICA

Author: Honghao Shan, Lingyun Zhang, Garrison W. Cottrell

Abstract: Independent Component Analysis (ICA) is a popular method for extracting independent features from visual data. However, as a fundamentally linear technique, there is always nonlinear residual redundancy that is not captured by ICA. Hence there have been many attempts to try to create a hierarchical version of ICA, but so far none of the approaches have a natural way to apply them more than once. Here we show that there is a relatively simple technique that transforms the absolute values of the outputs of a previous application of ICA into a normal distribution, to which ICA maybe applied again. This results in a recursive ICA algorithm that may be applied any number of times in order to extract higher order structure from previous layers. 1

6 0.094906606 130 nips-2006-Max-margin classification of incomplete data

7 0.089310117 66 nips-2006-Detecting Humans via Their Pose

8 0.084733747 72 nips-2006-Efficient Learning of Sparse Representations with an Energy-Based Model

9 0.083578438 108 nips-2006-Large Scale Hidden Semi-Markov SVMs

10 0.082486175 54 nips-2006-Comparative Gene Prediction using Conditional Random Fields

11 0.082482733 153 nips-2006-Online Clustering of Moving Hyperplanes

12 0.081940658 132 nips-2006-Modeling Dyadic Data with Binary Latent Factors

13 0.081523158 112 nips-2006-Learning Nonparametric Models for Probabilistic Imitation

14 0.079534218 147 nips-2006-Non-rigid point set registration: Coherent Point Drift

15 0.074583456 113 nips-2006-Learning Structural Equation Models for fMRI

16 0.072886631 46 nips-2006-Blind source separation for over-determined delayed mixtures

17 0.072564542 120 nips-2006-Learning to Traverse Image Manifolds

18 0.064192943 74 nips-2006-Efficient Structure Learning of Markov Networks using $L 1$-Regularization

19 0.063932948 15 nips-2006-A Switched Gaussian Process for Estimating Disparity and Segmentation in Binocular Stereo

20 0.063924998 175 nips-2006-Simplifying Mixture Models through Function Approximation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.221), (1, -0.028), (2, 0.12), (3, -0.116), (4, 0.049), (5, -0.11), (6, 0.01), (7, -0.142), (8, -0.068), (9, 0.266), (10, -0.117), (11, -0.146), (12, -0.298), (13, 0.12), (14, -0.165), (15, -0.064), (16, -0.313), (17, 0.092), (18, -0.059), (19, -0.106), (20, -0.121), (21, 0.048), (22, 0.08), (23, 0.089), (24, 0.003), (25, -0.048), (26, 0.009), (27, 0.07), (28, -0.019), (29, -0.033), (30, 0.041), (31, 0.19), (32, 0.016), (33, 0.069), (34, -0.036), (35, -0.0), (36, 0.011), (37, -0.024), (38, -0.004), (39, -0.02), (40, 0.001), (41, 0.043), (42, 0.013), (43, -0.013), (44, 0.043), (45, -0.023), (46, -0.061), (47, 0.033), (48, 0.084), (49, -0.002)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96415079 134 nips-2006-Modeling Human Motion Using Binary Latent Variables

Author: Graham W. Taylor, Geoffrey E. Hinton, Sam T. Roweis

Abstract: We propose a non-linear generative model for human motion data that uses an undirected model with binary latent variables and real-valued “visible” variables that represent joint angles. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. Such an architecture makes on-line inference efficient and allows us to use a simple approximate learning procedure. After training, the model finds a single set of parameters that simultaneously capture several different kinds of motion. We demonstrate the power of our approach by synthesizing various motion sequences and by performing on-line filling in of data lost during motion capture. Website: http://www.cs.toronto.edu/∼gwtaylor/publications/nips2006mhmublv/

2 0.77011263 111 nips-2006-Learning Motion Style Synthesis from Perceptual Observations

Author: Lorenzo Torresani, Peggy Hackney, Christoph Bregler

Abstract: This paper presents an algorithm for synthesis of human motion in specified styles. We use a theory of movement observation (Laban Movement Analysis) to describe movement styles as points in a multi-dimensional perceptual space. We cast the task of learning to synthesize desired movement styles as a regression problem: sequences generated via space-time interpolation of motion capture data are used to learn a nonlinear mapping between animation parameters and movement styles in perceptual space. We demonstrate that the learned model can apply a variety of motion styles to pre-recorded motion sequences and it can extrapolate styles not originally included in the training data. 1

3 0.68129712 88 nips-2006-Greedy Layer-Wise Training of Deep Networks

Author: Yoshua Bengio, Pascal Lamblin, Dan Popovici, Hugo Larochelle

Abstract: Complexity theory of circuits strongly suggests that deep architectures can be much more efficient (sometimes exponentially) than shallow architectures, in terms of computational elements required to represent some functions. Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly non-linear and highly-varying functions. However, until recently it was not clear how to train such deep networks, since gradient-based optimization starting from random initialization appears to often get stuck in poor solutions. Hinton et al. recently introduced a greedy layer-wise unsupervised learning algorithm for Deep Belief Networks (DBN), a generative model with many layers of hidden causal variables. In the context of the above optimization problem, we study this algorithm empirically and explore variants to better understand its success and extend it to cases where the inputs are continuous or where the structure of the input distribution is not revealing enough about the variable to be predicted in a supervised task. Our experiments also confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.

4 0.63187134 31 nips-2006-Analysis of Contour Motions

Author: Ce Liu, William T. Freeman, Edward H. Adelson

Abstract: A reliable motion estimation algorithm must function under a wide range of conditions. One regime, which we consider here, is the case of moving objects with contours but no visible texture. Tracking distinctive features such as corners can disambiguate the motion of contours, but spurious features such as T-junctions can be badly misleading. It is difficult to determine the reliability of motion from local measurements, since a full rank covariance matrix can result from both real and spurious features. We propose a novel approach that avoids these points altogether, and derives global motion estimates by utilizing information from three levels of contour analysis: edgelets, boundary fragments and contours. Boundary fragment are chains of orientated edgelets, for which we derive motion estimates from local evidence. The uncertainties of the local estimates are disambiguated after the boundary fragments are properly grouped into contours. The grouping is done by constructing a graphical model and marginalizing it using importance sampling. We propose two equivalent representations in this graphical model, reversible switch variables attached to the ends of fragments and fragment chains, to capture both local and global statistics of boundaries. Our system is successfully applied to both synthetic and real video sequences containing high-contrast boundaries and textureless regions. The system produces good motion estimates along with properly grouped and completed contours.

5 0.4357754 167 nips-2006-Recursive ICA

Author: Honghao Shan, Lingyun Zhang, Garrison W. Cottrell

Abstract: Independent Component Analysis (ICA) is a popular method for extracting independent features from visual data. However, as a fundamentally linear technique, there is always nonlinear residual redundancy that is not captured by ICA. Hence there have been many attempts to try to create a hierarchical version of ICA, but so far none of the approaches have a natural way to apply them more than once. Here we show that there is a relatively simple technique that transforms the absolute values of the outputs of a previous application of ICA into a normal distribution, to which ICA maybe applied again. This results in a recursive ICA algorithm that may be applied any number of times in order to extract higher order structure from previous layers. 1

6 0.43291771 72 nips-2006-Efficient Learning of Sparse Representations with an Energy-Based Model

7 0.39943779 112 nips-2006-Learning Nonparametric Models for Probabilistic Imitation

8 0.39133361 108 nips-2006-Large Scale Hidden Semi-Markov SVMs

9 0.29339916 54 nips-2006-Comparative Gene Prediction using Conditional Random Fields

10 0.2910811 113 nips-2006-Learning Structural Equation Models for fMRI

11 0.28722629 153 nips-2006-Online Clustering of Moving Hyperplanes

12 0.28339529 147 nips-2006-Non-rigid point set registration: Coherent Point Drift

13 0.28337342 120 nips-2006-Learning to Traverse Image Manifolds

14 0.28307396 40 nips-2006-Bayesian Detection of Infrequent Differences in Sets of Time Series with Shared Structure

15 0.27306661 139 nips-2006-Multi-dynamic Bayesian Networks

16 0.26949266 66 nips-2006-Detecting Humans via Their Pose

17 0.26906136 148 nips-2006-Nonlinear physically-based models for decoding motor-cortical population activity

18 0.25909117 189 nips-2006-Temporal dynamics of information content carried by neurons in the primary visual cortex

19 0.25420877 13 nips-2006-A Scalable Machine Learning Approach to Go

20 0.25166604 90 nips-2006-Hidden Markov Dirichlet Process: Modeling Genetic Recombination in Open Ancestral Space


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(1, 0.07), (3, 0.04), (7, 0.065), (8, 0.258), (9, 0.029), (12, 0.011), (13, 0.012), (20, 0.042), (22, 0.048), (44, 0.086), (57, 0.089), (65, 0.029), (69, 0.14), (71, 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.83888918 134 nips-2006-Modeling Human Motion Using Binary Latent Variables

Author: Graham W. Taylor, Geoffrey E. Hinton, Sam T. Roweis

Abstract: We propose a non-linear generative model for human motion data that uses an undirected model with binary latent variables and real-valued “visible” variables that represent joint angles. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. Such an architecture makes on-line inference efficient and allows us to use a simple approximate learning procedure. After training, the model finds a single set of parameters that simultaneously capture several different kinds of motion. We demonstrate the power of our approach by synthesizing various motion sequences and by performing on-line filling in of data lost during motion capture. Website: http://www.cs.toronto.edu/∼gwtaylor/publications/nips2006mhmublv/

2 0.81981194 202 nips-2006-iLSTD: Eligibility Traces and Convergence Analysis

Author: Alborz Geramifard, Michael Bowling, Martin Zinkevich, Richard S. Sutton

Abstract: We present new theoretical and empirical results with the iLSTD algorithm for policy evaluation in reinforcement learning with linear function approximation. iLSTD is an incremental method for achieving results similar to LSTD, the dataefficient, least-squares version of temporal difference learning, without incurring the full cost of the LSTD computation. LSTD is O(n2 ), where n is the number of parameters in the linear function approximator, while iLSTD is O(n). In this paper, we generalize the previous iLSTD algorithm and present three new results: (1) the first convergence proof for an iLSTD algorithm; (2) an extension to incorporate eligibility traces without changing the asymptotic computational complexity; and (3) the first empirical results with an iLSTD algorithm for a problem (mountain car) with feature vectors large enough (n = 10, 000) to show substantial computational advantages over LSTD. 1

3 0.78899944 153 nips-2006-Online Clustering of Moving Hyperplanes

Author: René Vidal

Abstract: We propose a recursive algorithm for clustering trajectories lying in multiple moving hyperplanes. Starting from a given or random initial condition, we use normalized gradient descent to update the coefficients of a time varying polynomial whose degree is the number of hyperplanes and whose derivatives at a trajectory give an estimate of the vector normal to the hyperplane containing that trajectory. As time proceeds, the estimates of the hyperplane normals are shown to track their true values in a stable fashion. The segmentation of the trajectories is then obtained by clustering their associated normal vectors. The final result is a simple recursive algorithm for segmenting a variable number of moving hyperplanes. We test our algorithm on the segmentation of dynamic scenes containing rigid motions and dynamic textures, e.g., a bird floating on water. Our method not only segments the bird motion from the surrounding water motion, but also determines patterns of motion in the scene (e.g., periodic motion) directly from the temporal evolution of the estimated polynomial coefficients. Our experiments also show that our method can deal with appearing and disappearing motions in the scene.

4 0.6315124 88 nips-2006-Greedy Layer-Wise Training of Deep Networks

Author: Yoshua Bengio, Pascal Lamblin, Dan Popovici, Hugo Larochelle

Abstract: Complexity theory of circuits strongly suggests that deep architectures can be much more efficient (sometimes exponentially) than shallow architectures, in terms of computational elements required to represent some functions. Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly non-linear and highly-varying functions. However, until recently it was not clear how to train such deep networks, since gradient-based optimization starting from random initialization appears to often get stuck in poor solutions. Hinton et al. recently introduced a greedy layer-wise unsupervised learning algorithm for Deep Belief Networks (DBN), a generative model with many layers of hidden causal variables. In the context of the above optimization problem, we study this algorithm empirically and explore variants to better understand its success and extend it to cases where the inputs are continuous or where the structure of the input distribution is not revealing enough about the variable to be predicted in a supervised task. Our experiments also confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.

5 0.6290437 201 nips-2006-Using Combinatorial Optimization within Max-Product Belief Propagation

Author: Daniel Tarlow, Gal Elidan, Daphne Koller, John C. Duchi

Abstract: In general, the problem of computing a maximum a posteriori (MAP) assignment in a Markov random field (MRF) is computationally intractable. However, in certain subclasses of MRF, an optimal or close-to-optimal assignment can be found very efficiently using combinatorial optimization algorithms: certain MRFs with mutual exclusion constraints can be solved using bipartite matching, and MRFs with regular potentials can be solved using minimum cut methods. However, these solutions do not apply to the many MRFs that contain such tractable components as sub-networks, but also other non-complying potentials. In this paper, we present a new method, called C OMPOSE, for exploiting combinatorial optimization for sub-networks within the context of a max-product belief propagation algorithm. C OMPOSE uses combinatorial optimization for computing exact maxmarginals for an entire sub-network; these can then be used for inference in the context of the network as a whole. We describe highly efficient methods for computing max-marginals for subnetworks corresponding both to bipartite matchings and to regular networks. We present results on both synthetic and real networks encoding correspondence problems between images, which involve both matching constraints and pairwise geometric constraints. We compare to a range of current methods, showing that the ability of C OMPOSE to transmit information globally across the network leads to improved convergence, decreased running time, and higher-scoring assignments.

6 0.62439996 176 nips-2006-Single Channel Speech Separation Using Factorial Dynamics

7 0.61408126 147 nips-2006-Non-rigid point set registration: Coherent Point Drift

8 0.60964859 93 nips-2006-Hyperparameter Learning for Graph Based Semi-supervised Learning Algorithms

9 0.57736921 45 nips-2006-Blind Motion Deblurring Using Image Statistics

10 0.56857061 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation

11 0.56757593 160 nips-2006-Part-based Probabilistic Point Matching using Equivalence Constraints

12 0.56594145 15 nips-2006-A Switched Gaussian Process for Estimating Disparity and Segmentation in Binocular Stereo

13 0.56485736 167 nips-2006-Recursive ICA

14 0.56074321 34 nips-2006-Approximate Correspondences in High Dimensions

15 0.55709553 43 nips-2006-Bayesian Model Scoring in Markov Random Fields

16 0.55647278 111 nips-2006-Learning Motion Style Synthesis from Perceptual Observations

17 0.55485225 31 nips-2006-Analysis of Contour Motions

18 0.55202866 97 nips-2006-Inducing Metric Violations in Human Similarity Judgements

19 0.55185968 112 nips-2006-Learning Nonparametric Models for Probabilistic Imitation

20 0.54753345 174 nips-2006-Similarity by Composition