nips nips2007 nips2007-4 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Byron Boots, Geoffrey J. Gordon, Sajid M. Siddiqi
Abstract: Stability is a desirable characteristic for linear dynamical systems, but it is often ignored by algorithms that learn these systems from data. We propose a novel method for learning stable linear dynamical systems: we formulate an approximation of the problem as a convex program, start with a solution to a relaxed version of the program, and incrementally add constraints to improve stability. Rather than continuing to generate constraints until we reach a feasible solution, we test stability at each step; because the convex program is only an approximation of the desired problem, this early stopping rule can yield a higher-quality solution. We apply our algorithm to the task of learning dynamic textures from image sequences as well as to modeling biosurveillance drug-sales data. The constraint generation approach leads to noticeable improvement in the quality of simulated sequences. We compare our method to those of Lacy and Bernstein [1, 2], with positive results in terms of accuracy, quality of simulated sequences, and efficiency. 1
Reference: text
sentIndex sentText sentNum sentScore
1 We propose a novel method for learning stable linear dynamical systems: we formulate an approximation of the problem as a convex program, start with a solution to a relaxed version of the program, and incrementally add constraints to improve stability. [sent-10, score-0.605]
2 Rather than continuing to generate constraints until we reach a feasible solution, we test stability at each step; because the convex program is only an approximation of the desired problem, this early stopping rule can yield a higher-quality solution. [sent-11, score-0.403]
3 We apply our algorithm to the task of learning dynamic textures from image sequences as well as to modeling biosurveillance drug-sales data. [sent-12, score-0.492]
4 The constraint generation approach leads to noticeable improvement in the quality of simulated sequences. [sent-13, score-0.577]
5 In the case where the state is real-valued and the noise terms are assumed to be Gaussian, the resulting model is called a linear dynamical system (LDS), also known as a Kalman Filter [3]. [sent-17, score-0.238]
6 An LDS with dynamics matrix A is stable if all of A’s eigenvalues have magnitude at most 1, i. [sent-20, score-0.449]
7 However, when learning from finite data samples, the least squares solution may be unstable even if the system is stable [6]. [sent-25, score-0.816]
8 The drawback of ignoring stability is most apparent when simulating long sequences from the system in order to generate representative data or infer stretches of missing values. [sent-26, score-0.305]
9 We propose a convex optimization algorithm for learning the dynamics matrix while guaranteeing stability. [sent-27, score-0.204]
10 An estimate of the underlying state sequence is first obtained using subspace identification. [sent-28, score-0.28]
11 We then formulate the least-squares problem for the dynamics matrix as a quadratic program ˆ (QP) [7], initially without constraints. [sent-29, score-0.294]
12 However, any unstable solution allows us to derive a linear constraint which we then add to our original QP and re-solve. [sent-31, score-0.522]
13 The above two steps are iterated until we reach a stable solution, which is then refined by a simple interpolation to obtain the best possible stable estimate. [sent-32, score-0.592]
14 Our method can be viewed as constraint generation for an underlying convex program with a feasible set of all matrices with singular values at most 1, similar to work in control systems [1]. [sent-33, score-0.989]
15 However, we terminate before reaching feasibility in the convex program, by checking for matrix stability after each new constraint. [sent-34, score-0.268]
16 This makes our algorithm less conservative than previous methods for enforcing stability since it chooses the best of a larger set of stable dynamics matrices. [sent-35, score-0.538]
17 The difference in the resulting stable systems is noticeable when simulating data. [sent-36, score-0.296]
18 The constraint generation approach also achieves much greater efficiency than previous methods in our experiments. [sent-37, score-0.51]
19 One application of LDSs in computer vision is learning dynamic textures from video data [8]. [sent-38, score-0.346]
20 An advantage of learning dynamic textures is the ability to play back a realistic-looking generated sequence of any desired duration. [sent-39, score-0.367]
21 In practice, however, videos synthesized from dynamic texture models can quickly degenerate because of instability in the underlying LDS. [sent-40, score-0.457]
22 In contrast, sequences generated from dynamic textures learned by our method remain “sane” even after arbitrarily long durations. [sent-41, score-0.419]
23 We also apply our algorithm to learning baseline dynamic models of over-the-counter (OTC) drug sales for biosurveillance, and sunspot numbers from the UCR archive [9]. [sent-42, score-0.611]
24 Within subspace methods, techniques have been developed to enforce stability by augmenting the extended observability matrix with zeros [6] or adding a regularization term to the least squares objective [11]. [sent-47, score-0.655]
25 They formulate the problem as a semidefinite program (SDP) whose objective minimizes the state sequence reconstruction error, and whose constraint bounds the largest singular value by 1. [sent-49, score-0.697]
26 This convex constraint is obtained by rewriting the nonlinear matrix inequality In − AAT 0 as a linear matrix inequality [12], where In is the n × n identity matrix. [sent-50, score-0.414]
27 The existence of this constraint also proves the convexity of the σ1 ≤ 1 region. [sent-52, score-0.235]
28 In our experiments, this causes LB-2 to perform worse than LB-1 (for any δ) in terms of the state sequence reconstruction error, even while obtaining solutions outside the feasible region of LB-1. [sent-56, score-0.424]
29 To summarize the distinction between constraint generation, LB-1 and LB-2: it is hard to have both the right objective function (reconstruction error) and the right feasible region (the set of stable matrices). [sent-59, score-0.764]
30 LB-1 optimizes the right objective but over the wrong feasible region (the set of matrices with σ1 ≤ 1). [sent-60, score-0.374]
31 LB-2 has a feasible region close to the right one, but at the cost of distorting its objective function to an extent that it fares worse than LB-1 in nearly all cases. [sent-61, score-0.233]
32 In contrast, our method optimizes the right objective over a less conservative feasible region than that of any previous algorithm with the right objective, and this combination is shown to work the best in practice. [sent-62, score-0.233]
33 3 Linear Dynamical Systems The evolution of a linear dynamical system can be described by the following two equations: xt+1 = Axt + wt yt = Cxt + vt (1) Time is indexed by the discrete variable t. [sent-63, score-0.267]
34 The parameters of the system are the dynamics matrix A ∈ Rn×n , the observation model C ∈ Rm×n , and the noise covariance matrices Q and R. [sent-76, score-0.359]
35 Note that we are learning uncontrolled linear dynamical systems, though, as in previous work, control inputs can easily be incorporated into the objective function and convex program. [sent-77, score-0.29]
36 We follow the subspace identification literature in estimating all parameters other than the dynamics matrix. [sent-80, score-0.23]
37 We use subspace identification methods in our experiments for uniformity with previous work we are building on (in the control systems literature) and with work we are comparing to ([8] on the dynamic textures data). [sent-83, score-0.482]
38 X is referred to as the extended observability matrix in the control systems literature; the tth column ˆ of X represents an estimate of the state of our LDS at time t. [sent-103, score-0.213]
39 yd yd+1 yd+2 ··· yd+τ −1 md×τ A matrix of this form, with each block of rows equal to the previous block but shifted by a constant number of columns, is called a block Hankel matrix [4]. [sent-125, score-0.212]
40 10 Sλ A S λ (stable matrices) * Afinal Sσ A generated constraint unstable matrices β0 A LB-1 * stable matrices unstable matrices Sσ n2 R -10 −10 0 α 10 Figure 2: (A): Conceptual depiction of the space of n × n matrices. [sent-129, score-1.428]
41 The region of stability (Sλ ) is non-convex while the smaller region of matrices with σ1 ≤ 1 (Sσ ) is convex. [sent-130, score-0.456]
42 One iteration of constraint generation yields the constraint indicated by the line labeled ‘generated constraint’, and (in this case) leads to a stable solution A∗ . [sent-134, score-1.091]
43 (B): The actual stable f and unstable regions for the space of 2×2 matrices Eα,β = [ 0. [sent-136, score-0.674]
44 Constraint generation is able to learn a nearly optimal model from a noisy state sequence of length 7 simulated from E0,10 , with better state reconstruction error than either LB-1 or LB-2. [sent-139, score-0.63]
45 Having multiple observations per column in D is particularly helpful when the underlying dynamical system is known to have periodicity. [sent-144, score-0.205]
46 To account for stability, we first formulate the dynamics matrix learning problem as a quadratic program with a feasible set that includes the set of stable dynamics matrices. [sent-149, score-0.772]
47 Then we demonstrate how instability in its solutions can be used to generate constraints that restrict this feasible set appropriately. [sent-150, score-0.267]
48 2 Generating Constraints The quadratic objective function above is equivalent to the least squares problem of Eq. [sent-160, score-0.299]
49 When its solution yields ˆ ˆ an unstable matrix, the spectral radius of A (i. [sent-163, score-0.354]
50 Ideally we would like to ˆ to calculate a convex constraint on the spectral radius. [sent-166, score-0.321]
51 The matrices E10,0 and E0,10 are stable with λ1 = 0. [sent-170, score-0.437]
52 3, but their convex combination γE10,0 + (1 − γ)E0,10 is unstable for (e. [sent-171, score-0.288]
53 This shows that the set of stable matrices is non-convex for n = 2, and in fact this is true for all n > 1. [sent-175, score-0.437]
54 , n [15] Therefore every unstable matrix has a singular value greater than one, but the converse is not necessarily true. [sent-179, score-0.383]
55 Figure 2(A) conceptually depicts the non-convex region of stability Sλ and the convex region Sσ with σ1 ≤ 1 in the space of all n × n matrices for some fixed n. [sent-181, score-0.507]
56 The stable matrices E10,0 and E0,10 reside at the edges of the figure. [sent-184, score-0.437]
57 While results for this class of matrices vary, the constraint generation algorithm described below is able to learn a nearly optimal model from a noisy state sequence of τ = 7 simulated from E0,10 , with better state reconstruction error than LB-1 and LB-2. [sent-185, score-1.006]
58 The QP is invoked repeatedly until the stable region, i. [sent-199, score-0.296]
59 At each iteration, we calculate a linear constraint of the form in Eq. [sent-202, score-0.235]
60 Once a stable matrix is obtained, it is possible to refine this solution. [sent-205, score-0.36]
61 We know that the last constraint caused our solution to cross the boundary of Sλ , so we interpolate between the last solution and the previous iteration’s solution using binary search to look for a boundary of the stable region, in order to obtain a better objective value while remaining stable. [sent-206, score-0.74]
62 An interpolation could be attempted between the least squares solution and any stable solution. [sent-207, score-0.548]
63 However, the stable region can be highly complex, and there may be several folds and boundaries of the stable region in the interpolated area. [sent-208, score-0.754]
64 5 Experiments For learning the dynamics matrix, we implemented1 least squares, constraint generation (using quadprog), LB-1 [1] and LB-2 [2] (using CVX with SeDuMi) in Matlab on a 3. [sent-210, score-0.65]
65 However, least-squares LDSs trained in scarce-data scenarios are unstable for almost any domain, and some domains lead to unstable models up to the limit of available data (e. [sent-213, score-0.474]
66 The goals of our experiments are to: (1) examine the state evolution and simulated observations of models learned using our method, and compare them to previous work; and (2) compare the algorithms in terms of reconstruction error and efficiency. [sent-217, score-0.301]
67 We apply these algorithms to learning dynamic textures from the vision domain (Section 5. [sent-222, score-0.298]
68 1), as well as OTC drug sales counts and sunspot numbers (Section 5. [sent-223, score-0.479]
69 Samples from the original steam sequence and the fountain sequence. [sent-234, score-0.504]
70 State evolution of synthesized sequences over 1000 frames (steam top, fountain bottom). [sent-236, score-0.497]
71 The least squares solutions display instability as time progresses. [sent-237, score-0.339]
72 The solutions obtained using LB-1 remain stable for the full 1000 frame image sequence. [sent-238, score-0.376]
73 The constraint generation solutions, however, yield state sequences that are stable over the full 1000 frame image sequence without significant damping. [sent-239, score-1.114]
74 Samples drawn from a least squares synthesized sequences (top), and samples drawn from a constraint generation synthesized sequence (bottom). [sent-241, score-1.172]
75 The constraint generation synthesized steam sequence is qualitatively better looking than the steam sequence generated by LB-1, although there is little qualitative difference between the two synthesized fountain sequences. [sent-243, score-1.637]
76 53 Table 1: Quantitative results on the dynamic textures data for different numbers of states n. [sent-324, score-0.337]
77 1 Stable Dynamic Textures Dynamic textures in vision can intuitively be described as models for sequences of images that exhibit some form of low-dimensional structure and recurrent (though not necessarily repeating) characteristics, e. [sent-330, score-0.352]
78 An LDS model of a dynamic texture may synthesize an “infinitely” long sequence of images by driving the model with zero mean Gaussian noise. [sent-336, score-0.281]
79 To better visualize the difference between image sequences generated by least-squares, LB-1, and constraint generation, the evolution of each method’s state is plotted over the course of the synthesized sequences (Figure 3(B)). [sent-338, score-0.73]
80 Sequences generated by the least squares models appear to be unstable, and this was in fact the case; both the steam and the fountain sequences resulted in unstable dynamics matrices. [sent-339, score-1.084]
81 Conversely, the constrained subspace identification algorithms all produced well-behaved sequences of states and stable dynamics matrices (Table 1), although constraint generation demonstrates the fastest runtime, best scalability, and lowest error of any stability-enforcing approach. [sent-340, score-1.331]
82 A qualitative comparison of images generated by constraint generation and least squares (Figure 3(C)) indicates the effect of instability in synthesized sequences generated from dynamic texture models. [sent-341, score-1.327]
83 While the unstable least-squares model demonstrates a dramatic increase in image contrast over time, the constraint generation model continues to generate qualitatively reasonable images. [sent-342, score-0.78]
84 Qualitative comparisons between constraint generation and LB-1 indicate that constraint generation learns models that generate more natural-looking video sequences2 than LB-1. [sent-343, score-1.068]
85 Table 1 demonstrates that constraint generation always has the lowest error as well as the fastest runtime. [sent-344, score-0.543]
86 The running time of constraint generation depends on the number of constraints needed to reach a stable solution. [sent-345, score-0.843]
87 Note that LB-1 is more efficient and scalable when simulated using constraint generation (by adding constraints until Sσ is reached) than it is in its original SDP formulation. [sent-346, score-0.647]
88 2 Stable Baseline Models for Biosurveillance We examine daily counts of OTC drug sales in pharmacies, obtained from the National Data Retail Monitor (NDRM) collection [17]. [sent-348, score-0.295]
89 We isolate a 60-day subsequence where the data dynamics remain relatively stationary, and attempt to learn LDS parameters to be able to simulate sequences of baseline values for use in detecting anomalies. [sent-352, score-0.242]
90 Figure 4(A) plots 22 different drug categories aggregated over all zipcodes, and Figure 4(B) plots a single drug category (cough/cold) in 29 different zipcodes separately. [sent-354, score-0.407]
91 In both cases, constraint generation is able to use very little training data to learn a stable model that captures the periodicity in the data, while the least squares model is unstable and its predictions diverge over time. [sent-355, score-1.281]
92 LB-1 learns a model that is stable but overconstrained, and the simulated observations quickly drift from the correct magnitudes. [sent-356, score-0.4]
93 6 Discussion We have introduced a novel method for learning stable linear dynamical systems. [sent-359, score-0.433]
94 Our constraint generation algorithm is more powerful than previous methods in the sense of optimizing over a larger set of stable matrices with a suitable objective function. [sent-360, score-1.006]
95 The constraint generation approach also has the benefit of being faster than previous methods in nearly all of our experiments. [sent-361, score-0.51]
96 One possible extension is to modify the EM algorithm for LDSs to incorporate constraint generation into the M-step in order to learn stable systems that locally maximize the observed data likelihood. [sent-362, score-0.806]
97 400 0 1500 0 400 0 300 0 1500 0 400 0 300 0 1500 0 400 Sunspot numbers 300 0 300 LB-1 Least Constraint Training Squares Generation Data 1500 0 0 0 0 30 60 0 30 60 0 100 200 Figure 4: (A): 60 days of data for 22 drug categories aggregated over all zipcodes in the city. [sent-371, score-0.32]
98 (B): 60 days of data for a single drug category (cough/cold) for all 29 zipcodes in the city. [sent-372, score-0.247]
99 The training data (top), simulated output from constraint generation, output from the unstable least squares model, and output from the over-damped LB-1 model (bottom). [sent-374, score-0.741]
100 Identification of stable models in subspace identification by using regularization. [sent-440, score-0.437]
wordName wordTfidf (topN-words)
[('stable', 0.296), ('generation', 0.275), ('steam', 0.242), ('unstable', 0.237), ('constraint', 0.235), ('fountain', 0.193), ('textures', 0.192), ('lds', 0.173), ('stability', 0.153), ('squares', 0.151), ('sunspot', 0.145), ('subspace', 0.141), ('matrices', 0.141), ('dynamical', 0.137), ('synthesized', 0.135), ('drug', 0.126), ('sequences', 0.121), ('sales', 0.121), ('zipcodes', 0.121), ('dynamic', 0.106), ('instability', 0.105), ('lacy', 0.097), ('ldss', 0.097), ('otc', 0.097), ('feasible', 0.093), ('dynamics', 0.089), ('yd', 0.084), ('identi', 0.083), ('singular', 0.082), ('region', 0.081), ('reconstruction', 0.079), ('hankel', 0.077), ('beb', 0.073), ('biosurveillance', 0.073), ('siddiqi', 0.073), ('state', 0.07), ('program', 0.069), ('sequence', 0.069), ('simulated', 0.067), ('ex', 0.067), ('texture', 0.067), ('matrix', 0.064), ('objective', 0.059), ('qp', 0.057), ('convex', 0.051), ('yt', 0.051), ('least', 0.051), ('solution', 0.05), ('atp', 0.048), ('boots', 0.048), ('mina', 0.048), ('retail', 0.048), ('sajid', 0.048), ('sms', 0.048), ('soatto', 0.048), ('ucr', 0.048), ('cg', 0.048), ('svd', 0.048), ('frame', 0.048), ('video', 0.048), ('counts', 0.048), ('evolution', 0.048), ('geoffrey', 0.046), ('vec', 0.046), ('videos', 0.044), ('pittsburgh', 0.044), ('control', 0.043), ('archive', 0.042), ('byron', 0.042), ('seth', 0.042), ('qualitative', 0.042), ('numbers', 0.039), ('quantitative', 0.039), ('images', 0.039), ('dennis', 0.038), ('quadratic', 0.038), ('constraints', 0.037), ('rm', 0.037), ('observations', 0.037), ('observability', 0.036), ('periodicity', 0.036), ('spectral', 0.035), ('aggregated', 0.034), ('ap', 0.034), ('bernstein', 0.034), ('ta', 0.034), ('observation', 0.034), ('formulate', 0.034), ('author', 0.033), ('inequalities', 0.033), ('scalable', 0.033), ('ut', 0.033), ('tr', 0.033), ('demonstrates', 0.033), ('pa', 0.033), ('interpolating', 0.032), ('radius', 0.032), ('solutions', 0.032), ('baseline', 0.032), ('system', 0.031)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999887 4 nips-2007-A Constraint Generation Approach to Learning Stable Linear Dynamical Systems
Author: Byron Boots, Geoffrey J. Gordon, Sajid M. Siddiqi
Abstract: Stability is a desirable characteristic for linear dynamical systems, but it is often ignored by algorithms that learn these systems from data. We propose a novel method for learning stable linear dynamical systems: we formulate an approximation of the problem as a convex program, start with a solution to a relaxed version of the program, and incrementally add constraints to improve stability. Rather than continuing to generate constraints until we reach a feasible solution, we test stability at each step; because the convex program is only an approximation of the desired problem, this early stopping rule can yield a higher-quality solution. We apply our algorithm to the task of learning dynamic textures from image sequences as well as to modeling biosurveillance drug-sales data. The constraint generation approach leads to noticeable improvement in the quality of simulated sequences. We compare our method to those of Lacy and Bernstein [1, 2], with positive results in terms of accuracy, quality of simulated sequences, and efficiency. 1
2 0.13918617 18 nips-2007-A probabilistic model for generating realistic lip movements from speech
Author: Gwenn Englebienne, Tim Cootes, Magnus Rattray
Abstract: The present work aims to model the correspondence between facial motion and speech. The face and sound are modelled separately, with phonemes being the link between both. We propose a sequential model and evaluate its suitability for the generation of the facial animation from a sequence of phonemes, which we obtain from speech. We evaluate the results both by computing the error between generated sequences and real video, as well as with a rigorous double-blind test with human subjects. Experiments show that our model compares favourably to other existing methods and that the sequences generated are comparable to real video sequences. 1
3 0.11871229 46 nips-2007-Cluster Stability for Finite Samples
Author: Ohad Shamir, Naftali Tishby
Abstract: Over the past few years, the notion of stability in data clustering has received growing attention as a cluster validation criterion in a sample-based framework. However, recent work has shown that as the sample size increases, any clustering model will usually become asymptotically stable. This led to the conclusion that stability is lacking as a theoretical and practical tool. The discrepancy between this conclusion and the success of stability in practice has remained an open question, which we attempt to address. Our theoretical approach is that stability, as used by cluster validation algorithms, is similar in certain respects to measures of generalization in a model-selection framework. In such cases, the model chosen governs the convergence rate of generalization bounds. By arguing that these rates are more important than the sample size, we are led to the prediction that stability-based cluster validation algorithms should not degrade with increasing sample size, despite the asymptotic universal stability. This prediction is substantiated by a theoretical analysis as well as some empirical results. We conclude that stability remains a meaningful cluster validation criterion over finite samples. 1
4 0.10784312 65 nips-2007-DIFFRAC: a discriminative and flexible framework for clustering
Author: Francis R. Bach, Zaïd Harchaoui
Abstract: We present a novel linear clustering framework (D IFFRAC) which relies on a linear discriminative cost function and a convex relaxation of a combinatorial optimization problem. The large convex optimization problem is solved through a sequence of lower dimensional singular value decompositions. This framework has several attractive properties: (1) although apparently similar to K-means, it exhibits superior clustering performance than K-means, in particular in terms of robustness to noise. (2) It can be readily extended to non linear clustering if the discriminative cost function is based on positive definite kernels, and can then be seen as an alternative to spectral clustering. (3) Prior information on the partition is easily incorporated, leading to state-of-the-art performance for semi-supervised learning, for clustering or classification. We present empirical evaluations of our algorithms on synthetic and real medium-scale datasets.
5 0.076228611 184 nips-2007-Stability Bounds for Non-i.i.d. Processes
Author: Mehryar Mohri, Afshin Rostamizadeh
Abstract: The notion of algorithmic stability has been used effectively in the past to derive tight generalization bounds. A key advantage of these bounds is that they are designed for specific learning algorithms, exploiting their particular properties. But, as in much of learning theory, existing stability analyses and bounds apply only in the scenario where the samples are independently and identically distributed (i.i.d.). In many machine learning applications, however, this assumption does not hold. The observations received by the learning algorithm often have some inherent temporal dependence, which is clear in system diagnosis or time series prediction problems. This paper studies the scenario where the observations are drawn from a stationary mixing sequence, which implies a dependence between observations that weaken over time. It proves novel stability-based generalization bounds that hold even with this more general setting. These bounds strictly generalize the bounds given in the i.i.d. case. It also illustrates their application in the case of several general classes of learning algorithms, including Support Vector Regression and Kernel Ridge Regression.
7 0.06662076 17 nips-2007-A neural network implementing optimal state estimation based on dynamic spike train decoding
8 0.066027865 177 nips-2007-Simplified Rules and Theoretical Analysis for Information Bottleneck Optimization and PCA with Spiking Neurons
9 0.06418132 148 nips-2007-Online Linear Regression and Its Application to Model-Based Reinforcement Learning
10 0.064039335 84 nips-2007-Expectation Maximization and Posterior Constraints
11 0.062658437 185 nips-2007-Stable Dual Dynamic Programming
12 0.060763076 162 nips-2007-Random Sampling of States in Dynamic Programming
13 0.059429154 190 nips-2007-Support Vector Machine Classification with Indefinite Kernels
14 0.058737852 163 nips-2007-Receding Horizon Differential Dynamic Programming
15 0.057808239 134 nips-2007-Multi-Task Learning via Conic Programming
16 0.055156294 12 nips-2007-A Spectral Regularization Framework for Multi-Task Structure Learning
17 0.054310706 86 nips-2007-Exponential Family Predictive Representations of State
18 0.054191347 206 nips-2007-Topmoumoute Online Natural Gradient Algorithm
19 0.05370931 156 nips-2007-Predictive Matrix-Variate t Models
20 0.052212834 23 nips-2007-An Analysis of Convex Relaxations for MAP Estimation
topicId topicWeight
[(0, -0.199), (1, -0.007), (2, -0.003), (3, 0.005), (4, -0.018), (5, -0.015), (6, 0.016), (7, 0.067), (8, -0.123), (9, 0.036), (10, 0.076), (11, 0.022), (12, 0.058), (13, -0.04), (14, -0.099), (15, 0.029), (16, -0.1), (17, -0.052), (18, 0.052), (19, -0.017), (20, -0.055), (21, 0.022), (22, -0.033), (23, 0.02), (24, -0.062), (25, 0.082), (26, 0.118), (27, 0.025), (28, 0.018), (29, -0.099), (30, -0.003), (31, -0.098), (32, -0.098), (33, 0.125), (34, 0.031), (35, -0.032), (36, 0.072), (37, -0.179), (38, 0.042), (39, 0.079), (40, -0.047), (41, -0.06), (42, 0.023), (43, -0.009), (44, 0.003), (45, -0.011), (46, 0.14), (47, 0.024), (48, 0.057), (49, -0.107)]
simIndex simValue paperId paperTitle
same-paper 1 0.95854807 4 nips-2007-A Constraint Generation Approach to Learning Stable Linear Dynamical Systems
Author: Byron Boots, Geoffrey J. Gordon, Sajid M. Siddiqi
Abstract: Stability is a desirable characteristic for linear dynamical systems, but it is often ignored by algorithms that learn these systems from data. We propose a novel method for learning stable linear dynamical systems: we formulate an approximation of the problem as a convex program, start with a solution to a relaxed version of the program, and incrementally add constraints to improve stability. Rather than continuing to generate constraints until we reach a feasible solution, we test stability at each step; because the convex program is only an approximation of the desired problem, this early stopping rule can yield a higher-quality solution. We apply our algorithm to the task of learning dynamic textures from image sequences as well as to modeling biosurveillance drug-sales data. The constraint generation approach leads to noticeable improvement in the quality of simulated sequences. We compare our method to those of Lacy and Bernstein [1, 2], with positive results in terms of accuracy, quality of simulated sequences, and efficiency. 1
2 0.71804219 18 nips-2007-A probabilistic model for generating realistic lip movements from speech
Author: Gwenn Englebienne, Tim Cootes, Magnus Rattray
Abstract: The present work aims to model the correspondence between facial motion and speech. The face and sound are modelled separately, with phonemes being the link between both. We propose a sequential model and evaluate its suitability for the generation of the facial animation from a sequence of phonemes, which we obtain from speech. We evaluate the results both by computing the error between generated sequences and real video, as well as with a rigorous double-blind test with human subjects. Experiments show that our model compares favourably to other existing methods and that the sequences generated are comparable to real video sequences. 1
3 0.57126147 28 nips-2007-Augmented Functional Time Series Representation and Forecasting with Gaussian Processes
Author: Nicolas Chapados, Yoshua Bengio
Abstract: We introduce a functional representation of time series which allows forecasts to be performed over an unspecified horizon with progressively-revealed information sets. By virtue of using Gaussian processes, a complete covariance matrix between forecasts at several time-steps is available. This information is put to use in an application to actively trade price spreads between commodity futures contracts. The approach delivers impressive out-of-sample risk-adjusted returns after transaction costs on a portfolio of 30 spreads. 1
4 0.5282535 46 nips-2007-Cluster Stability for Finite Samples
Author: Ohad Shamir, Naftali Tishby
Abstract: Over the past few years, the notion of stability in data clustering has received growing attention as a cluster validation criterion in a sample-based framework. However, recent work has shown that as the sample size increases, any clustering model will usually become asymptotically stable. This led to the conclusion that stability is lacking as a theoretical and practical tool. The discrepancy between this conclusion and the success of stability in practice has remained an open question, which we attempt to address. Our theoretical approach is that stability, as used by cluster validation algorithms, is similar in certain respects to measures of generalization in a model-selection framework. In such cases, the model chosen governs the convergence rate of generalization bounds. By arguing that these rates are more important than the sample size, we are led to the prediction that stability-based cluster validation algorithms should not degrade with increasing sample size, despite the asymptotic universal stability. This prediction is substantiated by a theoretical analysis as well as some empirical results. We conclude that stability remains a meaningful cluster validation criterion over finite samples. 1
5 0.51085389 174 nips-2007-Selecting Observations against Adversarial Objectives
Author: Andreas Krause, Brendan Mcmahan, Carlos Guestrin, Anupam Gupta
Abstract: In many applications, one has to actively select among a set of expensive observations before making an informed decision. Often, we want to select observations which perform well when evaluated with an objective function chosen by an adversary. Examples include minimizing the maximum posterior variance in Gaussian Process regression, robust experimental design, and sensor placement for outbreak detection. In this paper, we present the Submodular Saturation algorithm, a simple and efficient algorithm with strong theoretical approximation guarantees for the case where the possible objective functions exhibit submodularity, an intuitive diminishing returns property. Moreover, we prove that better approximation algorithms do not exist unless NP-complete problems admit efficient algorithms. We evaluate our algorithm on several real-world problems. For Gaussian Process regression, our algorithm compares favorably with state-of-the-art heuristics described in the geostatistics literature, while being simpler, faster and providing theoretical guarantees. For robust experimental design, our algorithm performs favorably compared to SDP-based algorithms. 1
6 0.50943601 184 nips-2007-Stability Bounds for Non-i.i.d. Processes
7 0.50465232 65 nips-2007-DIFFRAC: a discriminative and flexible framework for clustering
8 0.45581254 210 nips-2007-Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks
9 0.45135868 163 nips-2007-Receding Horizon Differential Dynamic Programming
10 0.44199377 70 nips-2007-Discriminative K-means for Clustering
11 0.43531233 12 nips-2007-A Spectral Regularization Framework for Multi-Task Structure Learning
12 0.42519981 130 nips-2007-Modeling Natural Sounds with Modulation Cascade Processes
13 0.41130269 162 nips-2007-Random Sampling of States in Dynamic Programming
15 0.40085632 156 nips-2007-Predictive Matrix-Variate t Models
16 0.38766196 112 nips-2007-Learning Monotonic Transformations for Classification
17 0.38663903 178 nips-2007-Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains
18 0.38498861 96 nips-2007-Heterogeneous Component Analysis
19 0.37506366 63 nips-2007-Convex Relaxations of Latent Variable Training
20 0.37264511 80 nips-2007-Ensemble Clustering using Semidefinite Programming
topicId topicWeight
[(5, 0.057), (13, 0.043), (16, 0.024), (19, 0.014), (21, 0.038), (31, 0.017), (34, 0.024), (35, 0.05), (47, 0.081), (49, 0.02), (53, 0.294), (83, 0.116), (85, 0.049), (87, 0.019), (90, 0.077)]
simIndex simValue paperId paperTitle
1 0.7946856 163 nips-2007-Receding Horizon Differential Dynamic Programming
Author: Yuval Tassa, Tom Erez, William D. Smart
Abstract: The control of high-dimensional, continuous, non-linear dynamical systems is a key problem in reinforcement learning and control. Local, trajectory-based methods, using techniques such as Differential Dynamic Programming (DDP), are not directly subject to the curse of dimensionality, but generate only local controllers. In this paper,we introduce Receding Horizon DDP (RH-DDP), an extension to the classic DDP algorithm, which allows us to construct stable and robust controllers based on a library of local-control trajectories. We demonstrate the effectiveness of our approach on a series of high-dimensional problems using a simulated multi-link swimming robot. These experiments show that our approach effectively circumvents dimensionality issues, and is capable of dealing with problems of (at least) 24 state and 9 action dimensions. 1
same-paper 2 0.74719673 4 nips-2007-A Constraint Generation Approach to Learning Stable Linear Dynamical Systems
Author: Byron Boots, Geoffrey J. Gordon, Sajid M. Siddiqi
Abstract: Stability is a desirable characteristic for linear dynamical systems, but it is often ignored by algorithms that learn these systems from data. We propose a novel method for learning stable linear dynamical systems: we formulate an approximation of the problem as a convex program, start with a solution to a relaxed version of the program, and incrementally add constraints to improve stability. Rather than continuing to generate constraints until we reach a feasible solution, we test stability at each step; because the convex program is only an approximation of the desired problem, this early stopping rule can yield a higher-quality solution. We apply our algorithm to the task of learning dynamic textures from image sequences as well as to modeling biosurveillance drug-sales data. The constraint generation approach leads to noticeable improvement in the quality of simulated sequences. We compare our method to those of Lacy and Bernstein [1, 2], with positive results in terms of accuracy, quality of simulated sequences, and efficiency. 1
3 0.52837902 63 nips-2007-Convex Relaxations of Latent Variable Training
Author: Yuhong Guo, Dale Schuurmans
Abstract: We investigate a new, convex relaxation of an expectation-maximization (EM) variant that approximates a standard objective while eliminating local minima. First, a cautionary result is presented, showing that any convex relaxation of EM over hidden variables must give trivial results if any dependence on the missing values is retained. Although this appears to be a strong negative outcome, we then demonstrate how the problem can be bypassed by using equivalence relations instead of value assignments over hidden variables. In particular, we develop new algorithms for estimating exponential conditional models that only require equivalence relation information over the variable values. This reformulation leads to an exact expression for EM variants in a wide range of problems. We then develop a semidefinite relaxation that yields global training by eliminating local minima. 1
4 0.51804376 138 nips-2007-Near-Maximum Entropy Models for Binary Neural Representations of Natural Images
Author: Matthias Bethge, Philipp Berens
Abstract: Maximum entropy analysis of binary variables provides an elegant way for studying the role of pairwise correlations in neural populations. Unfortunately, these approaches suffer from their poor scalability to high dimensions. In sensory coding, however, high-dimensional data is ubiquitous. Here, we introduce a new approach using a near-maximum entropy model, that makes this type of analysis feasible for very high-dimensional data—the model parameters can be derived in closed form and sampling is easy. Therefore, our NearMaxEnt approach can serve as a tool for testing predictions from a pairwise maximum entropy model not only for low-dimensional marginals, but also for high dimensional measurements of more than thousand units. We demonstrate its usefulness by studying natural images with dichotomized pixel intensities. Our results indicate that the statistics of such higher-dimensional measurements exhibit additional structure that are not predicted by pairwise correlations, despite the fact that pairwise correlations explain the lower-dimensional marginal statistics surprisingly well up to the limit of dimensionality where estimation of the full joint distribution is feasible. 1
5 0.51424587 134 nips-2007-Multi-Task Learning via Conic Programming
Author: Tsuyoshi Kato, Hisashi Kashima, Masashi Sugiyama, Kiyoshi Asai
Abstract: When we have several related tasks, solving them simultaneously is shown to be more effective than solving them individually. This approach is called multi-task learning (MTL) and has been studied extensively. Existing approaches to MTL often treat all the tasks as uniformly related to each other and the relatedness of the tasks is controlled globally. For this reason, the existing methods can lead to undesired solutions when some tasks are not highly related to each other, and some pairs of related tasks can have significantly different solutions. In this paper, we propose a novel MTL algorithm that can overcome these problems. Our method makes use of a task network, which describes the relation structure among tasks. This allows us to deal with intricate relation structures in a systematic way. Furthermore, we control the relatedness of the tasks locally, so all pairs of related tasks are guaranteed to have similar solutions. We apply the above idea to support vector machines (SVMs) and show that the optimization problem can be cast as a second order cone program, which is convex and can be solved efficiently. The usefulness of our approach is demonstrated through simulations with protein super-family classification and ordinal regression problems.
6 0.51224899 156 nips-2007-Predictive Matrix-Variate t Models
7 0.51056921 49 nips-2007-Colored Maximum Variance Unfolding
8 0.510355 115 nips-2007-Learning the 2-D Topology of Images
9 0.51022106 96 nips-2007-Heterogeneous Component Analysis
10 0.50937772 180 nips-2007-Sparse Feature Learning for Deep Belief Networks
11 0.50711662 7 nips-2007-A Kernel Statistical Test of Independence
12 0.50705582 94 nips-2007-Gaussian Process Models for Link Analysis and Transfer Learning
13 0.50674337 86 nips-2007-Exponential Family Predictive Representations of State
14 0.50622839 18 nips-2007-A probabilistic model for generating realistic lip movements from speech
15 0.50578022 112 nips-2007-Learning Monotonic Transformations for Classification
16 0.50551617 185 nips-2007-Stable Dual Dynamic Programming
17 0.50527948 16 nips-2007-A learning framework for nearest neighbor search
18 0.50454891 79 nips-2007-Efficient multiple hyperparameter learning for log-linear models
19 0.50428891 66 nips-2007-Density Estimation under Independent Similarly Distributed Sampling Assumptions
20 0.50341475 174 nips-2007-Selecting Observations against Adversarial Objectives