nips nips2006 nips2006-200 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ali Rahimi, Ben Recht
Abstract: We derive a cost functional for estimating the relationship between highdimensional observations and the low-dimensional process that generated them with no input-output examples. Limiting our search to invertible observation functions confers numerous benefits, including a compact representation and no suboptimal local minima. Our approximation algorithms for optimizing this cost functional are fast and give diagnostic bounds on the quality of their solution. Our method can be viewed as a manifold learning algorithm that utilizes a prior on the low-dimensional manifold coordinates. The benefits of taking advantage of such priors in manifold learning and searching for the inverse observation functions in system identification are demonstrated empirically by learning to track moving targets from raw measurements in a sensor network setting and in an RFID tracking experiment. 1
Reference: text
sentIndex sentText sentNum sentScore
1 Limiting our search to invertible observation functions confers numerous benefits, including a compact representation and no suboptimal local minima. [sent-6, score-0.218]
2 Our method can be viewed as a manifold learning algorithm that utilizes a prior on the low-dimensional manifold coordinates. [sent-8, score-0.37]
3 The benefits of taking advantage of such priors in manifold learning and searching for the inverse observation functions in system identification are demonstrated empirically by learning to track moving targets from raw measurements in a sensor network setting and in an RFID tracking experiment. [sent-9, score-1.129]
4 1 Introduction Measurements from sensor systems typically serve as a proxy for latent variables of interest. [sent-10, score-0.553]
5 To recover these latent variables, the parameters of the sensor system must first be determined. [sent-11, score-0.615]
6 When pairs of measurements and their corresponding latent variables are available, fully supervised regression techniques may be applied to learn a mapping between latent states and measurements. [sent-12, score-0.827]
7 In many applications, however, latent states cannot be observed and only a diffuse prior on them is available. [sent-13, score-0.352]
8 In such cases, marginalizing over the latent variables and searching for the model parameters using Expectation Maximization (EM) has become a popular approach [3,9,19]. [sent-14, score-0.278]
9 Our method is not susceptible to local minima and provides a guarantee on the quality of the recovered observation function. [sent-17, score-0.461]
10 Because our algorithm takes advantage of an explicit prior on the latent variables, it recovers latent variables more accurately than manifold learning algorithms when applied to similar tasks. [sent-19, score-0.71]
11 Our method may be applied to estimate the observation function in nonlinear dynamical systems by enforcing a Markovian dynamics prior over the latent states. [sent-20, score-0.438]
12 We demonstrate this approach to nonlinear system identification by learning to track a moving object in a field of completely uncalibrated sensor nodes whose measurement functions are unknown. [sent-21, score-0.751]
13 Given that the object moves smoothly over time, our algorithm learns a function that maps the raw measurements from the sensor network to the target’s location. [sent-22, score-0.695]
14 In another experiment, we learn to track Radio Frequency ID (RFID) tags given a sequence of voltage measurements induced by the tag in a set of antennae . [sent-23, score-0.579]
15 Given only these measurements and that the tag moves smoothly over time, we can recover a mapping from the voltages to the position of the tag. [sent-24, score-0.711]
16 These results are surprising because no parametric sensor model is available in either scenario. [sent-25, score-0.275]
17 We are able to recover the measurement model up to an affine transform given only raw measurement sequences and a diffuse prior on the state sequence. [sent-26, score-0.439]
18 2 A diffeomorphic warping model for unsupervised regression We assume that the set X = {xi }1···N of latent variables is drawn (not necessarily iid) from a known distribution, pX (X) = pX (x1 , · · · , xN ). [sent-27, score-0.355]
19 The set of measurements Y = {yi }1···N is the output of an unknown invertible nonlinearity applied to each latent variable, yi = f0 (xi ). [sent-28, score-0.644]
20 We assume that observations, yi ∈ RD , are higher dimensional than latent variables xi ∈ Rd . [sent-29, score-0.369]
21 Because we have assumed that f0 is invertible and that there is no observation noise, this process describes a change of variables. [sent-32, score-0.217]
22 1] and [7, chap 11]): N −1 −1 pY (Y) = pY (Y; f0 ) = pX (f0 (y1 ), · · · , f0 (yN )) −1 f f0 (yi ) det −1 f f0 (yi ) −1 2 . [sent-35, score-0.168]
23 The change of variables formula immediately yields a likelihood over f , circumventing the need for integrating over the latent variables. [sent-37, score-0.31]
24 Therefore, if the true pY follows the change of variable model (1), the recovered g −1 converges to the true f0 in the sense that they both describe a change of variable from the prior distribution pX to the distribution pY . [sent-47, score-0.369]
25 This way, small perturbations in y due to observation noise produce small perturbations in g(y). [sent-49, score-0.232]
26 Substituting into (2) and adding the smoothness penalty on g, we obtain: N max C −vec (KC ) ΩX vec (KC ) − λtrCKC + log det C∆(yi )∆(yi ) C , (3) i=1 where the vec (·) operator stacks up the columns of its matrix argument into a column vector. [sent-56, score-0.399]
27 This optimization is equivalent to N max Z 0 − tr (MZ) + log det tr Jkl Z i , (6) i=1 subject to the additional constraint that rank(Z) = 1. [sent-63, score-0.219]
28 The nonconcave logdet term serves to prevent the optimal solution of (2) from collapsing to g(y) = 0, since X = 0 is the most likely setting for the zero-mean Gaussian prior pX . [sent-71, score-0.171]
29 To circumvent the non-concavity of the logdet term, we replace it with constraints requiring that the sample mean and covariance of g(Y) match the expected mean and covariance of the random variables X. [sent-72, score-0.213]
30 We thus obtain the following optimization problem: min vec (KC ) ΩX vec (KC ) + λtrCKC C s. [sent-76, score-0.274]
31 4 Related Work Manifold learning algorithms and unsupervised nonlinear system identification algorithms solve variants of the unsupervised regression problem considered here. [sent-81, score-0.208]
32 Our method provides a statistical model that augments manifold learning algorithms with a prior on latent variables. [sent-82, score-0.432]
33 In addition to our use of dynamics, a notable difference between our method and principal manifold methods [16] is that instead of learning a mapping from states to observations, we learn mappings from observations to states. [sent-85, score-0.406]
34 As far as we are aware, in the manifold learning literature, only Jenkins and Mataric [6] explicitly take temporal coherency into account, by increasing the affinity of temporally adjacent points and applying Isomap [18]. [sent-87, score-0.21]
35 State-of-the-art nonlinear system identification techniques seek to recover all the parameters of a continuous hidden Markov chain with nonlinear state transitions and observation functions given noisy observations [3, 8, 9, 19]. [sent-88, score-0.442]
36 In addition, each iteration of coordinate ascent requires some form of nonlinear smoothing over the latent variables, which is itself both computationally costly and becomes prone to local minima when the estimated observation function becomes non-invertible during the iterations. [sent-90, score-0.528]
37 Further, because mappings from low-dimensional to high-dimensional vectors require many parameters to represent, existing approaches tend to be unsuitable for large-scale sensor network or image analysis problems. [sent-91, score-0.368]
38 Our algorithms do not have local minima and represent the more compact inverse observation function where high-dimensional observations appear only in pairwise kernel evaluations. [sent-92, score-0.25]
39 Comparisons with a semi-supervised variant of these algorithms [13] show that weak priors on the latent variables are extremely informative and that additional labeled data is often only necessary to fix the coordinate system. [sent-93, score-0.33]
40 5 Experiments The following experiments show that latent states and observation functions can be accurately and efficiently recovered up to a linear coordinate transformation given only raw measurements and a generic prior over the latent variables. [sent-94, score-1.219]
41 We compare against various manifold learning and nonlinear system identification algorithms. [sent-95, score-0.283]
42 The latent variables xt extract the position components of st . [sent-100, score-0.388]
43 The true 2D coordinates are recovered up to a scale, a flip, and some shrinking in the lower left corner. [sent-107, score-0.32]
44 Therefore the recovered g is close the inverse of the original mapping, up to a linear transform. [sent-108, score-0.302]
45 Figure 1(d) shows states recovered by the algorithm of Roweis and Ghahramani [3]. [sent-109, score-0.302]
46 Smoothing with the recovered function simply projects the 4 3 2 1 0 −1 −2 −3 −4 −5 (a) (d) (b) −6 5 6 4 0 2 −5 0 (e) (c) (f) Figure 1: (a) 2D ground truth trajectory. [sent-110, score-0.438]
47 (c) Latent variables are recovered up to a linear transformation and minor distortion. [sent-113, score-0.312]
48 Roweis-Ghahramani (d), Isomap (e), and Isomap+temporal coherence (f) recovered low-dimensional coordinates that exhibit folding and other artifacts that cannot be corrected by a linear transformation. [sent-114, score-0.429]
49 Isomap (Figure 1(e)) performs poorly on this data set due to the low sampling rate on the manifold and the fact that the true mapping f is not isometric. [sent-118, score-0.25]
50 Including temporal neighbors into Isomap’s neighborhood structure (as per ST-Isomap) creates some folding, and the true underlying walk is not recovered (Figure 1(f)). [sent-119, score-0.291]
51 The upper bound on the log-likelihood returned by the relaxation (6) serves as a diagnostic on the quality of our approximations. [sent-122, score-0.172]
52 1 Learning to track in an uncalibrated sensor network We consider an artificial distributed sensor network scenario where many sensor nodes are deployed randomly in a field in order to track a moving target (Figure 2(a)). [sent-132, score-1.356]
53 The location of the sensor nodes is unknown, and the sensors are uncalibrated, so that it is not known how the position of the target maps to the reported measurements. [sent-133, score-0.456]
54 This situation arises when it is not feasible to calibrate each sensor prior to deployment or when variations in environmental conditions affect each sensor differently. [sent-134, score-0.598]
55 Given only the raw measurements produced by the network from watching a smoothly moving target, we wish to learn a mapping from these measurements to the location of the target, even though no functional form for the measurement model is available. [sent-135, score-0.813]
56 A similar problem was considered by [11], who sought to recover the location of sensor nodes using off-the-shelf manifold learning algorithms. [sent-136, score-0.516]
57 Each latent state xt is the unknown position of the target at time t. [sent-137, score-0.438]
58 The unknown function f (xt ) gives the set of measurements yt reported by the sensor network at time t. [sent-138, score-0.557]
59 Figure 2(b) shows the time series of measurements from observing the target. [sent-139, score-0.228]
60 In this case, measurements were generated by having each sensor s report its true distance ds to the target at time t and passing it through a t random nonlinearity of the form αs exp(−β s ds ). [sent-140, score-0.537]
61 This is equivalent to requiring that a memoryless mapping from measurements to positions must exist. [sent-142, score-0.281]
62 024 Figure 2: (a) A target followed a smooth trajectory (dotted line) in a field of 100 randomly placed uncalibrated sensors with random and unknown observation functions (circle). [sent-213, score-0.454]
63 (b) Time series of measurements produced by the sensor network in response to the target’s motion. [sent-214, score-0.558]
64 (c) The recovered trajectory given only raw sensor measurements, and no information about the observation function (other than smoothness and invertibility). [sent-215, score-0.805]
65 (d) To test the recovered mapping further, the target was made to follow a zigzag pattern. [sent-217, score-0.49]
66 The resulting trajectory is again similar to the ground truth zigzag, up to minor distortion. [sent-219, score-0.275]
67 (f) The mapping obtained by KPCA cannot recover the zigzag, because KPCA does not utilize the prior on latent states. [sent-220, score-0.44]
68 The recovered function g implicitly performs all the triangulation necessary for recovering the position of the target, even though the position or characteristics of the sensors were not known a priori. [sent-222, score-0.429]
69 The bottom row of Figure 2 tests the recovered g by applying it to a new measurement set. [sent-223, score-0.346]
70 To show that this sensor network problem is not trivial, the figure also shows the output of the mapping obtained by KPCA. [sent-224, score-0.419]
71 As an RFID tag moves along the flat surface, the strength of the RF signal induced by RFID tag in each antenna is reported, producing a time series of 10 numbers. [sent-228, score-0.44]
72 We wish to learn a mapping from these 10 voltage measurements to the 2D position of the RFID tag. [sent-229, score-0.394]
73 Previously, such a mapping was recovered by hand, by meticulous physical modeling of this system, followed by trial-and-error to refine these mappings; a process that took about 3 months in total [10]. [sent-230, score-0.346]
74 We show that it is possible to recover this mapping automatically, up to an affine transformation, given only the raw time series of measurements generated by moving the RFID tag by hand on the Sensetable for about 5 minutes. [sent-231, score-0.714]
75 This is a challenging task because the relationship between the tag’s position and the observed measurements is highly oscillatory. [sent-232, score-0.253]
76 Once it is learned, we can use the mapping to track RFID tags. [sent-234, score-0.175]
77 This experiment serves as a real-world instantiation of the sensor network setup of the previous section in that each antenna effectively acts as an uncalibrated sensor node with an unknown and highly oscillatory measurement function. [sent-235, score-0.895]
78 Figure 3(b) shows the ground truth trajectory of the RFID tag in this data set. [sent-236, score-0.439]
79 Given only the 5 minute-long time series of raw voltage measurements, our algorithm recovered the trajectory shown in Figure 3(c). [sent-237, score-0.536]
80 These recovered coordinates are scaled down and flipped about both axes as compared to the ground truth coordinates. [sent-238, score-0.501]
81 There is also some additional shrinkage in the upper right corner, but the coordinates are otherwise recovered accurately, with an affine registration error of 1. [sent-239, score-0.451]
82 None of these algorithms recover low-dimensional coordinates that resemble the ground truth. [sent-242, score-0.235]
83 LLE, in addition to collapsing the coordinates to one dimension, exhibits severe folding, obtaining an affine registration 100 Ground truth 0. [sent-243, score-0.332]
84 03 Figure 3: (a) The output of the Sensetable over a six second period, while moving the tag from the left edge of the table to the right edge. [sent-256, score-0.22]
85 (c) The trajectory recovered by our spectral algorithm is correct up to flips about both axes, a scale change, and some shrinkage along the edge. [sent-260, score-0.383]
86 Isomap K=7 LLE k=15 KPCA Figure 4: From left to right, the trajectories recovered by LLE, KPCA, Isomap, ST-Isomap. [sent-261, score-0.313]
87 KPCA also exhibited folding and large holes, with an affine registration error of 7. [sent-265, score-0.273]
88 Isomap with temporal coherency performed similarly, with a best affine registration error of 3. [sent-269, score-0.18]
89 To further test the mapping recovered by our algorithm, we traced various trajectories with an RFID tag and passed the resulting voltages through the recovered g. [sent-272, score-0.872]
90 Noise in the recovered coordinates is due to measurement noise. [sent-275, score-0.409]
91 To demonstrate this, we generated 2000 random perturbations of the parameters of the inverse covariance of X used to generate the Sensetable results, and evaluated the resulting affine registration error. [sent-277, score-0.293]
92 05 Figure 5: Tracking RFID tags using the recovered mapping. [sent-401, score-0.293]
93 6 Conclusions and Future Work We have shown how to recover the latent variables in a dynamical system given an approximate prior on the dynamics of these variables and observations of these states through an unknown invertible nonlinearity. [sent-402, score-0.754]
94 The requirement that the observation function be invertible is similar to the requirement in manifold learning algorithms that the manifold not intersect itself. [sent-403, score-0.507]
95 Our algorithm enhances manifold learning algorithms by leveraging a prior on the latent variables. [sent-404, score-0.432]
96 Because we search for a mapping from observations to unknown states (as opposed to from states to observations), we can devise algorithms that are stable and avoid local minima. [sent-405, score-0.32]
97 We applied this methodology to learning to track objects given only raw measurements from sensors with no constraints on the observation model other than invertibility and smoothness. [sent-406, score-0.606]
98 We are currently evaluating various ways to relax the invertibility requirement on the observation function by allowing invertibility up to a linear subspace. [sent-407, score-0.28]
99 Gaussian process latent variable models for visualisation of high dimensional data. [sent-460, score-0.223]
100 Manifold learning algorithms for localization in wireless sensor networks. [sent-473, score-0.308]
wordName wordTfidf (topN-words)
[('sensor', 0.275), ('recovered', 0.257), ('rfid', 0.236), ('latent', 0.223), ('isomap', 0.206), ('px', 0.202), ('kpca', 0.195), ('measurements', 0.192), ('py', 0.182), ('tag', 0.164), ('manifold', 0.161), ('sensetable', 0.148), ('vec', 0.137), ('registration', 0.131), ('af', 0.129), ('det', 0.125), ('uncalibrated', 0.123), ('folding', 0.109), ('invertible', 0.103), ('invertibility', 0.099), ('raw', 0.097), ('trajectory', 0.094), ('ground', 0.092), ('lle', 0.091), ('yi', 0.091), ('measurement', 0.089), ('truth', 0.089), ('mapping', 0.089), ('track', 0.086), ('nonlinear', 0.085), ('observation', 0.082), ('recover', 0.08), ('perturbations', 0.075), ('jkl', 0.074), ('logdet', 0.074), ('zigzag', 0.074), ('observations', 0.073), ('kc', 0.073), ('target', 0.07), ('coordinates', 0.063), ('position', 0.061), ('id', 0.058), ('relaxation', 0.056), ('moving', 0.056), ('trajectories', 0.056), ('variables', 0.055), ('network', 0.055), ('coordinate', 0.052), ('voltage', 0.052), ('minima', 0.05), ('returned', 0.05), ('sensors', 0.05), ('antennae', 0.049), ('coherency', 0.049), ('rahimi', 0.049), ('recht', 0.049), ('trckc', 0.049), ('voltages', 0.049), ('collapsing', 0.049), ('xt', 0.049), ('prior', 0.048), ('tr', 0.047), ('states', 0.045), ('inverse', 0.045), ('identi', 0.044), ('smoothly', 0.043), ('tracking', 0.043), ('jenkins', 0.043), ('antenna', 0.043), ('chap', 0.043), ('unsupervised', 0.043), ('covariance', 0.042), ('brighter', 0.039), ('radio', 0.039), ('susceptible', 0.039), ('cm', 0.038), ('mappings', 0.038), ('concave', 0.038), ('ck', 0.038), ('system', 0.037), ('diffuse', 0.036), ('tags', 0.036), ('ascent', 0.036), ('rd', 0.036), ('series', 0.036), ('unknown', 0.035), ('walk', 0.034), ('warping', 0.034), ('determinant', 0.034), ('quality', 0.033), ('moves', 0.033), ('asymptotically', 0.033), ('search', 0.033), ('exhibited', 0.033), ('diagnostic', 0.033), ('seattle', 0.033), ('wireless', 0.033), ('spectral', 0.032), ('change', 0.032), ('em', 0.032)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999934 200 nips-2006-Unsupervised Regression with Applications to Nonlinear System Identification
Author: Ali Rahimi, Ben Recht
Abstract: We derive a cost functional for estimating the relationship between highdimensional observations and the low-dimensional process that generated them with no input-output examples. Limiting our search to invertible observation functions confers numerous benefits, including a compact representation and no suboptimal local minima. Our approximation algorithms for optimizing this cost functional are fast and give diagnostic bounds on the quality of their solution. Our method can be viewed as a manifold learning algorithm that utilizes a prior on the low-dimensional manifold coordinates. The benefits of taking advantage of such priors in manifold learning and searching for the inverse observation functions in system identification are demonstrated empirically by learning to track moving targets from raw measurements in a sensor network setting and in an RFID tracking experiment. 1
2 0.1899088 121 nips-2006-Learning to be Bayesian without Supervision
Author: Martin Raphan, Eero P. Simoncelli
Abstract: unkown-abstract
3 0.17473289 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming
Author: Kilian Q. Weinberger, Fei Sha, Qihui Zhu, Lawrence K. Saul
Abstract: In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. Recently, a number of researchers have converged on common solutions to this problem using methods from convex optimization. In particular, many results have been obtained by constructing semidefinite programs (SDPs) with low rank solutions. While the rank of matrix variables in SDPs cannot be directly constrained, it has been observed that low rank solutions emerge naturally by computing high variance or maximal trace solutions that respect local distance constraints. In this paper, we show how to solve very large problems of this type by a matrix factorization that leads to much smaller SDPs than those previously studied. The matrix factorization is derived by expanding the solution of the original problem in terms of the bottom eigenvectors of a graph Laplacian. The smaller SDPs obtained from this matrix factorization yield very good approximations to solutions of the original problem. Moreover, these approximations can be further refined by conjugate gradient descent. We illustrate the approach on localization in large scale sensor networks, where optimizations involving tens of thousands of nodes can be solved in just a few minutes. 1
4 0.15813784 120 nips-2006-Learning to Traverse Image Manifolds
Author: Piotr DollĂĄr, Vincent Rabaud, Serge J. Belongie
Abstract: We present a new algorithm, Locally Smooth Manifold Learning (LSML), that learns a warping function from a point on an manifold to its neighbors. Important characteristics of LSML include the ability to recover the structure of the manifold in sparsely populated regions and beyond the support of the provided data. Applications of our proposed technique include embedding with a natural out-of-sample extension and tasks such as tangent distance estimation, frame rate up-conversion, video compression and motion transfer. 1
5 0.13671906 42 nips-2006-Bayesian Image Super-resolution, Continued
Author: Lyndsey C. Pickup, David P. Capel, Stephen J. Roberts, Andrew Zisserman
Abstract: This paper develops a multi-frame image super-resolution approach from a Bayesian view-point by marginalizing over the unknown registration parameters relating the set of input low-resolution views. In Tipping and Bishop’s Bayesian image super-resolution approach [16], the marginalization was over the superresolution image, necessitating the use of an unfavorable image prior. By integrating over the registration parameters rather than the high-resolution image, our method allows for more realistic prior distributions, and also reduces the dimension of the integral considerably, removing the main computational bottleneck of the other algorithm. In addition to the motion model used by Tipping and Bishop, illumination components are introduced into the generative model, allowing us to handle changes in lighting as well as motion. We show results on real and synthetic datasets to illustrate the efficacy of this approach.
6 0.13339823 67 nips-2006-Differential Entropic Clustering of Multivariate Gaussians
7 0.10781903 147 nips-2006-Non-rigid point set registration: Coherent Point Drift
8 0.097284824 128 nips-2006-Manifold Denoising
9 0.09161824 77 nips-2006-Fast Computation of Graph Kernels
10 0.088085435 12 nips-2006-A Probabilistic Algorithm Integrating Source Localization and Noise Suppression of MEG and EEG data
11 0.087861657 69 nips-2006-Distributed Inference in Dynamical Systems
12 0.086423323 127 nips-2006-MLLE: Modified Locally Linear Embedding Using Multiple Weights
13 0.081124708 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation
14 0.080952905 184 nips-2006-Stratification Learning: Detecting Mixed Density and Dimensionality in High Dimensional Point Clouds
15 0.077608004 161 nips-2006-Particle Filtering for Nonparametric Bayesian Matrix Factorization
16 0.076781206 65 nips-2006-Denoising and Dimension Reduction in Feature Space
17 0.076076858 40 nips-2006-Bayesian Detection of Infrequent Differences in Sets of Time Series with Shared Structure
18 0.072636329 15 nips-2006-A Switched Gaussian Process for Estimating Disparity and Segmentation in Binocular Stereo
19 0.070305459 113 nips-2006-Learning Structural Equation Models for fMRI
20 0.069148481 132 nips-2006-Modeling Dyadic Data with Binary Latent Factors
topicId topicWeight
[(0, -0.251), (1, 0.035), (2, 0.029), (3, 0.052), (4, 0.014), (5, -0.076), (6, 0.123), (7, -0.058), (8, -0.023), (9, 0.09), (10, 0.011), (11, -0.083), (12, 0.036), (13, 0.068), (14, 0.043), (15, -0.033), (16, -0.009), (17, -0.256), (18, -0.049), (19, 0.202), (20, 0.032), (21, -0.135), (22, 0.105), (23, -0.161), (24, -0.295), (25, 0.055), (26, -0.115), (27, -0.091), (28, -0.027), (29, 0.102), (30, 0.029), (31, 0.099), (32, 0.103), (33, -0.0), (34, 0.029), (35, -0.161), (36, 0.045), (37, 0.026), (38, 0.025), (39, 0.016), (40, 0.082), (41, -0.082), (42, -0.039), (43, 0.119), (44, 0.06), (45, -0.034), (46, 0.051), (47, 0.032), (48, -0.024), (49, -0.005)]
simIndex simValue paperId paperTitle
same-paper 1 0.95090538 200 nips-2006-Unsupervised Regression with Applications to Nonlinear System Identification
Author: Ali Rahimi, Ben Recht
Abstract: We derive a cost functional for estimating the relationship between highdimensional observations and the low-dimensional process that generated them with no input-output examples. Limiting our search to invertible observation functions confers numerous benefits, including a compact representation and no suboptimal local minima. Our approximation algorithms for optimizing this cost functional are fast and give diagnostic bounds on the quality of their solution. Our method can be viewed as a manifold learning algorithm that utilizes a prior on the low-dimensional manifold coordinates. The benefits of taking advantage of such priors in manifold learning and searching for the inverse observation functions in system identification are demonstrated empirically by learning to track moving targets from raw measurements in a sensor network setting and in an RFID tracking experiment. 1
2 0.6435138 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming
Author: Kilian Q. Weinberger, Fei Sha, Qihui Zhu, Lawrence K. Saul
Abstract: In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. Recently, a number of researchers have converged on common solutions to this problem using methods from convex optimization. In particular, many results have been obtained by constructing semidefinite programs (SDPs) with low rank solutions. While the rank of matrix variables in SDPs cannot be directly constrained, it has been observed that low rank solutions emerge naturally by computing high variance or maximal trace solutions that respect local distance constraints. In this paper, we show how to solve very large problems of this type by a matrix factorization that leads to much smaller SDPs than those previously studied. The matrix factorization is derived by expanding the solution of the original problem in terms of the bottom eigenvectors of a graph Laplacian. The smaller SDPs obtained from this matrix factorization yield very good approximations to solutions of the original problem. Moreover, these approximations can be further refined by conjugate gradient descent. We illustrate the approach on localization in large scale sensor networks, where optimizations involving tens of thousands of nodes can be solved in just a few minutes. 1
3 0.63496763 120 nips-2006-Learning to Traverse Image Manifolds
Author: Piotr DollĂĄr, Vincent Rabaud, Serge J. Belongie
Abstract: We present a new algorithm, Locally Smooth Manifold Learning (LSML), that learns a warping function from a point on an manifold to its neighbors. Important characteristics of LSML include the ability to recover the structure of the manifold in sparsely populated regions and beyond the support of the provided data. Applications of our proposed technique include embedding with a natural out-of-sample extension and tasks such as tangent distance estimation, frame rate up-conversion, video compression and motion transfer. 1
4 0.61974609 121 nips-2006-Learning to be Bayesian without Supervision
Author: Martin Raphan, Eero P. Simoncelli
Abstract: unkown-abstract
5 0.46825305 147 nips-2006-Non-rigid point set registration: Coherent Point Drift
Author: Andriy Myronenko, Xubo Song, Miguel Á. Carreira-Perpiñán
Abstract: We introduce Coherent Point Drift (CPD), a novel probabilistic method for nonrigid registration of point sets. The registration is treated as a Maximum Likelihood (ML) estimation problem with motion coherence constraint over the velocity field such that one point set moves coherently to align with the second set. We formulate the motion coherence constraint and derive a solution of regularized ML estimation through the variational approach, which leads to an elegant kernel form. We also derive the EM algorithm for the penalized ML optimization with deterministic annealing. The CPD method simultaneously finds both the non-rigid transformation and the correspondence between two point sets without making any prior assumption of the transformation model except that of motion coherence. This method can estimate complex non-linear non-rigid transformations, and is shown to be accurate on 2D and 3D examples and robust in the presence of outliers and missing points.
6 0.41438684 42 nips-2006-Bayesian Image Super-resolution, Continued
7 0.41226846 128 nips-2006-Manifold Denoising
8 0.37056723 67 nips-2006-Differential Entropic Clustering of Multivariate Gaussians
9 0.36484486 25 nips-2006-An Application of Reinforcement Learning to Aerobatic Helicopter Flight
10 0.35539427 12 nips-2006-A Probabilistic Algorithm Integrating Source Localization and Noise Suppression of MEG and EEG data
11 0.3471342 6 nips-2006-A Kernel Subspace Method by Stochastic Realization for Learning Nonlinear Dynamical Systems
12 0.3446579 127 nips-2006-MLLE: Modified Locally Linear Embedding Using Multiple Weights
13 0.34211546 184 nips-2006-Stratification Learning: Detecting Mixed Density and Dimensionality in High Dimensional Point Clouds
14 0.33558732 129 nips-2006-Map-Reduce for Machine Learning on Multicore
15 0.32558817 161 nips-2006-Particle Filtering for Nonparametric Bayesian Matrix Factorization
16 0.32340857 107 nips-2006-Large Margin Multi-channel Analog-to-Digital Conversion with Applications to Neural Prosthesis
17 0.31769058 32 nips-2006-Analysis of Empirical Bayesian Methods for Neuroelectromagnetic Source Localization
18 0.31727421 69 nips-2006-Distributed Inference in Dynamical Systems
19 0.31690621 104 nips-2006-Large-Scale Sparsified Manifold Regularization
20 0.30739024 182 nips-2006-Statistical Modeling of Images with Fields of Gaussian Scale Mixtures
topicId topicWeight
[(1, 0.077), (3, 0.503), (7, 0.059), (9, 0.027), (20, 0.019), (22, 0.051), (44, 0.065), (57, 0.057), (65, 0.034), (69, 0.03)]
simIndex simValue paperId paperTitle
same-paper 1 0.91996211 200 nips-2006-Unsupervised Regression with Applications to Nonlinear System Identification
Author: Ali Rahimi, Ben Recht
Abstract: We derive a cost functional for estimating the relationship between highdimensional observations and the low-dimensional process that generated them with no input-output examples. Limiting our search to invertible observation functions confers numerous benefits, including a compact representation and no suboptimal local minima. Our approximation algorithms for optimizing this cost functional are fast and give diagnostic bounds on the quality of their solution. Our method can be viewed as a manifold learning algorithm that utilizes a prior on the low-dimensional manifold coordinates. The benefits of taking advantage of such priors in manifold learning and searching for the inverse observation functions in system identification are demonstrated empirically by learning to track moving targets from raw measurements in a sensor network setting and in an RFID tracking experiment. 1
2 0.8869698 120 nips-2006-Learning to Traverse Image Manifolds
Author: Piotr DollĂĄr, Vincent Rabaud, Serge J. Belongie
Abstract: We present a new algorithm, Locally Smooth Manifold Learning (LSML), that learns a warping function from a point on an manifold to its neighbors. Important characteristics of LSML include the ability to recover the structure of the manifold in sparsely populated regions and beyond the support of the provided data. Applications of our proposed technique include embedding with a natural out-of-sample extension and tasks such as tangent distance estimation, frame rate up-conversion, video compression and motion transfer. 1
3 0.85335302 17 nips-2006-A recipe for optimizing a time-histogram
Author: Hideaki Shimazaki, Shigeru Shinomoto
Abstract: The time-histogram method is a handy tool for capturing the instantaneous rate of spike occurrence. In most of the neurophysiological literature, the bin size that critically determines the goodness of the fit of the time-histogram to the underlying rate has been selected by individual researchers in an unsystematic manner. We propose an objective method for selecting the bin size of a time-histogram from the spike data, so that the time-histogram best approximates the unknown underlying rate. The resolution of the histogram increases, or the optimal bin size decreases, with the number of spike sequences sampled. It is notable that the optimal bin size diverges if only a small number of experimental trials are available from a moderately fluctuating rate process. In this case, any attempt to characterize the underlying spike rate will lead to spurious results. Given a paucity of data, our method can also suggest how many more trials are needed until the set of data can be analyzed with the required resolution. 1
4 0.77524978 104 nips-2006-Large-Scale Sparsified Manifold Regularization
Author: Ivor W. Tsang, James T. Kwok
Abstract: Semi-supervised learning is more powerful than supervised learning by using both labeled and unlabeled data. In particular, the manifold regularization framework, together with kernel methods, leads to the Laplacian SVM (LapSVM) that has demonstrated state-of-the-art performance. However, the LapSVM solution typically involves kernel expansions of all the labeled and unlabeled examples, and is slow on testing. Moreover, existing semi-supervised learning methods, including the LapSVM, can only handle a small number of unlabeled examples. In this paper, we integrate manifold regularization with the core vector machine, which has been used for large-scale supervised and unsupervised learning. By using a sparsified manifold regularizer and formulating as a center-constrained minimum enclosing ball problem, the proposed method produces sparse solutions with low time and space complexities. Experimental results show that it is much faster than the LapSVM, and can handle a million unlabeled examples on a standard PC; while the LapSVM can only handle several thousand patterns. 1
5 0.49448279 127 nips-2006-MLLE: Modified Locally Linear Embedding Using Multiple Weights
Author: Zhenyue Zhang, Jing Wang
Abstract: The locally linear embedding (LLE) is improved by introducing multiple linearly independent local weight vectors for each neighborhood. We characterize the reconstruction weights and show the existence of the linearly independent weight vectors at each neighborhood. The modified locally linear embedding (MLLE) proposed in this paper is much stable. It can retrieve the ideal embedding if MLLE is applied on data points sampled from an isometric manifold. MLLE is also compared with the local tangent space alignment (LTSA). Numerical examples are given that show the improvement and efficiency of MLLE. 1
6 0.48393402 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming
7 0.47732157 121 nips-2006-Learning to be Bayesian without Supervision
8 0.46779516 153 nips-2006-Online Clustering of Moving Hyperplanes
10 0.45726201 128 nips-2006-Manifold Denoising
11 0.45617038 106 nips-2006-Large Margin Hidden Markov Models for Automatic Speech Recognition
12 0.44872755 67 nips-2006-Differential Entropic Clustering of Multivariate Gaussians
13 0.44119731 15 nips-2006-A Switched Gaussian Process for Estimating Disparity and Segmentation in Binocular Stereo
14 0.4366039 48 nips-2006-Branch and Bound for Semi-Supervised Support Vector Machines
15 0.43084726 107 nips-2006-Large Margin Multi-channel Analog-to-Digital Conversion with Applications to Neural Prosthesis
16 0.42142311 162 nips-2006-Predicting spike times from subthreshold dynamics of a neuron
17 0.42112875 55 nips-2006-Computation of Similarity Measures for Sequential Data using Generalized Suffix Trees
18 0.42028299 95 nips-2006-Implicit Surfaces with Globally Regularised and Compactly Supported Basis Functions
19 0.41190699 34 nips-2006-Approximate Correspondences in High Dimensions
20 0.41154698 42 nips-2006-Bayesian Image Super-resolution, Continued