nips nips2007 nips2007-161 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Chinmay Hegde, Michael Wakin, Richard Baraniuk
Abstract: We propose a novel method for linear dimensionality reduction of manifold modeled data. First, we show that with a small number M of random projections of sample points in RN belonging to an unknown K-dimensional Euclidean manifold, the intrinsic dimension (ID) of the sample set can be estimated to high accuracy. Second, we rigorously prove that using only this set of random projections, we can estimate the structure of the underlying manifold. In both cases, the number of random projections required is linear in K and logarithmic in N , meaning that K < M ≪ N . To handle practical situations, we develop a greedy algorithm to estimate the smallest size of the projection space required to perform manifold learning. Our method is particularly relevant in distributed sensing systems and leads to significant potential savings in data acquisition, storage and transmission costs.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We propose a novel method for linear dimensionality reduction of manifold modeled data. [sent-7, score-0.616]
2 First, we show that with a small number M of random projections of sample points in RN belonging to an unknown K-dimensional Euclidean manifold, the intrinsic dimension (ID) of the sample set can be estimated to high accuracy. [sent-8, score-0.793]
3 Second, we rigorously prove that using only this set of random projections, we can estimate the structure of the underlying manifold. [sent-9, score-0.133]
4 In both cases, the number of random projections required is linear in K and logarithmic in N , meaning that K < M ≪ N . [sent-10, score-0.473]
5 To handle practical situations, we develop a greedy algorithm to estimate the smallest size of the projection space required to perform manifold learning. [sent-11, score-0.667]
6 Data that possesses merely K “intrinsic” degrees of freedom can be assumed to lie on a K-dimensional manifold in the high-dimensional ambient space. [sent-17, score-0.774]
7 Once the manifold model is identified, any point on it can be represented using essentially K pieces of information. [sent-18, score-0.521]
8 Thus, algorithms in this vein of dimensionality reduction attempt to learn the structure of the manifold given highdimensional training data. [sent-19, score-0.673]
9 While most conventional manifold learning algorithms are adaptive (i. [sent-20, score-0.601]
10 , involve construction of a nonlinear mapping), a linear, nonadaptive manifold dimensionality reduction technique has recently been introduced that employs random projections [1]. [sent-24, score-1.058]
11 Consider a K-dimensional manifold M in the ambient space RN and its projection onto a random subspace of dimension M = CK log(N ); note that K < M ≪ N . [sent-25, score-1.023]
12 The result of [1] is that the pairwise metric structure of sample points from M is preserved with high accuracy under projection from RN to RM . [sent-26, score-0.262]
13 (c,d) Isomap embedding learned from (c) original data in RN , and (d) a randomly projected version of the data into RM with M = 15. [sent-30, score-0.24]
14 Prototypical devices that directly and inexpensively acquire random projections of certain types of data (signals, images, etc. [sent-32, score-0.504]
15 The theory of [1] suggests that a wide variety of signal processing tasks can be performed directly on the random projections acquired by these devices, thus saving valuable sensing, storage and processing costs. [sent-34, score-0.436]
16 The advantages of random projections extend even to cases where the original data is available in the ambient space RN . [sent-35, score-0.639]
17 Collate: Each camera node transmits its respective captured image (of size N ) to a central processing unit. [sent-38, score-0.132]
18 Preprocess: The central processor estimates the intrinsic dimension K of the underlying image manifold. [sent-40, score-0.464]
19 Learn: The central processor performs a nonlinear embedding of the data points – for instance, using Isomap [6] – into a K-dimensional Euclidean space, using the estimate of K from the previous step. [sent-42, score-0.299]
20 On the one hand, to reduce the communication needs one may perform nonlinear image compression (such as JPEG) at each node before transmitting to the central processing. [sent-44, score-0.183]
21 On the other hand, every camera could encode its image by computing (either directly or indirectly) a small number of random projections to communicate to the central processor. [sent-46, score-0.519]
22 These random projections are obtained by linear operations on the data, and thus are cheaply computed. [sent-47, score-0.387]
23 Clearly, in many situations it will be less expensive to store, transmit, and process such randomly projected versions of the sensed images. [sent-48, score-0.164]
24 The question now becomes: how much information about the manifold is conveyed by these random projections, and is any advantage in analyzing such measurements from a manifold learning perspective? [sent-49, score-1.151]
25 In this paper, we provide theoretical and experimental evidence that reliable learning of a Kdimensional manifold can be performed not just in the high-dimensional ambient space RN but also in an intermediate, much lower-dimensional random projection space RM , where M = CK log(N ). [sent-50, score-0.859]
26 Second, we present a similar bound on the number of measurements M required for Isomap [6] – a popular manifold learning algorithm – to be “reliably” used to discover the nonlinear structure of the manifold. [sent-54, score-0.71]
27 This paves the way for a weakly adaptive, linear algorithm (ML-RP) for dimensionality reduction and manifold learning. [sent-57, score-0.641]
28 Section 2 recaps the manifold learning approaches we utilize. [sent-59, score-0.521]
29 In Section 3 presents our main theoretical contributions, namely, the bounds on M required to perform reliable dimensionality estimation and manifold learning from random projections. [sent-60, score-0.732]
30 Sec- tion 4 describes a new adaptive algorithm that estimates the minimum value of M required to provide a faithful representation of the data so that manifold learning can be performed. [sent-61, score-0.733]
31 2 Background An important input parameter for all manifold learning algorithms is the intrinsic dimension (ID) of a point cloud. [sent-64, score-0.821]
32 However, if the embedding dimension is too small, then distinct data points might be collapsed onto the same embedded point. [sent-66, score-0.336]
33 Hence a natural question to ask is: given a point cloud in N -dimensional Euclidean space, what is the dimension of the manifold that best captures the structure of this data set? [sent-67, score-0.781]
34 It then computes the scale-dependent correlation dimension of the data, defined as follows. [sent-71, score-0.194]
35 , xn ) is a finite dataset of underlying dimension K. [sent-76, score-0.245]
36 N (2) where d1 (x, y) (respectively, d2 (x, y)) stands for the geodesic (respectively, ℓ2 ) distance between points x and y. [sent-81, score-0.141]
37 The condition number τ controls the local, as well as global, curvature of the manifold – the smaller the τ , the less well-conditioned the manifold with higher “twistedness” [1]. [sent-82, score-1.073]
38 2 has been proved by first specifying a finite high-resolution sampling on the manifold, the nature of which depends on its intrinsic properties; for instance, a planar manifold can be sampled coarsely. [sent-84, score-0.657]
39 3 Bounds on the performance of ID estimation and manifold learning algorithms under random projection We saw above that random projections essentially ensure that the metric structure of a highdimensional input point cloud (i. [sent-86, score-1.229]
40 , the set of all pairwise distances between points belonging to the dataset) is preserved up to a distortion that depends on ǫ. [sent-88, score-0.128]
41 This immediately suggests that geometrybased ID estimation and manifold learning algorithms could be applied to the lower-dimensional, randomly projected version of the dataset. [sent-89, score-0.695]
42 The first of our main results establishes a sufficient dimension of random projection M required to maintain the fidelity of the estimated correlation dimension using the GP algorithm. [sent-90, score-0.522]
43 1 Let M be a compact K-dimensional manifold in RN having volume V and condition number 1/τ . [sent-93, score-0.576]
44 Let K be the dimension estimate of the GP algorithm on X over the range (rmin , rmax ). [sent-98, score-0.277]
45 Suppose the following condition holds: rmax < τ /2 (3) Let Φ be a random orthoprojector from RN to RM with M < N and M ≥O K log(N V τ −1 ) log(ρ−1 ) β2 δ2 . [sent-101, score-0.262]
46 (4) Let KΦ be the estimated correlation dimension on ΦX in the projected space over the range (rmin M/N , rmax M/N ). [sent-102, score-0.398]
47 1 is a worst-case bound and serves as a sufficient condition for stable ID estimation using random projections. [sent-105, score-0.132]
48 Thus, if we choose a sufficiently small value for δ and ρ, we are guaranteed estimation accuracy levels as close as desired to those obtained with ID estimation in the original signal space. [sent-106, score-0.172]
49 number of projections required to estimate KΦ very close to K (say, within integer roundoff error) becomes higher with increasing manifold dimension K. [sent-112, score-1.137]
50 The second of our main results prescribes the minimum dimension of random projections required to maintain the residual variance produced by Isomap in the projected domain within an arbitrary additive constant of that produced by Isomap with the full data in the ambient space. [sent-113, score-1.23]
51 2 Let M be a compact K-dimensional manifold in RN having volume V and condition number 1/τ . [sent-116, score-0.576]
52 Let Φ be a random orthoprojector from RN to RM with M < N . [sent-121, score-0.151]
53 Define the diameter Γ of the dataset as follows: Γ = max diso (xi , xj ) 1≤i,j≤n where diso (x, y) stands for the Isomap estimate of the geodesic distance between points x and y. [sent-124, score-0.328]
54 Define R and RΦ to be the residual variances obtained when Isomap generates a K-dimensional embedding of the original dataset X and projected dataset ΦX respectively. [sent-125, score-0.447]
55 Since the choice of ǫ is arbitrary, we can choose a large enough M (which is still only logarithmic in N ) such that the residual variance yielded by Isomap on the randomly projected version of the dataset is arbitrarily close to the variance produced with the data in the ambient space. [sent-128, score-0.682]
56 Thus, we have proved that with only an M -dimensional projection of the data (with M ≪ N ) we can perform ID estimation and subsequently learn the structure of a K-dimensional manifold, up to accuracy levels obtained by conventional methods. [sent-135, score-0.198]
57 In Section 4, we utilize these sufficiency results to motivate an algorithm for performing practical manifold structure estimation using random projections. [sent-136, score-0.644]
58 Also, since we have no a priori information regarding the data, it is impossible to fix K and R, the outputs of GP and Isomap on the point cloud in the ambient space. [sent-139, score-0.302]
59 To circumvent this problem we develop the following empirical procedure that we dub it ML-RP for manifold learning using random projections. [sent-141, score-0.572]
60 We initialize M to a small number, and compute M random projections of the data set X = {x1 , x2 , . [sent-142, score-0.387]
61 Using the set ΦX = {Φx : x ∈ X}, we estimate the intrinsic dimension using the GP algorithm. [sent-146, score-0.333]
62 The residual variance produced by this operation is recorded. [sent-148, score-0.187]
63 The algorithm terminates when the residual variance obtained is smaller than some tolerance parameter δ. [sent-150, score-0.156]
64 A sufficient number M of random projections is determined by a nonlinear procedure (i. [sent-153, score-0.442]
65 , sequential computation of Isomap residual variance) so that conventional Algorithm 1 ML-RP M ←1 Φ ← Random orthoprojector of size M × N . [sent-155, score-0.242]
66 while residual variance ≥ δ do Run the GP algorithm on ΦX. [sent-156, score-0.156]
67 (a) Estimated intrinsic dimension for underlying hyperspherical manifolds of increasing dimension. [sent-161, score-0.4]
68 (b) Minimum number of projections required for GP to work with 90% accuracy as compared to GP on native data. [sent-163, score-0.414]
69 manifold learning does almost as well on the projected dataset as the original. [sent-164, score-0.699]
70 On the other hand, the random linear projections provide a faithful representation of the data in the geodesic sense. [sent-165, score-0.511]
71 It is only weakly adaptive in the sense that only the stopping criterion for ML-RP is determined by monitoring the nature of the projected data. [sent-168, score-0.186]
72 The existence of a certain minimum number of measurements for any chosen error value δ ensures that eventually, M in the ML-RP algorithm is going to become high enough to ensure “good” Isomap performance. [sent-170, score-0.138]
73 , just the requisite numbers of projections of points are obtained. [sent-173, score-0.372]
74 5 Experimental results This section details the results of simulations of ID estimation and subsequent manifold learning on real and synthetic datasets. [sent-174, score-0.571]
75 First, we examine the performance of the GP algorithm on random projections of K-dimensional dimensional hyperspheres embedded in an ambient space of dimension N = 150. [sent-175, score-0.873]
76 Figure 2(a) shows the variation of the dimension estimate produced by GP as a function of the number of projections M . [sent-176, score-0.591]
77 Figure 2(b) displays the minimum number of projections per sample point required to estimate the scale-dependent correlation dimension directly from the random projections, up to 10% error, when compared to GP estimation on the original data. [sent-178, score-0.82]
78 Figure 2(b) illustrates the variation of the minimum required projection dimension M vs. [sent-180, score-0.359]
79 K, the intrinsic dimen- Figure 3: Standard databases. [sent-181, score-0.136]
80 Ambient dimension for the face database N = 4096; ambient dimension for the hand rotation databases N = 3840. [sent-182, score-0.832]
81 (right) ML-RP on the hand rotation database (N = 3840). [sent-186, score-0.182]
82 For M > 60, the Isomap variance is indistinguishable from the variance obtained in the ambient space. [sent-187, score-0.371]
83 We plot the intrinsic dimension of the dataset against the minimum number of projections required such that KΦ is within 10% of the conventional GP estimate K (this is equivalent to choosing δ = 0. [sent-189, score-0.875]
84 Finally, we turn our attention to two common datasets (Figure 3) found in the literature on dimension estimation – the face database2 [6], and the hand rotation database [17]. [sent-194, score-0.478]
85 3 The face database is a collection of 698 artificial snapshots of a face (N = 64 × 64 = 4096) varying under 3 degrees of freedom: 2 angles for pose and 1 for lighting dimension. [sent-195, score-0.182]
86 The signals are therefore believed to reside on a 3D manifold in an ambient space of dimension 4096. [sent-196, score-0.913]
87 The hand rotation database is a set of 90 images (N = 64 × 60 = 3840) of rotations of a hand holding an object. [sent-197, score-0.225]
88 Although the image appearance manifold is ostensibly one-dimensional, estimators in the literature always overestimate its ID [11]. [sent-198, score-0.575]
89 Random projections of each sample in the databases were obtained by computing the inner product of the image samples with an increasing number of rows of the random orthoprojector Φ. [sent-199, score-0.633]
90 We note that in the case of the face database, for M > 60, the Isomap variance on the randomly projected points closely approximates the variance obtained with full image data. [sent-200, score-0.382]
91 This behavior of convergence of the variance to the best possible value is even more sharply observed in the hand rotation database, in which the two variance curves are indistinguishable for M > 60. [sent-201, score-0.251]
92 6 Discussion Our main theoretical contributions in this paper are the explicit values for the lower bounds on the minimum number of random projections required to perform ID estimation and subsequent manifold learning using Isomap, with high guaranteed accuracy levels. [sent-203, score-1.115]
93 Experiments on simple cases, such as uniformly generated hyperspheres of varying dimension, and more complex situations, such as the image databases displayed in Figure 3, provide sufficient evidence of the nature of the bounds described above. [sent-205, score-0.144]
94 Note that we use a subsampled version of the database used in the literature, both in terms of resolution of the image and sampling of the manifold. [sent-213, score-0.128]
95 3 The method of random projections is thus a powerful tool for ensuring the stable embedding of lowdimensional manifolds into an intermediate space of reasonable size. [sent-214, score-0.523]
96 The motivation for developing results and algorithms that involve random measurements of high-dimensional data is significant, particularly due to the increasing attention that Compressive Sensing (CS) has received recently. [sent-215, score-0.166]
97 It is now possible to think of settings involving a huge number of low-power devices that inexpensively capture, store, and transmit a very small number of measurements of high-dimensional data. [sent-216, score-0.212]
98 In situations where the bottleneck lies in the transmission of the data to the central processing node, ML-RP provides a simple solution to the manifold learning problem and ensures that with minimum transmitted amount of information, effective manifold learning can be performed. [sent-218, score-1.248]
99 The metric structure of the projected dataset upon termination of MLRP closely resembles that of the original dataset with high probability; thus, ML-RP can be viewed as a novel adaptive algorithm for finding an efficient, reduced representation of data of very large dimension. [sent-219, score-0.345]
100 Geodesic entropic graphs for dimension and entropy estimation in manifold learning. [sent-304, score-0.735]
wordName wordTfidf (topN-words)
[('manifold', 0.521), ('projections', 0.336), ('isomap', 0.334), ('id', 0.246), ('ambient', 0.228), ('gp', 0.208), ('dimension', 0.164), ('intrinsic', 0.136), ('projected', 0.124), ('orthoprojector', 0.1), ('wakin', 0.1), ('residual', 0.099), ('rn', 0.093), ('embedding', 0.092), ('rmax', 0.08), ('geodesic', 0.08), ('rice', 0.075), ('cloud', 0.074), ('database', 0.074), ('devices', 0.067), ('rotation', 0.065), ('sensing', 0.061), ('projection', 0.059), ('measurements', 0.058), ('rm', 0.058), ('variance', 0.057), ('dimensionality', 0.056), ('exceeding', 0.056), ('minimum', 0.055), ('nonlinear', 0.055), ('face', 0.054), ('dataset', 0.054), ('image', 0.054), ('required', 0.054), ('random', 0.051), ('estimation', 0.05), ('diso', 0.05), ('hyperspheres', 0.05), ('inexpensively', 0.05), ('laska', 0.05), ('rmin', 0.05), ('central', 0.046), ('manifolds', 0.044), ('embedded', 0.044), ('ece', 0.044), ('faithful', 0.044), ('hegde', 0.044), ('hand', 0.043), ('conventional', 0.043), ('situations', 0.04), ('databases', 0.04), ('baraniuk', 0.04), ('duarte', 0.04), ('transmission', 0.04), ('compressive', 0.04), ('reduction', 0.039), ('transmit', 0.037), ('processor', 0.037), ('adaptive', 0.037), ('geometric', 0.036), ('points', 0.036), ('highdimensional', 0.035), ('pairwise', 0.035), ('preserved', 0.033), ('estimate', 0.033), ('compressed', 0.032), ('silva', 0.032), ('camera', 0.032), ('logarithmic', 0.032), ('produced', 0.031), ('cs', 0.031), ('condition', 0.031), ('metric', 0.03), ('correlation', 0.03), ('increasing', 0.029), ('fix', 0.029), ('indistinguishable', 0.029), ('attention', 0.028), ('euclidean', 0.028), ('acquisition', 0.028), ('compression', 0.028), ('theorem', 0.027), ('underlying', 0.027), ('variation', 0.027), ('storage', 0.025), ('stands', 0.025), ('weakly', 0.025), ('freedom', 0.025), ('ensures', 0.025), ('signal', 0.024), ('volume', 0.024), ('store', 0.024), ('contributions', 0.024), ('accuracy', 0.024), ('original', 0.024), ('belonging', 0.024), ('sample', 0.023), ('tion', 0.022), ('structure', 0.022), ('baron', 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 161 nips-2007-Random Projections for Manifold Learning
Author: Chinmay Hegde, Michael Wakin, Richard Baraniuk
Abstract: We propose a novel method for linear dimensionality reduction of manifold modeled data. First, we show that with a small number M of random projections of sample points in RN belonging to an unknown K-dimensional Euclidean manifold, the intrinsic dimension (ID) of the sample set can be estimated to high accuracy. Second, we rigorously prove that using only this set of random projections, we can estimate the structure of the underlying manifold. In both cases, the number of random projections required is linear in K and logarithmic in N , meaning that K < M ≪ N . To handle practical situations, we develop a greedy algorithm to estimate the smallest size of the projection space required to perform manifold learning. Our method is particularly relevant in distributed sensing systems and leads to significant potential savings in data acquisition, storage and transmission costs.
2 0.41902623 107 nips-2007-Iterative Non-linear Dimensionality Reduction with Manifold Sculpting
Author: Michael Gashler, Dan Ventura, Tony Martinez
Abstract: Many algorithms have been recently developed for reducing dimensionality by projecting data onto an intrinsic non-linear manifold. Unfortunately, existing algorithms often lose significant precision in this transformation. Manifold Sculpting is a new algorithm that iteratively reduces dimensionality by simulating surface tension in local neighborhoods. We present several experiments that show Manifold Sculpting yields more accurate results than existing algorithms with both generated and natural data-sets. Manifold Sculpting is also able to benefit from both prior dimensionality reduction efforts. 1
3 0.26075992 115 nips-2007-Learning the 2-D Topology of Images
Author: Nicolas L. Roux, Yoshua Bengio, Pascal Lamblin, Marc Joliveau, Balázs Kégl
Abstract: We study the following question: is the two-dimensional structure of images a very strong prior or is it something that can be learned with a few examples of natural images? If someone gave us a learning task involving images for which the two-dimensional topology of pixels was not known, could we discover it automatically and exploit it? For example suppose that the pixels had been permuted in a fixed but unknown way, could we recover the relative two-dimensional location of pixels on images? The surprising result presented here is that not only the answer is yes, but that about as few as a thousand images are enough to approximately recover the relative locations of about a thousand pixels. This is achieved using a manifold learning algorithm applied to pixels associated with a measure of distributional similarity between pixel intensities. We compare different topologyextraction approaches and show how having the two-dimensional topology can be exploited.
4 0.17092717 116 nips-2007-Learning the structure of manifolds using random projections
Author: Yoav Freund, Sanjoy Dasgupta, Mayank Kabra, Nakul Verma
Abstract: We present a simple variant of the k-d tree which automatically adapts to intrinsic low dimensional structure in data. 1
5 0.16731267 186 nips-2007-Statistical Analysis of Semi-Supervised Regression
Author: Larry Wasserman, John D. Lafferty
Abstract: Semi-supervised methods use unlabeled data in addition to labeled data to construct predictors. While existing semi-supervised methods have shown some promising empirical performance, their development has been based largely based on heuristics. In this paper we study semi-supervised learning from the viewpoint of minimax theory. Our first result shows that some common methods based on regularization using graph Laplacians do not lead to faster minimax rates of convergence. Thus, the estimators that use the unlabeled data do not have smaller risk than the estimators that use only labeled data. We then develop several new approaches that provably lead to improved performance. The statistical tools of minimax analysis are thus used to offer some new perspective on the problem of semi-supervised learning. 1
6 0.14597172 94 nips-2007-Gaussian Process Models for Link Analysis and Transfer Learning
8 0.11421918 10 nips-2007-A Randomized Algorithm for Large Scale Support Vector Learning
9 0.11057498 104 nips-2007-Inferring Neural Firing Rates from Spike Trains Using Gaussian Processes
10 0.10362621 153 nips-2007-People Tracking with the Laplacian Eigenmaps Latent Variable Model
11 0.096974388 170 nips-2007-Robust Regression with Twinned Gaussian Processes
12 0.084416091 188 nips-2007-Subspace-Based Face Recognition in Analog VLSI
13 0.083188884 32 nips-2007-Bayesian Co-Training
14 0.078229889 66 nips-2007-Density Estimation under Independent Similarly Distributed Sampling Assumptions
15 0.074246339 212 nips-2007-Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes
16 0.07097061 135 nips-2007-Multi-task Gaussian Process Prediction
17 0.061244499 53 nips-2007-Compressed Regression
18 0.060724668 16 nips-2007-A learning framework for nearest neighbor search
19 0.059672952 36 nips-2007-Better than least squares: comparison of objective functions for estimating linear-nonlinear models
20 0.059469804 175 nips-2007-Semi-Supervised Multitask Learning
topicId topicWeight
[(0, -0.206), (1, 0.08), (2, -0.064), (3, 0.083), (4, -0.061), (5, 0.108), (6, -0.163), (7, 0.088), (8, -0.037), (9, 0.338), (10, -0.204), (11, 0.428), (12, -0.07), (13, 0.117), (14, 0.191), (15, -0.143), (16, -0.078), (17, 0.029), (18, -0.08), (19, -0.009), (20, -0.068), (21, -0.093), (22, -0.132), (23, 0.034), (24, 0.015), (25, 0.002), (26, 0.043), (27, 0.0), (28, 0.105), (29, 0.053), (30, 0.023), (31, -0.047), (32, -0.089), (33, -0.005), (34, 0.016), (35, -0.027), (36, 0.044), (37, -0.035), (38, 0.04), (39, -0.04), (40, 0.003), (41, -0.005), (42, -0.011), (43, -0.002), (44, 0.013), (45, 0.066), (46, -0.022), (47, -0.002), (48, 0.056), (49, 0.055)]
simIndex simValue paperId paperTitle
same-paper 1 0.96846753 161 nips-2007-Random Projections for Manifold Learning
Author: Chinmay Hegde, Michael Wakin, Richard Baraniuk
Abstract: We propose a novel method for linear dimensionality reduction of manifold modeled data. First, we show that with a small number M of random projections of sample points in RN belonging to an unknown K-dimensional Euclidean manifold, the intrinsic dimension (ID) of the sample set can be estimated to high accuracy. Second, we rigorously prove that using only this set of random projections, we can estimate the structure of the underlying manifold. In both cases, the number of random projections required is linear in K and logarithmic in N , meaning that K < M ≪ N . To handle practical situations, we develop a greedy algorithm to estimate the smallest size of the projection space required to perform manifold learning. Our method is particularly relevant in distributed sensing systems and leads to significant potential savings in data acquisition, storage and transmission costs.
2 0.93866432 107 nips-2007-Iterative Non-linear Dimensionality Reduction with Manifold Sculpting
Author: Michael Gashler, Dan Ventura, Tony Martinez
Abstract: Many algorithms have been recently developed for reducing dimensionality by projecting data onto an intrinsic non-linear manifold. Unfortunately, existing algorithms often lose significant precision in this transformation. Manifold Sculpting is a new algorithm that iteratively reduces dimensionality by simulating surface tension in local neighborhoods. We present several experiments that show Manifold Sculpting yields more accurate results than existing algorithms with both generated and natural data-sets. Manifold Sculpting is also able to benefit from both prior dimensionality reduction efforts. 1
3 0.6656189 115 nips-2007-Learning the 2-D Topology of Images
Author: Nicolas L. Roux, Yoshua Bengio, Pascal Lamblin, Marc Joliveau, Balázs Kégl
Abstract: We study the following question: is the two-dimensional structure of images a very strong prior or is it something that can be learned with a few examples of natural images? If someone gave us a learning task involving images for which the two-dimensional topology of pixels was not known, could we discover it automatically and exploit it? For example suppose that the pixels had been permuted in a fixed but unknown way, could we recover the relative two-dimensional location of pixels on images? The surprising result presented here is that not only the answer is yes, but that about as few as a thousand images are enough to approximately recover the relative locations of about a thousand pixels. This is achieved using a manifold learning algorithm applied to pixels associated with a measure of distributional similarity between pixel intensities. We compare different topologyextraction approaches and show how having the two-dimensional topology can be exploited.
4 0.49158719 116 nips-2007-Learning the structure of manifolds using random projections
Author: Yoav Freund, Sanjoy Dasgupta, Mayank Kabra, Nakul Verma
Abstract: We present a simple variant of the k-d tree which automatically adapts to intrinsic low dimensional structure in data. 1
5 0.47163162 186 nips-2007-Statistical Analysis of Semi-Supervised Regression
Author: Larry Wasserman, John D. Lafferty
Abstract: Semi-supervised methods use unlabeled data in addition to labeled data to construct predictors. While existing semi-supervised methods have shown some promising empirical performance, their development has been based largely based on heuristics. In this paper we study semi-supervised learning from the viewpoint of minimax theory. Our first result shows that some common methods based on regularization using graph Laplacians do not lead to faster minimax rates of convergence. Thus, the estimators that use the unlabeled data do not have smaller risk than the estimators that use only labeled data. We then develop several new approaches that provably lead to improved performance. The statistical tools of minimax analysis are thus used to offer some new perspective on the problem of semi-supervised learning. 1
7 0.3631883 153 nips-2007-People Tracking with the Laplacian Eigenmaps Latent Variable Model
8 0.3207123 188 nips-2007-Subspace-Based Face Recognition in Analog VLSI
9 0.31257361 16 nips-2007-A learning framework for nearest neighbor search
10 0.30922833 49 nips-2007-Colored Maximum Variance Unfolding
11 0.28170738 10 nips-2007-A Randomized Algorithm for Large Scale Support Vector Learning
12 0.25632364 94 nips-2007-Gaussian Process Models for Link Analysis and Transfer Learning
13 0.25126776 43 nips-2007-Catching Change-points with Lasso
14 0.24925539 104 nips-2007-Inferring Neural Firing Rates from Spike Trains Using Gaussian Processes
15 0.24602465 53 nips-2007-Compressed Regression
16 0.23415971 170 nips-2007-Robust Regression with Twinned Gaussian Processes
17 0.22619598 8 nips-2007-A New View of Automatic Relevance Determination
18 0.21413663 32 nips-2007-Bayesian Co-Training
19 0.20669688 66 nips-2007-Density Estimation under Independent Similarly Distributed Sampling Assumptions
20 0.20104854 45 nips-2007-Classification via Minimum Incremental Coding Length (MICL)
topicId topicWeight
[(5, 0.029), (13, 0.028), (16, 0.03), (18, 0.014), (19, 0.045), (21, 0.077), (34, 0.011), (35, 0.031), (47, 0.111), (49, 0.03), (81, 0.238), (83, 0.145), (85, 0.021), (87, 0.026), (90, 0.074)]
simIndex simValue paperId paperTitle
same-paper 1 0.82261956 161 nips-2007-Random Projections for Manifold Learning
Author: Chinmay Hegde, Michael Wakin, Richard Baraniuk
Abstract: We propose a novel method for linear dimensionality reduction of manifold modeled data. First, we show that with a small number M of random projections of sample points in RN belonging to an unknown K-dimensional Euclidean manifold, the intrinsic dimension (ID) of the sample set can be estimated to high accuracy. Second, we rigorously prove that using only this set of random projections, we can estimate the structure of the underlying manifold. In both cases, the number of random projections required is linear in K and logarithmic in N , meaning that K < M ≪ N . To handle practical situations, we develop a greedy algorithm to estimate the smallest size of the projection space required to perform manifold learning. Our method is particularly relevant in distributed sensing systems and leads to significant potential savings in data acquisition, storage and transmission costs.
2 0.67866063 115 nips-2007-Learning the 2-D Topology of Images
Author: Nicolas L. Roux, Yoshua Bengio, Pascal Lamblin, Marc Joliveau, Balázs Kégl
Abstract: We study the following question: is the two-dimensional structure of images a very strong prior or is it something that can be learned with a few examples of natural images? If someone gave us a learning task involving images for which the two-dimensional topology of pixels was not known, could we discover it automatically and exploit it? For example suppose that the pixels had been permuted in a fixed but unknown way, could we recover the relative two-dimensional location of pixels on images? The surprising result presented here is that not only the answer is yes, but that about as few as a thousand images are enough to approximately recover the relative locations of about a thousand pixels. This is achieved using a manifold learning algorithm applied to pixels associated with a measure of distributional similarity between pixel intensities. We compare different topologyextraction approaches and show how having the two-dimensional topology can be exploited.
3 0.65847093 24 nips-2007-An Analysis of Inference with the Universum
Author: Olivier Chapelle, Alekh Agarwal, Fabian H. Sinz, Bernhard Schölkopf
Abstract: We study a pattern classification algorithm which has recently been proposed by Vapnik and coworkers. It builds on a new inductive principle which assumes that in addition to positive and negative data, a third class of data is available, termed the Universum. We assay the behavior of the algorithm by establishing links with Fisher discriminant analysis and oriented PCA, as well as with an SVM in a projected subspace (or, equivalently, with a data-dependent reduced kernel). We also provide experimental results. 1
4 0.65758902 18 nips-2007-A probabilistic model for generating realistic lip movements from speech
Author: Gwenn Englebienne, Tim Cootes, Magnus Rattray
Abstract: The present work aims to model the correspondence between facial motion and speech. The face and sound are modelled separately, with phonemes being the link between both. We propose a sequential model and evaluate its suitability for the generation of the facial animation from a sequence of phonemes, which we obtain from speech. We evaluate the results both by computing the error between generated sequences and real video, as well as with a rigorous double-blind test with human subjects. Experiments show that our model compares favourably to other existing methods and that the sequences generated are comparable to real video sequences. 1
5 0.65625232 45 nips-2007-Classification via Minimum Incremental Coding Length (MICL)
Author: John Wright, Yangyu Tao, Zhouchen Lin, Yi Ma, Heung-yeung Shum
Abstract: We present a simple new criterion for classification, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the minimum number of additional bits to code the test sample, subject to an allowable distortion. We prove asymptotic optimality of this criterion for Gaussian data and analyze its relationships to classical classifiers. Theoretical results provide new insights into relationships among popular classifiers such as MAP and RDA, as well as unsupervised clustering methods based on lossy compression [13]. Minimizing the lossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small-sample setting. Compression also provides a uniform means of handling classes of varying dimension. This simple classification criterion and its kernel and local versions perform competitively against existing classifiers on both synthetic examples and real imagery data such as handwritten digits and human faces, without requiring domain-specific information. 1
6 0.65533215 93 nips-2007-GRIFT: A graphical model for inferring visual classification features from human data
7 0.65411794 138 nips-2007-Near-Maximum Entropy Models for Binary Neural Representations of Natural Images
8 0.65261793 156 nips-2007-Predictive Matrix-Variate t Models
9 0.65247935 94 nips-2007-Gaussian Process Models for Link Analysis and Transfer Learning
10 0.65125072 63 nips-2007-Convex Relaxations of Latent Variable Training
11 0.65096569 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI
12 0.6507811 154 nips-2007-Predicting Brain States from fMRI Data: Incremental Functional Principal Component Regression
13 0.65036213 180 nips-2007-Sparse Feature Learning for Deep Belief Networks
14 0.65003514 43 nips-2007-Catching Change-points with Lasso
15 0.64960128 186 nips-2007-Statistical Analysis of Semi-Supervised Regression
16 0.64925373 158 nips-2007-Probabilistic Matrix Factorization
17 0.64838189 73 nips-2007-Distributed Inference for Latent Dirichlet Allocation
18 0.64781076 153 nips-2007-People Tracking with the Laplacian Eigenmaps Latent Variable Model
19 0.64749962 107 nips-2007-Iterative Non-linear Dimensionality Reduction with Manifold Sculpting
20 0.64664245 148 nips-2007-Online Linear Regression and Its Application to Model-Based Reinforcement Learning