nips nips2007 nips2007-115 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Nicolas L. Roux, Yoshua Bengio, Pascal Lamblin, Marc Joliveau, Balázs Kégl
Abstract: We study the following question: is the two-dimensional structure of images a very strong prior or is it something that can be learned with a few examples of natural images? If someone gave us a learning task involving images for which the two-dimensional topology of pixels was not known, could we discover it automatically and exploit it? For example suppose that the pixels had been permuted in a fixed but unknown way, could we recover the relative two-dimensional location of pixels on images? The surprising result presented here is that not only the answer is yes, but that about as few as a thousand images are enough to approximately recover the relative locations of about a thousand pixels. This is achieved using a manifold learning algorithm applied to pixels associated with a measure of distributional similarity between pixel intensities. We compare different topologyextraction approaches and show how having the two-dimensional topology can be exploited.
Reference: text
sentIndex sentText sentNum sentScore
1 fr Abstract We study the following question: is the two-dimensional structure of images a very strong prior or is it something that can be learned with a few examples of natural images? [sent-11, score-0.285]
2 If someone gave us a learning task involving images for which the two-dimensional topology of pixels was not known, could we discover it automatically and exploit it? [sent-12, score-0.787]
3 For example suppose that the pixels had been permuted in a fixed but unknown way, could we recover the relative two-dimensional location of pixels on images? [sent-13, score-0.88]
4 The surprising result presented here is that not only the answer is yes, but that about as few as a thousand images are enough to approximately recover the relative locations of about a thousand pixels. [sent-14, score-0.487]
5 This is achieved using a manifold learning algorithm applied to pixels associated with a measure of distributional similarity between pixel intensities. [sent-15, score-0.794]
6 We compare different topologyextraction approaches and show how having the two-dimensional topology can be exploited. [sent-16, score-0.175]
7 1 Introduction Machine learning has been applied to a number of tasks involving an input domain with a special topology: one-dimensional for sequences, two-dimensional for images, three-dimensional for videos and for 3-D capture. [sent-17, score-0.102]
8 On the other hand, other learning algorithms successfully exploit the specific topology of their input, e. [sent-21, score-0.175]
9 It has been conjectured [8, 2] that the two-dimensional structure of natural images is a very strong prior that would require a huge number of bits to specify, if starting from the completely uniform prior over all possible permutations. [sent-24, score-0.231]
10 The question studied here is the following: is the two-dimensional structure of natural images a very strong prior or is it something that can be learned with a few examples? [sent-25, score-0.261]
11 If a small number of examples is enough to discover that structure, then the conjecture in [8] about the image topology was probably incorrect. [sent-26, score-0.427]
12 To answer that question we consider a hypothetical learning task involving images whose pixels have been permuted in a fixed but unknown way. [sent-27, score-0.655]
13 Could we recover the 1 two-dimensional relations between pixels automatically? [sent-28, score-0.457]
14 The basic idea of the paper is that the two-dimensional topology of pixels can be recovered by looking for a two-dimensional manifold embedding pixels (each pixel is a point in that space), such that nearby pixels have similar distributions of intensity (and possibly color) values. [sent-31, score-2.203]
15 We explore a number of manifold techniques with this goal in mind, and explain how we have adapted these techniques in order to obtain the positive and surprising result: the two-dimensional structure of pixels can be recovered from a rather small number of training images. [sent-32, score-0.546]
16 On images we find that the first 2 dimensions are dominant, meaning that even the knowledge that 2 dimensions are most appropriate could probably be inferred from the data. [sent-33, score-0.216]
17 , with 2-dimensional structures, and our experiments have been performed with images of size 27 × 27 to 30 × 30, i. [sent-36, score-0.21]
18 It means that we have to look for the embedding of about a thousand points (the pixels) on a two-dimensional manifold. [sent-39, score-0.508]
19 Metric Multi-Dimensional Scaling MDS is a linear embedding technique (analogous to PCA but starting from distances and yielding coordinates on the principal directions, of maximum variance). [sent-40, score-0.748]
20 Since we found Isomap to work best to recover the pixel topology even on small sets of images, we review the basic elements of Isomap. [sent-43, score-0.473]
21 It applies the metric multidimensional scaling (MDS) algorithm to geodesic distances in the neighborhood graph. [sent-44, score-0.246]
22 The neighborhood graph is obtained by connecting the k nearest neighbors of each point. [sent-45, score-0.222]
23 Each arc of the graph is associated with a distance (the user-provided distance between points), and is used to compute an approximation of the geodesic distance on the manifold with the length of the shortest path between two points. [sent-46, score-0.434]
24 The metric MDS algorithm then transforms these distances into d-dimensional coordinates as follows. [sent-47, score-0.311]
25 This yields the coordinates: and xik = vki λk is the k-th embedding coordinate of point i. [sent-53, score-0.427]
26 3 Topology-Discovery Algorithms In order to apply a manifold learning algorithm, we must generally have a notion of similarity or distance between the points to embed. [sent-54, score-0.239]
27 Here each point corresponds to a pixel, and the data we have about the pixels provide an empirical distribution of intensities for all pixels. [sent-55, score-0.467]
28 A simple and natural dependency statistic is the correlation between pixel intensities, and it works very well. [sent-57, score-0.272]
29 The empirical correlation ρij between the intensity of pixel i and pixel j is in the interval [−1, 1]. [sent-58, score-0.53]
30 However, two pixels highly anti-correlated are much more likely to be close than pixels not correlated (think of edges in an image). [sent-59, score-0.738]
31 If we assume them to be the value of a Gaussian kernel 1 |ρij | = K(xi , xj ) = e− 2 xi −xj 2 , then by defining Dij = xi − xj and solving the above for Dij we obtain a “distance” formula that can be used with the manifold learning algorithms: Dij = − log |ρij | . [sent-61, score-0.182]
32 (1) Note that scaling the distances in the Gaussian kernel by a variance parameter would only scale the resulting embedding, so it is unnecessary. [sent-62, score-0.076]
33 2 Many other measures of distance would probably work as well. [sent-63, score-0.095]
34 However, we found the absolute correlation to be simple and easy to understand while yielding nice embeddings. [sent-64, score-0.064]
35 1 Dealing With Low-Variance Pixels A difficulty we observed in experimenting with different manifold learning algorithms on data sets such as MNIST is the influence of low-variance pixels. [sent-66, score-0.145]
36 On MNIST digit images the border pixels may have 0 or very small variance. [sent-67, score-0.55]
37 This makes them all want to be close to each other, which tends to fold the manifold on itself. [sent-68, score-0.145]
38 To handle this problem we have simply ignored pixels with very low variance. [sent-69, score-0.369]
39 In the experiments with MNIST we removed pixels with standard deviation less than 15% of the maximum standard deviation (maximum over all pixels). [sent-71, score-0.423]
40 On the NORB dataset, which has varied backgrounds, this step does not remove any of the pixels (so it is unnecessary). [sent-72, score-0.395]
41 4 Converting Back to a Grid Image Once we have obtained an embedding for the pixels, the next thing we would like to do is to transform the data vectors back into images. [sent-73, score-0.427]
42 Choosing horizontal and vertical axes (since the coordinates on the manifold can be arbitrarily rotated), and rotating the embedding coordinates accordingly, and 2. [sent-75, score-1.111]
43 Transforming the input vector of intensity values (along with the pixel coordinates) into an ordinary discrete image on a grid. [sent-76, score-0.538]
44 This should be done so that the resulting intensity at position (i, j) is close to the intensity values associated with input pixels whose embedding coordinates are (i, j). [sent-77, score-1.304]
45 Such a mapping of pixels to a grid has already been done in [4], where a grid topology is defined by the connections in a graphical model, which is then trained by maximizing the approximate likelihood. [sent-78, score-0.764]
46 N ) be the embedding coordinates found by the dimensionality reduction algorithm for the k-th input variable. [sent-83, score-0.776]
47 We select the horizontal axis as the direction of smaller spread, the vertical axis being in the orthogonal direction, and perform the appropriate rotation. [sent-84, score-0.099]
48 The output image pixel intensity M i,j at coordinates (i, j) is obtained through a convex average Mi,j = wi,j,k xk (2) k where the weights are non-negative and sum to one, and are chosen as follows. [sent-86, score-0.73]
49 Large values of γ correspond to using only the nearest neighbor of (i, j) among the pk s. [sent-89, score-0.122]
50 Smaller values smooth the intensities and make the output look better if the embedding is not perfect. [sent-90, score-0.572]
51 Input: X {Raw input n × N data matrix, one row per example, with elements in fixed but arbitrary order} Input: δ = 0. [sent-93, score-0.134]
52 W ) do r=1 repeat neighbors ← {k : ||pk − (i, j)||1 < r} r ←r+1 until neighbors not empty for all k in neighbors do vk ← eγ||pk −(i,j)||1 end for wi,j,. [sent-105, score-0.283]
53 ← 0 for all k in neighbors do v wi,j,k = P i,j,k {Compute convolution weights} k vi,j,k end for end for Algorithm 2 Convolve a raw input vector into a regular grid image, using the already discovered embedding for each input variable. [sent-106, score-1.08]
54 W ) do Yi,j ← k wi,j,k xk {Perform the convolution} end for 4 5 Experimental Results We performed experiments on two sets of images: MNIST digits dataset and NORB object classification dataset 1 . [sent-113, score-0.109]
55 We used the “jittered objects and cluttered background” image set from NORB. [sent-114, score-0.13]
56 The MNIST images are particular in that they have a white background, whereas the NORB images have more varying backgrounds. [sent-115, score-0.362]
57 The NORB images are originally of dimension 108 × 108; we subsampled them by 4 × 4 averaging into 27 × 27 images. [sent-116, score-0.181]
58 The experiments have been performed with k = 4 neighbors for the Isomap embedding. [sent-117, score-0.104]
59 Smaller values of k often led to unconnected neighborhood graphs, which Isomap cannot deal with. [sent-118, score-0.066]
60 (a) Isomap embedding (b) LLE embedding (c) MDS embedding (d) MVU embedding Figure 1: Examples of embeddings discovered by Isomap, LLE, MDS and MVU with 250 training images from NORB. [sent-119, score-2.047]
61 Each of the original pixel is placed at the location discovered by the algorithm. [sent-120, score-0.324]
62 Size of the circle and gray level indicate the original true location of the pixel. [sent-121, score-0.053]
63 In Figure 1 we compare four different manifold learning algorithms on the NORB images: Isomap, LLE, MDS and MVU. [sent-124, score-0.145]
64 One the one hand, MDS is using the pseudo-distance defined in equation 1, whose relationship with the real distance between two pixels in the original image is linear only in a small neighborhood. [sent-126, score-0.583]
65 On the other hand, Isomap uses the geodesic distances in the neighborhood graph, whose relationship with the real distance is really close to linear. [sent-127, score-0.255]
66 (b) and (d): Geodesic distance in neighborhood graph vs. [sent-130, score-0.154]
67 The true distance is on the horizontal axis for all figures. [sent-132, score-0.125]
68 Figure 3 shows the embeddings obtained on the NORB data using different numbers of examples. [sent-134, score-0.068]
69 This minimization is justified because the discovered embedding could be arbitrarily rotated, isotropically scaled, and mirrored. [sent-136, score-0.517]
70 100 examples are enough to get a reasonable embedding, and with 2000 or more a very good embedding is obtained: the RMSE for 2000 examples is 1. [sent-137, score-0.527]
71 13, meaning that in expectation, each pixel is off by slightly more than one. [sent-138, score-0.21]
72 13 2000 examples Figure 3: Embedding discovered by Isomap on the NORB dataset, with different numbers of training samples (top row). [sent-147, score-0.14]
73 Second row shows the same embeddings aligned (by a similarity transformation) on the original grid, third row shows the residual error (RMSE) after the alignment. [sent-148, score-0.265]
74 Figure 4 shows the whole process of transforming an original image (with pixels possibly permuted) into an embedded image and finally into a reconstructed image as per algorithms 1 and 2. [sent-149, score-0.858]
75 Figure 4: Example of the process of transforming an MNIST image (top) from which pixel order is unknown (second row) into its embedding (third row) and finally reconstructed as an image after rotation and convolution (bottom). [sent-150, score-1.052]
76 In the third row, we show the intensity associated to each original pixel by the grey level in a circle located at the pixel coordinates discovered by Isomap. [sent-151, score-0.882]
77 We also performed experiments with acoustic spectral data to see if the time-frequency topology can be recovered. [sent-152, score-0.24]
78 The acoustic data come from the first 100 blues pieces of a publically available genre classification dataset [14]. [sent-153, score-0.176]
79 Figure 5 shows the resulting embedding when we removed the 30 coordinates of lowest standard deviation (δ = . [sent-161, score-0.694]
80 5 0 (a) Blues embedding 1 2 3 4 5 6 7 8 9 10 (b) Spectrum Figure 5: Embedding and spectrum decay for sequences of blues music. [sent-167, score-0.529]
81 6 Discussion Although [8] argue that learning the right permutation of pixels with a flat prior might be too difficult (either in a lifetime or through evolution), our results suggest otherwise. [sent-168, score-0.394]
82 The main element of explanation that we see is that the space of permutations of d numbers is not √ d such a large class of functions. [sent-170, score-0.073]
83 There are approximately N = 2πd d permutations (Stirling e approximation) of d numbers. [sent-171, score-0.073]
84 Hence if we had a bounded criterion (say taking values in [0, 1]) to compare different permutations and we used n examples (i. [sent-173, score-0.123]
85 When d = 400 (the number of pixels with non-negligible variance in MNIST images), d log d − d ≈ 2000. [sent-177, score-0.369]
86 What is the selection criterion that we have used to recover the image structure? [sent-179, score-0.218]
87 Mainly we have used an additional prior which gives a preference to an order for which nearby pixels have similar distributions. [sent-180, score-0.454]
88 How specific to natural images and how strong is that prior? [sent-181, score-0.181]
89 When we are trying to compute useful functions from raw data, it is important to discover dependencies between the input random variables. [sent-183, score-0.227]
90 That directly gives rise to the notion of local connectivity between neurons associated to nearby spatial locations, in the case of brains, the same notion that is exploited in convolutional neural networks. [sent-185, score-0.096]
91 The fact that nearby pixels are more correlated is true at many scales in natural images. [sent-186, score-0.429]
92 The way in which we score permutations is not the way that one would score functions in an ordinary learning experiment. [sent-191, score-0.165]
93 Indeed, by using the distributional similarity between pairs of pixels, we get not just a scalar score but d(d−1)/2 scores. [sent-192, score-0.095]
94 7 7 Conclusion and Future Work We proved here that, even with a small number of examples, we are able to recover almost perfectly the 2-D topology of images. [sent-194, score-0.263]
95 This allows us to use image-specific learning algorithms without specifying any prior other than the dimensionnality of the coordinates. [sent-195, score-0.069]
96 We also showed that this algorithm performed well on sound data, even though the topology might be less obvious in that case. [sent-196, score-0.204]
97 One could easily apply this algorithm to data whose intrinsic dimensionality of the coordinates is unknown. [sent-198, score-0.272]
98 In that case, one would not convert the embedding to a grid image but rather keep it and connect only the inputs associated to close coordinates (performing a k nearest neighbor for instance). [sent-199, score-0.935]
99 It is not known if such an embedding might be useful for other types of data than the ones discussed above. [sent-200, score-0.427]
100 Unsupervised classification segmentation and enhancement of images using ica mixture models. [sent-268, score-0.223]
wordName wordTfidf (topN-words)
[('embedding', 0.427), ('pixels', 0.369), ('isomap', 0.284), ('coordinates', 0.24), ('pixel', 0.21), ('images', 0.181), ('norb', 0.179), ('topology', 0.175), ('mds', 0.17), ('dij', 0.153), ('manifold', 0.145), ('image', 0.13), ('grid', 0.11), ('mnist', 0.107), ('intensities', 0.098), ('lle', 0.095), ('pk', 0.094), ('discovered', 0.09), ('recover', 0.088), ('raw', 0.084), ('thousand', 0.081), ('geodesic', 0.081), ('convolution', 0.08), ('intensity', 0.079), ('input', 0.077), ('blues', 0.077), ('neighbors', 0.075), ('permutations', 0.073), ('embeddings', 0.068), ('mvu', 0.067), ('neighborhood', 0.066), ('nearby', 0.06), ('distance', 0.06), ('row', 0.057), ('permuted', 0.054), ('montreal', 0.054), ('audio', 0.053), ('default', 0.052), ('lecun', 0.051), ('examples', 0.05), ('distances', 0.048), ('transforming', 0.047), ('output', 0.047), ('dimensionnality', 0.044), ('rmse', 0.044), ('ica', 0.042), ('ordinary', 0.042), ('xti', 0.041), ('bengio', 0.04), ('ij', 0.04), ('genre', 0.038), ('formula', 0.037), ('discover', 0.037), ('distributional', 0.036), ('acoustic', 0.036), ('convolutional', 0.036), ('topographic', 0.036), ('probably', 0.035), ('axis', 0.034), ('similarity', 0.034), ('yielding', 0.033), ('position', 0.033), ('rotated', 0.033), ('dimensionality', 0.032), ('surprising', 0.032), ('dependency', 0.031), ('horizontal', 0.031), ('correlation', 0.031), ('end', 0.03), ('speech', 0.03), ('eigenvalues', 0.03), ('le', 0.03), ('performed', 0.029), ('something', 0.029), ('circle', 0.029), ('trying', 0.029), ('scaling', 0.028), ('reconstructed', 0.028), ('axes', 0.028), ('vk', 0.028), ('graph', 0.028), ('nearest', 0.028), ('bottou', 0.027), ('welling', 0.027), ('deviation', 0.027), ('remove', 0.026), ('question', 0.026), ('spectrum', 0.025), ('residual', 0.025), ('score', 0.025), ('involving', 0.025), ('dataset', 0.025), ('prior', 0.025), ('background', 0.025), ('connecting', 0.025), ('original', 0.024), ('weights', 0.024), ('locations', 0.024), ('nally', 0.023), ('metric', 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 115 nips-2007-Learning the 2-D Topology of Images
Author: Nicolas L. Roux, Yoshua Bengio, Pascal Lamblin, Marc Joliveau, Balázs Kégl
Abstract: We study the following question: is the two-dimensional structure of images a very strong prior or is it something that can be learned with a few examples of natural images? If someone gave us a learning task involving images for which the two-dimensional topology of pixels was not known, could we discover it automatically and exploit it? For example suppose that the pixels had been permuted in a fixed but unknown way, could we recover the relative two-dimensional location of pixels on images? The surprising result presented here is that not only the answer is yes, but that about as few as a thousand images are enough to approximately recover the relative locations of about a thousand pixels. This is achieved using a manifold learning algorithm applied to pixels associated with a measure of distributional similarity between pixel intensities. We compare different topologyextraction approaches and show how having the two-dimensional topology can be exploited.
2 0.26075992 161 nips-2007-Random Projections for Manifold Learning
Author: Chinmay Hegde, Michael Wakin, Richard Baraniuk
Abstract: We propose a novel method for linear dimensionality reduction of manifold modeled data. First, we show that with a small number M of random projections of sample points in RN belonging to an unknown K-dimensional Euclidean manifold, the intrinsic dimension (ID) of the sample set can be estimated to high accuracy. Second, we rigorously prove that using only this set of random projections, we can estimate the structure of the underlying manifold. In both cases, the number of random projections required is linear in K and logarithmic in N , meaning that K < M ≪ N . To handle practical situations, we develop a greedy algorithm to estimate the smallest size of the projection space required to perform manifold learning. Our method is particularly relevant in distributed sensing systems and leads to significant potential savings in data acquisition, storage and transmission costs.
3 0.2269952 107 nips-2007-Iterative Non-linear Dimensionality Reduction with Manifold Sculpting
Author: Michael Gashler, Dan Ventura, Tony Martinez
Abstract: Many algorithms have been recently developed for reducing dimensionality by projecting data onto an intrinsic non-linear manifold. Unfortunately, existing algorithms often lose significant precision in this transformation. Manifold Sculpting is a new algorithm that iteratively reduces dimensionality by simulating surface tension in local neighborhoods. We present several experiments that show Manifold Sculpting yields more accurate results than existing algorithms with both generated and natural data-sets. Manifold Sculpting is also able to benefit from both prior dimensionality reduction efforts. 1
4 0.18189901 49 nips-2007-Colored Maximum Variance Unfolding
Author: Le Song, Arthur Gretton, Karsten M. Borgwardt, Alex J. Smola
Abstract: Maximum variance unfolding (MVU) is an effective heuristic for dimensionality reduction. It produces a low-dimensional representation of the data by maximizing the variance of their embeddings while preserving the local distances of the original data. We show that MVU also optimizes a statistical dependence measure which aims to retain the identity of individual observations under the distancepreserving constraints. This general view allows us to design “colored” variants of MVU, which produce low-dimensional representations for a given task, e.g. subject to class labels or other side information. 1
5 0.11594935 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI
Author: Francois Meyer, Greg Stephens
Abstract: Functional Magnetic Resonance Imaging (fMRI) provides dynamical access into the complex functioning of the human brain, detailing the hemodynamic activity of thousands of voxels during hundreds of sequential time points. One approach towards illuminating the connection between fMRI and cognitive function is through decoding; how do the time series of voxel activities combine to provide information about internal and external experience? Here we seek models of fMRI decoding which are balanced between the simplicity of their interpretation and the effectiveness of their prediction. We use signals from a subject immersed in virtual reality to compare global and local methods of prediction applying both linear and nonlinear techniques of dimensionality reduction. We find that the prediction of complex stimuli is remarkably low-dimensional, saturating with less than 100 features. In particular, we build effective models based on the decorrelated components of cognitive activity in the classically-defined Brodmann areas. For some of the stimuli, the top predictive areas were surprisingly transparent, including Wernicke’s area for verbal instructions, visual cortex for facial and body features, and visual-temporal regions for velocity. Direct sensory experience resulted in the most robust predictions, with the highest correlation (c ∼ 0.8) between the predicted and experienced time series of verbal instructions. Techniques based on non-linear dimensionality reduction (Laplacian eigenmaps) performed similarly. The interpretability and relative simplicity of our approach provides a conceptual basis upon which to build more sophisticated techniques for fMRI decoding and offers a window into cognitive function during dynamic, natural experience. 1
6 0.10904994 172 nips-2007-Scene Segmentation with CRFs Learned from Partially Labeled Images
7 0.093379207 143 nips-2007-Object Recognition by Scene Alignment
8 0.089716725 181 nips-2007-Sparse Overcomplete Latent Variable Decomposition of Counts Data
9 0.088023014 109 nips-2007-Kernels on Attributed Pointsets with Applications
10 0.073264785 116 nips-2007-Learning the structure of manifolds using random projections
11 0.072394706 180 nips-2007-Sparse Feature Learning for Deep Belief Networks
12 0.070634812 155 nips-2007-Predicting human gaze using low-level saliency combined with face detection
13 0.070354022 111 nips-2007-Learning Horizontal Connections in a Sparse Coding Model of Natural Images
14 0.0690393 153 nips-2007-People Tracking with the Laplacian Eigenmaps Latent Variable Model
15 0.066459268 188 nips-2007-Subspace-Based Face Recognition in Analog VLSI
16 0.066264473 193 nips-2007-The Distribution Family of Similarity Distances
17 0.066054761 183 nips-2007-Spatial Latent Dirichlet Allocation
18 0.064552344 81 nips-2007-Estimating disparity with confidence from energy neurons
19 0.064417101 93 nips-2007-GRIFT: A graphical model for inferring visual classification features from human data
20 0.063620843 145 nips-2007-On Sparsity and Overcompleteness in Image Models
topicId topicWeight
[(0, -0.199), (1, 0.097), (2, -0.047), (3, -0.027), (4, -0.026), (5, 0.125), (6, -0.07), (7, 0.234), (8, 0.02), (9, 0.253), (10, -0.063), (11, 0.299), (12, -0.027), (13, 0.113), (14, 0.062), (15, -0.077), (16, -0.107), (17, 0.033), (18, -0.061), (19, 0.059), (20, -0.031), (21, 0.042), (22, -0.069), (23, -0.049), (24, -0.152), (25, -0.019), (26, 0.063), (27, 0.009), (28, 0.025), (29, 0.037), (30, -0.043), (31, -0.021), (32, 0.012), (33, -0.087), (34, 0.05), (35, 0.016), (36, 0.042), (37, 0.078), (38, 0.092), (39, 0.016), (40, -0.101), (41, 0.067), (42, -0.114), (43, 0.083), (44, -0.114), (45, -0.047), (46, -0.013), (47, 0.001), (48, -0.059), (49, 0.107)]
simIndex simValue paperId paperTitle
same-paper 1 0.95984817 115 nips-2007-Learning the 2-D Topology of Images
Author: Nicolas L. Roux, Yoshua Bengio, Pascal Lamblin, Marc Joliveau, Balázs Kégl
Abstract: We study the following question: is the two-dimensional structure of images a very strong prior or is it something that can be learned with a few examples of natural images? If someone gave us a learning task involving images for which the two-dimensional topology of pixels was not known, could we discover it automatically and exploit it? For example suppose that the pixels had been permuted in a fixed but unknown way, could we recover the relative two-dimensional location of pixels on images? The surprising result presented here is that not only the answer is yes, but that about as few as a thousand images are enough to approximately recover the relative locations of about a thousand pixels. This is achieved using a manifold learning algorithm applied to pixels associated with a measure of distributional similarity between pixel intensities. We compare different topologyextraction approaches and show how having the two-dimensional topology can be exploited.
2 0.82679081 107 nips-2007-Iterative Non-linear Dimensionality Reduction with Manifold Sculpting
Author: Michael Gashler, Dan Ventura, Tony Martinez
Abstract: Many algorithms have been recently developed for reducing dimensionality by projecting data onto an intrinsic non-linear manifold. Unfortunately, existing algorithms often lose significant precision in this transformation. Manifold Sculpting is a new algorithm that iteratively reduces dimensionality by simulating surface tension in local neighborhoods. We present several experiments that show Manifold Sculpting yields more accurate results than existing algorithms with both generated and natural data-sets. Manifold Sculpting is also able to benefit from both prior dimensionality reduction efforts. 1
3 0.71306878 161 nips-2007-Random Projections for Manifold Learning
Author: Chinmay Hegde, Michael Wakin, Richard Baraniuk
Abstract: We propose a novel method for linear dimensionality reduction of manifold modeled data. First, we show that with a small number M of random projections of sample points in RN belonging to an unknown K-dimensional Euclidean manifold, the intrinsic dimension (ID) of the sample set can be estimated to high accuracy. Second, we rigorously prove that using only this set of random projections, we can estimate the structure of the underlying manifold. In both cases, the number of random projections required is linear in K and logarithmic in N , meaning that K < M ≪ N . To handle practical situations, we develop a greedy algorithm to estimate the smallest size of the projection space required to perform manifold learning. Our method is particularly relevant in distributed sensing systems and leads to significant potential savings in data acquisition, storage and transmission costs.
4 0.59298295 49 nips-2007-Colored Maximum Variance Unfolding
Author: Le Song, Arthur Gretton, Karsten M. Borgwardt, Alex J. Smola
Abstract: Maximum variance unfolding (MVU) is an effective heuristic for dimensionality reduction. It produces a low-dimensional representation of the data by maximizing the variance of their embeddings while preserving the local distances of the original data. We show that MVU also optimizes a statistical dependence measure which aims to retain the identity of individual observations under the distancepreserving constraints. This general view allows us to design “colored” variants of MVU, which produce low-dimensional representations for a given task, e.g. subject to class labels or other side information. 1
5 0.41701978 113 nips-2007-Learning Visual Attributes
Author: Vittorio Ferrari, Andrew Zisserman
Abstract: We present a probabilistic generative model of visual attributes, together with an efficient learning algorithm. Attributes are visual qualities of objects, such as ‘red’, ‘striped’, or ‘spotted’. The model sees attributes as patterns of image segments, repeatedly sharing some characteristic properties. These can be any combination of appearance, shape, or the layout of segments within the pattern. Moreover, attributes with general appearance are taken into account, such as the pattern of alternation of any two colors which is characteristic for stripes. To enable learning from unsegmented training images, the model is learnt discriminatively, by optimizing a likelihood ratio. As demonstrated in the experimental evaluation, our model can learn in a weakly supervised setting and encompasses a broad range of attributes. We show that attributes can be learnt starting from a text query to Google image search, and can then be used to recognize the attribute and determine its spatial extent in novel real-world images.
6 0.4126491 172 nips-2007-Scene Segmentation with CRFs Learned from Partially Labeled Images
7 0.39717594 143 nips-2007-Object Recognition by Scene Alignment
8 0.36334291 188 nips-2007-Subspace-Based Face Recognition in Analog VLSI
9 0.35966545 109 nips-2007-Kernels on Attributed Pointsets with Applications
10 0.33796626 196 nips-2007-The Infinite Gamma-Poisson Feature Model
11 0.33514118 211 nips-2007-Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data
12 0.32989496 138 nips-2007-Near-Maximum Entropy Models for Binary Neural Representations of Natural Images
13 0.31862995 116 nips-2007-Learning the structure of manifolds using random projections
14 0.30674043 193 nips-2007-The Distribution Family of Similarity Distances
15 0.29966119 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI
16 0.29378936 180 nips-2007-Sparse Feature Learning for Deep Belief Networks
17 0.29226178 93 nips-2007-GRIFT: A graphical model for inferring visual classification features from human data
18 0.28017217 56 nips-2007-Configuration Estimates Improve Pedestrian Finding
19 0.27882427 210 nips-2007-Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks
20 0.27867347 81 nips-2007-Estimating disparity with confidence from energy neurons
topicId topicWeight
[(5, 0.101), (13, 0.059), (16, 0.04), (21, 0.061), (31, 0.022), (34, 0.032), (35, 0.028), (39, 0.133), (47, 0.104), (49, 0.022), (81, 0.039), (83, 0.162), (85, 0.027), (87, 0.036), (90, 0.047)]
simIndex simValue paperId paperTitle
same-paper 1 0.90391976 115 nips-2007-Learning the 2-D Topology of Images
Author: Nicolas L. Roux, Yoshua Bengio, Pascal Lamblin, Marc Joliveau, Balázs Kégl
Abstract: We study the following question: is the two-dimensional structure of images a very strong prior or is it something that can be learned with a few examples of natural images? If someone gave us a learning task involving images for which the two-dimensional topology of pixels was not known, could we discover it automatically and exploit it? For example suppose that the pixels had been permuted in a fixed but unknown way, could we recover the relative two-dimensional location of pixels on images? The surprising result presented here is that not only the answer is yes, but that about as few as a thousand images are enough to approximately recover the relative locations of about a thousand pixels. This is achieved using a manifold learning algorithm applied to pixels associated with a measure of distributional similarity between pixel intensities. We compare different topologyextraction approaches and show how having the two-dimensional topology can be exploited.
2 0.8408767 138 nips-2007-Near-Maximum Entropy Models for Binary Neural Representations of Natural Images
Author: Matthias Bethge, Philipp Berens
Abstract: Maximum entropy analysis of binary variables provides an elegant way for studying the role of pairwise correlations in neural populations. Unfortunately, these approaches suffer from their poor scalability to high dimensions. In sensory coding, however, high-dimensional data is ubiquitous. Here, we introduce a new approach using a near-maximum entropy model, that makes this type of analysis feasible for very high-dimensional data—the model parameters can be derived in closed form and sampling is easy. Therefore, our NearMaxEnt approach can serve as a tool for testing predictions from a pairwise maximum entropy model not only for low-dimensional marginals, but also for high dimensional measurements of more than thousand units. We demonstrate its usefulness by studying natural images with dichotomized pixel intensities. Our results indicate that the statistics of such higher-dimensional measurements exhibit additional structure that are not predicted by pairwise correlations, despite the fact that pairwise correlations explain the lower-dimensional marginal statistics surprisingly well up to the limit of dimensionality where estimation of the full joint distribution is feasible. 1
3 0.82779509 180 nips-2007-Sparse Feature Learning for Deep Belief Networks
Author: Marc'aurelio Ranzato, Y-lan Boureau, Yann L. Cun
Abstract: Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation. We describe a novel and efficient algorithm to learn sparse representations, and compare it theoretically and experimentally with a similar machine trained probabilistically, namely a Restricted Boltzmann Machine. We propose a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation. We demonstrate this method by extracting features from a dataset of handwritten numerals, and from a dataset of natural image patches. We show that by stacking multiple levels of such machines and by training sequentially, high-order dependencies between the input observed variables can be captured. 1
4 0.82719576 63 nips-2007-Convex Relaxations of Latent Variable Training
Author: Yuhong Guo, Dale Schuurmans
Abstract: We investigate a new, convex relaxation of an expectation-maximization (EM) variant that approximates a standard objective while eliminating local minima. First, a cautionary result is presented, showing that any convex relaxation of EM over hidden variables must give trivial results if any dependence on the missing values is retained. Although this appears to be a strong negative outcome, we then demonstrate how the problem can be bypassed by using equivalence relations instead of value assignments over hidden variables. In particular, we develop new algorithms for estimating exponential conditional models that only require equivalence relation information over the variable values. This reformulation leads to an exact expression for EM variants in a wide range of problems. We then develop a semidefinite relaxation that yields global training by eliminating local minima. 1
Author: Lars Buesing, Wolfgang Maass
Abstract: We show that under suitable assumptions (primarily linearization) a simple and perspicuous online learning rule for Information Bottleneck optimization with spiking neurons can be derived. This rule performs on common benchmark tasks as well as a rather complex rule that has previously been proposed [1]. Furthermore, the transparency of this new learning rule makes a theoretical analysis of its convergence properties feasible. A variation of this learning rule (with sign changes) provides a theoretically founded method for performing Principal Component Analysis (PCA) with spiking neurons. By applying this rule to an ensemble of neurons, different principal components of the input can be extracted. In addition, it is possible to preferentially extract those principal components from incoming signals X that are related or are not related to some additional target signal YT . In a biological interpretation, this target signal YT (also called relevance variable) could represent proprioceptive feedback, input from other sensory modalities, or top-down signals. 1
6 0.81645936 18 nips-2007-A probabilistic model for generating realistic lip movements from speech
7 0.81538761 84 nips-2007-Expectation Maximization and Posterior Constraints
8 0.81521904 187 nips-2007-Structured Learning with Approximate Inference
9 0.81479156 86 nips-2007-Exponential Family Predictive Representations of State
10 0.8126471 7 nips-2007-A Kernel Statistical Test of Independence
11 0.81016916 73 nips-2007-Distributed Inference for Latent Dirichlet Allocation
12 0.8093608 94 nips-2007-Gaussian Process Models for Link Analysis and Transfer Learning
13 0.80803239 45 nips-2007-Classification via Minimum Incremental Coding Length (MICL)
14 0.80713511 161 nips-2007-Random Projections for Manifold Learning
15 0.80706084 49 nips-2007-Colored Maximum Variance Unfolding
16 0.80693179 134 nips-2007-Multi-Task Learning via Conic Programming
17 0.80513871 146 nips-2007-On higher-order perceptron algorithms
18 0.80508995 148 nips-2007-Online Linear Regression and Its Application to Model-Based Reinforcement Learning
19 0.80469787 2 nips-2007-A Bayesian LDA-based model for semi-supervised part-of-speech tagging
20 0.80387551 158 nips-2007-Probabilistic Matrix Factorization