nips nips2006 nips2006-128 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Matthias Hein, Markus Maier
Abstract: We consider the problem of denoising a noisily sampled submanifold M in Rd , where the submanifold M is a priori unknown and we are only given a noisy point sample. The presented denoising algorithm is based on a graph-based diffusion process of the point sample. We analyze this diffusion process using recent results about the convergence of graph Laplacians. In the experiments we show that our method is capable of dealing with non-trivial high-dimensional noise. Moreover using the denoising algorithm as pre-processing method we can improve the results of a semi-supervised learning algorithm. 1
Reference: text
sentIndex sentText sentNum sentScore
1 de Abstract We consider the problem of denoising a noisily sampled submanifold M in Rd , where the submanifold M is a priori unknown and we are only given a noisy point sample. [sent-4, score-0.964]
2 The presented denoising algorithm is based on a graph-based diffusion process of the point sample. [sent-5, score-0.922]
3 We analyze this diffusion process using recent results about the convergence of graph Laplacians. [sent-6, score-0.652]
4 Moreover using the denoising algorithm as pre-processing method we can improve the results of a semi-supervised learning algorithm. [sent-8, score-0.499]
5 1 Introduction In the last years several new methods have been developed in the machine learning community which are based on the assumption that the data lies on a submanifold M in Rd . [sent-9, score-0.238]
6 Namely in practice the data lies almost never exactly on the submanifold but due to noise is scattered around it. [sent-12, score-0.271]
7 Several of the existing algorithms in particular graph based methods are quite sensitive to noise. [sent-13, score-0.229]
8 In this paper we tackle this problem by proposing a denoising method for manifold data. [sent-15, score-0.662]
9 Given noisily sampled manifold data in Rd the objective is to ’project’ the sample onto the submanifold. [sent-16, score-0.252]
10 For both methods one has to know the intrinsic dimension of the submanifold M as a parameter of the algorithm. [sent-18, score-0.308]
11 However in the presence of high-dimensional noise it is almost impossible to estimate the intrinsic dimension correctly. [sent-19, score-0.183]
12 It works well for low-dimensional submanifolds corrupted by high-dimensional noise and can deal with multiple connected components. [sent-22, score-0.184]
13 The basic principle behind our denoising method has been proposed by [13] as a surface processing method in R3 . [sent-23, score-0.52]
14 Second we provide an interpretation of the denoising algorithm which takes into account the probabilistic setting encountered in machine learning and which differs from the one usually given in the computer graphics community. [sent-26, score-0.596]
15 2 The noise model and problem statement We assume that the data lies on an abstract m-dimensional manifold M , where the dimension m can be seen as the number of independent parameters in the data. [sent-27, score-0.312]
16 Furthermore we assume that the manifold M is equipped with a probability measure PM which is absolutely continuous with respect to the natural volume element1 dV of M . [sent-34, score-0.239]
17 We consider here for convenience a Gaussian noise model but also any other reasonably concentrated isotropic noise should work. [sent-37, score-0.146]
18 (1) M 2 of Now the Gaussian measure is equivalent to the heat kernel pt (x, y) = (4πt)− 2 exp − x−y 4t the diffusion process on Rd , see e. [sent-39, score-0.477]
19 An alternative point of view on PX is therefore to see PX as the result of a diffusion of the density function2 p(θ) of PM stopped at time t = 1 σ 2 . [sent-42, score-0.421]
20 The basic principle behind the denoising algorithm in this paper is to 2 reverse this diffusion process. [sent-43, score-0.908]
21 d 3 The denoising algorithm In practice we have only an i. [sent-44, score-0.499]
22 , n on the submanifold M which generated the points Xi . [sent-54, score-0.217]
23 Instead the goal is to find corresponding points Zi on the submanifold M which are close to the points Xi . [sent-56, score-0.249]
24 Second as stated in the last section we would like to reverse this diffusion process which amounts to solving a PDE. [sent-59, score-0.448]
25 Instead we solve the diffusion process directly on a graph generated by the sample Xi . [sent-61, score-0.7]
26 This can be motivated by recent results in [7] where it was shown that the generator of the diffusion process, the Laplacian ∆Rd , can be approximated by the graph Laplacian of a random neighborhood graph. [sent-62, score-0.74]
27 A similar setting for the denoising of two-dimensional meshes in R3 has been proposed in the seminal work of Taubin [13]. [sent-63, score-0.499]
28 In this paper we propose a modification of this diffusion process which allows us to deal with general noisy samples of arbitrary (low-dimensional) submanifolds in Rd . [sent-65, score-0.571]
29 Moreover we give an interpretation of the algorithm, which differs from the one usually given in the computer graphics community and takes into account the probabilistic nature of the problem. [sent-67, score-0.124]
30 1 Structure on the sample-based graph We would like to define a diffusion process directly on the sample Xi . [sent-69, score-0.7]
31 To this end we need the generator of the diffusion process, the graph Laplacian. [sent-70, score-0.662]
32 With {h(Xi )}n being the k-nearest i=1 neighbor (k-NN) distances the weights of the k-NN graph are defined as 2 w(Xi , Xj ) = exp − Xi − Xj , (max{h(Xi ), h(Xj )})2 if Xi − Xj ≤ max{h(Xi ), h(Xj )}, and w(Xi , Xj ) = 0 otherwise. [sent-73, score-0.282]
33 Additionally we set w(Xi , Xi ) = 0, so that the graph has no n loops. [sent-74, score-0.229]
34 Further we denote by d the degree function d(Xi ) = j=1 w(Xi , Xj ) of the graph and √ In local coordinates θ1 , . [sent-75, score-0.229]
35 : HV → HE , ( f )(Xi , Xj ) = f (Xj ) − f (Xi ) the graph (∆f )(Xi ) = f (Xi ) − 1 d(Xi ) n j=1 w(Xi , Xj )f (Xj ), where ∗ is the adjoint of . [sent-85, score-0.249]
36 Defining the matrix D with the degree function on the diagonal the graph Laplacian in matrix form is given as ∆ = − D−1 W , see [7] for more details. [sent-86, score-0.229]
37 2 The denoising algorithm Having defined the necessary structure on the graph it is straightforward to write down the backward diffusion process. [sent-89, score-1.112]
38 In the next section we will analyze the geometric properties of this diffusion process and show why it is directed towards the submanifold M . [sent-90, score-0.628]
39 Since the graph Laplacian is the generator of the diffusion process on the graph we can formulate the algorithm by the following differential equation on the graph: ∂t X = −γ ∆X, (2) where γ > 0 is the diffusion constant. [sent-91, score-1.402]
40 Since the points change with time, the whole graph is dynamic in our setting. [sent-92, score-0.261]
41 This is different to the diffusion processes on a fixed graph studied in semi-supervised learning. [sent-93, score-0.613]
42 In order to solve the differential equation (2) we choose an implicit Euler-scheme, that is X(t + 1) − X(t) = −δt γ ∆X(t + 1), (3) where δt is the time-step. [sent-94, score-0.119]
43 In [12] it was pointed out that there exists a connection Algorithm 1 Manifold denoising 1: Choose δt, k 2: while Stopping criterion not satisfied do 3: Compute the k-NN distances h(Xi ), i = 1, . [sent-103, score-0.552]
44 , n, 4: Compute the weights w(Xi , Xj ) of the graph with w(Xi , Xi ) = 0, w(Xi , Xj ) = exp − Xi −Xj 2 (max{h(Xi ),h(Xj )})2 , if Xi − Xj ≤ max{h(Xi ), h(Xj )}, −1 5: Compute the graph Laplacian ∆, ∆ = − D W , 6: Solve X(t + 1) − X(t) = −δt ∆X(t + 1) ⇒ X(t + 1) = ( + δt ∆)−1 X(t). [sent-106, score-0.458]
45 7: end while between diffusion processes and Tikhonov regularization. [sent-107, score-0.384]
46 Each time-step of our diffusion process can therefore be seen as a regression problem, where we trade off between fitting the new points Z to the points X(t) and having a ’smooth’ point configuration Z measured with respect to the current graph built from X(t). [sent-113, score-0.716]
47 3 In the denoising algorithm we have chosen to use a weighted k-NN graph. [sent-115, score-0.499]
48 It turns out that a k-NN graph has three advantages over an h-neighborhood graph3 . [sent-116, score-0.229]
49 The first advantage is that the graph has a better connectivity. [sent-117, score-0.229]
50 Namely points in areas of different density have quite different neighborhood scales which leads for a fixed h to either disconnected or over-connected graphs. [sent-118, score-0.199]
51 One can deduce that the expected squared distance of the noisy submanifold sample is dominated by 2 the noise term if 2dσ 2 > maxθ,θ i(θ) − i(θ ) , which is usually the case for large d. [sent-122, score-0.399]
52 In this case it is quite difficult to adjust the average number of neighbors in a graph by a fixed neighborhood size h since the distances start to concentrate around their mean value. [sent-123, score-0.381]
53 4 Stopping criterion The problem of choosing the correct number of iterations is very difficult if one has initially highdimensional noise and requires prior knowledge. [sent-126, score-0.124]
54 The first one is based on the effect that if the diffusion is done too long the data becomes disconnected and concentrates in local clusters. [sent-128, score-0.436]
55 The second one is based on prior knowledge about the intrinsic dimension of the data. [sent-130, score-0.123]
56 In this case one can stop the denoising if the estimated dimension of the sample (e. [sent-131, score-0.64]
57 Another less founded but very simple way is to stop the iterations if the changes in the sample are below some pre-defined threshold. [sent-134, score-0.131]
58 4 Large sample limit and theoretical analysis Our qualitative theoretical analysis of the denoising algorithm is based on recent results on the limit of graph Laplacians [7, 8] as the neighborhood size decreases and the sample size increases. [sent-135, score-0.998]
59 We use this result to study the continuous limit of the diffusion process. [sent-136, score-0.452]
60 The following theorem about the limit of the graph Laplacian applies to h-neighborhood graphs, whereas the denoising algorithm is based on a k-NN graph. [sent-137, score-0.776]
61 3 In an h-neighborhood graph two sample points Xi , Xj have a common edge if Xi − Xj ≤ h. [sent-144, score-0.309]
62 is equal to the multiplicity of the first eigenvalue of the graph Laplacian. [sent-146, score-0.229]
63 1 The noise-free case We first derive in a non-rigorous way the continuum limit of our graph based diffusion process in the noise free case. [sent-150, score-0.76]
64 (4) Note that for the k-NN graph the neighborhood size h is a function of the local density which implies that the diffusion constant D also becomes a function of the local density D = D(p(x)). [sent-153, score-0.765]
65 14) Let i : M → Rd be a regular, smooth embedding of an mdimensional manifold M , then ∆M i = m H, where H is the mean curvature7 of M . [sent-155, score-0.184]
66 Using the equation ∆M i = mH we can establish equivalence of the continuous diffusion equation (4) to a generalized mean curvature flow. [sent-156, score-0.615]
67 ∂t i = D [m H + 2 p p, i ], (5) The equivalence to the mean curvature flow ∂t i = m H is usually given in computer graphics as the reason for the denoising effect, see [13, 11]. [sent-157, score-0.729]
68 However as we have shown the diffusion has already an additional part if one has a non-uniform probability measure on M . [sent-158, score-0.406]
69 2 The noisy case The analysis of the noisy case is more complicated and we can only provide a rough analysis. [sent-160, score-0.135]
70 In the following analysis we will assume three things, 1) the noise level σ is small compared to the neighborhood size h, 2) the curvature of M is small compared to h and 3) the density pM varies slowly along M . [sent-162, score-0.287]
71 In the following we try to separate this effect from the mean curvature part derived in the noise-free case. [sent-164, score-0.133]
72 Using the condition on the curvature we can approximate the diffusion step −∆X as follows: −∆X ≈ i(θmin ) − X − i(θmin ) − I M kh ( i(θmin ) − i(θ) ) i(θ) p(θ) dV (θ) k ( i(θmin ) − i(θ) ) p(θ) dV (θ) M h , II The mean curvature H is the trace of the second fundamental form. [sent-166, score-0.758]
73 If M is a hypersurface in Rd the mean 1 curvature at p is H = d−1 d−1 κi N , where N is the normal vector and κi the principal curvatures at p. [sent-167, score-0.154]
74 We conclude from this rough analysis that in the denoising procedure we always have a tradeoff between reducing the noise via the term I and smoothing of the manifold via the mean curvature term II. [sent-170, score-0.912]
75 5 Experiments In the experimental section we test the performance of the denoising algorithm on three noisy datasets. [sent-174, score-0.553]
76 Furthermore we explore the possibility to use the denoising method as a preprocessing step for semi-supervised learning. [sent-175, score-0.568]
77 Due to lack of space we can not deal with further applications as preprocessing method for clustering or dimensionality reduction. [sent-176, score-0.12]
78 The manifold M is given as t → [sin(2πt), 2πt], t is sampled uniformly on [0, 1]. [sent-179, score-0.163]
79 We verify the effect of the denoising algorithm by estimating continuously the dimension over different scales (note that the dimension of a finite sample always depends on the scale at which one examines). [sent-183, score-0.673]
80 The result of the denoising algorithm with k = 25 for the k-NN graph and 10 timesteps is given in the right part of Figure 5. [sent-185, score-0.748]
81 One can observe visually and by inspecting the dimension estimate as well as by the histogram of distances that the algorithm has reduced the noise. [sent-187, score-0.155]
82 First as discussed in the last section the diffusion process has a component which moves the manifold in the direction of the mean curvature, which leads to a smoothing of the sinusoid. [sent-189, score-0.658]
83 Second at the boundary the sinusoid shrinks due to the missing counterparts in the local averaging done by the graph Laplacian, see (6), which result in an inward tangential component. [sent-190, score-0.276]
84 In the next experiment we apply the denoising to the handwritten digit datasets USPS and MNIST. [sent-191, score-0.576]
85 In order to check if the denoising method can also handle several manifolds at the same time which would make the method useful for clustering and dimensionality reduction we fed all the 10 digits simultaneously into the algorithm. [sent-199, score-0.606]
86 This happens since they are outliers with respect to their digit manifold and lie closer to another digit component. [sent-204, score-0.229]
87 We expect that in such cases manifold denoising as a pre-processing step should improve the discriminative capacity of graph-based methods. [sent-213, score-0.662]
88 However the denoising algorithm does not take into account label information. [sent-214, score-0.499]
89 Therefore in the case where the cluster assumption is not fulfilled the denoising algorithm might decrease the performance. [sent-215, score-0.499]
90 Therefore we add the number of iterations of the denoising process as an additional parameter in the SSL algorithm. [sent-216, score-0.571]
91 For the evaluation of our denoising algorithm as a preprocessing step for SSL, we used the benchmark data sets from [3]. [sent-217, score-0.568]
92 f ∗ = argminf ∈HV f −y 2 HV + µ f, ∆f HV , where y is the given label vector and f, ∆f HV is the smoothness functional induced by the graph Laplacian. [sent-222, score-0.249]
93 In order to be consistent with our de1 1 ˜ noising scheme we choose instead of the normalized graph Laplacian ∆ = − D− 2 W D− 2 as −1 suggested in [15] the graph Laplacian ∆ = − D W and the graph structure as described in Section 3. [sent-224, score-0.687]
94 As neighborhood graph for the SSL-algorithm we used a symmetric k-NN graph with the 2 following weights: w(Xi , Xj ) = exp(−γ Xi − Xj ) if Xi − Xj ≤ min{h(Xi ), h(Xj )}. [sent-226, score-0.536]
95 The number of k-NN was chosen for denoising in {5, 10, 15, 25, 50, 100, 150, 200}, and for classification in {5, 10, 20, 50, 100}. [sent-228, score-0.499]
96 Parameter values where not all data points have been classified, that is the graph is disconnected, were excluded. [sent-232, score-0.261]
97 In Table 1 the results are shown for the standard case, that is no manifold denoising (No MD), and with manifold denoising (MD). [sent-235, score-1.324]
98 For the datasets g241c, g241d and Text we get significantly better performance using denoising as a preprocessing step, whereas the results are indifferent for the other datasets. [sent-236, score-0.591]
99 However compared to the results of the state of the art of SSL on all the datasets reported in [3], the denoising preprocessing has lead to a performance of the algorithm which is competitive uniformly over all datasets. [sent-237, score-0.591]
100 From graphs to manifolds - weak and strong pointwise consistency of graph Laplacians. [sent-378, score-0.291]
wordName wordTfidf (topN-words)
[('denoising', 0.499), ('diffusion', 0.384), ('hv', 0.284), ('rd', 0.23), ('graph', 0.229), ('submanifold', 0.185), ('xi', 0.178), ('manifold', 0.163), ('pm', 0.154), ('px', 0.15), ('xj', 0.137), ('kh', 0.129), ('laplacian', 0.127), ('dv', 0.124), ('curvature', 0.112), ('md', 0.108), ('ssl', 0.101), ('neighborhood', 0.078), ('submanifolds', 0.07), ('preprocessing', 0.069), ('graphics', 0.068), ('dimension', 0.063), ('intrinsic', 0.06), ('noise', 0.06), ('dy', 0.059), ('usps', 0.057), ('noisy', 0.054), ('distances', 0.053), ('stopping', 0.053), ('disconnected', 0.052), ('generator', 0.049), ('euler', 0.049), ('differential', 0.049), ('limit', 0.048), ('sample', 0.048), ('sinusoid', 0.047), ('hein', 0.043), ('noisily', 0.041), ('digits', 0.04), ('manifolds', 0.04), ('equation', 0.039), ('histogram', 0.039), ('process', 0.039), ('timestep', 0.037), ('audibert', 0.037), ('density', 0.037), ('absolutely', 0.034), ('laplacians', 0.034), ('mh', 0.034), ('digit', 0.033), ('iterations', 0.033), ('heat', 0.032), ('points', 0.032), ('tangent', 0.031), ('highdimensional', 0.031), ('implicit', 0.031), ('stop', 0.03), ('min', 0.03), ('connected', 0.03), ('smoothing', 0.03), ('det', 0.03), ('usually', 0.029), ('topographic', 0.028), ('rough', 0.027), ('community', 0.027), ('lemma', 0.027), ('dimensionality', 0.027), ('isotropic', 0.026), ('operator', 0.026), ('lies', 0.026), ('von', 0.026), ('zhou', 0.025), ('mnist', 0.025), ('reverse', 0.025), ('deal', 0.024), ('metric', 0.024), ('xt', 0.023), ('datasets', 0.023), ('distance', 0.023), ('changed', 0.022), ('ow', 0.022), ('measure', 0.022), ('graphs', 0.022), ('principal', 0.021), ('handwritten', 0.021), ('surface', 0.021), ('moves', 0.021), ('mean', 0.021), ('geometric', 0.02), ('continuous', 0.02), ('vertices', 0.02), ('timesteps', 0.02), ('recompute', 0.02), ('meir', 0.02), ('argminf', 0.02), ('subsample', 0.02), ('arxiv', 0.02), ('physica', 0.02), ('adjoint', 0.02), ('founded', 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999946 128 nips-2006-Manifold Denoising
Author: Matthias Hein, Markus Maier
Abstract: We consider the problem of denoising a noisily sampled submanifold M in Rd , where the submanifold M is a priori unknown and we are only given a noisy point sample. The presented denoising algorithm is based on a graph-based diffusion process of the point sample. We analyze this diffusion process using recent results about the convergence of graph Laplacians. In the experiments we show that our method is capable of dealing with non-trivial high-dimensional noise. Moreover using the denoising algorithm as pre-processing method we can improve the results of a semi-supervised learning algorithm. 1
2 0.15208153 1 nips-2006-A Bayesian Approach to Diffusion Models of Decision-Making and Response Time
Author: Michael D. Lee, Ian G. Fuss, Daniel J. Navarro
Abstract: We present a computational Bayesian approach for Wiener diffusion models, which are prominent accounts of response time distributions in decision-making. We first develop a general closed-form analytic approximation to the response time distributions for one-dimensional diffusion processes, and derive the required Wiener diffusion as a special case. We use this result to undertake Bayesian modeling of benchmark data, using posterior sampling to draw inferences about the interesting psychological parameters. With the aid of the benchmark data, we show the Bayesian account has several advantages, including dealing naturally with the parameter variation needed to account for some key features of the data, and providing quantitative measures to guide decisions about model construction. 1
3 0.13960114 120 nips-2006-Learning to Traverse Image Manifolds
Author: Piotr DollĂĄr, Vincent Rabaud, Serge J. Belongie
Abstract: We present a new algorithm, Locally Smooth Manifold Learning (LSML), that learns a warping function from a point on an manifold to its neighbors. Important characteristics of LSML include the ability to recover the structure of the manifold in sparsely populated regions and beyond the support of the provided data. Applications of our proposed technique include embedding with a natural out-of-sample extension and tasks such as tangent distance estimation, frame rate up-conversion, video compression and motion transfer. 1
4 0.13784733 151 nips-2006-On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts
Author: Hariharan Narayanan, Mikhail Belkin, Partha Niyogi
Abstract: One of the intuitions underlying many graph-based methods for clustering and semi-supervised learning, is that class or cluster boundaries pass through areas of low probability density. In this paper we provide some formal analysis of that notion for a probability distribution. We introduce a notion of weighted boundary volume, which measures the length of the class/cluster boundary weighted by the density of the underlying probability distribution. We show that sizes of the cuts of certain commonly used data adjacency graphs converge to this continuous weighted volume of the boundary. keywords: Clustering, Semi-Supervised Learning 1
5 0.1261733 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation
Author: Dong S. Cheng, Vittorio Murino, Mário Figueiredo
Abstract: This paper proposes a new approach to model-based clustering under prior knowledge. The proposed formulation can be interpreted from two different angles: as penalized logistic regression, where the class labels are only indirectly observed (via the probability density of each class); as finite mixture learning under a grouping prior. To estimate the parameters of the proposed model, we derive a (generalized) EM algorithm with a closed-form E-step, in contrast with other recent approaches to semi-supervised probabilistic clustering which require Gibbs sampling or suboptimal shortcuts. We show that our approach is ideally suited for image segmentation: it avoids the combinatorial nature Markov random field priors, and opens the door to more sophisticated spatial priors (e.g., wavelet-based) in a simple and computationally efficient way. Finally, we extend our formulation to work in unsupervised, semi-supervised, or discriminative modes. 1
6 0.12023971 80 nips-2006-Fundamental Limitations of Spectral Clustering
7 0.11425146 60 nips-2006-Convergence of Laplacian Eigenmaps
8 0.10701397 7 nips-2006-A Local Learning Approach for Clustering
10 0.098947078 182 nips-2006-Statistical Modeling of Images with Fields of Gaussian Scale Mixtures
11 0.097284824 200 nips-2006-Unsupervised Regression with Applications to Nonlinear System Identification
12 0.095253527 163 nips-2006-Prediction on a Graph with a Perceptron
13 0.090826176 65 nips-2006-Denoising and Dimension Reduction in Feature Space
14 0.08682888 35 nips-2006-Approximate inference using planar graph decomposition
15 0.08595065 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming
16 0.082267962 104 nips-2006-Large-Scale Sparsified Manifold Regularization
17 0.079071023 92 nips-2006-High-Dimensional Graphical Model Selection Using $\ell 1$-Regularized Logistic Regression
18 0.077067107 77 nips-2006-Fast Computation of Graph Kernels
19 0.076933771 130 nips-2006-Max-margin classification of incomplete data
20 0.071047366 117 nips-2006-Learning on Graph with Laplacian Regularization
topicId topicWeight
[(0, -0.222), (1, 0.091), (2, -0.002), (3, 0.175), (4, 0.045), (5, -0.069), (6, 0.015), (7, 0.059), (8, 0.022), (9, 0.019), (10, 0.139), (11, -0.165), (12, 0.054), (13, 0.049), (14, -0.03), (15, 0.06), (16, -0.025), (17, -0.082), (18, -0.108), (19, 0.013), (20, -0.008), (21, -0.05), (22, 0.104), (23, -0.181), (24, 0.027), (25, -0.041), (26, 0.005), (27, -0.033), (28, -0.188), (29, -0.12), (30, 0.057), (31, -0.054), (32, 0.011), (33, -0.016), (34, -0.018), (35, -0.065), (36, -0.014), (37, 0.051), (38, -0.003), (39, -0.076), (40, 0.068), (41, 0.085), (42, -0.144), (43, -0.024), (44, 0.169), (45, 0.116), (46, 0.084), (47, 0.033), (48, 0.026), (49, 0.002)]
simIndex simValue paperId paperTitle
same-paper 1 0.94773942 128 nips-2006-Manifold Denoising
Author: Matthias Hein, Markus Maier
Abstract: We consider the problem of denoising a noisily sampled submanifold M in Rd , where the submanifold M is a priori unknown and we are only given a noisy point sample. The presented denoising algorithm is based on a graph-based diffusion process of the point sample. We analyze this diffusion process using recent results about the convergence of graph Laplacians. In the experiments we show that our method is capable of dealing with non-trivial high-dimensional noise. Moreover using the denoising algorithm as pre-processing method we can improve the results of a semi-supervised learning algorithm. 1
2 0.65099674 120 nips-2006-Learning to Traverse Image Manifolds
Author: Piotr DollĂĄr, Vincent Rabaud, Serge J. Belongie
Abstract: We present a new algorithm, Locally Smooth Manifold Learning (LSML), that learns a warping function from a point on an manifold to its neighbors. Important characteristics of LSML include the ability to recover the structure of the manifold in sparsely populated regions and beyond the support of the provided data. Applications of our proposed technique include embedding with a natural out-of-sample extension and tasks such as tangent distance estimation, frame rate up-conversion, video compression and motion transfer. 1
3 0.59045088 60 nips-2006-Convergence of Laplacian Eigenmaps
Author: Mikhail Belkin, Partha Niyogi
Abstract: Geometrically based methods for various tasks of machine learning have attracted considerable attention over the last few years. In this paper we show convergence of eigenvectors of the point cloud Laplacian to the eigenfunctions of the Laplace-Beltrami operator on the underlying manifold, thus establishing the first convergence results for a spectral dimensionality reduction algorithm in the manifold setting. 1
4 0.52181911 151 nips-2006-On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts
Author: Hariharan Narayanan, Mikhail Belkin, Partha Niyogi
Abstract: One of the intuitions underlying many graph-based methods for clustering and semi-supervised learning, is that class or cluster boundaries pass through areas of low probability density. In this paper we provide some formal analysis of that notion for a probability distribution. We introduce a notion of weighted boundary volume, which measures the length of the class/cluster boundary weighted by the density of the underlying probability distribution. We show that sizes of the cuts of certain commonly used data adjacency graphs converge to this continuous weighted volume of the boundary. keywords: Clustering, Semi-Supervised Learning 1
5 0.47797284 121 nips-2006-Learning to be Bayesian without Supervision
Author: Martin Raphan, Eero P. Simoncelli
Abstract: unkown-abstract
6 0.47630292 182 nips-2006-Statistical Modeling of Images with Fields of Gaussian Scale Mixtures
7 0.45685202 1 nips-2006-A Bayesian Approach to Diffusion Models of Decision-Making and Response Time
8 0.44466418 92 nips-2006-High-Dimensional Graphical Model Selection Using $\ell 1$-Regularized Logistic Regression
9 0.43243116 184 nips-2006-Stratification Learning: Detecting Mixed Density and Dimensionality in High Dimensional Point Clouds
10 0.41779026 200 nips-2006-Unsupervised Regression with Applications to Nonlinear System Identification
11 0.41557613 123 nips-2006-Learning with Hypergraphs: Clustering, Classification, and Embedding
12 0.40885642 127 nips-2006-MLLE: Modified Locally Linear Embedding Using Multiple Weights
13 0.39739901 104 nips-2006-Large-Scale Sparsified Manifold Regularization
14 0.38474175 77 nips-2006-Fast Computation of Graph Kernels
15 0.37805608 80 nips-2006-Fundamental Limitations of Spectral Clustering
16 0.3582429 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming
17 0.3576929 35 nips-2006-Approximate inference using planar graph decomposition
18 0.33312529 95 nips-2006-Implicit Surfaces with Globally Regularised and Compactly Supported Basis Functions
19 0.3319397 169 nips-2006-Relational Learning with Gaussian Processes
20 0.33066919 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation
topicId topicWeight
[(1, 0.112), (3, 0.093), (7, 0.134), (9, 0.065), (20, 0.016), (22, 0.064), (44, 0.051), (57, 0.059), (64, 0.016), (65, 0.034), (67, 0.196), (69, 0.024), (83, 0.014)]
simIndex simValue paperId paperTitle
same-paper 1 0.85865676 128 nips-2006-Manifold Denoising
Author: Matthias Hein, Markus Maier
Abstract: We consider the problem of denoising a noisily sampled submanifold M in Rd , where the submanifold M is a priori unknown and we are only given a noisy point sample. The presented denoising algorithm is based on a graph-based diffusion process of the point sample. We analyze this diffusion process using recent results about the convergence of graph Laplacians. In the experiments we show that our method is capable of dealing with non-trivial high-dimensional noise. Moreover using the denoising algorithm as pre-processing method we can improve the results of a semi-supervised learning algorithm. 1
2 0.84981579 177 nips-2006-Sparse Kernel Orthonormalized PLS for feature extraction in large data sets
Author: Jerónimo Arenas-garcía, Kaare B. Petersen, Lars K. Hansen
Abstract: In this paper we are presenting a novel multivariate analysis method. Our scheme is based on a novel kernel orthonormalized partial least squares (PLS) variant for feature extraction, imposing sparsity constrains in the solution to improve scalability. The algorithm is tested on a benchmark of UCI data sets, and on the analysis of integrated short-time music features for genre prediction. The upshot is that the method has strong expressive power even with rather few features, is clearly outperforming the ordinary kernel PLS, and therefore is an appealing method for feature extraction of labelled data. 1
3 0.71999764 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming
Author: Kilian Q. Weinberger, Fei Sha, Qihui Zhu, Lawrence K. Saul
Abstract: In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. Recently, a number of researchers have converged on common solutions to this problem using methods from convex optimization. In particular, many results have been obtained by constructing semidefinite programs (SDPs) with low rank solutions. While the rank of matrix variables in SDPs cannot be directly constrained, it has been observed that low rank solutions emerge naturally by computing high variance or maximal trace solutions that respect local distance constraints. In this paper, we show how to solve very large problems of this type by a matrix factorization that leads to much smaller SDPs than those previously studied. The matrix factorization is derived by expanding the solution of the original problem in terms of the bottom eigenvectors of a graph Laplacian. The smaller SDPs obtained from this matrix factorization yield very good approximations to solutions of the original problem. Moreover, these approximations can be further refined by conjugate gradient descent. We illustrate the approach on localization in large scale sensor networks, where optimizations involving tens of thousands of nodes can be solved in just a few minutes. 1
4 0.71924889 184 nips-2006-Stratification Learning: Detecting Mixed Density and Dimensionality in High Dimensional Point Clouds
Author: Gloria Haro, Gregory Randall, Guillermo Sapiro
Abstract: The study of point cloud data sampled from a stratification, a collection of manifolds with possible different dimensions, is pursued in this paper. We present a technique for simultaneously soft clustering and estimating the mixed dimensionality and density of such structures. The framework is based on a maximum likelihood estimation of a Poisson mixture model. The presentation of the approach is completed with artificial and real examples demonstrating the importance of extending manifold learning to stratification learning. 1
5 0.71526939 80 nips-2006-Fundamental Limitations of Spectral Clustering
Author: Boaz Nadler, Meirav Galun
Abstract: Spectral clustering methods are common graph-based approaches to clustering of data. Spectral clustering algorithms typically start from local information encoded in a weighted graph on the data and cluster according to the global eigenvectors of the corresponding (normalized) similarity matrix. One contribution of this paper is to present fundamental limitations of this general local to global approach. We show that based only on local information, the normalized cut functional is not a suitable measure for the quality of clustering. Further, even with a suitable similarity measure, we show that the first few eigenvectors of such adjacency matrices cannot successfully cluster datasets that contain structures at different scales of size and density. Based on these findings, a second contribution of this paper is a novel diffusion based measure to evaluate the coherence of individual clusters. Our measure can be used in conjunction with any bottom-up graph-based clustering method, it is scale-free and can determine coherent clusters at all scales. We present both synthetic examples and real image segmentation problems where various spectral clustering algorithms fail. In contrast, using this coherence measure finds the expected clusters at all scales. Keywords: Clustering, kernels, learning theory. 1
6 0.71409512 104 nips-2006-Large-Scale Sparsified Manifold Regularization
7 0.70203948 127 nips-2006-MLLE: Modified Locally Linear Embedding Using Multiple Weights
8 0.69849807 83 nips-2006-Generalized Maximum Margin Clustering and Unsupervised Kernel Learning
9 0.69721925 119 nips-2006-Learning to Rank with Nonsmooth Cost Functions
10 0.69402266 17 nips-2006-A recipe for optimizing a time-histogram
11 0.69325358 48 nips-2006-Branch and Bound for Semi-Supervised Support Vector Machines
12 0.6923309 35 nips-2006-Approximate inference using planar graph decomposition
13 0.69052732 106 nips-2006-Large Margin Hidden Markov Models for Automatic Speech Recognition
14 0.68822408 120 nips-2006-Learning to Traverse Image Manifolds
15 0.68820232 121 nips-2006-Learning to be Bayesian without Supervision
16 0.68451428 117 nips-2006-Learning on Graph with Laplacian Regularization
17 0.68301415 163 nips-2006-Prediction on a Graph with a Perceptron
18 0.68188262 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation
19 0.68151093 65 nips-2006-Denoising and Dimension Reduction in Feature Space
20 0.68103242 20 nips-2006-Active learning for misspecified generalized linear models