nips nips2013 nips2013-87 knowledge-graph by maker-knowledge-mining

87 nips-2013-Density estimation from unweighted k-nearest neighbor graphs: a roadmap


Source: pdf

Author: Ulrike Von Luxburg, Morteza Alamgir

Abstract: Consider an unweighted k-nearest neighbor graph on n points that have been sampled i.i.d. from some unknown density p on Rd . We prove how one can estimate the density p just from the unweighted adjacency matrix of the graph, without knowing the points themselves or any distance or similarity scores. The key insights are that local differences in link numbers can be used to estimate a local function of the gradient of p, and that integrating this function along shortest paths leads to an estimate of the underlying density. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Density estimation from unweighted k-nearest neighbor graphs: a roadmap Ulrike von Luxburg and Morteza Alamgir Department of Computer Science University of Hamburg, Germany {luxburg,alamgir}@informatik. [sent-1, score-0.389]

2 de Abstract Consider an unweighted k-nearest neighbor graph on n points that have been sampled i. [sent-3, score-0.561]

3 We prove how one can estimate the density p just from the unweighted adjacency matrix of the graph, without knowing the points themselves or any distance or similarity scores. [sent-7, score-0.695]

4 The key insights are that local differences in link numbers can be used to estimate a local function of the gradient of p, and that integrating this function along shortest paths leads to an estimate of the underlying density. [sent-8, score-0.538]

5 Consider an unweighted k-nearest neighbor graph that has been built on a random sample X1 , . [sent-10, score-0.532]

6 Is it then possible to estimate the underlying density p, just from the adjacency matrix of the unweighted graph? [sent-19, score-0.621]

7 Machine learning algorithms on graphs are abundant, ranging from graph clustering methods such as spectral clustering over label propagation methods for semi-supervised learning to dimensionality reduction methods and manifold algorithms. [sent-21, score-0.225]

8 , Xn we first compute pairwise similarities s(Xi , Xj ) according to some suitable similarity function and then build a k-nearest neighbor graph (kNN graph for short) based on this similarity function. [sent-25, score-0.418]

9 The intuition is that the edges in the graph encode the local information given by the similarity function, whereas the graph as a whole reveals global properties of the data distribution such as cluster properties, high- and low-density regions, or manifold structure. [sent-26, score-0.381]

10 From a computational point of view, kNN graphs are convenient because they lead to a sparse representation of the data — even more so when the graph is unweighted. [sent-27, score-0.252]

11 It is easy to see that for suitably weighted kNN graphs this is the case: the original density can be estimated from the degrees in the graph. [sent-29, score-0.29]

12 However, it is completely unclear whether the same holds true for unweighted kNN graphs. [sent-30, score-0.314]

13 The naive attempt to estimate the density from vertex degrees obviously has to fail in unweighted kNN graphs because all vertex degrees are (about) k. [sent-32, score-0.711]

14 Moreover, unweighted kNN graphs are invariant with respect to rescaling of the underlying distribution by a constant factor (e. [sent-33, score-0.468]

15 , the unweighted kNN graph on a sample from the uniform distribution on [0, 1]2 is indistinguishable from a kNN graph on a sample from the uniform distribution on [0, 2]2 ). [sent-35, score-0.67]

16 So all we can hope for is an estimate of the density up to some multiplicative constant that cannot be determined from the kNN graph alone. [sent-36, score-0.376]

17 To see this, consider the case where the underlying density is continuous, hence approximately constant in small neighborhoods. [sent-38, score-0.245]

18 Then, if n is large and k/n is small, local neighborhoods in the kNN graph are all going to look like kNN graphs from a uniform distribution. [sent-39, score-0.253]

19 It is impossible to estimate the density in an unweighted kNN graph by local quantities alone. [sent-41, score-0.737]

20 This makes the problem very different and much harder than more standard density estimation problems. [sent-43, score-0.171]

21 We show that it is indeed possible to estimate the underlying density from an unweighted kNN graph. [sent-45, score-0.583]

22 In a first step we estimate a pointwise function of the gradient of the density, and in a second step we integrate these estimates along shortest paths in the graph to end up with an approximation of the log-density. [sent-47, score-0.553]

23 Our estimate works as long as the kNN graph is reasonably dense (k d+2 /(n2 logd n) → ∞). [sent-48, score-0.281]

24 Currently we do not know whether this is due to a suboptimal proof or whether density estimation is generally impossible in the sparse regime. [sent-52, score-0.217]

25 Let p be a continuously differentiable density on X . [sent-60, score-0.233]

26 The resulting graph is called the directed, unweighted kNN graph (in the following, we will often drop the words “directed” and “unweighted”). [sent-71, score-0.62]

27 For a rectifiable path γ : [0, 1] → X we define its p-weighted length as 1 p (γ) p1/d (x) ds := := γ p1/d (γ(t))|γ (t)| dt 0 (recall the notational convention of writing “ds” in a line integral). [sent-77, score-0.256]

28 As a consequence of the compactness of X , a minimizing path that realizes Dp always exists (cf. [sent-79, score-0.25]

29 Under the given assumptions on p, the Dp -shortest path between any two points x, y ∈ Xε0 is smooth. [sent-85, score-0.26]

30 In an unweighted graph, define the length of a path as its number of edges. [sent-86, score-0.52]

31 For two vertices x, y denote by Dsp (x, y) their shortest path distance in the graph. [sent-87, score-0.524]

32 2 t en ng R at x Rp'(x) D en si ty Ta e o th et p ac nt s ge T an Rp'(x) sity d en xl x Left 1 xr Left d Right1 R Right d Figure 1: Geometric argument (left: 1-dimensional case, right: 2-dimensional case). [sent-92, score-0.429]

33 The intuition to estimate the density from the directed kNN graph is the following. [sent-94, score-0.518]

34 Consider a point x in a region where the density has positive slope. [sent-95, score-0.198]

35 When the density has an increasing slope at x, there tend to be less sample points in [x−R, x] than in [x, x+R], so the set Right1 (x) tends to contain more sample points than the set Left1 (x). [sent-97, score-0.329]

36 This is in accordance with the intuition we mentioned above: one cannot estimate the density by just looking at a local neighborhood of x in the kNN graph. [sent-104, score-0.286]

37 To estimate the density at a particular data point Xs , we now sum the estimates p (x)/p2 (x) over all data points x that sit between X0 and Xs . [sent-107, score-0.304]

38 This corresponds to integrating the function p (x)/p2 (x) over the interval [X0 , Xs ] with respect to the underlying density p, which in turn corresponds to integrating the function p (x)/p(x) over the interval [X0 , Xs ] with respect to the standard Lebesgue measure. [sent-108, score-0.297]

39 Our idea is to consider an integral along a path between X0 and Xs , specifically along a path that corresponds to a shortest path in the graph Gn . [sent-114, score-1.128]

40 Our idea is to use the shortest path as reference. [sent-116, score-0.435]

41 For a point x on the shortest path between X0 and Xs , the “left points” of x should be the ones that are on or close to the subpath from X0 to x and “right points” the ones on or close to the path from x to Xs . [sent-117, score-0.668]

42 1 Gradient estimates based on link differences Fix a point x on a simple, continuously differentiable path γ and let T (x) be its tangent vector. [sent-119, score-0.355]

43 3 Out(x) H Out(x) path γ x Left d In(xr ) In(x l ) path γ Right x r xl Left d γ Right γ Figure 2: Definitions of “left” and “right”in the d-dimensional case. [sent-122, score-0.528]

44 It is not yet the end of the story, as the quantities Leftd and Rightd cannot be evaluated based on the kNN graph alone, but it is a good starting point to develop the necessary proof concepts. [sent-125, score-0.233]

45 In this section we prove the consistency of a density estimate based on Leftd and Rightd . [sent-126, score-0.223]

46 Let γ be a differentiable, regular, simple path in Xε0 and x a sample point on this path. [sent-129, score-0.258]

47 Let T be the tangent direction of γ at x and pT (x) the directional derivative of the density p in direction T at point x. [sent-130, score-0.313]

48 A simple geometric argument shows that if the density in a neighborhood of x is linear, then E(Rightd − Leftd ) = n · rd ηd /2 · rpT (x) (n times the probability mass of the grey area in Figure 1). [sent-140, score-0.278]

49 A similar argument holds approximately if the density is just differentiable at x. [sent-141, score-0.263]

50 2 Integrating the gradient estimates along the shortest path To deal with the integration part, let us recap some standard results about line integrals. [sent-146, score-0.49]

51 Proposition 2 (Line integral) Let γ : [0, 1] → Rd be a simple, continuously differentiable path from x0 = γ(0) to x1 = γ(1) parameterized by arc length. [sent-147, score-0.268]

52 For a point x = γ(t) on the path, denote by T (x) the tangent vector to γ at x, and by pT (x) the directional derivative of p in the tangent direction T . [sent-148, score-0.202]

53 (1) Note that γ (t) is the tangent vector T (x) of the path γ at point x = γ(t). [sent-157, score-0.293]

54 Hence, the scalar product p (γ(t)), γ (t) coincides with the directional derivative of p in direction T , so the right hand side of Equation (1) coincides with the left hand side of the equation in the proposition. [sent-158, score-0.27]

55 The goal is to approximate the integral along the continuous path γ by a sum along a path γn in the kNN graph Gn . [sent-162, score-0.693]

56 To achieve this, we need to construct a sequence of paths γn in Gn such that γn converges to some well-defined path γ in the underlying space and the lengths of γn in Gn converge to p (γ). [sent-163, score-0.316]

57 To this end, we are going to consider paths γn which are shortest paths in the graph. [sent-164, score-0.385]

58 Adapting the proof of the convergence of shortest paths in unweighted kNN graphs (Alamgir and von Luxburg, 2012) we can derive the following statement for integrals along shortest paths. [sent-165, score-1.002]

59 Proposition 3 (Integrating a function along a shortest path) Let X and p satisfy the assumptions in Section 2. [sent-166, score-0.262]

60 Fix two sample points in Xε0 , say X0 and Xs , and let γn be a shortest path between X0 and Xs in the kNN graph Gn . [sent-167, score-0.667]

61 g(x) −→ γ x∈γn Note that if g(x)p1/d (x) can be written in the form F (γ(t)), γ (t) , then the same statement even holds if the shortest Dp -path is not unique, because the path integral then only depends on start and end point. [sent-173, score-0.497]

62 3 Combining everything to obtain a density estimate Theorem 4 (Density estimate) Let X and p satisfy the assumptions in Section 2, let X0 ∈ Xε0 be any fixed sample point. [sent-176, score-0.248]

63 For another sample point Xs , let γn be a shortest path between X0 and Xs in the kNN graph Gn . [sent-177, score-0.64]

64 Assume that there exists a path γ that realizes Dp (x, y) and that is completely contained in Xε0 . [sent-178, score-0.25]

65 p(x)(d+1)/d According to Proposition 3, the latter can be approximated by k nηd 1/d x∈γn pT (x) p(x)(d+1)/d where γn is a shortest path between X0 and Xs in the kNN graph. [sent-183, score-0.435]

66 x∈γn The final d-dimensional density estimate In this section, we finally introduce an estimate that solely uses quantities available from the kNN graph. [sent-185, score-0.302]

67 Let x be a vertex on a shortest path γn,k in the kNN graph Gn . [sent-186, score-0.619]

68 Let xl and xr be the predecessor and successor vertices of x on this path (in particular, xl and xr are sample points as well). [sent-187, score-0.774]

69 But the intuition is that whenever we find two sets on the left and right side of x that have approximately the same volume, then the difference Leftγn,k − Rightγn,k should be a function of pT (x). [sent-191, score-0.181]

70 Another insight is that the set Leftγn,k (x) counts the number of directed paths of length 2 from x to xl , and Rightγn,k (x) analogously. [sent-194, score-0.283]

71 We conjecture that the difference Rightγn,k − Leftγn,k can be used as before to construct a density estimate. [sent-195, score-0.171]

72 Specifically, if γn,k is a shortest path from the anchor point X0 to Xs , we believe that under similar conditions on k and n as before, ηd Rightγn,k (x) − Leftγn,k (x) ( ) kνd x∈γ n is a consistent estimator of the quantity log p(Xs ) − log p(X0 ). [sent-196, score-0.584]

73 The second difficulty is related to the shortest path in the graph. [sent-202, score-0.435]

74 While it is clear that “most edges” in this path have approximately the maximal length (that is, (k/(nηd p(x))1/d for an edge in the neighborhood of x), this is not true for all edges. [sent-203, score-0.258]

75 Intuitively it is clear that the contribution of the few violating edges will be washed out in the integral along the shortest path, but we don’t have a formal proof yet. [sent-204, score-0.35]

76 Consider a Dp -shortest path γ ⊂ Rd and a point x on this path with out-radius rout (x). [sent-206, score-0.554]

77 Define the points xl and xr as the two points where the path γ enters resp. [sent-207, score-0.529]

78 leaves the ball B(x, rout (x)), and define the sets Ln,k := Out(x) ∩ B(xl , rout (x)) and 1/d 1/d Rn,k := Out(x) ∩ B(xr , rout (x)). [sent-208, score-0.395]

79 It circumvents the problems mentioned above by using well defined balls instead of In-sets and the continuous path γ rather than the finite sample shortest path γn , but the quantities cannot be estimated from the kNN graph alone. [sent-211, score-0.931]

80 We draw n = 2000 points according to a couple of simple densities on R, R2 and R10 , then we build the directed, unweighted kNN graph with k = 50. [sent-213, score-0.521]

81 We fix a random point as anchor point X0 , compute the quantities Rightγn,k and Leftγn,k for all sample points, and then sum the differences Rightγn,k − Leftγn,k along shortest paths to X0 . [sent-214, score-0.494]

82 7 Extensions We have seen how to estimate the density in an unweighted, directed kNN graph. [sent-223, score-0.326]

83 The current density estimate requires that we know the dimension d of the underlying space because we need to be able to compute the constants ηd (volume of the unit ball) and vd (intersection of two unit balls). [sent-227, score-0.294]

84 The dimension can be estimated from the directed, unweighted kNN graph as follows. [sent-228, score-0.494]

85 Denote by r the distance of x to its kth-nearest neighbor, and by K the number of vertices that can be reached from x by a directed shortest path of length 2. [sent-229, score-0.627]

86 If n is large enough and k small enough, the density on these balls is approximately constant, which implies K/k ≈ 2d where d is the dimension of the underlying space. [sent-231, score-0.303]

87 The current estimate is based on the directed kNN graph, but many applications use undirected kNN graphs. [sent-233, score-0.188]

88 However, it is possible to recover the directed, unweighted kNN graph from the undirected, unweighted graph. [sent-234, score-0.781]

89 To estimate the set Out(x) in order to recover the directed kNN graph, we rank all points y ∈ N (x) according to |N (x) ∩ N (y)| and choose Out(x) as the first k vertices in this ranking. [sent-238, score-0.268]

90 In this paper we focus on estimating the density from the unweighted kNN graph. [sent-240, score-0.485]

91 Another interesting problem is to recover an embedding of the vertices to Rd such that the kNN graph based on the embedded vertices corresponds to the given kNN graph. [sent-241, score-0.316]

92 However, we are not aware of any approach in the literature that can faithfully embed unweighted kNN graphs and comes with performance guarantees. [sent-248, score-0.386]

93 Based on our density estimate, such an embedding can now easily be constructed. [sent-249, score-0.216]

94 Given the unweighted kNN graph, we assign edge weights w(Xi , Xj ) = (ˆ−1/d (Xi ) + p−1/d (Xj ))/2 where p is the estimate of the p ˆ ˆ underlying density. [sent-250, score-0.412]

95 Then the shortest paths in this weighted kNN graph converge to the Euclidean distances in the underlying space, and standard metric multidimensional scaling can be used to construct an appropriate embedding. [sent-251, score-0.554]

96 The figures plot the first dimension of the data points versus the true (black) and estimated (green) density values. [sent-294, score-0.252]

97 The right figures plot the first coordinate of the data points against the true (black) and estimated (green) density values. [sent-298, score-0.29]

98 8 Conclusions In this paper we show how a density can be estimated from the adjacency matrix of an unweighted, directed kNN graph, provided the graph is dense enough (k d+2 /(n2 logd n) → ∞). [sent-300, score-0.568]

99 In this case, the information about the underlying density is implicitly contained in unweighted kNN graphs, and, at least in principle, accessible by machine learning algorithms. [sent-301, score-0.531]

100 For such sparse graphs, our density estimate fails because it is dominated by sampling noise that does not disappear as n → ∞. [sent-303, score-0.223]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('knn', 0.563), ('unweighted', 0.314), ('leftd', 0.268), ('rightd', 0.268), ('shortest', 0.229), ('path', 0.206), ('xs', 0.179), ('density', 0.171), ('graph', 0.153), ('xl', 0.116), ('rout', 0.115), ('directed', 0.103), ('xr', 0.099), ('dim', 0.097), ('gn', 0.096), ('pt', 0.085), ('ordinal', 0.083), ('logd', 0.076), ('graphs', 0.072), ('dp', 0.072), ('alamgir', 0.067), ('shaw', 0.067), ('paths', 0.064), ('anchor', 0.062), ('en', 0.062), ('integral', 0.062), ('tangent', 0.06), ('vertices', 0.059), ('balls', 0.058), ('burago', 0.057), ('rd', 0.055), ('points', 0.054), ('estimate', 0.052), ('doiu', 0.051), ('ball', 0.05), ('ds', 0.05), ('xn', 0.048), ('underlying', 0.046), ('embedding', 0.045), ('left', 0.045), ('realizes', 0.044), ('integrating', 0.04), ('neighbor', 0.04), ('intuition', 0.039), ('bilu', 0.038), ('dsp', 0.038), ('gutin', 0.038), ('hajiaghayi', 0.038), ('jamieson', 0.038), ('adjacency', 0.038), ('multidimensional', 0.038), ('right', 0.038), ('differentiable', 0.036), ('rescaling', 0.036), ('luxburg', 0.036), ('similarity', 0.036), ('coincides', 0.035), ('von', 0.035), ('directional', 0.034), ('demaine', 0.034), ('schultz', 0.034), ('mcfee', 0.034), ('borg', 0.034), ('undirected', 0.033), ('along', 0.033), ('vertex', 0.031), ('side', 0.031), ('pmax', 0.031), ('distance', 0.03), ('log', 0.03), ('pmin', 0.029), ('ouyang', 0.029), ('approximately', 0.028), ('going', 0.028), ('argument', 0.028), ('estimated', 0.027), ('proposition', 0.027), ('embeddings', 0.027), ('primitive', 0.027), ('hoeffding', 0.027), ('alon', 0.027), ('quantities', 0.027), ('point', 0.027), ('proof', 0.026), ('continuously', 0.026), ('intersection', 0.026), ('sample', 0.025), ('lanckriet', 0.025), ('vd', 0.025), ('neighborhood', 0.024), ('metric', 0.024), ('xj', 0.024), ('recti', 0.023), ('gures', 0.022), ('gradient', 0.022), ('derivative', 0.021), ('raises', 0.021), ('vn', 0.021), ('impossible', 0.02), ('degrees', 0.02)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 87 nips-2013-Density estimation from unweighted k-nearest neighbor graphs: a roadmap

Author: Ulrike Von Luxburg, Morteza Alamgir

Abstract: Consider an unweighted k-nearest neighbor graph on n points that have been sampled i.i.d. from some unknown density p on Rd . We prove how one can estimate the density p just from the unweighted adjacency matrix of the graph, without knowing the points themselves or any distance or similarity scores. The key insights are that local differences in link numbers can be used to estimate a local function of the gradient of p, and that integrating this function along shortest paths leads to an estimate of the underlying density. 1

2 0.1568778 289 nips-2013-Scalable kernels for graphs with continuous attributes

Author: Aasa Feragen, Niklas Kasenburg, Jens Petersen, Marleen de Bruijne, Karsten Borgwardt

Abstract: While graphs with continuous node attributes arise in many applications, stateof-the-art graph kernels for comparing continuous-attributed graphs suffer from a high runtime complexity. For instance, the popular shortest path kernel scales as O(n4 ), where n is the number of nodes. In this paper, we present a class of graph kernels with computational complexity O(n2 (m + log n + δ 2 + d)), where δ is the graph diameter, m is the number of edges, and d is the dimension of the node attributes. Due to the sparsity and small diameter of real-world graphs, these kernels typically scale comfortably to large graphs. In our experiments, the presented kernels outperform state-of-the-art kernels in terms of speed and accuracy on classification benchmark datasets. 1

3 0.15023315 227 nips-2013-Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions

Author: Yasin Abbasi, Peter Bartlett, Varun Kanade, Yevgeny Seldin, Csaba Szepesvari

Abstract: We study the problem of online learning Markov Decision Processes (MDPs) when both the transition distributions and loss functions are chosen by an adversary. We present an algorithm that, under a mixing assumption, achieves p O( T log |⇧| + log |⇧|) regret with respect to a comparison set of policies ⇧. The regret is independent of the size of the state and action spaces. When expectations over sample paths can be computed efficiently and the comparison set ⇧ has polynomial size, this algorithm is efficient. We also consider the episodic adversarial online shortest path problem. Here, in each episode an adversary may choose a weighted directed acyclic graph with an identified start and finish node. The goal of the learning algorithm is to choose a path that minimizes the loss while traversing from the start to finish node. At the end of each episode the loss function (given by weights on the edges) is revealed to the learning algorithm. The goal is to minimize regret with respect to a fixed policy for selecting paths. This problem is a special case of the online MDP problem. It was shown that for randomly chosen graphs and adversarial losses, the problem can be efficiently solved. We show that it also can be efficiently solved for adversarial graphs and randomly chosen losses. When both graphs and losses are adversarially chosen, we show that designing efficient algorithms for the adversarial online shortest path problem (and hence for the adversarial MDP problem) is as hard as learning parity with noise, a notoriously difficult problem that has been used to design efficient cryptographic schemes. Finally, we present an efficient algorithm whose regret scales linearly with the number of distinct graphs. 1

4 0.11459081 269 nips-2013-Regression-tree Tuning in a Streaming Setting

Author: Samory Kpotufe, Francesco Orabona

Abstract: We consider the problem of maintaining the data-structures of a partition-based regression procedure in a setting where the training data arrives sequentially over time. We prove that it is possible to maintain such a structure in time O (log n) at ˜ any time step n while achieving a nearly-optimal regression rate of O n−2/(2+d) in terms of the unknown metric dimension d. Finally we prove a new regression lower-bound which is independent of a given data size, and hence is more appropriate for the streaming setting. 1

5 0.10040281 282 nips-2013-Robust Multimodal Graph Matching: Sparse Coding Meets Graph Matching

Author: Marcelo Fiori, Pablo Sprechmann, Joshua Vogelstein, Pablo Muse, Guillermo Sapiro

Abstract: Graph matching is a challenging problem with very important applications in a wide range of fields, from image and video analysis to biological and biomedical problems. We propose a robust graph matching algorithm inspired in sparsityrelated techniques. We cast the problem, resembling group or collaborative sparsity formulations, as a non-smooth convex optimization problem that can be efficiently solved using augmented Lagrangian techniques. The method can deal with weighted or unweighted graphs, as well as multimodal data, where different graphs represent different types of data. The proposed approach is also naturally integrated with collaborative graph inference techniques, solving general network inference problems where the observed variables, possibly coming from different modalities, are not in correspondence. The algorithm is tested and compared with state-of-the-art graph matching techniques in both synthetic and real graphs. We also present results on multimodal graphs and applications to collaborative inference of brain connectivity from alignment-free functional magnetic resonance imaging (fMRI) data. The code is publicly available. 1

6 0.099220037 63 nips-2013-Cluster Trees on Manifolds

7 0.08894904 207 nips-2013-Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic

8 0.085651442 18 nips-2013-A simple example of Dirichlet process mixture inconsistency for the number of components

9 0.082889207 125 nips-2013-From Bandits to Experts: A Tale of Domination and Independence

10 0.080206759 217 nips-2013-On Poisson Graphical Models

11 0.07387539 108 nips-2013-Error-Minimizing Estimates and Universal Entry-Wise Error Bounds for Low-Rank Matrix Completion

12 0.068256684 291 nips-2013-Sensor Selection in High-Dimensional Gaussian Trees with Nuisances

13 0.06350892 144 nips-2013-Inverse Density as an Inverse Problem: the Fredholm Equation Approach

14 0.057782847 281 nips-2013-Robust Low Rank Kernel Embeddings of Multivariate Distributions

15 0.057622217 135 nips-2013-Heterogeneous-Neighborhood-based Multi-Task Local Learning Algorithms

16 0.057059366 36 nips-2013-Annealing between distributions by averaging moments

17 0.055505902 35 nips-2013-Analyzing the Harmonic Structure in Graph-Based Learning

18 0.054917302 288 nips-2013-Scalable Influence Estimation in Continuous-Time Diffusion Networks

19 0.053957291 161 nips-2013-Learning Stochastic Inverses

20 0.053383701 4 nips-2013-A Comparative Framework for Preconditioned Lasso Algorithms


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.144), (1, 0.021), (2, 0.044), (3, -0.004), (4, 0.041), (5, 0.087), (6, 0.053), (7, -0.059), (8, -0.026), (9, 0.02), (10, 0.018), (11, -0.108), (12, 0.097), (13, -0.032), (14, 0.114), (15, 0.005), (16, 0.037), (17, -0.042), (18, -0.097), (19, 0.05), (20, 0.031), (21, 0.021), (22, 0.009), (23, 0.008), (24, -0.075), (25, -0.035), (26, 0.008), (27, 0.073), (28, -0.08), (29, 0.107), (30, -0.094), (31, -0.017), (32, -0.035), (33, 0.078), (34, -0.052), (35, 0.022), (36, 0.054), (37, -0.047), (38, -0.219), (39, 0.032), (40, 0.011), (41, 0.011), (42, 0.025), (43, -0.04), (44, -0.003), (45, -0.129), (46, -0.024), (47, -0.029), (48, 0.06), (49, -0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95450675 87 nips-2013-Density estimation from unweighted k-nearest neighbor graphs: a roadmap

Author: Ulrike Von Luxburg, Morteza Alamgir

Abstract: Consider an unweighted k-nearest neighbor graph on n points that have been sampled i.i.d. from some unknown density p on Rd . We prove how one can estimate the density p just from the unweighted adjacency matrix of the graph, without knowing the points themselves or any distance or similarity scores. The key insights are that local differences in link numbers can be used to estimate a local function of the gradient of p, and that integrating this function along shortest paths leads to an estimate of the underlying density. 1

2 0.64187515 207 nips-2013-Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic

Author: James L. Sharpnack, Akshay Krishnamurthy, Aarti Singh

Abstract: The detection of anomalous activity in graphs is a statistical problem that arises in many applications, such as network surveillance, disease outbreak detection, and activity monitoring in social networks. Beyond its wide applicability, graph structured anomaly detection serves as a case study in the difficulty of balancing computational complexity with statistical power. In this work, we develop from first principles the generalized likelihood ratio test for determining if there is a well connected region of activation over the vertices in the graph in Gaussian noise. Because this test is computationally infeasible, we provide a relaxation, called the Lov´ sz extended scan statistic (LESS) that uses submodularity to approximate the a intractable generalized likelihood ratio. We demonstrate a connection between LESS and maximum a-posteriori inference in Markov random fields, which provides us with a poly-time algorithm for LESS. Using electrical network theory, we are able to control type 1 error for LESS and prove conditions under which LESS is risk consistent. Finally, we consider specific graph models, the torus, knearest neighbor graphs, and ǫ-random graphs. We show that on these graphs our results provide near-optimal performance by matching our results to known lower bounds. 1

3 0.604738 35 nips-2013-Analyzing the Harmonic Structure in Graph-Based Learning

Author: Xiao-Ming Wu, Zhenguo Li, Shih-Fu Chang

Abstract: We find that various well-known graph-based models exhibit a common important harmonic structure in its target function – the value of a vertex is approximately the weighted average of the values of its adjacent neighbors. Understanding of such structure and analysis of the loss defined over such structure help reveal important properties of the target function over a graph. In this paper, we show that the variation of the target function across a cut can be upper and lower bounded by the ratio of its harmonic loss and the cut cost. We use this to develop an analytical tool and analyze five popular graph-based models: absorbing random walks, partially absorbing random walks, hitting times, pseudo-inverse of the graph Laplacian, and eigenvectors of the Laplacian matrices. Our analysis sheds new insights into several open questions related to these models, and provides theoretical justifications and guidelines for their practical use. Simulations on synthetic and real datasets confirm the potential of the proposed theory and tool.

4 0.60002327 182 nips-2013-Manifold-based Similarity Adaptation for Label Propagation

Author: Masayuki Karasuyama, Hiroshi Mamitsuka

Abstract: Label propagation is one of the state-of-the-art methods for semi-supervised learning, which estimates labels by propagating label information through a graph. Label propagation assumes that data points (nodes) connected in a graph should have similar labels. Consequently, the label estimation heavily depends on edge weights in a graph which represent similarity of each node pair. We propose a method for a graph to capture the manifold structure of input features using edge weights parameterized by a similarity function. In this approach, edge weights represent both similarity and local reconstruction weight simultaneously, both being reasonable for label propagation. For further justification, we provide analytical considerations including an interpretation as a cross-validation of a propagation model in the feature space, and an error analysis based on a low dimensional manifold model. Experimental results demonstrated the effectiveness of our approach both in synthetic and real datasets. 1

5 0.57690626 289 nips-2013-Scalable kernels for graphs with continuous attributes

Author: Aasa Feragen, Niklas Kasenburg, Jens Petersen, Marleen de Bruijne, Karsten Borgwardt

Abstract: While graphs with continuous node attributes arise in many applications, stateof-the-art graph kernels for comparing continuous-attributed graphs suffer from a high runtime complexity. For instance, the popular shortest path kernel scales as O(n4 ), where n is the number of nodes. In this paper, we present a class of graph kernels with computational complexity O(n2 (m + log n + δ 2 + d)), where δ is the graph diameter, m is the number of edges, and d is the dimension of the node attributes. Due to the sparsity and small diameter of real-world graphs, these kernels typically scale comfortably to large graphs. In our experiments, the presented kernels outperform state-of-the-art kernels in terms of speed and accuracy on classification benchmark datasets. 1

6 0.55452216 328 nips-2013-The Total Variation on Hypergraphs - Learning on Hypergraphs Revisited

7 0.53841734 25 nips-2013-Adaptive Anonymity via $b$-Matching

8 0.51287431 217 nips-2013-On Poisson Graphical Models

9 0.50727308 282 nips-2013-Robust Multimodal Graph Matching: Sparse Coding Meets Graph Matching

10 0.49333784 357 nips-2013-k-Prototype Learning for 3D Rigid Structures

11 0.49245664 332 nips-2013-Tracking Time-varying Graphical Structure

12 0.48137838 151 nips-2013-Learning Chordal Markov Networks by Constraint Satisfaction

13 0.47720218 125 nips-2013-From Bandits to Experts: A Tale of Domination and Independence

14 0.45561522 329 nips-2013-Third-Order Edge Statistics: Contour Continuation, Curvature, and Cortical Connections

15 0.44439888 80 nips-2013-Data-driven Distributionally Robust Polynomial Optimization

16 0.44373828 132 nips-2013-Global MAP-Optimality by Shrinking the Combinatorial Search Area with Convex Relaxation

17 0.44177425 18 nips-2013-A simple example of Dirichlet process mixture inconsistency for the number of components

18 0.43782219 358 nips-2013-q-OCSVM: A q-Quantile Estimator for High-Dimensional Distributions

19 0.4340612 8 nips-2013-A Graphical Transformation for Belief Propagation: Maximum Weight Matchings and Odd-Sized Cycles

20 0.42740893 227 nips-2013-Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.014), (16, 0.026), (33, 0.136), (34, 0.088), (41, 0.023), (49, 0.033), (56, 0.144), (70, 0.036), (85, 0.075), (89, 0.025), (93, 0.036), (95, 0.014), (96, 0.255), (99, 0.014)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.80938423 87 nips-2013-Density estimation from unweighted k-nearest neighbor graphs: a roadmap

Author: Ulrike Von Luxburg, Morteza Alamgir

Abstract: Consider an unweighted k-nearest neighbor graph on n points that have been sampled i.i.d. from some unknown density p on Rd . We prove how one can estimate the density p just from the unweighted adjacency matrix of the graph, without knowing the points themselves or any distance or similarity scores. The key insights are that local differences in link numbers can be used to estimate a local function of the gradient of p, and that integrating this function along shortest paths leads to an estimate of the underlying density. 1

2 0.80186993 208 nips-2013-Neural representation of action sequences: how far can a simple snippet-matching model take us?

Author: Cheston Tan, Jedediah M. Singer, Thomas Serre, David Sheinberg, Tomaso Poggio

Abstract: The macaque Superior Temporal Sulcus (STS) is a brain area that receives and integrates inputs from both the ventral and dorsal visual processing streams (thought to specialize in form and motion processing respectively). For the processing of articulated actions, prior work has shown that even a small population of STS neurons contains sufficient information for the decoding of actor invariant to action, action invariant to actor, as well as the specific conjunction of actor and action. This paper addresses two questions. First, what are the invariance properties of individual neural representations (rather than the population representation) in STS? Second, what are the neural encoding mechanisms that can produce such individual neural representations from streams of pixel images? We find that a simple model, one that simply computes a linear weighted sum of ventral and dorsal responses to short action “snippets”, produces surprisingly good fits to the neural data. Interestingly, even using inputs from a single stream, both actor-invariance and action-invariance can be accounted for, by having different linear weights. 1

3 0.74936634 52 nips-2013-Bayesian inference as iterated random functions with applications to sequential inference in graphical models

Author: Arash Amini, Xuanlong Nguyen

Abstract: We propose a general formalism of iterated random functions with semigroup property, under which exact and approximate Bayesian posterior updates can be viewed as specific instances. A convergence theory for iterated random functions is presented. As an application of the general theory we analyze convergence behaviors of exact and approximate message-passing algorithms that arise in a sequential change point detection problem formulated via a latent variable directed graphical model. The sequential inference algorithm and its supporting theory are illustrated by simulated examples.

4 0.70869201 311 nips-2013-Stochastic Convex Optimization with Multiple Objectives

Author: Mehrdad Mahdavi, Tianbao Yang, Rong Jin

Abstract: In this paper, we are interested in the development of efficient algorithms for convex optimization problems in the simultaneous presence of multiple objectives and stochasticity in the first-order information. We cast the stochastic multiple objective optimization problem into a constrained optimization problem by choosing one function as the objective and try to bound other objectives by appropriate thresholds. We first examine a two stages exploration-exploitation based algorithm which first approximates the stochastic objectives by sampling and then solves a constrained stochastic optimization problem by projected gradient method. This method attains a suboptimal convergence rate even under strong assumption on the objectives. Our second approach is an efficient primal-dual stochastic algorithm. It leverages on the theory of Lagrangian method √ conin strained optimization and attains the optimal convergence rate of O(1/ T ) in high probability for general Lipschitz continuous objectives.

5 0.66949165 282 nips-2013-Robust Multimodal Graph Matching: Sparse Coding Meets Graph Matching

Author: Marcelo Fiori, Pablo Sprechmann, Joshua Vogelstein, Pablo Muse, Guillermo Sapiro

Abstract: Graph matching is a challenging problem with very important applications in a wide range of fields, from image and video analysis to biological and biomedical problems. We propose a robust graph matching algorithm inspired in sparsityrelated techniques. We cast the problem, resembling group or collaborative sparsity formulations, as a non-smooth convex optimization problem that can be efficiently solved using augmented Lagrangian techniques. The method can deal with weighted or unweighted graphs, as well as multimodal data, where different graphs represent different types of data. The proposed approach is also naturally integrated with collaborative graph inference techniques, solving general network inference problems where the observed variables, possibly coming from different modalities, are not in correspondence. The algorithm is tested and compared with state-of-the-art graph matching techniques in both synthetic and real graphs. We also present results on multimodal graphs and applications to collaborative inference of brain connectivity from alignment-free functional magnetic resonance imaging (fMRI) data. The code is publicly available. 1

6 0.66945457 125 nips-2013-From Bandits to Experts: A Tale of Domination and Independence

7 0.66926038 206 nips-2013-Near-Optimal Entrywise Sampling for Data Matrices

8 0.66803157 228 nips-2013-Online Learning of Dynamic Parameters in Social Networks

9 0.66670144 232 nips-2013-Online PCA for Contaminated Data

10 0.66625828 142 nips-2013-Information-theoretic lower bounds for distributed statistical estimation with communication constraints

11 0.66499805 249 nips-2013-Polar Operators for Structured Sparse Estimation

12 0.66413927 355 nips-2013-Which Space Partitioning Tree to Use for Search?

13 0.66285563 233 nips-2013-Online Robust PCA via Stochastic Optimization

14 0.66243315 224 nips-2013-On the Sample Complexity of Subspace Learning

15 0.66214925 79 nips-2013-DESPOT: Online POMDP Planning with Regularization

16 0.66185743 344 nips-2013-Using multiple samples to learn mixture models

17 0.66116059 285 nips-2013-Robust Transfer Principal Component Analysis with Rank Constraints

18 0.66114831 332 nips-2013-Tracking Time-varying Graphical Structure

19 0.66105795 292 nips-2013-Sequential Transfer in Multi-armed Bandit with Finite Set of Models

20 0.66000181 89 nips-2013-Dimension-Free Exponentiated Gradient