nips nips2006 nips2006-87 knowledge-graph by maker-knowledge-mining

87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming


Source: pdf

Author: Kilian Q. Weinberger, Fei Sha, Qihui Zhu, Lawrence K. Saul

Abstract: In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. Recently, a number of researchers have converged on common solutions to this problem using methods from convex optimization. In particular, many results have been obtained by constructing semidefinite programs (SDPs) with low rank solutions. While the rank of matrix variables in SDPs cannot be directly constrained, it has been observed that low rank solutions emerge naturally by computing high variance or maximal trace solutions that respect local distance constraints. In this paper, we show how to solve very large problems of this type by a matrix factorization that leads to much smaller SDPs than those previously studied. The matrix factorization is derived by expanding the solution of the original problem in terms of the bottom eigenvectors of a graph Laplacian. The smaller SDPs obtained from this matrix factorization yield very good approximations to solutions of the original problem. Moreover, these approximations can be further refined by conjugate gradient descent. We illustrate the approach on localization in large scale sensor networks, where optimizations involving tens of thousands of nodes can be solved in just a few minutes. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. [sent-11, score-0.293]

2 While the rank of matrix variables in SDPs cannot be directly constrained, it has been observed that low rank solutions emerge naturally by computing high variance or maximal trace solutions that respect local distance constraints. [sent-14, score-0.778]

3 In this paper, we show how to solve very large problems of this type by a matrix factorization that leads to much smaller SDPs than those previously studied. [sent-15, score-0.329]

4 The matrix factorization is derived by expanding the solution of the original problem in terms of the bottom eigenvectors of a graph Laplacian. [sent-16, score-0.62]

5 The smaller SDPs obtained from this matrix factorization yield very good approximations to solutions of the original problem. [sent-17, score-0.32]

6 We illustrate the approach on localization in large scale sensor networks, where optimizations involving tens of thousands of nodes can be solved in just a few minutes. [sent-19, score-0.891]

7 1 Introduction In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. [sent-20, score-0.293]

8 Such data arises in many applications, including manifold learning [12], robot navigation [3], protein clustering [6], and sensor localization [1]. [sent-22, score-0.65]

9 In all these applications, the challenge is to compute low dimensional representations that are consistent with observed measurements of local proximity. [sent-23, score-0.277]

10 For example, in robot path mapping, the robot’s locations must be inferred from the high dimensional description of its state in terms of sensorimotor input. [sent-24, score-0.323]

11 Likewise, in sensor networks, the locations of individual nodes must be inferred from the estimated distances between nearby sensors. [sent-26, score-0.708]

12 Again, the challenge is to find a planar representation of the sensors that preserves local distances. [sent-27, score-0.282]

13 In general, it is possible to formulate these problems as simple optimizations over the low dimensional representations xi of individual instances (e. [sent-28, score-0.421]

14 The most straight- forward formulations, however, lead to non-convex optimizations that are plagued by local minima. [sent-31, score-0.217]

15 Convexity is obtained by recasting the problems as optimizations over the inner product matrices Xij = xi · xj . [sent-34, score-0.38]

16 First, only low rank solutions for the inner product matrices X yield low dimensional representations for the vectors xi . [sent-37, score-0.686]

17 This work has shown that while the rank of solutions from SDPs cannot be directly constrained, low rank solutions often emerge naturally by computing maximal trace solutions that respect local distance constraints. [sent-42, score-0.717]

18 Maximizing the trace of the inner product matrix X has the effect of maximizing the variance of the low dimensional representation {xi }. [sent-43, score-0.452]

19 We show how to solve such problems by approximately factorizing the large n×n matrix X as X ≈ QYQ where Q is a pre-computed n×m rectangular matrix with m n. [sent-47, score-0.231]

20 The factorization leaves only the much smaller m × m matrix Y to be optimized with respect to local distance constraints. [sent-48, score-0.392]

21 With this factorization, and by collecting constraints using the Schur complement lemma, we show how to rewrite the original optimization over the large matrix X as a simple SDP involving the smaller matrix Y. [sent-49, score-0.326]

22 Moreover, if desirable, this solution can be further refined [1] by (non-convex) conjugate gradient descent in the vectors {xi }. [sent-51, score-0.236]

23 The main contribution of this paper is the matrix factorization that makes it possible to solve large problems in MVU. [sent-52, score-0.299]

24 Either implicitly or explicitly, all problems of this sort specify a graph whose nodes represent the vectors {xi } and whose edges represent local distance constraints. [sent-54, score-0.479]

25 The matrix factorization is obtained by expanding the low dimensional representation of these nodes (e. [sent-55, score-0.532]

26 , sensor locations) in terms of the m n bottom (smoothest) eigenvectors of the graph Laplacian. [sent-57, score-0.655]

27 Due to the local distance constraints, one expects the low dimensional representation of these nodes to vary smoothly as one traverses edges in the graph. [sent-58, score-0.463]

28 The presumption of smoothness justifies the partial orthogonal expansion in terms of the bottom eigenvectors of the graph Laplacian [5]. [sent-59, score-0.323]

29 The approach in this paper applies generally to any setting in which low dimensional representations are derived from an SDP that maximizes variance subject to local distance constraints. [sent-62, score-0.458]

30 For concreteness, we illustrate the approach on the problem of localization in large scale sensor networks, as recently described by [1]. [sent-63, score-0.482]

31 Here, we are able to solve optimizations involving tens of thousands of nodes in just a few minutes. [sent-64, score-0.424]

32 Section 2 reviews the problem of localization in large scale sensor networks and its formulation by [1] as an SDP that maximizes variance subject to local distance constraints. [sent-67, score-0.754]

33 Section 3 shows how we solve large problems of this form—by approximating the inner product matrix of sensor locations as the product of smaller matrices, by solving the smaller SDP that results from this approximation, and by refining the solution from this smaller SDP using local search. [sent-68, score-0.993]

34 2 Sensor localization via maximum variance unfolding The problem of sensor localization is best illustrated by example; see Fig. [sent-71, score-0.84]

35 Imagine that sensors are located in major cities throughout the continental US, and that nearby sensors can estimate their distances to one another (e. [sent-73, score-0.534]

36 From only this local information, the problem of sensor localization is to compute the individual sensor locations and to identify the whole network topology. [sent-76, score-1.056]

37 In purely mathematical terms, the problem can be viewed as computing a low rank embedding in two or three dimensional Euclidean space subject to local distance constraints. [sent-77, score-0.447]

38 We assume there are n sensors distributed in the plane and formulate the problem as an optimization over their planar coordinates Figure 1: Sensors distributed over US cities. [sent-78, score-0.296]

39 (Sensor localization in three tances are estimated between nearby cities within dimensional space can be solved in a similar a fixed radius. [sent-83, score-0.419]

40 ) We define a neighbor relation i ∼ j if the ith and jth sensors are sufficiently close to estimate their pairwise distance via limited-range radio transmission. [sent-85, score-0.242]

41 From such (noisy) estimates of local pairwise distances {dij }, the problem of sensor localization is to infer the planar coordinates {xi }. [sent-86, score-0.664]

42 ,xn xi − xj i∼j 2 − d2 ij 2 (1) In some applications, the locations of a few sensors are also known in advance. [sent-90, score-0.369]

43 By relaxing the constraint that the sensor locations xi lie in the 2 plane, we obtain a convex optimization that is much more tractable [1]. [sent-97, score-0.668]

44 (1–2) in terms of the elements of the inner product matrix Xij = xi · xj . [sent-99, score-0.253]

45 (2), while the second constraint specifies that X is positive semidefinite, which is necessary to interpret it as an inner product matrix in Euclidean space. [sent-102, score-0.249]

46 Instead, the vectors will more generally lie in a subspace of dimensionality equal to the rank of the solution X. [sent-106, score-0.247]

47 To obtain planar coordinates, one can project these vectors into their two dimensional subspace of maximum variance, obtained from the top two eigenvectors of X. [sent-107, score-0.338]

48 However, the rank of a matrix is not a convex function of its elements; thus it cannot be directly constrained as part of a convex optimization. [sent-110, score-0.278]

49 Mindful of this problem, the approach to sensor localization in [1] borrows an idea from recent work in unsupervised learning [12, 14]. [sent-111, score-0.482]

50 (The trace is proportional 2 to the variance assuming that the sensors are centered on the origin, since tr(X) = i xi . [sent-113, score-0.298]

51 ) The extra variance term in the loss function favors low rank solutions; intuitively, it is based on the observation that a flat piece of paper has greater diameter than a crumpled one. [sent-114, score-0.322]

52 This general framework for trading off global variance versus local rigidity has come to be known as maximum variance unfolding (MVU) [9, 15, 13]. [sent-117, score-0.341]

53 Thus, for small networks, this approach to sensor localization is viable, but for large networks (n ∼ 104 ), exact solutions are prohibitively expensive. [sent-122, score-0.592]

54 3 Large-scale maximum variance unfolding Most SDP solvers are based on interior-point methods whose time-complexity scales cubically in the matrix size and number of constraints [2]. [sent-124, score-0.334]

55 1 Matrix factorization To obtain an optimization involving smaller matrices, we appeal to ideas in spectral graph theory [5]. [sent-127, score-0.411]

56 The sensor network defines a connected graph whose edges represent local pairwise connectivity. [sent-128, score-0.609]

57 Whenever two nodes share an edge in this graph, we expect the locations of these nodes to be relatively similar. [sent-129, score-0.368]

58 We can view the location of the sensors as a function that is defined over the nodes of this graph. [sent-130, score-0.265]

59 Because the edges represent local distance constraints, we expect this function to vary smoothly as we traverse edges in the graph. [sent-131, score-0.232]

60 Function approximations on graphs are most naturally derived from the eigenvectors of the graph Laplacian [5]. [sent-136, score-0.296]

61 For unweighted graphs, the graph Laplacian L computes the quadratic form f Lf = i∼j (fi − fj )2 (5) on functions f ∈ n defined over the nodes of the graph. [sent-137, score-0.31]

62 The eigenvectors of L provide a set of basis functions over the nodes of the graph, ordered by smoothness. [sent-138, score-0.277]

63 Expanding the sensor locations xi in terms of these eigenvectors yields a compact factorization m for the inner product matrix X. [sent-140, score-1.008]

64 Suppose that xi ≈ α=1 Qiα yα , where the columns of the n×m rectangular matrix Q store the m bottom eigenvectors of the graph Laplacian (excluding the uniform eigenvector with zero eigenvalue). [sent-141, score-0.454]

65 Note that in this approximation, the matrix Q can be cheaply precomputed from the unweighted connectivity graph of the sensor network, while the vectors yα play the role of unknowns that depend in a complicated way on the local distance estimates dij . [sent-142, score-0.768]

66 From the low-order approximation to the sensor locations, we obtain the matrix factorization: X ≈ QYQ . [sent-144, score-0.408]

67 (6) approximates the inner product matrix X as the product of much smaller matrices. [sent-146, score-0.279]

68 Using this approximation for localization in large scale networks, we can solve an optimization for the much smaller m×m matrix Y, as opposed to the original n×n matrix X. [sent-147, score-0.446]

69 (6) can alternately be viewed as a form of regularization, as it constrains neighboring sensors to have nearby locations even when the estimated local distances dij suggest otherwise (e. [sent-156, score-0.476]

70 With this notation, the objective function (up to an additive constant) takes the form 2 b Y − Y AY, 2 (8) where A ∈ m ×m is the positive semidefinite matrix that collects all the quadratic coefficients in 2 the objective function and b ∈ m is the vector that collects all the linear coefficients. [sent-170, score-0.268]

71 As a result, this approach scales very well to large problems in sensor localization. [sent-188, score-0.36]

72 Expressed as SDPs, these earlier formulations of MVU involved as many constraints as edges in the underlying graph, even with the matrix factorization in eq. [sent-194, score-0.315]

73 Thus, the speed-ups obtained here over previous approaches are not merely due to graph regularization, but more precisely to its use in conjunction with quadratic penalties, all of which can be collected in a single linear matrix inequality via the Schur lemma. [sent-196, score-0.232]

74 Though this objective function is written in terms of the inner product matrix X, the hill-climbing in this step was performed in terms of the vectors xi ∈ m . [sent-212, score-0.329]

75 While not always necessary, this first step was mainly helpful for localization in sensor networks with irregular (and particularly non-convex) boundaries. [sent-213, score-0.522]

76 It seems generally difficult to representation such boundaries in terms of the bottom eigenvectors of the graph Laplacian. [sent-214, score-0.35]

77 This second step helps to correct patches of the network where either the graph regularization leads to oversmoothing and/or the rank constraint is not well modeled by MVU. [sent-217, score-0.374]

78 4 Results We evaluated our algorithm on two simulated sensor networks of different size and topology. [sent-218, score-0.425]

79 We did not assume any prior knowledge of sensor locations (e. [sent-219, score-0.456]

80 We added white noise to each local distance measurement with a standard deviation of 10% of the true local distance. [sent-222, score-0.22]

81 8 Figure 2: Sensor locations inferred for n = 1055 largest cities in the continental US. [sent-265, score-0.328]

82 On average, each sensor estimated local distances to 18 neighbors, with measurements corrupted by 10% Gaussian noise; see text. [sent-266, score-0.453]

83 Left: sensor locations obtained by solving the SDP in eq. [sent-267, score-0.456]

84 (9) using the m = 10 bottom eigenvectors of the graph Laplacian (computation time 4s). [sent-268, score-0.323]

85 Right: sensor locations after post-processing by conjugate gradient descent (additional computation time 3s). [sent-270, score-0.623]

86 1, placed nodes at scaled locations of the n = 1055 largest cities in the continental US. [sent-274, score-0.407]

87 Each node estimated the local distance to up to 18 other nodes within a radius of size r = 0. [sent-275, score-0.294]

88 (9) was solved using the m = 10 bottom eigenvectors of the graph Laplacian. [sent-278, score-0.359]

89 Each node estimated the local distance to up to 20 other nodes within a radius of size r = 0. [sent-290, score-0.294]

90 0 240 the m = 10 bottom eigenvectors of the graph Laplacian. [sent-294, score-0.323]

91 3 illustrates the absolute positional errors of the sensor locations computed in three different ways: the solution from the SDP in eq. [sent-298, score-0.497]

92 Both are plotted versus the number the sensors were colored so that the ground truth of eigenvectors, m, in the matrix factorizapositioning reveals the word CONVEX in the fore- tion. [sent-304, score-0.219]

93 (We focused on the role of m, noting that previous studies [1, 7] have thoroughly investigated the role of parameters such as the weight constant ν, the sensor radius r, and the noise level. [sent-310, score-0.362]

94 The approach makes use of a matrix factorization computed from the bottom eigenvectors of the graph Laplacian. [sent-322, score-0.543]

95 The factorization yields accurate approximate solutions which can be further refined by local search. [sent-323, score-0.292]

96 The power of the approach was illustrated by simulated results on sensor localization. [sent-324, score-0.385]

97 The networks in section 4 have far more nodes and edges than could be analyzed by previously formulated SDPs for these types of problems [1, 3, 6, 14]. [sent-325, score-0.235]

98 Beyond the problem of sensor localization, our approach applies quite generally to other settings where low dimensional representations are inferred from local distance constraints. [sent-326, score-0.743]

99 Semidefinite programming approaches for sensor network localization with noisy distance measurements. [sent-338, score-0.586]

100 The fastest mixing Markov process on a graph and a connection to a maximum variance unfolding problem. [sent-386, score-0.322]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('sdp', 0.452), ('sdps', 0.35), ('sensor', 0.332), ('qyq', 0.175), ('eigenvectors', 0.155), ('mvu', 0.153), ('unfolding', 0.153), ('semide', 0.15), ('localization', 0.15), ('factorization', 0.144), ('sensors', 0.143), ('optimizations', 0.139), ('locations', 0.124), ('nodes', 0.122), ('rank', 0.116), ('graph', 0.114), ('cities', 0.095), ('schur', 0.095), ('dimensional', 0.094), ('weinberger', 0.081), ('local', 0.078), ('matrix', 0.076), ('conjugate', 0.072), ('inner', 0.071), ('solutions', 0.07), ('manifold', 0.067), ('twenty', 0.067), ('continental', 0.066), ('xij', 0.065), ('distance', 0.064), ('optimization', 0.063), ('robot', 0.062), ('planar', 0.061), ('low', 0.06), ('dept', 0.057), ('anchor', 0.057), ('laplacian', 0.055), ('variance', 0.055), ('xi', 0.055), ('bottom', 0.054), ('regularization', 0.053), ('simulated', 0.053), ('descent', 0.052), ('product', 0.051), ('solve', 0.051), ('constraint', 0.051), ('constraints', 0.05), ('tr', 0.05), ('tens', 0.049), ('sha', 0.049), ('objective', 0.048), ('ij', 0.047), ('representations', 0.045), ('edges', 0.045), ('trace', 0.045), ('nearby', 0.044), ('xjj', 0.044), ('dij', 0.044), ('convex', 0.043), ('gradient', 0.043), ('inferred', 0.043), ('distances', 0.043), ('quadratic', 0.042), ('centering', 0.042), ('solution', 0.041), ('network', 0.04), ('networks', 0.04), ('protein', 0.039), ('xiao', 0.038), ('solved', 0.036), ('matrices', 0.036), ('expanding', 0.036), ('loss', 0.036), ('dimensionality', 0.035), ('penalties', 0.035), ('radio', 0.035), ('subject', 0.035), ('nite', 0.034), ('xii', 0.032), ('unweighted', 0.032), ('thousands', 0.032), ('involving', 0.031), ('bonn', 0.031), ('smaller', 0.03), ('radius', 0.03), ('plane', 0.029), ('pa', 0.029), ('spectral', 0.029), ('problems', 0.028), ('vectors', 0.028), ('ay', 0.028), ('philadelphia', 0.028), ('emerge', 0.028), ('piece', 0.028), ('graphs', 0.027), ('generally', 0.027), ('lu', 0.027), ('collects', 0.027), ('favors', 0.027), ('programs', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999964 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming

Author: Kilian Q. Weinberger, Fei Sha, Qihui Zhu, Lawrence K. Saul

Abstract: In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. Recently, a number of researchers have converged on common solutions to this problem using methods from convex optimization. In particular, many results have been obtained by constructing semidefinite programs (SDPs) with low rank solutions. While the rank of matrix variables in SDPs cannot be directly constrained, it has been observed that low rank solutions emerge naturally by computing high variance or maximal trace solutions that respect local distance constraints. In this paper, we show how to solve very large problems of this type by a matrix factorization that leads to much smaller SDPs than those previously studied. The matrix factorization is derived by expanding the solution of the original problem in terms of the bottom eigenvectors of a graph Laplacian. The smaller SDPs obtained from this matrix factorization yield very good approximations to solutions of the original problem. Moreover, these approximations can be further refined by conjugate gradient descent. We illustrate the approach on localization in large scale sensor networks, where optimizations involving tens of thousands of nodes can be solved in just a few minutes. 1

2 0.20220561 39 nips-2006-Balanced Graph Matching

Author: Timothee Cour, Praveen Srinivasan, Jianbo Shi

Abstract: Graph matching is a fundamental problem in Computer Vision and Machine Learning. We present two contributions. First, we give a new spectral relaxation technique for approximate solutions to matching problems, that naturally incorporates one-to-one or one-to-many constraints within the relaxation scheme. The second is a normalization procedure for existing graph matching scoring functions that can dramatically improve the matching accuracy. It is based on a reinterpretation of the graph matching compatibility matrix as a bipartite graph on edges for which we seek a bistochastic normalization. We evaluate our two contributions on a comprehensive test set of random graph matching problems, as well as on image correspondence problem. Our normalization procedure can be used to improve the performance of many existing graph matching algorithms, including spectral matching, graduated assignment and semidefinite programming. 1

3 0.17473289 200 nips-2006-Unsupervised Regression with Applications to Nonlinear System Identification

Author: Ali Rahimi, Ben Recht

Abstract: We derive a cost functional for estimating the relationship between highdimensional observations and the low-dimensional process that generated them with no input-output examples. Limiting our search to invertible observation functions confers numerous benefits, including a compact representation and no suboptimal local minima. Our approximation algorithms for optimizing this cost functional are fast and give diagnostic bounds on the quality of their solution. Our method can be viewed as a manifold learning algorithm that utilizes a prior on the low-dimensional manifold coordinates. The benefits of taking advantage of such priors in manifold learning and searching for the inverse observation functions in system identification are demonstrated empirically by learning to track moving targets from raw measurements in a sensor network setting and in an RFID tracking experiment. 1

4 0.15539931 67 nips-2006-Differential Entropic Clustering of Multivariate Gaussians

Author: Jason V. Davis, Inderjit S. Dhillon

Abstract: Gaussian data is pervasive and many learning algorithms (e.g., k-means) model their inputs as a single sample drawn from a multivariate Gaussian. However, in many real-life settings, each input object is best described by multiple samples drawn from a multivariate Gaussian. Such data can arise, for example, in a movie review database where each movie is rated by several users, or in time-series domains such as sensor networks. Here, each input can be naturally described by both a mean vector and covariance matrix which parameterize the Gaussian distribution. In this paper, we consider the problem of clustering such input objects, each represented as a multivariate Gaussian. We formulate the problem using an information theoretic approach and draw several interesting theoretical connections to Bregman divergences and also Bregman matrix divergences. We evaluate our method across several domains, including synthetic data, sensor network data, and a statistical debugging application. 1

5 0.11975995 12 nips-2006-A Probabilistic Algorithm Integrating Source Localization and Noise Suppression of MEG and EEG data

Author: Johanna M. Zumer, Hagai T. Attias, Kensuke Sekihara, Srikantan S. Nagarajan

Abstract: We have developed a novel algorithm for integrating source localization and noise suppression based on a probabilistic graphical model of stimulus-evoked MEG/EEG data. Our algorithm localizes multiple dipoles while suppressing noise sources with the computational complexity equivalent to a single dipole scan, and is therefore more efficient than traditional multidipole fitting procedures. In simulation, the algorithm can accurately localize and estimate the time course of several simultaneously-active dipoles, with rotating or fixed orientation, at noise levels typical for averaged MEG data. Furthermore, the algorithm is superior to beamforming techniques, which we show to be an approximation to our graphical model, in estimation of temporally correlated sources. Success of this algorithm for localizing auditory cortex in a tumor patient and for localizing an epileptic spike source are also demonstrated. 1

6 0.11041591 69 nips-2006-Distributed Inference in Dynamical Systems

7 0.10523134 35 nips-2006-Approximate inference using planar graph decomposition

8 0.096449934 80 nips-2006-Fundamental Limitations of Spectral Clustering

9 0.086184137 106 nips-2006-Large Margin Hidden Markov Models for Automatic Speech Recognition

10 0.08595065 128 nips-2006-Manifold Denoising

11 0.081892543 77 nips-2006-Fast Computation of Graph Kernels

12 0.077398114 7 nips-2006-A Local Learning Approach for Clustering

13 0.076739967 76 nips-2006-Emergence of conjunctive visual features by quadratic independent component analysis

14 0.075740963 74 nips-2006-Efficient Structure Learning of Markov Networks using $L 1$-Regularization

15 0.075671613 120 nips-2006-Learning to Traverse Image Manifolds

16 0.074895956 161 nips-2006-Particle Filtering for Nonparametric Bayesian Matrix Factorization

17 0.073535755 164 nips-2006-Randomized PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension

18 0.070512645 83 nips-2006-Generalized Maximum Margin Clustering and Unsupervised Kernel Learning

19 0.070167907 163 nips-2006-Prediction on a Graph with a Perceptron

20 0.069845065 32 nips-2006-Analysis of Empirical Bayesian Methods for Neuroelectromagnetic Source Localization


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.24), (1, 0.075), (2, 0.016), (3, 0.106), (4, 0.05), (5, -0.1), (6, 0.097), (7, 0.093), (8, -0.009), (9, -0.046), (10, 0.038), (11, -0.143), (12, 0.02), (13, -0.001), (14, 0.04), (15, -0.137), (16, -0.009), (17, -0.093), (18, 0.138), (19, 0.132), (20, 0.17), (21, -0.091), (22, 0.096), (23, -0.048), (24, -0.171), (25, -0.012), (26, -0.089), (27, -0.092), (28, 0.088), (29, 0.053), (30, 0.051), (31, 0.135), (32, 0.133), (33, -0.001), (34, 0.076), (35, -0.037), (36, 0.064), (37, 0.036), (38, -0.096), (39, -0.009), (40, 0.05), (41, -0.004), (42, 0.005), (43, 0.016), (44, -0.03), (45, -0.092), (46, 0.075), (47, -0.06), (48, 0.124), (49, -0.02)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9533872 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming

Author: Kilian Q. Weinberger, Fei Sha, Qihui Zhu, Lawrence K. Saul

Abstract: In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. Recently, a number of researchers have converged on common solutions to this problem using methods from convex optimization. In particular, many results have been obtained by constructing semidefinite programs (SDPs) with low rank solutions. While the rank of matrix variables in SDPs cannot be directly constrained, it has been observed that low rank solutions emerge naturally by computing high variance or maximal trace solutions that respect local distance constraints. In this paper, we show how to solve very large problems of this type by a matrix factorization that leads to much smaller SDPs than those previously studied. The matrix factorization is derived by expanding the solution of the original problem in terms of the bottom eigenvectors of a graph Laplacian. The smaller SDPs obtained from this matrix factorization yield very good approximations to solutions of the original problem. Moreover, these approximations can be further refined by conjugate gradient descent. We illustrate the approach on localization in large scale sensor networks, where optimizations involving tens of thousands of nodes can be solved in just a few minutes. 1

2 0.69012868 200 nips-2006-Unsupervised Regression with Applications to Nonlinear System Identification

Author: Ali Rahimi, Ben Recht

Abstract: We derive a cost functional for estimating the relationship between highdimensional observations and the low-dimensional process that generated them with no input-output examples. Limiting our search to invertible observation functions confers numerous benefits, including a compact representation and no suboptimal local minima. Our approximation algorithms for optimizing this cost functional are fast and give diagnostic bounds on the quality of their solution. Our method can be viewed as a manifold learning algorithm that utilizes a prior on the low-dimensional manifold coordinates. The benefits of taking advantage of such priors in manifold learning and searching for the inverse observation functions in system identification are demonstrated empirically by learning to track moving targets from raw measurements in a sensor network setting and in an RFID tracking experiment. 1

3 0.61413831 39 nips-2006-Balanced Graph Matching

Author: Timothee Cour, Praveen Srinivasan, Jianbo Shi

Abstract: Graph matching is a fundamental problem in Computer Vision and Machine Learning. We present two contributions. First, we give a new spectral relaxation technique for approximate solutions to matching problems, that naturally incorporates one-to-one or one-to-many constraints within the relaxation scheme. The second is a normalization procedure for existing graph matching scoring functions that can dramatically improve the matching accuracy. It is based on a reinterpretation of the graph matching compatibility matrix as a bipartite graph on edges for which we seek a bistochastic normalization. We evaluate our two contributions on a comprehensive test set of random graph matching problems, as well as on image correspondence problem. Our normalization procedure can be used to improve the performance of many existing graph matching algorithms, including spectral matching, graduated assignment and semidefinite programming. 1

4 0.53197688 67 nips-2006-Differential Entropic Clustering of Multivariate Gaussians

Author: Jason V. Davis, Inderjit S. Dhillon

Abstract: Gaussian data is pervasive and many learning algorithms (e.g., k-means) model their inputs as a single sample drawn from a multivariate Gaussian. However, in many real-life settings, each input object is best described by multiple samples drawn from a multivariate Gaussian. Such data can arise, for example, in a movie review database where each movie is rated by several users, or in time-series domains such as sensor networks. Here, each input can be naturally described by both a mean vector and covariance matrix which parameterize the Gaussian distribution. In this paper, we consider the problem of clustering such input objects, each represented as a multivariate Gaussian. We formulate the problem using an information theoretic approach and draw several interesting theoretical connections to Bregman divergences and also Bregman matrix divergences. We evaluate our method across several domains, including synthetic data, sensor network data, and a statistical debugging application. 1

5 0.49311528 77 nips-2006-Fast Computation of Graph Kernels

Author: Karsten M. Borgwardt, Nicol N. Schraudolph, S.v.n. Vishwanathan

Abstract: Using extensions of linear algebra concepts to Reproducing Kernel Hilbert Spaces (RKHS), we define a unifying framework for random walk kernels on graphs. Reduction to a Sylvester equation allows us to compute many of these kernels in O(n3 ) worst-case time. This includes kernels whose previous worst-case time complexity was O(n6 ), such as the geometric kernels of G¨ rtner et al. [1] and a the marginal graph kernels of Kashima et al. [2]. Our algebra in RKHS allow us to exploit sparsity in directed and undirected graphs more effectively than previous methods, yielding sub-cubic computational complexity when combined with conjugate gradient solvers or fixed-point iterations. Experiments on graphs from bioinformatics and other application domains show that our algorithms are often more than 1000 times faster than existing approaches. 1

6 0.44361481 120 nips-2006-Learning to Traverse Image Manifolds

7 0.44169667 35 nips-2006-Approximate inference using planar graph decomposition

8 0.43377662 12 nips-2006-A Probabilistic Algorithm Integrating Source Localization and Noise Suppression of MEG and EEG data

9 0.43048051 69 nips-2006-Distributed Inference in Dynamical Systems

10 0.42588434 70 nips-2006-Doubly Stochastic Normalization for Spectral Clustering

11 0.41119009 107 nips-2006-Large Margin Multi-channel Analog-to-Digital Conversion with Applications to Neural Prosthesis

12 0.39384663 129 nips-2006-Map-Reduce for Machine Learning on Multicore

13 0.39114451 201 nips-2006-Using Combinatorial Optimization within Max-Product Belief Propagation

14 0.38036954 128 nips-2006-Manifold Denoising

15 0.37356955 93 nips-2006-Hyperparameter Learning for Graph Based Semi-supervised Learning Algorithms

16 0.36468598 32 nips-2006-Analysis of Empirical Bayesian Methods for Neuroelectromagnetic Source Localization

17 0.35549122 124 nips-2006-Linearly-solvable Markov decision problems

18 0.35541373 92 nips-2006-High-Dimensional Graphical Model Selection Using $\ell 1$-Regularized Logistic Regression

19 0.35477647 123 nips-2006-Learning with Hypergraphs: Clustering, Classification, and Embedding

20 0.34151745 164 nips-2006-Randomized PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(1, 0.114), (3, 0.078), (7, 0.087), (9, 0.053), (12, 0.013), (20, 0.019), (22, 0.082), (44, 0.092), (57, 0.06), (64, 0.016), (65, 0.053), (69, 0.039), (71, 0.015), (90, 0.011), (95, 0.195)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.90578651 24 nips-2006-Aggregating Classification Accuracy across Time: Application to Single Trial EEG

Author: Steven Lemm, Christin Schäfer, Gabriel Curio

Abstract: We present a method for binary on-line classification of triggered but temporally blurred events that are embedded in noisy time series in the context of on-line discrimination between left and right imaginary hand-movement. In particular the goal of the binary classification problem is to obtain the decision, as fast and as reliably as possible from the recorded EEG single trials. To provide a probabilistic decision at every time-point t the presented method gathers information from two distinct sequences of features across time. In order to incorporate decisions from prior time-points we suggest an appropriate weighting scheme, that emphasizes time instances, providing a higher discriminatory power between the instantaneous class distributions of each feature, where the discriminatory power is quantified in terms of the Bayes error of misclassification. The effectiveness of this procedure is verified by its successful application in the 3rd BCI competition. Disclosure of the data after the competition revealed this approach to be superior with single trial error rates as low as 10.7, 11.5 and 16.7% for the three different subjects under study. 1

same-paper 2 0.84887552 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming

Author: Kilian Q. Weinberger, Fei Sha, Qihui Zhu, Lawrence K. Saul

Abstract: In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. Recently, a number of researchers have converged on common solutions to this problem using methods from convex optimization. In particular, many results have been obtained by constructing semidefinite programs (SDPs) with low rank solutions. While the rank of matrix variables in SDPs cannot be directly constrained, it has been observed that low rank solutions emerge naturally by computing high variance or maximal trace solutions that respect local distance constraints. In this paper, we show how to solve very large problems of this type by a matrix factorization that leads to much smaller SDPs than those previously studied. The matrix factorization is derived by expanding the solution of the original problem in terms of the bottom eigenvectors of a graph Laplacian. The smaller SDPs obtained from this matrix factorization yield very good approximations to solutions of the original problem. Moreover, these approximations can be further refined by conjugate gradient descent. We illustrate the approach on localization in large scale sensor networks, where optimizations involving tens of thousands of nodes can be solved in just a few minutes. 1

3 0.72918391 184 nips-2006-Stratification Learning: Detecting Mixed Density and Dimensionality in High Dimensional Point Clouds

Author: Gloria Haro, Gregory Randall, Guillermo Sapiro

Abstract: The study of point cloud data sampled from a stratification, a collection of manifolds with possible different dimensions, is pursued in this paper. We present a technique for simultaneously soft clustering and estimating the mixed dimensionality and density of such structures. The framework is based on a maximum likelihood estimation of a Poisson mixture model. The presentation of the approach is completed with artificial and real examples demonstrating the importance of extending manifold learning to stratification learning. 1

4 0.72088635 65 nips-2006-Denoising and Dimension Reduction in Feature Space

Author: Mikio L. Braun, Klaus-Robert Müller, Joachim M. Buhmann

Abstract: We show that the relevant information about a classification problem in feature space is contained up to negligible error in a finite number of leading kernel PCA components if the kernel matches the underlying learning problem. Thus, kernels not only transform data sets such that good generalization can be achieved even by linear discriminant functions, but this transformation is also performed in a manner which makes economic use of feature space dimensions. In the best case, kernels provide efficient implicit representations of the data to perform classification. Practically, we propose an algorithm which enables us to recover the subspace and dimensionality relevant for good classification. Our algorithm can therefore be applied (1) to analyze the interplay of data set and kernel in a geometric fashion, (2) to help in model selection, and to (3) de-noise in feature space in order to yield better classification results. 1

5 0.71945596 127 nips-2006-MLLE: Modified Locally Linear Embedding Using Multiple Weights

Author: Zhenyue Zhang, Jing Wang

Abstract: The locally linear embedding (LLE) is improved by introducing multiple linearly independent local weight vectors for each neighborhood. We characterize the reconstruction weights and show the existence of the linearly independent weight vectors at each neighborhood. The modified locally linear embedding (MLLE) proposed in this paper is much stable. It can retrieve the ideal embedding if MLLE is applied on data points sampled from an isometric manifold. MLLE is also compared with the local tangent space alignment (LTSA). Numerical examples are given that show the improvement and efficiency of MLLE. 1

6 0.71932042 20 nips-2006-Active learning for misspecified generalized linear models

7 0.71812332 168 nips-2006-Reducing Calibration Time For Brain-Computer Interfaces: A Clustering Approach

8 0.71754211 175 nips-2006-Simplifying Mixture Models through Function Approximation

9 0.71540439 32 nips-2006-Analysis of Empirical Bayesian Methods for Neuroelectromagnetic Source Localization

10 0.71529889 83 nips-2006-Generalized Maximum Margin Clustering and Unsupervised Kernel Learning

11 0.71498758 3 nips-2006-A Complexity-Distortion Approach to Joint Pattern Alignment

12 0.71222514 106 nips-2006-Large Margin Hidden Markov Models for Automatic Speech Recognition

13 0.7122196 121 nips-2006-Learning to be Bayesian without Supervision

14 0.71127254 48 nips-2006-Branch and Bound for Semi-Supervised Support Vector Machines

15 0.70913815 128 nips-2006-Manifold Denoising

16 0.70896369 117 nips-2006-Learning on Graph with Laplacian Regularization

17 0.7056585 19 nips-2006-Accelerated Variational Dirichlet Process Mixtures

18 0.70389211 67 nips-2006-Differential Entropic Clustering of Multivariate Gaussians

19 0.70192915 169 nips-2006-Relational Learning with Gaussian Processes

20 0.70163602 35 nips-2006-Approximate inference using planar graph decomposition