nips nips2005 nips2005-46 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Benjamin V. Roy, Ciamac C. Moallemi
Abstract: We propose consensus propagation, an asynchronous distributed protocol for averaging numbers across a network. We establish convergence, characterize the convergence rate for regular graphs, and demonstrate that the protocol exhibits better scaling properties than pairwise averaging, an alternative that has received much recent attention. Consensus propagation can be viewed as a special case of belief propagation, and our results contribute to the belief propagation literature. In particular, beyond singly-connected graphs, there are very few classes of relevant problems for which belief propagation is known to converge. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We propose consensus propagation, an asynchronous distributed protocol for averaging numbers across a network. [sent-4, score-0.904]
2 We establish convergence, characterize the convergence rate for regular graphs, and demonstrate that the protocol exhibits better scaling properties than pairwise averaging, an alternative that has received much recent attention. [sent-5, score-0.427]
3 Consensus propagation can be viewed as a special case of belief propagation, and our results contribute to the belief propagation literature. [sent-6, score-1.064]
4 In particular, beyond singly-connected graphs, there are very few classes of relevant problems for which belief propagation is known to converge. [sent-7, score-0.541]
5 1 Introduction Consider a network of n nodes in which the ith node observes a number yi ∈ [0, 1] and n aims to compute the average i=1 yi /n. [sent-8, score-0.325]
6 In both wireless sensor and peer-to-peer networks, for example, there is interest in simple protocols for computing aggregate statistics (see, for example, the references in [1]), and averaging enables computation of several important ones. [sent-10, score-0.217]
7 Further, averaging serves as a primitive in the design of more sophisticated distributed information processing algorithms. [sent-11, score-0.175]
8 For example, a maximum likelihood estimate can be produced by an averaging protocol if each node’s observations are linear in variables of interest and noise is Gaussian [2]. [sent-12, score-0.244]
9 As another example, averaging protocols are central to policy-gradient-based methods for distributed optimization of network performance [3]. [sent-13, score-0.222]
10 In this paper we propose and analyze a new protocol – consensus propagation – for asynchronous distributed averaging. [sent-14, score-1.118]
11 As a baseline for comparison, we will also discuss another asychronous distributed protocol – pairwise averaging – which has received much recent attention. [sent-15, score-0.417]
12 In pairwise averaging, each node maintains its current estimate of the average, and each time a pair of nodes communicate, they revise their estimates to both take on the mean of their previous estimates. [sent-16, score-0.34]
13 Convergence of this protocol in a very general model of asynchronous computation and communication was established in [4]. [sent-17, score-0.349]
14 Recent work [5, 6] has studied the convergence rate and its dependence on network topology and how pairs of nodes are sampled. [sent-18, score-0.265]
15 Here, sampling is governed by a certain doubly stochastic matrix, and the convergence rate is characterized by its second-largest eigenvalue. [sent-19, score-0.2]
16 Consensus propagation is a simple algorithm with an intuitive interpretation. [sent-20, score-0.329]
17 It can also be viewed as an asynchronous distributed version of belief propagation as applied to approxi- mation of conditional distributions in a Gaussian Markov random field. [sent-21, score-0.77]
18 When the network of interest is singly-connected, prior results about belief propagation imply convergence of consensus propagation. [sent-22, score-1.075]
19 In particular, Gaussian belief propagation on a graph with cycles is not guaranteed to converge, as demonstrated by examples in [7]. [sent-24, score-0.642]
20 In fact, there are very few relevant cases where belief propagation on a graph with cycles is known to converge. [sent-25, score-0.642]
21 One simple case where belief propagation is guaranteed to converge is when the graph has only a single cycle [11, 12, 13]. [sent-27, score-0.732]
22 Recent work proposes the use of belief propagation to solve maximum-weight matching problems and proves convergence in that context [14]. [sent-28, score-0.661]
23 [15] proves convergence in the application of belief propogation to a classification problem. [sent-29, score-0.332]
24 In the Gaussian case, [7, 16] provide sufficient conditions for convergence, but these conditions are difficult to interpret and do not capture situations that correspond to consensus propagation. [sent-30, score-0.438]
25 With this background, let us discuss the primary contributions of this paper: (1) we propose consensus propagation, a new asynchronous distributed protocol for averaging; (2) we prove that consensus propagation converges even when executed asynchronously. [sent-31, score-1.58]
26 Each node i ∈ V is assigned a number yi ∈ [0, 1]. [sent-35, score-0.16]
27 The goal is for each node to obtain an estimate of y = ¯ yi /n through an asynchronous distributed protocol in which each node carries out i∈V simple computations and communicates parsimonious messages to its neighbors. [sent-36, score-0.786]
28 We propose consensus propagation as an approach to the aforementioned problem. [sent-37, score-0.793]
29 In this protocol, if a node i communicates to a neighbor j at time t, it transmits a message consistt ing of two numerical values. [sent-38, score-0.273]
30 Let µt ∈ R and Kij ∈ R+ denote the values associated with ij the most recently transmitted message from i to j at or before time t. [sent-39, score-0.397]
31 At each time t, node j t has stored in memory the most recent message from each neighbor: {µt , Kij |i ∈ N (j)}. [sent-40, score-0.236]
32 ij The initial values in memory before receiving any messages are arbitrary. [sent-41, score-0.32]
33 Consensus propagation is parameterized by a scalar β > 0 and a non-negative matrix Q ∈ Rn×n with Qij > 0 if and only if i = j and (i, j) ∈ E. [sent-42, score-0.363]
34 (2) For each t, denote by Ut ⊆ E the set of directed edges along which messages are transmitted at time t. [sent-45, score-0.326]
35 In particular, the messages Fij (K t−1 ) and t−1 Gij (K t−1 ) transmitted from node i to node j depend only on {µt−1 , Kui |u ∈ N (i)}, ui which node i has stored in memory. [sent-49, score-0.6]
36 ui Consensus propagation is an asynchronous protocol because only a subset of the potential messages are transmitted at each time. [sent-51, score-0.896]
37 Our convergence analysis can also be extended to accommodate more general models of asynchronism that involve communication delays, as those presented in [17]. [sent-52, score-0.153]
38 In our study of convergence time, we will focus on the synchronous version of consensus propagation. [sent-53, score-0.748]
39 Note that synchronous consensus propagation is defined by: K t = F(K t−1 ), µt = G(µt−1 , K t−1 ), xt = X (µt−1 , K t−1 ). [sent-55, score-1.06]
40 In order for nodes in Sji to compute y, they must at least be provided with the average µ∗ ij ∗ among observations at nodes in Sij and the cardinality Kij = |Sij |. [sent-59, score-0.416]
41 The messages µt and ij t t Kij can be viewed as estimates. [sent-60, score-0.352]
42 In fact, when β = ∞, µt and Kij converge to µ∗ and ij ij ∗ Kij , as we will now explain. [sent-61, score-0.411]
43 Suppose the graph is singly connected, β = ∞, and transmissions are synchronous. [sent-62, score-0.21]
44 This is a recursive characterization of |Sij |, and it is easy to see that it converges in a number of iterations equal to the diameter of the graph. [sent-64, score-0.138]
45 A simple inductive argument shows that at each time t, µt is an average ij t among observations at Kij nodes in Sij , and after a number of iterations equal to the diameter of the graph, µt = µ∗ . [sent-66, score-0.448]
46 Further, for any i ∈ V , y= yi + 1+ u∈N (i) Kui µui u∈N (i) Kui , so xt converges to y. [sent-67, score-0.18]
47 This interpretation can be extended to the asynchronous case where it i elucidates the fact that µt and K t become µ∗ and K ∗ after every pair of nodes in the graph has established bilateral communication through some sequence of transmissions among adjacent nodes. [sent-68, score-0.555]
48 If β = ∞, for any (i, j) ∈ E that is part of a cycle, t Kij → ∞ whether transmissions are synchronous or asynchronous, so long as messages are transmitted along each edge of the cycle an infinite number of times. [sent-70, score-0.585]
49 A heuristic fix t−1 ˜t might be to compose the iteration (4) with one that attenuates: Kij ← 1+ u∈N (i)\j Kui , t ˜t ˜t and Kij ← Kij /(1+ ij Kij ). [sent-71, score-0.19]
50 The message is essentially ˜t ˜t unaffected when ij Kij is small but becomes increasingly attenuated as Kij grows. [sent-73, score-0.287]
51 This is exactly the kind of attenuation carried out by consensus propagation when βQij = 1/ ij < ∞. [sent-74, score-0.987]
52 2 Relation to Belief Propagation Consensus propagation can also be viewed as a special case of belief propagation. [sent-77, score-0.548]
53 In this context, belief propagation is used to approximate the marginal distributions of a vector x ∈ Rn conditioned on the observations y ∈ Rn . [sent-78, score-0.544]
54 Note that β can be viewed as an inverse temperature parameter; as β increases, components of x associated with adjacent nodes become increasingly correlated. [sent-81, score-0.174]
55 ¯ ¯ i i In belief propagation, messages are passed along edges of a Markov random field. [sent-91, score-0.391]
56 The message Mij (·) passed from node i to node j is a distribution on the variable xj . [sent-93, score-0.43]
57 Node i computes this message using incoming messages from other nodes as defined by the update equation t Mij (xj ) = κ t−1 Mui (xi ) dxi . [sent-94, score-0.34]
58 In particular, let t t t (µt , Kij ) ∈ R × R+ parameterize Gaussian message Mij (·) according to Mij (xj ) ∝ ij t t 2 exp −Kij (xj − µij ) . [sent-97, score-0.287]
59 Then, (6) is equivalent to synchronous consensus propagation iterations for K t and µt . [sent-98, score-1.005]
60 The sequence of densities pt (xj ) j t Mij (xj ) ∝ ψj (xj ) 2 t Kij (xj = exp −(xj − yj ) − i∈N (j) − µt )2 , ij i∈N (j) is meant to converge to an approximation of the marginal conditional distribution of xj . [sent-99, score-0.396]
61 With this and aforementioned corresponj dences, we have shown that consensus propagation is a special case of belief propagation. [sent-102, score-0.98]
62 Readers familiar with belief propagation will notice that in the derivation above we have used the sum product form of the algorithm. [sent-103, score-0.516]
63 3 Convergence The following theorem is our main convergence result. [sent-105, score-0.168]
64 This property is used to establish convergence to a unique fixed point. [sent-115, score-0.178]
65 This allows us to establish existence of a fixed point and asynchronous convergence. [sent-118, score-0.194]
66 4 Convergence Time for Regular Graphs In this section, we will study the convergence time of synchronous consensus propagation. [sent-119, score-0.779]
67 1 The Case of Regular Graphs We will restrict our analysis of convergence time to cases where (V, E) is a d-regular graph, for d ≥ 2. [sent-124, score-0.152]
68 We will also make simplifying assumptions that Qij = 1, µ0 = yi , and K 0 = [k0 ]ij for ij some scalar k0 ≥ 0. [sent-126, score-0.276]
69 Given a uniform initial condition K 0 = [k0 ]ij , we can study the sequence of iterates {K t } by examining the scalar sequence {kt }, defined by kt = (1 + (d − 1)kt−1 )(1 + (1 + (d − 1)kt−1 )/β). [sent-130, score-0.213]
70 Similarly, in this setting, the equations for the evolution of µt take the special form µt = ij yi 1 + 1− 1 + (d − 1)kt−1 1 + (d − 1)kt−1 u∈N (i)\j µt−1 ui . [sent-132, score-0.309]
71 where y ∈ Rnd is a vector with yij = yi and P ∈ ˆ ˆ ˆ corresponds to a Markov chain on the set of directed edges E. [sent-134, score-0.169]
72 As in (3), we associate each µt with an estimate xt of xβ according to xt = y/(1 + dk β ) + dk β Aµt /(1 + dk β ), where A ∈ Rn×nd is a matrix defined by + (Aµ)j = i∈N (j) µij /d. [sent-136, score-0.325]
73 Rnd×nd + The update equation (7) suggests that the convergence of µt is intimately tied to a notion t−1 ˆ ˆ ˆ ˆ of mixing time associated with P . [sent-137, score-0.302]
74 a t ˆτ ˆ Define the Ces` ro mixing time τ by τ = supt≥0 a τ =0 (P − P ) 2,nd . [sent-139, score-0.237]
75 Note that, in the case where P is aperiodic, ˆ matrix, P irreducible, and symmetric, τ corresponds to the traditional definition of mixing time: the ˆ inverse of the spectral gap of P . [sent-142, score-0.15]
76 A time t∗ is said to be an -convergence time if estimates xt are -accurate for all t ≥ t∗ . [sent-143, score-0.166]
77 The following theorem, whose proof can be found in [1], establishes a bound on the convergence time of synchronous consensus propagation given appropriately chosen β, as a function of and τ . [sent-144, score-1.108]
78 The second case of interest is when k0 = k β , so that kt = k β for all t ≥ 0 Theorem 3 suggests that initializing with k0 = k β leads to an improvement in convergence time. [sent-152, score-0.248]
79 Hence, we suspect that a convergence time bound of O((τ / ) log(1/ )) also holds for the case of k0 = 0. [sent-154, score-0.152]
80 It is not difficult to imagine variations of Algorithm 1 which use a doubling sequence of guesses for the Ces´ ro mixing time τ . [sent-158, score-0.263]
81 5 Comparison with Pairwise Averaging Using the results of Section 4, we can compare the performance of consensus propagation to that of pairwise averaging. [sent-161, score-0.855]
82 Pairwise averaging is usually defined in an asynchronous setting, but there is a synchronous counterpart which works as follows. [sent-162, score-0.466]
83 ¯ In the case of a singly-connected graph, synchronous consensus propagation converges exactly in a number of iterations equal to the diameter of the graph. [sent-166, score-1.094]
84 This is the best one can hope for under any algorithm, since the diameter is the minimum amount of time required for a message to travel between the two most distant nodes. [sent-169, score-0.193]
85 On the other hand, for a fixed accuracy , the worst-case number of iterations required by synchronous pairwise averaging on a singly-connected graph scales at least quadratically in the diameter [18]. [sent-170, score-0.609]
86 The rate of convergence of synchronous pairwise averaging is governed by the relation xt − y 1 2,n ≤ λt , where λ2 is the second largest eigenvalue of P . [sent-171, score-0.645]
87 Let τ2 = 1/ log(1/λ2 ), ¯ 2 and call it the mixing time of P . [sent-172, score-0.181]
88 The number of iterations required by consensus propagation is Θ(τ log τ ), whereas that required by pairwise averaging is Θ(τ2 ). [sent-175, score-1.044]
89 Both mixing times depend on the size and topology of the graph. [sent-176, score-0.181]
90 τ2 is the mixing time of a process on nodes that transitions along edges whereas τ is the mixing time of a process on directed edges that transitions towards nodes. [sent-177, score-0.679]
91 As we will now illustrate through an example, it is this difference that makes τ2 larger than τ and, therefore, pairwise averaging less efficient than consensus propagation. [sent-180, score-0.641]
92 In the case of a cycle (d = 2) with an even number of nodes n, minimizing the mixing time over P results in τ2 = Θ(n2 ) [19]. [sent-181, score-0.376]
93 Intuitively, the improvement in mixing time arises from the fact that the edge process moves around the cycle in a single direction and therefore explores the entire graph within n iterations. [sent-185, score-0.407]
94 The cycle example demonstrates a Θ(n/ log n) advantage offered by consensus propagation. [sent-187, score-0.545]
95 Comparisons of mixing times associated with other graph topologies remains an issue for future analysis. [sent-188, score-0.253]
96 An analysis of belief propagation on the turbo decoding graph with Gaussian densities. [sent-241, score-0.657]
97 On the uniqueness of loopy belief propagation fixed points. [sent-251, score-0.541]
98 On the convergence of iterative decoding on graphs with a single cycle. [sent-275, score-0.223]
99 Correctness of local probability propagation in graphical models with loops. [sent-281, score-0.329]
100 Correctness of belief propagation in Gaussian graphical models of arbitrary topology. [sent-299, score-0.516]
wordName wordTfidf (topN-words)
[('consensus', 0.438), ('propagation', 0.329), ('kui', 0.298), ('kij', 0.273), ('ij', 0.19), ('synchronous', 0.189), ('belief', 0.187), ('asynchronous', 0.162), ('mixing', 0.15), ('qij', 0.148), ('messages', 0.13), ('protocol', 0.129), ('kt', 0.127), ('convergence', 0.121), ('averaging', 0.115), ('nodes', 0.113), ('node', 0.108), ('xt', 0.104), ('graph', 0.103), ('message', 0.097), ('xj', 0.093), ('pairwise', 0.088), ('sij', 0.085), ('cycle', 0.082), ('mij', 0.079), ('transmitted', 0.079), ('fij', 0.068), ('ui', 0.067), ('diameter', 0.065), ('gij', 0.064), ('limt', 0.064), ('moallemi', 0.064), ('prabhakar', 0.064), ('transmissions', 0.064), ('graphs', 0.064), ('distributed', 0.06), ('boyd', 0.059), ('preprint', 0.056), ('ro', 0.056), ('ut', 0.055), ('stanford', 0.055), ('yi', 0.052), ('doubly', 0.051), ('edges', 0.05), ('iterations', 0.049), ('theorem', 0.047), ('protocols', 0.047), ('ciamac', 0.043), ('singly', 0.043), ('edge', 0.041), ('ces', 0.041), ('dk', 0.039), ('decoding', 0.038), ('mode', 0.037), ('communicates', 0.037), ('forth', 0.037), ('rnd', 0.037), ('sji', 0.037), ('directed', 0.036), ('rn', 0.035), ('scalar', 0.034), ('vertex', 0.034), ('ghosh', 0.034), ('transitions', 0.034), ('xi', 0.033), ('establish', 0.032), ('van', 0.032), ('communication', 0.032), ('regular', 0.032), ('viewed', 0.032), ('converge', 0.031), ('time', 0.031), ('chain', 0.031), ('topology', 0.031), ('attenuation', 0.03), ('adjacent', 0.029), ('marginal', 0.028), ('aggregate', 0.028), ('pt', 0.028), ('governed', 0.028), ('fastest', 0.028), ('gaussian', 0.028), ('sensor', 0.027), ('markov', 0.026), ('sequence', 0.026), ('correctness', 0.026), ('aforementioned', 0.026), ('established', 0.026), ('received', 0.025), ('log', 0.025), ('proceedings', 0.025), ('loopy', 0.025), ('unique', 0.025), ('classes', 0.025), ('passed', 0.024), ('proves', 0.024), ('management', 0.024), ('converges', 0.024), ('conjecture', 0.023), ('cycles', 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 46 nips-2005-Consensus Propagation
Author: Benjamin V. Roy, Ciamac C. Moallemi
Abstract: We propose consensus propagation, an asynchronous distributed protocol for averaging numbers across a network. We establish convergence, characterize the convergence rate for regular graphs, and demonstrate that the protocol exhibits better scaling properties than pairwise averaging, an alternative that has received much recent attention. Consensus propagation can be viewed as a special case of belief propagation, and our results contribute to the belief propagation literature. In particular, beyond singly-connected graphs, there are very few classes of relevant problems for which belief propagation is known to converge. 1
2 0.13833986 96 nips-2005-Inference with Minimal Communication: a Decision-Theoretic Variational Approach
Author: O. P. Kreidl, Alan S. Willsky
Abstract: Given a directed graphical model with binary-valued hidden nodes and real-valued noisy observations, consider deciding upon the maximum a-posteriori (MAP) or the maximum posterior-marginal (MPM) assignment under the restriction that each node broadcasts only to its children exactly one single-bit message. We present a variational formulation, viewing the processing rules local to all nodes as degrees-of-freedom, that minimizes the loss in expected (MAP or MPM) performance subject to such online communication constraints. The approach leads to a novel message-passing algorithm to be executed offline, or before observations are realized, which mitigates the performance loss by iteratively coupling all rules in a manner implicitly driven by global statistics. We also provide (i) illustrative examples, (ii) assumptions that guarantee convergence and efficiency and (iii) connections to active research areas. 1
3 0.13665445 70 nips-2005-Fast Information Value for Graphical Models
Author: Brigham S. Anderson, Andrew W. Moore
Abstract: Calculations that quantify the dependencies between variables are vital to many operations with graphical models, e.g., active learning and sensitivity analysis. Previously, pairwise information gain calculation has involved a cost quadratic in network size. In this work, we show how to perform a similar computation with cost linear in network size. The loss function that allows this is of a form amenable to computation by dynamic programming. The message-passing algorithm that results is described and empirical results demonstrate large speedups without decrease in accuracy. In the cost-sensitive domains examined, superior accuracy is achieved.
4 0.12674733 105 nips-2005-Large-Scale Multiclass Transduction
Author: Thomas Gärtner, Quoc V. Le, Simon Burton, Alex J. Smola, Vishy Vishwanathan
Abstract: We present a method for performing transductive inference on very large datasets. Our algorithm is based on multiclass Gaussian processes and is effective whenever the multiplication of the kernel matrix or its inverse with a vector can be computed sufficiently fast. This holds, for instance, for certain graph and string kernels. Transduction is achieved by variational inference over the unlabeled data subject to a balancing constraint. 1
5 0.12482742 125 nips-2005-Message passing for task redistribution on sparse graphs
Author: K. Wong, Zhuo Gao, David Tax
Abstract: The problem of resource allocation in sparse graphs with real variables is studied using methods of statistical physics. An efficient distributed algorithm is devised on the basis of insight gained from the analysis and is examined using numerical simulations, showing excellent performance and full agreement with the theoretical results.
6 0.12286837 204 nips-2005-Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation
7 0.11968394 127 nips-2005-Mixture Modeling by Affinity Propagation
8 0.10979994 121 nips-2005-Location-based activity recognition
9 0.10370304 65 nips-2005-Estimating the wrong Markov random field: Benefits in the computation-limited setting
10 0.088792711 60 nips-2005-Dynamic Social Network Analysis using Latent Space Models
11 0.082087062 114 nips-2005-Learning Rankings via Convex Hull Separation
12 0.077150159 154 nips-2005-Preconditioner Approximations for Probabilistic Graphical Models
13 0.076264188 178 nips-2005-Soft Clustering on Graphs
14 0.07607419 42 nips-2005-Combining Graph Laplacians for Semi--Supervised Learning
15 0.075143017 27 nips-2005-Analysis of Spectral Kernel Design based Semi-supervised Learning
16 0.074868195 90 nips-2005-Hot Coupling: A Particle Approach to Inference and Normalization on Pairwise Undirected Graphs
17 0.073736534 139 nips-2005-Non-iterative Estimation with Perturbed Gaussian Markov Processes
18 0.063464962 75 nips-2005-Fixing two weaknesses of the Spectral Method
19 0.063385941 79 nips-2005-Fusion of Similarity Data in Clustering
20 0.062425934 186 nips-2005-TD(0) Leads to Better Policies than Approximate Value Iteration
topicId topicWeight
[(0, 0.205), (1, 0.071), (2, -0.015), (3, -0.069), (4, -0.067), (5, -0.079), (6, -0.214), (7, 0.15), (8, 0.066), (9, 0.178), (10, 0.108), (11, -0.048), (12, -0.174), (13, 0.031), (14, 0.093), (15, -0.002), (16, -0.03), (17, 0.08), (18, 0.087), (19, 0.01), (20, -0.041), (21, 0.137), (22, -0.0), (23, 0.073), (24, 0.01), (25, -0.002), (26, 0.04), (27, 0.115), (28, -0.09), (29, -0.036), (30, -0.092), (31, 0.037), (32, 0.029), (33, -0.021), (34, -0.044), (35, -0.055), (36, 0.019), (37, 0.1), (38, 0.08), (39, 0.067), (40, -0.046), (41, -0.031), (42, -0.136), (43, -0.052), (44, 0.058), (45, -0.085), (46, -0.004), (47, 0.026), (48, 0.053), (49, 0.012)]
simIndex simValue paperId paperTitle
same-paper 1 0.96917146 46 nips-2005-Consensus Propagation
Author: Benjamin V. Roy, Ciamac C. Moallemi
Abstract: We propose consensus propagation, an asynchronous distributed protocol for averaging numbers across a network. We establish convergence, characterize the convergence rate for regular graphs, and demonstrate that the protocol exhibits better scaling properties than pairwise averaging, an alternative that has received much recent attention. Consensus propagation can be viewed as a special case of belief propagation, and our results contribute to the belief propagation literature. In particular, beyond singly-connected graphs, there are very few classes of relevant problems for which belief propagation is known to converge. 1
2 0.74644417 204 nips-2005-Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation
Author: Dmitry Malioutov, Alan S. Willsky, Jason K. Johnson
Abstract: This paper presents a new framework based on walks in a graph for analysis and inference in Gaussian graphical models. The key idea is to decompose correlations between variables as a sum over all walks between those variables in the graph. The weight of each walk is given by a product of edgewise partial correlations. We provide a walk-sum interpretation of Gaussian belief propagation in trees and of the approximate method of loopy belief propagation in graphs with cycles. This perspective leads to a better understanding of Gaussian belief propagation and of its convergence in loopy graphs. 1
3 0.696769 96 nips-2005-Inference with Minimal Communication: a Decision-Theoretic Variational Approach
Author: O. P. Kreidl, Alan S. Willsky
Abstract: Given a directed graphical model with binary-valued hidden nodes and real-valued noisy observations, consider deciding upon the maximum a-posteriori (MAP) or the maximum posterior-marginal (MPM) assignment under the restriction that each node broadcasts only to its children exactly one single-bit message. We present a variational formulation, viewing the processing rules local to all nodes as degrees-of-freedom, that minimizes the loss in expected (MAP or MPM) performance subject to such online communication constraints. The approach leads to a novel message-passing algorithm to be executed offline, or before observations are realized, which mitigates the performance loss by iteratively coupling all rules in a manner implicitly driven by global statistics. We also provide (i) illustrative examples, (ii) assumptions that guarantee convergence and efficiency and (iii) connections to active research areas. 1
4 0.67958885 125 nips-2005-Message passing for task redistribution on sparse graphs
Author: K. Wong, Zhuo Gao, David Tax
Abstract: The problem of resource allocation in sparse graphs with real variables is studied using methods of statistical physics. An efficient distributed algorithm is devised on the basis of insight gained from the analysis and is examined using numerical simulations, showing excellent performance and full agreement with the theoretical results.
5 0.67619133 121 nips-2005-Location-based activity recognition
Author: Lin Liao, Dieter Fox, Henry Kautz
Abstract: Learning patterns of human behavior from sensor data is extremely important for high-level activity inference. We show how to extract and label a person’s activities and significant places from traces of GPS data. In contrast to existing techniques, our approach simultaneously detects and classifies the significant locations of a person and takes the highlevel context into account. Our system uses relational Markov networks to represent the hierarchical activity model that encodes the complex relations among GPS readings, activities and significant places. We apply FFT-based message passing to perform efficient summation over large numbers of nodes in the networks. We present experiments that show significant improvements over existing techniques. 1
6 0.59385085 70 nips-2005-Fast Information Value for Graphical Models
7 0.45171779 127 nips-2005-Mixture Modeling by Affinity Propagation
8 0.40231344 90 nips-2005-Hot Coupling: A Particle Approach to Inference and Normalization on Pairwise Undirected Graphs
9 0.38018698 139 nips-2005-Non-iterative Estimation with Perturbed Gaussian Markov Processes
10 0.36331129 9 nips-2005-A Domain Decomposition Method for Fast Manifold Learning
11 0.35789236 114 nips-2005-Learning Rankings via Convex Hull Separation
12 0.35464895 107 nips-2005-Large scale networks fingerprinting and visualization using the k-core decomposition
13 0.35107937 65 nips-2005-Estimating the wrong Markov random field: Benefits in the computation-limited setting
14 0.34600171 105 nips-2005-Large-Scale Multiclass Transduction
15 0.32754615 178 nips-2005-Soft Clustering on Graphs
16 0.32388735 86 nips-2005-Generalized Nonnegative Matrix Approximations with Bregman Divergences
17 0.32349944 60 nips-2005-Dynamic Social Network Analysis using Latent Space Models
18 0.32188088 154 nips-2005-Preconditioner Approximations for Probabilistic Graphical Models
19 0.31797758 122 nips-2005-Logic and MRF Circuitry for Labeling Occluding and Thinline Visual Contours
20 0.31380054 116 nips-2005-Learning Topology with the Generative Gaussian Graph and the EM Algorithm
topicId topicWeight
[(3, 0.04), (10, 0.073), (27, 0.017), (30, 0.272), (31, 0.114), (34, 0.084), (41, 0.024), (55, 0.079), (65, 0.016), (69, 0.033), (73, 0.034), (88, 0.06), (91, 0.058)]
simIndex simValue paperId paperTitle
same-paper 1 0.81447434 46 nips-2005-Consensus Propagation
Author: Benjamin V. Roy, Ciamac C. Moallemi
Abstract: We propose consensus propagation, an asynchronous distributed protocol for averaging numbers across a network. We establish convergence, characterize the convergence rate for regular graphs, and demonstrate that the protocol exhibits better scaling properties than pairwise averaging, an alternative that has received much recent attention. Consensus propagation can be viewed as a special case of belief propagation, and our results contribute to the belief propagation literature. In particular, beyond singly-connected graphs, there are very few classes of relevant problems for which belief propagation is known to converge. 1
2 0.54198486 78 nips-2005-From Weighted Classification to Policy Search
Author: Doron Blatt, Alfred O. Hero
Abstract: This paper proposes an algorithm to convert a T -stage stochastic decision problem with a continuous state space to a sequence of supervised learning problems. The optimization problem associated with the trajectory tree and random trajectory methods of Kearns, Mansour, and Ng, 2000, is solved using the Gauss-Seidel method. The algorithm breaks a multistage reinforcement learning problem into a sequence of single-stage reinforcement learning subproblems, each of which is solved via an exact reduction to a weighted-classification problem that can be solved using off-the-self methods. Thus the algorithm converts a reinforcement learning problem into simpler supervised learning subproblems. It is shown that the method converges in a finite number of steps to a solution that cannot be further improved by componentwise optimization. The implication of the proposed algorithm is that a plethora of classification methods can be applied to find policies in the reinforcement learning problem. 1
3 0.54089051 96 nips-2005-Inference with Minimal Communication: a Decision-Theoretic Variational Approach
Author: O. P. Kreidl, Alan S. Willsky
Abstract: Given a directed graphical model with binary-valued hidden nodes and real-valued noisy observations, consider deciding upon the maximum a-posteriori (MAP) or the maximum posterior-marginal (MPM) assignment under the restriction that each node broadcasts only to its children exactly one single-bit message. We present a variational formulation, viewing the processing rules local to all nodes as degrees-of-freedom, that minimizes the loss in expected (MAP or MPM) performance subject to such online communication constraints. The approach leads to a novel message-passing algorithm to be executed offline, or before observations are realized, which mitigates the performance loss by iteratively coupling all rules in a manner implicitly driven by global statistics. We also provide (i) illustrative examples, (ii) assumptions that guarantee convergence and efficiency and (iii) connections to active research areas. 1
4 0.53749728 65 nips-2005-Estimating the wrong Markov random field: Benefits in the computation-limited setting
Author: Martin J. Wainwright
Abstract: Consider the problem of joint parameter estimation and prediction in a Markov random field: i.e., the model parameters are estimated on the basis of an initial set of data, and then the fitted model is used to perform prediction (e.g., smoothing, denoising, interpolation) on a new noisy observation. Working in the computation-limited setting, we analyze a joint method in which the same convex variational relaxation is used to construct an M-estimator for fitting parameters, and to perform approximate marginalization for the prediction step. The key result of this paper is that in the computation-limited setting, using an inconsistent parameter estimator (i.e., an estimator that returns the “wrong” model even in the infinite data limit) is provably beneficial, since the resulting errors can partially compensate for errors made by using an approximate prediction technique. En route to this result, we analyze the asymptotic properties of M-estimators based on convex variational relaxations, and establish a Lipschitz stability property that holds for a broad class of variational methods. We show that joint estimation/prediction based on the reweighted sum-product algorithm substantially outperforms a commonly used heuristic based on ordinary sum-product. 1 Keywords: Markov random fields; variational method; message-passing algorithms; sum-product; belief propagation; parameter estimation; learning. 1
5 0.53192735 145 nips-2005-On Local Rewards and Scaling Distributed Reinforcement Learning
Author: Drew Bagnell, Andrew Y. Ng
Abstract: We consider the scaling of the number of examples necessary to achieve good performance in distributed, cooperative, multi-agent reinforcement learning, as a function of the the number of agents n. We prove a worstcase lower bound showing that algorithms that rely solely on a global reward signal to learn policies confront a fundamental limit: They require a number of real-world examples that scales roughly linearly in the number of agents. For settings of interest with a very large number of agents, this is impractical. We demonstrate, however, that there is a class of algorithms that, by taking advantage of local reward signals in large distributed Markov Decision Processes, are able to ensure good performance with a number of samples that scales as O(log n). This makes them applicable even in settings with a very large number of agents n. 1
6 0.52727962 204 nips-2005-Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation
7 0.52527493 154 nips-2005-Preconditioner Approximations for Probabilistic Graphical Models
8 0.52395606 142 nips-2005-Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games
9 0.51396936 153 nips-2005-Policy-Gradient Methods for Planning
10 0.51077086 72 nips-2005-Fast Online Policy Gradient Learning with SMD Gain Vector Adaptation
11 0.50973803 90 nips-2005-Hot Coupling: A Particle Approach to Inference and Normalization on Pairwise Undirected Graphs
12 0.50788981 144 nips-2005-Off-policy Learning with Options and Recognizers
13 0.50660133 43 nips-2005-Comparing the Effects of Different Weight Distributions on Finding Sparse Representations
14 0.50286686 108 nips-2005-Layered Dynamic Textures
15 0.50010264 124 nips-2005-Measuring Shared Information and Coordinated Activity in Neuronal Networks
16 0.49903634 23 nips-2005-An Application of Markov Random Fields to Range Sensing
17 0.49765456 199 nips-2005-Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions
18 0.49755183 47 nips-2005-Consistency of one-class SVM and related algorithms
19 0.49743891 178 nips-2005-Soft Clustering on Graphs
20 0.49336502 187 nips-2005-Temporal Abstraction in Temporal-difference Networks