nips nips2002 nips2002-94 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Wim Wiegerinck, Tom Heskes
Abstract: We consider loopy belief propagation for approximate inference in probabilistic graphical models. A limitation of the standard algorithm is that clique marginals are computed as if there were no loops in the graph. To overcome this limitation, we introduce fractional belief propagation. Fractional belief propagation is formulated in terms of a family of approximate free energies, which includes the Bethe free energy and the naive mean-field free as special cases. Using the linear response correction of the clique marginals, the scale parameters can be tuned. Simulation results illustrate the potential merits of the approach.
Reference: text
sentIndex sentText sentNum sentScore
1 nl ¡ Abstract We consider loopy belief propagation for approximate inference in probabilistic graphical models. [sent-3, score-1.325]
2 A limitation of the standard algorithm is that clique marginals are computed as if there were no loops in the graph. [sent-4, score-0.594]
3 To overcome this limitation, we introduce fractional belief propagation. [sent-5, score-0.786]
4 Fractional belief propagation is formulated in terms of a family of approximate free energies, which includes the Bethe free energy and the naive mean-field free as special cases. [sent-6, score-1.595]
5 Using the linear response correction of the clique marginals, the scale parameters can be tuned. [sent-7, score-0.347]
6 1 Introduction Probabilistic graphical models are powerful tools for learning and reasoning in domains with uncertainty. [sent-9, score-0.062]
7 Unfortunately, inference in large, complex graphical models is computationally intractable. [sent-10, score-0.108]
8 One of methods in the latter class is Pearl’s loopy belief propagation [1]. [sent-13, score-1.152]
9 Until recently, a disadvantage of the method was its heuristic character, and the absence of a converge guarantee. [sent-15, score-0.026]
10 Often, the algorithm gives good solutions, but sometimes the algorithm fails to converge. [sent-16, score-0.06]
11 [2] showed that the fixed points of loopy belief propagation are actually stationary points of the Bethe free energy from statistical physics. [sent-18, score-1.503]
12 This does not only give the algorithm a firm theoretical basis, but it also solves the convergence problem by the existence of an objective function which can be minimized directly [3]. [sent-19, score-0.095]
13 Minka’s expectation propagation [4] is a generalization that makes the method applicable to Bayesian learning. [sent-21, score-0.344]
14 [2] introduced the Kikuchi free energy in the graphical models community, which can be considered as a higher order truncation of a systematic expansion of the exact free energy using larger clusters. [sent-23, score-0.796]
15 They also developed an associated generalized belief propagation algorithm. [sent-24, score-0.78]
16 In this paper, we propose another direction which yields possibilities to improve upon loopy belief propagation, without resorting to larger clusters. [sent-25, score-0.902]
17 In section 3 we shortly review approximate inference by loopy belief propagation and discuss an inherent limitation of this method. [sent-28, score-1.403]
18 This motivates us to generalize upon loopy belief propagation. [sent-29, score-0.959]
19 We do so by formulating a new class of approximate free energies in section 4. [sent-30, score-0.374]
20 In section 5 we consider the fixed point equations and formulate the fractional belief propagation algorithm. [sent-31, score-1.184]
21 In section 6 we will use linear response estimates to tune the parameters in the method. [sent-32, score-0.179]
22 2 Inference in graphical models on a set of discrete variables is assumed to be proportional to a product ¡§ ©©©§¥ ¡ ¢ ¨¦¤£¡ Our starting point is a probabilistic model in a finite domain. [sent-35, score-0.062]
23 The joint distribution of clique potentials (1) ¡ 1 § ) (¡ $&$"¡ ' %# ! [sent-36, score-0.226]
24 % % where each refers to a subset of the nodes in the model. [sent-37, score-0.042]
25 ¦DB@8(¡ where the sum is over connected pairs . [sent-39, score-0.031]
26 The right hand side can be viewed as product of potentials , where is the set of edges that contain node . [sent-40, score-0.087]
27 The typical task that we try to perform is to compute the marginal . [sent-41, score-0.031]
28 Basically, the computation requires the summation over all single node distributions remaining variables . [sent-42, score-0.082]
29 In small networks, this summation can be performed explicitly. [sent-43, score-0.031]
30 In large networks, the complexity of computation depends on the underlying graphical structure of the model, and is exponential in the maximal clique size of the triangulated moralized graph [5]. [sent-44, score-0.278]
31 This may lead to intractable models, even if the clusters are small. [sent-45, score-0.025]
32 When the model is intractable, one has to resort to approximate methods. [sent-46, score-0.099]
33 2 u % ¡ 3 Loopy belief propagation in Boltzmann machines A nowadays popular approximate method is loopy belief propagation. [sent-47, score-1.679]
34 In this section, we will shortly review of this method. [sent-48, score-0.037]
35 Next we will discuss one of its inherent limitations, which motivates us to propose a possible way to overcome this limitation. [sent-49, score-0.092]
36 For simplicity, we restrict this section to Boltzmann machines. [sent-50, score-0.027]
37 Loopy belief propagaThe goal is to compute pair marginals tion computes approximating pair marginals by applying the belief propagation algorithm for trees to loopy graphs, i. [sent-52, score-2.34]
38 , it computes messages according to dH2 ¡ SH2 x SH2 (¡ H © 2 2 (¡ e vq p # V 2 2¡ WD@ 2 (¡ SH2 CA9 ¢ V y Y a Q § 2 (¡ dH2 H 2¡ ¡ SH2 ¦D9 q H (¡ 2 y CA E ! [sent-54, score-0.057]
39 H dH2 in which (3) are the incoming messages to node except from node , (4) If the procedure converges (which is not guaranteed in loopy graphs), the resulting approximating pair marginals are (5) Q § 2 H (¡ PH 2 ¡ SH2 H 2¡ ¡ bSH2 DB SH2 ¡ SH2 CA9 ! [sent-55, score-0.972]
40 In general, the exact pair marginals will be of the form (6) Q Q dH2 ¢ bSH2 bSH2 which has an effective interaction . [sent-57, score-0.371]
41 With loops in the , and the result will in general be different graph, however, the loops will contribute to . [sent-59, score-0.118]
42 If we compare (6) with (5), we see that loopy belief propagation assumes from , ignoring contributions from loops. [sent-60, score-1.152]
43 ¢ bSH2 bdH2 Q Q Q SH2 bdH2 Q dH2 Q Now suppose we would know in advance, then a better approximation could be expected if we could model approximate pair marginals of the form 0 u % ¨ 2 § 2 v¡`2 p ' ! [sent-61, score-0.377]
44 The (7) are to be determined by some propagation algorithm. [sent-64, score-0.344]
45 In the next sections, we generalize upon the above idea and introduce fractional belief propagation as a family of loopy belief propagation-like algorithms parameterized by scale parameters . [sent-65, score-2.124]
46 The resulting approximating clique marginals will be of the form ¡ (8) ¡ where is the set of nodes in clique . [sent-66, score-0.754]
47 The issue of how to set the parameters is subject of section 6. [sent-67, score-0.027]
48 £ 4 A family of approximate free energies The new class of approximating methods will be formulated via a new class of approximating free energies. [sent-68, score-0.655]
49 The exact free energy of a model with clique potentials is ¡ % ©(¡ ' $ " '& ¡ RE T (¡ ' % % % E ¡ E $ " %#! [sent-69, score-0.609]
50 can be recovered by minimization of the free (10) e %¨ 8 ei ¨ ¥ It is well known that the joint distribution energy (9) 6 g(¡ ¢ ¢ 7 53 2 0 ( #46 '$ 1) under the constraint . [sent-70, score-0.408]
51 The idea is now to construct an approximate free energy and compute its minimum . [sent-71, score-0.447]
52 x x 9 e x I G E B B @ ©HFDCA A popular approximate free energy is based on the Bethe assumption, which basically states that is approximately tree-like, x Y 0 2 § Q¥ b¡`2 x # (¡ x % # 8(¡x 2 ¢ h q hp P % % (11) in which are the cliques that contain . [sent-73, score-0.594]
53 This assumption is exact if the factor graph [6] of the model is a tree. [sent-74, score-0.058]
54 9 It can be shown that minima of the Bethe free energy are fixed points of the loopy belief propagation algorithm [2]. [sent-76, score-1.533]
55 In our proposal, we generalize upon the Bethe assumption, and make the parameterized assumption (13) § ¡ x ¡ ' ¨ ¦ % % W¥7u Y 6 %2 ¡ % 9 Y 2 ¢ ¢ %¡q p% 2 2 ¡ 2 x # ©¦ ¡ x r# ¡ x % ¢ ¦ P ¨ h q hp q CQ¥ % % . [sent-77, score-0.208]
56 The term with single node marginals is constructed to deal with overcounted terms. [sent-79, score-0.316]
57 This class of free energies trivially contains the Bethe ). [sent-81, score-0.257]
58 In addition, it includes the variational mean field free energy, confree energy ( ventionally written as as a limiting (implying an effective interaction of strength zero). [sent-82, score-0.41]
59 If this limit is case for taken in (14), terms linear in will dominate and act as a penalty term for non-factorial entropies. [sent-83, score-0.025]
60 Thirdly, it contains the recently derived free energy to upper bound the log partition function [7]. [sent-86, score-0.351]
61 This one is recovered if, for pair-wise cliques, the ’s are set to the edge appearance probabilities in the so-called spanning tree polytope of the graph. [sent-87, score-0.055]
62 2 x '"& 2 x q 9 2 $ % 9 T ' $%"# 2 x 2 ¦ ¨ 9 % £ % ¡ © § ¨¡ ¢ % ¡ ¢ ¤ ¢ ¥£ 6 ¢ ¡ 9 ¤ ¥¢ 2 x ¨ p 2 ¦ ¢ x % SH2 ¡ 6 5 SH2 ¡ 5 Fractional belief propagation In this section we will use the fixed point equations to generalize Pearl’s algorithm to . [sent-89, score-0.911]
63 Here, we do not worry too fractional belief propagation as a heuristic to minimize much about guaranteed convergence. [sent-90, score-1.162]
64 If convergence is a problem, one can always resort to direct minimization of using, e. [sent-91, score-0.101]
65 We propagation converges, its solution is guaranteed to be a local minimum of expect a similar situation for . [sent-95, score-0.378]
66 We obtain § ¦ ¦ Fixed point equations from (15) (16) (17) 8¡£ ' (¡ % % © 2 U&¡2 2 2 (¡ 2 x ¢ % y &(¡ % x § 2 b¡`2 y % # ! [sent-97, score-0.027]
67 ¨ ¦ ¤ ¨ ¦ ¤ P ©§¥ § ¡ x % % x % b¡`2 y 2 % b¡`2 x 2 and we notice that has indeed the functional dependency of as desired in (8). [sent-99, score-0.029]
68 Inspired by Pearl’s loopy belief propagation algorithm, we use the above equations to formulate fractional belief propagation (see Algorithm 1) 1 . [sent-100, score-2.309]
69 $" 1 , is equivalent to standard loopy belief propagation 8¡£ 7 Algorithm 1 Fractional Belief Propagation 1: initialize( ) 2: repeat 3: for all do 4: update according to (15). [sent-105, score-1.194]
70 5: update , according to (17) using the new and the old . [sent-106, score-0.042]
71 7: end for 8: until convergence criterion is met (or maximum number of iterations is exceeded) (or failure) 9: return % 2 x x % 2 `§ x x % u c 2 £dY ex % u dY 2 c % % xy % 0 x 2 y q p ¦ ! [sent-108, score-0.17]
72 2 `§ 2 y % % x ¡ ¡ As a theoretical footnote we mention a different (generally more greedy) -algorithm, which has the same fixed points as . [sent-109, score-0.027]
73 The minimization of the ’s using D leads to the well known mean field equations. [sent-112, score-0.029]
74 6 T (0 ¥ ¢ ¥ ¡ 0 ¦ 2 x ¨ ©§ 6 0 6 6 Tuning using linear response theory ¡ Now the question is, how do we set the parameters ? [sent-113, score-0.121]
75 Unfortunately, we do not have access to the true pair marginals, but if we would have estimates that improve upon , we can compute new parameters such that is closer to . [sent-115, score-0.231]
76 However, with the new parameters the estimates will be changed as well, and this procedure should be iterated. [sent-116, score-0.059]
77 For simplicity, In this paper, we use the linear response theory [10] to improve upon we restrict ourselves to Boltzmann machines with binary units. [sent-119, score-0.241]
78 Linear response theory has been applied previously to improve upon pair marginals (or correlations) in the naive mean field approximation [11] and in loopy belief propagation [12]. [sent-122, score-1.68]
79 ¡ To iteratively compute new scaling parameters from the linear response corrections we use a gradient descent like algorithm SH ¦2 `§ x SH 2 ix¨ ' (% P ¥¦ ¤ § ¡ IPSH2 F E ¦ ¡ £ £ ¡ ¡ § ¨¡ ¡ ¢ ¥ £ ! [sent-123, score-0.211]
80 By iteratively computing the linear response marginals, and adapting the scale parameters in the gradient descent direction, we can optimize , see Algorithm 2. [sent-125, score-0.186]
81 Each linear response to a Boltzmann machine with estimate can be computed numerically by applying parameters and . [sent-126, score-0.153]
82 Partial derivatives with respect to , required for the gradient in (22), can be computed numerically by rerunning fractional belief propagation with parameters . [sent-127, score-1.134]
83 In this procedure the computation cost to update requires times the cost of , where is the number of nodes and is the number of edges. [sent-128, score-0.112]
84 £ SH2 ¡ ¡ £ £8 ¡ ¡ 8¡£ 7 SH2 T £ Q¡ ¡ Q H W T P§ § W W u T (u 7 Numerical results We applied the method to a Boltzmann machine in which the nodes are connected according to a square grid with periodic boundary conditions. [sent-129, score-0.073]
85 Thresholds drawn from the binary distribution were drawn according to We generated networks, and compared results of standard loopy belief propagation to results obtained by fractional belief propagation where the scale parameters were obtained by Algorithm 2. [sent-131, score-2.29]
86 The iterations were stopped if the maximum change in was less than , or if the number of iterations exceeded . [sent-135, score-0.136]
87 Throughout the procedure, fractional belief propagations were ran with convergence criterion of maximal difference of between messages in successive iterations (one iteration is one cycle over all weights). [sent-136, score-0.946]
88 In our experiment, all (fractional) belief propagation runs converged. [sent-137, score-0.78]
89 The number of updates of ranged between 20 and 80. [sent-138, score-0.027]
90 After optimization we found (inverse) scale parameters ranging from to . [sent-139, score-0.036]
91 In the left panel, it can be seen that the procedure can lead to significant improvements. [sent-141, score-0.028]
wordName wordTfidf (topN-words)
[('belief', 0.436), ('loopy', 0.372), ('propagation', 0.344), ('fractional', 0.322), ('marginals', 0.265), ('clique', 0.19), ('bethe', 0.19), ('energy', 0.177), ('free', 0.174), ('sh', 0.152), ('boltzmann', 0.128), ('response', 0.096), ('energies', 0.083), ('pearl', 0.074), ('bp', 0.073), ('basically', 0.069), ('approximating', 0.067), ('upon', 0.066), ('approximate', 0.065), ('graphical', 0.062), ('marginalization', 0.062), ('loops', 0.059), ('messages', 0.057), ('hp', 0.055), ('exceeded', 0.054), ('cliques', 0.054), ('nijmegen', 0.054), ('dh', 0.051), ('node', 0.051), ('limitation', 0.05), ('yedidia', 0.049), ('generalize', 0.047), ('pair', 0.047), ('inference', 0.046), ('nodes', 0.042), ('update', 0.042), ('dy', 0.041), ('iterations', 0.041), ('parameterized', 0.04), ('substitution', 0.04), ('motivates', 0.038), ('convergence', 0.038), ('shortly', 0.037), ('scale', 0.036), ('potentials', 0.036), ('ix', 0.035), ('met', 0.035), ('guaranteed', 0.034), ('resort', 0.034), ('limiting', 0.032), ('numerically', 0.032), ('exact', 0.032), ('compute', 0.031), ('estimates', 0.031), ('summation', 0.031), ('ph', 0.031), ('ex', 0.031), ('connected', 0.031), ('eld', 0.03), ('algorithm', 0.03), ('minimization', 0.029), ('dependency', 0.029), ('descent', 0.029), ('procedure', 0.028), ('improve', 0.028), ('formulate', 0.028), ('overcome', 0.028), ('initialize', 0.028), ('access', 0.028), ('recovered', 0.028), ('interaction', 0.027), ('section', 0.027), ('thirdly', 0.027), ('footnote', 0.027), ('ranged', 0.027), ('polytope', 0.027), ('cccp', 0.027), ('geert', 0.027), ('grooteplein', 0.027), ('heskes', 0.027), ('kikuchi', 0.027), ('propagations', 0.027), ('snn', 0.027), ('tc', 0.027), ('wim', 0.027), ('yuille', 0.027), ('equations', 0.027), ('minimized', 0.027), ('inherent', 0.026), ('heuristic', 0.026), ('naive', 0.026), ('graph', 0.026), ('machines', 0.026), ('linear', 0.025), ('family', 0.025), ('criterion', 0.025), ('intractable', 0.025), ('tuning', 0.025), ('formulating', 0.025), ('ez', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 94 nips-2002-Fractional Belief Propagation
Author: Wim Wiegerinck, Tom Heskes
Abstract: We consider loopy belief propagation for approximate inference in probabilistic graphical models. A limitation of the standard algorithm is that clique marginals are computed as if there were no loops in the graph. To overcome this limitation, we introduce fractional belief propagation. Fractional belief propagation is formulated in terms of a family of approximate free energies, which includes the Bethe free energy and the naive mean-field free as special cases. Using the linear response correction of the clique marginals, the scale parameters can be tuned. Simulation results illustrate the potential merits of the approach.
2 0.58179438 189 nips-2002-Stable Fixed Points of Loopy Belief Propagation Are Local Minima of the Bethe Free Energy
Author: Tom Heskes
Abstract: We extend recent work on the connection between loopy belief propagation and the Bethe free energy. Constrained minimization of the Bethe free energy can be turned into an unconstrained saddle-point problem. Both converging double-loop algorithms and standard loopy belief propagation can be interpreted as attempts to solve this saddle-point problem. Stability analysis then leads us to conclude that stable fixed points of loopy belief propagation must be (local) minima of the Bethe free energy. Perhaps surprisingly, the converse need not be the case: minima can be unstable fixed points. We illustrate this with an example and discuss implications. 1
3 0.21716632 82 nips-2002-Exponential Family PCA for Belief Compression in POMDPs
Author: Nicholas Roy, Geoffrey J. Gordon
Abstract: Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are intractable for large models. The intractability of these algorithms is due to a great extent to their generating an optimal policy over the entire belief space. However, in real POMDP problems most belief states are unlikely, and there is a structured, low-dimensional manifold of plausible beliefs embedded in the high-dimensional belief space. We introduce a new method for solving large-scale POMDPs by taking advantage of belief space sparsity. We reduce the dimensionality of the belief space by exponential family Principal Components Analysis [1], which allows us to turn the sparse, highdimensional belief space into a compact, low-dimensional representation in terms of learned features of the belief state. We then plan directly on the low-dimensional belief features. By planning in a low-dimensional space, we can find policies for POMDPs that are orders of magnitude larger than can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and also on a mobile robot navigation task.
4 0.16486575 32 nips-2002-Approximate Inference and Protein-Folding
Author: Chen Yanover, Yair Weiss
Abstract: Side-chain prediction is an important subtask in the protein-folding problem. We show that finding a minimal energy side-chain configuration is equivalent to performing inference in an undirected graphical model. The graphical model is relatively sparse yet has many cycles. We used this equivalence to assess the performance of approximate inference algorithms in a real-world setting. Specifically we compared belief propagation (BP), generalized BP (GBP) and naive mean field (MF). In cases where exact inference was possible, max-product BP always found the global minimum of the energy (except in few cases where it failed to converge), while other approximation algorithms of similar complexity did not. In the full protein data set, maxproduct BP always found a lower energy configuration than the other algorithms, including a widely used protein-folding software (SCWRL). 1
5 0.12987286 4 nips-2002-A Differential Semantics for Jointree Algorithms
Author: James D. Park, Adnan Darwiche
Abstract: A new approach to inference in belief networks has been recently proposed, which is based on an algebraic representation of belief networks using multi–linear functions. According to this approach, the key computational question is that of representing multi–linear functions compactly, since inference reduces to a simple process of ev aluating and differentiating such functions. W e show here that mainstream inference algorithms based on jointrees are a special case of this approach in a v ery precise sense. W e use this result to prov e new properties of jointree algorithms, and then discuss some of its practical and theoretical implications. 1
6 0.12571985 133 nips-2002-Learning to Perceive Transparency from the Statistics of Natural Scenes
7 0.12531757 181 nips-2002-Self Supervised Boosting
8 0.11999314 96 nips-2002-Generalized² Linear² Models
9 0.10814803 152 nips-2002-Nash Propagation for Loopy Graphical Games
10 0.10133452 205 nips-2002-Value-Directed Compression of POMDPs
11 0.087348625 169 nips-2002-Real-Time Particle Filters
12 0.075496949 73 nips-2002-Dynamic Bayesian Networks with Deterministic Latent Tables
13 0.065933704 80 nips-2002-Exact MAP Estimates by (Hyper)tree Agreement
14 0.060250863 74 nips-2002-Dynamic Structure Super-Resolution
15 0.059308287 21 nips-2002-Adaptive Classification by Variational Kalman Filtering
16 0.05832589 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions
17 0.056048289 174 nips-2002-Regularized Greedy Importance Sampling
18 0.055668663 157 nips-2002-On the Dirichlet Prior and Bayesian Regularization
19 0.054659348 204 nips-2002-VIBES: A Variational Inference Engine for Bayesian Networks
20 0.054252621 173 nips-2002-Recovering Intrinsic Images from a Single Image
topicId topicWeight
[(0, -0.171), (1, -0.017), (2, -0.183), (3, 0.078), (4, -0.0), (5, 0.14), (6, -0.164), (7, 0.404), (8, -0.377), (9, -0.095), (10, -0.256), (11, 0.185), (12, -0.177), (13, -0.032), (14, 0.136), (15, -0.071), (16, 0.053), (17, 0.054), (18, -0.049), (19, 0.01), (20, -0.032), (21, -0.041), (22, 0.104), (23, -0.056), (24, 0.003), (25, -0.019), (26, 0.071), (27, 0.019), (28, -0.009), (29, 0.063), (30, -0.05), (31, -0.075), (32, 0.015), (33, -0.041), (34, -0.001), (35, 0.053), (36, -0.096), (37, -0.049), (38, 0.094), (39, -0.099), (40, -0.052), (41, -0.025), (42, -0.055), (43, -0.002), (44, -0.031), (45, 0.051), (46, -0.059), (47, -0.001), (48, -0.052), (49, -0.057)]
simIndex simValue paperId paperTitle
same-paper 1 0.98503911 94 nips-2002-Fractional Belief Propagation
Author: Wim Wiegerinck, Tom Heskes
Abstract: We consider loopy belief propagation for approximate inference in probabilistic graphical models. A limitation of the standard algorithm is that clique marginals are computed as if there were no loops in the graph. To overcome this limitation, we introduce fractional belief propagation. Fractional belief propagation is formulated in terms of a family of approximate free energies, which includes the Bethe free energy and the naive mean-field free as special cases. Using the linear response correction of the clique marginals, the scale parameters can be tuned. Simulation results illustrate the potential merits of the approach.
2 0.9543848 189 nips-2002-Stable Fixed Points of Loopy Belief Propagation Are Local Minima of the Bethe Free Energy
Author: Tom Heskes
Abstract: We extend recent work on the connection between loopy belief propagation and the Bethe free energy. Constrained minimization of the Bethe free energy can be turned into an unconstrained saddle-point problem. Both converging double-loop algorithms and standard loopy belief propagation can be interpreted as attempts to solve this saddle-point problem. Stability analysis then leads us to conclude that stable fixed points of loopy belief propagation must be (local) minima of the Bethe free energy. Perhaps surprisingly, the converse need not be the case: minima can be unstable fixed points. We illustrate this with an example and discuss implications. 1
3 0.51760155 4 nips-2002-A Differential Semantics for Jointree Algorithms
Author: James D. Park, Adnan Darwiche
Abstract: A new approach to inference in belief networks has been recently proposed, which is based on an algebraic representation of belief networks using multi–linear functions. According to this approach, the key computational question is that of representing multi–linear functions compactly, since inference reduces to a simple process of ev aluating and differentiating such functions. W e show here that mainstream inference algorithms based on jointrees are a special case of this approach in a v ery precise sense. W e use this result to prov e new properties of jointree algorithms, and then discuss some of its practical and theoretical implications. 1
4 0.51171196 32 nips-2002-Approximate Inference and Protein-Folding
Author: Chen Yanover, Yair Weiss
Abstract: Side-chain prediction is an important subtask in the protein-folding problem. We show that finding a minimal energy side-chain configuration is equivalent to performing inference in an undirected graphical model. The graphical model is relatively sparse yet has many cycles. We used this equivalence to assess the performance of approximate inference algorithms in a real-world setting. Specifically we compared belief propagation (BP), generalized BP (GBP) and naive mean field (MF). In cases where exact inference was possible, max-product BP always found the global minimum of the energy (except in few cases where it failed to converge), while other approximation algorithms of similar complexity did not. In the full protein data set, maxproduct BP always found a lower energy configuration than the other algorithms, including a widely used protein-folding software (SCWRL). 1
5 0.44307274 82 nips-2002-Exponential Family PCA for Belief Compression in POMDPs
Author: Nicholas Roy, Geoffrey J. Gordon
Abstract: Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are intractable for large models. The intractability of these algorithms is due to a great extent to their generating an optimal policy over the entire belief space. However, in real POMDP problems most belief states are unlikely, and there is a structured, low-dimensional manifold of plausible beliefs embedded in the high-dimensional belief space. We introduce a new method for solving large-scale POMDPs by taking advantage of belief space sparsity. We reduce the dimensionality of the belief space by exponential family Principal Components Analysis [1], which allows us to turn the sparse, highdimensional belief space into a compact, low-dimensional representation in terms of learned features of the belief state. We then plan directly on the low-dimensional belief features. By planning in a low-dimensional space, we can find policies for POMDPs that are orders of magnitude larger than can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and also on a mobile robot navigation task.
6 0.37323871 152 nips-2002-Nash Propagation for Loopy Graphical Games
7 0.34858978 96 nips-2002-Generalized² Linear² Models
8 0.34701541 133 nips-2002-Learning to Perceive Transparency from the Statistics of Natural Scenes
9 0.3146933 205 nips-2002-Value-Directed Compression of POMDPs
10 0.25134158 181 nips-2002-Self Supervised Boosting
11 0.22053489 174 nips-2002-Regularized Greedy Importance Sampling
12 0.21393314 80 nips-2002-Exact MAP Estimates by (Hyper)tree Agreement
13 0.21219803 169 nips-2002-Real-Time Particle Filters
14 0.20398477 47 nips-2002-Branching Law for Axons
15 0.18862325 179 nips-2002-Scaling of Probability-Based Optimization Algorithms
16 0.18790068 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions
17 0.18331583 188 nips-2002-Stability-Based Model Selection
18 0.17671311 157 nips-2002-On the Dirichlet Prior and Bayesian Regularization
19 0.17516467 24 nips-2002-Adaptive Scaling for Feature Selection in SVMs
20 0.1715173 203 nips-2002-Using Tarjan's Red Rule for Fast Dependency Tree Construction
topicId topicWeight
[(42, 0.085), (54, 0.098), (55, 0.039), (57, 0.396), (67, 0.018), (68, 0.043), (74, 0.065), (75, 0.017), (92, 0.043), (98, 0.083)]
simIndex simValue paperId paperTitle
same-paper 1 0.86417508 94 nips-2002-Fractional Belief Propagation
Author: Wim Wiegerinck, Tom Heskes
Abstract: We consider loopy belief propagation for approximate inference in probabilistic graphical models. A limitation of the standard algorithm is that clique marginals are computed as if there were no loops in the graph. To overcome this limitation, we introduce fractional belief propagation. Fractional belief propagation is formulated in terms of a family of approximate free energies, which includes the Bethe free energy and the naive mean-field free as special cases. Using the linear response correction of the clique marginals, the scale parameters can be tuned. Simulation results illustrate the potential merits of the approach.
2 0.68937957 187 nips-2002-Spikernels: Embedding Spiking Neurons in Inner-Product Spaces
Author: Lavi Shpigelman, Yoram Singer, Rony Paz, Eilon Vaadia
Abstract: Inner-product operators, often referred to as kernels in statistical learning, define a mapping from some input space into a feature space. The focus of this paper is the construction of biologically-motivated kernels for cortical activities. The kernels we derive, termed Spikernels, map spike count sequences into an abstract vector space in which we can perform various prediction tasks. We discuss in detail the derivation of Spikernels and describe an efficient algorithm for computing their value on any two sequences of neural population spike counts. We demonstrate the merits of our modeling approach using the Spikernel and various standard kernels for the task of predicting hand movement velocities from cortical recordings. In all of our experiments all the kernels we tested outperform the standard scalar product used in regression with the Spikernel consistently achieving the best performance. 1
3 0.60834908 135 nips-2002-Learning with Multiple Labels
Author: Rong Jin, Zoubin Ghahramani
Abstract: In this paper, we study a special kind of learning problem in which each training instance is given a set of (or distribution over) candidate class labels and only one of the candidate labels is the correct one. Such a problem can occur, e.g., in an information retrieval setting where a set of words is associated with an image, or if classes labels are organized hierarchically. We propose a novel discriminative approach for handling the ambiguity of class labels in the training examples. The experiments with the proposed approach over five different UCI datasets show that our approach is able to find the correct label among the set of candidate labels and actually achieve performance close to the case when each training instance is given a single correct label. In contrast, naIve methods degrade rapidly as more ambiguity is introduced into the labels. 1
4 0.58845294 189 nips-2002-Stable Fixed Points of Loopy Belief Propagation Are Local Minima of the Bethe Free Energy
Author: Tom Heskes
Abstract: We extend recent work on the connection between loopy belief propagation and the Bethe free energy. Constrained minimization of the Bethe free energy can be turned into an unconstrained saddle-point problem. Both converging double-loop algorithms and standard loopy belief propagation can be interpreted as attempts to solve this saddle-point problem. Stability analysis then leads us to conclude that stable fixed points of loopy belief propagation must be (local) minima of the Bethe free energy. Perhaps surprisingly, the converse need not be the case: minima can be unstable fixed points. We illustrate this with an example and discuss implications. 1
5 0.47301471 4 nips-2002-A Differential Semantics for Jointree Algorithms
Author: James D. Park, Adnan Darwiche
Abstract: A new approach to inference in belief networks has been recently proposed, which is based on an algebraic representation of belief networks using multi–linear functions. According to this approach, the key computational question is that of representing multi–linear functions compactly, since inference reduces to a simple process of ev aluating and differentiating such functions. W e show here that mainstream inference algorithms based on jointrees are a special case of this approach in a v ery precise sense. W e use this result to prov e new properties of jointree algorithms, and then discuss some of its practical and theoretical implications. 1
6 0.4686532 82 nips-2002-Exponential Family PCA for Belief Compression in POMDPs
7 0.45512885 32 nips-2002-Approximate Inference and Protein-Folding
8 0.44987768 152 nips-2002-Nash Propagation for Loopy Graphical Games
10 0.43636021 123 nips-2002-Learning Attractor Landscapes for Learning Motor Primitives
11 0.43513364 137 nips-2002-Location Estimation with a Differential Update Network
12 0.43504491 169 nips-2002-Real-Time Particle Filters
13 0.4332068 204 nips-2002-VIBES: A Variational Inference Engine for Bayesian Networks
14 0.42717063 37 nips-2002-Automatic Derivation of Statistical Algorithms: The EM Family and Beyond
15 0.42565066 163 nips-2002-Prediction and Semantic Association
16 0.42302394 205 nips-2002-Value-Directed Compression of POMDPs
17 0.42192495 5 nips-2002-A Digital Antennal Lobe for Pattern Equalization: Analysis and Design
18 0.4208073 124 nips-2002-Learning Graphical Models with Mercer Kernels
19 0.41653502 203 nips-2002-Using Tarjan's Red Rule for Fast Dependency Tree Construction
20 0.4159877 179 nips-2002-Scaling of Probability-Based Optimization Algorithms