nips nips2011 nips2011-296 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yusuke Watanabe
Abstract: While loopy Belief Propagation (LBP) has been utilized in a wide variety of applications with empirical success, it comes with few theoretical guarantees. Especially, if the interactions of random variables in a graphical model are strong, the behaviors of the algorithm can be difficult to analyze due to underlying phase transitions. In this paper, we develop a novel approach to the uniqueness problem of the LBP fixed point; our new “necessary and sufficient” condition is stated in terms of graphs and signs, where the sign denotes the types (attractive/repulsive) of the interaction (i.e., compatibility function) on the edge. In all previous works, uniqueness is guaranteed only in the situations where the strength of the interactions are “sufficiently” small in certain senses. In contrast, our condition covers arbitrary strong interactions on the specified class of signed graphs. The result of this paper is based on the recent theoretical advance in the LBP algorithm; the connection with the graph zeta function.
Reference: text
sentIndex sentText sentNum sentScore
1 In this paper, we develop a novel approach to the uniqueness problem of the LBP fixed point; our new “necessary and sufficient” condition is stated in terms of graphs and signs, where the sign denotes the types (attractive/repulsive) of the interaction (i. [sent-5, score-0.419]
2 In all previous works, uniqueness is guaranteed only in the situations where the strength of the interactions are “sufficiently” small in certain senses. [sent-8, score-0.315]
3 In contrast, our condition covers arbitrary strong interactions on the specified class of signed graphs. [sent-9, score-0.565]
4 The result of this paper is based on the recent theoretical advance in the LBP algorithm; the connection with the graph zeta function. [sent-10, score-0.548]
5 In this paper we propose a novel approach to the uniqueness problem of LBP fixed point. [sent-14, score-0.234]
6 An important step toward better understanding of the algorithm has been the variational interpretation by the Bethe free energy function; the fixed points of LBP correspond to the stationary points of the Bethe free energy function [7]. [sent-16, score-0.368]
7 For the uniqueness problem of the LBP fixed point a number of conditions has been proposed [12, 13, 14, 15]. [sent-18, score-0.259]
8 (Note that the convergence property implies uniqueness by definition. [sent-19, score-0.254]
9 ) In all previous works, the uniqueness is guaranteed only in the situations where the strength of the interactions are “sufficiently” small in certain senses. [sent-20, score-0.315]
10 In this paper we propose a completely new approach to the uniqueness condition of the LBP algorithm; it should be emphasized that strength of interactions on specified class of signed graphs can be arbitrary large in this condition. [sent-21, score-0.91]
11 com 1 we utilize the connection between the Bethe free energy and the graph zeta function established in [16]; the determinant of the Hessian of the Bethe free energy equals the reciprocal of the graph zeta function up to a positive factor. [sent-29, score-1.474]
12 Combined with the index formula [16], the uniqueness problem is reduced to a positivity property of the graph zeta function. [sent-30, score-0.868]
13 2 Loopy Belief Propagation, Bethe free energy and graph zeta function In this section, we provide basic facts on LBP; the connection with the Bethe free energy and graph zeta function. [sent-36, score-1.419]
14 Throughout this paper, G = (V, E) is a connected undirected graph with V , the vertices, and E, the undirected edges. [sent-37, score-0.243]
15 , xi ∈ {±1}) variables, Z is the normalization constant and ψij , ψi are positive functions called compatibility functions. [sent-40, score-0.215]
16 Without loss of generality we assume that ψij (xi , xj ) = exp(Jij xi xj ) and ψi (xi ) = exp(hi xi ). [sent-41, score-0.476]
17 In various applications, we would like to compute marginal distributions pi (xi ) := p(x) pij (xi , xj ) := and x\{xi } p(x) (2) x\{xi xj } though exact computations are often intractable due to the combinatorial complexities. [sent-43, score-0.276]
18 If the graph is a tree, however, they are efficiently computed by the belief propagation algorithm [1]. [sent-44, score-0.337]
19 The update rule of messages is given by µnew (xj ) ∝ i→j ψji (xj , xi )ψi (xi ) µk→i (xi ), (3) k∈Ni \j xi where Ni is the neighborhood of i ∈ V . [sent-48, score-0.246]
20 If the messages converge to some fixed point {µ∞ (xj )}, i→j the approximations of pi (xi ) and pij (xi , xj ) are calculated as bi (xi ) ∝ ψi (xi ) µ∞ (xi ), k→i (4) k∈Ni bij (xi , xj ) ∝ ψij (xi , xj )ψi (xi )ψj (xj ) µ∞ (xi ) k→i k∈Ni \j with normalization bij (xi , xj ) > 0, and 2. [sent-50, score-0.65]
21 1 µ∞ (xj ), k→j (5) k∈Nj \i xi bi (xi ) = 1 and xi ,xj bij (xi , xj ) = 1. [sent-51, score-0.394]
22 From (3) and (5), the constraints xj bij (xi , xj ) = bi (xi ) are automatically satisfied. [sent-52, score-0.302]
23 The Bethe free energy The LBP algorithm is interpreted as a variational problem of the Bethe free energy function [7]. [sent-53, score-0.368]
24 In this formulation, the domain of the function is given by L(G) = {qi , qij }; qij (xi , xj ) > 0, qij (xi , xj ) = qi (xi ) qij (xi , xj ) = 1, xj xi ,xj 2 (6) and element of this set is called pseudomarginals, i. [sent-54, score-1.241]
25 The objective function called Bethe free energy is defined on L(G) by: F (q) := − qij (xi , xj ) log ψij (xi , xj ) − ij∈E xi xj + qi (xi ) log ψi (xi ) i∈V xi (1 − di ) qij (xi , xj ) log qij (xi , xj ) + ij∈E xi xj qi (xi ) log qi (xi ), (7) xi i∈V where di = |Ni |. [sent-58, score-1.958]
26 2 Zeta function and Ihara’s formula In this section, we explain the connection of LBP to the graph zeta function. [sent-63, score-0.569]
27 A closed geodesic c is prime if there are no closed geodesic d and natural number m(≥ 2) such that c = dm . [sent-73, score-0.26]
28 An equivalence class of prime closed geodesics is called a prime cycle. [sent-77, score-0.262]
29 For given (complex or real) weights u = (ue )e∈E , the Ihara’s graph zeta function [18, 19] is given by (1 − g(p))−1 ζG (u) := g(p) := ue1 · · · uek for p = (e1 , . [sent-79, score-0.525]
30 The following theorem gives the connection between the Bethe free energy and the zeta function. [sent-85, score-0.603]
31 More precisely, the theorem asserts that the determinant of the Hessian of the Bethe free energy function is the reciprocal of the zeta function up to a positive factor. [sent-86, score-0.635]
32 The following equality holds at any point of L(G): ζG (u)−1 = det( 2 F) qij (xi , xj ) ij∈E xi ,xj =±1 qi (xi )1−di 22|V |+4|E| (9) i∈V xi =±1 where the derivatives are taken over a affine coordinate of L(G): mi = Eqi [xi ], χij = Eqij [xi xj ], and Covqij [xi , xj ] χij − mi mj ui→j = = . [sent-88, score-0.838]
33 Since the weight (10) in Theorem 1 is symmetric with respect to the inversion of edges, the zeta function can be reduced to undirected edge weights. [sent-90, score-0.482]
34 To avoid confusion, we introduce a notation: the zeta function of undirected edge weights β = (βij )ij∈E is denoted by ZG (β). [sent-91, score-0.482]
35 The result shows a new type of approach towards uniqueness conditions. [sent-96, score-0.234]
36 1 Existing conditions on uniqueness There have been many works on the uniqueness and/or convergence of the LBP algorithm for discrete graphical models [12, 13, 14, 15] and Gaussian graphical models [21]. [sent-99, score-0.557]
37 This theorem gives the uniqueness property by bounding the strengths of the interactions, i. [sent-106, score-0.297]
38 In fact, the behaviors of LBP algorithm can be dramatically different if the signs of the compatibility functions are changed. [sent-113, score-0.208]
39 Note that each edge compatibility function ψij tend to force the variables xi , xj equal if Jij > 0 and not equal if Jij < 0; the first case is called attractive interaction and the latter repulsive. [sent-114, score-0.42]
40 In contrast to the above uniqueness conditions, we pursue another approach: we use the information of signs, {sgn Jij }ij∈E , rather than the strengths. [sent-115, score-0.234]
41 In this paper, we characterize the signed graphs that guarantee the uniqueness of the solution; this result is stated in Theorem 3. [sent-116, score-0.791]
42 A signed graph, (G, s), is a graph equipped with a sign map, s, from the edges to {±1}. [sent-119, score-0.729]
43 A compatibility function defines the sign function, s, by s(ij) = sgn Jij . [sent-120, score-0.207]
44 The deletion and subgraph of a signed graph is defined naturally restricting the sign function. [sent-124, score-0.791]
45 A w-reduction of a signed graph (G, s) is a signed graph that is obtained by one of the following operations: (w1) Erasure of a vertex of degree two. [sent-126, score-1.293]
46 (An edge is a bridge if the deletion of the edge makes the number of the connected component increase. [sent-133, score-0.272]
47 A signed graph is w-reduced if no w-reduction is applicable. [sent-139, score-0.635]
48 Any signed graph is reduced to the unique w-reduced signed graph called the complete w-reduction. [sent-140, score-1.36]
49 Dn is a signed graph obtained by duplicating each edge of Cn with plus and minus signs. [sent-150, score-0.776]
50 Two signed graphs (G, s) and (G, s ) are said to be gauge equivalent if there exists a map g : V −→ {±1} such that s (ij) = s(ij)g(i)g(j). [sent-152, score-0.673]
51 For a signed graph (G, s) the following conditions are equivalent. [sent-155, score-0.66]
52 (iv) (K4 , s− ) and its gauge equivalent signed graphs. [sent-160, score-0.59]
53 3 Examples and experiments In this subsection we present concrete examples of signed graphs which do or do not satisfy the condition of Theorem 3. [sent-164, score-0.618]
54 3) 2 × 2 grid graph: This graph does not satisfy the condition for any sign because its complete w-reduction is different from the signed graphs in the item 2 of Theorem 3. [sent-174, score-0.862]
55 4 Proofs: conditions in terms of graph zeta function The aim of this subsection is to prove Theorem 3. [sent-189, score-0.573]
56 For the proof, Lemma 2, which is purely a result of the graph zeta function, is utilized. [sent-190, score-0.525]
57 1 Graph theoretic results We denote by G − the deletion of an undirected edge from a graph G and by G/ the contraction. [sent-194, score-0.371]
58 A minor of a graph is obtained by the repeated applications of the deletion, contraction and removal of isolated vertices. [sent-195, score-0.268]
59 The Deletion and contraction operations have natural meaning in the context of the graph zeta function as follows: Lemma 1. [sent-196, score-0.563]
60 Let ij be an edge, then ζG−ij (u) = ζG (˜ ), where ue is equal to ue if [e] = ij and 0 u ˜ otherwise. [sent-198, score-0.692]
61 Let ij be a non-loop edge, then ζG/ij (u) = ζG (˜ ), where ue is equal to ue if [e] = ij u ˜ and 1 otherwise. [sent-200, score-0.692]
62 From the prime cycle representation of zeta functions, both of the assertions are trivial. [sent-202, score-0.461]
63 Next, to prove Theorem 3, we formally define the notion of deletions, contractions and minors on signed graphs [23]. [sent-203, score-0.605]
64 For a signed graph the signed-deletion of an edge is just the deletion of the edge along with the sign on it. [sent-204, score-0.945]
65 The signed-contraction of a non-loop edge ij ∈ E is defined up to gauge equivalence as follows. [sent-205, score-0.461]
66 For any non-loop edge ij, there is a gauge equivalent signed graph that has the sign + on ij. [sent-206, score-0.892]
67 The resulting signed graph is determined up to gauge equivalence. [sent-208, score-0.751]
68 A signed minor of a signed graph is obtained by repeated applications of the signed-deletion, signed-contraction, and removal of isolated vertices. [sent-209, score-1.178]
69 For a signed graph, (G, s), the following conditions are equivalent. [sent-211, score-0.499]
70 That is, if βij ∈ Is(ij) for all ij ∈ E then ZG (β) > 0, where β = (βij )ij∈E , I+ = [0, 1) and I− = (−1, 0]. [sent-214, score-0.249]
71 That is, if βij ∈ Is(ij) for all ij ∈ E then ZG (β) ≥ 0 3. [sent-217, score-0.249]
72 (iv) (K4 , s− ) and its gauge equivalent signed graphs. [sent-221, score-0.59]
73 The uniqueness condition in Theorem 3 is equivalent to all the conditions in this lemma. [sent-223, score-0.297]
74 Here, we remark properties of this condition (the proof is straightforward from definition and Lemma 2): (1) (G, s) is U-type iff its gauge equivalents are U-type. [sent-224, score-0.217]
75 (2) If (G, s) is U-type then its signed minors are U-type. [sent-225, score-0.522]
76 If (G, s) is weakly U-type, then its signed minors are weakly U-type; this is obvious from Lemma 1. [sent-231, score-0.598]
77 However, direct computation of the zeta of (B2 , s+ ) shows that this signed graph is not weakly U-type. [sent-232, score-1.037]
78 Note that if (G, s) does not contain (B2 , s+ ) as a signed minor then any wreductions of (G, s) also do not contain (B2 , s+ ) as a signed minor; we can check this property for each type of w-reductions, (w,1,2,3). [sent-236, score-1.044]
79 Therefore, it is sufficient to show that if a w-reduced signed graph (G, s) does not contain (B2 , +, +) as a signed minor then it is one of the five types. [sent-237, score-1.158]
80 First, if the nullity of G is less than three, it is not hard to see that the signed graph is type (i), (ii) or (iii). [sent-239, score-0.699]
81 Secondly, we consider the case that the graph G has nullity three. [sent-240, score-0.225]
82 Note that all w-reduced signed graphs of nullity two have the signed minor (B1 , +). [sent-241, score-1.144]
83 Notice that if (G, s) satisfies (12) then its gauge equivalents, the deletion and signed-contraction has the same property. [sent-255, score-0.208]
84 So far, we have proven the assertion for w-reduced graphs; we extend the proof to arbitrary signed graphs. [sent-256, score-0.511]
85 For any signed graph, the complete w-reductions are obtained by first using reductions (w1,w2) and then reducing the bridges (w3) because (w3) always makes the degree bigger and does not make a loop. [sent-257, score-0.516]
86 Let (G , s ) be a (w3)-reduction of a signed graph (G, s), i. [sent-260, score-0.635]
87 Let (G , s ) be a (w1) or (w2)-reduction of a signed graph (G, s). [sent-272, score-0.635]
88 We conclude the uniqueness of the solution from the above index sum theorem. [sent-291, score-0.256]
89 ) We can choose Jij and hi such that 1−d qi i (xi ) ∝ exp qij (xi , xj ) ij∈E i∈V hi xi . [sent-297, score-0.506]
90 Jij xi xj + ij∈E (14) i∈V This construction implies that q correspond to a LBP fixed point with compatibility functions {Jij , hi }. [sent-298, score-0.365]
91 5 Concluding remarks In this paper we have developed a new approach to the uniqueness problem of the LBP algorithm. [sent-302, score-0.234]
92 The uniqueness problem is reduced to the properties of graph zeta functions, Lemma 2, using the indexed formula. [sent-304, score-0.759]
93 In contrast to the existing conditions, our uniqueness guarantee includes graphical models with strong interactions. [sent-305, score-0.266]
94 The uniqueness problem is reduced to the positivity of the graph zeta function on a restricted set, rather than the hypercube of size one. [sent-309, score-0.827]
95 If we can check the positivity of graph zeta functions theoretically or algorithmically, the result can be used for a better guarantee of the uniqueness. [sent-310, score-0.598]
96 CCCP algorithms to minimize the bethe and kikuchi free energies: Convergent alternatives to belief propagation. [sent-367, score-0.42]
97 Convexity arguments for efficient minimization of the bethe and kikuchi free energies. [sent-383, score-0.335]
98 On the uniqueness of loopy belief propagation fixed points. [sent-387, score-0.496]
99 Graph zeta function in the bethe free energy and loopy belief propagation. [sent-412, score-0.926]
100 Loopy belief propagation, Bethe free energy and graph zeta function. [sent-438, score-0.783]
wordName wordTfidf (topN-words)
[('signed', 0.474), ('lbp', 0.387), ('zeta', 0.364), ('ij', 0.249), ('uniqueness', 0.234), ('bethe', 0.218), ('jij', 0.172), ('graph', 0.161), ('qij', 0.138), ('xj', 0.128), ('gauge', 0.116), ('xi', 0.11), ('ue', 0.097), ('deletion', 0.092), ('propagation', 0.091), ('signs', 0.088), ('energy', 0.088), ('compatibility', 0.086), ('loopy', 0.086), ('free', 0.085), ('belief', 0.085), ('graphs', 0.083), ('edge', 0.077), ('prime', 0.074), ('nullity', 0.064), ('zg', 0.064), ('sign', 0.064), ('sgn', 0.057), ('interactions', 0.053), ('geodesic', 0.053), ('det', 0.052), ('cycles', 0.05), ('minor', 0.049), ('qi', 0.048), ('ihara', 0.048), ('minors', 0.048), ('positivity', 0.046), ('bij', 0.046), ('ni', 0.045), ('directed', 0.043), ('theorem', 0.043), ('complete', 0.042), ('claim', 0.042), ('undirected', 0.041), ('hi', 0.041), ('closed', 0.04), ('watanabe', 0.039), ('ub', 0.039), ('minus', 0.038), ('contraction', 0.038), ('condition', 0.038), ('weakly', 0.038), ('proof', 0.037), ('geodesics', 0.036), ('behaviors', 0.034), ('kikuchi', 0.032), ('researches', 0.032), ('graphical', 0.032), ('dn', 0.03), ('edges', 0.03), ('unique', 0.029), ('reciprocal', 0.028), ('pseudomarginals', 0.028), ('strength', 0.028), ('determinant', 0.027), ('check', 0.027), ('ei', 0.027), ('message', 0.027), ('lemma', 0.027), ('messages', 0.026), ('bridge', 0.026), ('equivalents', 0.026), ('followings', 0.026), ('duplicating', 0.026), ('conditions', 0.025), ('kn', 0.025), ('iv', 0.025), ('mi', 0.024), ('subgraphs', 0.024), ('convergent', 0.024), ('vertex', 0.023), ('cycle', 0.023), ('connection', 0.023), ('mooij', 0.023), ('subsection', 0.023), ('index', 0.022), ('hypercube', 0.022), ('variational', 0.022), ('formula', 0.021), ('property', 0.02), ('pij', 0.02), ('bm', 0.02), ('tanh', 0.02), ('xed', 0.02), ('isolated', 0.02), ('loop', 0.019), ('hessian', 0.019), ('tokyo', 0.019), ('equivalence', 0.019), ('called', 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999976 296 nips-2011-Uniqueness of Belief Propagation on Signed Graphs
Author: Yusuke Watanabe
Abstract: While loopy Belief Propagation (LBP) has been utilized in a wide variety of applications with empirical success, it comes with few theoretical guarantees. Especially, if the interactions of random variables in a graphical model are strong, the behaviors of the algorithm can be difficult to analyze due to underlying phase transitions. In this paper, we develop a novel approach to the uniqueness problem of the LBP fixed point; our new “necessary and sufficient” condition is stated in terms of graphs and signs, where the sign denotes the types (attractive/repulsive) of the interaction (i.e., compatibility function) on the edge. In all previous works, uniqueness is guaranteed only in the situations where the strength of the interactions are “sufficiently” small in certain senses. In contrast, our condition covers arbitrary strong interactions on the specified class of signed graphs. The result of this paper is based on the recent theoretical advance in the LBP algorithm; the connection with the graph zeta function.
2 0.26425132 158 nips-2011-Learning unbelievable probabilities
Author: Xaq Pitkow, Yashar Ahmadian, Ken D. Miller
Abstract: Loopy belief propagation performs approximate inference on graphical models with loops. One might hope to compensate for the approximation by adjusting model parameters. Learning algorithms for this purpose have been explored previously, and the claim has been made that every set of locally consistent marginals can arise from belief propagation run on a graphical model. On the contrary, here we show that many probability distributions have marginals that cannot be reached by belief propagation using any set of model parameters or any learning algorithm. We call such marginals ‘unbelievable.’ This problem occurs whenever the Hessian of the Bethe free energy is not positive-definite at the target marginals. All learning algorithms for belief propagation necessarily fail in these cases, producing beliefs or sets of beliefs that may even be worse than the pre-learning approximation. We then show that averaging inaccurate beliefs, each obtained from belief propagation using model parameters perturbed about some learned mean values, can achieve the unbelievable marginals. 1
3 0.18778986 239 nips-2011-Robust Lasso with missing and grossly corrupted observations
Author: Nasser M. Nasrabadi, Trac D. Tran, Nam Nguyen
Abstract: This paper studies the problem of accurately recovering a sparse vector β from highly corrupted linear measurements y = Xβ + e + w where e is a sparse error vector whose nonzero entries may be unbounded and w is a bounded noise. We propose a so-called extended Lasso optimization which takes into consideration sparse prior information of both β and e . Our first result shows that the extended Lasso can faithfully recover both the regression and the corruption vectors. Our analysis is relied on a notion of extended restricted eigenvalue for the design matrix X. Our second set of results applies to a general class of Gaussian design matrix X with i.i.d rows N (0, Σ), for which we provide a surprising phenomenon: the extended Lasso can recover exact signed supports of both β and e from only Ω(k log p log n) observations, even the fraction of corruption is arbitrarily close to one. Our analysis also shows that this amount of observations required to achieve exact signed support is optimal. 1
4 0.092176393 117 nips-2011-High-Dimensional Graphical Model Selection: Tractable Graph Families and Necessary Conditions
Author: Animashree Anandkumar, Vincent Tan, Alan S. Willsky
Abstract: We consider the problem of Ising and Gaussian graphical model selection given n i.i.d. samples from the model. We propose an efficient threshold-based algorithm for structure estimation based on conditional mutual information thresholding. This simple local algorithm requires only loworder statistics of the data and decides whether two nodes are neighbors in the unknown graph. We identify graph families for which the proposed algorithm has low sample and computational complexities. Under some transparent assumptions, we establish that the proposed algorithm is −4 structurally consistent (or sparsistent) when the number of samples scales as n = Ω(Jmin log p), where p is the number of nodes and Jmin is the minimum edge potential. We also develop novel non-asymptotic techniques for obtaining necessary conditions for graphical model selection. Keywords: Graphical model selection, high-dimensional learning, local-separation property, necessary conditions, typical sets, Fano’s inequality.
5 0.091147155 248 nips-2011-Semi-supervised Regression via Parallel Field Regularization
Author: Binbin Lin, Chiyuan Zhang, Xiaofei He
Abstract: This paper studies the problem of semi-supervised learning from the vector field perspective. Many of the existing work use the graph Laplacian to ensure the smoothness of the prediction function on the data manifold. However, beyond smoothness, it is suggested by recent theoretical work that we should ensure second order smoothness for achieving faster rates of convergence for semisupervised regression problems. To achieve this goal, we show that the second order smoothness measures the linearity of the function, and the gradient field of a linear function has to be a parallel vector field. Consequently, we propose to find a function which minimizes the empirical error, and simultaneously requires its gradient field to be as parallel as possible. We give a continuous objective function on the manifold and discuss how to discretize it by using random points. The discretized optimization problem turns out to be a sparse linear system which can be solved very efficiently. The experimental results have demonstrated the effectiveness of our proposed approach. 1
6 0.069494277 146 nips-2011-Learning Higher-Order Graph Structure with Features by Structure Penalty
7 0.069437206 170 nips-2011-Message-Passing for Approximate MAP Inference with Latent Variables
8 0.061084468 213 nips-2011-Phase transition in the family of p-resistances
9 0.059318364 67 nips-2011-Data Skeletonization via Reeb Graphs
10 0.057894859 157 nips-2011-Learning to Search Efficiently in High Dimensions
11 0.056912582 273 nips-2011-Structural equations and divisive normalization for energy-dependent component analysis
12 0.056298543 262 nips-2011-Sparse Inverse Covariance Matrix Estimation Using Quadratic Approximation
13 0.056135159 195 nips-2011-On Learning Discrete Graphical Models using Greedy Methods
14 0.055019788 242 nips-2011-See the Tree Through the Lines: The Shazoo Algorithm
15 0.054288466 199 nips-2011-On fast approximate submodular minimization
16 0.05199134 17 nips-2011-Accelerated Adaptive Markov Chain for Partition Function Computation
17 0.051365294 71 nips-2011-Directed Graph Embedding: an Algorithm based on Continuous Limits of Laplacian-type Operators
18 0.050477285 171 nips-2011-Metric Learning with Multiple Kernels
19 0.04961586 274 nips-2011-Structure Learning for Optimization
20 0.048248846 265 nips-2011-Sparse recovery by thresholded non-negative least squares
topicId topicWeight
[(0, 0.137), (1, 0.006), (2, -0.038), (3, -0.06), (4, -0.059), (5, -0.017), (6, -0.105), (7, -0.031), (8, 0.087), (9, -0.12), (10, -0.027), (11, -0.034), (12, -0.101), (13, -0.038), (14, 0.002), (15, 0.078), (16, 0.142), (17, -0.072), (18, -0.036), (19, 0.031), (20, -0.015), (21, 0.086), (22, 0.179), (23, 0.028), (24, 0.209), (25, -0.034), (26, -0.037), (27, -0.124), (28, -0.043), (29, 0.098), (30, -0.046), (31, 0.007), (32, 0.04), (33, 0.063), (34, -0.003), (35, 0.065), (36, -0.093), (37, -0.012), (38, -0.046), (39, 0.053), (40, 0.081), (41, 0.072), (42, -0.01), (43, -0.113), (44, -0.043), (45, -0.055), (46, -0.15), (47, -0.156), (48, -0.025), (49, -0.048)]
simIndex simValue paperId paperTitle
same-paper 1 0.95726228 296 nips-2011-Uniqueness of Belief Propagation on Signed Graphs
Author: Yusuke Watanabe
Abstract: While loopy Belief Propagation (LBP) has been utilized in a wide variety of applications with empirical success, it comes with few theoretical guarantees. Especially, if the interactions of random variables in a graphical model are strong, the behaviors of the algorithm can be difficult to analyze due to underlying phase transitions. In this paper, we develop a novel approach to the uniqueness problem of the LBP fixed point; our new “necessary and sufficient” condition is stated in terms of graphs and signs, where the sign denotes the types (attractive/repulsive) of the interaction (i.e., compatibility function) on the edge. In all previous works, uniqueness is guaranteed only in the situations where the strength of the interactions are “sufficiently” small in certain senses. In contrast, our condition covers arbitrary strong interactions on the specified class of signed graphs. The result of this paper is based on the recent theoretical advance in the LBP algorithm; the connection with the graph zeta function.
2 0.84485722 158 nips-2011-Learning unbelievable probabilities
Author: Xaq Pitkow, Yashar Ahmadian, Ken D. Miller
Abstract: Loopy belief propagation performs approximate inference on graphical models with loops. One might hope to compensate for the approximation by adjusting model parameters. Learning algorithms for this purpose have been explored previously, and the claim has been made that every set of locally consistent marginals can arise from belief propagation run on a graphical model. On the contrary, here we show that many probability distributions have marginals that cannot be reached by belief propagation using any set of model parameters or any learning algorithm. We call such marginals ‘unbelievable.’ This problem occurs whenever the Hessian of the Bethe free energy is not positive-definite at the target marginals. All learning algorithms for belief propagation necessarily fail in these cases, producing beliefs or sets of beliefs that may even be worse than the pre-learning approximation. We then show that averaging inaccurate beliefs, each obtained from belief propagation using model parameters perturbed about some learned mean values, can achieve the unbelievable marginals. 1
3 0.72863895 31 nips-2011-An Application of Tree-Structured Expectation Propagation for Channel Decoding
Author: Pablo M. Olmos, Luis Salamanca, Juan Fuentes, Fernando Pérez-Cruz
Abstract: We show an application of a tree structure for approximate inference in graphical models using the expectation propagation algorithm. These approximations are typically used over graphs with short-range cycles. We demonstrate that these approximations also help in sparse graphs with long-range loops, as the ones used in coding theory to approach channel capacity. For asymptotically large sparse graph, the expectation propagation algorithm together with the tree structure yields a completely disconnected approximation to the graphical model but, for for finite-length practical sparse graphs, the tree structure approximation to the code graph provides accurate estimates for the marginal of each variable. Furthermore, we propose a new method for constructing the tree structure on the fly that might be more amenable for sparse graphs with general factors. 1
4 0.60698831 117 nips-2011-High-Dimensional Graphical Model Selection: Tractable Graph Families and Necessary Conditions
Author: Animashree Anandkumar, Vincent Tan, Alan S. Willsky
Abstract: We consider the problem of Ising and Gaussian graphical model selection given n i.i.d. samples from the model. We propose an efficient threshold-based algorithm for structure estimation based on conditional mutual information thresholding. This simple local algorithm requires only loworder statistics of the data and decides whether two nodes are neighbors in the unknown graph. We identify graph families for which the proposed algorithm has low sample and computational complexities. Under some transparent assumptions, we establish that the proposed algorithm is −4 structurally consistent (or sparsistent) when the number of samples scales as n = Ω(Jmin log p), where p is the number of nodes and Jmin is the minimum edge potential. We also develop novel non-asymptotic techniques for obtaining necessary conditions for graphical model selection. Keywords: Graphical model selection, high-dimensional learning, local-separation property, necessary conditions, typical sets, Fano’s inequality.
5 0.55640459 67 nips-2011-Data Skeletonization via Reeb Graphs
Author: Xiaoyin Ge, Issam I. Safa, Mikhail Belkin, Yusu Wang
Abstract: Recovering hidden structure from complex and noisy non-linear data is one of the most fundamental problems in machine learning and statistical inference. While such data is often high-dimensional, it is of interest to approximate it with a lowdimensional or even one-dimensional space, since many important aspects of data are often intrinsically low-dimensional. Furthermore, there are many scenarios where the underlying structure is graph-like, e.g, river/road networks or various trajectories. In this paper, we develop a framework to extract, as well as to simplify, a one-dimensional ”skeleton” from unorganized data using the Reeb graph. Our algorithm is very simple, does not require complex optimizations and can be easily applied to unorganized high-dimensional data such as point clouds or proximity graphs. It can also represent arbitrary graph structures in the data. We also give theoretical results to justify our method. We provide a number of experiments to demonstrate the effectiveness and generality of our algorithm, including comparisons to existing methods, such as principal curves. We believe that the simplicity and practicality of our algorithm will help to promote skeleton graphs as a data analysis tool for a broad range of applications.
6 0.54907864 274 nips-2011-Structure Learning for Optimization
7 0.49184054 146 nips-2011-Learning Higher-Order Graph Structure with Features by Structure Penalty
8 0.46333164 213 nips-2011-Phase transition in the family of p-resistances
9 0.43030453 170 nips-2011-Message-Passing for Approximate MAP Inference with Latent Variables
10 0.42739022 248 nips-2011-Semi-supervised Regression via Parallel Field Regularization
11 0.37837049 253 nips-2011-Signal Estimation Under Random Time-Warpings and Nonlinear Signal Alignment
12 0.37369198 17 nips-2011-Accelerated Adaptive Markov Chain for Partition Function Computation
13 0.36725339 195 nips-2011-On Learning Discrete Graphical Models using Greedy Methods
14 0.35840204 188 nips-2011-Non-conjugate Variational Message Passing for Multinomial and Binary Regression
15 0.35020158 150 nips-2011-Learning a Distance Metric from a Network
16 0.34977615 236 nips-2011-Regularized Laplacian Estimation and Fast Eigenvector Approximation
17 0.34854519 306 nips-2011-t-divergence Based Approximate Inference
18 0.34298846 242 nips-2011-See the Tree Through the Lines: The Shazoo Algorithm
19 0.33891526 71 nips-2011-Directed Graph Embedding: an Algorithm based on Continuous Limits of Laplacian-type Operators
20 0.32718131 137 nips-2011-Iterative Learning for Reliable Crowdsourcing Systems
topicId topicWeight
[(0, 0.023), (4, 0.051), (20, 0.05), (26, 0.021), (31, 0.062), (33, 0.01), (34, 0.036), (43, 0.081), (45, 0.064), (57, 0.028), (74, 0.071), (83, 0.069), (89, 0.295), (99, 0.05)]
simIndex simValue paperId paperTitle
1 0.85750097 136 nips-2011-Inverting Grice's Maxims to Learn Rules from Natural Language Extractions
Author: Mohammad S. Sorower, Janardhan R. Doppa, Walker Orr, Prasad Tadepalli, Thomas G. Dietterich, Xiaoli Z. Fern
Abstract: We consider the problem of learning rules from natural language text sources. These sources, such as news articles and web texts, are created by a writer to communicate information to a reader, where the writer and reader share substantial domain knowledge. Consequently, the texts tend to be concise and mention the minimum information necessary for the reader to draw the correct conclusions. We study the problem of learning domain knowledge from such concise texts, which is an instance of the general problem of learning in the presence of missing data. However, unlike standard approaches to missing data, in this setting we know that facts are more likely to be missing from the text in cases where the reader can infer them from the facts that are mentioned combined with the domain knowledge. Hence, we can explicitly model this “missingness” process and invert it via probabilistic inference to learn the underlying domain knowledge. This paper introduces a mention model that models the probability of facts being mentioned in the text based on what other facts have already been mentioned and domain knowledge in the form of Horn clause rules. Learning must simultaneously search the space of rules and learn the parameters of the mention model. We accomplish this via an application of Expectation Maximization within a Markov Logic framework. An experimental evaluation on synthetic and natural text data shows that the method can learn accurate rules and apply them to new texts to make correct inferences. Experiments also show that the method out-performs the standard EM approach that assumes mentions are missing at random. 1
same-paper 2 0.77600986 296 nips-2011-Uniqueness of Belief Propagation on Signed Graphs
Author: Yusuke Watanabe
Abstract: While loopy Belief Propagation (LBP) has been utilized in a wide variety of applications with empirical success, it comes with few theoretical guarantees. Especially, if the interactions of random variables in a graphical model are strong, the behaviors of the algorithm can be difficult to analyze due to underlying phase transitions. In this paper, we develop a novel approach to the uniqueness problem of the LBP fixed point; our new “necessary and sufficient” condition is stated in terms of graphs and signs, where the sign denotes the types (attractive/repulsive) of the interaction (i.e., compatibility function) on the edge. In all previous works, uniqueness is guaranteed only in the situations where the strength of the interactions are “sufficiently” small in certain senses. In contrast, our condition covers arbitrary strong interactions on the specified class of signed graphs. The result of this paper is based on the recent theoretical advance in the LBP algorithm; the connection with the graph zeta function.
3 0.69410694 217 nips-2011-Practical Variational Inference for Neural Networks
Author: Alex Graves
Abstract: Variational methods have been previously explored as a tractable approximation to Bayesian inference for neural networks. However the approaches proposed so far have only been applicable to a few simple network architectures. This paper introduces an easy-to-implement stochastic variational method (or equivalently, minimum description length loss function) that can be applied to most neural networks. Along the way it revisits several common regularisers from a variational perspective. It also provides a simple pruning heuristic that can both drastically reduce the number of network weights and lead to improved generalisation. Experimental results are provided for a hierarchical multidimensional recurrent neural network applied to the TIMIT speech corpus. 1
4 0.51658946 45 nips-2011-Beating SGD: Learning SVMs in Sublinear Time
Author: Elad Hazan, Tomer Koren, Nati Srebro
Abstract: We present an optimization approach for linear SVMs based on a stochastic primal-dual approach, where the primal step is akin to an importance-weighted SGD, and the dual step is a stochastic update on the importance weights. This yields an optimization method with a sublinear dependence on the training set size, and the first method for learning linear SVMs with runtime less then the size of the training set required for learning! 1
5 0.48447028 158 nips-2011-Learning unbelievable probabilities
Author: Xaq Pitkow, Yashar Ahmadian, Ken D. Miller
Abstract: Loopy belief propagation performs approximate inference on graphical models with loops. One might hope to compensate for the approximation by adjusting model parameters. Learning algorithms for this purpose have been explored previously, and the claim has been made that every set of locally consistent marginals can arise from belief propagation run on a graphical model. On the contrary, here we show that many probability distributions have marginals that cannot be reached by belief propagation using any set of model parameters or any learning algorithm. We call such marginals ‘unbelievable.’ This problem occurs whenever the Hessian of the Bethe free energy is not positive-definite at the target marginals. All learning algorithms for belief propagation necessarily fail in these cases, producing beliefs or sets of beliefs that may even be worse than the pre-learning approximation. We then show that averaging inaccurate beliefs, each obtained from belief propagation using model parameters perturbed about some learned mean values, can achieve the unbelievable marginals. 1
6 0.478037 135 nips-2011-Information Rates and Optimal Decoding in Large Neural Populations
7 0.47438717 204 nips-2011-Online Learning: Stochastic, Constrained, and Smoothed Adversaries
8 0.47389632 258 nips-2011-Sparse Bayesian Multi-Task Learning
9 0.47381473 236 nips-2011-Regularized Laplacian Estimation and Fast Eigenvector Approximation
10 0.47233614 57 nips-2011-Comparative Analysis of Viterbi Training and Maximum Likelihood Estimation for HMMs
11 0.47225201 102 nips-2011-Generalised Coupled Tensor Factorisation
12 0.47189099 276 nips-2011-Structured sparse coding via lateral inhibition
13 0.46999598 186 nips-2011-Noise Thresholds for Spectral Clustering
14 0.4672201 71 nips-2011-Directed Graph Embedding: an Algorithm based on Continuous Limits of Laplacian-type Operators
15 0.46691886 183 nips-2011-Neural Reconstruction with Approximate Message Passing (NeuRAMP)
16 0.46661332 273 nips-2011-Structural equations and divisive normalization for energy-dependent component analysis
17 0.46524775 144 nips-2011-Learning Auto-regressive Models from Sequence and Non-sequence Data
18 0.46457633 75 nips-2011-Dynamical segmentation of single trials from population neural data
19 0.46450546 133 nips-2011-Inferring spike-timing-dependent plasticity from spike train data
20 0.46316117 257 nips-2011-SpaRCS: Recovering low-rank and sparse matrices from compressive measurements