nips nips2003 nips2003-174 knowledge-graph by maker-knowledge-mining

174 nips-2003-Semidefinite Relaxations for Approximate Inference on Graphs with Cycles

Source: pdf

Author: Michael I. Jordan, Martin J. Wainwright

Abstract: We present a new method for calculating approximate marginals for probability distributions deﬁned by graphs with cycles, based on a Gaussian entropy bound combined with a semideﬁnite outer bound on the marginal polytope. This combination leads to a log-determinant maximization problem that can be solved by efﬁcient interior point methods [8]. As with the Bethe approximation and its generalizations [12], the optimizing arguments of this problem can be taken as approximations to the exact marginals. In contrast to Bethe/Kikuchi approaches, our variational problem is strictly convex and so has a unique global optimum. An additional desirable feature is that the value of the optimal solution is guaranteed to provide an upper bound on the log partition function. In experimental trials, the performance of the log-determinant relaxation is comparable to or better than the sum-product algorithm, and by a substantial margin for certain problem classes. Finally, the zero-temperature limit of our log-determinant relaxation recovers a class of well-known semideﬁnite relaxations for integer programming [e.g., 3]. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Semideﬁnite relaxations for approximate inference on graphs with cycles Martin J. [sent-1, score-0.236]

2 edu Abstract We present a new method for calculating approximate marginals for probability distributions deﬁned by graphs with cycles, based on a Gaussian entropy bound combined with a semideﬁnite outer bound on the marginal polytope. [sent-7, score-0.92]

3 This combination leads to a log-determinant maximization problem that can be solved by efﬁcient interior point methods [8]. [sent-8, score-0.178]

4 As with the Bethe approximation and its generalizations [12], the optimizing arguments of this problem can be taken as approximations to the exact marginals. [sent-9, score-0.202]

5 In contrast to Bethe/Kikuchi approaches, our variational problem is strictly convex and so has a unique global optimum. [sent-10, score-0.227]

6 An additional desirable feature is that the value of the optimal solution is guaranteed to provide an upper bound on the log partition function. [sent-11, score-0.332]

7 In experimental trials, the performance of the log-determinant relaxation is comparable to or better than the sum-product algorithm, and by a substantial margin for certain problem classes. [sent-12, score-0.159]

8 Finally, the zero-temperature limit of our log-determinant relaxation recovers a class of well-known semideﬁnite relaxations for integer programming [e. [sent-13, score-0.28]

9 , Markov random ﬁeld, factor graph), a key problem is the computation of marginal distributions. [sent-18, score-0.211]

10 Although highly efﬁcient algorithms exist for trees, exact solutions are prohibitively complex for more general graphs of any substantial size. [sent-19, score-0.115]

11 This difﬁculty motivates the use of algorithms for computing approximations to marginal distributions, a problem to which we refer as approximate inference. [sent-20, score-0.327]

12 [12], it can be interpreted as a method for attempting to solve a variational problem wherein the exact entropy is replaced by the Bethe approximation. [sent-23, score-0.376]

13 , 6], the associated variational problems are typically not convex, thus leading to algorithmic complications, and also raising the possibility of multiple local optima. [sent-28, score-0.196]

14 , mean ﬁeld [4]), the optimal values of Bethe-type variational problems fail to provide bounds on the log partition function. [sent-31, score-0.363]

15 This paper introduces a new class of variational problems that are both convex and provide upper bounds. [sent-33, score-0.286]

16 Our derivation relies on a Gaussian upper bound on the discrete entropy of a suitably regularized random vector, and a semideﬁnite outer bound on the set of valid marginal distributions. [sent-34, score-0.894]

17 The combination leads to a log-determinant maximization problem with a unique optimum that can be found by efﬁcient interior point methods [8]. [sent-35, score-0.237]

18 As with the Bethe/Kikuchi approximations and sum-product algorithms, the optimizing arguments of this problem can be taken as approximations to the marginal distributions of the underlying graphical model. [sent-36, score-0.45]

19 Moreover, taking the “zero-temperature” limit recovers a class of wellknown semideﬁnite programming relaxations for integer programming problems [e. [sent-37, score-0.181]

20 2 Problem set-up We consider an undirected graph G = (V, E) with n = |V | nodes. [sent-40, score-0.108]

21 Associated with each vertex s ∈ V is a random variable xs taking values in the discrete space X = {0, 1, . [sent-41, score-0.259]

22 For some index set I, we let φ = {φα | α ∈ I} denote a collection of potential functions associated with the cliques of G, and let θ = {θα | α ∈ I} be a vector of parameters associated with these potential functions. [sent-47, score-0.244]

23 α Here Φ(θ) is the log partition function that serves to normalize the distribution. [sent-49, score-0.157]

24 For example, one minimal representation of a binaryvalued random vector on a graph with pairwise cliques is the standard Ising model, in which φ = {xs | s ∈ V } ∪ { xs xt | (s, t) ∈ E}. [sent-51, score-0.572]

25 , xs xt xu for a third order interaction) to the collection of potential functions. [sent-55, score-0.434]

26 1 Duality and marginal polytopes It is well known that Φ is convex in terms of θ, and strictly so for a minimal representation. [sent-58, score-0.336]

27 We refer to this set as the marginal polytope1 associated with the graph G and the potentials φ. [sent-70, score-0.351]

28 (6) x∈X n This relation establishes that for µ in the relative interior of MARG(G; φ), the value of the conjugate dual Φ∗ (µ) is given by the negative entropy of the distribution p(x; θ(µ)), where the pair θ(µ) and µ are dually coupled via Eqn. [sent-73, score-0.482]

29 (7) Moreover, we are guaranteed that the optimum is attained uniquely at the exact marginals µ = {µα } of p(x; θ). [sent-77, score-0.236]

30 This variational formulation plays a central role in our development in the sequel. [sent-78, score-0.135]

31 2 Challenges with the variational formulation There are two difﬁculties associated with the variational formulation (7). [sent-80, score-0.302]

32 First of all, observe that the (negative) entropy Φ∗ , as a function of only the mean parameters µ, is implicitly deﬁned; indeed, it is typically impossible to specify an explicit form for Φ ∗ . [sent-81, score-0.189]

33 Key exceptions are trees and hypertrees, for which the entropy is well-known to decompose into a sum of local entropies deﬁned by local marginals on the (hyper)edges [1]. [sent-82, score-0.421]

34 Secondly, for a general graph with cycles, the marginal polytope MARG(G; φ) is deﬁned by a number of inequalities that grows rapidly in graph size [e. [sent-83, score-0.613]

35 Trees and hypertrees again are important exceptions: in this case, the junction tree theorem [e. [sent-86, score-0.244]

36 , 1] provides a compact representation of the associated marginal polytopes. [sent-88, score-0.243]

37 The Bethe approach (and its generalizations) can be understood as consisting of two steps: (a) replacing the exact entropy −Φ∗ with a tree (or hypertree) approximation; and (b) replacing the marginal polytope MARG(G; φ) with constraint sets deﬁned by tree (or hypertree) consistency conditions. [sent-89, score-0.791]

38 However, since the (hyper)tree approximations used do not bound the exact entropy, the optimal values of Bethe-type variational problems do not provide a bound on the value of the log partition function Φ(θ). [sent-90, score-0.685]

39 Requirements for bounding Φ are both an outer bound on the marginal polytope, as well as an upper bound on the entropy −Φ∗ . [sent-91, score-0.862]

40 1 When φα corresponds to an indicator function, then µα is a marginal probability; otherwise, this choice entails a minor abuse of terminology. [sent-92, score-0.211]

41 3 Log-determinant relaxation In this section, we state and prove a set of upper bounds based on the solution of a variational problem involving determinant maximization and semideﬁnite constraints. [sent-93, score-0.497]

42 It is also convenient to deﬁne all problems with respect to the complete graph K n (i. [sent-95, score-0.184]

43 We use the standard (minimal) Ising representation for a binary problem, in terms of the potential functions φ = {xs | s ∈ V } ∪ {xs xt | (s, t)}. [sent-98, score-0.15]

44 Of course, any problem can 2 be embedded into the complete graph by setting to zero a subset of the {θst } parameters. [sent-100, score-0.155]

45 (In particular, for a graph G = (V, E), we simply set θst = 0 for all pairs (s, t) ∈ E. [sent-101, score-0.108]

46 1 Outer bounds on the marginal polytope We ﬁrst focus on the marginal polytope MARG(Kn ) ≡ MARG(Kn ; φ) of valid dual variables {µs , µst }, as deﬁned in Eqn. [sent-103, score-0.948]

47 In this section, we describe a set of semideﬁnite and linear constraints that any valid dual vector µ ∈ MARG(Kn ) must satisfy. [sent-105, score-0.245]

48 1 Semideﬁnite constraints Given an arbitrary vector µ ∈ Rd , consider the following (n + 1) × (n + 1) matrix:   1 µ1 µ2 · · · µn−1 µn 1 µ12 · · · ··· µ1n   µ1  µ µ21 1 ··· ··· µ2n  2     . [sent-108, score-0.133]

49 µn,(n−1)  n−1 µn µn1 µn2 · · · µ(n−1),n 1 (8) The motivation underlying this deﬁnition is the following: suppose that the given dual vector µ actually belongs to MARG(Kn ), in which case there exists some distribution p(x; θ) such that µs = x p(x; θ) xs and µst = x p(x; θ) xs xt . [sent-138, score-0.622]

50 (Note in particular that the diagonal elements are all one, since x2 = 1 when xs ∈ {−1, +1}. [sent-140, score-0.228]

51 ) Since any such moment matrix must be positive s semideﬁnite,2 we have established the following: Lemma 1 (Semideﬁnite outer bound). [sent-141, score-0.128]

52 The binary marginal polytope MARG(K n ) is contained within the semideﬁnite constraint set: SDEF1 := µ ∈ Rd M1 [µ] 0 (9) This semideﬁnite relaxation can be further strengthened by including higher order terms in the moment matrices [5]. [sent-142, score-0.57]

53 2 Additional linear constraints It is straightforward to augment these semideﬁnite constraints with additional linear constraints. [sent-145, score-0.202]

54 2 T Pairwise edge constraints: It is natural to require that the subset of mean parameters associated with each pair of random variables (xs , xt ) — namely, µs , µt and µst — specify a valid pairwise marginal distribution. [sent-148, score-0.43]

55 (10) It can be shown [11, 10] that these constraints are necessary and sufﬁcient to guarantee the existence of a consistent pairwise marginal. [sent-150, score-0.176]

56 By the junction tree theorem [1], this pairwise consistency guarantees that the constraints of Eqn. [sent-151, score-0.371]

57 (10) provide a complete description of the binary marginal polytope for any tree-structured graph. [sent-152, score-0.489]

58 Moreover, for a general graph with cycles, they are equivalent to the tree-consistent constraint set used in the Bethe approach [12] when applied to a binary vector x ∈ {−1, +1}n . [sent-153, score-0.185]

59 Triplet constraints: Local consistency can be extended to triplets {xs , xt , xu }, and even more generally to higher order subsets. [sent-154, score-0.154]

60 For the triplet case, consider the following set of constraints (and permutations thereof) among the pairwise mean parameters {µst , µsu , µtu }: µst + µsu + µtu ≥ −1, µst − µsu − µtu ≥ −1. [sent-155, score-0.244]

61 (11) It can be shown [11, 10] that these constraints, in conjunction with the pairwise constraints (10), are necessary and sufﬁcient to ensure that the collection of mean parameters {µs , µt , µu , µst , µsu , µtu } uniquely determine a valid marginal over the triplet (xs , xt , xu ). [sent-156, score-0.668]

62 Once again, by applying the junction tree theorem [1], we conclude that the constraints (10) and (11) provide a complete characterization of the binary marginal polytope for hypertrees of width two. [sent-157, score-0.834]

63 It is worthwhile observing that this set of constraints is equivalent to those that are implicitly enforced by any Kikuchi approximation [12] with clusters of size three (when applied to a binary problem). [sent-158, score-0.146]

64 2 Gaussian entropy bound We now turn to the task of upper bounding the entropy. [sent-160, score-0.408]

65 Our starting point is the familiar interpretation of the Gaussian as the maximum entropy distribution subject to covariance constraints: Lemma 2. [sent-161, score-0.189]

66 The (differential) entropy h(x) := − p(x) log p(x)dx is upper bounded by the entropy 1 log det cov(x) + n log(2πe) of a Gaussian with matched covariance. [sent-162, score-0.71]

67 2 2 Of interest to us is the discrete entropy of a discrete-valued random vector x ∈ {−1, +1} n , whereas the Gaussian bound of Lemma 2 applies to the differential entropy of a continuousvalued random vector. [sent-163, score-0.599]

68 We have h(x) = H(x), where h and H denote the differential and discrete entropies of x and x respectively. [sent-168, score-0.11]

69 By construction, the differential entropy can be decomposed as a sum of integrals over hyperboxes of unit volume, one for each conﬁguration, over which the probability density of x is constant. [sent-170, score-0.232]

70 3 Log-determinant relaxation Equipped with these building blocks, we are now ready to state and prove a log-determinant relaxation for the log partition function. [sent-173, score-0.413]

71 Let x ∈ {−1, +1}n , and let OUT(Kn ) be any convex outer bound on MARG(Kn ) that is contained within SDEF1 . [sent-175, score-0.305]

72 Then there holds Φ(θ) ≤ max µ∈OUT(Kn ) θ, µ + 1 1 log det M1 (µ) + blkdiag[0, In ] 2 3 + n πe log( ) 2 2 (12) where blkdiag[0, In ] is an (n+1)×(n+1) block-diagonal matrix. [sent-176, score-0.175]

73 From Lemma 3, we have 2 H(x) = h(x); combining this equality with Lemma 2, we obtain the upper bound H(x) ≤ 1 n 2 log det cov(x) + 2 log(2πe). [sent-181, score-0.35]

74 Next we use the Schur complement formula [8] to 4 express the log determinant as follows: 1 1 blkdiag[0, In ] + n log . [sent-183, score-0.263]

75 (13) with the Gaussian upper bound leads to the following expression: log det cov(x) = log det M1 [µ] + 1 1 n πe log det M1 [µ] + blkdiag[0, In ] + log( ) 2 3 2 2 Substituting this upper bound into the variational representation of Eqn. [sent-185, score-1.01]

76 (7) and using the fact that OUT(Kn ) is an outer bound on MARG(G) yields Eqn. [sent-186, score-0.243]

77 By construction, the cost function is strictly convex so that the optimum is unique. [sent-188, score-0.151]

78 (12) is a determinant maximization problem, for which efﬁcient interior point methods have been developed [e. [sent-191, score-0.247]

79 4 Experimental results The relevance of the log-determinant relaxation for applications is two-fold: it provides upper bounds on the log partition function, and the maximizing arguments µ ∈ OUT(K n ) of Eqn. [sent-194, score-0.425]

80 (12) can be taken as approximations to the exact marginals of the distribution p(x; θ). [sent-195, score-0.23]

81 So as to test its performance in computing approximate marginals, we performed extensive experiments on the complete graph (fully connected) and the 2-D nearestneighbor lattice model. [sent-196, score-0.189]

82 For a given coupling strength dcoup > 0, we investigated three possible types of coupling: (a) for repulsive interactions, θst ∼ U[−2dcoup , 0]; (b) for mixed interactions, θst ∼ U[−dcoup , +dcoup ]; (c) for attractive interactions, θst ∼ U[0, 2dcoup ]. [sent-202, score-0.277]

83 For each distribution p(x; θ), we performed the following computations: (a) the exact marginal probability p(xs = 1; θ) at each node; and (b) approximate marginals computed 4 Here U[a, b] denotes the uniform distribution on [a, b]. [sent-203, score-0.393]

84 from the Bethe approximation with the sum-product algorithm, or (c) log-determinant approximate marginals from Theorem 1 using the outer bound OUT(Kn ) given by the ﬁrst semideﬁnite relaxation SDEF1 in conjunction with the pairwise linear constraints in Eqn. [sent-204, score-0.677]

85 We computed the exact marginal values either by exhaustive summation (complete graph), or by the junction tree algorithm (lattices). [sent-206, score-0.382]

86 The log-determinant problem of Theorem 1 was solved using interior point methods [8]. [sent-209, score-0.115]

87 For each graph (fully connected or grid), we examined a total of 6 conditions: 2 different potential strengths (weak or strong) for each of the 3 types of coupling (attractive, mixed, n 1 and repulsive). [sent-210, score-0.231]

88 We computed the 1 -error n s=1 |p(xs = 1; θ) − µs |, where µs was the approximate marginal computed either by SP or by LD. [sent-211, score-0.245]

89 Statistics of the 1 -approximation error for the sum-product (SP) and logdeterminant (LD) methods for the fully connected graph K16 , as well as the 4-nearest neighbor grid with 16 nodes, with varying coupling and potential strengths. [sent-309, score-0.259]

90 The potential strength is given as the pair (dobs , dcoup ); note that dobs = 0. [sent-311, score-0.23]

91 5 new old More precisely, we updated messages in the log domain as γ log M st + (1 − γ) log Mst . [sent-322, score-0.449]

92 5 Discussion In this paper, we developed a new method for approximate inference based on the combination of a Gaussian entropy bound with semideﬁnite constraints on the marginal polytope. [sent-323, score-0.65]

93 The resultant log-determinant maximization problem can be solved by efﬁcient interior point methods [8]. [sent-324, score-0.178]

94 It can be shown [11, 10] that in the zero-temperature limit, the log-determinant relaxation (12) reduces to a class of semideﬁnite relaxations that are widely used in combinatorial optimization. [sent-327, score-0.237]

95 One open question is whether techniques for bounding the performance of such semideﬁnite relaxations [e. [sent-328, score-0.153]

96 It remains to develop a deeper understanding of the interaction between the two components to these approximations (i. [sent-332, score-0.11]

97 , the entropy bound, and the outer bound on the marginal polytope), as well as how to tailor approximations to particular graph structures. [sent-334, score-0.833]

98 Finally, semideﬁnite constraints can be combined with entropy approximations (preferably convex) other than the Gaussian bound used in this paper, among them “convexiﬁed” Bethe/Kikuchi entropy approximations [9]. [sent-335, score-0.758]

99 A new class of upper bounds on the log partition function. [sent-397, score-0.259]

100 Semideﬁnite relaxations for approximate inference on graphs with cycles. [sent-412, score-0.175]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('marg', 0.417), ('semide', 0.331), ('xs', 0.228), ('kn', 0.214), ('marginal', 0.211), ('entropy', 0.189), ('polytope', 0.186), ('st', 0.158), ('variational', 0.135), ('outer', 0.128), ('relaxation', 0.128), ('sp', 0.122), ('interior', 0.115), ('bound', 0.115), ('bethe', 0.11), ('relaxations', 0.109), ('graph', 0.108), ('blkdiag', 0.104), ('dcoup', 0.104), ('ld', 0.103), ('constraints', 0.101), ('log', 0.097), ('marginals', 0.096), ('coupling', 0.085), ('approximations', 0.082), ('cov', 0.082), ('det', 0.078), ('hypertrees', 0.078), ('pairwise', 0.075), ('nite', 0.075), ('tu', 0.072), ('berkeley', 0.069), ('determinant', 0.069), ('exceptions', 0.068), ('triplet', 0.068), ('dual', 0.067), ('xt', 0.067), ('maximization', 0.063), ('rd', 0.062), ('convex', 0.062), ('tree', 0.062), ('cycles', 0.061), ('upper', 0.06), ('partition', 0.06), ('optimum', 0.059), ('wainwright', 0.059), ('xu', 0.058), ('junction', 0.057), ('uc', 0.057), ('lemma', 0.053), ('dobs', 0.052), ('repulsive', 0.052), ('exact', 0.052), ('trials', 0.05), ('su', 0.049), ('complete', 0.047), ('theorem', 0.047), ('hypertree', 0.045), ('deza', 0.045), ('binary', 0.045), ('valid', 0.045), ('bounding', 0.044), ('collection', 0.043), ('differential', 0.043), ('interactions', 0.043), ('recovers', 0.043), ('bounds', 0.042), ('median', 0.041), ('yedidia', 0.041), ('laurent', 0.041), ('ising', 0.041), ('hyper', 0.041), ('coupled', 0.039), ('potential', 0.038), ('arguments', 0.038), ('graphical', 0.037), ('relation', 0.037), ('entropies', 0.036), ('strength', 0.036), ('conjugate', 0.035), ('approximate', 0.034), ('minimal', 0.033), ('moreover', 0.033), ('graphs', 0.032), ('vector', 0.032), ('associated', 0.032), ('trees', 0.032), ('discrete', 0.031), ('substantial', 0.031), ('generalizations', 0.03), ('strictly', 0.03), ('attained', 0.029), ('cliques', 0.029), ('princeton', 0.029), ('january', 0.029), ('consistency', 0.029), ('problems', 0.029), ('ranges', 0.028), ('interaction', 0.028), ('fully', 0.028)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 174 nips-2003-Semidefinite Relaxations for Approximate Inference on Graphs with Cycles

Author: Michael I. Jordan, Martin J. Wainwright

2 0.2309486 171 nips-2003-Semi-Definite Programming by Perceptron Learning

Author: Thore Graepel, Ralf Herbrich, Andriy Kharechko, John S. Shawe-taylor

Abstract: We present a modiﬁed version of the perceptron learning algorithm (PLA) which solves semideﬁnite programs (SDPs) in polynomial time. The algorithm is based on the following three observations: (i) Semideﬁnite programs are linear programs with inﬁnitely many (linear) constraints; (ii) every linear program can be solved by a sequence of constraint satisfaction problems with linear constraints; (iii) in general, the perceptron learning algorithm solves a constraint satisfaction problem with linear constraints in ﬁnitely many updates. Combining the PLA with a probabilistic rescaling algorithm (which, on average, increases the size of the feasable region) results in a probabilistic algorithm for solving SDPs that runs in polynomial time. We present preliminary results which demonstrate that the algorithm works, but is not competitive with state-of-the-art interior point methods. 1

3 0.14737134 189 nips-2003-Tree-structured Approximations by Expectation Propagation

Author: Yuan Qi, Tom Minka

Abstract: Approximation structure plays an important role in inference on loopy graphs. As a tractable structure, tree approximations have been utilized in the variational method of Ghahramani & Jordan (1997) and the sequential projection method of Frey et al. (2000). However, belief propagation represents each factor of the graph with a product of single-node messages. In this paper, belief propagation is extended to represent factors with tree approximations, by way of the expectation propagation framework. That is, each factor sends a “message” to all pairs of nodes in a tree structure. The result is more accurate inferences and more frequent convergence than ordinary belief propagation, at a lower cost than variational trees or double-loop algorithms. 1

4 0.14731492 32 nips-2003-Approximate Expectation Maximization

Author: Tom Heskes, Onno Zoeter, Wim Wiegerinck

Abstract: We discuss the integration of the expectation-maximization (EM) algorithm for maximum likelihood learning of Bayesian networks with belief propagation algorithms for approximate inference. Specifically we propose to combine the outer-loop step of convergent belief propagation algorithms with the M-step of the EM algorithm. This then yields an approximate EM algorithm that is essentially still double loop, with the important advantage of an inner loop that is guaranteed to converge. Simulations illustrate the merits of such an approach. 1

5 0.13005349 193 nips-2003-Variational Linear Response

Author: Manfred Opper, Ole Winther

Abstract: A general linear response method for deriving improved estimates of correlations in the variational Bayes framework is presented. Three applications are given and it is discussed how to use linear response as a general principle for improving mean ﬁeld approximations.

6 0.12206578 96 nips-2003-Invariant Pattern Recognition by Semi-Definite Programming Machines

7 0.11308224 155 nips-2003-Perspectives on Sparse Bayesian Learning

8 0.10556773 94 nips-2003-Information Maximization in Noisy Channels : A Variational Approach

9 0.097616047 142 nips-2003-On the Concentration of Expectation and Approximate Inference in Layered Networks

10 0.089030102 64 nips-2003-Estimating Internal Variables and Paramters of a Learning Agent by a Particle Filter

11 0.088562056 177 nips-2003-Simplicial Mixtures of Markov Chains: Distributed Modelling of Dynamic User Profiles

12 0.08677955 152 nips-2003-Pairwise Clustering and Graphical Models

13 0.084868424 117 nips-2003-Linear Response for Approximate Inference

14 0.081364155 167 nips-2003-Robustness in Markov Decision Problems with Uncertain Transition Matrices

15 0.081145048 124 nips-2003-Max-Margin Markov Networks

16 0.080805793 51 nips-2003-Design of Experiments via Information Theory

17 0.078047037 169 nips-2003-Sample Propagation

18 0.077406339 107 nips-2003-Learning Spectral Clustering

19 0.073883787 30 nips-2003-Approximability of Probability Distributions

20 0.071871608 48 nips-2003-Convex Methods for Transduction

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.241), (1, -0.058), (2, -0.115), (3, 0.122), (4, 0.176), (5, -0.09), (6, 0.093), (7, 0.021), (8, -0.003), (9, -0.015), (10, -0.009), (11, -0.1), (12, 0.012), (13, 0.046), (14, -0.126), (15, 0.018), (16, -0.097), (17, 0.024), (18, -0.078), (19, -0.111), (20, -0.049), (21, -0.026), (22, -0.134), (23, -0.28), (24, 0.048), (25, -0.03), (26, 0.193), (27, 0.099), (28, -0.04), (29, 0.042), (30, -0.036), (31, 0.073), (32, 0.011), (33, 0.002), (34, -0.051), (35, 0.006), (36, -0.094), (37, 0.121), (38, -0.058), (39, 0.001), (40, -0.009), (41, -0.11), (42, 0.075), (43, -0.017), (44, -0.049), (45, 0.051), (46, 0.027), (47, 0.039), (48, 0.072), (49, -0.015)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95970869 174 nips-2003-Semidefinite Relaxations for Approximate Inference on Graphs with Cycles

Author: Michael I. Jordan, Martin J. Wainwright

2 0.77643514 171 nips-2003-Semi-Definite Programming by Perceptron Learning

Author: Thore Graepel, Ralf Herbrich, Andriy Kharechko, John S. Shawe-taylor

3 0.50493139 32 nips-2003-Approximate Expectation Maximization

Author: Tom Heskes, Onno Zoeter, Wim Wiegerinck

4 0.50112027 189 nips-2003-Tree-structured Approximations by Expectation Propagation

Author: Yuan Qi, Tom Minka

5 0.49684796 96 nips-2003-Invariant Pattern Recognition by Semi-Definite Programming Machines

Author: Thore Graepel, Ralf Herbrich

Abstract: Knowledge about local invariances with respect to given pattern transformations can greatly improve the accuracy of classiﬁcation. Previous approaches are either based on regularisation or on the generation of virtual (transformed) examples. We develop a new framework for learning linear classiﬁers under known transformations based on semideﬁnite programming. We present a new learning algorithm— the Semideﬁnite Programming Machine (SDPM)—which is able to ﬁnd a maximum margin hyperplane when the training examples are polynomial trajectories instead of single points. The solution is found to be sparse in dual variables and allows to identify those points on the trajectory with minimal real-valued output as virtual support vectors. Extensions to segments of trajectories, to more than one transformation parameter, and to learning with kernels are discussed. In experiments we use a Taylor expansion to locally approximate rotational invariance in pixel images from USPS and ﬁnd improvements over known methods. 1

6 0.49424246 155 nips-2003-Perspectives on Sparse Bayesian Learning

7 0.46221116 74 nips-2003-Finding the M Most Probable Configurations using Loopy Belief Propagation

8 0.44795051 48 nips-2003-Convex Methods for Transduction

9 0.43945083 94 nips-2003-Information Maximization in Noisy Channels : A Variational Approach

10 0.43815595 30 nips-2003-Approximability of Probability Distributions

11 0.38821504 193 nips-2003-Variational Linear Response

12 0.36550948 142 nips-2003-On the Concentration of Expectation and Approximate Inference in Layered Networks

13 0.36378357 116 nips-2003-Linear Program Approximations for Factored Continuous-State Markov Decision Processes

14 0.35701269 152 nips-2003-Pairwise Clustering and Graphical Models

15 0.35392493 169 nips-2003-Sample Propagation

16 0.34343952 167 nips-2003-Robustness in Markov Decision Problems with Uncertain Transition Matrices

17 0.34317562 134 nips-2003-Near-Minimax Optimal Classification with Dyadic Classification Trees

18 0.32595107 168 nips-2003-Salient Boundary Detection using Ratio Contour

19 0.31870189 177 nips-2003-Simplicial Mixtures of Markov Chains: Distributed Modelling of Dynamic User Profiles

20 0.30062905 44 nips-2003-Can We Learn to Beat the Best Stock

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.073), (11, 0.011), (29, 0.011), (30, 0.018), (35, 0.072), (53, 0.074), (69, 0.014), (71, 0.077), (76, 0.04), (85, 0.092), (91, 0.083), (99, 0.356)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.93397498 167 nips-2003-Robustness in Markov Decision Problems with Uncertain Transition Matrices

Author: Arnab Nilim, Laurent El Ghaoui

Abstract: Optimal solutions to Markov Decision Problems (MDPs) are very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of those probabilities is far from accurate. Hence, estimation errors are limiting factors in applying MDPs to realworld problems. We propose an algorithm for solving ﬁnite-state and ﬁnite-action MDPs, where the solution is guaranteed to be robust with respect to estimation errors on the state transition probabilities. Our algorithm involves a statistically accurate yet numerically efﬁcient representation of uncertainty, via Kullback-Leibler divergence bounds. The worst-case complexity of the robust algorithm is the same as the original Bellman recursion. Hence, robustness can be added at practically no extra computing cost.

same-paper 2 0.89077532 174 nips-2003-Semidefinite Relaxations for Approximate Inference on Graphs with Cycles

Author: Michael I. Jordan, Martin J. Wainwright

3 0.77613437 191 nips-2003-Unsupervised Context Sensitive Language Acquisition from a Large Corpus

Author: Zach Solan, David Horn, Eytan Ruppin, Shimon Edelman

Abstract: We describe a pattern acquisition algorithm that learns, in an unsupervised fashion, a streamlined representation of linguistic structures from a plain natural-language corpus. This paper addresses the issues of learning structured knowledge from a large-scale natural language data set, and of generalization to unseen text. The implemented algorithm represents sentences as paths on a graph whose vertices are words (or parts of words). Signiﬁcant patterns, determined by recursive context-sensitive statistical inference, form new vertices. Linguistic constructions are represented by trees composed of signiﬁcant patterns and their associated equivalence classes. An input module allows the algorithm to be subjected to a standard test of English as a Second Language (ESL) proﬁciency. The results are encouraging: the model attains a level of performance considered to be “intermediate” for 9th-grade students, despite having been trained on a corpus (CHILDES) containing transcribed speech of parents directed to small children. 1

4 0.6722225 158 nips-2003-Policy Search by Dynamic Programming

Author: J. A. Bagnell, Sham M. Kakade, Jeff G. Schneider, Andrew Y. Ng

Abstract: We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to visit each state), then we can derive a policy search algorithm that terminates in a ﬁnite number of steps, and for which we can provide non-trivial performance guarantees. We also demonstrate this algorithm on several grid-world POMDPs, a planar biped walking robot, and a double-pole balancing problem. 1

5 0.61767364 34 nips-2003-Approximate Policy Iteration with a Policy Language Bias

Author: Alan Fern, Sungwook Yoon, Robert Givan

Abstract: We explore approximate policy iteration, replacing the usual costfunction learning step with a learning step in policy space. We give policy-language biases that enable solution of very large relational Markov decision processes (MDPs) that no previous technique can solve. In particular, we induce high-quality domain-speciﬁc planners for classical planning domains (both deterministic and stochastic variants) by solving such domains as extremely large MDPs. 1

6 0.5885309 78 nips-2003-Gaussian Processes in Reinforcement Learning

7 0.58844733 116 nips-2003-Linear Program Approximations for Factored Continuous-State Markov Decision Processes

8 0.58411801 42 nips-2003-Bounded Finite State Controllers

9 0.57228458 121 nips-2003-Log-Linear Models for Label Ranking

10 0.57185352 189 nips-2003-Tree-structured Approximations by Expectation Propagation

11 0.56882018 101 nips-2003-Large Margin Classifiers: Convex Loss, Low Noise, and Convergence Rates

12 0.56134862 103 nips-2003-Learning Bounds for a Generalized Family of Bayesian Posterior Distributions

13 0.55591542 151 nips-2003-PAC-Bayesian Generic Chaining

14 0.54926246 30 nips-2003-Approximability of Probability Distributions

15 0.54244721 177 nips-2003-Simplicial Mixtures of Markov Chains: Distributed Modelling of Dynamic User Profiles

16 0.53851449 134 nips-2003-Near-Minimax Optimal Classification with Dyadic Classification Trees

17 0.53752476 48 nips-2003-Convex Methods for Transduction

18 0.53560525 63 nips-2003-Error Bounds for Transductive Learning via Compression and Clustering

19 0.53524172 50 nips-2003-Denoising and Untangling Graphs Using Degree Priors

20 0.53463233 146 nips-2003-Online Learning of Non-stationary Sequences