nips nips2008 nips2008-104 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Philip Torr, M. P. Kumar
Abstract: We consider the problem of obtaining the approximate maximum a posteriori estimate of a discrete random field characterized by pairwise potentials that form a truncated convex model. For this problem, we propose an improved st-MINCUT based move making algorithm. Unlike previous move making approaches, which either provide a loose bound or no bound on the quality of the solution (in terms of the corresponding Gibbs energy), our algorithm achieves the same guarantees as the standard linear programming (LP) relaxation. Compared to previous approaches based on the LP relaxation, e.g. interior-point algorithms or treereweighted message passing (TRW), our method is faster as it uses only the efficient st-MINCUT algorithm in its design. Furthermore, it directly provides us with a primal solution (unlike TRW and other related methods which solve the dual of the LP). We demonstrate the effectiveness of the proposed approach on both synthetic and standard real data problems. Our analysis also opens up an interesting question regarding the relationship between move making algorithms (such as α-expansion and the algorithms presented in this paper) and the randomized rounding schemes used with convex relaxations. We believe that further explorations in this direction would help design efficient algorithms for more complex relaxations.
Reference: text
sentIndex sentText sentNum sentScore
1 uk Abstract We consider the problem of obtaining the approximate maximum a posteriori estimate of a discrete random field characterized by pairwise potentials that form a truncated convex model. [sent-12, score-0.734]
2 Unlike previous move making approaches, which either provide a loose bound or no bound on the quality of the solution (in terms of the corresponding Gibbs energy), our algorithm achieves the same guarantees as the standard linear programming (LP) relaxation. [sent-14, score-0.275]
3 Our analysis also opens up an interesting question regarding the relationship between move making algorithms (such as α-expansion and the algorithms presented in this paper) and the randomized rounding schemes used with convex relaxations. [sent-20, score-0.4]
4 The word ‘discrete’ refers to the fact that each of the random variables va ∈ v = {v0 , · · · , vn−1 } can take one label from a discrete set l = {l0 , · · · , lh−1 }. [sent-28, score-0.257]
5 An MRF defines a neighbourhood relationship (denoted by E) over the random variables such that (a, b) ∈ E if, and only if, va and vb are neighbouring random variables. [sent-30, score-0.418]
6 Given an MRF, a labelling refers to a function f such that f : {0, · · · , n − 1} −→ {0, · · · , h − 1}. [sent-31, score-0.34]
7 In other words, the function f assigns to each random variable va ∈ v, a label lf (a) ∈ l. [sent-32, score-0.236]
8 The probability of the labelling is given by the following Gibbs distribution: Pr(f, D|θ) = exp(−Q(f, D; θ))/Z(θ), where θ is the parameter of the MRF and Z(θ) is the normalization constant (i. [sent-33, score-0.34]
9 Assuming a pairwise MRF, the Gibbs energy is given by: 1 θa;f (a) + Q(f, D; θ) = va ∈v 2 θab;f (a)f (b) , (1) (a,b)∈E 1 2 where θa;f (a) and θab;f (a)f (b) are the unary and pairwise potentials respectively. [sent-36, score-0.799]
10 The superscripts ‘1’ and ‘2’ indicate that the unary potential depends on the labelling of one random variable at a time, while the pairwise potential depends on the labelling of two neighbouring random variables. [sent-37, score-1.007]
11 Clearly, the labelling f which maximizes the posterior Pr(f, D|θ) can be obtained by minimizing the Gibbs energy. [sent-38, score-0.34]
12 The problem of obtaining such a labelling f is known as maximum a posteriori 1 (MAP) estimation. [sent-39, score-0.34]
13 In this paper, we consider the problem of MAP estimation of random fields where the pairwise potentials are defined by truncated convex models [4]. [sent-40, score-0.754]
14 Formally speaking, the pairwise potentials are of the form 2 θab;f (a)f (b) = wab min{d(f (a) − f (b)), M } (2) where wab ≥ 0 for all (a, b) ∈ E, d(·) is a convex function and M > 0 is the truncation factor. [sent-41, score-0.825]
15 Examples of pairwise potentials of this form include the truncated linear metric and the truncated quadratic semi-metric, i. [sent-45, score-1.008]
16 2 2 θab;f (a)f (b) = wab min{|f (a) − f (b)|, M }, θab;f (a)f (b) = wab min{(f (a) − f (b))2 , M }. [sent-47, score-0.394]
17 (3) Before proceeding further, we would like to note here that the method presented in this paper can be trivially extended to truncated submodular models (a generalization of truncated convex models). [sent-48, score-0.712]
18 However, we will restrict our discussion to truncated convex models for two reasons: (i) it makes the analysis of our approach easier; and (ii) truncated convex pairwise potentials are commonly used in several problems such as stereo reconstruction, image denoising and inpainting [22]. [sent-49, score-1.183]
19 Given their widespread use, it is not surprising that several approximate MAP estimation algorithms have been proposed in the literature for the truncated convex model. [sent-54, score-0.443]
20 1 Related Work Given a random field with truncated convex pairwise potentials, Felzenszwalb and Huttenlocher [6] improved the efficiency of the popular max-product belief propagation (BP) algorithm [19] to obtain the MAP estimate. [sent-57, score-0.531]
21 However, for a general MRF , BP provides no bounds on the quality of the approximate MAP labelling obtained. [sent-61, score-0.379]
22 the labelling f ) is often obtained from the dual solution in an unprincipled manner1. [sent-69, score-0.373]
23 [5] showed that when using certain randomized rounding schemes on the primal solution (to get the final labelling f ), the following guarantees hold true: (i) for Potts model (i. [sent-77, score-0.484]
24 d(f (a) − f (b)) = |f (a) − f (b)| and M = 1), we obtain a multiplicative bound2 of 2; (ii) for the truncated linear metric (i. [sent-79, score-0.431]
25 2 Let f be the labelling obtained by an algorithm A (e. [sent-83, score-0.34]
26 in this case when the pairwise potentials form a Potts model). [sent-87, score-0.311]
27 The algorithm A is said to achieve a multiplicative bound of σ, if for every instance in the class of MAP estimation problems the following holds true: „ « Q(f, D; θ) E ≤ σ, Q(f ∗ , D; θ) where E(·) denotes the expectation of its argument under the rounding scheme. [sent-89, score-0.241]
28 2 Initialization - Initialize the labelling to some function f1 . [sent-90, score-0.34]
29 Iteration - Choose an interval Im = [im + 1, jm ] where (jm − im ) = L such that d(L) ≥ M . [sent-92, score-0.425]
30 - Move from current labelling fm to a new labelling fm+1 such that fm+1 (a) = fm (a) or fm+1 (a) ∈ Im , ∀va ∈ v. [sent-93, score-1.624]
31 The new labelling is obtained by solving the st- MINCUT problem on a graph described in § 2. [sent-94, score-0.407]
32 As is typical with move making methods, our approach iteratively goes from one labelling to the next by solving an st-MINCUT problem. [sent-98, score-0.501]
33 d(f (a) − f (b)) = |f (a) − f (b)| and a general M > 0), we obtain a multiplicative bound of √ 2 + 2; and (iii) for the truncated quadratic semi-metric (i. [sent-100, score-0.494]
34 Move making algorithms start with an initial labelling f0 and iteratively minimize the Gibbs energy by moving to a better labelling. [sent-104, score-0.486]
35 The recently proposed range move algorithm [23] modifies this approach such that any variable currently labelled li where i ∈ [α, β] can be assigned any label lj where j ∈ [α, β]. [sent-109, score-0.235]
36 In contrast, the α-expansion algorithm [4] (where each variable can either retain its label or get assigned the label lα at an iteration) provides a multiplicative bound of 2 for the Potts model and 2M for the truncated linear metric. [sent-116, score-0.532]
37 Gupta and Tardos [8] generalized the α-expansion algorithm for the truncated linear metric and obtained a multiplicative bound of 4. [sent-117, score-0.475]
38 Komodakis and Tziritas [14] designed a primal-dual algorithm which provides a bound of 2M for the truncated quadratic semimetric. [sent-118, score-0.421]
39 However, all the above move making algorithms use only a single st-MINCUT at each iteration and are hence, much faster than interior point algorithms, TRW, TRW- S and BP. [sent-120, score-0.265]
40 The first extension allows us to handle any truncated convex model (and not just truncated linear). [sent-123, score-0.712]
41 In other words, if in the mth iteration ′ we move from label fm to fm+1 then it is possible that there exists another labelling fm+1 such ′ ′ ′ that fm+1 (a) = fm (a) or fm+1 (a) ∈ Im for all va ∈ v and Q(fm+1 , D; θ) < Q(fm+1 , D; θ). [sent-131, score-1.662]
42 We now turn our attention to designing a method of moving from labelling fm to fm+1 . [sent-133, score-0.812]
43 Our approach relies on constructing a graph such that every st-cut on the graph corresponds to a labelling f ′ of the random variables which satisfies: f ′ (a) = fm (a) or f ′ (a) ∈ Im , for all va ∈ v. [sent-134, score-1.171]
44 The new labelling fm+1 is obtained in two steps: (i) we obtain a labelling f ′ which corresponds to the 3 st-MINCUT on our graph; and (ii) we choose the new labelling fm+1 as f′ if Q(f ′ , D; θ) ≤ Q(fm , D; θ), fm+1 = fm otherwise. [sent-135, score-1.492]
45 We also have the current labelling fm for all the random variables. [sent-141, score-0.836]
46 We construct a directed weighted graph (with non-negative weights) Gm = {Vm , Em , cm (·, ·)} such that for each va ∈ v, we define vertices {aim +1 , aim +2 , · · · , ajm } ∈ Vm . [sent-142, score-0.575]
47 weight) cm (e) are of two types: (i) those that represent the unary potentials of a labelling corresponding to an st-cut in the graph and; (ii) those that represent the pairwise potentials of the labelling. [sent-146, score-1.166]
48 Figure 1: Part of the graph Gm containing the terminals and the vertices corresponding to the variable va . [sent-147, score-0.304]
49 The edges which represent the unary potential of the new labelling are also shown. [sent-148, score-0.593]
50 1 shows the above edges together with their capacities for one random variable va . [sent-151, score-0.411]
51 Any st-cut with finite cost3 contains only one of the finite capacity edges for each random variable va . [sent-153, score-0.455]
52 This is because if an st-cut included more than one finite capacity edge, then by construction it must include at least one infinite capacity edge thereby making its cost infinite [9, 23]. [sent-154, score-0.319]
53 We interpret a finite cost st-cut as a relabelling of the random variables as follows: k if st-cut includes edge (ak , ak+1 ) where k ∈ [im + 1, jm ), jm if st-cut includes edge (ajm , t), (5) f ′ (a) = fm (a) if st-cut includes edge (s, aim +1 ). [sent-155, score-0.953]
54 Note that the sum of the unary potentials for the labelling f ′ is exactly equal to the cost of the st-cut over the edges defined above. [sent-156, score-0.817]
55 However, the Gibbs energy of the labelling also includes the sum of the pairwise potentials (as shown in equation (1)). [sent-157, score-0.744]
56 Unlike the unary potentials we will not be able to model the sum of pairwise potentials exactly. [sent-158, score-0.621]
57 Representing Pairwise Potentials For all neighbouring random variables va and vb , i. [sent-160, score-0.321]
58 (a, b) ∈ E, we define edges (ak , bk′ ) ∈ Em where either one or both of k and k ′ belong to the set (im + 1, jm ] (i. [sent-162, score-0.274]
59 The capacity of these edges is given by wab cm (ak , bk′ ) = (d(k − k ′ + 1) − 2d(k − k ′ ) + d(k − k ′ − 1)) . [sent-165, score-0.586]
60 (6) 2 The above capacity is non-negative due to the fact that wab ≥ 0 and d(·) is convex. [sent-166, score-0.302]
61 Furthermore, we also add the following edges: ab cm (ak , ak+1 ) = w2 (d(L − k + im ) + d(k − im )) , ∀(a, b) ∈ E, k ∈ [im + 1, jm ) wab cm (bk′ , bk′ +1 ) = 2 (d(L − k ′ + im ) + d(k ′ − im )) , ∀(a, b) ∈ E, k ′ ∈ [im + 1, jm ) ab cm (ajm , t) = cm (bjm , t) = w2 d(L), ∀(a, b) ∈ E. [sent-167, score-2.523]
62 (7) 3 Recall that the cost of an st-cut is the sum of the capacities of the edges whose starting point lies in the set of vertices containing the source s and whose ending point lies in the set of vertices containing the sink t. [sent-168, score-0.314]
63 4 (b) (c) (d) (a) Figure 2: (a) Edges that are used to represent the pairwise potentials of two neighbouring random variables va and vb are shown. [sent-169, score-0.632]
64 Undirected edges indicate that there are opposing edges in both directions with equal capacity (as given by equation 6). [sent-170, score-0.419]
65 Directed dashed edges, with capacities shown in equation (7), are added to ensure that the graph models the convex pairwise potentials correctly. [sent-171, score-0.525]
66 (b) An additional edge is added when fm (a) ∈ Im and fm (b) ∈ Im . [sent-172, score-0.979]
67 (c) A similar additional edge is added when fm (a) ∈ Im and fm (b) ∈ Im . [sent-174, score-0.979]
68 (d) Five / edges, with capacities as shown in equation (8), are added when fm (a) ∈ Im and fm (b) ∈ Im . [sent-175, score-1.005]
69 / / Undirected edges indicate the presence of opposing edges with equal capacity. [sent-176, score-0.314]
70 Note that in [23] the graph obtained by the edges in equations (6) and (7) was used to find the exact MAP estimate for convex pairwise potentials. [sent-177, score-0.407]
71 A proof that the above edges exactly model convex pairwise potentials up to an additive constant κab = wab d(L) can be found in [17]. [sent-178, score-0.74]
72 However, we are concerned with the NP-hard case where the pairwise potentials are truncated. [sent-179, score-0.311]
73 • If fm (a) ∈ Im and fm (b) ∈ Im then we do not add any more edges in the graph (see Fig. [sent-182, score-1.157]
74 • If fm (a) ∈ Im and fm (b) ∈ Im then we add an edge (aim +1 , bim +1 ) with capacity wab M +κab /2, / where κab = wab d(L) is a constant for a given pair of neighbouring random variables (a, b) ∈ E (see Fig. [sent-184, score-1.647]
75 Similarly, if fm (a) ∈ Im and fm (b) ∈ Im then we add an edge (bim +1 , aim +1 ) with / capacity wab M + κab /2 (see Fig. [sent-186, score-1.335]
76 • If fm (a) ∈ Im and fm (b) ∈ Im , we introduce a new vertex pab . [sent-188, score-1.049]
77 Using this vertex pab , five edges / / are defined with the following capacities (see Fig. [sent-189, score-0.312]
78 2(d)): cm (aim +1 , pab ) = cm (pab , aim +1 ) = cm (bim +1 , pab ) = cm (pab , bim +1 ) = wab M + κab /2, 2 (8) cm (s, pab ) = θab;fm (a),fm (b) + κab . [sent-190, score-1.316]
79 Given the graph Gm we solve the st-MINCUT problem which provides us with a labelling f ′ as described in equation (5). [sent-192, score-0.426]
80 The new labelling fm+1 is obtained using equation (4). [sent-193, score-0.34]
81 Note that our graph construction is similar to that of Gupta and Tardos [8] with two notable exceptions: (i) we can handle any general truncated convex model and not just truncated linear as in the case of [8]. [sent-194, score-0.803]
82 The following properties provide such a value of L for both the truncated linear and the truncated quadratic models. [sent-197, score-0.671]
83 2 Properties of the Algorithm For the above graph construction, the following properties hold true: • The cost of the st-MINCUT provides an upper bound on the Gibbs energy of the labelling f ′ and hence, on the Gibbs energy of fm+1 (see section 2. [sent-200, score-0.677]
84 √ • For the truncated linear metric, our algorithm obtains a multiplicative bound of 2 + 2 using √ L = 2M (see section 3, Theorem 1, of [17]). [sent-202, score-0.472]
85 √ • For the truncated quadratic semi-metric, our algorithm obtains a multiplicative bound of O( M ) √ using L = M (see section 3, Theorem 2, of [17]). [sent-208, score-0.517]
86 1 Synthetic Data Experimental Setup We used 100 random fields for both the truncated linear and truncated quadratic models. [sent-214, score-0.695]
87 1 Specifically, the unary potentials θa;i were sampled uniformly from the interval [0, 10] while the weights wab , which determine the pairwise potentials, were sampled uniformly from [0, 5]. [sent-221, score-0.615]
88 For the above random field formulation, the unary potentials were defined as in [22] and were truncated at 15. [sent-235, score-0.647]
89 We experimented using the following truncated convex potentials: 2 2 θab;ij = 50 min{|i − j|, 10}, θab;ij = 50 min{(i − j)2 , 100}. [sent-238, score-0.399]
90 (9) The above form of pairwise potentials encourage neighbouring pixels to take similar disparity values which corresponds to our expectations of finding smooth surfaces in natural images. [sent-239, score-0.396]
91 Truncation of pairwise potentials is essential to avoid oversmoothing, as observed in [4, 23]. [sent-240, score-0.311]
92 4 Discussion We have presented an st-MINCUT based algorithm for obtaining the approximate MAP estimate of discrete random fields with truncated convex pairwise potentials. [sent-270, score-0.531]
93 Our method improves the multiplicative bound for the truncated linear metric compared to [4, 8] and provides the best known bound for the truncated quadratic semi-metric. [sent-271, score-0.896]
94 In fact, its speed can be further improved by a large factor using clever techniques such as those described in [12] (for convex unary potentials) and/or [1] (for general unary potentials). [sent-273, score-0.322]
95 2 shows that, for the truncated linear and truncated quadratic models, the bound achieved by our move making algorithm over intervals of any length L is equal to that of rounding the LP relaxation’s optimal solution using the same intervals [5]. [sent-277, score-0.961]
96 A natural question would be to ask about the relationship between move making algorithms and the rounding schemes used in convex relaxations. [sent-279, score-0.376]
97 Note that despite recent efforts [14] which analyze certain move making algorithms in the context of primal-dual approaches for the LP relaxation, not many results 7 are known about their connection with randomized rounding schemes. [sent-280, score-0.27]
98 A linear programming formulation and approximation algorithms for the metric labelling problem. [sent-317, score-0.39]
99 Conditional random fields: Probabilistic models for segmenting and labelling sequence data. [sent-392, score-0.364]
100 Graph cut based optimization for MRFs with truncated convex priors. [sent-421, score-0.399]
wordName wordTfidf (topN-words)
[('fm', 0.472), ('labelling', 0.34), ('truncated', 0.313), ('im', 0.297), ('potentials', 0.203), ('wab', 0.197), ('va', 0.18), ('ab', 0.165), ('edges', 0.146), ('cm', 0.138), ('move', 0.132), ('jm', 0.128), ('lp', 0.11), ('pairwise', 0.108), ('unary', 0.107), ('capacity', 0.105), ('ajm', 0.105), ('pab', 0.105), ('energy', 0.093), ('trw', 0.092), ('multiplicative', 0.092), ('ak', 0.091), ('convex', 0.086), ('rounding', 0.085), ('gibbs', 0.08), ('stereo', 0.074), ('mrf', 0.072), ('bp', 0.068), ('graph', 0.067), ('neighbouring', 0.064), ('relaxation', 0.063), ('map', 0.061), ('capacities', 0.061), ('elds', 0.061), ('bim', 0.06), ('gupta', 0.056), ('aim', 0.054), ('neighbourhood', 0.053), ('tardos', 0.048), ('potts', 0.045), ('quadratic', 0.045), ('bound', 0.044), ('pami', 0.044), ('kolmogorov', 0.043), ('komodakis', 0.039), ('bk', 0.037), ('passing', 0.036), ('message', 0.035), ('edge', 0.035), ('truncation', 0.034), ('veksler', 0.034), ('iteration', 0.034), ('dual', 0.033), ('primal', 0.033), ('vb', 0.032), ('label', 0.032), ('vertices', 0.031), ('chekuri', 0.03), ('ishikawa', 0.03), ('teddy', 0.03), ('wolfson', 0.03), ('synthetic', 0.03), ('making', 0.029), ('ii', 0.029), ('gm', 0.028), ('unlike', 0.028), ('range', 0.028), ('eld', 0.027), ('slower', 0.026), ('terminals', 0.026), ('pawan', 0.026), ('interior', 0.026), ('kumar', 0.026), ('reconstruction', 0.026), ('guarantees', 0.026), ('metric', 0.026), ('art', 0.025), ('algorithms', 0.024), ('labelled', 0.024), ('disparities', 0.024), ('sink', 0.024), ('random', 0.024), ('construction', 0.024), ('obtains', 0.023), ('opposing', 0.022), ('clever', 0.022), ('disparity', 0.021), ('vm', 0.021), ('variables', 0.021), ('cost', 0.021), ('felzenszwalb', 0.02), ('bounds', 0.02), ('relationship', 0.02), ('em', 0.02), ('labels', 0.02), ('faster', 0.02), ('estimation', 0.02), ('provides', 0.019), ('oxford', 0.019), ('lj', 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 104 nips-2008-Improved Moves for Truncated Convex Models
Author: Philip Torr, M. P. Kumar
Abstract: We consider the problem of obtaining the approximate maximum a posteriori estimate of a discrete random field characterized by pairwise potentials that form a truncated convex model. For this problem, we propose an improved st-MINCUT based move making algorithm. Unlike previous move making approaches, which either provide a loose bound or no bound on the quality of the solution (in terms of the corresponding Gibbs energy), our algorithm achieves the same guarantees as the standard linear programming (LP) relaxation. Compared to previous approaches based on the LP relaxation, e.g. interior-point algorithms or treereweighted message passing (TRW), our method is faster as it uses only the efficient st-MINCUT algorithm in its design. Furthermore, it directly provides us with a primal solution (unlike TRW and other related methods which solve the dual of the LP). We demonstrate the effectiveness of the proposed approach on both synthetic and standard real data problems. Our analysis also opens up an interesting question regarding the relationship between move making algorithms (such as α-expansion and the algorithms presented in this paper) and the randomized rounding schemes used with convex relaxations. We believe that further explorations in this direction would help design efficient algorithms for more complex relaxations.
2 0.2242361 171 nips-2008-Online Prediction on Large Diameter Graphs
Author: Mark Herbster, Guy Lever, Massimiliano Pontil
Abstract: We continue our study of online prediction of the labelling of a graph. We show a fundamental limitation of Laplacian-based algorithms: if the graph has a large diameter then the number of mistakes made by such algorithms may be proportional to the square root of the number of vertices, even when tackling simple problems. We overcome this drawback by means of an efficient algorithm which achieves a logarithmic mistake bound. It is based on the notion of a spine, a path graph which provides a linear embedding of the original graph. In practice, graphs may exhibit cluster structure; thus in the last part, we present a modified algorithm which achieves the “best of both worlds”: it performs well locally in the presence of cluster structure, and globally on large diameter graphs. 1
3 0.15833604 214 nips-2008-Sparse Online Learning via Truncated Gradient
Author: John Langford, Lihong Li, Tong Zhang
Abstract: We propose a general method called truncated gradient to induce sparsity in the weights of online-learning algorithms with convex loss. This method has several essential properties. First, the degree of sparsity is continuous—a parameter controls the rate of sparsification from no sparsification to total sparsification. Second, the approach is theoretically motivated, and an instance of it can be regarded as an online counterpart of the popular L1 -regularization method in the batch setting. We prove small rates of sparsification result in only small additional regret with respect to typical online-learning guarantees. Finally, the approach works well empirically. We apply it to several datasets and find for datasets with large numbers of features, substantial sparsity is discoverable. 1
4 0.095943607 49 nips-2008-Clusters and Coarse Partitions in LP Relaxations
Author: David Sontag, Amir Globerson, Tommi S. Jaakkola
Abstract: We propose a new class of consistency constraints for Linear Programming (LP) relaxations for finding the most probable (MAP) configuration in graphical models. Usual cluster-based LP relaxations enforce joint consistency on the beliefs of a cluster of variables, with computational cost increasing exponentially with the size of the clusters. By partitioning the state space of a cluster and enforcing consistency only across partitions, we obtain a class of constraints which, although less tight, are computationally feasible for large clusters. We show how to solve the cluster selection and partitioning problem monotonically in the dual LP, using the current beliefs to guide these choices. We obtain a dual message passing algorithm and apply it to protein design problems where the variables have large state spaces and the usual cluster-based relaxations are very costly. The resulting method solves many of these problems exactly, and significantly faster than a method that does not use partitioning. 1
5 0.081341736 40 nips-2008-Bounds on marginal probability distributions
Author: Joris M. Mooij, Hilbert J. Kappen
Abstract: We propose a novel bound on single-variable marginal probability distributions in factor graphs with discrete variables. The bound is obtained by propagating local bounds (convex sets of probability distributions) over a subtree of the factor graph, rooted in the variable of interest. By construction, the method not only bounds the exact marginal probability distribution of a variable, but also its approximate Belief Propagation marginal (“belief”). Thus, apart from providing a practical means to calculate bounds on marginals, our contribution also lies in providing a better understanding of the error made by Belief Propagation. We show that our bound outperforms the state-of-the-art on some inference problems arising in medical diagnosis. 1
6 0.080575563 47 nips-2008-Clustered Multi-Task Learning: A Convex Formulation
7 0.077399351 145 nips-2008-Multi-stage Convex Relaxation for Learning with Sparse Regularization
8 0.074243002 165 nips-2008-On the Reliability of Clustering Stability in the Large Sample Regime
9 0.072974637 135 nips-2008-Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of \boldmath$\ell 1$-regularized MLE
10 0.072393805 147 nips-2008-Multiscale Random Fields with Application to Contour Grouping
11 0.072074324 69 nips-2008-Efficient Exact Inference in Planar Ising Models
12 0.069828108 34 nips-2008-Bayesian Network Score Approximation using a Metagraph Kernel
13 0.065640293 37 nips-2008-Biasing Approximate Dynamic Programming with a Lower Discount Factor
14 0.064353488 152 nips-2008-Non-stationary dynamic Bayesian networks
15 0.063029438 201 nips-2008-Robust Near-Isometric Matching via Structured Learning of Graphical Models
16 0.059713371 175 nips-2008-PSDBoost: Matrix-Generation Linear Programming for Positive Semidefinite Matrices Learning
17 0.058515489 84 nips-2008-Fast Prediction on a Tree
18 0.05599089 53 nips-2008-Counting Solution Clusters in Graph Coloring Problems Using Belief Propagation
19 0.055074226 129 nips-2008-MAS: a multiplicative approximation scheme for probabilistic inference
20 0.053602286 148 nips-2008-Natural Image Denoising with Convolutional Networks
topicId topicWeight
[(0, -0.156), (1, -0.026), (2, -0.045), (3, 0.038), (4, 0.032), (5, -0.077), (6, 0.018), (7, -0.098), (8, 0.006), (9, -0.181), (10, -0.146), (11, 0.063), (12, 0.033), (13, -0.032), (14, 0.015), (15, -0.06), (16, 0.0), (17, 0.048), (18, -0.117), (19, -0.028), (20, -0.008), (21, -0.067), (22, -0.023), (23, -0.014), (24, -0.066), (25, -0.006), (26, -0.0), (27, 0.056), (28, -0.099), (29, -0.044), (30, -0.102), (31, -0.033), (32, -0.07), (33, -0.07), (34, -0.09), (35, -0.012), (36, -0.142), (37, -0.018), (38, 0.084), (39, 0.077), (40, -0.057), (41, -0.056), (42, -0.081), (43, -0.107), (44, -0.086), (45, 0.047), (46, -0.074), (47, -0.031), (48, 0.043), (49, -0.018)]
simIndex simValue paperId paperTitle
same-paper 1 0.94695139 104 nips-2008-Improved Moves for Truncated Convex Models
Author: Philip Torr, M. P. Kumar
Abstract: We consider the problem of obtaining the approximate maximum a posteriori estimate of a discrete random field characterized by pairwise potentials that form a truncated convex model. For this problem, we propose an improved st-MINCUT based move making algorithm. Unlike previous move making approaches, which either provide a loose bound or no bound on the quality of the solution (in terms of the corresponding Gibbs energy), our algorithm achieves the same guarantees as the standard linear programming (LP) relaxation. Compared to previous approaches based on the LP relaxation, e.g. interior-point algorithms or treereweighted message passing (TRW), our method is faster as it uses only the efficient st-MINCUT algorithm in its design. Furthermore, it directly provides us with a primal solution (unlike TRW and other related methods which solve the dual of the LP). We demonstrate the effectiveness of the proposed approach on both synthetic and standard real data problems. Our analysis also opens up an interesting question regarding the relationship between move making algorithms (such as α-expansion and the algorithms presented in this paper) and the randomized rounding schemes used with convex relaxations. We believe that further explorations in this direction would help design efficient algorithms for more complex relaxations.
2 0.60895091 171 nips-2008-Online Prediction on Large Diameter Graphs
Author: Mark Herbster, Guy Lever, Massimiliano Pontil
Abstract: We continue our study of online prediction of the labelling of a graph. We show a fundamental limitation of Laplacian-based algorithms: if the graph has a large diameter then the number of mistakes made by such algorithms may be proportional to the square root of the number of vertices, even when tackling simple problems. We overcome this drawback by means of an efficient algorithm which achieves a logarithmic mistake bound. It is based on the notion of a spine, a path graph which provides a linear embedding of the original graph. In practice, graphs may exhibit cluster structure; thus in the last part, we present a modified algorithm which achieves the “best of both worlds”: it performs well locally in the presence of cluster structure, and globally on large diameter graphs. 1
3 0.54702568 69 nips-2008-Efficient Exact Inference in Planar Ising Models
Author: Nicol N. Schraudolph, Dmitry Kamenetsky
Abstract: We give polynomial-time algorithms for the exact computation of lowest-energy states, worst margin violators, partition functions, and marginals in certain binary undirected graphical models. Our approach provides an interesting alternative to the well-known graph cut paradigm in that it does not impose any submodularity constraints; instead we require planarity to establish a correspondence with perfect matchings in an expanded dual graph. Maximum-margin parameter estimation for a boundary detection task shows our approach to be efficient and effective. A C++ implementation is available from http://nic.schraudolph.org/isinf/. 1
4 0.54242271 53 nips-2008-Counting Solution Clusters in Graph Coloring Problems Using Belief Propagation
Author: Lukas Kroc, Ashish Sabharwal, Bart Selman
Abstract: We show that an important and computationally challenging solution space feature of the graph coloring problem (COL), namely the number of clusters of solutions, can be accurately estimated by a technique very similar to one for counting the number of solutions. This cluster counting approach can be naturally written in terms of a new factor graph derived from the factor graph representing the COL instance. Using a variant of the Belief Propagation inference framework, we can efficiently approximate cluster counts in random COL problems over a large range of graph densities. We illustrate the algorithm on instances with up to 100, 000 vertices. Moreover, we supply a methodology for computing the number of clusters exactly using advanced techniques from the knowledge compilation literature. This methodology scales up to several hundred variables. 1
5 0.5260939 115 nips-2008-Learning Bounded Treewidth Bayesian Networks
Author: Gal Elidan, Stephen Gould
Abstract: With the increased availability of data for complex domains, it is desirable to learn Bayesian network structures that are sufficiently expressive for generalization while also allowing for tractable inference. While the method of thin junction trees can, in principle, be used for this purpose, its fully greedy nature makes it prone to overfitting, particularly when data is scarce. In this work we present a novel method for learning Bayesian networks of bounded treewidth that employs global structure modifications and that is polynomial in the size of the graph and the treewidth bound. At the heart of our method is a triangulated graph that we dynamically update in a way that facilitates the addition of chain structures that increase the bound on the model’s treewidth by at most one. We demonstrate the effectiveness of our “treewidth-friendly” method on several real-life datasets. Importantly, we also show that by using global operators, we are able to achieve better generalization even when learning Bayesian networks of unbounded treewidth. 1
6 0.52352065 84 nips-2008-Fast Prediction on a Tree
7 0.43250781 175 nips-2008-PSDBoost: Matrix-Generation Linear Programming for Positive Semidefinite Matrices Learning
8 0.42222339 34 nips-2008-Bayesian Network Score Approximation using a Metagraph Kernel
9 0.40923485 49 nips-2008-Clusters and Coarse Partitions in LP Relaxations
10 0.39691934 225 nips-2008-Supervised Bipartite Graph Inference
11 0.39376974 25 nips-2008-An interior-point stochastic approximation method and an L1-regularized delta rule
12 0.38923746 50 nips-2008-Continuously-adaptive discretization for message-passing algorithms
13 0.38223103 47 nips-2008-Clustered Multi-Task Learning: A Convex Formulation
14 0.36984488 40 nips-2008-Bounds on marginal probability distributions
15 0.36276311 214 nips-2008-Sparse Online Learning via Truncated Gradient
16 0.35766384 107 nips-2008-Influence of graph construction on graph-based clustering measures
17 0.35315323 165 nips-2008-On the Reliability of Clustering Stability in the Large Sample Regime
18 0.34077471 129 nips-2008-MAS: a multiplicative approximation scheme for probabilistic inference
19 0.33235627 145 nips-2008-Multi-stage Convex Relaxation for Learning with Sparse Regularization
20 0.32737902 183 nips-2008-Predicting the Geometry of Metal Binding Sites from Protein Sequence
topicId topicWeight
[(6, 0.058), (7, 0.046), (12, 0.029), (28, 0.157), (56, 0.279), (57, 0.129), (59, 0.032), (63, 0.029), (71, 0.054), (77, 0.027), (83, 0.044)]
simIndex simValue paperId paperTitle
same-paper 1 0.72277266 104 nips-2008-Improved Moves for Truncated Convex Models
Author: Philip Torr, M. P. Kumar
Abstract: We consider the problem of obtaining the approximate maximum a posteriori estimate of a discrete random field characterized by pairwise potentials that form a truncated convex model. For this problem, we propose an improved st-MINCUT based move making algorithm. Unlike previous move making approaches, which either provide a loose bound or no bound on the quality of the solution (in terms of the corresponding Gibbs energy), our algorithm achieves the same guarantees as the standard linear programming (LP) relaxation. Compared to previous approaches based on the LP relaxation, e.g. interior-point algorithms or treereweighted message passing (TRW), our method is faster as it uses only the efficient st-MINCUT algorithm in its design. Furthermore, it directly provides us with a primal solution (unlike TRW and other related methods which solve the dual of the LP). We demonstrate the effectiveness of the proposed approach on both synthetic and standard real data problems. Our analysis also opens up an interesting question regarding the relationship between move making algorithms (such as α-expansion and the algorithms presented in this paper) and the randomized rounding schemes used with convex relaxations. We believe that further explorations in this direction would help design efficient algorithms for more complex relaxations.
2 0.61879754 27 nips-2008-Artificial Olfactory Brain for Mixture Identification
Author: Mehmet K. Muezzinoglu, Alexander Vergara, Ramon Huerta, Thomas Nowotny, Nikolai Rulkov, Henry Abarbanel, Allen Selverston, Mikhail Rabinovich
Abstract: The odor transduction process has a large time constant and is susceptible to various types of noise. Therefore, the olfactory code at the sensor/receptor level is in general a slow and highly variable indicator of the input odor in both natural and artificial situations. Insects overcome this problem by using a neuronal device in their Antennal Lobe (AL), which transforms the identity code of olfactory receptors to a spatio-temporal code. This transformation improves the decision of the Mushroom Bodies (MBs), the subsequent classifier, in both speed and accuracy. Here we propose a rate model based on two intrinsic mechanisms in the insect AL, namely integration and inhibition. Then we present a MB classifier model that resembles the sparse and random structure of insect MB. A local Hebbian learning procedure governs the plasticity in the model. These formulations not only help to understand the signal conditioning and classification methods of insect olfactory systems, but also can be leveraged in synthetic problems. Among them, we consider here the discrimination of odor mixtures from pure odors. We show on a set of records from metal-oxide gas sensors that the cascade of these two new models facilitates fast and accurate discrimination of even highly imbalanced mixtures from pure odors. 1
3 0.6097663 100 nips-2008-How memory biases affect information transmission: A rational analysis of serial reproduction
Author: Jing Xu, Thomas L. Griffiths
Abstract: Many human interactions involve pieces of information being passed from one person to another, raising the question of how this process of information transmission is affected by the capacities of the agents involved. In the 1930s, Sir Frederic Bartlett explored the influence of memory biases in “serial reproduction” of information, in which one person’s reconstruction of a stimulus from memory becomes the stimulus seen by the next person. These experiments were done using relatively uncontrolled stimuli such as pictures and stories, but suggested that serial reproduction would transform information in a way that reflected the biases inherent in memory. We formally analyze serial reproduction using a Bayesian model of reconstruction from memory, giving a general result characterizing the effect of memory biases on information transmission. We then test the predictions of this account in two experiments using simple one-dimensional stimuli. Our results provide theoretical and empirical justification for the idea that serial reproduction reflects memory biases. 1
4 0.60808492 208 nips-2008-Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes
Author: Erik B. Sudderth, Michael I. Jordan
Abstract: We develop a statistical framework for the simultaneous, unsupervised segmentation and discovery of visual object categories from image databases. Examining a large set of manually segmented scenes, we show that object frequencies and segment sizes both follow power law distributions, which are well modeled by the Pitman–Yor (PY) process. This nonparametric prior distribution leads to learning algorithms which discover an unknown set of objects, and segmentation methods which automatically adapt their resolution to each image. Generalizing previous applications of PY processes, we use Gaussian processes to discover spatially contiguous segments which respect image boundaries. Using a novel family of variational approximations, our approach produces segmentations which compare favorably to state-of-the-art methods, while simultaneously discovering categories shared among natural scenes. 1
5 0.60186899 116 nips-2008-Learning Hybrid Models for Image Annotation with Partially Labeled Data
Author: Xuming He, Richard S. Zemel
Abstract: Extensive labeled data for image annotation systems, which learn to assign class labels to image regions, is difficult to obtain. We explore a hybrid model framework for utilizing partially labeled data that integrates a generative topic model for image appearance with discriminative label prediction. We propose three alternative formulations for imposing a spatial smoothness prior on the image labels. Tests of the new models and some baseline approaches on three real image datasets demonstrate the effectiveness of incorporating the latent structure. 1
6 0.59755224 197 nips-2008-Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation
7 0.59618938 118 nips-2008-Learning Transformational Invariants from Natural Movies
8 0.59387577 176 nips-2008-Partially Observed Maximum Entropy Discrimination Markov Networks
9 0.59382558 66 nips-2008-Dynamic visual attention: searching for coding length increments
10 0.59135664 80 nips-2008-Extended Grassmann Kernels for Subspace-Based Learning
11 0.59103727 30 nips-2008-Bayesian Experimental Design of Magnetic Resonance Imaging Sequences
12 0.59083682 158 nips-2008-Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks
13 0.58993131 200 nips-2008-Robust Kernel Principal Component Analysis
14 0.58927894 234 nips-2008-The Infinite Factorial Hidden Markov Model
15 0.58918279 236 nips-2008-The Mondrian Process
16 0.58906019 62 nips-2008-Differentiable Sparse Coding
17 0.58835185 92 nips-2008-Generative versus discriminative training of RBMs for classification of fMRI images
18 0.58722264 232 nips-2008-The Conjoint Effect of Divisive Normalization and Orientation Selectivity on Redundancy Reduction
19 0.58682203 134 nips-2008-Mixed Membership Stochastic Blockmodels
20 0.58646858 235 nips-2008-The Infinite Hierarchical Factor Regression Model