cvpr cvpr2013 cvpr2013-180 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Neill D.F. Campbell, Kartic Subr, Jan Kautz
Abstract: Conditional Random Fields (CRFs) are used for diverse tasks, ranging from image denoising to object recognition. For images, they are commonly defined as a graph with nodes corresponding to individual pixels and pairwise links that connect nodes to their immediate neighbors. Recent work has shown that fully-connected CRFs, where each node is connected to every other node, can be solved efficiently under the restriction that the pairwise term is a Gaussian kernel over a Euclidean feature space. In this paper, we generalize the pairwise terms to a non-linear dissimilarity measure that is not required to be a distance metric. To this end, we propose a density estimation technique to derive conditional pairwise potentials in a nonparametric manner. We then use an efficient embedding technique to estimate an approximate Euclidean feature space for these potentials, in which the pairwise term can still be expressed as a Gaussian kernel. We demonstrate that the use of non-parametric models for the pairwise interactions, conditioned on the input data, greatly increases expressive power whilst maintaining efficient inference.
Reference: text
sentIndex sentText sentNum sentScore
1 For images, they are commonly defined as a graph with nodes corresponding to individual pixels and pairwise links that connect nodes to their immediate neighbors. [sent-4, score-0.519]
2 Recent work has shown that fully-connected CRFs, where each node is connected to every other node, can be solved efficiently under the restriction that the pairwise term is a Gaussian kernel over a Euclidean feature space. [sent-5, score-0.403]
3 In this paper, we generalize the pairwise terms to a non-linear dissimilarity measure that is not required to be a distance metric. [sent-6, score-0.689]
4 To this end, we propose a density estimation technique to derive conditional pairwise potentials in a nonparametric manner. [sent-7, score-0.813]
5 We then use an efficient embedding technique to estimate an approximate Euclidean feature space for these potentials, in which the pairwise term can still be expressed as a Gaussian kernel. [sent-8, score-0.505]
6 We demonstrate that the use of non-parametric models for the pairwise interactions, conditioned on the input data, greatly increases expressive power whilst maintaining efficient inference. [sent-9, score-0.556]
7 The basic model consists of the combination of a set of unary terms, defined for each node individually, and a set of pairwise terms, defined as a function of two nodes that share an edge. [sent-15, score-0.489]
8 The pairwise interactions impose a smoothness cost on the final labeling. [sent-20, score-0.285]
9 Whilst good results have been obtained using only neighboring pairwise terms, they may only be used to express a limited range of priors. [sent-22, score-0.332]
10 To overcome this limitation, recent work has produced a number of approximate inference techniques making use of cross bilateral filtering. [sent-30, score-0.32]
11 In particular, the work of Kr ¨ahenb u¨hl and Koltun [10] proposed a method for performing inference in a fully-connected pairwise CRF (every node is connected to every other node) by taking a meanfield approximation to the original CRF. [sent-31, score-0.528]
12 Here, the message passing is performed as a Gaussian bilateral filtering process under the limitation that the pairwise potentials be expressed as a weighted sum of Gaussian kernels over a Euclidean feature space. [sent-32, score-0.932]
13 However, thus far, the applications have been limited by the requirement that the pairwise terms con- sist of a weighted sum of Gaussian kernels over a Euclidean feature space. [sent-35, score-0.406]
14 In this work we generalize the pairwise potential from a simple parametric model to a conditional non-parametric model that is learnt from training data. [sent-38, score-0.705]
15 Our learning approach is to approximate directly the conditional joint probability distributions (from the training data) in a straight forward density estimation process. [sent-39, score-0.349]
16 This probability model may be expressed as an image specific (evaluated at test time), sparsely sampled dissimilarity measure. [sent-40, score-0.4]
17 The pairwise terms may then be expressed as Gaussian kernels in this new feature space and thus the inference procedure of [10] may proceed unaltered. [sent-42, score-0.517]
18 This allows us to generalize the pairwise terms to a general, non-linear dissimilarity measure that is not required to be a distance metric. [sent-43, score-0.689]
19 In particular we show that the use of non-parametric models for the pairwise interactions greatly increases the expressive power whilst maintaining the efficient inference of [10]. [sent-44, score-0.587]
20 In particular there has been some work on approximating more complex pairwise terms with [16] learning the parameters of a non-zero mean Gaussian mixture model in the bilateral space and [11] approximating a truncated penalty function as a mixture of exponentials. [sent-47, score-0.511]
21 The subsequent work of [17] provides a method for extending the filter based inference algorithm for models that include potentials defined over certain types of higher-order cliques. [sent-52, score-0.364]
22 Recent work has investigated extensions to pairwise CRFs under alternative inference methods, in particular the works of Nowozin et al. [sent-54, score-0.402]
23 We would refer the reader to the references contained in [6, 13] for further details of research into parameter estimation in CRF models with parametric pairwise terms. [sent-60, score-0.329]
24 Two standard non-Euclidean distance metrics over patches are used (χ2 and Earth Mover’s Distance) and an embedding into a Euclidean feature space is employed to incorporate them into the dense CRF framework. [sent-62, score-0.196]
25 In contrast, we propose to generalize away from a data-driven heuristic dissimilarity measure, rather incorporating non-parametric dissim- ilarities, learnt from training data. [sent-63, score-0.493]
26 However, if the pairwise terms in the Gibbs energy are expressed as φij(xi,xj) = μ(xi,xj)m? [sent-80, score-0.363]
27 (3) is a Gaussian kernel with precision Λm in some feature space ∈ F(m) , for the mth kernel, then the message passing step ∈co Fnsists of a low pass filtering operation under a Gaussian kernel for which efficient approximations exist, e. [sent-91, score-0.299]
28 Non-Parametric Pairwise Potentials In order to allow for more expressive pairwise potentials we would like to relax the restriction on Gaussian parametric models, [10, 11, 16] and allow for more complex, nonparametric models that may be learnt from training data and conditioned on the input. [sent-103, score-0.934]
29 In this section we describe how we overcome the limitation that the pairwise potentials be expresses as a Gaussian kernel, as in (3). [sent-104, score-0.532]
30 Firstly, we present our desired pairwise potentials as density estimates 1Here we use [xi = xj] as an inequality indicator function. [sent-106, score-0.675]
31 of the conditional pairwise probability (learnt from training data, conditioned on a test image). [sent-107, score-0.641]
32 We then express these probabilities as a dissimilarity measure between nodes in the CRF. [sent-108, score-0.502]
33 Finally, we use an efficient approximate embedding technique to find a set of feature spaces that encode the dissimilarity measure as the Euclidean distance and thus the desired pairwise potential under a Gaussian kernel in this space. [sent-109, score-0.875]
34 Pairwise Conditional Probabilities The pairwise potentials in a CRF encode conditional probabilities between pairs of nodes. [sent-112, score-0.753]
35 We make this conditional for each node at test time by afir Tst . [sent-114, score-0.275]
36 looking aet tthhies lcoocnadl area a(la nfo image patch asti )te astro tuimned a particular node iin the test image I then finding simand iala pra patches nino dthee i training images. [sent-115, score-0.293]
37 IF aorn dea thchen nla fbinedli ln gin s itmhelabel space L, we want to estimate the conditional probability ePl (xj =e L Ll ,| xi w=a nl,t tI to, T es )t mfora tteh eth neo cdoensd j aiornoualn pdr onobadebi li,i. [sent-116, score-0.233]
38 i Density Estimation: Any density estimation or regression technique could be used to approximate these conditional probabilities; in particular, we make use ofa non-parametric approach by referring to the training data directly and performing a kernel based density estimate. [sent-119, score-0.513]
39 We take the mean of the indicator images for the label lfrom the training im- ages that contain a patch similar to si. [sent-120, score-0.25]
40 In practice this corresponds to extracting much larger patches from the training indicator images for class l, that are centered on patches similar to si, and finding the mean. [sent-122, score-0.24]
41 We place a prior that dictates the range over which we are able to infer useful information in the pairwise potential by applying a Gaussian window of size σw in pixel distance gwin(i,j) = exp? [sent-123, score-0.458]
42 This procedure identifies local correlations in the training data that will then be encouraged to occur in the output by means of the pairwise potentials. [sent-132, score-0.355]
43 For example, if a particular image patch always has label labove it in the training data then the indicator images will always be set to one above this patch. [sent-133, score-0.25]
44 This indicates that the pairwise term should have strong connections to the pixels above for class l. [sent-135, score-0.332]
45 Probabilities to Feature Spaces We now have a method for determining the non-local pairwise potentials around node ifor image I. [sent-138, score-0.607]
46 In order to pbea rawbiles eto p use ttiahelsse a potentials eto i perform eth Ie . [sent-139, score-0.247]
47 an embedding in a feature space) where the distance between the feature vectors of each node under the Gaussian kernel is equal to the conditional probability densities. [sent-143, score-0.418]
48 We can achieve this, in a similar fashion to [14], by creating an appropriate dissimilarity measure, based on the conditional probabilities, and finding an embedding such that the Euclidean distance in the embedded space matches this dissimilarity measure. [sent-144, score-0.977]
49 Dissimilarity Measure: Ifwe denote the dissimilarity measure as d (i, j,I,T ) then we may express our pairwise term as having ti,hej ,fIo,rmT φij (xi , xj) = [xi = xj] exp (−d (i, j,I,T )) . [sent-145, score-0.652]
50 (7) where this distance between landmark location iand varying j, under label l, is the conditional distribution of the label lgiven the training data T the test image I. [sent-149, score-0.484]
51 Feature Space Embedding: The dissimilarity measures obtained for each label may now be embedding into a feature space F(l) to provide a of feature vector } for each ntuordee s ip aacned F Flabelt ol psurochvi tdhea ta {fi(l) ? [sent-150, score-0.528]
52 Thus we have generalized the constraints on the pairwise potentials to the requirement that they be expressed as (5) where the functions dl (i, j,I,T ) are dissimilarity measures which must satisfy d(l ,(ij, i,,IT, T ) ) a =e d0i. [sent-162, score-1.013]
53 Approximate Euclidean Embedding We make use of the Landmark version of the Multidimensional Scaling (MDS) algorithm [4] to compute the fea- {fi(l)} ture vectors from the dissimilarity measures as an embedding sin { pf-d}im fernosmion tahle E duisclsi mdeialnar space eRaspu. [sent-167, score-0.462]
54 The landmark variant (LMDS) has the advantage over classical MDS of removing the need to store a complete pairwise dissimilarity matrix = dl (i,j,I, T ) that would have a storage complexity of O(N2), where N = |P| is the number of pixels in the test image I. [sent-168, score-0.855]
55 Instead, the Di(lj) Nystr o¨m approximation of D(ilj) is used and allows us, under reasonable sampling conditions, to provide only a subset of the dissimilarity matrix. [sent-169, score-0.32]
56 The remaining points have their positions triangulated from these landmark points, requiring the dissimilarities between the landmarks and the other points. [sent-172, score-0.216]
57 In addition to the masked test image we are also provided with a training database of images containing similar statistical properties to our test data. [sent-180, score-0.312]
58 Consider a single sample location for the foreground label (orange); we want to compute the dissimilarity to all other pixels. [sent-181, score-0.434]
59 We look at the local patch (conditional region) around the pixel, and find all the patches in our training database that match with a low hamming distance. [sent-182, score-0.232]
60 The dissimilarities to the pixels in the wider region, determined by the window size σw, around our input patch should have the same label distribution as the regions around the training patches. [sent-183, score-0.507]
61 Therefore, we take the mean of the set of larger 111666556199 local neighborhood of a random sample is used to condition a lookup into the training data to provide a non-parametric estimate of the potential dissimilarity measure to the wider image. [sent-184, score-0.527]
62 patches from the training indicator images, for the appropriate label, as our conditional probability. [sent-185, score-0.344]
63 We then multiply by the Gaussian window function and take the negative log to obtain the required dissimilarity (7). [sent-186, score-0.403]
64 Experiments Since our contribution is in the use of pairwise potentials, a direct and unbiased evaluation of our work is best obtained by removing the dependence of the results on any unary terms in the CRF. [sent-209, score-0.382]
65 Here, we have a task where no unary is applicable in the occluded region and we must use expressive pairwise Method Accuracy Random Forest [13] 67. [sent-211, score-0.505]
66 We provide results for the accuracy (as the percentage of pixels correctly labeled) for filling in the masked regions on unseen test images after training on a separate training set. [sent-218, score-0.331]
67 We adopt the same methodology as [13, 6] splitting the input data in a 2: 1 training to test ratio with the dimensions of the masked regions drawn from [5 . [sent-219, score-0.214]
68 The unary term is clamped to the ground truth outside the occluded region and to an uninformative uniform distribution within the occluded region. [sent-247, score-0.27]
69 Table 1 gives the quantitative results for the small oc111666666200 × Figure 3: A sample of the dissimilarity measures used for a KAIST Hanja2 example. [sent-248, score-0.32]
70 Funodr (ebalcuhe o)f 3 t h×e 3lan pdatmcahersk patches we find similar patches in the training data and then estimate the density of the appropriate class (foreground or background) centered on the patch. [sent-250, score-0.259]
71 The raw probability values are shown; we apply a Gaussian window and take the negative log to obtain the dissimilarity samples for embedding. [sent-252, score-0.403]
72 Instead, filling in plausible structure is indicative of good performance showing that the model has captured the underlying statistics of the training data and exploited the conditional dependence on the input. [sent-260, score-0.281]
73 3 we provide an example of the conditional, nonparametric pairwise potentials used for the KAIST Hanja2 database. [sent-263, score-0.574]
74 The dissimilarity measures for the foreground and background classes are observed to vary based upon the local region around the sampling locations and we can see the structure, learnt from the training data, that is promoted by the potentials. [sent-264, score-0.497]
75 We compared our non-parametric model to the cross bilateral model of [10] and an input-agnostic dissimilarity used by [14]. [sent-276, score-0.483]
76 The results obtained for [10] and [14] require the original color image in order to calculate the pairwise terms; our method makes no use of the color images during training or testing. [sent-277, score-0.355]
77 Both methods in italics (marked with an asterisk), the cross bilateral model (gaussian kernel in color space) of [10] and the χ2 patch dissimilarity model of [14], needed the original color image to evaluate the pairwise terms over the occluded regions. [sent-288, score-0.965]
78 All results were obtained with a window size of σw = 13 pixels; the increase over the σw = 7 pixels for the Chinese characters is indicative of the differing scales of the foreground objects. [sent-293, score-0.305]
79 Discussion Timing: We compute the embedding using 80 landmark samples and 10 dimensions in around 1s. [sent-295, score-0.224]
80 The data lookup for the non-parametric, conditional potentials is less predictable. [sent-300, score-0.405]
81 Short- and Long-Range Interactions: In our experiments we used the model of (9) with uninformative unary terms over the occluded region. [sent-303, score-0.248]
82 At a window range of 2 pixels we are approaching the performance of traditional MRF and CRF models with local neighbors. [sent-306, score-0.191]
83 This is to be expected since the local conditioning of the potentials is no longer valid over large distances; in addition, at 35 pixels we are approaching the size of the characters o aserled LClrectly)b1086000 i(Pxcr%Acelucya420 025101520253035 Std. [sent-309, score-0.429]
84 We observe a noticeable decrease in performance for small window sizes (approaching the standard 4-connected CRF) demonstrating the advantage of having a non-local pairwise potential. [sent-311, score-0.368]
85 There is also a drop-off in performance with large window sizes suggesting that very long range potentials act as a hinderance. [sent-312, score-0.33]
86 We observe that the weighting w of the pairwise term has relatively little impact but there is a definite optimal value for the σf term. [sent-315, score-0.285]
87 The weight w is of little importance since we have an uninformative unary term in the occluded region. [sent-318, score-0.208]
88 The σf plays a greater role due to the windowing process applied to the pairwise potentials. [sent-319, score-0.336]
89 3 leads to uninformative tails (at large distances) for all classes and the embedded approximation of the dissimilarity measure will be less accurate in these regions. [sent-321, score-0.487]
90 Changing the σf parameter to match the window helps provide a sharper drop offin the edge potentials outside the windowed region and leads to an improved accuracy. [sent-322, score-0.33]
91 Limitations: Whilst our approach provides state-of-the-art performance and confers many benefits in the expressive power of the non-local and conditional potentials. [sent-328, score-0.375]
92 Also we are currently neglecting cross terms in that density estimating between different labels can also be performed and encoded into the update filtering at each iteration. [sent-336, score-0.243]
93 In particular there are many options for estimating the conditional potential distances for a wider variety of multidimensional complex data and to improve scaling with larger training datasets. [sent-339, score-0.328]
94 Conclusions: We have demonstrated how to condition expressive, non-local pairwise potentials on input data. [sent-341, score-0.532]
95 This embedding of the pixels in feature space leads to an efficient mean-field inference in a fully-connected CRF model, yet with a generalized underlying dissimilarity measure. [sent-343, score-0.626]
96 Our method confers state-ofthe-art performance when compared to recent approaches that perform inference on similar models. [sent-344, score-0.233]
97 Efficient inference in fully connected CRFs with gaussian edge potentials. [sent-427, score-0.199]
98 Improved initialization and gaussian mixture pairwise terms for dense random fields with mean-field inference. [sent-486, score-0.449]
99 Filter-based mean-field inference for random fields with higher-order terms and product labelspaces. [sent-497, score-0.199]
100 Comparing the mean field method and belief propagation for approximate inference in MRFs. [sent-504, score-0.201]
wordName wordTfidf (topN-words)
[('dissimilarity', 0.32), ('pairwise', 0.285), ('potentials', 0.247), ('kaist', 0.231), ('lmds', 0.173), ('conditional', 0.158), ('crf', 0.15), ('embedding', 0.142), ('dissimilarities', 0.134), ('inference', 0.117), ('confers', 0.116), ('bilateral', 0.112), ('dtf', 0.103), ('masked', 0.102), ('expressive', 0.101), ('rtf', 0.095), ('xj', 0.092), ('mrf', 0.092), ('crfs', 0.09), ('uninformative', 0.089), ('gwin', 0.087), ('subr', 0.087), ('conditioned', 0.086), ('message', 0.085), ('whilst', 0.084), ('window', 0.083), ('gaussian', 0.082), ('landmark', 0.082), ('density', 0.081), ('dl', 0.079), ('node', 0.075), ('xi', 0.075), ('characters', 0.074), ('ahenb', 0.072), ('nodes', 0.072), ('vineet', 0.071), ('filtering', 0.071), ('training', 0.07), ('weizmann', 0.069), ('fi', 0.068), ('label', 0.066), ('mds', 0.064), ('probabilities', 0.063), ('occluded', 0.062), ('indicator', 0.062), ('approaching', 0.061), ('nowozin', 0.061), ('potts', 0.061), ('horse', 0.061), ('euclidean', 0.06), ('learnt', 0.059), ('gmrf', 0.058), ('kartic', 0.058), ('neill', 0.058), ('unary', 0.057), ('passing', 0.057), ('database', 0.056), ('wider', 0.055), ('tree', 0.054), ('patches', 0.054), ('indicative', 0.053), ('patch', 0.052), ('hl', 0.052), ('koltun', 0.051), ('meanfield', 0.051), ('windowing', 0.051), ('mpm', 0.051), ('fj', 0.051), ('cross', 0.051), ('foreground', 0.048), ('kr', 0.048), ('pixels', 0.047), ('campbell', 0.047), ('express', 0.047), ('potential', 0.045), ('dictates', 0.045), ('generalize', 0.044), ('parametric', 0.044), ('field', 0.044), ('requirement', 0.044), ('immediate', 0.043), ('london', 0.043), ('warrell', 0.043), ('kernel', 0.043), ('test', 0.042), ('nonparametric', 0.042), ('fields', 0.042), ('tails', 0.041), ('kautz', 0.041), ('terms', 0.04), ('approximate', 0.04), ('regression', 0.04), ('jancsary', 0.039), ('expressed', 0.038), ('decision', 0.038), ('neighborhood', 0.037), ('embedded', 0.037), ('kernels', 0.037), ('approximating', 0.037), ('college', 0.036)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999869 180 cvpr-2013-Fully-Connected CRFs with Non-Parametric Pairwise Potential
Author: Neill D.F. Campbell, Kartic Subr, Jan Kautz
Abstract: Conditional Random Fields (CRFs) are used for diverse tasks, ranging from image denoising to object recognition. For images, they are commonly defined as a graph with nodes corresponding to individual pixels and pairwise links that connect nodes to their immediate neighbors. Recent work has shown that fully-connected CRFs, where each node is connected to every other node, can be solved efficiently under the restriction that the pairwise term is a Gaussian kernel over a Euclidean feature space. In this paper, we generalize the pairwise terms to a non-linear dissimilarity measure that is not required to be a distance metric. To this end, we propose a density estimation technique to derive conditional pairwise potentials in a nonparametric manner. We then use an efficient embedding technique to estimate an approximate Euclidean feature space for these potentials, in which the pairwise term can still be expressed as a Gaussian kernel. We demonstrate that the use of non-parametric models for the pairwise interactions, conditioned on the input data, greatly increases expressive power whilst maintaining efficient inference.
2 0.27261209 165 cvpr-2013-Fast Energy Minimization Using Learned State Filters
Author: Matthieu Guillaumin, Luc Van_Gool, Vittorio Ferrari
Abstract: Pairwise discrete energies defined over graphs are ubiquitous in computer vision. Many algorithms have been proposed to minimize such energies, often concentrating on sparse graph topologies or specialized classes of pairwise potentials. However, when the graph is fully connected and the pairwise potentials are arbitrary, the complexity of even approximate minimization algorithms such as TRW-S grows quadratically both in the number of nodes and in the number of states a node can take. Moreover, recent applications are using more and more computationally expensive pairwise potentials. These factors make it very hard to employ fully connected models. In this paper we propose a novel, generic algorithm to approximately minimize any discrete pairwise energy function. Our method exploits tractable sub-energies to filter the domain of the function. The parameters of the filter are learnt from instances of the same class of energies with good candidate solutions. Compared to existing methods, it efficiently handles fully connected graphs, with many states per node, and arbitrary pairwise potentials, which might be expensive to compute. We demonstrate experimentally on two applications that our algorithm is much more efficient than other generic minimization algorithms such as TRW-S, while returning essentially identical solutions.
3 0.22340326 156 cvpr-2013-Exploring Compositional High Order Pattern Potentials for Structured Output Learning
Author: Yujia Li, Daniel Tarlow, Richard Zemel
Abstract: When modeling structured outputs such as image segmentations, prediction can be improved by accurately modeling structure present in the labels. A key challenge is developing tractable models that are able to capture complex high level structure like shape. In this work, we study the learning of a general class of pattern-like high order potential, which we call Compositional High Order Pattern Potentials (CHOPPs). We show that CHOPPs include the linear deviation pattern potentials of Rother et al. [26] and also Restricted Boltzmann Machines (RBMs); we also establish the near equivalence of these two models. Experimentally, we show that performance is affected significantly by the degree of variability present in the datasets, and we define a quantitative variability measure to aid in studying this. We then improve CHOPPs performance in high variability datasets with two primary contributions: (a) developing a loss-sensitive joint learning procedure, so that internal pattern parameters can be learned in conjunction with other model potentials to minimize expected loss;and (b) learning an image-dependent mapping that encourages or inhibits patterns depending on image features. We also explore varying how multiple patterns are composed, and learning convolutional patterns. Quantitative results on challenging highly variable datasets show that the joint learning and image-dependent high order potentials can improve performance.
4 0.21916112 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs
Author: Roozbeh Mottaghi, Sanja Fidler, Jian Yao, Raquel Urtasun, Devi Parikh
Abstract: Recent trends in semantic image segmentation have pushed for holistic scene understanding models that jointly reason about various tasks such as object detection, scene recognition, shape analysis, contextual reasoning. In this work, we are interested in understanding the roles of these different tasks in aiding semantic segmentation. Towards this goal, we “plug-in ” human subjects for each of the various components in a state-of-the-art conditional random field model (CRF) on the MSRC dataset. Comparisons among various hybrid human-machine CRFs give us indications of how much “head room ” there is to improve segmentation by focusing research efforts on each of the tasks. One of the interesting findings from our slew of studies was that human classification of isolated super-pixels, while being worse than current machine classifiers, provides a significant boost in performance when plugged into the CRF! Fascinated by this finding, we conducted in depth analysis of the human generated potentials. This inspired a new machine potential which significantly improves state-of-the-art performance on the MRSC dataset.
5 0.21432613 24 cvpr-2013-A Principled Deep Random Field Model for Image Segmentation
Author: Pushmeet Kohli, Anton Osokin, Stefanie Jegelka
Abstract: We discuss a model for image segmentation that is able to overcome the short-boundary bias observed in standard pairwise random field based approaches. To wit, we show that a random field with multi-layered hidden units can encode boundary preserving higher order potentials such as the ones used in the cooperative cuts model of [11] while still allowing for fast and exact MAP inference. Exact inference allows our model to outperform previous image segmentation methods, and to see the true effect of coupling graph edges. Finally, our model can be easily extended to handle segmentation instances with multiple labels, for which it yields promising results.
6 0.13067779 121 cvpr-2013-Detection- and Trajectory-Level Exclusion in Multiple Object Tracking
7 0.12941945 131 cvpr-2013-Discriminative Non-blind Deblurring
8 0.12337229 340 cvpr-2013-Probabilistic Label Trees for Efficient Large Scale Image Classification
9 0.12293229 92 cvpr-2013-Constrained Clustering and Its Application to Face Clustering in Videos
10 0.12151688 335 cvpr-2013-Poselet Conditioned Pictorial Structures
11 0.11936578 10 cvpr-2013-A Fully-Connected Layered Model of Foreground and Background Flow
12 0.11648107 13 cvpr-2013-A Higher-Order CRF Model for Road Network Extraction
13 0.11575573 262 cvpr-2013-Learning for Structured Prediction Using Approximate Subgradient Descent with Working Sets
14 0.11235045 406 cvpr-2013-Spatial Inference Machines
15 0.10812848 466 cvpr-2013-Whitened Expectation Propagation: Non-Lambertian Shape from Shading and Shadow
16 0.10619715 207 cvpr-2013-Human Pose Estimation Using a Joint Pixel-wise and Part-wise Formulation
17 0.10522189 206 cvpr-2013-Human Pose Estimation Using Body Parts Dependent Joint Regressors
18 0.10346145 186 cvpr-2013-GeoF: Geodesic Forests for Learning Coupled Predictors
19 0.10304252 50 cvpr-2013-Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling
20 0.099264555 20 cvpr-2013-A New Model and Simple Algorithms for Multi-label Mumford-Shah Problems
topicId topicWeight
[(0, 0.267), (1, 0.022), (2, 0.017), (3, 0.012), (4, 0.13), (5, 0.06), (6, 0.049), (7, 0.041), (8, -0.093), (9, -0.059), (10, 0.104), (11, 0.048), (12, -0.098), (13, 0.068), (14, -0.073), (15, 0.134), (16, -0.053), (17, 0.009), (18, 0.154), (19, -0.103), (20, -0.001), (21, -0.009), (22, -0.029), (23, 0.054), (24, 0.043), (25, -0.093), (26, -0.024), (27, 0.117), (28, -0.083), (29, -0.04), (30, -0.035), (31, -0.085), (32, 0.1), (33, 0.046), (34, -0.013), (35, 0.028), (36, 0.005), (37, -0.079), (38, -0.04), (39, 0.082), (40, -0.037), (41, 0.084), (42, 0.069), (43, 0.014), (44, 0.004), (45, 0.001), (46, -0.081), (47, 0.026), (48, -0.034), (49, -0.058)]
simIndex simValue paperId paperTitle
same-paper 1 0.95424533 180 cvpr-2013-Fully-Connected CRFs with Non-Parametric Pairwise Potential
Author: Neill D.F. Campbell, Kartic Subr, Jan Kautz
Abstract: Conditional Random Fields (CRFs) are used for diverse tasks, ranging from image denoising to object recognition. For images, they are commonly defined as a graph with nodes corresponding to individual pixels and pairwise links that connect nodes to their immediate neighbors. Recent work has shown that fully-connected CRFs, where each node is connected to every other node, can be solved efficiently under the restriction that the pairwise term is a Gaussian kernel over a Euclidean feature space. In this paper, we generalize the pairwise terms to a non-linear dissimilarity measure that is not required to be a distance metric. To this end, we propose a density estimation technique to derive conditional pairwise potentials in a nonparametric manner. We then use an efficient embedding technique to estimate an approximate Euclidean feature space for these potentials, in which the pairwise term can still be expressed as a Gaussian kernel. We demonstrate that the use of non-parametric models for the pairwise interactions, conditioned on the input data, greatly increases expressive power whilst maintaining efficient inference.
2 0.88900352 24 cvpr-2013-A Principled Deep Random Field Model for Image Segmentation
Author: Pushmeet Kohli, Anton Osokin, Stefanie Jegelka
Abstract: We discuss a model for image segmentation that is able to overcome the short-boundary bias observed in standard pairwise random field based approaches. To wit, we show that a random field with multi-layered hidden units can encode boundary preserving higher order potentials such as the ones used in the cooperative cuts model of [11] while still allowing for fast and exact MAP inference. Exact inference allows our model to outperform previous image segmentation methods, and to see the true effect of coupling graph edges. Finally, our model can be easily extended to handle segmentation instances with multiple labels, for which it yields promising results.
3 0.8664428 165 cvpr-2013-Fast Energy Minimization Using Learned State Filters
Author: Matthieu Guillaumin, Luc Van_Gool, Vittorio Ferrari
Abstract: Pairwise discrete energies defined over graphs are ubiquitous in computer vision. Many algorithms have been proposed to minimize such energies, often concentrating on sparse graph topologies or specialized classes of pairwise potentials. However, when the graph is fully connected and the pairwise potentials are arbitrary, the complexity of even approximate minimization algorithms such as TRW-S grows quadratically both in the number of nodes and in the number of states a node can take. Moreover, recent applications are using more and more computationally expensive pairwise potentials. These factors make it very hard to employ fully connected models. In this paper we propose a novel, generic algorithm to approximately minimize any discrete pairwise energy function. Our method exploits tractable sub-energies to filter the domain of the function. The parameters of the filter are learnt from instances of the same class of energies with good candidate solutions. Compared to existing methods, it efficiently handles fully connected graphs, with many states per node, and arbitrary pairwise potentials, which might be expensive to compute. We demonstrate experimentally on two applications that our algorithm is much more efficient than other generic minimization algorithms such as TRW-S, while returning essentially identical solutions.
4 0.74521846 156 cvpr-2013-Exploring Compositional High Order Pattern Potentials for Structured Output Learning
Author: Yujia Li, Daniel Tarlow, Richard Zemel
Abstract: When modeling structured outputs such as image segmentations, prediction can be improved by accurately modeling structure present in the labels. A key challenge is developing tractable models that are able to capture complex high level structure like shape. In this work, we study the learning of a general class of pattern-like high order potential, which we call Compositional High Order Pattern Potentials (CHOPPs). We show that CHOPPs include the linear deviation pattern potentials of Rother et al. [26] and also Restricted Boltzmann Machines (RBMs); we also establish the near equivalence of these two models. Experimentally, we show that performance is affected significantly by the degree of variability present in the datasets, and we define a quantitative variability measure to aid in studying this. We then improve CHOPPs performance in high variability datasets with two primary contributions: (a) developing a loss-sensitive joint learning procedure, so that internal pattern parameters can be learned in conjunction with other model potentials to minimize expected loss;and (b) learning an image-dependent mapping that encourages or inhibits patterns depending on image features. We also explore varying how multiple patterns are composed, and learning convolutional patterns. Quantitative results on challenging highly variable datasets show that the joint learning and image-dependent high order potentials can improve performance.
5 0.68192577 466 cvpr-2013-Whitened Expectation Propagation: Non-Lambertian Shape from Shading and Shadow
Author: Brian Potetz, Mohammadreza Hajiarbabi
Abstract: For problems over continuous random variables, MRFs with large cliques pose a challenge in probabilistic inference. Difficulties in performing optimization efficiently have limited the probabilistic models explored in computer vision and other fields. One inference technique that handles large cliques well is Expectation Propagation. EP offers run times independent of clique size, which instead depend only on the rank, or intrinsic dimensionality, of potentials. This property would be highly advantageous in computer vision. Unfortunately, for grid-shaped models common in vision, traditional Gaussian EP requires quadratic space and cubic time in the number of pixels. Here, we propose a variation of EP that exploits regularities in natural scene statistics to achieve run times that are linear in both number of pixels and clique size. We test these methods on shape from shading, and we demonstrate strong performance not only for Lambertian surfaces, but also on arbitrary surface reflectance and lighting arrangements, which requires highly non-Gaussian potentials. Finally, we use large, non-local cliques to exploit cast shadow, which is traditionally ignored in shape from shading.
6 0.67129064 308 cvpr-2013-Nonlinearly Constrained MRFs: Exploring the Intrinsic Dimensions of Higher-Order Cliques
7 0.66768682 262 cvpr-2013-Learning for Structured Prediction Using Approximate Subgradient Descent with Working Sets
8 0.65673214 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs
9 0.65027791 13 cvpr-2013-A Higher-Order CRF Model for Road Network Extraction
11 0.63148308 6 cvpr-2013-A Comparative Study of Modern Inference Techniques for Discrete Energy Minimization Problems
12 0.62266999 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations
13 0.62187362 128 cvpr-2013-Discrete MRF Inference of Marginal Densities for Non-uniformly Discretized Variable Space
14 0.61214036 448 cvpr-2013-Universality of the Local Marginal Polytope
15 0.59883898 186 cvpr-2013-GeoF: Geodesic Forests for Learning Coupled Predictors
16 0.59765792 62 cvpr-2013-Bilinear Programming for Human Activity Recognition with Unknown MRF Graphs
17 0.58428872 406 cvpr-2013-Spatial Inference Machines
18 0.58379388 193 cvpr-2013-Graph Transduction Learning with Connectivity Constraints with Application to Multiple Foreground Cosegmentation
19 0.5540846 278 cvpr-2013-Manhattan Junction Catalogue for Spatial Reasoning of Indoor Scenes
20 0.55388176 25 cvpr-2013-A Sentence Is Worth a Thousand Pixels
topicId topicWeight
[(10, 0.059), (26, 0.021), (33, 0.767), (67, 0.032), (69, 0.019), (87, 0.038)]
simIndex simValue paperId paperTitle
1 0.99983704 178 cvpr-2013-From Local Similarity to Global Coding: An Application to Image Classification
Author: Amirreza Shaban, Hamid R. Rabiee, Mehrdad Farajtabar, Marjan Ghazvininejad
Abstract: Bag of words models for feature extraction have demonstrated top-notch performance in image classification. These representations are usually accompanied by a coding method. Recently, methods that code a descriptor giving regard to its nearby bases have proved efficacious. These methods take into account the nonlinear structure of descriptors, since local similarities are a good approximation of global similarities. However, they confine their usage of the global similarities to nearby bases. In this paper, we propose a coding scheme that brings into focus the manifold structure of descriptors, and devise a method to compute the global similarities of descriptors to the bases. Given a local similarity measure between bases, a global measure is computed. Exploiting the local similarity of a descriptor and its nearby bases, a global measure of association of a descriptor to all the bases is computed. Unlike the locality-based and sparse coding methods, the proposed coding varies smoothly with respect to the underlying manifold. Experiments on benchmark image classification datasets substantiate the superiority oftheproposed method over its locality and sparsity based rivals.
2 0.99968249 357 cvpr-2013-Revisiting Depth Layers from Occlusions
Author: Adarsh Kowdle, Andrew Gallagher, Tsuhan Chen
Abstract: In this work, we consider images of a scene with a moving object captured by a static camera. As the object (human or otherwise) moves about the scene, it reveals pairwise depth-ordering or occlusion cues. The goal of this work is to use these sparse occlusion cues along with monocular depth occlusion cues to densely segment the scene into depth layers. We cast the problem of depth-layer segmentation as a discrete labeling problem on a spatiotemporal Markov Random Field (MRF) that uses the motion occlusion cues along with monocular cues and a smooth motion prior for the moving object. We quantitatively show that depth ordering produced by the proposed combination of the depth cues from object motion and monocular occlusion cues are superior to using either feature independently, and using a na¨ ıve combination of the features.
same-paper 3 0.99963862 180 cvpr-2013-Fully-Connected CRFs with Non-Parametric Pairwise Potential
Author: Neill D.F. Campbell, Kartic Subr, Jan Kautz
Abstract: Conditional Random Fields (CRFs) are used for diverse tasks, ranging from image denoising to object recognition. For images, they are commonly defined as a graph with nodes corresponding to individual pixels and pairwise links that connect nodes to their immediate neighbors. Recent work has shown that fully-connected CRFs, where each node is connected to every other node, can be solved efficiently under the restriction that the pairwise term is a Gaussian kernel over a Euclidean feature space. In this paper, we generalize the pairwise terms to a non-linear dissimilarity measure that is not required to be a distance metric. To this end, we propose a density estimation technique to derive conditional pairwise potentials in a nonparametric manner. We then use an efficient embedding technique to estimate an approximate Euclidean feature space for these potentials, in which the pairwise term can still be expressed as a Gaussian kernel. We demonstrate that the use of non-parametric models for the pairwise interactions, conditioned on the input data, greatly increases expressive power whilst maintaining efficient inference.
4 0.99952137 93 cvpr-2013-Constraints as Features
Author: Shmuel Asafi, Daniel Cohen-Or
Abstract: In this paper, we introduce a new approach to constrained clustering which treats the constraints as features. Our method augments the original feature space with additional dimensions, each of which derived from a given Cannot-link constraints. The specified Cannot-link pair gets extreme coordinates values, and the rest of the points get coordinate values that express their spatial influence from the specified constrained pair. After augmenting all the new features, a standard unconstrained clustering algorithm can be performed, like k-means or spectral clustering. We demonstrate the efficacy of our method for active semi-supervised learning applied to image segmentation and compare it to alternative methods. We also evaluate the performance of our method on the four most commonly evaluated datasets from the UCI machine learning repository.
5 0.99948293 260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition
Author: Petr Gronát, Guillaume Obozinski, Josef Sivic, Tomáš Pajdla
Abstract: The aim of this work is to localize a query photograph by finding other images depicting the same place in a large geotagged image database. This is a challenging task due to changes in viewpoint, imaging conditions and the large size of the image database. The contribution of this work is two-fold. First, we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per-exemplar SVMs in object recognition. Second, as onlyfewpositive training examples are availablefor each location, we propose a new approach to calibrate all the per-location SVM classifiers using only the negative examples. The calibration we propose relies on a significance measure essentially equivalent to the p-values classically used in statistical hypothesis testing. Experiments are performed on a database of 25,000 geotagged street view images of Pittsburgh and demonstrate improved place recognition accuracy of the proposed approach over the previous work. 2Center for Machine Perception, Faculty of Electrical Engineering 3WILLOW project, Laboratoire d’Informatique de l’E´cole Normale Sup e´rieure, ENS/INRIA/CNRS UMR 8548. 4Universit Paris-Est, LIGM (UMR CNRS 8049), Center for Visual Computing, Ecole des Ponts - ParisTech, 77455 Marne-la-Valle, France
6 0.99942946 252 cvpr-2013-Learning Locally-Adaptive Decision Functions for Person Verification
7 0.99931949 55 cvpr-2013-Background Modeling Based on Bidirectional Analysis
8 0.99899936 137 cvpr-2013-Dynamic Scene Classification: Learning Motion Descriptors with Slow Features Analysis
9 0.99888676 346 cvpr-2013-Real-Time No-Reference Image Quality Assessment Based on Filter Learning
10 0.99799275 113 cvpr-2013-Dense Variational Reconstruction of Non-rigid Surfaces from Monocular Video
11 0.99778086 165 cvpr-2013-Fast Energy Minimization Using Learned State Filters
12 0.99746221 59 cvpr-2013-Better Exploiting Motion for Better Action Recognition
13 0.99474305 301 cvpr-2013-Multi-target Tracking by Rank-1 Tensor Approximation
14 0.99469489 48 cvpr-2013-Attribute-Based Detection of Unfamiliar Classes with Humans in the Loop
15 0.98877305 189 cvpr-2013-Graph-Based Discriminative Learning for Location Recognition
16 0.98808134 379 cvpr-2013-Scalable Sparse Subspace Clustering
17 0.98787099 266 cvpr-2013-Learning without Human Scores for Blind Image Quality Assessment
18 0.986139 343 cvpr-2013-Query Adaptive Similarity for Large Scale Object Retrieval
19 0.98582357 306 cvpr-2013-Non-rigid Structure from Motion with Diffusion Maps Prior
20 0.9852621 148 cvpr-2013-Ensemble Video Object Cut in Highly Dynamic Scenes