cvpr cvpr2013 cvpr2013-86 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Fuxin Li, Joao Carreira, Guy Lebanon, Cristian Sminchisescu
Abstract: In this paper we present an inference procedure for the semantic segmentation of images. Differentfrom many CRF approaches that rely on dependencies modeled with unary and pairwise pixel or superpixel potentials, our method is entirely based on estimates of the overlap between each of a set of mid-level object segmentation proposals and the objects present in the image. We define continuous latent variables on superpixels obtained by multiple intersections of segments, then output the optimal segments from the inferred superpixel statistics. The algorithm is capable of recombine and refine initial mid-level proposals, as well as handle multiple interacting objects, even from the same class, all in a consistent joint inference framework by maximizing the composite likelihood of the underlying statistical model using an EM algorithm. In the PASCAL VOC segmentation challenge, the proposed approach obtains high accuracy and successfully handles images of complex object interactions.
Reference: text
sentIndex sentText sentNum sentScore
1 se Abstract In this paper we present an inference procedure for the semantic segmentation of images. [sent-10, score-0.254]
2 Differentfrom many CRF approaches that rely on dependencies modeled with unary and pairwise pixel or superpixel potentials, our method is entirely based on estimates of the overlap between each of a set of mid-level object segmentation proposals and the objects present in the image. [sent-11, score-0.621]
3 We define continuous latent variables on superpixels obtained by multiple intersections of segments, then output the optimal segments from the inferred superpixel statistics. [sent-12, score-0.554]
4 The algorithm is capable of recombine and refine initial mid-level proposals, as well as handle multiple interacting objects, even from the same class, all in a consistent joint inference framework by maximizing the composite likelihood of the underlying statistical model using an EM algorithm. [sent-13, score-0.708]
5 Introduction The goal of semantic segmentation is to detect objects from different categories and identify their spatial layout simultaneously. [sent-16, score-0.272]
6 The segments are then passed to classifiers or regressors that determine to which category they belong. [sent-21, score-0.316]
7 The existence of predictions for many mutually overlapping segments poses a new inference challenge for pixel labeling. [sent-23, score-0.342]
8 Standard inference approaches in a high-order (hierarchical) CRF model [14, 15] can model both pixel/superpixel and segment-level layers with pixel/superpixel nodes and segment nodes interconnected based on overlap and compositionality. [sent-24, score-0.453]
9 However, the interactions in these models are complex and involve different types of pairwise potentials (between pixels, between pix- els and segments and between segments) which limits the range ofpotential functions for which tractable approximate inference is feasible. [sent-25, score-0.328]
10 Other approaches search for configurations of nonoverlapping segment hypotheses [9, 13] by using nonmaxima suppression and maximum clique random field models [11]. [sent-27, score-0.254]
11 In such cases, segments often occlude and cut through each other and the initial mid-level proposals may not be entirely accurate. [sent-31, score-0.216]
12 learn to classify superpixels using class predictions from all enclosing segments as input features [1]. [sent-35, score-0.345]
13 297 bkie bkie bkie bkie bkie bkie bkie bkie bkie bkie person person person person person person person Score: 0. [sent-44, score-3.216]
14 The need for an efficient inference procedure given multiple object segmentation proposals. [sent-58, score-0.226]
15 Identifying the correct object layout from the overlapping segment predictions is a nontrivial task. [sent-59, score-0.343]
16 Simply performing non-maximum suppression would discard all the person segments, which have lower scores because they all overlap the first bike segment. [sent-60, score-0.325]
17 By combining thousands of pixels that span a large segment into one segment statistic, we transfer conflicting high-order terms into a number of one-dimensional distributions, hence avoiding difficult maximum a posteriori inference in models with cyclic dependencies. [sent-63, score-0.579]
18 Our main idea is to model the segments as computable composites of statistics on superpixels that do not spatially overlap. [sent-67, score-0.412]
19 By computable, we mean there exists a mathematical formula that can output segment statistics given values ofthe superpixel statistics. [sent-68, score-0.478]
20 Based on such a link, we can optimize the superpixel statistics by maximizing the composite likelihood (or posterior) of the predicted segment statistics in the modeled error distribution. [sent-69, score-1.15]
21 Intuitively, the configuration of superpixels that can explain most of predicted segment statistics will emerge as the maximum likelihood solution, as shown in fig. [sent-70, score-0.712]
22 3 and encodes the dependency of the ground truth statistic on the segments and the superpixel statistics, as well as the dependency of the observations on predicted segment statistics and a noise source. [sent-73, score-0.81]
23 Our methodology consists of a training phase and an inference phase. [sent-74, score-0.196]
24 In the training phase, regressors are estimated to predict segment statistics. [sent-75, score-0.255]
25 (Best viewed in color) The goal of our inference can be intuitively thought as finding the superpixel configuration which best explains most of the predicted segment statistics, here spatial overlap (with the chair object). [sent-113, score-0.813]
26 Instead of find such a superpixel configuration using a search algorithm, we formulate it as a continuous maximum composite likelihood problem with a convex relaxation, where a nearoptimal solution can be found via mathematical optimization. [sent-115, score-0.692]
27 Segment statistics are generated from superpixel statistics and the segments. [sent-120, score-0.329]
28 The observations are predicted segment statistics on each category. [sent-121, score-0.411]
29 They are the maximal segment statistic for all ground truth objects in the same category, perturbed with noise ? [sent-122, score-0.302]
30 During inference, we first solve for the superpixel statistics θ, then output full object segmentations given θ. [sent-124, score-0.303]
31 Given a test image, the inference phase has three main stages: • Use the trained regressors to predict segment statistics. [sent-126, score-0.414]
32 • Maximize the composite likelihood to estimate superpixel statistics. [sent-127, score-0.633]
33 We generalize the composite likelihood methodology to handle statistic estimates instead of probabilistic estimates. [sent-131, score-0.579]
34 The E-step assigns mixture weights and the M-step maximizes the composite likelihood. [sent-135, score-0.344]
35 For the last stage, we exploit the structure in the superpixel statistics in order to propose an efficient, optimal search algorithm to find the best pixel labeling given the estimated superpixel statistics. [sent-137, score-0.49]
36 A maximum composite likelihood (MCL) approach [18, 20] drops the independence assumptions typical in maximum likelihood. [sent-144, score-0.492]
37 Given vector β ≥ 0, the composite likelihood object is = cl(θ) =? [sent-161, score-0.476]
38 (1) i= 1 j= 1 When β has stochastic components, this is called stochastic composite likelihood (SCL)[6]. [sent-164, score-0.436]
39 MCL is the approach to solve for θ by maximizing the composite likelihood (1). [sent-165, score-0.474]
40 We define the maximum composite f-likelihood problem as n k mθaxi? [sent-170, score-0.344]
41 (2) This new MCL problem recovers the model parameters θ from the composite f-likelihood logpθ (f(X(i) , Aj , Bj)) for all the random variables on multiple different subsets. [sent-173, score-0.316]
42 Then a fixed-length feature vector Zij can be extracted from these segments and the distribution of fij can be modeled as pθ(fij) = N(θ? [sent-178, score-0.248]
43 To do so, we need to estimate the number of objects in each category and assign the score of each segment belonging to 333333000422 a particular object. [sent-191, score-0.414]
44 A discus- sion on how to output final segmentations given the superpixel statistics estimated from MCL is deferred to sec. [sent-198, score-0.263]
45 specific overlap with a category Ck is defined by Vik0 = V (Ck,Ai) = F mj∈aCxk | FFjj∩∪ A Aii| . [sent-219, score-0.262]
46 (3) The true overlap Vik0 can be estimated by training one regressor for each category Ck(for details on possible training methods one can consult e. [sent-220, score-0.303]
47 Since this paper deals with inference, which is only required during testing, we assume that regressors are already obtained based on a separate training set and denote their estimates in the test image I as Given segments A1, A2 , . [sent-223, score-0.237]
48 , Am, we find multiple intersections by dividing the image Iinto superpixels S1, S2 , . [sent-226, score-0.197]
49 , Sn, so that ∀i, j,Si ∩ Sj = ∅, ∀k, Ak = ∪iSk(i) (every segment Ak is the union of some superpixels), and the number of superpixels is minimal. [sent-229, score-0.335]
50 sider only segments that have non-negligiblepredicted overlap (over a loose threshold) with at least one category. [sent-231, score-0.279]
51 Therefore, in many cases, the superpixels have finer granularity inside objects of interest (fig. [sent-232, score-0.231]
52 The Probabilistic Model We use θkj to model the percentage of pixels within a superpixel Sk that belongs to object Fj . [sent-236, score-0.237]
53 Then, the overlap between a segment Ai and Fj can be computed as Vij=||FFjj∩∪ AAii||=? [sent-237, score-0.331]
54 The idea is that if one parameterizes the ground truth object with θ, then its overlap with each segment can be computed (fig. [sent-241, score-0.371]
55 If we know the number of objects in each category and their rough locations, this can be solved by assigning each to one of the objects in Ck, so that likelihood is maximized. [sent-244, score-0.329]
56 =1βijp(Vˆij|Vij(θ)) (6) where θ is an n r matrix, βij = 1 if segment Ai has been assigned to object Fj and 0 otherwise. [sent-248, score-0.262]
57 Note that an assignment is performed within each category, hence a segment can be assigned to many objects, but at most 1 per category. [sent-249, score-0.222]
58 We assume that the estimated overlap is generated from the true overlap Vik plus noise. [sent-253, score-0.298]
59 Also, θ and V generate a Bernoulli ran- dom variable z, which determines whether the predicted overlap would be a false positive. [sent-274, score-0.321]
60 Motivated by these observations, we introduce an additional Bernoulli random variable zij for each predicted score (fig. [sent-275, score-0.48]
61 The outcome of zij informs whether the prediction is a false positive. [sent-277, score-0.348]
62 s,sθurm] rpetpiorensse anrtes a inl 333333000533 line with our observations: if zij = 0, the prediction is a false positive and the true overlap Vij should be 0. [sent-288, score-0.497]
63 If zij = 1, then Vij should be centered around the predicted overlap1. [sent-290, score-0.409]
64 cat and dog, horse and cow, a segment often has significant predicted overlaps on multiple categories, but only one of them is correct (see our technical report [16] for an example). [sent-295, score-0.347]
65 In such cases, when we have evidence from θ−j that an object in another category might exist, the probability of zij = 1is diminished by a Vˆij α(Vˆij factor (details in [16]). [sent-296, score-0.43]
66 Each different color represents a different superpixel (black identifies the largest one). [sent-300, score-0.197]
67 Histograms of true overlap given predicted overlap across the VOC validation set. [sent-309, score-0.43]
68 The 0 mass corresponds to misclassifications, where the object does not belong to the category, but the regressor erroneously outputs nonzero predicted overlaps. [sent-311, score-0.213]
69 Also note that with higher predicted overlap there is less chance for V = 0. [sent-312, score-0.281]
70 Formally, we would like to optimize the composite likelihood with latent variables Z = [zij] : m r mθ,aZx? [sent-323, score-0.436]
71 It tends to preserve the shape of segments in the superpixel potentials and proved important for practical performance. [sent-350, score-0.375]
72 Locating Multiple Objects within Each Category To locate multiple objects in one category and in order to separate the estimates to each object, we adopt the above EM estimation with a hypothesis-testing framework to find the number of objects in each category, in a MAP setting. [sent-361, score-0.243]
73 Namely, we solve (2) for each category Ck independently, with an additional geometric prior on the number of objects rk : p(rk = j) = (1 − q)jq, where q > 0 is a parameter. [sent-362, score-0.215]
74 In (12), the denominator represents the maximum likelihood from any configuration, and the nominator represents the likelihood of the best explanation of by any of the current j objects. [sent-375, score-0.268]
75 Then, each segment is assigned to the object Fj that maximizes E(zi,j) in the final Zrk . [sent-379, score-0.29]
76 The joint inference on all categories is subsequently performed, by treating each object as a different category with separately assigned predictions. [sent-380, score-0.408]
77 The Full Procedure The full inference procedure involves two steps: • Determining the number of objects within each category by the within-class object separation routine in Sec. [sent-383, score-0.402]
78 • Performing joint inference by iterating (9) and (10) across all categories and objects. [sent-386, score-0.215]
79 Notice that we choose to perform the within-class object separation routine before the joint inference, because within each category the enumeration of object counts is tractable. [sent-387, score-0.304]
80 If one enumerates in the joint inference phase, then hypotheses like “1 object in c1, 2 objects in c2” need to be 1Bicycle Bicycle 1 2 Bicycles Bicycle 1 Bicycle 2 1Person Person 1 2 Persons Person 1 Person 2 Figure7. [sent-388, score-0.286]
81 Difer ntθcomputedfor1bicy le/2bicy les,and1per- son/2 persons hypotheses for the same set of predicted segment overlaps. [sent-389, score-0.358]
82 Whereas, even if the within-class object separation can make mistakes, the erroneous object hypotheses can still be suppressed during the joint inference. [sent-395, score-0.218]
83 7 we show the result of running the within-class object separation routine on the segments in fig. [sent-397, score-0.249]
84 One can see that in both the bicycle and the person categories, two objects are generated instead of one. [sent-399, score-0.272]
85 Although both categories improve the likelihood by predicting 2 objects, the second bicycle object is erroneous whereas the second person object is correct. [sent-400, score-0.485]
86 After detecting two objects for each category and running joint inference with these 4 objects, the algorithm is able to correct that mistake, as shown in fig. [sent-401, score-0.315]
87 We propose an algorithm to produce optimal segments that maximizes the overlap with ground truth, without the need to re-segment. [sent-410, score-0.337]
88 Not all superpixels with non-zero potentials are in the final mask, because adding some more would be suboptimal according to the procedure in Sec. [sent-424, score-0.201]
89 It is interesting to see that the first person has his right leg correctly cut through by the bicycle, a solution that was not available in any of the initial object segmentation proposals. [sent-426, score-0.242]
90 Suppose we have A with V (Fj , A) = V0, then V can be increased if and only if we add a superpixel to A with ba++dc 1−θkθjkj > V, because > ba iff dc > ba. [sent-428, score-0.197]
91 In case the optimal segments in multiple categories conflict on some superpixels, one can run a branch-and-bound search on all the conflicting superpixels to maximize the sum of overlaps on each object. [sent-431, score-0.522]
92 For each conflicting superpixel Sk, a quality function is defined by 1−θkθjkj 1−θkθjkj Qkj = mAaxV (Fj,A) −Am, Sak ∈/xAV (Fj,A) (14) where we perform the search in a best-first manner, with the superpixel Sk for object Fj picked first if the pair has the best quality Qkj . [sent-432, score-0.499]
93 The overlap predictions used in our system are obtained by combining the regressors from [17] and [2], with linear weights learned on the t rainval set. [sent-449, score-0.333]
94 Conclusion This paper proposes a composite statistical inference approach to semantic segmentation. [sent-462, score-0.547]
95 The composite likelihood methodology is generalized to model onedimensional error distributions of statistical estimates. [sent-463, score-0.551]
96 Based on this generalization, superpixel-level inference is performed based on a set of mutually overlapping object segmentation proposals and their predicted overlaps with object categories. [sent-464, score-0.515]
97 The generative process underlying overlap prediction is modeled using a graphical model 333333000866 ? [sent-465, score-0.213]
98 The last image, on the right, shows a typical failure case: segments covering part of one of the horses are strongly confused and assigned to ‘cow’ . [sent-519, score-0.2]
99 2 and an EM algorithm is proposed to solve the maximum composite likelihood inference in two steps: the number of objects in each category is first determined, then a joint optimization is performed for all objects across categories. [sent-588, score-0.827]
100 A note on composite likelihood inference and model selection. [sent-733, score-0.558]
wordName wordTfidf (topN-words)
[('composite', 0.316), ('zij', 0.277), ('vij', 0.271), ('bkie', 0.246), ('mcl', 0.222), ('superpixel', 0.197), ('segment', 0.182), ('superpixels', 0.153), ('ij', 0.151), ('overlap', 0.149), ('csi', 0.148), ('kj', 0.135), ('predicted', 0.132), ('lrk', 0.131), ('segments', 0.13), ('inference', 0.122), ('aj', 0.121), ('likelihood', 0.12), ('fj', 0.12), ('bicycle', 0.116), ('category', 0.113), ('person', 0.108), ('carreira', 0.101), ('bj', 0.098), ('sk', 0.088), ('jkj', 0.074), ('jsl', 0.074), ('voc', 0.073), ('regressors', 0.073), ('statistic', 0.072), ('score', 0.071), ('semantic', 0.068), ('bike', 0.068), ('em', 0.067), ('statistics', 0.066), ('conflicting', 0.065), ('segmentation', 0.064), ('computable', 0.063), ('predictions', 0.062), ('categories', 0.061), ('fij', 0.057), ('proposals', 0.056), ('rk', 0.054), ('ck', 0.053), ('maximize', 0.05), ('dillon', 0.049), ('ffjj', 0.049), ('jlogp', 0.049), ('lebanon', 0.049), ('pottics', 0.049), ('qkj', 0.049), ('rainval', 0.049), ('svrsegm', 0.049), ('zrk', 0.049), ('objects', 0.048), ('potentials', 0.048), ('ai', 0.046), ('routine', 0.045), ('bernoulli', 0.045), ('hypotheses', 0.044), ('intersections', 0.044), ('val', 0.043), ('regressor', 0.041), ('statistical', 0.041), ('object', 0.04), ('assigned', 0.04), ('false', 0.04), ('interacting', 0.039), ('maximizing', 0.038), ('phase', 0.037), ('distributions', 0.037), ('methodology', 0.037), ('bicycles', 0.036), ('relaxation', 0.036), ('estimates', 0.034), ('separation', 0.034), ('ak', 0.034), ('formula', 0.033), ('overlaps', 0.033), ('modeled', 0.033), ('georgia', 0.033), ('suppose', 0.032), ('truncated', 0.032), ('joint', 0.032), ('observations', 0.031), ('prediction', 0.031), ('configuration', 0.031), ('layout', 0.031), ('horses', 0.03), ('rain', 0.03), ('granularity', 0.03), ('cut', 0.03), ('kn', 0.03), ('optimal', 0.03), ('interactions', 0.028), ('distribution', 0.028), ('suppressed', 0.028), ('maximum', 0.028), ('overlapping', 0.028), ('maximizes', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation
Author: Fuxin Li, Joao Carreira, Guy Lebanon, Cristian Sminchisescu
Abstract: In this paper we present an inference procedure for the semantic segmentation of images. Differentfrom many CRF approaches that rely on dependencies modeled with unary and pairwise pixel or superpixel potentials, our method is entirely based on estimates of the overlap between each of a set of mid-level object segmentation proposals and the objects present in the image. We define continuous latent variables on superpixels obtained by multiple intersections of segments, then output the optimal segments from the inferred superpixel statistics. The algorithm is capable of recombine and refine initial mid-level proposals, as well as handle multiple interacting objects, even from the same class, all in a consistent joint inference framework by maximizing the composite likelihood of the underlying statistical model using an EM algorithm. In the PASCAL VOC segmentation challenge, the proposed approach obtains high accuracy and successfully handles images of complex object interactions.
2 0.19357614 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning
Author: David Weiss, Ben Taskar
Abstract: We propose SCALPEL, a flexible method for object segmentation that integrates rich region-merging cues with mid- and high-level information about object layout, class, and scale into the segmentation process. Unlike competing approaches, SCALPEL uses a cascade of bottom-up segmentation models that is capable of learning to ignore boundaries early on, yet use them as a stopping criterion once the object has been mostly segmented. Furthermore, we show how such cascades can be learned efficiently. When paired with a novel method that generates better localized shapepriors than our competitors, our method leads to a concise, accurate set of segmentation proposals; these proposals are more accurate on the PASCAL VOC2010 dataset than state-of-the-art methods that use re-ranking to filter much larger bags of proposals. The code for our algorithm is available online.
3 0.18346822 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
Author: Sanja Fidler, Roozbeh Mottaghi, Alan Yuille, Raquel Urtasun
Abstract: In this paper we are interested in how semantic segmentation can help object detection. Towards this goal, we propose a novel deformable part-based model which exploits region-based segmentation algorithms that compute candidate object regions by bottom-up clustering followed by ranking of those regions. Our approach allows every detection hypothesis to select a segment (including void), and scores each box in the image using both the traditional HOG filters as well as a set of novel segmentation features. Thus our model “blends ” between the detector and segmentation models. Since our features can be computed very efficiently given the segments, we maintain the same complexity as the original DPM [14]. We demonstrate the effectiveness of our approach in PASCAL VOC 2010, and show that when employing only a root filter our approach outperforms Dalal & Triggs detector [12] on all classes, achieving 13% higher average AP. When employing the parts, we outperform the original DPM [14] in 19 out of 20 classes, achieving an improvement of 8% AP. Furthermore, we outperform the previous state-of-the-art on VOC’10 test by 4%.
4 0.1810564 230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation
Author: Christian Häne, Christopher Zach, Andrea Cohen, Roland Angst, Marc Pollefeys
Abstract: Both image segmentation and dense 3D modeling from images represent an intrinsically ill-posed problem. Strong regularizers are therefore required to constrain the solutions from being ’too noisy’. Unfortunately, these priors generally yield overly smooth reconstructions and/or segmentations in certain regions whereas they fail in other areas to constrain the solution sufficiently. In this paper we argue that image segmentation and dense 3D reconstruction contribute valuable information to each other’s task. As a consequence, we propose a rigorous mathematical framework to formulate and solve a joint segmentation and dense reconstruction problem. Image segmentations provide geometric cues about which surface orientations are more likely to appear at a certain location in space whereas a dense 3D reconstruction yields a suitable regularization for the segmentation problem by lifting the labeling from 2D images to 3D space. We show how appearance-based cues and 3D surface orientation priors can be learned from training data and subsequently used for class-specific regularization. Experimental results on several real data sets highlight the advantages of our joint formulation.
5 0.1598203 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context
Author: Gautam Singh, Jana Kosecka
Abstract: This paper presents a nonparametric approach to semantic parsing using small patches and simple gradient, color and location features. We learn the relevance of individual feature channels at test time using a locally adaptive distance metric. To further improve the accuracy of the nonparametric approach, we examine the importance of the retrieval set used to compute the nearest neighbours using a novel semantic descriptor to retrieve better candidates. The approach is validated by experiments on several datasets used for semantic parsing demonstrating the superiority of the method compared to the state of art approaches.
6 0.15789954 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs
7 0.1532677 29 cvpr-2013-A Video Representation Using Temporal Superpixels
8 0.15002561 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels
9 0.14385107 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds
10 0.13537924 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images
11 0.1252407 24 cvpr-2013-A Principled Deep Random Field Model for Image Segmentation
12 0.12346673 187 cvpr-2013-Geometric Context from Videos
13 0.12166391 460 cvpr-2013-Weakly-Supervised Dual Clustering for Image Semantic Segmentation
14 0.11696747 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
15 0.1163488 366 cvpr-2013-Robust Region Grouping via Internal Patch Statistics
16 0.10895583 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations
17 0.10806175 80 cvpr-2013-Category Modeling from Just a Single Labeling: Use Depth Information to Guide the Learning of 2D Models
18 0.10761395 212 cvpr-2013-Image Segmentation by Cascaded Region Agglomeration
19 0.1073142 13 cvpr-2013-A Higher-Order CRF Model for Road Network Extraction
20 0.10526214 406 cvpr-2013-Spatial Inference Machines
topicId topicWeight
[(0, 0.229), (1, -0.017), (2, 0.048), (3, -0.046), (4, 0.15), (5, 0.046), (6, 0.061), (7, 0.124), (8, -0.093), (9, 0.024), (10, 0.166), (11, -0.083), (12, -0.017), (13, 0.063), (14, -0.071), (15, 0.047), (16, 0.086), (17, 0.013), (18, -0.077), (19, 0.027), (20, 0.016), (21, 0.023), (22, -0.028), (23, 0.025), (24, 0.01), (25, -0.002), (26, -0.123), (27, -0.0), (28, -0.006), (29, -0.045), (30, 0.025), (31, -0.05), (32, -0.02), (33, -0.007), (34, 0.017), (35, -0.04), (36, -0.035), (37, 0.009), (38, 0.023), (39, -0.01), (40, 0.025), (41, 0.011), (42, 0.067), (43, 0.002), (44, -0.074), (45, 0.005), (46, 0.018), (47, -0.025), (48, -0.038), (49, -0.081)]
simIndex simValue paperId paperTitle
same-paper 1 0.94721645 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation
Author: Fuxin Li, Joao Carreira, Guy Lebanon, Cristian Sminchisescu
Abstract: In this paper we present an inference procedure for the semantic segmentation of images. Differentfrom many CRF approaches that rely on dependencies modeled with unary and pairwise pixel or superpixel potentials, our method is entirely based on estimates of the overlap between each of a set of mid-level object segmentation proposals and the objects present in the image. We define continuous latent variables on superpixels obtained by multiple intersections of segments, then output the optimal segments from the inferred superpixel statistics. The algorithm is capable of recombine and refine initial mid-level proposals, as well as handle multiple interacting objects, even from the same class, all in a consistent joint inference framework by maximizing the composite likelihood of the underlying statistical model using an EM algorithm. In the PASCAL VOC segmentation challenge, the proposed approach obtains high accuracy and successfully handles images of complex object interactions.
Author: Luming Zhang, Mingli Song, Zicheng Liu, Xiao Liu, Jiajun Bu, Chun Chen
Abstract: Weakly supervised image segmentation is a challenging problem in computer vision field. In this paper, we present a new weakly supervised image segmentation algorithm by learning the distribution of spatially structured superpixel sets from image-level labels. Specifically, we first extract graphlets from each image where a graphlet is a smallsized graph consisting of superpixels as its nodes and it encapsulates the spatial structure of those superpixels. Then, a manifold embedding algorithm is proposed to transform graphlets of different sizes into equal-length feature vectors. Thereafter, we use GMM to learn the distribution of the post-embedding graphlets. Finally, we propose a novel image segmentation algorithm, called graphlet cut, that leverages the learned graphlet distribution in measuring the homogeneity of a set of spatially structured superpixels. Experimental results show that the proposed approach outperforms state-of-the-art weakly supervised image segmentation methods, and its performance is comparable to those of the fully supervised segmentation models.
3 0.77036101 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning
Author: David Weiss, Ben Taskar
Abstract: We propose SCALPEL, a flexible method for object segmentation that integrates rich region-merging cues with mid- and high-level information about object layout, class, and scale into the segmentation process. Unlike competing approaches, SCALPEL uses a cascade of bottom-up segmentation models that is capable of learning to ignore boundaries early on, yet use them as a stopping criterion once the object has been mostly segmented. Furthermore, we show how such cascades can be learned efficiently. When paired with a novel method that generates better localized shapepriors than our competitors, our method leads to a concise, accurate set of segmentation proposals; these proposals are more accurate on the PASCAL VOC2010 dataset than state-of-the-art methods that use re-ranking to filter much larger bags of proposals. The code for our algorithm is available online.
4 0.76688045 460 cvpr-2013-Weakly-Supervised Dual Clustering for Image Semantic Segmentation
Author: Yang Liu, Jing Liu, Zechao Li, Jinhui Tang, Hanqing Lu
Abstract: In this paper, we propose a novel Weakly-Supervised Dual Clustering (WSDC) approach for image semantic segmentation with image-level labels, i.e., collaboratively performing image segmentation and tag alignment with those regions. The proposed approach is motivated from the observation that superpixels belonging to an object class usually exist across multiple images and hence can be gathered via the idea of clustering. In WSDC, spectral clustering is adopted to cluster the superpixels obtained from a set of over-segmented images. At the same time, a linear transformation between features and labels as a kind of discriminative clustering is learned to select the discriminative features among different classes. The both clustering outputs should be consistent as much as possible. Besides, weakly-supervised constraints from image-level labels are imposed to restrict the labeling of superpixels. Finally, the non-convex and non-smooth objective function are efficiently optimized using an iterative CCCP procedure. Extensive experiments conducted on MSRC andLabelMe datasets demonstrate the encouraging performance of our method in comparison with some state-of-the-arts.
5 0.76000357 212 cvpr-2013-Image Segmentation by Cascaded Region Agglomeration
Author: Zhile Ren, Gregory Shakhnarovich
Abstract: We propose a hierarchical segmentation algorithm that starts with a very fine oversegmentation and gradually merges regions using a cascade of boundary classifiers. This approach allows the weights of region and boundary features to adapt to the segmentation scale at which they are applied. The stages of the cascade are trained sequentially, with asymetric loss to maximize boundary recall. On six segmentation data sets, our algorithm achieves best performance under most region-quality measures, and does it with fewer segments than the prior work. Our algorithm is also highly competitive in a dense oversegmentation (superpixel) regime under boundary-based measures.
6 0.75863481 13 cvpr-2013-A Higher-Order CRF Model for Road Network Extraction
7 0.73615938 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations
8 0.72190118 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs
9 0.71574503 29 cvpr-2013-A Video Representation Using Temporal Superpixels
10 0.71422762 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds
11 0.70948482 366 cvpr-2013-Robust Region Grouping via Internal Patch Statistics
12 0.6991694 26 cvpr-2013-A Statistical Model for Recreational Trails in Aerial Images
13 0.68258673 458 cvpr-2013-Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds
14 0.65333056 406 cvpr-2013-Spatial Inference Machines
15 0.65035278 262 cvpr-2013-Learning for Structured Prediction Using Approximate Subgradient Descent with Working Sets
16 0.62203354 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels
17 0.61563355 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images
18 0.61034876 24 cvpr-2013-A Principled Deep Random Field Model for Image Segmentation
19 0.60460871 25 cvpr-2013-A Sentence Is Worth a Thousand Pixels
20 0.59272969 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context
topicId topicWeight
[(10, 0.103), (16, 0.02), (26, 0.041), (33, 0.238), (67, 0.039), (69, 0.417), (87, 0.065)]
simIndex simValue paperId paperTitle
1 0.897888 1 cvpr-2013-3D-Based Reasoning with Blocks, Support, and Stability
Author: Zhaoyin Jia, Andrew Gallagher, Ashutosh Saxena, Tsuhan Chen
Abstract: 3D volumetric reasoning is important for truly understanding a scene. Humans are able to both segment each object in an image, and perceive a rich 3D interpretation of the scene, e.g., the space an object occupies, which objects support other objects, and which objects would, if moved, cause other objects to fall. We propose a new approach for parsing RGB-D images using 3D block units for volumetric reasoning. The algorithm fits image segments with 3D blocks, and iteratively evaluates the scene based on block interaction properties. We produce a 3D representation of the scene based on jointly optimizing over segmentations, block fitting, supporting relations, and object stability. Our algorithm incorporates the intuition that a good 3D representation of the scene is the one that fits the data well, and is a stable, self-supporting (i.e., one that does not topple) arrangement of objects. We experiment on several datasets including controlled and real indoor scenarios. Results show that our stability-reasoning framework improves RGB-D segmentation and scene volumetric representation.
2 0.87292701 172 cvpr-2013-Finding Group Interactions in Social Clutter
Author: Ruonan Li, Parker Porfilio, Todd Zickler
Abstract: We consider the problem of finding distinctive social interactions involving groups of agents embedded in larger social gatherings. Given a pre-defined gallery of short exemplar interaction videos, and a long input video of a large gathering (with approximately-tracked agents), we identify within the gathering small sub-groups of agents exhibiting social interactions that resemble those in the exemplars. The participants of each detected group interaction are localized in space; the extent of their interaction is localized in time; and when the gallery ofexemplars is annotated with group-interaction categories, each detected interaction is classified into one of the pre-defined categories. Our approach represents group behaviors by dichotomous collections of descriptors for (a) individual actions, and (b) pairwise interactions; and it includes efficient algorithms for optimally distinguishing participants from by-standers in every temporal unit and for temporally localizing the extent of the group interaction. Most importantly, the method is generic and can be applied whenever numerous interacting agents can be approximately tracked over time. We evaluate the approach using three different video collections, two that involve humans and one that involves mice.
3 0.86404204 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns
Author: Zhe Yang, Zhiwei Xiong, Yueyi Zhang, Jiao Wang, Feng Wu
Abstract: This paper proposes novel density modulated binary patterns for depth acquisition. Similar to Kinect, the illumination patterns do not need a projector for generation and can be emitted by infrared lasers and diffraction gratings. Our key idea is to use the density of light spots in the patterns to carry phase information. Two technical problems are addressed here. First, we propose an algorithm to design the patterns to carry more phase information without compromising the depth reconstruction from a single captured image as with Kinect. Second, since the carried phase is not strictly sinusoidal, the depth reconstructed from the phase contains a systematic error. We further propose a pixelbased phase matching algorithm to reduce the error. Experimental results show that the depth quality can be greatly improved using the phase carried by the density of light spots. Furthermore, our scheme can achieve 20 fps depth reconstruction with GPU assistance.
4 0.85803163 135 cvpr-2013-Discriminative Subspace Clustering
Author: Vasileios Zografos, Liam Ellis, Rudolf Mester
Abstract: We present a novel method for clustering data drawn from a union of arbitrary dimensional subspaces, called Discriminative Subspace Clustering (DiSC). DiSC solves the subspace clustering problem by using a quadratic classifier trained from unlabeled data (clustering by classification). We generate labels by exploiting the locality of points from the same subspace and a basic affinity criterion. A number of classifiers are then diversely trained from different partitions of the data, and their results are combined together in an ensemble, in order to obtain the final clustering result. We have tested our method with 4 challenging datasets and compared against 8 state-of-the-art methods from literature. Our results show that DiSC is a very strong performer in both accuracy and robustness, and also of low computational complexity.
same-paper 5 0.84667444 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation
Author: Fuxin Li, Joao Carreira, Guy Lebanon, Cristian Sminchisescu
Abstract: In this paper we present an inference procedure for the semantic segmentation of images. Differentfrom many CRF approaches that rely on dependencies modeled with unary and pairwise pixel or superpixel potentials, our method is entirely based on estimates of the overlap between each of a set of mid-level object segmentation proposals and the objects present in the image. We define continuous latent variables on superpixels obtained by multiple intersections of segments, then output the optimal segments from the inferred superpixel statistics. The algorithm is capable of recombine and refine initial mid-level proposals, as well as handle multiple interacting objects, even from the same class, all in a consistent joint inference framework by maximizing the composite likelihood of the underlying statistical model using an EM algorithm. In the PASCAL VOC segmentation challenge, the proposed approach obtains high accuracy and successfully handles images of complex object interactions.
6 0.83240283 231 cvpr-2013-Joint Detection, Tracking and Mapping by Semantic Bundle Adjustment
7 0.82617629 392 cvpr-2013-Separable Dictionary Learning
9 0.74705029 292 cvpr-2013-Multi-agent Event Detection: Localization and Role Assignment
10 0.71279597 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
11 0.68829799 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
12 0.68236351 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models
13 0.67894387 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection
14 0.67350918 372 cvpr-2013-SLAM++: Simultaneous Localisation and Mapping at the Level of Objects
15 0.67242378 282 cvpr-2013-Measuring Crowd Collectiveness
16 0.67055625 402 cvpr-2013-Social Role Discovery in Human Events
17 0.66953796 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations
18 0.66906351 381 cvpr-2013-Scene Parsing by Integrating Function, Geometry and Appearance Models
19 0.66303605 364 cvpr-2013-Robust Object Co-detection
20 0.65786517 116 cvpr-2013-Designing Category-Level Attributes for Discriminative Visual Recognition