cvpr cvpr2013 cvpr2013-212 knowledge-graph by maker-knowledge-mining

212 cvpr-2013-Image Segmentation by Cascaded Region Agglomeration


Source: pdf

Author: Zhile Ren, Gregory Shakhnarovich

Abstract: We propose a hierarchical segmentation algorithm that starts with a very fine oversegmentation and gradually merges regions using a cascade of boundary classifiers. This approach allows the weights of region and boundary features to adapt to the segmentation scale at which they are applied. The stages of the cascade are trained sequentially, with asymetric loss to maximize boundary recall. On six segmentation data sets, our algorithm achieves best performance under most region-quality measures, and does it with fewer segments than the prior work. Our algorithm is also highly competitive in a dense oversegmentation (superpixel) regime under boundary-based measures.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 com Abstract We propose a hierarchical segmentation algorithm that starts with a very fine oversegmentation and gradually merges regions using a cascade of boundary classifiers. [sent-2, score-0.565]

2 This approach allows the weights of region and boundary features to adapt to the segmentation scale at which they are applied. [sent-3, score-0.309]

3 The stages of the cascade are trained sequentially, with asymetric loss to maximize boundary recall. [sent-4, score-0.363]

4 On six segmentation data sets, our algorithm achieves best performance under most region-quality measures, and does it with fewer segments than the prior work. [sent-5, score-0.286]

5 Our algorithm is also highly competitive in a dense oversegmentation (superpixel) regime under boundary-based measures. [sent-6, score-0.298]

6 As long as the oversegmentation indeed does not undersegment any region (i. [sent-13, score-0.137]

7 , little or no “leakage” of true regions across oversegmentation boundaries), this speeds up reasoning, with little loss of accuracy for the downstream task, such as category-level segmentation [3, 7] or depth estimation [25]. [sent-15, score-0.338]

8 A user of oversegmentation algorithm has two conflicting objectives: On the one hand, make the superpixels as large as possible, hence reducing their number; on the other hand, preserve true boundaries in the image, hence driving the number of superpixels up and their size down. [sent-16, score-0.476]

9 For each image, segmentation shown is at the scale optimal for the image (OIS) with each algorithm - i. [sent-21, score-0.16]

10 superpixel methods provide a tuning parameter that controls this tradeoff, and the range that seems to work for a variety of applications is 400 to 1000 superpixels. [sent-24, score-0.116]

11 We pursue the agglomerative clustering approach: starting with a very fine partition into small regions, gradually merge them into larger and larger ones. [sent-26, score-0.218]

12 Our method constructs a cascade of boundary classifiers that produce increasingly coarse image partitions by merging regions preserved by previous stages. [sent-29, score-0.469]

13 222000001 919 We show that across six data sets, performance measures and for a broad range segmentation scale from fine to coarse, the performance of ISCRA is superior to that of current state-of-the-art methods. [sent-33, score-0.265]

14 To our knowledge, it is by far the most comprehensive evaluation of the leading approaches to superpixel extraction, and in general to image segmentation. [sent-34, score-0.116]

15 Background There is a rich body of work on edge or boundary detection in computer vision, with the state of the art represented by the gPb boundary detector and its variants [2]. [sent-37, score-0.2]

16 However, a boundary map may not correspond to a valid segmentation, since it may not provide closed contours. [sent-38, score-0.124]

17 For the purpose of our discussion, we will define two segmentation regimes. [sent-44, score-0.115]

18 The superpixel regime corresponds to more than 50 segments per image. [sent-45, score-0.433]

19 In this regime the purpose of oversegmentation is mostly to reduce complexity of representation, without sacrificing future segmentation accuracy. [sent-46, score-0.413]

20 Therefore, the natural notion ofscale for this regime is the number of segments k; typical values of k are in the hundreds for a moderately sized image. [sent-47, score-0.317]

21 Broadly, methods that produce superpixels can be grouped into graph-based [18, 6, 14, 22], clustering of pixels such as SLIC [1] and MeanShift [4], and curve evolution such as Turbopixels [13] and SEEDS[21]. [sent-49, score-0.178]

22 In contrast, the large segment regime produces fewer than 100 segments. [sent-50, score-0.25]

23 The appropriate number of segments in this regime depends on the content of the image, and specifying k is not natural. [sent-51, score-0.363]

24 Instead, the precise meaning of scale and the way it is controled varies between segmentation methods, as described below. [sent-52, score-0.202]

25 In OWT-UCM [2] the oriented watershed transform on gPb boundaries is followed by greedy merging of regions, resulting in weighted boundary map such that thresholding it at any level produces a valid segmentation; the value of the threshold controls the scale. [sent-53, score-0.426]

26 Throughout the merging process, OWT-UCM uses the same set of weights on various features throughout the process, and thus despite the greedy iterative nature of the merging, this is in a sense a single stage process. [sent-54, score-0.315]

27 The agglomerative merging segmentation algorithm in [11], like ISCRA, starts with a fine oversegmentation, learns a boundary probability model, applies it to merge regions until the estimated probability of merging is below a threshold. [sent-57, score-0.895]

28 The classifier is then retrained, and applied again; their implementation includes four such stages, therefore defining four segmentation scales. [sent-58, score-0.115]

29 There is a number of differences, however: while in ISCRA we use asymmetric loss and a universal threshold of 21, in [11] the loss is symmetric, but the threshold is tuned in ad-hoc fashion. [sent-59, score-0.135]

30 four), producing a more gradual and accurate merging process. [sent-61, score-0.204]

31 Higher Order Correlation Clustering (HOCC) [12] also starts with fine segmentation, but instead of a greedy merging applies a “single-shot” partition over the superpixel graph. [sent-63, score-0.469]

32 The scale in HOCC is controled by specifying explicitly the number of regions, which may be a disadvantage when the user would like a more adaptive scale definition like in OWT-UCM or in ISCRA. [sent-64, score-0.201]

33 In SCALPEL, a region is “grown” by applying a cascade of greedy merging steps to an initial over-segmentation. [sent-66, score-0.406]

34 Similarly to ISCRA, the order of merging in each step is determined by learning weights that reflect importance of features at different scales. [sent-67, score-0.204]

35 An important question about any segmentation algorithm is how it handles multiple scales. [sent-71, score-0.115]

36 Specifically, it is often desirable to produce a hierarchy, in which regions obtained at a finer scale are necessarily subregions of the regions at a coarser scale. [sent-72, score-0.229]

37 This is also not the case with superpixel algorithms like SLIC, ERS, SEEDS or Turbopixels. [sent-74, score-0.116]

38 Still, it seems clear that the objectives in partitioning an image into 1000 segments vs. [sent-76, score-0.128]

39 1 We are also given ground truth segmentation for a set of training images. [sent-87, score-0.188]

40 A ground truth segmentation is provided as a label map. [sent-88, score-0.188]

41 The set of ground truths for I allows us to label each region pair in N(Ri), as follows. [sent-91, score-0.141]

42 Then, if Ri,p and Ri,q are assigned to the same region, we set ypiq = 1, otherwise ypiq = 0. [sent-93, score-0.17]

43 When multiple ground truths are available, we set ypiq to the average value of the labels assigned under each of the ground truths. [sent-94, score-0.22]

44 This yields ypiq between 0 and 1, measuring the fraction of humans who thought Ri,p and Ri,q belong in the same region (and thus, intuitively, reflecting the perceptual strength of the boundary). [sent-95, score-0.174]

45 This induces a prediction problem: given an image I and initial segmentation R, epsrtoimblaemte :th gei vceonnd aintio inmaal posterior pinriotibaalb sileigtym of grouping Pg(p, q; I,R) ? [sent-102, score-0.139]

46 Greedy merging of regions While it is possible to compute the predictions for all pairs in N(Ri) at once, the resulting set of predictions may spuafifresr i nfr Nom(R Rinconsistencies (lack of transitivity in the predicted labels). [sent-108, score-0.285]

47 Instead, we pursue a simpler approach: greedy merging of regions. [sent-112, score-0.271]

48 In each iteration, we merge the pair of regions with the highest Pg update the features to reflect this merge, and repeat, until no pair of current regions has Pg > 21. [sent-113, score-0.259]

49 Since we started with a valid segmentation and coarsened it, we remain with a valid segmentation R? [sent-114, score-0.32]

50 on image index ifrom notation when Algorithm 1: Greedy merging MERGE(I,R,w) Given: image I, regions R, weights w fGoriveeanch: i m(pa, gqe) I∈, r Neg(iRon) sd Ro Pp,q h=t Pg(φpq (I, R) ; w) ifonritieaalcihze ( Rp,? [sent-119, score-0.285]

51 Given a typical image with 1000 superpixels in R, only a Gsmivaelln f raa ctytipoinc aolf i pairs ein w Nith(R 10)0 w0i sllu bpee rtrpuixee negatives, snilnyce a smmoastll neighboring superpixels )s whoiul ld b eb ter grouped together. [sent-143, score-0.334]

52 Furthermore, we scale the loss value for every pair of regions by the length Li, in pixels, of the boundary between the regions. [sent-146, score-0.28]

53 This scaling reflects the higher loss from “erasing” a long true boundary than a short one: w∗(α) = argwminn1i? [sent-147, score-0.154]

54 Our choice for α is driven by the following intuition: we would like to preserve the boundary recall of the starting segmentation as much as possible, while merging as many regions as possible. [sent-151, score-0.542]

55 The boundary recall REC(R, G) is defined as the fraction of boundary pixels in ground t)ru itsh d eGrecovered by the predicted boundaries in R? [sent-152, score-0.338]

56 Suppose the recall of the initial segmentation is r; then, we want the lowest value of α for which 2but is not guaranteed to behave so, unless Itrain = Itune ; empirically however we always observed this behavior. [sent-156, score-0.207]

57 Cascaded Region Agglomeration When the model is trained with the scaled loss (1), with α optimized to limit recall drop on tuning set, the merging usually stops early, with many remaining unmerged regions. [sent-163, score-0.396]

58 Color and texture histograms tend to become less sparse; shape may become less convex; features that were useless for very small regions (e. [sent-165, score-0.129]

59 ,wT RGi0v e←n :R Im Rfor t← ←= R 1to T do Rt ← MERGE(I, Rt−1 , wt) Return: RT Return: R Training the cascade (Algorithm 3) is similar to the training of cascaded classifiers elsewhere, e. [sent-175, score-0.166]

60 Furthermore, at each stage these two sets are sampled independently, so that empirical distribution of features on with stage t is a more robust estimate of the distribution of new data. [sent-195, score-0.166]

61 Fi- nally, note that after a few stages most of the boundaries from earlier stages are no longer active (due to merging of their constituent regions) and so reusing the same images carries much less risk of overfitting. [sent-196, score-0.465]

62 At each stage some of the regions are merged using the model learned for that stage, and the next stage receives the resulting coarsened segmentation as its input. [sent-198, score-0.412]

63 Once the merging stops, we can “backtrack” and report the segmentation at any point along the merging process. [sent-199, score-0.523]

64 This allows us to control the scale either by specifying the desired number of segments (appropriate for the superpixel regime) or by specifying the number of stages to run, which is the natural definition of scale for ISCRA. [sent-200, score-0.498]

65 We can also compute the boundary map that reflect the scale at which regions are merged: if there are T stages in ISCRA, then for every boundary pixel in the initial segmentation, the value of the boundary map will be t/T if it was merged after t stages. [sent-201, score-0.585]

66 Pixels that were not on the boundaries of initial superpixels will have values zero, and pixels that survived the last stage will have value 1. [sent-202, score-0.299]

67 Examples of ISCRA segmentations at multiple scales, as well as the hierarchical boundary map, are illustrated in Figure 2(right); more examples are available in supplementary materials. [sent-203, score-0.128]

68 Experiments Our experiments were aimed at two goals: (i) compare ISCRA to other methods in both superpixel and large region regimes; (ii) evaluate effect of various design choices on performance. [sent-205, score-0.165]

69 This yields 3 dimensions in • Texture: The χ2 difference of two segments when representing eTahceh χ image using 32 textons (1 dimension). [sent-211, score-0.133]

70 Examples of the cascaded merging by ISCRA on BSDS test images. [sent-215, score-0.286]

71 From left: results after merging from ≈ 1000 superpixels down to 250,125,50,10, the segmentation with ISCRA at tohen optimal essctal ime iang hindsight f loerf tt:h ere image (OIS), earngdin tghe f boundary strength map. [sent-216, score-0.617]

72 With the addition of constant bias term for each stage, we have in wt for stage t 39 weights. [sent-223, score-0.117]

73 We trained 60 stages ISCRA on the 200 images in BSDS300 training set, using unregularized logistic regression at each stage (see discussion in Section 4 regarding overfitting). [sent-225, score-0.216]

74 1 to prevent repeated merging involving the same region at the same stage; empirically this improves performance, since it keeps the training data distribution from changing too much within the stage. [sent-230, score-0.253]

75 As initial segmentation for every image (train or test) we took the finest scale of OWT-UCM obtained for that image. [sent-231, score-0.208]

76 3(0righ4t0)av5e0rag6e0 number of superpixels surviving each stage, over all images in all test sets (with standard deviation bars). [sent-234, score-0.242]

77 As might be expected, this value decreases dramatically with the average number of segments surviving the preceding stages (shown in Figure 3(right) for the test sets). [sent-239, score-0.252]

78 VOC2012: validation set for segmentation challenge (comp5) Pascal VOC 2012 [5]. [sent-265, score-0.115]

79 Two measures are sensible for any segmentation scale: boundary precision vs. [sent-273, score-0.257]

80 Superpixels In superpixel regime, we compare ISCRA to SLIC, ERS, OWT-UCM, and Hoiem et al. [sent-277, score-0.116]

81 Since individual superpixels tend to be much smaller than most ground truth regions, region-based measures make little sense in this regime. [sent-279, score-0.271]

82 Since pixel-wise errors become fairly large in this regime, the boundary-based measures relevant for the superpixel regime are no longer sensible. [sent-283, score-0.414]

83 • • Segmentation covering, measuring average per-pixel overlap t baetitownee cno segments ians ground terruatghe ( GpeTr-)p iaxnedl the proposed segmentation. [sent-286, score-0.19]

84 Probabilistic Rand Index (PRI), measuring pairwise 3In BSDS data sets no semantic labels are provided, and we assign each ground truth region its own label. [sent-287, score-0.22]

85 With the exception of [12] the methods involved produce hierarchical segmentation, which can be used to extract a specific set of regions by specifying a scale value. [sent-291, score-0.193]

86 The latter is a kind oforacle measuring the best achievable accuracy of any labeling adhering to the predicted segmentation regions. [sent-293, score-0.204]

87 Finally, since for some of the measures there is a trade-off between performance and number of segments, we report for each ODS/OIS value the average number of segments extracted at that scale by the method in question. [sent-294, score-0.194]

88 Since only partial ground truth boundaries VOC2012 and MSRC are available, we omit precision/recall curves for those data sets. [sent-298, score-0.126]

89 Figure 6 shows that for most data sets, ISCRA achieves the lowest under-segmentation error for a broad range of scales, typically for 400 superpixels and fewer. [sent-299, score-0.156]

90 Results for region measures relevant for the large segment regime are summarized in Tables 1through 6. [sent-301, score-0.301]

91 While ISCRA achieves better or equal results in most combinations of data set/measure/scale choice, it typically does that with many fewer segments than other methods (see superscripts in the ODS columns), and the number of segments it obtains is closer to that in the ground truth. [sent-303, score-0.328]

92 Across the board (with the exception of NYU) ISCRA achieves the same level of ASA with significantly fewer segments than other methods; most dramatically for ASA=0. [sent-310, score-0.168]

93 We also trained a variant of ISCRA in which α = 1for all stages, and 222000111644 the threshold on Pg used to stop merging is trained. [sent-314, score-0.236]

94 Superscripts: the average number of segments at the optimal scale for each method/measure. [sent-333, score-0.152]

95 It is trained in sequence, allowing adaptation of feature weights to increasing segmentation scale; when applied on an image it produces a hierarchical segmentation, allowing the user to directly control the scale and the number of resulting regions. [sent-372, score-0.215]

96 It also is competitive in boundary-based measures in the superpixel regime, obtaining best results for part of the range for some data sets. [sent-374, score-0.158]

97 ISCRA tends to achieve these results with fewer segments per image than other methods, making it potentially appealing for use as preprocessing step for semantic segmentation and other high-level perception tasks. [sent-375, score-0.286]

98 For instance, in our experiments here, it can not obtain ASA or boundary recall values above those of the finest scale of OWT-UCM. [sent-377, score-0.211]

99 Texture segmentation by multiscale aggregation of filter responses and shape elements. [sent-436, score-0.115]

100 A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. [sent-530, score-0.155]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('iscra', 0.721), ('regime', 0.21), ('merging', 0.204), ('superpixels', 0.156), ('pg', 0.152), ('itrain', 0.127), ('asa', 0.122), ('superpixel', 0.116), ('segmentation', 0.115), ('segments', 0.107), ('itune', 0.106), ('boundary', 0.1), ('merge', 0.097), ('stages', 0.093), ('oversegmentation', 0.088), ('agglomeration', 0.085), ('ypiq', 0.085), ('cascade', 0.084), ('cascaded', 0.082), ('regions', 0.081), ('hocc', 0.075), ('scalpel', 0.075), ('stage', 0.066), ('slic', 0.061), ('regimes', 0.056), ('loss', 0.054), ('boundaries', 0.053), ('surviving', 0.052), ('ers', 0.052), ('wt', 0.051), ('hoiem', 0.051), ('truths', 0.049), ('region', 0.049), ('turbopixels', 0.047), ('bsds', 0.047), ('specifying', 0.046), ('scale', 0.045), ('greedy', 0.045), ('ground', 0.043), ('coarsened', 0.042), ('controled', 0.042), ('hindsight', 0.042), ('rnmoatswe', 0.042), ('sbd', 0.042), ('measures', 0.042), ('merged', 0.042), ('recall', 0.042), ('fewer', 0.04), ('measuring', 0.04), ('gpb', 0.04), ('return', 0.039), ('nyu', 0.039), ('msrc', 0.039), ('fine', 0.039), ('ois', 0.036), ('scaled', 0.035), ('voi', 0.035), ('partition', 0.035), ('pq', 0.034), ('sets', 0.034), ('seeds', 0.033), ('trained', 0.032), ('ods', 0.032), ('superscripts', 0.031), ('truth', 0.03), ('tpami', 0.03), ('meanshift', 0.03), ('starts', 0.03), ('agnostic', 0.029), ('drop', 0.029), ('segmentations', 0.028), ('merges', 0.028), ('dominates', 0.027), ('asymmetric', 0.027), ('ri', 0.027), ('dimensions', 0.026), ('behave', 0.026), ('labeling', 0.026), ('agglomerative', 0.025), ('logistic', 0.025), ('initial', 0.024), ('finest', 0.024), ('ren', 0.024), ('valid', 0.024), ('semantic', 0.024), ('become', 0.024), ('six', 0.024), ('user', 0.023), ('rt', 0.023), ('achievable', 0.023), ('perceptually', 0.023), ('semantically', 0.023), ('coarser', 0.022), ('pursue', 0.022), ('atomic', 0.022), ('grouped', 0.022), ('longer', 0.022), ('exception', 0.021), ('objectives', 0.021), ('subsets', 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999952 212 cvpr-2013-Image Segmentation by Cascaded Region Agglomeration

Author: Zhile Ren, Gregory Shakhnarovich

Abstract: We propose a hierarchical segmentation algorithm that starts with a very fine oversegmentation and gradually merges regions using a cascade of boundary classifiers. This approach allows the weights of region and boundary features to adapt to the segmentation scale at which they are applied. The stages of the cascade are trained sequentially, with asymetric loss to maximize boundary recall. On six segmentation data sets, our algorithm achieves best performance under most region-quality measures, and does it with fewer segments than the prior work. Our algorithm is also highly competitive in a dense oversegmentation (superpixel) regime under boundary-based measures.

2 0.21163693 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning

Author: David Weiss, Ben Taskar

Abstract: We propose SCALPEL, a flexible method for object segmentation that integrates rich region-merging cues with mid- and high-level information about object layout, class, and scale into the segmentation process. Unlike competing approaches, SCALPEL uses a cascade of bottom-up segmentation models that is capable of learning to ignore boundaries early on, yet use them as a stopping criterion once the object has been mostly segmented. Furthermore, we show how such cascades can be learned efficiently. When paired with a novel method that generates better localized shapepriors than our competitors, our method leads to a concise, accurate set of segmentation proposals; these proposals are more accurate on the PASCAL VOC2010 dataset than state-of-the-art methods that use re-ranking to filter much larger bags of proposals. The code for our algorithm is available online.

3 0.13875434 29 cvpr-2013-A Video Representation Using Temporal Superpixels

Author: Jason Chang, Donglai Wei, John W. Fisher_III

Abstract: We develop a generative probabilistic model for temporally consistent superpixels in video sequences. In contrast to supervoxel methods, object parts in different frames are tracked by the same temporal superpixel. We explicitly model flow between frames with a bilateral Gaussian process and use this information to propagate superpixels in an online fashion. We consider four novel metrics to quantify performance of a temporal superpixel representation and demonstrate superior performance when compared to supervoxel methods.

4 0.13641308 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images

Author: Saurabh Gupta, Pablo Arbeláez, Jitendra Malik

Abstract: We address the problems of contour detection, bottomup grouping and semantic segmentation using RGB-D data. We focus on the challenging setting of cluttered indoor scenes, and evaluate our approach on the recently introduced NYU-Depth V2 (NYUD2) dataset [27]. We propose algorithms for object boundary detection and hierarchical segmentation that generalize the gPb − ucm approach of [se2]g mbeyn mtaatkioinng t effective use oef t dheep gthP information. Wroea schho owf that our system can label each contour with its type (depth, normal or albedo). We also propose a generic method for long-range amodal completion of surfaces and show its effectiveness in grouping. We then turn to the problem of semantic segmentation and propose a simple approach that classifies superpixels into the 40 dominant object categories in NYUD2. We use both generic and class-specific features to encode the appearance and geometry of objects. We also show how our approach can be used for scene classification, and how this contextual information in turn improves object recognition. In all of these tasks, we report significant improvements over the state-of-the-art.

5 0.13058406 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

Author: Guang Shu, Afshin Dehghan, Mubarak Shah

Abstract: We propose an approach to improve the detection performance of a generic detector when it is applied to a particular video. The performance of offline-trained objects detectors are usually degraded in unconstrained video environments due to variant illuminations, backgrounds and camera viewpoints. Moreover, most object detectors are trained using Haar-like features or gradient features but ignore video specificfeatures like consistent colorpatterns. In our approach, we apply a Superpixel-based Bag-of-Words (BoW) model to iteratively refine the output of a generic detector. Compared to other related work, our method builds a video-specific detector using superpixels, hence it can handle the problem of appearance variation. Most importantly, using Conditional Random Field (CRF) along with our super pixel-based BoW model, we develop and algorithm to segment the object from the background . Therefore our method generates an output of the exact object regions instead of the bounding boxes generated by most detectors. In general, our method takes detection bounding boxes of a generic detector as input and generates the detection output with higher average precision and precise object regions. The experiments on four recent datasets demonstrate the effectiveness of our approach and significantly improves the state-of-art detector by 5-16% in average precision.

6 0.12739101 458 cvpr-2013-Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds

7 0.11508133 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds

8 0.11117217 460 cvpr-2013-Weakly-Supervised Dual Clustering for Image Semantic Segmentation

9 0.10761395 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation

10 0.10663323 437 cvpr-2013-Towards Fast and Accurate Segmentation

11 0.10430831 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context

12 0.10308036 72 cvpr-2013-Boundary Detection Benchmarking: Beyond F-Measures

13 0.10203088 203 cvpr-2013-Hierarchical Video Representation with Trajectory Binary Partition Tree

14 0.097223274 366 cvpr-2013-Robust Region Grouping via Internal Patch Statistics

15 0.094677009 187 cvpr-2013-Geometric Context from Videos

16 0.093699425 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs

17 0.087162778 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection

18 0.086430684 357 cvpr-2013-Revisiting Depth Layers from Occlusions

19 0.080880754 222 cvpr-2013-Incorporating User Interaction and Topological Constraints within Contour Completion via Discrete Calculus

20 0.079992555 281 cvpr-2013-Measures and Meta-Measures for the Supervised Evaluation of Image Segmentation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.163), (1, -0.006), (2, 0.052), (3, -0.001), (4, 0.102), (5, 0.015), (6, 0.04), (7, 0.075), (8, -0.105), (9, 0.003), (10, 0.148), (11, -0.081), (12, 0.029), (13, 0.059), (14, -0.002), (15, 0.015), (16, 0.065), (17, -0.084), (18, -0.077), (19, 0.116), (20, 0.036), (21, 0.023), (22, -0.086), (23, -0.009), (24, -0.039), (25, 0.072), (26, -0.053), (27, -0.055), (28, 0.029), (29, 0.023), (30, 0.067), (31, 0.022), (32, -0.02), (33, -0.026), (34, 0.062), (35, -0.044), (36, -0.023), (37, 0.011), (38, 0.055), (39, 0.047), (40, 0.016), (41, 0.03), (42, 0.045), (43, -0.013), (44, -0.043), (45, 0.043), (46, 0.053), (47, -0.036), (48, 0.097), (49, -0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9174248 212 cvpr-2013-Image Segmentation by Cascaded Region Agglomeration

Author: Zhile Ren, Gregory Shakhnarovich

Abstract: We propose a hierarchical segmentation algorithm that starts with a very fine oversegmentation and gradually merges regions using a cascade of boundary classifiers. This approach allows the weights of region and boundary features to adapt to the segmentation scale at which they are applied. The stages of the cascade are trained sequentially, with asymetric loss to maximize boundary recall. On six segmentation data sets, our algorithm achieves best performance under most region-quality measures, and does it with fewer segments than the prior work. Our algorithm is also highly competitive in a dense oversegmentation (superpixel) regime under boundary-based measures.

2 0.80148178 29 cvpr-2013-A Video Representation Using Temporal Superpixels

Author: Jason Chang, Donglai Wei, John W. Fisher_III

Abstract: We develop a generative probabilistic model for temporally consistent superpixels in video sequences. In contrast to supervoxel methods, object parts in different frames are tracked by the same temporal superpixel. We explicitly model flow between frames with a bilateral Gaussian process and use this information to propagate superpixels in an online fashion. We consider four novel metrics to quantify performance of a temporal superpixel representation and demonstrate superior performance when compared to supervoxel methods.

3 0.79296172 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation

Author: Luming Zhang, Mingli Song, Zicheng Liu, Xiao Liu, Jiajun Bu, Chun Chen

Abstract: Weakly supervised image segmentation is a challenging problem in computer vision field. In this paper, we present a new weakly supervised image segmentation algorithm by learning the distribution of spatially structured superpixel sets from image-level labels. Specifically, we first extract graphlets from each image where a graphlet is a smallsized graph consisting of superpixels as its nodes and it encapsulates the spatial structure of those superpixels. Then, a manifold embedding algorithm is proposed to transform graphlets of different sizes into equal-length feature vectors. Thereafter, we use GMM to learn the distribution of the post-embedding graphlets. Finally, we propose a novel image segmentation algorithm, called graphlet cut, that leverages the learned graphlet distribution in measuring the homogeneity of a set of spatially structured superpixels. Experimental results show that the proposed approach outperforms state-of-the-art weakly supervised image segmentation methods, and its performance is comparable to those of the fully supervised segmentation models.

4 0.75859433 458 cvpr-2013-Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds

Author: Jeremie Papon, Alexey Abramov, Markus Schoeler, Florentin Wörgötter

Abstract: Unsupervised over-segmentation of an image into regions of perceptually similar pixels, known as superpixels, is a widely used preprocessing step in segmentation algorithms. Superpixel methods reduce the number of regions that must be considered later by more computationally expensive algorithms, with a minimal loss of information. Nevertheless, as some information is inevitably lost, it is vital that superpixels not cross object boundaries, as such errors will propagate through later steps. Existing methods make use of projected color or depth information, but do not consider three dimensional geometric relationships between observed data points which can be used to prevent superpixels from crossing regions of empty space. We propose a novel over-segmentation algorithm which uses voxel relationships to produce over-segmentations which are fully consistent with the spatial geometry of the scene in three dimensional, rather than projective, space. Enforcing the constraint that segmented regions must have spatial connectivity prevents label flow across semantic object boundaries which might otherwise be violated. Additionally, as the algorithm works directly in 3D space, observations from several calibrated RGB+D cameras can be segmented jointly. Experiments on a large data set of human annotated RGB+D images demonstrate a significant reduction in occurrence of clusters crossing object boundaries, while maintaining speeds comparable to state-of-the-art 2D methods.

5 0.75032753 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning

Author: David Weiss, Ben Taskar

Abstract: We propose SCALPEL, a flexible method for object segmentation that integrates rich region-merging cues with mid- and high-level information about object layout, class, and scale into the segmentation process. Unlike competing approaches, SCALPEL uses a cascade of bottom-up segmentation models that is capable of learning to ignore boundaries early on, yet use them as a stopping criterion once the object has been mostly segmented. Furthermore, we show how such cascades can be learned efficiently. When paired with a novel method that generates better localized shapepriors than our competitors, our method leads to a concise, accurate set of segmentation proposals; these proposals are more accurate on the PASCAL VOC2010 dataset than state-of-the-art methods that use re-ranking to filter much larger bags of proposals. The code for our algorithm is available online.

6 0.73726672 366 cvpr-2013-Robust Region Grouping via Internal Patch Statistics

7 0.72508413 460 cvpr-2013-Weakly-Supervised Dual Clustering for Image Semantic Segmentation

8 0.72288465 26 cvpr-2013-A Statistical Model for Recreational Trails in Aerial Images

9 0.68878961 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images

10 0.67684799 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation

11 0.64812618 281 cvpr-2013-Measures and Meta-Measures for the Supervised Evaluation of Image Segmentation

12 0.64779538 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds

13 0.62452722 468 cvpr-2013-Winding Number for Region-Boundary Consistent Salient Contour Extraction

14 0.6238668 437 cvpr-2013-Towards Fast and Accurate Segmentation

15 0.61769766 222 cvpr-2013-Incorporating User Interaction and Topological Constraints within Contour Completion via Discrete Calculus

16 0.59922105 280 cvpr-2013-Maximum Cohesive Grid of Superpixels for Fast Object Localization

17 0.59473342 72 cvpr-2013-Boundary Detection Benchmarking: Beyond F-Measures

18 0.58981711 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations

19 0.58396775 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

20 0.50731319 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.143), (16, 0.015), (26, 0.068), (33, 0.227), (67, 0.068), (69, 0.065), (87, 0.087), (89, 0.207), (97, 0.012)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.83958316 212 cvpr-2013-Image Segmentation by Cascaded Region Agglomeration

Author: Zhile Ren, Gregory Shakhnarovich

Abstract: We propose a hierarchical segmentation algorithm that starts with a very fine oversegmentation and gradually merges regions using a cascade of boundary classifiers. This approach allows the weights of region and boundary features to adapt to the segmentation scale at which they are applied. The stages of the cascade are trained sequentially, with asymetric loss to maximize boundary recall. On six segmentation data sets, our algorithm achieves best performance under most region-quality measures, and does it with fewer segments than the prior work. Our algorithm is also highly competitive in a dense oversegmentation (superpixel) regime under boundary-based measures.

2 0.83304602 235 cvpr-2013-Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines

Author: Gunhee Kim, Eric P. Xing

Abstract: With an explosion of popularity of online photo sharing, we can trivially collect a huge number of photo streams for any interesting topics such as scuba diving as an outdoor recreational activity class. Obviously, the retrieved photo streams are neither aligned nor calibrated since they are taken in different temporal, spatial, and personal perspectives. However, at the same time, they are likely to share common storylines that consist of sequences of events and activities frequently recurred within the topic. In this paper, as a first technical step to detect such collective storylines, we propose an approach to jointly aligning and segmenting uncalibrated multiple photo streams. The alignment task discovers the matched images between different photo streams, and the image segmentation task parses each image into multiple meaningful regions to facilitate the image understanding. We close a loop between the two tasks so that solving one task helps enhance the performance of the other in a mutually rewarding way. To this end, we design a scalable message-passing based optimization framework to jointly achieve both tasks for the whole input image set at once. With evaluation on the new Flickr dataset of 15 outdoor activities that consist of 1.5 millions of images of 13 thousands of photo streams, our empirical results show that the proposed algorithms are more successful than other candidate methods for both tasks.

3 0.80975264 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

Author: Ian Endres, Kevin J. Shih, Johnston Jiaa, Derek Hoiem

Abstract: We propose a method to learn a diverse collection of discriminative parts from object bounding box annotations. Part detectors can be trained and applied individually, which simplifies learning and extension to new features or categories. We apply the parts to object category detection, pooling part detections within bottom-up proposed regions and using a boosted classifier with proposed sigmoid weak learners for scoring. On PASCAL VOC 2010, we evaluate the part detectors ’ ability to discriminate and localize annotated keypoints. Our detection system is competitive with the best-existing systems, outperforming other HOG-based detectors on the more deformable categories.

4 0.80460423 202 cvpr-2013-Hierarchical Saliency Detection

Author: Qiong Yan, Li Xu, Jianping Shi, Jiaya Jia

Abstract: When dealing with objects with complex structures, saliency detection confronts a critical problem namely that detection accuracy could be adversely affected if salient foreground or background in an image contains small-scale high-contrast patterns. This issue is common in natural images and forms a fundamental challenge for prior methods. We tackle it from a scale point of view and propose a multi-layer approach to analyze saliency cues. The final saliency map is produced in a hierarchical model. Different from varying patch sizes or downsizing images, our scale-based region handling is by finding saliency values optimally in a tree model. Our approach improves saliency detection on many images that cannot be handled well traditionally. A new dataset is also constructed. –

5 0.80210882 414 cvpr-2013-Structure Preserving Object Tracking

Author: Lu Zhang, Laurens van_der_Maaten

Abstract: Model-free trackers can track arbitrary objects based on a single (bounding-box) annotation of the object. Whilst the performance of model-free trackers has recently improved significantly, simultaneously tracking multiple objects with similar appearance remains very hard. In this paper, we propose a new multi-object model-free tracker (based on tracking-by-detection) that resolves this problem by incorporating spatial constraints between the objects. The spatial constraints are learned along with the object detectors using an online structured SVM algorithm. The experimental evaluation ofour structure-preserving object tracker (SPOT) reveals significant performance improvements in multi-object tracking. We also show that SPOT can improve the performance of single-object trackers by simultaneously tracking different parts of the object.

6 0.79792655 285 cvpr-2013-Minimum Uncertainty Gap for Robust Visual Tracking

7 0.79783291 311 cvpr-2013-Occlusion Patterns for Object Class Detection

8 0.79769975 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities

9 0.79524887 325 cvpr-2013-Part Discovery from Partial Correspondence

10 0.79504061 408 cvpr-2013-Spatiotemporal Deformable Part Models for Action Detection

11 0.79468155 331 cvpr-2013-Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis

12 0.79452366 425 cvpr-2013-Tensor-Based High-Order Semantic Relation Transfer for Semantic Scene Segmentation

13 0.79295278 400 cvpr-2013-Single Image Calibration of Multi-axial Imaging Systems

14 0.79272598 314 cvpr-2013-Online Object Tracking: A Benchmark

15 0.79234099 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation

16 0.79208481 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models

17 0.79148018 19 cvpr-2013-A Minimum Error Vanishing Point Detection Approach for Uncalibrated Monocular Images of Man-Made Environments

18 0.79104418 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics

19 0.79030496 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases

20 0.78976721 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection