cvpr cvpr2013 cvpr2013-460 knowledge-graph by maker-knowledge-mining

460 cvpr-2013-Weakly-Supervised Dual Clustering for Image Semantic Segmentation

Source: pdf

Author: Yang Liu, Jing Liu, Zechao Li, Jinhui Tang, Hanqing Lu

Abstract: In this paper, we propose a novel Weakly-Supervised Dual Clustering (WSDC) approach for image semantic segmentation with image-level labels, i.e., collaboratively performing image segmentation and tag alignment with those regions. The proposed approach is motivated from the observation that superpixels belonging to an object class usually exist across multiple images and hence can be gathered via the idea of clustering. In WSDC, spectral clustering is adopted to cluster the superpixels obtained from a set of over-segmented images. At the same time, a linear transformation between features and labels as a kind of discriminative clustering is learned to select the discriminative features among different classes. The both clustering outputs should be consistent as much as possible. Besides, weakly-supervised constraints from image-level labels are imposed to restrict the labeling of superpixels. Finally, the non-convex and non-smooth objective function are efficiently optimized using an iterative CCCP procedure. Extensive experiments conducted on MSRC andLabelMe datasets demonstrate the encouraging performance of our method in comparison with some state-of-the-arts.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Abstract In this paper, we propose a novel Weakly-Supervised Dual Clustering (WSDC) approach for image semantic segmentation with image-level labels, i. [sent-11, score-0.24]

2 , collaboratively performing image segmentation and tag alignment with those regions. [sent-13, score-0.17]

3 The proposed approach is motivated from the observation that superpixels belonging to an object class usually exist across multiple images and hence can be gathered via the idea of clustering. [sent-14, score-0.311]

4 In WSDC, spectral clustering is adopted to cluster the superpixels obtained from a set of over-segmented images. [sent-15, score-0.472]

5 At the same time, a linear transformation between features and labels as a kind of discriminative clustering is learned to select the discriminative features among different classes. [sent-16, score-0.401]

6 The both clustering outputs should be consistent as much as possible. [sent-17, score-0.149]

7 Besides, weakly-supervised constraints from image-level labels are imposed to restrict the labeling of superpixels. [sent-18, score-0.128]

8 Introduction Image semantic segmentation is to automatically parse images into some semantic regions. [sent-22, score-0.361]

9 This is a coherent task between image segmentation and region-level label assignment. [sent-23, score-0.213]

10 In turn, precise labeling results will boost image segmentation since the pixels with the same label can be deemed as a whole object. [sent-26, score-0.249]

11 From this view, semantic segmentation is a kind of higher-level image understanding than any individual case. [sent-27, score-0.278]

12 Recently, image semantic segmentation has become a popular research topic and some efforts contribute to the problem [3, 26]. [sent-29, score-0.24]

13 However, producing pixel-level labels is time-consuming and may be inaccurate. [sent-31, score-0.128]

14 Fortunately, lots of image sharing websites provide us plentiful user-contributed images with social tags, in which the raw correspondences between images and labels are available. [sent-32, score-0.164]

15 Thus, weakly-supervised methods [25, 26, 27] with only image-level labels available have emerged and attracted more attention. [sent-33, score-0.128]

16 , obtaining meaningful image regions and simultaneously assigning image-level labels to those regions. [sent-36, score-0.128]

17 The problem is formulated as a Weakly-Supervised Dual Clustering (WSDC) task to cluster superpixels and assign a suitable label to each cluster. [sent-37, score-0.341]

18 The first evidence of our method is that similar superpixels have high probability to share the same label. [sent-38, score-0.277]

19 To mine this kind of important contextual relationship, a spectral clustering term is defined over the superpixels of all images to group the visually similar ones together. [sent-39, score-0.547]

20 We define a discriminative clustering term and require its outputs to be consistent with the outputs of spectral clustering. [sent-43, score-0.343]

21 Besides, we explicitly impose weakly-supervised constraints during the dual clustering process which can assign labels to clusters. [sent-44, score-0.373]

22 • We propose a coherent framework to jointly solve im- age segmentation raenndt region-level a jnoninottlaytio sonl uen idmer222000777533 ? [sent-51, score-0.149]

23 The proposed method incorporates the spectral clustering a pnrdo pdoisscerdim minetahtiovde i clustering tso t chelu sspteerc superpixels from all images into different clusters, and imposes image-level labels as a kind of weak supervision to assign labels to clusters. [sent-106, score-0.936]

24 Weakly-supervised semantic segmentation [26, 27] arised to solve this problem. [sent-121, score-0.24]

25 Label to region means reassign the labels annotated at the image-level to those segmented im- age regions rather than the whole image [13, 12, 23]. [sent-126, score-0.128]

26 Most existing works are only applied to a subgroup of images with same foreground and not intended to handle irregularly appearing multiple foregrounds. [sent-138, score-0.11]

27 Besides, they did not explore any supervision like easily available image-level labels in their learning process. [sent-139, score-0.191]

28 The former problem leads to solving a bottom-up unsupervised clustering while the latter problem leads to methods designed for top-down discriminative clustering problem. [sent-142, score-0.278]

29 w Liteht XXi == X[xi1, , · · · , xini], ·w ,hXere xik ∈ Rd is the feature descriptor of the ,k·-·th· superpixel in the i∈-th R image and ni is the number of superpixels in the i-th image. [sent-147, score-0.385]

30 gic {=0 ,11 1if} Xi belongs to t∈he { c0-,th1 }class and 0 otherwise. [sent-158, score-0.342]

31 Spectral Clustering On the one hand, visually similar superpixels have high probability to share the same label. [sent-164, score-0.277]

32 On the other hand, spectral techniques have been demonstrated to be effective to detect the cluster structure [20], which can integrate the consistency relationships of superpixels among different images. [sent-165, score-0.365]

33 In light of this, we employ spectral techniques to mine the aforementioned contextual information. [sent-166, score-0.125]

34 The interactions among superpixels are represented by an affinity matrix S ∈ RN×N defined as Sij=? [sent-167, score-0.277]

35 xj) or xj∈ Nk(xi), Here Nk (x) is the set of k-nearest superpixels of x. [sent-171, score-0.277]

36 The kH-enreeaNr est superpixels are selected only from the superpixels from one image or the images sharing common labels, because the label of a superpixel is identified from labels of the image it belongs to. [sent-172, score-0.854]

37 In addition, to encourage spatially smooth labelings, the spatial neighbor superpixels within the same image are also connected. [sent-174, score-0.277]

38 Then the spectral clustering term is defined as minimizing the following equation: J(Y ) =21i,? [sent-175, score-0.195]

39 Discriminative Clustering Since not all the features are important and discriminative for a certain class, a discriminative clustering strategy with l2,1-norm regularization is introduced. [sent-184, score-0.235]

40 Its outputs are required to be consistent with the outputs of spectral clustering. [sent-185, score-0.172]

41 i nTh Weref ∈or Re, the objective function for discriminative clustering is formulated as − ? [sent-188, score-0.171]

42 In that way, the proposed method is able to handle correlated and noisy features and enable to evaluate the correlation between labels and features. [sent-207, score-0.128]

43 , a mapping function from visual features to labels, the discriminative feature representations for each class can be obtained. [sent-215, score-0.098]

44 Weakly-Supervised Constraint Given an image and its associated labels, it is reasonable and natural to restrict the mapping between superpixels and labels to meet the following constraints. [sent-218, score-0.405]

45 • One superpixel corresponds to at most one label. [sent-219, score-0.108]

46 • OOnnee lsaubpeelr phiaxse at olerraests one superpixel mapped to it. [sent-220, score-0.108]

47 er Iet is at least one superpixel supporting this label. [sent-222, score-0.138]

48 • Superpixels should correspond to the labels of images they belong t soh. [sent-223, score-0.128]

49 oTuhldis cmorarkeesps sure ttoha tht teh learbee are no image superpixels supporting an invalid label. [sent-224, score-0.307]

50 To satisfy the first constraint, we impose an orthogonality constraint on Y just like [10], i. [sent-225, score-0.148]

51 To satisfy the last two conditions, we explicitly impose a weak-supervision constraint with a hyper-parameter γ: ? [sent-232, score-0.086]

52 222000777755 (5) where yicj is the value of Y corresponding to the j-th superpixel within the i-th image on label c. [sent-237, score-0.251]

53 pij ∈ RN is an indicator vector whose element corresponding ∈to Rthe j-th superpixel in the i-th image is one and other elements are zeros. [sent-264, score-0.139]

54 l(t) , (10) where nα is the number of superpixels with the largest label value max l(t) . [sent-297, score-0.341]

55 μ I n≥ our experiments iot ciso nsetrto large enough taoli ensure the orthogonality constraint satisfied. [sent-356, score-0.108]

56 MSRC: It is a widely used dataset in semantic segmentation task. [sent-406, score-0.24]

57 It contains 591 images from 21 different classes and there are 3 labels per image on average. [sent-407, score-0.128]

58 We adopt SLIC algorithm [2] to obtain the superpixels for each image, and describe each superpixel by the typical bag-of-words representation while using SIFT [17] as the local descriptor. [sent-413, score-0.427]

59 We evaluate the performance of semantic segmentation from two views: the labeling performance and segmentation performance. [sent-415, score-0.359]

60 Because the various baselines on the both datasets adopt different evaluation standards so we report different measures to accord with the corresponding baselines. [sent-417, score-0.136]

61 For segmentation evaluation metric we adopt the intersection-over-union score (IOU score) [7] which is a standard measure in PASCAL challenges. [sent-418, score-0.161]

62 μ is set to be 108 which is large enough to guarantee the orthogonality constraint satisfied. [sent-433, score-0.108]

63 The sem}a antnicd segmentation performance i}s used to tune parameters. [sent-436, score-0.119]

64 Secondly, accuracies reach the peak points when β = 103, γ = 104 and β = 104, γ = 106 on both datasets respectively which all lie in the middle range and the accuracies do not increase monotonically when β and γ increase. [sent-441, score-0.136]

65 Experiments on MSRC dataset We compare the proposed algorithm with LAS [15], MTL-RF [25], MIM [26] and RLSIM [4] to evaluate the semantic segmentation performance. [sent-458, score-0.24]

66 ‘Full’ supervision means each pixel is labeled with a tag and ‘Weak’ supervision means only image-level labels are available. [sent-460, score-0.305]

67 ‘With’ ILP represents during the predicting period, the images’ labels are available and we only predict the labels of superpixels from the image’s labels. [sent-461, score-0.533]

68 ‘Without’ ILP indicates the labels of images are absolutely unknown. [sent-462, score-0.128]

69 Secondly, unlike RLSIM 1 gets much higher accuracy with ILP than RLSIM 2 without ILP, the results of WSDC 1 and WSDC 2 are very near and both achieve high accuracies, which approves during the prediction period the image-level labels have negligible effect on our algorith- m’s performance. [sent-468, score-0.162]

70 In addition, requiring image-level priors to boost performance is also a weakness of many semantic segmentation methods. [sent-469, score-0.273]

71 For segmentation performance, we compare our method with [7, 8, 16]. [sent-472, score-0.119]

72 All the three methods divide the images into some subgroups which images with a same label are deemed as a subgroup, and they process the images from a subgroup atonetime. [sent-473, score-0.207]

73 Inourexperiments, wereportthe segmentation performance under two settings: one is WSDC 3 which segments images from a subgroup at one time, the other one we directly report the segmentation performance of WSDC 2. [sent-474, score-0.348]

74 multiple foregrounds and backgrounds at one time is a big challenge for most segmentation methods. [sent-477, score-0.188]

75 Table 3 shows the segmentation performance on MSRC dataset. [sent-478, score-0.119]

76 It can proves that the weakly-supervision information can promote the segmentation performance. [sent-480, score-0.157]

77 It is worth to noting that, segmenting a subgroup of images which share the same foreground itself is a strong supervision. [sent-482, score-0.11]

78 The results of WSDC 2 will certainly be effected by the imbalanced labels and irregular appearing foregrounds and backgrounds. [sent-484, score-0.197]

79 This reflects that the guidance of weakly-supervision can boost the segmentation performance and especially helpful to disambiguate the easily confusing categories which is also a second target of our method. [sent-486, score-0.184]

80 Experiments on LabelMe dataset Fully-supervised methods [21, 24, 11] and weaklysupervised methods [27, 26] are used as compared methods and the condition settings of them are displayed in Table 4. [sent-489, score-0.13]

81 Semantic segmentation comparisons on LabelMe are presented in Table 5. [sent-490, score-0.119]

82 To our best knowledge, no works have reported the segmentation performance on LabelMe dataset. [sent-494, score-0.119]

83 The experimental settings of our method and baselines on MSRC dataset. [sent-496, score-0.117]

84 The results of our method under different data settings on both datasets are reported in Table 6 and Table 7. [sent-509, score-0.099]

85 First, the highest and lowest ac- curacies on both datasets under different settings make little difference which proves our method is relatively stable and robust. [sent-511, score-0.099]

86 Second, compared setting 2 and 3, whether the test set τ2 is explored in the model learning process or not, the obtained accuracies are comparable. [sent-512, score-0.094]

87 Maybe due to the simplicity ofMSRC, the out-of-sample setting (setting 3) on the dataset achieves better performance than the in-sample setting (setting 2). [sent-513, score-0.09]

88 This demonstrates that our method is effective to semantically parsing images even no labels are provided. [sent-515, score-0.161]

89 Finally, the proposed algorithm achieves the best performance with setting 5 and setting 2 on the MSRC and LabelMe datasets, respectively. [sent-516, score-0.09]

90 The reason may be that τ2 of MSRC has more images and fewer class labels than τ2 of LabelMe. [sent-517, score-0.162]

91 Conclusion In this paper, we propose a Weakly-Supervised Dual Clustering (WSDC) method to automatically segment the images into localized semantic regions. [sent-519, score-0.121]

92 We combine spectral clustering and discriminative clustering into a unified framework to integrate the contextual relationships between superpixels and discriminative features of multiple classes. [sent-520, score-0.707]

93 To fully exploit discriminative features, we impose the nonnegative constraint on the label matrix Y and l2,1-norm regularization on the linear transformation. [sent-521, score-0.278]

94 The image-level labels are imposed as weakly-supervised constraints to as- sign each cluster a semantic label. [sent-522, score-0.249]

95 Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. [sent-565, score-0.119]

96 The experimental settings of our method and baselines on LabelMe dataset. [sent-592, score-0.117]

97 Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. [sent-686, score-0.119]

98 Connecting modalities: Semisupervised segmentation and annotation of images using unaligned text corpora. [sent-691, score-0.119]

99 Inferring semantic concepts from community-contributed images and noisy tags. [sent-701, score-0.121]

100 Towards weakly supervised semantic segmentation by means of multiple instance and multitask learning. [sent-712, score-0.317]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('wsdc', 0.386), ('gic', 0.342), ('superpixels', 0.277), ('msrc', 0.236), ('yty', 0.171), ('labelme', 0.16), ('ilp', 0.154), ('labels', 0.128), ('cccp', 0.126), ('semantic', 0.121), ('segmentation', 0.119), ('xtw', 0.119), ('acc', 0.112), ('subgroup', 0.11), ('superpixel', 0.108), ('clustering', 0.107), ('maxl', 0.105), ('dual', 0.098), ('axxiyicj', 0.089), ('hcbuiy', 0.089), ('hct', 0.089), ('rlsim', 0.089), ('xxt', 0.088), ('vezhnevets', 0.088), ('spectral', 0.088), ('mj', 0.08), ('ytly', 0.079), ('yicj', 0.079), ('aver', 0.073), ('weaklysupervised', 0.069), ('foregrounds', 0.069), ('discriminative', 0.064), ('nonnegative', 0.064), ('label', 0.064), ('iou', 0.063), ('supervision', 0.063), ('orthogonality', 0.062), ('settings', 0.061), ('axxipitjy', 0.059), ('hctytqi', 0.059), ('hcytqi', 0.059), ('osp', 0.059), ('wsindtehaoc', 0.059), ('yijt', 0.059), ('rn', 0.059), ('liu', 0.058), ('baselines', 0.056), ('ic', 0.056), ('hc', 0.054), ('cosegmentation', 0.052), ('tag', 0.051), ('accuracies', 0.049), ('mim', 0.049), ('socher', 0.046), ('constraint', 0.046), ('weakly', 0.045), ('setting', 0.045), ('yij', 0.044), ('nlpr', 0.044), ('outputs', 0.042), ('adopt', 0.042), ('impose', 0.04), ('sij', 0.039), ('xi', 0.039), ('kind', 0.038), ('datasets', 0.038), ('mof', 0.038), ('promote', 0.038), ('lu', 0.038), ('rc', 0.038), ('nk', 0.037), ('mine', 0.037), ('ij', 0.036), ('lots', 0.036), ('ui', 0.036), ('joulin', 0.035), ('correlations', 0.035), ('gets', 0.034), ('class', 0.034), ('facts', 0.034), ('slic', 0.034), ('deemed', 0.033), ('secondly', 0.033), ('boost', 0.033), ('parsing', 0.033), ('nonparametric', 0.033), ('supervised', 0.032), ('yan', 0.032), ('confusing', 0.032), ('iterative', 0.032), ('tang', 0.032), ('besides', 0.032), ('element', 0.031), ('diagonal', 0.031), ('bi', 0.031), ('ferrari', 0.03), ('extensive', 0.03), ('coherent', 0.03), ('supporting', 0.03), ('encouraging', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 460 cvpr-2013-Weakly-Supervised Dual Clustering for Image Semantic Segmentation

Author: Yang Liu, Jing Liu, Zechao Li, Jinhui Tang, Hanqing Lu

2 0.18868333 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds

Author: Yan Wang, Rongrong Ji, Shih-Fu Chang

Abstract: Recent years have witnessed a growing interest in understanding the semantics of point clouds in a wide variety of applications. However, point cloud labeling remains an open problem, due to the difficulty in acquiring sufficient 3D point labels towards training effective classifiers. In this paper, we overcome this challenge by utilizing the existing massive 2D semantic labeled datasets from decadelong community efforts, such as ImageNet and LabelMe, and a novel “cross-domain ” label propagation approach. Our proposed method consists of two major novel components, Exemplar SVM based label propagation, which effectively addresses the cross-domain issue, and a graphical model based contextual refinement incorporating 3D constraints. Most importantly, the entire process does not require any training data from the target scenes, also with good scalability towards large scale applications. We evaluate our approach on the well-known Cornell Point Cloud Dataset, achieving much greater efficiency and comparable accuracy even without any 3D training data. Our approach shows further major gains in accuracy when the training data from the target scenes is used, outperforming state-ofthe-art approaches with far better efficiency.

3 0.18463598 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context

Author: Gautam Singh, Jana Kosecka

Abstract: This paper presents a nonparametric approach to semantic parsing using small patches and simple gradient, color and location features. We learn the relevance of individual feature channels at test time using a locally adaptive distance metric. To further improve the accuracy of the nonparametric approach, we examine the importance of the retrieval set used to compute the nearest neighbours using a novel semantic descriptor to retrieve better candidates. The approach is validated by experiments on several datasets used for semantic parsing demonstrating the superiority of the method compared to the state of art approaches.

4 0.17668577 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

Author: Guang Shu, Afshin Dehghan, Mubarak Shah

Abstract: We propose an approach to improve the detection performance of a generic detector when it is applied to a particular video. The performance of offline-trained objects detectors are usually degraded in unconstrained video environments due to variant illuminations, backgrounds and camera viewpoints. Moreover, most object detectors are trained using Haar-like features or gradient features but ignore video specificfeatures like consistent colorpatterns. In our approach, we apply a Superpixel-based Bag-of-Words (BoW) model to iteratively refine the output of a generic detector. Compared to other related work, our method builds a video-specific detector using superpixels, hence it can handle the problem of appearance variation. Most importantly, using Conditional Random Field (CRF) along with our super pixel-based BoW model, we develop and algorithm to segment the object from the background . Therefore our method generates an output of the exact object regions instead of the bounding boxes generated by most detectors. In general, our method takes detection bounding boxes of a generic detector as input and generates the detection output with higher average precision and precise object regions. The experiments on four recent datasets demonstrate the effectiveness of our approach and significantly improves the state-of-art detector by 5-16% in average precision.

5 0.15951969 29 cvpr-2013-A Video Representation Using Temporal Superpixels

Author: Jason Chang, Donglai Wei, John W. Fisher_III

Abstract: We develop a generative probabilistic model for temporally consistent superpixels in video sequences. In contrast to supervoxel methods, object parts in different frames are tracked by the same temporal superpixel. We explicitly model flow between frames with a bilateral Gaussian process and use this information to propagate superpixels in an online fashion. We consider four novel metrics to quantify performance of a temporal superpixel representation and demonstrate superior performance when compared to supervoxel methods.

6 0.15061529 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs

7 0.15021476 425 cvpr-2013-Tensor-Based High-Order Semantic Relation Transfer for Semantic Scene Segmentation

8 0.15014881 366 cvpr-2013-Robust Region Grouping via Internal Patch Statistics

9 0.14472355 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images

10 0.14172377 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning

11 0.13685614 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images

12 0.13662337 458 cvpr-2013-Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds

13 0.12863249 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation

14 0.12166391 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation

15 0.12120759 262 cvpr-2013-Learning for Structured Prediction Using Approximate Subgradient Descent with Working Sets

16 0.11117217 212 cvpr-2013-Image Segmentation by Cascaded Region Agglomeration

17 0.10670295 93 cvpr-2013-Constraints as Features

18 0.097287111 294 cvpr-2013-Multi-class Video Co-segmentation with a Generative Multi-video Model

19 0.094152004 165 cvpr-2013-Fast Energy Minimization Using Learned State Filters

20 0.087519191 222 cvpr-2013-Incorporating User Interaction and Topological Constraints within Contour Completion via Discrete Calculus

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.209), (1, -0.045), (2, 0.029), (3, 0.025), (4, 0.136), (5, 0.021), (6, 0.021), (7, 0.07), (8, -0.14), (9, 0.01), (10, 0.188), (11, -0.082), (12, -0.014), (13, 0.062), (14, -0.073), (15, -0.003), (16, 0.058), (17, -0.072), (18, -0.114), (19, 0.07), (20, 0.086), (21, -0.015), (22, -0.062), (23, 0.019), (24, -0.047), (25, 0.013), (26, -0.072), (27, 0.01), (28, 0.105), (29, -0.055), (30, 0.043), (31, -0.005), (32, 0.032), (33, -0.082), (34, 0.02), (35, -0.025), (36, -0.083), (37, 0.032), (38, 0.078), (39, -0.048), (40, 0.009), (41, -0.038), (42, -0.042), (43, 0.052), (44, 0.012), (45, -0.068), (46, 0.001), (47, 0.012), (48, 0.049), (49, -0.062)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92843074 460 cvpr-2013-Weakly-Supervised Dual Clustering for Image Semantic Segmentation

Author: Yang Liu, Jing Liu, Zechao Li, Jinhui Tang, Hanqing Lu

2 0.85086346 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation

Author: Luming Zhang, Mingli Song, Zicheng Liu, Xiao Liu, Jiajun Bu, Chun Chen

Abstract: Weakly supervised image segmentation is a challenging problem in computer vision field. In this paper, we present a new weakly supervised image segmentation algorithm by learning the distribution of spatially structured superpixel sets from image-level labels. Specifically, we first extract graphlets from each image where a graphlet is a smallsized graph consisting of superpixels as its nodes and it encapsulates the spatial structure of those superpixels. Then, a manifold embedding algorithm is proposed to transform graphlets of different sizes into equal-length feature vectors. Thereafter, we use GMM to learn the distribution of the post-embedding graphlets. Finally, we propose a novel image segmentation algorithm, called graphlet cut, that leverages the learned graphlet distribution in measuring the homogeneity of a set of spatially structured superpixels. Experimental results show that the proposed approach outperforms state-of-the-art weakly supervised image segmentation methods, and its performance is comparable to those of the fully supervised segmentation models.

3 0.82186198 366 cvpr-2013-Robust Region Grouping via Internal Patch Statistics

Author: Xiaobai Liu, Liang Lin, Alan L. Yuille

Abstract: In this work, we present an efficient multi-scale low-rank representation for image segmentation. Our method begins with partitioning the input images into a set of superpixels, followed by seeking the optimal superpixel-pair affinity matrix, both of which are performed at multiple scales of the input images. Since low-level superpixel features are usually corrupted by image noises, we propose to infer the low-rank refined affinity matrix. The inference is guided by two observations on natural images. First, looking into a single image, local small-size image patterns tend to recur frequently within the same semantic region, but may not appear in semantically different regions. We call this internal image statistics as replication prior, and quantitatively justify it on real image databases. Second, the affinity matrices at different scales should be consistently solved, which leads to the cross-scale consistency constraint. We formulate these two purposes with one unified formulation and develop an efficient optimization procedure. Our experiments demonstrate the presented method can substantially improve segmentation accuracy.

4 0.79358131 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds

Author: Yan Wang, Rongrong Ji, Shih-Fu Chang

5 0.78581989 458 cvpr-2013-Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds

Author: Jeremie Papon, Alexey Abramov, Markus Schoeler, Florentin Wörgötter

Abstract: Unsupervised over-segmentation of an image into regions of perceptually similar pixels, known as superpixels, is a widely used preprocessing step in segmentation algorithms. Superpixel methods reduce the number of regions that must be considered later by more computationally expensive algorithms, with a minimal loss of information. Nevertheless, as some information is inevitably lost, it is vital that superpixels not cross object boundaries, as such errors will propagate through later steps. Existing methods make use of projected color or depth information, but do not consider three dimensional geometric relationships between observed data points which can be used to prevent superpixels from crossing regions of empty space. We propose a novel over-segmentation algorithm which uses voxel relationships to produce over-segmentations which are fully consistent with the spatial geometry of the scene in three dimensional, rather than projective, space. Enforcing the constraint that segmented regions must have spatial connectivity prevents label flow across semantic object boundaries which might otherwise be violated. Additionally, as the algorithm works directly in 3D space, observations from several calibrated RGB+D cameras can be segmented jointly. Experiments on a large data set of human annotated RGB+D images demonstrate a significant reduction in occurrence of clusters crossing object boundaries, while maintaining speeds comparable to state-of-the-art 2D methods.

6 0.77165276 29 cvpr-2013-A Video Representation Using Temporal Superpixels

7 0.74256802 212 cvpr-2013-Image Segmentation by Cascaded Region Agglomeration

8 0.70751858 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation

9 0.68909657 26 cvpr-2013-A Statistical Model for Recreational Trails in Aerial Images

10 0.66626018 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context

11 0.65524924 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning

12 0.62736779 280 cvpr-2013-Maximum Cohesive Grid of Superpixels for Fast Object Localization

13 0.61561298 425 cvpr-2013-Tensor-Based High-Order Semantic Relation Transfer for Semantic Scene Segmentation

14 0.60909933 406 cvpr-2013-Spatial Inference Machines

15 0.59448195 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations

16 0.58918589 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images

17 0.5843268 262 cvpr-2013-Learning for Structured Prediction Using Approximate Subgradient Descent with Working Sets

18 0.5807367 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

19 0.55791801 193 cvpr-2013-Graph Transduction Learning with Connectivity Constraints with Application to Multiple Foreground Cosegmentation

20 0.5104689 93 cvpr-2013-Constraints as Features

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.123), (16, 0.029), (22, 0.257), (26, 0.066), (28, 0.012), (33, 0.251), (67, 0.051), (69, 0.05), (80, 0.017), (87, 0.07)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.81533855 460 cvpr-2013-Weakly-Supervised Dual Clustering for Image Semantic Segmentation

Author: Yang Liu, Jing Liu, Zechao Li, Jinhui Tang, Hanqing Lu

2 0.81405461 123 cvpr-2013-Detection of Manipulation Action Consequences (MAC)

Author: Yezhou Yang, Cornelia Fermüller, Yiannis Aloimonos

Abstract: The problem of action recognition and human activity has been an active research area in Computer Vision and Robotics. While full-body motions can be characterized by movement and change of posture, no characterization, that holds invariance, has yet been proposed for the description of manipulation actions. We propose that a fundamental concept in understanding such actions, are the consequences of actions. There is a small set of fundamental primitive action consequences that provides a systematic high-level classification of manipulation actions. In this paper a technique is developed to recognize these action consequences. At the heart of the technique lies a novel active tracking and segmentation method that monitors the changes in appearance and topological structure of the manipulated object. These are then used in a visual semantic graph (VSG) based procedure applied to the time sequence of the monitored object to recognize the action consequence. We provide a new dataset, called Manipulation Action Consequences (MAC 1.0), which can serve as testbed for other studies on this topic. Several ex- periments on this dataset demonstrates that our method can robustly track objects and detect their deformations and division during the manipulation. Quantitative tests prove the effectiveness and efficiency of the method.

3 0.80004859 143 cvpr-2013-Efficient Large-Scale Structured Learning

Author: Steve Branson, Oscar Beijbom, Serge Belongie

Abstract: unkown-abstract

4 0.77929997 443 cvpr-2013-Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances

Author: Feng Lu, Yasuyuki Matsushita, Imari Sato, Takahiro Okabe, Yoichi Sato

Abstract: We propose an uncalibrated photometric stereo method that works with general and unknown isotropic reflectances. Our method uses a pixel intensity profile, which is a sequence of radiance intensities recorded at a pixel across multi-illuminance images. We show that for general isotropic materials, the geodesic distance between intensity profiles is linearly related to the angular difference of their surface normals, and that the intensity distribution of an intensity profile conveys information about the reflectance properties, when the intensity profile is obtained under uniformly distributed directional lightings. Based on these observations, we show that surface normals can be estimated up to a convex/concave ambiguity. A solution method based on matrix decomposition with missing data is developed for a reliable estimation. Quantitative and qualitative evaluations of our method are performed using both synthetic and real-world scenes.

5 0.77818191 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields

Author: Sven Wanner, Christoph Straehle, Bastian Goldluecke

Abstract: Wepresent thefirst variationalframeworkfor multi-label segmentation on the ray space of 4D light fields. For traditional segmentation of single images, , features need to be extractedfrom the 2Dprojection ofa three-dimensional scene. The associated loss of geometry information can cause severe problems, for example if different objects have a very similar visual appearance. In this work, we show that using a light field instead of an image not only enables to train classifiers which can overcome many of these problems, but also provides an optimal data structure for label optimization by implicitly providing scene geometry information. It is thus possible to consistently optimize label assignment over all views simultaneously. As a further contribution, we make all light fields available online with complete depth and segmentation ground truth data where available, and thus establish the first benchmark data set for light field analysis to facilitate competitive further development of algorithms.

6 0.75413036 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

7 0.75167 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection

8 0.75143516 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities

9 0.75107801 311 cvpr-2013-Occlusion Patterns for Object Class Detection

10 0.75048029 414 cvpr-2013-Structure Preserving Object Tracking

11 0.75047278 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation

12 0.74958301 325 cvpr-2013-Part Discovery from Partial Correspondence

13 0.74916899 285 cvpr-2013-Minimum Uncertainty Gap for Robust Visual Tracking

14 0.74895895 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases

15 0.7487967 331 cvpr-2013-Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis

16 0.74836087 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models

17 0.74737352 360 cvpr-2013-Robust Estimation of Nonrigid Transformation for Point Set Registration

18 0.74734306 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics

19 0.74734306 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image

20 0.74711055 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds