cvpr cvpr2013 cvpr2013-309 knowledge-graph by maker-knowledge-mining

309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context


Source: pdf

Author: Gautam Singh, Jana Kosecka

Abstract: This paper presents a nonparametric approach to semantic parsing using small patches and simple gradient, color and location features. We learn the relevance of individual feature channels at test time using a locally adaptive distance metric. To further improve the accuracy of the nonparametric approach, we examine the importance of the retrieval set used to compute the nearest neighbours using a novel semantic descriptor to retrieve better candidates. The approach is validated by experiments on several datasets used for semantic parsing demonstrating the superiority of the method compared to the state of art approaches.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Nonparametric scene parsing with adaptive feature relevance and semantic context Gautam Singh Jana Koˇ seck a´ George Mason University Fairfax, VA {gs inghc ,kosecka} @ cs . [sent-1, score-0.729]

2 edu Abstract This paper presents a nonparametric approach to semantic parsing using small patches and simple gradient, color and location features. [sent-3, score-0.676]

3 We learn the relevance of individual feature channels at test time using a locally adaptive distance metric. [sent-4, score-0.457]

4 To further improve the accuracy of the nonparametric approach, we examine the importance of the retrieval set used to compute the nearest neighbours using a novel semantic descriptor to retrieve better candidates. [sent-5, score-1.082]

5 The approach is validated by experiments on several datasets used for semantic parsing demonstrating the superiority of the method compared to the state of art approaches. [sent-6, score-0.542]

6 Introduction The problem of semantic labelling, requires simultaneous segmentation of an image into regions and categorization of all the image pixels. [sent-8, score-0.38]

7 With the increasing complexity and size of the datasets used for evaluation of semantic segmentation, nonparametric techniques [15, 26] combined with various context driven retrieval strategies have demonstrated notable improvement in the performance. [sent-11, score-0.848]

8 These methods typically start with an oversegmentation of an image into superpixels followed by the computation of a rich set of features characterizing both appearance and local geometry at the superpixel level. [sent-12, score-0.479]

9 In the proposed work, we follow a nonparametric approach and make the following contributions: (i) We forgo the use of large superpixels and complex features and tackle the problem of semantic segmentation using local patches characterized by gradient orientation, color and location features. [sent-14, score-0.849]

10 The proposed approach is validated extensively on several semantic segmentation datasets consistently showing improved performance over the state of the art methods. [sent-16, score-0.486]

11 Related Work In recent years, a large number of approaches for semantic segmentation have been proposed. [sent-18, score-0.38]

12 Context is often captured by a retrieval set of images similar to the query and methods developed for establishing matches between image regions (at pixel or superpixel level) for labelling the image. [sent-28, score-0.951]

13 Authors in [26] work at the superpixel-level and retrieve similar images using global image features which is followed by superpixellevel matching using local features and a Markov random field (MRF) to incorporate neighbourhood context. [sent-30, score-0.345]

14 The work of [26] was extended by [4] by training per superpixel per feature weights and also by incorporating superpixellevel semantic context. [sent-31, score-0.743]

15 A set of partially similar images is used in [3 1] by searching for matches for each region of the query image and then using the retrieval set for label transfer. [sent-32, score-0.538]

16 A nonparametric method which avoids the construction of a retrieval set is [8] which instead addresses the problem of semantic labelling by building a graph of patch correspondences across image sets and transfers annotations to unlabeled images using the established correspondences. [sent-33, score-1.059]

17 Our work is closely related to the work of [26, 4] in that we also pursue nonparametric approach, but differ in the choice of elementary regions, features, feature relevance learning and the method for computing the retrieval set for k-NN classification. [sent-35, score-0.631]

18 In our case, the retrieval set is obtained in a feedback manner using a novel semantic label descriptor computed from the initial semantic segmentation. [sent-36, score-1.132]

19 Similarly to [4], we follow the observation that a single global distance metric is often not sufficient for handling the large variations within a class and propose to compute weights for individual features channels. [sent-37, score-0.314]

20 The computation of the feature relevance we adopt falls into a broad class of distance metric learning techniques which have been shown to be beneficial for many problems like image classification [5], object segmentation [17] and image annotation [9]. [sent-39, score-0.372]

21 Approach In this section, we will describe our baseline approach, followed by the method of weight computation in Section 4 and semantic contextual retrieval in Section 5. [sent-42, score-0.709]

22 Problem Formulation We formulate the semantic segmentation of an image segmented into small superpixels. [sent-45, score-0.38]

23 The output of the seman- tic segmentation is a labelling L = (l1, l2, . [sent-46, score-0.367]

24 The posterior probability of a labelling L given the observed appearance feature vectors A = [a1, a2 , . [sent-57, score-0.408]

25 (1) We estimate the labelling L as a Maximum A Posteriori Probability (MAP), argmax P(L|A) = argmax P(A|L) P(L) . [sent-61, score-0.384]

26 Superpixels and features For an image, we extract superpixels utilizing a segmentation method [29] where superpixel boundaries are obtained as watersheds on a negative absolute Laplacian image with LoG extremas as seeds. [sent-65, score-0.479]

27 These blob-based superpixels are efficient to compute and naturally consistent with the boundaries. [sent-66, score-0.272]

28 Similarly to [18], for each superpixel, we compute a 133-dimensional feature vector ai comprised of SIFT descriptor (128 dimensions), color mean over the pixels of an individual superpixel in Lab color space (3 dimensions) and the location of the superpixel centroid (2 dimensions). [sent-67, score-0.898]

29 The SIFT descriptor for a superpixel is com- puted at a fixed scale and orientation using publicly available code [27]. [sent-68, score-0.362]

30 The individual label likelihood P(ai |lj) for a superpixel si is obtained using a k-NN method. [sent-77, score-0.462]

31 S|inlce a superpixel is uniquely represented by its feature vector, we use the symbols si and ai interchangeably. [sent-78, score-0.347]

32 We compute the normalized label likelihood score using the individual label likelihood: • P(ai|lj) =nLL(ai,lj) ? [sent-81, score-0.412]

33 1 A straightforward way to compute the neighbourhood Nik is to use the concatenated feature ai (Section 3. [sent-86, score-0.386]

34 2) and retrieve the k nearest points by computing distance to superpixels in G. [sent-87, score-0.367]

35 Such a retrieval can be efficiently performed by the use of approximate nearest neighbour methods like k-d trees [19]. [sent-88, score-0.422]

36 (2) can be rewritten in log-space and the optimal labelling L∗ achieved as argLmin? [sent-93, score-0.32]

37 For example, when trying to label a seaside image, it is more helpful if we search for the nearest neighbours in images of beaches and discard views from street scenes. [sent-108, score-0.336]

38 It helps discard images which are dissimilar to the query image and provides a scene-level context which can help improve the labelling performance. [sent-110, score-0.589]

39 The retrieval subset will serve as the source of image annotations which will be used to label the query image. [sent-111, score-0.538]

40 All the images in the training set T are ranked for each individual global image feature in ascending order of the Euclidean distance from the query image. [sent-113, score-0.465]

41 Finally, we select a subset of images Tg from the training set T as the retrieval set. [sent-115, score-0.269]

42 In the next two sections, we describe in detail the two contributions of this work: a method for weighting different feature channels and the strategy for improving the retrieval set. [sent-120, score-0.37]

43 Weighted k-NN The baseline k-NN approach uses Euclidean distance to compute the neighbourhood around the point. [sent-122, score-0.275]

44 We propose to use a weighted k-NN method to compute the neighbourhood of a query point. [sent-123, score-0.483]

45 To compute a weighted distance between two superpixels ai and aj, we split the feature vector into three feature channels of gradient orientation, color and location and first compute distances in individual feature spaces: = [dicj,dsij, (7) difj disj,dilj dilj]? [sent-124, score-1.004]

46 where dicj, are the Euclidean distances between the color, SIFT and location channels of the feature vectors ai and aj of the two superpixels respectively. [sent-125, score-0.534]

47 We now define a weighted distance between the two superpixels as diwj = w? [sent-126, score-0.31]

48 (8), we can now obtain the neighbourhood Nik around a superpixel by applying it to the feature distance vector difj between ai and aj ∈ G to compute the label likelihood scores in Eq. [sent-130, score-0.941]

49 Weight computation With the varying nature of the retrieval set for individual query images, we use the locally adaptive metric approach of [3] for the weight computation. [sent-133, score-0.709]

50 In our setting, the test points are the individual superpixels of the query image. [sent-135, score-0.501]

51 The goal is to estimate the relevance of a feature channel iby evaluating its ability to predict class posterior probabilites locally at a query point. [sent-136, score-0.522]

52 For the query point x0, the relevance for feature ican be computed by averaging the ri (z) ’s in its neighbourhood r¯i(x0) =|N(1x0)|z∈N? [sent-140, score-0.599]

53 1 where m is the number of individual feature channels (three in our case), c is a parameter which determines the influence of r¯i (at c = 0, all three feature channels have equal weights) and Ri (x0) = maxpm=1 { ¯rp(x0)} −¯ r i (x0). [sent-145, score-0.349]

54 Semantic Contextual Retrieval The semantic labelling of an image, even if inaccurate provides a strong cue about the presence and absence of different categories in the image. [sent-150, score-0.653]

55 While the idea of using context to improve the labelling has been explored in the past for image superpixels [20, 4], here we examine the effectiveness of this idea in the stage of improving the entire retrieval set. [sent-151, score-0.86]

56 In order to do so, we propose a global descriptor derived from the intial labelling of the image which will be used to improve the retrieval set. [sent-152, score-0.707]

57 To summarize the semantic label information of a labeled image, we introduce the semantic label descriptor for a labelled image. [sent-153, score-1.102]

58 Our proposed descriptor helps encode the positional information of each category in the image and can be used for semantic contextual retrieval. [sent-156, score-0.515]

59 ls of the layout more precisely but be more prone to classification errors while a lower value for n would be less sensitive to errors in the labelling but does not encode the spatial position of the semantic categories as well. [sent-167, score-0.731]

60 This approach of computing a semantic label-based descriptor is similar to [10]. [sent-168, score-0.454]

61 Our method also differs from [4] who compute a superpixel-level semantic context descriptor as a normalized label histogram of neighbouring regions. [sent-171, score-0.734]

62 Semantic Retrieval Set Global image features (GIST, color histograms and spatial pyramid over SIFT) were used to build retrieval set Tg in Section 3. [sent-174, score-0.337]

63 We now use the semantic label descriptor fseman introduced above to help us refine the quality of the retrieval set by exploiting the semantic context. [sent-176, score-1.233]

64 Using the resultant semantic image labelling, we generate its corresponding semantic label descriptor fskeman. [sent-178, score-0.897]

65 Similarily, for the query view Iq, we label it using WKNN-MRF method and compute the corresponding semantic label descriptor. [sent-179, score-0.821]

66 We generate a new set of ranking for the images in training set T based on the distance between their semantic label descriptor and that of the query image. [sent-180, score-0.854]

67 The ranking is computed in an ascending order of the semantic label descriptor distances. [sent-181, score-0.636]

68 Using the new retrieval set Ts, we once again perform semantic labelling on the image by the process described in Section 3. [sent-184, score-0.888]

69 The WLKNN refers to a weighted k-NN using a retrieval set built using the label descriptor only. [sent-188, score-0.514]

70 We also experiment with using the semantic layout descriptor with all the other three global image features for the building of the retrieval set and denote this method WAKNN-MRF. [sent-189, score-0.768]

71 The evaluation criterion for the methods is the per pixel accuracy (percentage of pixels correctly labelled) and per class accuracy (the average of semantic category accuracies). [sent-192, score-0.369]

72 For Stanford Background and Google Street View datasets, we selected 10% of the training images as the size of our retrieval set. [sent-193, score-0.269]

73 Computation of the feature weights required an average of four minutes for a single query image. [sent-200, score-0.304]

74 To help speed up the computation of the weights, we approximate the neighbourhood construction of [3] through k-d trees [19]. [sent-201, score-0.289]

75 For the query view, we index the individual features from the retrieval set in a k-d tree, constructing one k-d tree per feature channel. [sent-202, score-0.593]

76 The neighbourhood computation is then approximated using the set union of the k-NN from different feature channels. [sent-203, score-0.299]

77 (11) adaptively changing the nearest neighbours in the weighted neighbourhood space. [sent-205, score-0.373]

78 SiftFlow SiftFlow is a large dataset of 2688 images with 33 semantic categories. [sent-211, score-0.333]

79 When we incorporate semantic context to obtain a refined retrieval set, our system achieves the best performance for both per-pixel and per-class accuracies. [sent-216, score-0.644]

80 The categories which saw an increase of more than 10% after the use of semantic context include field, car, river, plant, sidewalk, bridge, door, crosswalk. [sent-217, score-0.409]

81 These are categories which do not occur very frequently but achieved improved labelling with × the context. [sent-218, score-0.32]

82 For example, identifying road and highways helps label cars, sidewalk and crosswalk. [sent-219, score-0.299]

83 3l4580oawsdte We also experimented with replacing the SIFT feature for the superpixel with a HOG feature [2]. [sent-224, score-0.309]

84 H TOhGe individual HOG cell descriptors were averaged to compute the superpixel feature. [sent-226, score-0.358]

85 SUN09 SUN09 dataset [1] has fully labelled per-pixel ground truth for a set of 107 semantic categories. [sent-229, score-0.428]

86 Using the semantic context helped obtain an improvement of 3. [sent-232, score-0.473]

87 It was observed that the per-pixel labelling accuracy of outdoor scenes was more than 11% better than indoor scenes highlighting the challenge of labelling indoor views. [sent-235, score-0.64]

88 Examples (a)-(c) are instances of semantic context improving the labelling as trees and mountains are predicted in the initial labelling. [sent-251, score-0.772]

89 In comparison to the other methods, our performance was in the top-two for the per-pixel accuracy and for two semantic categories. [sent-261, score-0.333]

90 Stanford-Background This dataset contains 715 images with two separate label sets; semantic and geometric. [sent-262, score-0.443]

91 We conducted our experiments for predicting the semantic category only. [sent-263, score-0.333]

92 The semantic classes include seven background classes and a generic foreground class. [sent-264, score-0.333]

93 The use of semantic context leads to an improvement of only 0. [sent-267, score-0.409]

94 The lack of significant improvement with the use of semantic context here can be explained by the nature of the dataset as more than 90% of the images contain 4 or more of the 8 semantic categories. [sent-269, score-0.742]

95 Conclusions We have presented an approach for nonparametric scene parsing using a k-NN method. [sent-272, score-0.274]

96 A locally adaptive distance metric is learned at query time to compute the relevance of individual feature channels. [sent-274, score-0.645]

97 Using the initial 333 111555446 labelling as a contextual cue for presence or absence of objects in the scene, we proposed a semantic context descriptor which helped refine the quality of the retrieval set which is a key component of nonparametric methods. [sent-275, score-1.42]

98 For future work, we would like to explore better methods for incorporating spatial information at the patch level and also explore learning semantic concepts for scene understanding. [sent-278, score-0.333]

99 Partial similarity based nonparametric scene parsing in certain environment. [sent-484, score-0.274]

100 Supervised label transfer for semantic segmentation of street scenes. [sent-490, score-0.59]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('semantic', 0.333), ('labelling', 0.32), ('lj', 0.28), ('retrieval', 0.235), ('superpixels', 0.229), ('superpixel', 0.203), ('neighbourhood', 0.199), ('query', 0.193), ('nonparametric', 0.171), ('nik', 0.128), ('relevance', 0.122), ('descriptor', 0.121), ('siftflow', 0.12), ('label', 0.11), ('parsing', 0.103), ('street', 0.1), ('labelled', 0.095), ('difj', 0.093), ('neighbour', 0.092), ('ai', 0.091), ('channels', 0.082), ('individual', 0.079), ('sidewalk', 0.077), ('context', 0.076), ('neighbours', 0.074), ('pages', 0.074), ('likelihood', 0.07), ('pyramid', 0.066), ('helped', 0.064), ('fseman', 0.062), ('superpixellevel', 0.062), ('contextual', 0.061), ('ladicky', 0.06), ('tg', 0.06), ('nl', 0.059), ('weights', 0.058), ('road', 0.057), ('trail', 0.055), ('highways', 0.055), ('eck', 0.055), ('feature', 0.053), ('ko', 0.053), ('sturgess', 0.053), ('retrieve', 0.053), ('nearest', 0.052), ('grid', 0.051), ('gould', 0.051), ('neighbouring', 0.051), ('elementary', 0.05), ('sift', 0.05), ('layout', 0.048), ('weighted', 0.048), ('segmentation', 0.047), ('computation', 0.047), ('locally', 0.046), ('aj', 0.046), ('zi', 0.044), ('river', 0.044), ('streets', 0.044), ('mrf', 0.044), ('trees', 0.043), ('compute', 0.043), ('buildings', 0.043), ('ascending', 0.042), ('adaptive', 0.042), ('validated', 0.042), ('car', 0.039), ('refine', 0.039), ('orientation', 0.038), ('channel', 0.037), ('sea', 0.037), ('door', 0.036), ('class', 0.036), ('color', 0.036), ('mountain', 0.035), ('posterior', 0.035), ('split', 0.035), ('russell', 0.034), ('vlfeat', 0.034), ('metric', 0.034), ('training', 0.034), ('datasets', 0.033), ('distance', 0.033), ('weight', 0.033), ('cell', 0.033), ('inference', 0.033), ('location', 0.033), ('tree', 0.033), ('view', 0.032), ('argmax', 0.032), ('grass', 0.032), ('google', 0.032), ('ri', 0.032), ('art', 0.031), ('plant', 0.031), ('global', 0.031), ('accuracies', 0.031), ('bridge', 0.03), ('ranking', 0.03), ('ls', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000007 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context

Author: Gautam Singh, Jana Kosecka

Abstract: This paper presents a nonparametric approach to semantic parsing using small patches and simple gradient, color and location features. We learn the relevance of individual feature channels at test time using a locally adaptive distance metric. To further improve the accuracy of the nonparametric approach, we examine the importance of the retrieval set used to compute the nearest neighbours using a novel semantic descriptor to retrieve better candidates. The approach is validated by experiments on several datasets used for semantic parsing demonstrating the superiority of the method compared to the state of art approaches.

2 0.31645396 425 cvpr-2013-Tensor-Based High-Order Semantic Relation Transfer for Semantic Scene Segmentation

Author: Heesoo Myeong, Kyoung Mu Lee

Abstract: We propose a novel nonparametric approach for semantic segmentation using high-order semantic relations. Conventional context models mainly focus on learning pairwise relationships between objects. Pairwise relations, however, are not enough to represent high-level contextual knowledge within images. In this paper, we propose semantic relation transfer, a method to transfer high-order semantic relations of objects from annotated images to unlabeled images analogous to label transfer techniques where label information are transferred. Wefirst define semantic tensors representing high-order relations of objects. Semantic relation transfer problem is then formulated as semi-supervised learning using a quadratic objective function of the semantic tensors. By exploiting low-rank property of the semantic tensors and employing Kronecker sum similarity, an efficient approximation algorithm is developed. Based on the predicted high-order semantic relations, we reason semantic segmentation and evaluate the performance on several challenging datasets.

3 0.22907162 284 cvpr-2013-Mesh Based Semantic Modelling for Indoor and Outdoor Scenes

Author: Julien P.C. Valentin, Sunando Sengupta, Jonathan Warrell, Ali Shahrokni, Philip H.S. Torr

Abstract: Semantic reconstruction of a scene is important for a variety of applications such as 3D modelling, object recognition and autonomous robotic navigation. However, most object labelling methods work in the image domain and fail to capture the information present in 3D space. In this work we propose a principled way to generate object labelling in 3D. Our method builds a triangulated meshed representation of the scene from multiple depth estimates. We then define a CRF over this mesh, which is able to capture the consistency of geometric properties of the objects present in the scene. In this framework, we are able to generate object hypotheses by combining information from multiple sources: geometric properties (from the 3D mesh), and appearance properties (from images). We demonstrate the robustness of our framework in both indoor and outdoor scenes. For indoor scenes we created an augmented version of the NYU indoor scene dataset (RGB-D images) with object labelled meshes for training and evaluation. For outdoor scenes, we created ground truth object labellings for the KITTI odometry dataset (stereo image sequence). We observe a signifi- cant speed-up in the inference stage by performing labelling on the mesh, and additionally achieve higher accuracies.

4 0.21698162 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds

Author: Yan Wang, Rongrong Ji, Shih-Fu Chang

Abstract: Recent years have witnessed a growing interest in understanding the semantics of point clouds in a wide variety of applications. However, point cloud labeling remains an open problem, due to the difficulty in acquiring sufficient 3D point labels towards training effective classifiers. In this paper, we overcome this challenge by utilizing the existing massive 2D semantic labeled datasets from decadelong community efforts, such as ImageNet and LabelMe, and a novel “cross-domain ” label propagation approach. Our proposed method consists of two major novel components, Exemplar SVM based label propagation, which effectively addresses the cross-domain issue, and a graphical model based contextual refinement incorporating 3D constraints. Most importantly, the entire process does not require any training data from the target scenes, also with good scalability towards large scale applications. We evaluate our approach on the well-known Cornell Point Cloud Dataset, achieving much greater efficiency and comparable accuracy even without any 3D training data. Our approach shows further major gains in accuracy when the training data from the target scenes is used, outperforming state-ofthe-art approaches with far better efficiency.

5 0.20715035 146 cvpr-2013-Enriching Texture Analysis with Semantic Data

Author: Tim Matthews, Mark S. Nixon, Mahesan Niranjan

Abstract: We argue for the importance of explicit semantic modelling in human-centred texture analysis tasks such as retrieval, annotation, synthesis, and zero-shot learning. To this end, low-level attributes are selected and used to define a semantic space for texture. 319 texture classes varying in illumination and rotation are positioned within this semantic space using a pairwise relative comparison procedure. Low-level visual features used by existing texture descriptors are then assessed in terms of their correspondence to the semantic space. Textures with strong presence ofattributes connoting randomness and complexity are shown to be poorly modelled by existing descriptors. In a retrieval experiment semantic descriptors are shown to outperform visual descriptors. Semantic modelling of texture is thus shown to provide considerable value in both feature selection and in analysis tasks.

6 0.18463598 460 cvpr-2013-Weakly-Supervised Dual Clustering for Image Semantic Segmentation

7 0.18371437 173 cvpr-2013-Finding Things: Image Parsing with Regions and Per-Exemplar Detectors

8 0.18201074 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images

9 0.18021309 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

10 0.179855 207 cvpr-2013-Human Pose Estimation Using a Joint Pixel-wise and Part-wise Formulation

11 0.17543866 29 cvpr-2013-A Video Representation Using Temporal Superpixels

12 0.1598203 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation

13 0.15882164 343 cvpr-2013-Query Adaptive Similarity for Large Scale Object Retrieval

14 0.15027755 406 cvpr-2013-Spatial Inference Machines

15 0.15001766 189 cvpr-2013-Graph-Based Discriminative Learning for Location Recognition

16 0.14920372 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories

17 0.14726496 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning

18 0.14517443 366 cvpr-2013-Robust Region Grouping via Internal Patch Statistics

19 0.14058991 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs

20 0.13857181 13 cvpr-2013-A Higher-Order CRF Model for Road Network Extraction


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.27), (1, -0.061), (2, 0.055), (3, -0.021), (4, 0.184), (5, 0.039), (6, -0.031), (7, 0.105), (8, -0.178), (9, -0.026), (10, 0.152), (11, -0.027), (12, 0.066), (13, 0.147), (14, -0.018), (15, -0.084), (16, 0.112), (17, -0.052), (18, -0.067), (19, 0.007), (20, 0.187), (21, -0.033), (22, -0.051), (23, 0.136), (24, -0.212), (25, -0.029), (26, -0.005), (27, 0.003), (28, -0.014), (29, 0.042), (30, 0.004), (31, -0.109), (32, -0.065), (33, -0.089), (34, -0.111), (35, -0.115), (36, -0.102), (37, 0.053), (38, -0.037), (39, -0.015), (40, -0.05), (41, -0.035), (42, -0.01), (43, -0.031), (44, 0.026), (45, -0.06), (46, -0.002), (47, 0.09), (48, -0.027), (49, 0.039)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95564967 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context

Author: Gautam Singh, Jana Kosecka

Abstract: This paper presents a nonparametric approach to semantic parsing using small patches and simple gradient, color and location features. We learn the relevance of individual feature channels at test time using a locally adaptive distance metric. To further improve the accuracy of the nonparametric approach, we examine the importance of the retrieval set used to compute the nearest neighbours using a novel semantic descriptor to retrieve better candidates. The approach is validated by experiments on several datasets used for semantic parsing demonstrating the superiority of the method compared to the state of art approaches.

2 0.84348375 425 cvpr-2013-Tensor-Based High-Order Semantic Relation Transfer for Semantic Scene Segmentation

Author: Heesoo Myeong, Kyoung Mu Lee

Abstract: We propose a novel nonparametric approach for semantic segmentation using high-order semantic relations. Conventional context models mainly focus on learning pairwise relationships between objects. Pairwise relations, however, are not enough to represent high-level contextual knowledge within images. In this paper, we propose semantic relation transfer, a method to transfer high-order semantic relations of objects from annotated images to unlabeled images analogous to label transfer techniques where label information are transferred. Wefirst define semantic tensors representing high-order relations of objects. Semantic relation transfer problem is then formulated as semi-supervised learning using a quadratic objective function of the semantic tensors. By exploiting low-rank property of the semantic tensors and employing Kronecker sum similarity, an efficient approximation algorithm is developed. Based on the predicted high-order semantic relations, we reason semantic segmentation and evaluate the performance on several challenging datasets.

3 0.78645104 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds

Author: Yan Wang, Rongrong Ji, Shih-Fu Chang

Abstract: Recent years have witnessed a growing interest in understanding the semantics of point clouds in a wide variety of applications. However, point cloud labeling remains an open problem, due to the difficulty in acquiring sufficient 3D point labels towards training effective classifiers. In this paper, we overcome this challenge by utilizing the existing massive 2D semantic labeled datasets from decadelong community efforts, such as ImageNet and LabelMe, and a novel “cross-domain ” label propagation approach. Our proposed method consists of two major novel components, Exemplar SVM based label propagation, which effectively addresses the cross-domain issue, and a graphical model based contextual refinement incorporating 3D constraints. Most importantly, the entire process does not require any training data from the target scenes, also with good scalability towards large scale applications. We evaluate our approach on the well-known Cornell Point Cloud Dataset, achieving much greater efficiency and comparable accuracy even without any 3D training data. Our approach shows further major gains in accuracy when the training data from the target scenes is used, outperforming state-ofthe-art approaches with far better efficiency.

4 0.73708105 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation

Author: Luming Zhang, Mingli Song, Zicheng Liu, Xiao Liu, Jiajun Bu, Chun Chen

Abstract: Weakly supervised image segmentation is a challenging problem in computer vision field. In this paper, we present a new weakly supervised image segmentation algorithm by learning the distribution of spatially structured superpixel sets from image-level labels. Specifically, we first extract graphlets from each image where a graphlet is a smallsized graph consisting of superpixels as its nodes and it encapsulates the spatial structure of those superpixels. Then, a manifold embedding algorithm is proposed to transform graphlets of different sizes into equal-length feature vectors. Thereafter, we use GMM to learn the distribution of the post-embedding graphlets. Finally, we propose a novel image segmentation algorithm, called graphlet cut, that leverages the learned graphlet distribution in measuring the homogeneity of a set of spatially structured superpixels. Experimental results show that the proposed approach outperforms state-of-the-art weakly supervised image segmentation methods, and its performance is comparable to those of the fully supervised segmentation models.

5 0.70853472 460 cvpr-2013-Weakly-Supervised Dual Clustering for Image Semantic Segmentation

Author: Yang Liu, Jing Liu, Zechao Li, Jinhui Tang, Hanqing Lu

Abstract: In this paper, we propose a novel Weakly-Supervised Dual Clustering (WSDC) approach for image semantic segmentation with image-level labels, i.e., collaboratively performing image segmentation and tag alignment with those regions. The proposed approach is motivated from the observation that superpixels belonging to an object class usually exist across multiple images and hence can be gathered via the idea of clustering. In WSDC, spectral clustering is adopted to cluster the superpixels obtained from a set of over-segmented images. At the same time, a linear transformation between features and labels as a kind of discriminative clustering is learned to select the discriminative features among different classes. The both clustering outputs should be consistent as much as possible. Besides, weakly-supervised constraints from image-level labels are imposed to restrict the labeling of superpixels. Finally, the non-convex and non-smooth objective function are efficiently optimized using an iterative CCCP procedure. Extensive experiments conducted on MSRC andLabelMe datasets demonstrate the encouraging performance of our method in comparison with some state-of-the-arts.

6 0.66925555 406 cvpr-2013-Spatial Inference Machines

7 0.65553188 26 cvpr-2013-A Statistical Model for Recreational Trails in Aerial Images

8 0.64086097 146 cvpr-2013-Enriching Texture Analysis with Semantic Data

9 0.63908213 366 cvpr-2013-Robust Region Grouping via Internal Patch Statistics

10 0.61534458 29 cvpr-2013-A Video Representation Using Temporal Superpixels

11 0.61219877 157 cvpr-2013-Exploring Implicit Image Statistics for Visual Representativeness Modeling

12 0.61204392 458 cvpr-2013-Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds

13 0.59351665 13 cvpr-2013-A Higher-Order CRF Model for Road Network Extraction

14 0.58937138 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation

15 0.58875835 284 cvpr-2013-Mesh Based Semantic Modelling for Indoor and Outdoor Scenes

16 0.58780122 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images

17 0.57812124 212 cvpr-2013-Image Segmentation by Cascaded Region Agglomeration

18 0.56634545 173 cvpr-2013-Finding Things: Image Parsing with Regions and Per-Exemplar Detectors

19 0.53316486 73 cvpr-2013-Bringing Semantics into Focus Using Visual Abstraction

20 0.53204322 99 cvpr-2013-Cross-View Image Geolocalization


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.108), (16, 0.026), (26, 0.057), (28, 0.012), (33, 0.327), (67, 0.09), (69, 0.056), (70, 0.141), (87, 0.107)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.95862806 174 cvpr-2013-Fine-Grained Crowdsourcing for Fine-Grained Recognition

Author: Jia Deng, Jonathan Krause, Li Fei-Fei

Abstract: Fine-grained recognition concerns categorization at sub-ordinate levels, where the distinction between object classes is highly local. Compared to basic level recognition, fine-grained categorization can be more challenging as there are in general less data and fewer discriminative features. This necessitates the use of stronger prior for feature selection. In this work, we include humans in the loop to help computers select discriminative features. We introduce a novel online game called “Bubbles ” that reveals discriminative features humans use. The player’s goal is to identify the category of a heavily blurred image. During the game, the player can choose to reveal full details of circular regions ( “bubbles”), with a certain penalty. With proper setup the game generates discriminative bubbles with assured quality. We next propose the “BubbleBank” algorithm that uses the human selected bubbles to improve machine recognition performance. Experiments demonstrate that our approach yields large improvements over the previous state of the art on challenging benchmarks.

2 0.95403898 466 cvpr-2013-Whitened Expectation Propagation: Non-Lambertian Shape from Shading and Shadow

Author: Brian Potetz, Mohammadreza Hajiarbabi

Abstract: For problems over continuous random variables, MRFs with large cliques pose a challenge in probabilistic inference. Difficulties in performing optimization efficiently have limited the probabilistic models explored in computer vision and other fields. One inference technique that handles large cliques well is Expectation Propagation. EP offers run times independent of clique size, which instead depend only on the rank, or intrinsic dimensionality, of potentials. This property would be highly advantageous in computer vision. Unfortunately, for grid-shaped models common in vision, traditional Gaussian EP requires quadratic space and cubic time in the number of pixels. Here, we propose a variation of EP that exploits regularities in natural scene statistics to achieve run times that are linear in both number of pixels and clique size. We test these methods on shape from shading, and we demonstrate strong performance not only for Lambertian surfaces, but also on arbitrary surface reflectance and lighting arrangements, which requires highly non-Gaussian potentials. Finally, we use large, non-local cliques to exploit cast shadow, which is traditionally ignored in shape from shading.

3 0.94714212 150 cvpr-2013-Event Recognition in Videos by Learning from Heterogeneous Web Sources

Author: Lin Chen, Lixin Duan, Dong Xu

Abstract: In this work, we propose to leverage a large number of loosely labeled web videos (e.g., from YouTube) and web images (e.g., from Google/Bing image search) for visual event recognition in consumer videos without requiring any labeled consumer videos. We formulate this task as a new multi-domain adaptation problem with heterogeneous sources, in which the samples from different source domains can be represented by different types of features with different dimensions (e.g., the SIFTfeaturesfrom web images and space-time (ST) features from web videos) while the target domain samples have all types of features. To effectively cope with the heterogeneous sources where some source domains are more relevant to the target domain, we propose a new method called Multi-domain Adaptation with Heterogeneous Sources (MDA-HS) to learn an optimal target classifier, in which we simultaneously seek the optimal weights for different source domains with different types of features as well as infer the labels of unlabeled target domain data based on multiple types of features. We solve our optimization problem by using the cutting-plane algorithm based on group-based multiple kernel learning. Comprehensive experiments on two datasets demonstrate the effectiveness of MDA-HS for event recognition in consumer videos.

4 0.94308531 23 cvpr-2013-A Practical Rank-Constrained Eight-Point Algorithm for Fundamental Matrix Estimation

Author: Yinqiang Zheng, Shigeki Sugimoto, Masatoshi Okutomi

Abstract: Due to its simplicity, the eight-point algorithm has been widely used in fundamental matrix estimation. Unfortunately, the rank-2 constraint of a fundamental matrix is enforced via a posterior rank correction step, thus leading to non-optimal solutions to the original problem. To address this drawback, existing algorithms need to solve either a very high order polynomial or a sequence of convex relaxation problems, both of which are computationally ineffective and numerically unstable. In this work, we present a new rank-2 constrained eight-point algorithm, which directly incorporates the rank-2 constraint in the minimization process. To avoid singularities, we propose to solve seven subproblems and retrieve their globally optimal solutions by using tailored polynomial system solvers. Our proposed method is noniterative, computationally efficient and numerically stable. Experiment results have verified its superiority over existing algebraic error based algorithms in terms of accuracy, as well as its advantages when used to initialize geometric error based algorithms.

5 0.9314605 87 cvpr-2013-Compressed Hashing

Author: Yue Lin, Rong Jin, Deng Cai, Shuicheng Yan, Xuelong Li

Abstract: Recent studies have shown that hashing methods are effective for high dimensional nearest neighbor search. A common problem shared by many existing hashing methods is that in order to achieve a satisfied performance, a large number of hash tables (i.e., long codewords) are required. To address this challenge, in this paper we propose a novel approach called Compressed Hashing by exploring the techniques of sparse coding and compressed sensing. In particular, we introduce a sparse coding scheme, based on the approximation theory of integral operator, that generate sparse representation for high dimensional vectors. We then project sparse codes into a low dimensional space by effectively exploring the Restricted Isometry Property (RIP), a key property in compressed sensing theory. Both of the theoretical analysis and the empirical studies on two large data sets show that the proposed approach is more effective than the state-of-the-art hashing algorithms.

6 0.92740595 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities

same-paper 7 0.92665869 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context

8 0.92576933 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds

9 0.92510873 15 cvpr-2013-A Lazy Man's Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration

10 0.92496079 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

11 0.92461479 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

12 0.92459524 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval

13 0.92446309 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases

14 0.92235839 72 cvpr-2013-Boundary Detection Benchmarking: Beyond F-Measures

15 0.9221186 202 cvpr-2013-Hierarchical Saliency Detection

16 0.92205262 94 cvpr-2013-Context-Aware Modeling and Recognition of Activities in Video

17 0.92130935 98 cvpr-2013-Cross-View Action Recognition via a Continuous Virtual Path

18 0.9209004 167 cvpr-2013-Fast Multiple-Part Based Object Detection Using KD-Ferns

19 0.92089719 14 cvpr-2013-A Joint Model for 2D and 3D Pose Estimation from a Single Image

20 0.9206844 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images