cvpr cvpr2013 cvpr2013-145 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Anelia Angelova, Shenghuo Zhu
Abstract: We propose a detection and segmentation algorithm for the purposes of fine-grained recognition. The algorithm first detects low-level regions that could potentially belong to the object and then performs a full-object segmentation through propagation. Apart from segmenting the object, we can also ‘zoom in ’ on the object, i.e. center it, normalize it for scale, and thus discount the effects of the background. We then show that combining this with a state-of-the-art classification algorithm leads to significant improvements in performance especially for datasets which are considered particularly hard for recognition, e.g. birds species. The proposed algorithm is much more efficient than other known methods in similar scenarios [4, 21]. Our method is also simpler and we apply it here to different classes of objects, e.g. birds, flowers, cats and dogs. We tested the algorithm on a number of benchmark datasets for fine-grained categorization. It outperforms all the known state-of-the-art methods on these datasets, sometimes by as much as 11%. It improves the performance of our baseline algorithm by 3-4%, consistently on all datasets. We also observed more than a 4% improvement in the recognition performance on a challenging largescale flower dataset, containing 578 species of flowers and 250,000 images.
Reference: text
sentIndex sentText sentNum sentScore
1 Efficient object detection and segmentation for fine-grained recognition Anelia Angelova NEC Labs America Cupertino, CA ane l ne c-l ab s . [sent-1, score-0.318]
2 com ia@ Abstract We propose a detection and segmentation algorithm for the purposes of fine-grained recognition. [sent-2, score-0.26]
3 The algorithm first detects low-level regions that could potentially belong to the object and then performs a full-object segmentation through propagation. [sent-3, score-0.283]
4 We also observed more than a 4% improvement in the recognition performance on a challenging largescale flower dataset, containing 578 species of flowers and 250,000 images. [sent-17, score-1.113]
5 the leaves of the flowers provide informative context, for other super-categories, e. [sent-30, score-0.443]
6 birds who are mobile, different classes often share the same background, so segmenting out the background will be beneficial. [sent-32, score-0.503]
7 Another benefit of a detection and segmentation algorithm is that it can localize the object, which will be beneficial, especially if the object is not in the center of the image or is of size, different from the other objects’ sizes. [sent-34, score-0.364]
8 In this paper we propose an efficient object detection and segmentation algorithm which is effectively used to localize the object and normalize it for scale (Figure 1). [sent-35, score-0.422]
9 birds, flowers, and cats and dogs, and improves the recognition performance for fine-grained classification tasks. [sent-38, score-0.395]
10 Furthermore, the obtained segmentation is used to localize the object, normalize it for scale and discount the effects of the background. [sent-46, score-0.341]
11 The key contributions of this paper are: - We propose a region-guided detection and segmentation of the object. [sent-48, score-0.227]
12 This is particularly beneficial when the object takes a small area of the image, is not in the center, or when the background is shared among different classes (as is the trees background for birds or indoors environment for cats and dogs). [sent-54, score-0.9]
13 We combine the feature extracted from the segmented image with a stateof-the-art recognition algorithm and obtain an efficient and reliable recognition pipeline which leads to large improvements in performance. [sent-57, score-0.326]
14 This is of huge importance since segmentation can now be run as part of standard recognition pipelines. [sent-60, score-0.244]
15 either flowers, or birds, or cats and dogs [8, 17, 20]. [sent-63, score-0.433]
16 These datasets are very challenging, especially the latter one, as the birds can be encountered in different resolutions with a lot of possible background and object texture variability. [sent-65, score-0.564]
17 Furthermore, our team has collected a large-scale flower dataset which contains 578 different species of flowers and about 250,000 images (Figure 2). [sent-68, score-1.073]
18 Previous work Fine-grained recognition is a topic of large practical importance and many recent works have addressed such tasks including recognition of flowers [17], birds [2, 8, 26], cats and dogs [19, 20], tree-leaves [15]. [sent-74, score-1.378]
19 These approaches are either too slow or are targeted for segmentation during training. [sent-77, score-0.198]
20 Recent works have proposed object segmentation for the purposes of better classification. [sent-78, score-0.276]
21 Another work, again on cat and dog categorization [20], proposes to do segmentation prior to recognition. [sent-82, score-0.31]
22 Those methods typically work with small coherent regions on the image (called super-pixels) and feed the low-level segmentations to object detection pipelines [10]. [sent-86, score-0.292]
23 Although those methods have provided many insights and useful tools for recognition [10], they have stopped short of providing efficient algorithms for full-object segmentation for either object recognition or detection. [sent-87, score-0.4]
24 Example images from each class of the large-scale 578 flower dataset. [sent-89, score-0.237]
25 Other related work, although not doing segmentation per se, has proposed to first localize the potential object region and utilize this information during recognition [14, 22, 23]. [sent-90, score-0.392]
26 Object detection and segmentation This section describes how to detect and segment the object, or objects, in an image. [sent-92, score-0.227]
27 Finally, the segmented image (which contains the detected and segmented object, possibly cropped and resized) and input image are processed through the feature extraction and classification pipeline (Section 4) and the final classification is obtained. [sent-97, score-0.488]
28 For simplicity we use the super-pixel segmentation method by Felzenszwalb and Huttenlocher [9] to over-segment the image into small coherent regions. [sent-101, score-0.228]
29 Using ground truth segmentation of training images, we consider super-pixel regions with large overlap with the foreground and background ground truth areas, as positive and negative examples, respectively. [sent-107, score-0.598]
30 When no ground truth is available, we start from an approximate segmentation and iteratively improve the segmentation by applying the trained model. [sent-109, score-0.521]
31 This procedure is standard in other segmentation works [3]. [sent-111, score-0.198]
32 The birds and cats and dogs dataset have ground truth segmentation provided, so we built a single model. [sent-112, score-1.231]
33 For the Oxford 102 flowers dataset we used the segmentation images in [17] as seed an improved it iteratively. [sent-113, score-0.706]
34 As shown later in our experiments, we have the same algorithms for both training of the model and detection for flowers, birds, cats and dogs. [sent-116, score-0.301]
35 The goal of the segmentation task is to find the label Xj for each pixel Ij, where Xj = 1 when the pixel belongs to the object and Xj = 0, otherwise. [sent-122, score-0.243]
36 Since the diffusion properties of the foreground and background of different images (and datasets) may vary, we consider separate segmentations for the detected foreground only-areas and background-only areas, respectively. [sent-156, score-0.269]
37 This is done since the segmentation with respect to one of them could be good but not with respect to the other and combining the results of foreground and background segmentations produces more coherent segmentation and takes advantage of their complementary functions. [sent-157, score-0.678]
38 Top: Input image and the initial regions which are classified with high score to belong to either a flower or the background. [sent-164, score-0.277]
39 Bottom: Label propagation on this image and the final segmentation result. [sent-165, score-0.279]
40 At the same time, it gives equivalent results to the individual foreground and background segmentations which are more stable. [sent-169, score-0.22]
41 To obtain the final segmentation Xsegm is thresholded at 0. [sent-170, score-0.198]
42 Fine-grained recognition with segmentation This section describes how we use the segmented image in the final fine-grained recognition task. [sent-183, score-0.408]
43 Example segmented images from the datasets tested in this paper. [sent-190, score-0.212]
44 Examples Although not necessarily of failed segmentations perfect, these segmentations are are shown in the bottom row: only a portion of the background is removed, this is typical of flowers since they take larger areas of the image; parts of the object are missing, e. [sent-192, score-0.769]
45 Our classification pipeline uses the 1-vs-all strategy of linear SVM classification and we used the Liblinear SVM implementation [7]. [sent-196, score-0.219]
46 The segmented image is processed through the same feature extraction pipeline as the original image, that is, new features are extracted for the segmented image. [sent-198, score-0.334]
47 One thing to note here is that, because of our decision to apply HOG type features and pooling to the segmented image, the segmentation helps with both providing shape of the contour of the object to be recognized, as well as, ignoring features in the background that can be distractors. [sent-200, score-0.566]
48 On the other hand, by keeping both sets of features from the original and the segmented image, we can avoid losing precision due to mis-segmentation, and can also include the background for cases for which it may provide useful context (e. [sent-201, score-0.209]
49 the leaves and stems of some flowers may be useful for recognition). [sent-203, score-0.473]
50 In our experiments we found that it is sufficient to keep a global pooling of the segmented image, in addition to the full set of poolings for the original image. [sent-204, score-0.234]
51 The latter is very beneficial since these datasets have variabilities in scale and one of the purposes of our segmentation is to be able to localize the object and normalize for its scale. [sent-208, score-0.521]
52 No cropping is done for the two flower datasets, since the flowers are assumed to take most of the image area (even for small ‘cluster’ flowers). [sent-209, score-0.75]
53 We did not do cropping for the experiment which uses ground truth bounding box information (for birds). [sent-210, score-0.277]
54 We note that the segmentation procedure is much faster than previously known segmentation methods, which take at least 30 seconds [4, 21]. [sent-214, score-0.457]
55 Furthermore, our segmentation run-time allows it to be run as a part of standard recognition pipelines at test time, which had not been possible before, and is a significant advantage. [sent-215, score-0.282]
56 Experiments In this section we show experimental results of our proposed algorithm on a number of fine-grained recognition benchmarks: Oxford 102 flowers [17], Caltech-UCSD 200 birds [2, 26], and the recent Oxford Cats and Dogs [20] datasets. [sent-217, score-0.899]
57 We compare to our baseline algorithm, because it measures how much the proposed seg- mentation has contributed to the improvement in classification performance. [sent-219, score-0.217]
58 In addition, we measure our performance on the large-scale 578-category flower dataset. [sent-220, score-0.237]
59 Oxford 102 flower species dataset Oxford 102 flowers dataset is a well-known dataset for fine-grained recognition proposed by Nilsback and Zisserman [17]. [sent-223, score-1.212]
60 The dataset contains 102 species of flowers and a total of 8189 images, each category containing between 40 and 200 images. [sent-224, score-0.832]
61 Some of the segmentation methods are designed to be very specific to the appearance of flowers [17] (with the assumption that a single flower is in the center of the image and takes most of the image), while others [3] are more general and can also be applied to other types of datasets. [sent-227, score-0.905]
62 One important thing to note is that the improvement of our algorithm over our baseline is about 4%, and the only difference between the two is the addition of the proposed segmentation algorithm and the features extracted from the segmented image. [sent-231, score-0.509]
63 Caltech-UCSD 200 birds species dataset Caltech-UCSD-200 Birds dataset [26] is a very challenging dataset containing 200 species of birds. [sent-234, score-1.22]
64 Apart from very fine-differences between different species of birds, what makes the recognition hard in this dataset is the variety of poses, large variability in scales, and also very rich backgrounds in which the birds often blend in. [sent-235, score-0.85]
65 Even when using ground truth bounding boxes, provided as annotations with the dataset [26], the reported results have been around 19% [26, 27] and most recently 24. [sent-238, score-0.274]
66 3% [3], but the latter result additionally uses crude ground truth segmentation of each bird. [sent-239, score-0.391]
67 MethodAccuracy (in %) The proposed algorithm improves the performance both with and without using ground truth bounding boxes (see Tables 2 and 3). [sent-240, score-0.281]
68 This is by itself an impressive improvement over the other known algorithms for this dataset (even when not taking advantage of our baseline performance). [sent-248, score-0.263]
69 Most importantly, our algorithm shows improvement over all known prior approaches, when no ground truth bounding boxes are used. [sent-250, score-0.375]
70 Another thing to notice here is that the improvement over our baseline, when no bounding boxes information is known, is larger than the improvement with bounding boxes. [sent-256, score-0.419]
71 This improvement is consistent across the other datasets tested in this paper, which do not have bounding box information. [sent-257, score-0.271]
72 We attribute this to the fact that the bounding boxes have perfect object localization and scaling, and to large extent have background elimination capabilities. [sent-258, score-0.262]
73 This underlines the importance of our proposed automatic detection and segmentation of the object, which then allows to ‘zoom in’ on the object, especially for large- scale datasets for which providing bounding boxes or other ground truth information will be infeasible. [sent-259, score-0.627]
74 Oxford Cats and Dogs dataset Oxford Cats and Dogs [20] is a new dataset for fine-grained classification which contains 6033 images of 37 breeds of cats and dogs. [sent-262, score-0.528]
75 They apply segmentation at test time, as is done here, but their algorithm is based on Grabcut [21], 888881111166444 MethodAccuracy (in %) TOC abuhlaresi,b2a. [sent-264, score-0.23]
76 Please refer to Table 3 for comparison to other baselines which additionally use ground truth bounding box information. [sent-267, score-0.268]
77 312607birds dataset, when ground truth bounding boxes are used (the result in [3] uses crude ground truth segmentation masks in addition to bounding boxes). [sent-271, score-0.727]
78 Also, the methods proposed in [20] are specific to recognizing cat and dog breeds and utilize head and body layout information. [sent-273, score-0.192]
79 Note that [20] also reported classification when using cat and dog head annotations or ground truth segmentation during testing, whereas here our experiments do not use such information. [sent-276, score-0.543]
80 Large-scale 578 flower species dataset This dataset consists of 578 species of flowers and con- tains about 250,000 images and is the largest and most challenging such dataset we are aware of. [sent-279, score-1.457]
81 ) for an input flower image, and be available for general use. [sent-283, score-0.237]
82 The improvement provided by our segmentation method is 4. [sent-285, score-0.261]
83 Classification performance on the large-scale 578 flowers dataset for the top returned result. [sent-299, score-0.508]
84 Note that this large-scale data has no segmentation ground truth or bounding box information (since it contains 250,000+ images and obtaining those would be prohibitive or at least very expensive). [sent-303, score-0.437]
85 Thus, here the advantage that an automatic segmentation algorithm can give in terms of improving the final classification performance is really important. [sent-304, score-0.311]
86 Another interesting fact is that here we have used the same initial region detection model that was trained on the Oxford 102 flowers dataset, which contains fewer species of flowers (102 instead of 578). [sent-305, score-1.244]
87 Naturally, the performance of the segmentation algorithm can be further improved after adapting the segmentation model to this specific dataset. [sent-307, score-0.396]
88 As seen by the improvements over the baseline, our segmentation algorithm gives advantage in recognition performance. [sent-309, score-0.295]
89 This is true even if the segmentation may be imperfect for some examples. [sent-310, score-0.198]
90 This shows that segmenting out the object of interest during testing is of crucial importance for an automatic algorithms and that it is worthwhile exploring even better segmentation algorithms. [sent-311, score-0.311]
91 Conclusions and future work We propose an algorithm which combines region-based detection of the object of interest and full-object segmentation through propagation. [sent-313, score-0.272]
92 The segmentation is applied at test time and is shown to be very useful for improving the classification performance on four challenging datasets. [sent-314, score-0.305]
93 Classification performance on the large-scale 578 flowers dataset for top K = 1, . [sent-317, score-0.508]
94 578-category flower dataset which is the largest collection offlower species we are aware of. [sent-321, score-0.593]
95 Our algorithm is much faster than previously used segmentation algorithms in similar scenarios, e. [sent-323, score-0.228]
96 It is also applicable to a variety of types of categories, as shown in this paper on birds, flowers, and cats and dogs. [sent-326, score-0.272]
97 The collection of the large-scale 578-class flower dataset would not have been possible without their critical expertise in flower identification. [sent-332, score-0.539]
98 Leafsnap: A computer vision system [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] for automatic plant species identification. [sent-437, score-0.357]
99 Automated flower classification over a large number of classes. [sent-454, score-0.314]
100 An automatic visual flora - segmentation and classification of flower images. [sent-460, score-0.548]
wordName wordTfidf (topN-words)
[('flowers', 0.443), ('birds', 0.41), ('species', 0.291), ('cats', 0.272), ('flower', 0.237), ('segmentation', 0.198), ('oxford', 0.175), ('dogs', 0.161), ('segmented', 0.118), ('segmentations', 0.11), ('methodaccuracy', 0.085), ('bounding', 0.084), ('propagation', 0.081), ('classification', 0.077), ('baseline', 0.077), ('boxes', 0.072), ('truth', 0.068), ('dataset', 0.065), ('cat', 0.065), ('localize', 0.065), ('pipeline', 0.065), ('improvement', 0.063), ('nilsback', 0.061), ('parkhi', 0.061), ('background', 0.061), ('poolings', 0.06), ('xsegm', 0.06), ('ybg', 0.06), ('yfg', 0.06), ('ground', 0.057), ('pooling', 0.056), ('thing', 0.053), ('laplacian', 0.053), ('cupertino', 0.053), ('zoom', 0.052), ('improvements', 0.051), ('beneficial', 0.051), ('grabcut', 0.05), ('foreground', 0.049), ('breeds', 0.049), ('datasets', 0.048), ('apart', 0.047), ('dog', 0.047), ('liblinear', 0.047), ('tested', 0.046), ('jawahar', 0.046), ('recognition', 0.046), ('object', 0.045), ('russakovsky', 0.044), ('nec', 0.042), ('variabilities', 0.041), ('wij', 0.04), ('regions', 0.04), ('normalize', 0.04), ('crude', 0.039), ('discount', 0.038), ('finegrained', 0.038), ('pipelines', 0.038), ('cropping', 0.038), ('schroff', 0.038), ('region', 0.038), ('backgrounds', 0.038), ('team', 0.037), ('labs', 0.037), ('automatic', 0.036), ('wah', 0.036), ('hog', 0.035), ('providing', 0.035), ('welinder', 0.034), ('branson', 0.033), ('extraction', 0.033), ('containing', 0.033), ('llc', 0.033), ('purposes', 0.033), ('segmenting', 0.032), ('done', 0.032), ('xj', 0.032), ('known', 0.031), ('head', 0.031), ('america', 0.031), ('coherent', 0.03), ('lin', 0.03), ('vedaldi', 0.03), ('plant', 0.03), ('box', 0.03), ('faster', 0.03), ('useful', 0.03), ('etc', 0.029), ('additionally', 0.029), ('detection', 0.029), ('impressive', 0.027), ('center', 0.027), ('preconditioning', 0.026), ('xfg', 0.026), ('yuanqing', 0.026), ('eiv', 0.026), ('objectcentric', 0.026), ('olga', 0.026), ('rudimentary', 0.026), ('submodels', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 1.0 145 cvpr-2013-Efficient Object Detection and Segmentation for Fine-Grained Recognition
Author: Anelia Angelova, Shenghuo Zhu
Abstract: We propose a detection and segmentation algorithm for the purposes of fine-grained recognition. The algorithm first detects low-level regions that could potentially belong to the object and then performs a full-object segmentation through propagation. Apart from segmenting the object, we can also ‘zoom in ’ on the object, i.e. center it, normalize it for scale, and thus discount the effects of the background. We then show that combining this with a state-of-the-art classification algorithm leads to significant improvements in performance especially for datasets which are considered particularly hard for recognition, e.g. birds species. The proposed algorithm is much more efficient than other known methods in similar scenarios [4, 21]. Our method is also simpler and we apply it here to different classes of objects, e.g. birds, flowers, cats and dogs. We tested the algorithm on a number of benchmark datasets for fine-grained categorization. It outperforms all the known state-of-the-art methods on these datasets, sometimes by as much as 11%. It improves the performance of our baseline algorithm by 3-4%, consistently on all datasets. We also observed more than a 4% improvement in the recognition performance on a challenging largescale flower dataset, containing 578 species of flowers and 250,000 images.
Author: Thomas Berg, Peter N. Belhumeur
Abstract: From a set ofimages in aparticular domain, labeled with part locations and class, we present a method to automatically learn a large and diverse set of highly discriminative intermediate features that we call Part-based One-vs-One Features (POOFs). Each of these features specializes in discrimination between two particular classes based on the appearance at a particular part. We demonstrate the particular usefulness of these features for fine-grained visual categorization with new state-of-the-art results on bird species identification using the Caltech UCSD Birds (CUB) dataset and parity with the best existing results in face verification on the Labeled Faces in the Wild (LFW) dataset. Finally, we demonstrate the particular advantage of POOFs when training data is scarce.
3 0.23998752 452 cvpr-2013-Vantage Feature Frames for Fine-Grained Categorization
Author: Asma Rejeb Sfar, Nozha Boujemaa, Donald Geman
Abstract: We study fine-grained categorization, the task of distinguishing among (sub)categories of the same generic object class (e.g., birds), focusing on determining botanical species (leaves and orchids) from scanned images. The strategy is to focus attention around several vantage points, which is the approach taken by botanists, but using features dedicated to the individual categories. Our implementation of the strategy is based on vantage feature frames, a novel object representation consisting of two components: a set of coordinate systems centered at the most discriminating local viewpoints for the generic object class and a set of category-dependentfeatures computed in these frames. The features are pooled over frames to build the classifier. Categorization then proceeds from coarse-grained (finding the frames) to fine-grained (finding the category), and hence the vantage feature frames must be both detectable and discriminating. The proposed method outperforms state-of-the art algorithms, in particular those using more distributed representations, on standard databases of leaves.
4 0.14257382 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
Author: Sanja Fidler, Roozbeh Mottaghi, Alan Yuille, Raquel Urtasun
Abstract: In this paper we are interested in how semantic segmentation can help object detection. Towards this goal, we propose a novel deformable part-based model which exploits region-based segmentation algorithms that compute candidate object regions by bottom-up clustering followed by ranking of those regions. Our approach allows every detection hypothesis to select a segment (including void), and scores each box in the image using both the traditional HOG filters as well as a set of novel segmentation features. Thus our model “blends ” between the detector and segmentation models. Since our features can be computed very efficiently given the segments, we maintain the same complexity as the original DPM [14]. We demonstrate the effectiveness of our approach in PASCAL VOC 2010, and show that when employing only a root filter our approach outperforms Dalal & Triggs detector [12] on all classes, achieving 13% higher average AP. When employing the parts, we outperform the original DPM [14] in 19 out of 20 classes, achieving an improvement of 8% AP. Furthermore, we outperform the previous state-of-the-art on VOC’10 test by 4%.
5 0.11193479 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning
Author: David Weiss, Ben Taskar
Abstract: We propose SCALPEL, a flexible method for object segmentation that integrates rich region-merging cues with mid- and high-level information about object layout, class, and scale into the segmentation process. Unlike competing approaches, SCALPEL uses a cascade of bottom-up segmentation models that is capable of learning to ignore boundaries early on, yet use them as a stopping criterion once the object has been mostly segmented. Furthermore, we show how such cascades can be learned efficiently. When paired with a novel method that generates better localized shapepriors than our competitors, our method leads to a concise, accurate set of segmentation proposals; these proposals are more accurate on the PASCAL VOC2010 dataset than state-of-the-art methods that use re-ranking to filter much larger bags of proposals. The code for our algorithm is available online.
6 0.10695344 293 cvpr-2013-Multi-attribute Queries: To Merge or Not to Merge?
7 0.10607858 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images
8 0.10435899 364 cvpr-2013-Robust Object Co-detection
9 0.10251489 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
10 0.095712215 130 cvpr-2013-Discriminative Color Descriptors
11 0.093071982 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
12 0.090324223 187 cvpr-2013-Geometric Context from Videos
13 0.086904548 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs
14 0.085650578 67 cvpr-2013-Blocks That Shout: Distinctive Parts for Scene Classification
15 0.085625365 173 cvpr-2013-Finding Things: Image Parsing with Regions and Per-Exemplar Detectors
16 0.083220556 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels
17 0.082027115 148 cvpr-2013-Ensemble Video Object Cut in Highly Dynamic Scenes
18 0.08079531 414 cvpr-2013-Structure Preserving Object Tracking
19 0.076716289 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images
20 0.073485732 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations
topicId topicWeight
[(0, 0.194), (1, -0.053), (2, 0.032), (3, -0.011), (4, 0.088), (5, 0.023), (6, 0.017), (7, 0.063), (8, -0.03), (9, 0.015), (10, 0.017), (11, -0.055), (12, 0.032), (13, -0.026), (14, 0.04), (15, -0.033), (16, 0.035), (17, -0.044), (18, -0.033), (19, 0.062), (20, -0.003), (21, 0.065), (22, 0.083), (23, -0.065), (24, 0.062), (25, 0.083), (26, 0.056), (27, 0.086), (28, -0.032), (29, -0.085), (30, 0.044), (31, 0.065), (32, -0.052), (33, 0.007), (34, 0.094), (35, -0.009), (36, 0.073), (37, -0.062), (38, -0.061), (39, 0.114), (40, -0.066), (41, 0.008), (42, -0.048), (43, -0.003), (44, -0.026), (45, -0.031), (46, 0.101), (47, 0.135), (48, -0.045), (49, -0.011)]
simIndex simValue paperId paperTitle
same-paper 1 0.91525048 145 cvpr-2013-Efficient Object Detection and Segmentation for Fine-Grained Recognition
Author: Anelia Angelova, Shenghuo Zhu
Abstract: We propose a detection and segmentation algorithm for the purposes of fine-grained recognition. The algorithm first detects low-level regions that could potentially belong to the object and then performs a full-object segmentation through propagation. Apart from segmenting the object, we can also ‘zoom in ’ on the object, i.e. center it, normalize it for scale, and thus discount the effects of the background. We then show that combining this with a state-of-the-art classification algorithm leads to significant improvements in performance especially for datasets which are considered particularly hard for recognition, e.g. birds species. The proposed algorithm is much more efficient than other known methods in similar scenarios [4, 21]. Our method is also simpler and we apply it here to different classes of objects, e.g. birds, flowers, cats and dogs. We tested the algorithm on a number of benchmark datasets for fine-grained categorization. It outperforms all the known state-of-the-art methods on these datasets, sometimes by as much as 11%. It improves the performance of our baseline algorithm by 3-4%, consistently on all datasets. We also observed more than a 4% improvement in the recognition performance on a challenging largescale flower dataset, containing 578 species of flowers and 250,000 images.
2 0.77891123 452 cvpr-2013-Vantage Feature Frames for Fine-Grained Categorization
Author: Asma Rejeb Sfar, Nozha Boujemaa, Donald Geman
Abstract: We study fine-grained categorization, the task of distinguishing among (sub)categories of the same generic object class (e.g., birds), focusing on determining botanical species (leaves and orchids) from scanned images. The strategy is to focus attention around several vantage points, which is the approach taken by botanists, but using features dedicated to the individual categories. Our implementation of the strategy is based on vantage feature frames, a novel object representation consisting of two components: a set of coordinate systems centered at the most discriminating local viewpoints for the generic object class and a set of category-dependentfeatures computed in these frames. The features are pooled over frames to build the classifier. Categorization then proceeds from coarse-grained (finding the frames) to fine-grained (finding the category), and hence the vantage feature frames must be both detectable and discriminating. The proposed method outperforms state-of-the art algorithms, in particular those using more distributed representations, on standard databases of leaves.
3 0.69957143 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
Author: Sanja Fidler, Roozbeh Mottaghi, Alan Yuille, Raquel Urtasun
Abstract: In this paper we are interested in how semantic segmentation can help object detection. Towards this goal, we propose a novel deformable part-based model which exploits region-based segmentation algorithms that compute candidate object regions by bottom-up clustering followed by ranking of those regions. Our approach allows every detection hypothesis to select a segment (including void), and scores each box in the image using both the traditional HOG filters as well as a set of novel segmentation features. Thus our model “blends ” between the detector and segmentation models. Since our features can be computed very efficiently given the segments, we maintain the same complexity as the original DPM [14]. We demonstrate the effectiveness of our approach in PASCAL VOC 2010, and show that when employing only a root filter our approach outperforms Dalal & Triggs detector [12] on all classes, achieving 13% higher average AP. When employing the parts, we outperform the original DPM [14] in 19 out of 20 classes, achieving an improvement of 8% AP. Furthermore, we outperform the previous state-of-the-art on VOC’10 test by 4%.
Author: Thomas Berg, Peter N. Belhumeur
Abstract: From a set ofimages in aparticular domain, labeled with part locations and class, we present a method to automatically learn a large and diverse set of highly discriminative intermediate features that we call Part-based One-vs-One Features (POOFs). Each of these features specializes in discrimination between two particular classes based on the appearance at a particular part. We demonstrate the particular usefulness of these features for fine-grained visual categorization with new state-of-the-art results on bird species identification using the Caltech UCSD Birds (CUB) dataset and parity with the best existing results in face verification on the Labeled Faces in the Wild (LFW) dataset. Finally, we demonstrate the particular advantage of POOFs when training data is scarce.
5 0.66674542 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations
Author: Payman Yadollahpour, Dhruv Batra, Gregory Shakhnarovich
Abstract: This paper introduces a two-stage approach to semantic image segmentation. In the first stage a probabilistic model generates a set of diverse plausible segmentations. In the second stage, a discriminatively trained re-ranking model selects the best segmentation from this set. The re-ranking stage can use much more complex features than what could be tractably used in the probabilistic model, allowing a better exploration of the solution space than possible by simply producing the most probable solution from the probabilistic model. While our proposed approach already achieves state-of-the-art results (48.1%) on the challenging VOC 2012 dataset, our machine and human analyses suggest that even larger gains are possible with such an approach.
6 0.65205789 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning
7 0.64018142 281 cvpr-2013-Measures and Meta-Measures for the Supervised Evaluation of Image Segmentation
8 0.63939947 173 cvpr-2013-Finding Things: Image Parsing with Regions and Per-Exemplar Detectors
9 0.62124467 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
10 0.60627151 174 cvpr-2013-Fine-Grained Crowdsourcing for Fine-Grained Recognition
11 0.57996315 235 cvpr-2013-Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines
12 0.57865936 263 cvpr-2013-Learning the Change for Automatic Image Cropping
13 0.5728336 364 cvpr-2013-Robust Object Co-detection
14 0.57216901 437 cvpr-2013-Towards Fast and Accurate Segmentation
15 0.56560397 212 cvpr-2013-Image Segmentation by Cascaded Region Agglomeration
16 0.56331623 401 cvpr-2013-Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection
17 0.56108403 201 cvpr-2013-Heterogeneous Visual Features Fusion via Sparse Multimodal Machine
18 0.55772835 327 cvpr-2013-Pattern-Driven Colorization of 3D Surfaces
19 0.55293238 83 cvpr-2013-Classification of Tumor Histology via Morphometric Context
20 0.55107301 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images
topicId topicWeight
[(10, 0.107), (16, 0.02), (26, 0.045), (27, 0.225), (33, 0.299), (67, 0.085), (69, 0.058), (87, 0.062)]
simIndex simValue paperId paperTitle
1 0.88200742 451 cvpr-2013-Unsupervised Salience Learning for Person Re-identification
Author: Rui Zhao, Wanli Ouyang, Xiaogang Wang
Abstract: Human eyes can recognize person identities based on some small salient regions. However, such valuable salient information is often hidden when computing similarities of images with existing approaches. Moreover, many existing approaches learn discriminative features and handle drastic viewpoint change in a supervised way and require labeling new training data for a different pair of camera views. In this paper, we propose a novel perspective for person re-identification based on unsupervised salience learning. Distinctive features are extracted without requiring identity labels in the training procedure. First, we apply adjacency constrained patch matching to build dense correspondence between image pairs, which shows effectiveness in handling misalignment caused by large viewpoint and pose variations. Second, we learn human salience in an unsupervised manner. To improve the performance of person re-identification, human salience is incorporated in patch matching to find reliable and discriminative matched patches. The effectiveness of our approach is validated on the widely used VIPeR dataset and ETHZ dataset.
2 0.87817305 164 cvpr-2013-Fast Convolutional Sparse Coding
Author: Hilton Bristow, Anders Eriksson, Simon Lucey
Abstract: Sparse coding has become an increasingly popular method in learning and vision for a variety of classification, reconstruction and coding tasks. The canonical approach intrinsically assumes independence between observations during learning. For many natural signals however, sparse coding is applied to sub-elements (i.e. patches) of the signal, where such an assumption is invalid. Convolutional sparse coding explicitly models local interactions through the convolution operator, however the resulting optimization problem is considerably more complex than traditional sparse coding. In this paper, we draw upon ideas from signal processing and Augmented Lagrange Methods (ALMs) to produce a fast algorithm with globally optimal subproblems and super-linear convergence.
3 0.87744421 324 cvpr-2013-Part-Based Visual Tracking with Online Latent Structural Learning
Author: Rui Yao, Qinfeng Shi, Chunhua Shen, Yanning Zhang, Anton van_den_Hengel
Abstract: Despite many advances made in the area, deformable targets and partial occlusions continue to represent key problems in visual tracking. Structured learning has shown good results when applied to tracking whole targets, but applying this approach to a part-based target model is complicated by the need to model the relationships between parts, and to avoid lengthy initialisation processes. We thus propose a method which models the unknown parts using latent variables. In doing so we extend the online algorithm pegasos to the structured prediction case (i.e., predicting the location of the bounding boxes) with latent part variables. To better estimate the parts, and to avoid over-fitting caused by the extra model complexity/capacity introduced by theparts, wepropose a two-stage trainingprocess, based on the primal rather than the dual form. We then show that the method outperforms the state-of-the-art (linear and non-linear kernel) trackers.
same-paper 4 0.86642206 145 cvpr-2013-Efficient Object Detection and Segmentation for Fine-Grained Recognition
Author: Anelia Angelova, Shenghuo Zhu
Abstract: We propose a detection and segmentation algorithm for the purposes of fine-grained recognition. The algorithm first detects low-level regions that could potentially belong to the object and then performs a full-object segmentation through propagation. Apart from segmenting the object, we can also ‘zoom in ’ on the object, i.e. center it, normalize it for scale, and thus discount the effects of the background. We then show that combining this with a state-of-the-art classification algorithm leads to significant improvements in performance especially for datasets which are considered particularly hard for recognition, e.g. birds species. The proposed algorithm is much more efficient than other known methods in similar scenarios [4, 21]. Our method is also simpler and we apply it here to different classes of objects, e.g. birds, flowers, cats and dogs. We tested the algorithm on a number of benchmark datasets for fine-grained categorization. It outperforms all the known state-of-the-art methods on these datasets, sometimes by as much as 11%. It improves the performance of our baseline algorithm by 3-4%, consistently on all datasets. We also observed more than a 4% improvement in the recognition performance on a challenging largescale flower dataset, containing 578 species of flowers and 250,000 images.
Author: Thomas Berg, Peter N. Belhumeur
Abstract: From a set ofimages in aparticular domain, labeled with part locations and class, we present a method to automatically learn a large and diverse set of highly discriminative intermediate features that we call Part-based One-vs-One Features (POOFs). Each of these features specializes in discrimination between two particular classes based on the appearance at a particular part. We demonstrate the particular usefulness of these features for fine-grained visual categorization with new state-of-the-art results on bird species identification using the Caltech UCSD Birds (CUB) dataset and parity with the best existing results in face verification on the Labeled Faces in the Wild (LFW) dataset. Finally, we demonstrate the particular advantage of POOFs when training data is scarce.
6 0.83512771 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images
7 0.83077443 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
8 0.83044916 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval
9 0.82903677 94 cvpr-2013-Context-Aware Modeling and Recognition of Activities in Video
10 0.82902676 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
11 0.82812703 457 cvpr-2013-Visual Tracking via Locality Sensitive Histograms
12 0.82789415 416 cvpr-2013-Studying Relationships between Human Gaze, Description, and Computer Vision
13 0.82778531 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection
14 0.82764626 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People
15 0.82742006 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
16 0.82724887 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs
17 0.82718456 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning
18 0.8270359 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories
19 0.82660246 202 cvpr-2013-Hierarchical Saliency Detection
20 0.82633227 325 cvpr-2013-Part Discovery from Partial Correspondence