cvpr cvpr2013 cvpr2013-70 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sanja Fidler, Roozbeh Mottaghi, Alan Yuille, Raquel Urtasun
Abstract: In this paper we are interested in how semantic segmentation can help object detection. Towards this goal, we propose a novel deformable part-based model which exploits region-based segmentation algorithms that compute candidate object regions by bottom-up clustering followed by ranking of those regions. Our approach allows every detection hypothesis to select a segment (including void), and scores each box in the image using both the traditional HOG filters as well as a set of novel segmentation features. Thus our model “blends ” between the detector and segmentation models. Since our features can be computed very efficiently given the segments, we maintain the same complexity as the original DPM [14]. We demonstrate the effectiveness of our approach in PASCAL VOC 2010, and show that when employing only a root filter our approach outperforms Dalal & Triggs detector [12] on all classes, achieving 13% higher average AP. When employing the parts, we outperform the original DPM [14] in 19 out of 20 classes, achieving an improvement of 8% AP. Furthermore, we outperform the previous state-of-the-art on VOC’10 test by 4%.
Reference: text
sentIndex sentText sentNum sentScore
1 edu c l a Abstract In this paper we are interested in how semantic segmentation can help object detection. [sent-4, score-0.295]
2 Towards this goal, we propose a novel deformable part-based model which exploits region-based segmentation algorithms that compute candidate object regions by bottom-up clustering followed by ranking of those regions. [sent-5, score-0.47]
3 Our approach allows every detection hypothesis to select a segment (including void), and scores each box in the image using both the traditional HOG filters as well as a set of novel segmentation features. [sent-6, score-0.928]
4 Thus our model “blends ” between the detector and segmentation models. [sent-7, score-0.279]
5 We demonstrate the effectiveness of our approach in PASCAL VOC 2010, and show that when employing only a root filter our approach outperforms Dalal & Triggs detector [12] on all classes, achieving 13% higher average AP. [sent-9, score-0.287]
6 For example, knowing which objects are present in the scene should simplify segmentation and detection tasks. [sent-15, score-0.239]
7 Similarly, if we can detect where an object is, segmentation should be easier as only figure-ground segmentation is necessary. [sent-16, score-0.421]
8 Existing approaches typically take the output of a detector and refine the regions inside the boxes to produce image segmentations [22, 5, 1, 14]. [sent-17, score-0.278]
9 An alternative approach is to use the candidate detections produced by state-of-the-art detectors as additional features for segmentation. [sent-18, score-0.227]
10 In contrast, in this paper we are interested in exploiting semantic segmentation in order to improve object detection. [sent-20, score-0.295]
11 While bottom-up segmentation has been often believed to be inferior to top-down object detectors due to its frequent over- and under- segmentation, recent approaches [8, 1] have shown impressive results in difficult datasets such as PASCAL VOC challenge. [sent-21, score-0.291]
12 Here, we take advantage of region-based segmentation approaches [7], which compute a set of candidate object regions by bottom-up clustering, and produce a segmentation by ranking those regions using class specific rankers. [sent-22, score-0.536]
13 Our goal is to make use ofthese candidate object segments to bias sliding window object detectors to agree with these regions. [sent-23, score-0.477]
14 Importantly, unlike the aforementioned holistic approaches, we reason about all possible object bounding boxes (not just candidates) to not limit the expressiveness of our model. [sent-24, score-0.372]
15 However, so far, there has not been many attempts to incorporate segmentation into DPMs. [sent-26, score-0.182]
16 In this paper we propose a novel deformable part-based model, which exploits region-based segmentation by allowing every detection hypothesis to select a segment (including void) from a small pool of segment candidates. [sent-27, score-0.991]
17 Our detector scores each box in the image using both the traditional HOG filters as well as the set of novel segmentation features. [sent-29, score-0.616]
18 Our model “blends” between the detector and the segmentation models by boosting object hypotheses on the segments. [sent-30, score-0.384]
19 Furthermore, it can recover from segmentation mistakes by exploiting a powerful appearance model. [sent-31, score-0.219]
20 Importantly, as given the segments we can compute our features very efficiently, our approach has the same computational complexity as the original DPM [14]. [sent-32, score-0.222]
21 We demonstrate the effectiveness of our approach in PASCAL VOC 2010, and show that when employing only a root filter our approach outperforms Dalal & Triggs detector [12] by 13% AP, and when employing parts, we outper- form the original DPM [14] by 8%. [sent-33, score-0.371]
22 In the past few years, a wide variety of segmentation algorithms that employ object detectors as top-down cues have been proposed. [sent-50, score-0.328]
23 This is typically done in the form of unary features for an MRF [19], or as candidate bounding boxes for holistic MRFs [33, 21]. [sent-51, score-0.442]
24 In [26], heads of cats and dogs are detected with a DPM, and segmentation is performed using a GrabCut-type method. [sent-53, score-0.22]
25 By combining top-down shape information from DPM parts and bottom-up color and boundary cues, [32] tackle segmentation and detection task simultaneously and provide shape and depth ordering for the detected objects. [sent-54, score-0.292]
26 [11] exploit a DPM to find a rough location for the object of interest and refine the detected bounding box according to occlusion boundaries and color information. [sent-56, score-0.494]
27 For instance, while [4] finds segmentations for people by aligning the masks obtained for each Poselet [4], [23] integrates low level segmentation cues with Poselets in a soft manner. [sent-61, score-0.268]
28 In contrast, in this paper we proposed a novel deformable-part based model, which allows each detection hypothesis to select candidate segments. [sent-66, score-0.258]
29 Semantic Segmentation for Object Detection We are interested in utilizing semantic segmentation to help object detection. [sent-70, score-0.295]
30 In particular, we take advantage of region-based segmentation approaches, which compute candidate object regions by bottom-up clustering, and rank those regions to estimate a score for each class. [sent-71, score-0.389]
31 Towards this goal we frame detection as an inference problem, where each detection hypothesis can select a segment from a pool of candidates (those returned from the segmentation as well as void). [sent-72, score-0.789]
32 A Segmentation-Aware DPM Following [14], let p0 be a random variable encoding the location and scale of a bounding box in an image pyramid as well as the mixture component id. [sent-78, score-0.476]
33 Let {pi}i=1,···,P be a set of parts wthehic thra iennicnogd eex bounding Lbeotx {eps a}t double the resolution of the root. [sent-80, score-0.295]
34 Denote with h the index over the set of candidate segments returned by the segmentation algorithm. [sent-81, score-0.451]
35 We frame the detection problem as inference in a Markov Random Field (MRF), where each root filter hypothesis can select a segment from a pool of candidates. [sent-82, score-0.613]
36 ilTtenhritsp 0air of features alone could result in box hypotheses that “overshoot” φback−out, is the opposite: the segment. [sent-93, score-0.335]
37 The purpose of the second pair of features, φback−in it tries to minimize the number of background pixels inside the box and maximize its number outside. [sent-94, score-0.351]
38 and In synchrony these features would try to tightly place a box around the segment. [sent-95, score-0.422]
39 We will use S(h) to denote the segment that h indexes. [sent-97, score-0.233]
40 As in [ 14], we employ a HOG pyramid to compute φ(x, p0), and use double resolution to compute the part features φ(x, pi). [sent-98, score-0.195]
41 The features φ(x, h, p0) link segmentation and detection. [sent-99, score-0.227]
42 Segmentation Features Given a set of candidate segments, we would like to encode features linking segmentation and detection while remaining computationally efficient. [sent-103, score-0.366]
43 Towards this goal, we derive simple features which encourage the selected segment to agree with the object detection hypothesis. [sent-105, score-0.479]
44 Segment-In: Given a segment S(h), our first feature counts the percentage of pixels in S(h) that fall inside the bounding box defined by p0. [sent-108, score-0.791]
45 B(p0){p ∈ S(h)} where |S(h) | is the size of the segment indexed by h, and wB(hep0re) |iSs thhe)| si set t hoef pixels c tohneta seingmede nint itnhdee bounding ba noxd defined by p0. [sent-110, score-0.428]
46 This feature encourages the bounding box to contain the segment. [sent-111, score-0.437]
47 Segment-Out: Our second feature counts the percentage of segment pixels that are outside the bounding box, φseg−out(x,h,p0) =|S(1h)|p ∈/? [sent-112, score-0.524]
48 B(p0){p ∈ S(h)} This feature discourages boxes that do not contain all segment pixels. [sent-113, score-0.304]
49 Background-In: We additionally compute a feature counting the amount ofbackground inside the bounding box as follows φback−in(x,h,p0) =N − |1S(h)|p∈? [sent-114, score-0.535]
50 This feature captures the statistics of how often the segments leak outside the true bounding box vs how often they are too small. [sent-116, score-0.621]
51 Background-Out: This feature counts the amount of background outside the bounding box φback−out(x,h,p0) =N − |1S(h)|p ∈/? [sent-117, score-0.533]
52 B(p0){p ∈/ S(h)} It tries to discourage bounding boxes that are too big and do not tightly fit the segments. [sent-118, score-0.378]
53 Overlap: This feature penalizes bounding boxes which do not overlap well with the segment. [sent-119, score-0.333]
54 In particular, it computes the intersection over union between the candidate bounding box defined by p0 and the tighter bounding box around the segment S(h). [sent-120, score-1.305]
55 It is defined as follows φoverlap(x,h,p0) =BB((pp00)) ∩∪ BB((SS((hh))))− λ with B(S(h)) the tighter bounding box around S(h), B(p0) the bounding box defined by p0, and λ a constant, which is the intersection over union level that defines a true positive. [sent-121, score-0.99]
56 We incorporate an additional feature to learn the bias for the background segment (h = 0). [sent-125, score-0.269]
57 This puts the scores of the HOG filters and the segmentation potentials into a common referential. [sent-126, score-0.277]
58 1 depicts our features computed for a specific bounding box p0 and segment S(h). [sent-130, score-0.715]
59 Note that the first two features, φseg−in and φseg−out, encourage the box to contain as many segment pixels as possible. [sent-131, score-0.513]
60 This pair of features alone could result in box hypotheses that “overshoot” the segment. [sent-132, score-0.335]
61 The purpose of the second pair of features, φback−in and φback−out, is the opposite: it tries to minimize the number of background pixels inside the box and maximize its number outside. [sent-133, score-0.351]
62 In synchrony these features would try to tightly place a box around the segment. [sent-134, score-0.422]
63 Note that the features have to be computed for each segment h, but this is not a problem as there are typically only a few segments per image. [sent-139, score-0.422]
64 Let φint (h) be the integral image for segment h, which, at each point (u, v), counts the % of pixels that belong to this segment and are contained inside the subimage defined by the domain [0, u] [0, v] . [sent-142, score-0.665]
65 2, (φtl , φtr, φbl , φbr) indexes the integral image of segment S(h) at the four corners, i. [sent-147, score-0.311]
66 , topleft, top-right, bottom-left, bottom-right, of the bounding box defined by p0. [sent-149, score-0.437]
67 The overlap feature between a hypothesis p0 and a segment S(h) can also be computed very efficiently. [sent-150, score-0.386]
68 Given that the segment bounding box B(S(h)) is fixed and the width and height of p0 at a particular level of the pyramid are fixed as well, we can quickly segment S(h) integral image φint(h) Figure 2. [sent-158, score-0.981]
69 The algorithm works as follows: First, we compute maxh wsTegφ(x, h,p0) as well as maxpi (wTi,def · φ(x, p0, pi)) for each root filter hypothesis p0. [sent-173, score-0.225]
70 We then· compute the score as the sum ofthe HOG and segment score for each mixture component at the root level. [sent-174, score-0.481]
71 Allowing different weights for the different segmentation features is important in order to learn how likely is for each class to have segments that undershoot or overshoot the detection bounding box. [sent-179, score-0.723]
72 We employ as loss the intersection over the union of the root filters. [sent-180, score-0.216]
73 As in DPM [14], we initialize the model by first training only the root filters, followed by training a root mixture model. [sent-181, score-0.251]
74 Note that we expect the weights for φseg−in (x, h, p0), φback−out (x, h, p0) and φoverlap(x, h, p0) to be positive, as 333222999755 we would like to maximize the overlap, the amount of foreground inside the bounding box and background outside the bounding box. [sent-183, score-0.737]
75 Similarly, the weights for φseg−out (x, h, p0) and φback−in (x, h, p0) are expected to be negative as we would like to minimize the amount of background inside the bounding box as well as the amount of foreground segment outside. [sent-184, score-0.735]
76 In particular, for most experiments we use the final segmentation output of [7]. [sent-189, score-0.182]
77 For each class, we find all connected components in the segmentation output, and remove those that do not exceed 1500 pixels. [sent-190, score-0.182]
78 Experimental Evaluation We first evaluate our detection performance on val subset of PASCAL VOC 2010 detection dataset, and compare it to the baselines. [sent-196, score-0.25]
79 As baselines we use the Dalal&Triggs; detector [12] (which for fairness we compare to our detector when only using the root filters), the DPM [14], as well as CPMC [7] when used as a detector. [sent-199, score-0.3]
80 To compute the latter, we find all the connected components in the final segmentation output of CPMC [7], and place the tightest bounding box around each component. [sent-200, score-0.652]
81 To compute the score of the box we utilize the CPMC ranking scores for the segments. [sent-201, score-0.349]
82 The main power of our approach is that it blends between DPM (appearance) and segmentation (CPMC). [sent-206, score-0.316]
83 When there is a segment, our approach is encouraged to tightly fit a box around it. [sent-208, score-0.31]
84 Note that our results well demonstrate the effectiveness of using blended detection and segmentation models for object detection. [sent-210, score-0.296]
85 Note that our approach is able to both retrieve detections where there is no segment as well as position the bounding box correctly where there is segment evidence. [sent-213, score-0.951]
86 , bounding box prediction and context-rescoring, in the top of Table 2. [sent-218, score-0.437]
87 In particular, among 150 segments per image returned by [8], we selected a topranking subset for each class, so that there was an average of 5 segments per image. [sent-228, score-0.331]
88 Conclusion In this paper, we have proposed a novel deformable partbased model, which exploits region-based segmentation by allowing every detection hypothesis to select a segment (including void) from a pool of segment candidates. [sent-235, score-0.991]
89 Our detector scores each box in the image using both the HOG filters as in original DPM, as well as a set of novel segmentation features. [sent-237, score-0.616]
90 This way, our model “blends” between the detector and the segmentation model, by boosting object hypotheses on the segments, while recovering from making mistakes by exploiting a powerful appearance model. [sent-238, score-0.421]
91 We demonstrated the effectiveness of our approach in PASCAL VOC 2010, and show that when employing only a root filter our approach outperforms Dalal & Triggs detector [12] by 13% AP and when employing parts, we outperform the original DPM [14] by 8%. [sent-239, score-0.422]
92 We expect a new generation of object detectors which are able to exploit semantic segmentation yet to come. [sent-241, score-0.347]
93 333222999866 plane bike bird boat bottle buscarcatchair cow table dog horse motor person plant sheep sofa traintvAvg. [sent-245, score-0.637]
94 AP performance (in %) on VOC 2010 val for our detector with parts, the DPM [14], and the CPMC-based detector [7]. [sent-310, score-0.33]
95 plane bike bird boat bottle buscarcat chair cow table dog horse motor person plant sheep sofa traintvAvg. [sent-311, score-0.614]
96 Object segmentation by alignment of poselet activations to image contours. [sent-387, score-0.23]
97 Object detection and segmentation from joint embedding of parts and pixels. [sent-512, score-0.292]
98 plane bike bird boat bottle buscarcatchair cow table dog horse motor person plant sheep sofa traintvAvg. [sent-648, score-0.637]
99 AP performance (in %) on VOC 2010 val for our detector when using more segments. [sent-671, score-0.233]
100 For example, for an image with a chair and a cat GT box, we show the top scoring box for chair and the top scoring box for cat. [sent-692, score-0.644]
wordName wordTfidf (topN-words)
[('dpm', 0.339), ('seg', 0.311), ('box', 0.242), ('cpmc', 0.233), ('segment', 0.233), ('voc', 0.199), ('bounding', 0.195), ('segmentation', 0.182), ('segments', 0.144), ('val', 0.136), ('blends', 0.134), ('segdpm', 0.134), ('void', 0.11), ('back', 0.108), ('root', 0.106), ('overshoot', 0.1), ('dpms', 0.1), ('detector', 0.097), ('hypothesis', 0.086), ('employing', 0.084), ('candidate', 0.082), ('integral', 0.078), ('pascal', 0.074), ('deformable', 0.071), ('boxes', 0.071), ('ap', 0.069), ('tightly', 0.068), ('triggs', 0.068), ('overlap', 0.067), ('buscarcatchair', 0.067), ('synchrony', 0.067), ('wsteg', 0.067), ('inside', 0.065), ('motor', 0.061), ('sheep', 0.058), ('detection', 0.057), ('object', 0.057), ('filters', 0.056), ('counts', 0.056), ('semantic', 0.056), ('bike', 0.055), ('azizpour', 0.055), ('bottle', 0.054), ('gu', 0.054), ('parts', 0.053), ('dog', 0.052), ('detectors', 0.052), ('xs', 0.051), ('pool', 0.051), ('outperform', 0.051), ('plant', 0.051), ('cow', 0.051), ('sofa', 0.051), ('dalal', 0.05), ('agree', 0.049), ('holistic', 0.049), ('boat', 0.048), ('poselet', 0.048), ('hypotheses', 0.048), ('detections', 0.048), ('hog', 0.048), ('iou', 0.047), ('inference', 0.047), ('double', 0.047), ('bird', 0.047), ('features', 0.045), ('segmentations', 0.045), ('exploits', 0.045), ('arbelaez', 0.044), ('chair', 0.044), ('tries', 0.044), ('returned', 0.043), ('tighter', 0.043), ('horse', 0.042), ('rain', 0.041), ('masks', 0.041), ('outside', 0.04), ('importantly', 0.039), ('scores', 0.039), ('carreira', 0.039), ('mixture', 0.039), ('encourage', 0.038), ('cats', 0.038), ('bourdev', 0.038), ('mistakes', 0.037), ('employ', 0.037), ('union', 0.037), ('bb', 0.037), ('ys', 0.037), ('scoring', 0.036), ('intersection', 0.036), ('bias', 0.036), ('score', 0.035), ('br', 0.034), ('desai', 0.034), ('vedaldi', 0.034), ('compute', 0.033), ('dai', 0.033), ('select', 0.033), ('bl', 0.033)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999881 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
Author: Sanja Fidler, Roozbeh Mottaghi, Alan Yuille, Raquel Urtasun
Abstract: In this paper we are interested in how semantic segmentation can help object detection. Towards this goal, we propose a novel deformable part-based model which exploits region-based segmentation algorithms that compute candidate object regions by bottom-up clustering followed by ranking of those regions. Our approach allows every detection hypothesis to select a segment (including void), and scores each box in the image using both the traditional HOG filters as well as a set of novel segmentation features. Thus our model “blends ” between the detector and segmentation models. Since our features can be computed very efficiently given the segments, we maintain the same complexity as the original DPM [14]. We demonstrate the effectiveness of our approach in PASCAL VOC 2010, and show that when employing only a root filter our approach outperforms Dalal & Triggs detector [12] on all classes, achieving 13% higher average AP. When employing the parts, we outperform the original DPM [14] in 19 out of 20 classes, achieving an improvement of 8% AP. Furthermore, we outperform the previous state-of-the-art on VOC’10 test by 4%.
2 0.28600773 364 cvpr-2013-Robust Object Co-detection
Author: Xin Guo, Dong Liu, Brendan Jou, Mojun Zhu, Anni Cai, Shih-Fu Chang
Abstract: Object co-detection aims at simultaneous detection of objects of the same category from a pool of related images by exploiting consistent visual patterns present in candidate objects in the images. The related image set may contain a mixture of annotated objects and candidate objects generated by automatic detectors. Co-detection differs from the conventional object detection paradigm in which detection over each test image is determined one-by-one independently without taking advantage of common patterns in the data pool. In this paper, we propose a novel, robust approach to dramatically enhance co-detection by extracting a shared low-rank representation of the object instances in multiple feature spaces. The idea is analogous to that of the well-known Robust PCA [28], but has not been explored in object co-detection so far. The representation is based on a linear reconstruction over the entire data set and the low-rank approach enables effective removal of noisy and outlier samples. The extracted low-rank representation can be used to detect the target objects by spectral clustering. Extensive experiments over diverse benchmark datasets demonstrate consistent and significant performance gains of the proposed method over the state-of-the-art object codetection method and the generic object detection methods without co-detection formulations.
3 0.28200936 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
Author: Ian Endres, Kevin J. Shih, Johnston Jiaa, Derek Hoiem
Abstract: We propose a method to learn a diverse collection of discriminative parts from object bounding box annotations. Part detectors can be trained and applied individually, which simplifies learning and extension to new features or categories. We apply the parts to object category detection, pooling part detections within bottom-up proposed regions and using a boosted classifier with proposed sigmoid weak learners for scoring. On PASCAL VOC 2010, we evaluate the part detectors ’ ability to discriminate and localize annotated keypoints. Our detection system is competitive with the best-existing systems, outperforming other HOG-based detectors on the more deformable categories.
4 0.23177543 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning
Author: David Weiss, Ben Taskar
Abstract: We propose SCALPEL, a flexible method for object segmentation that integrates rich region-merging cues with mid- and high-level information about object layout, class, and scale into the segmentation process. Unlike competing approaches, SCALPEL uses a cascade of bottom-up segmentation models that is capable of learning to ignore boundaries early on, yet use them as a stopping criterion once the object has been mostly segmented. Furthermore, we show how such cascades can be learned efficiently. When paired with a novel method that generates better localized shapepriors than our competitors, our method leads to a concise, accurate set of segmentation proposals; these proposals are more accurate on the PASCAL VOC2010 dataset than state-of-the-art methods that use re-ranking to filter much larger bags of proposals. The code for our algorithm is available online.
5 0.2268821 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
Author: Byung-soo Kim, Shili Xu, Silvio Savarese
Abstract: In this paper we focus on the problem of detecting objects in 3D from RGB-D images. We propose a novel framework that explores the compatibility between segmentation hypotheses of the object in the image and the corresponding 3D map. Our framework allows to discover the optimal location of the object using a generalization of the structural latent SVM formulation in 3D as well as the definition of a new loss function defined over the 3D space in training. We evaluate our method using two existing RGB-D datasets. Extensive quantitative and qualitative experimental results show that our proposed approach outperforms state-of-theart as methods well as a number of baseline approaches for both 3D and 2D object recognition tasks.
6 0.21831279 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs
7 0.20999555 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels
8 0.18377168 1 cvpr-2013-3D-Based Reasoning with Blocks, Support, and Stability
9 0.18346822 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation
10 0.18168354 163 cvpr-2013-Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
11 0.17176336 325 cvpr-2013-Part Discovery from Partial Correspondence
12 0.17120984 311 cvpr-2013-Occlusion Patterns for Object Class Detection
13 0.16993427 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection
14 0.15476106 154 cvpr-2013-Explicit Occlusion Modeling for 3D Object Class Representations
15 0.14925963 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations
16 0.14800332 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes
17 0.14257382 145 cvpr-2013-Efficient Object Detection and Segmentation for Fine-Grained Recognition
18 0.14256312 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
19 0.13601924 25 cvpr-2013-A Sentence Is Worth a Thousand Pixels
20 0.12867685 273 cvpr-2013-Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection
topicId topicWeight
[(0, 0.263), (1, -0.087), (2, 0.078), (3, -0.1), (4, 0.145), (5, 0.039), (6, 0.16), (7, 0.186), (8, -0.059), (9, -0.016), (10, -0.057), (11, -0.141), (12, 0.071), (13, -0.107), (14, 0.009), (15, -0.015), (16, 0.085), (17, 0.021), (18, -0.121), (19, 0.1), (20, -0.05), (21, 0.012), (22, 0.18), (23, -0.003), (24, 0.147), (25, -0.039), (26, -0.044), (27, -0.005), (28, 0.006), (29, -0.092), (30, 0.021), (31, -0.051), (32, -0.018), (33, -0.018), (34, 0.018), (35, 0.032), (36, 0.106), (37, -0.073), (38, -0.039), (39, 0.098), (40, -0.029), (41, 0.047), (42, -0.005), (43, 0.041), (44, -0.098), (45, -0.09), (46, 0.012), (47, 0.036), (48, 0.033), (49, -0.056)]
simIndex simValue paperId paperTitle
same-paper 1 0.97599548 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
Author: Sanja Fidler, Roozbeh Mottaghi, Alan Yuille, Raquel Urtasun
Abstract: In this paper we are interested in how semantic segmentation can help object detection. Towards this goal, we propose a novel deformable part-based model which exploits region-based segmentation algorithms that compute candidate object regions by bottom-up clustering followed by ranking of those regions. Our approach allows every detection hypothesis to select a segment (including void), and scores each box in the image using both the traditional HOG filters as well as a set of novel segmentation features. Thus our model “blends ” between the detector and segmentation models. Since our features can be computed very efficiently given the segments, we maintain the same complexity as the original DPM [14]. We demonstrate the effectiveness of our approach in PASCAL VOC 2010, and show that when employing only a root filter our approach outperforms Dalal & Triggs detector [12] on all classes, achieving 13% higher average AP. When employing the parts, we outperform the original DPM [14] in 19 out of 20 classes, achieving an improvement of 8% AP. Furthermore, we outperform the previous state-of-the-art on VOC’10 test by 4%.
2 0.84902817 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
Author: Byung-soo Kim, Shili Xu, Silvio Savarese
Abstract: In this paper we focus on the problem of detecting objects in 3D from RGB-D images. We propose a novel framework that explores the compatibility between segmentation hypotheses of the object in the image and the corresponding 3D map. Our framework allows to discover the optimal location of the object using a generalization of the structural latent SVM formulation in 3D as well as the definition of a new loss function defined over the 3D space in training. We evaluate our method using two existing RGB-D datasets. Extensive quantitative and qualitative experimental results show that our proposed approach outperforms state-of-theart as methods well as a number of baseline approaches for both 3D and 2D object recognition tasks.
3 0.83351249 364 cvpr-2013-Robust Object Co-detection
Author: Xin Guo, Dong Liu, Brendan Jou, Mojun Zhu, Anni Cai, Shih-Fu Chang
Abstract: Object co-detection aims at simultaneous detection of objects of the same category from a pool of related images by exploiting consistent visual patterns present in candidate objects in the images. The related image set may contain a mixture of annotated objects and candidate objects generated by automatic detectors. Co-detection differs from the conventional object detection paradigm in which detection over each test image is determined one-by-one independently without taking advantage of common patterns in the data pool. In this paper, we propose a novel, robust approach to dramatically enhance co-detection by extracting a shared low-rank representation of the object instances in multiple feature spaces. The idea is analogous to that of the well-known Robust PCA [28], but has not been explored in object co-detection so far. The representation is based on a linear reconstruction over the entire data set and the low-rank approach enables effective removal of noisy and outlier samples. The extracted low-rank representation can be used to detect the target objects by spectral clustering. Extensive experiments over diverse benchmark datasets demonstrate consistent and significant performance gains of the proposed method over the state-of-the-art object codetection method and the generic object detection methods without co-detection formulations.
4 0.78027892 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
Author: Ian Endres, Kevin J. Shih, Johnston Jiaa, Derek Hoiem
Abstract: We propose a method to learn a diverse collection of discriminative parts from object bounding box annotations. Part detectors can be trained and applied individually, which simplifies learning and extension to new features or categories. We apply the parts to object category detection, pooling part detections within bottom-up proposed regions and using a boosted classifier with proposed sigmoid weak learners for scoring. On PASCAL VOC 2010, we evaluate the part detectors ’ ability to discriminate and localize annotated keypoints. Our detection system is competitive with the best-existing systems, outperforming other HOG-based detectors on the more deformable categories.
5 0.77646554 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning
Author: David Weiss, Ben Taskar
Abstract: We propose SCALPEL, a flexible method for object segmentation that integrates rich region-merging cues with mid- and high-level information about object layout, class, and scale into the segmentation process. Unlike competing approaches, SCALPEL uses a cascade of bottom-up segmentation models that is capable of learning to ignore boundaries early on, yet use them as a stopping criterion once the object has been mostly segmented. Furthermore, we show how such cascades can be learned efficiently. When paired with a novel method that generates better localized shapepriors than our competitors, our method leads to a concise, accurate set of segmentation proposals; these proposals are more accurate on the PASCAL VOC2010 dataset than state-of-the-art methods that use re-ranking to filter much larger bags of proposals. The code for our algorithm is available online.
6 0.74780107 145 cvpr-2013-Efficient Object Detection and Segmentation for Fine-Grained Recognition
7 0.73372889 1 cvpr-2013-3D-Based Reasoning with Blocks, Support, and Stability
8 0.70594615 163 cvpr-2013-Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
9 0.69659674 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models
10 0.69033331 247 cvpr-2013-Learning Class-to-Image Distance with Object Matchings
11 0.68507177 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations
12 0.67586923 325 cvpr-2013-Part Discovery from Partial Correspondence
13 0.66953003 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation
14 0.66399467 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning
15 0.66041768 144 cvpr-2013-Efficient Maximum Appearance Search for Large-Scale Object Detection
16 0.65896314 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection
17 0.65817332 43 cvpr-2013-Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs
18 0.65651155 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels
19 0.6556797 173 cvpr-2013-Finding Things: Image Parsing with Regions and Per-Exemplar Detectors
20 0.6508767 416 cvpr-2013-Studying Relationships between Human Gaze, Description, and Computer Vision
topicId topicWeight
[(10, 0.137), (16, 0.017), (26, 0.052), (33, 0.244), (39, 0.018), (67, 0.098), (69, 0.127), (80, 0.017), (87, 0.073), (88, 0.139)]
simIndex simValue paperId paperTitle
same-paper 1 0.89550078 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
Author: Sanja Fidler, Roozbeh Mottaghi, Alan Yuille, Raquel Urtasun
Abstract: In this paper we are interested in how semantic segmentation can help object detection. Towards this goal, we propose a novel deformable part-based model which exploits region-based segmentation algorithms that compute candidate object regions by bottom-up clustering followed by ranking of those regions. Our approach allows every detection hypothesis to select a segment (including void), and scores each box in the image using both the traditional HOG filters as well as a set of novel segmentation features. Thus our model “blends ” between the detector and segmentation models. Since our features can be computed very efficiently given the segments, we maintain the same complexity as the original DPM [14]. We demonstrate the effectiveness of our approach in PASCAL VOC 2010, and show that when employing only a root filter our approach outperforms Dalal & Triggs detector [12] on all classes, achieving 13% higher average AP. When employing the parts, we outperform the original DPM [14] in 19 out of 20 classes, achieving an improvement of 8% AP. Furthermore, we outperform the previous state-of-the-art on VOC’10 test by 4%.
2 0.89325327 172 cvpr-2013-Finding Group Interactions in Social Clutter
Author: Ruonan Li, Parker Porfilio, Todd Zickler
Abstract: We consider the problem of finding distinctive social interactions involving groups of agents embedded in larger social gatherings. Given a pre-defined gallery of short exemplar interaction videos, and a long input video of a large gathering (with approximately-tracked agents), we identify within the gathering small sub-groups of agents exhibiting social interactions that resemble those in the exemplars. The participants of each detected group interaction are localized in space; the extent of their interaction is localized in time; and when the gallery ofexemplars is annotated with group-interaction categories, each detected interaction is classified into one of the pre-defined categories. Our approach represents group behaviors by dichotomous collections of descriptors for (a) individual actions, and (b) pairwise interactions; and it includes efficient algorithms for optimally distinguishing participants from by-standers in every temporal unit and for temporally localizing the extent of the group interaction. Most importantly, the method is generic and can be applied whenever numerous interacting agents can be approximately tracked over time. We evaluate the approach using three different video collections, two that involve humans and one that involves mice.
3 0.89308494 97 cvpr-2013-Correspondence-Less Non-rigid Registration of Triangular Surface Meshes
Author: Zsolt Sánta, Zoltan Kato
Abstract: A novel correspondence-less approach is proposed to find a thin plate spline map between a pair of deformable 3D objects represented by triangular surface meshes. The proposed method works without landmark extraction and feature correspondences. The aligning transformation is found simply by solving a system of nonlinear equations. Each equation is generated by integrating a nonlinear function over the object’s domains. We derive recursive formulas for the efficient computation of these integrals. Based on a series of comparative tests on a large synthetic dataset, our triangular mesh-based algorithm outperforms state of the art methods both in terms of computing time and accuracy. The applicability of the proposed approach has been demonstrated on the registration of 3D lung CT volumes.
4 0.88369614 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
Author: Ian Endres, Kevin J. Shih, Johnston Jiaa, Derek Hoiem
Abstract: We propose a method to learn a diverse collection of discriminative parts from object bounding box annotations. Part detectors can be trained and applied individually, which simplifies learning and extension to new features or categories. We apply the parts to object category detection, pooling part detections within bottom-up proposed regions and using a boosted classifier with proposed sigmoid weak learners for scoring. On PASCAL VOC 2010, we evaluate the part detectors ’ ability to discriminate and localize annotated keypoints. Our detection system is competitive with the best-existing systems, outperforming other HOG-based detectors on the more deformable categories.
5 0.88332534 231 cvpr-2013-Joint Detection, Tracking and Mapping by Semantic Bundle Adjustment
Author: Nicola Fioraio, Luigi Di_Stefano
Abstract: In this paper we propose a novel Semantic Bundle Adjustmentframework whereby known rigid stationary objects are detected while tracking the camera and mapping the environment. The system builds on established tracking and mapping techniques to exploit incremental 3D reconstruction in order to validate hypotheses on the presence and pose of sought objects. Then, detected objects are explicitly taken into account for a global semantic optimization of both camera and object poses. Thus, unlike all systems proposed so far, our approach allows for solving jointly the detection and SLAM problems, so as to achieve object detection together with improved SLAM accuracy.
6 0.87964058 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection
7 0.87902474 86 cvpr-2013-Composite Statistical Inference for Semantic Segmentation
8 0.87687302 292 cvpr-2013-Multi-agent Event Detection: Localization and Role Assignment
9 0.87389618 1 cvpr-2013-3D-Based Reasoning with Blocks, Support, and Stability
10 0.87202215 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
11 0.86938882 414 cvpr-2013-Structure Preserving Object Tracking
12 0.86895156 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
13 0.86855704 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models
14 0.86667055 381 cvpr-2013-Scene Parsing by Integrating Function, Geometry and Appearance Models
15 0.86505979 325 cvpr-2013-Part Discovery from Partial Correspondence
16 0.86487955 372 cvpr-2013-SLAM++: Simultaneous Localisation and Mapping at the Level of Objects
17 0.8641519 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
18 0.86368817 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval
19 0.86347312 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns
20 0.86301339 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses