cvpr cvpr2013 cvpr2013-311 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Bojan Pepikj, Michael Stark, Peter Gehler, Bernt Schiele
Abstract: Despite the success of recent object class recognition systems, the long-standing problem of partial occlusion remains a major challenge, and a principled solution is yet to be found. In this paper we leave the beaten path of methods that treat occlusion as just another source of noise instead, we include the occluder itself into the modelling, by mining distinctive, reoccurring occlusion patterns from annotated training data. These patterns are then used as training data for dedicated detectors of varying sophistication. In particular, we evaluate and compare models that range from standard object class detectors to hierarchical, part-based representations of occluder/occludee pairs. In an extensive evaluation we derive insights that can aid further developments in tackling the occlusion challenge. –
Reference: text
sentIndex sentText sentNum sentScore
1 In this paper we leave the beaten path of methods that treat occlusion as just another source of noise instead, we include the occluder itself into the modelling, by mining distinctive, reoccurring occlusion patterns from annotated training data. [sent-2, score-1.902]
2 These patterns are then used as training data for dedicated detectors of varying sophistication. [sent-3, score-0.315]
3 In particular, we evaluate and compare models that range from standard object class detectors to hierarchical, part-based representations of occluder/occludee pairs. [sent-4, score-0.237]
4 In an extensive evaluation we derive insights that can aid further developments in tackling the occlusion challenge. [sent-5, score-0.773]
5 Despite these achievements towards more accurate object hypotheses, partial occlusion still poses a major challenge to state-of-the-art detectors [4, 7], as becomes apparent when analyzing the results of current benchmark datasets [6]. [sent-9, score-0.874]
6 Curiously, what is also common to these approaches is that they focus entirely on the occluded object the occludee – without any explicit notion of the cause of occlu– Figure 1. [sent-11, score-0.446]
7 (Left) True positive detections by our occluded objects detector. [sent-13, score-0.243]
8 In this paper we therefore follow a different route, by treating the occluder as a first class citizen in the occlusion problem. [sent-18, score-1.045]
9 In particular, we start from the observation that certain types of occlusions are more likely than others: consider a street scene with cars parked on either side of the road (as in Fig. [sent-19, score-0.183]
10 Clearly, the visible and occluded portions of cars tend to form patterns that repeat numerous times, providing valuable visual cues about both the presence of individual objects and the layout of the scene as a whole. [sent-21, score-0.386]
11 Based on this observation, we chose to explicitly model these occlusion patterns by leveraging fine-grained, 3D annotations of a recent data set of urban street scenes [9]. [sent-22, score-0.942]
12 In particular, we mine reoccurring spatial arrangements of objects observed from a specific viewpoint, and model their distinctive appearance by an array of specialized detectors. [sent-23, score-0.266]
13 As baselines we include a standard, state333222888644 of-the-art object class detector [7] as well as a recently proposed double-person detector [19] in the evaluation, with sometimes surprising results (Sect. [sent-25, score-0.195]
14 First, we approach the challenging problem of partial occlusions in object class recognition from a different angle than most recent attempts by treating causes of occlusions as first class citizens in the model. [sent-28, score-0.42]
15 And third, in an extensive experimental study we evaluate and compare these different techniques, providing insights that we believe to be helpful in tackling the partial occlusion challenge in a principled manner. [sent-30, score-0.782]
16 Related work Sensitivity to partial occlusion has so far mostly been considered a lack in robustness, essentially treating occlusion as “noise rather than signal”1 . [sent-32, score-1.424]
17 Among the most successful implementations are integrated models of detection and segmentation using structured prediction and branch-and-bound [8], latent occlusion variables in a max-margin framework [20], and boosting [21]. [sent-36, score-0.841]
18 Only recently, [19] leveraged the joint appearance of multiple people for robust people detection and tracking by training a double-person detector [7] on pairs of people rather than single humans. [sent-39, score-0.279]
19 While our evaluation includes their model as a baseline, we systematically evaluate and contrast different ways of modelling occluders as first class citizens, and propose a more expressive, hierarchical model of occluder/occludee pairs that outperforms their model in certain configurations. [sent-40, score-0.28]
20 In the realm of deformable part models [10] has considered part-level occlusion in the form of dedicated “occlusion” candidate parts that represent generic occlusion features (such as a visible occlusion edge). [sent-41, score-2.168]
21 On the scene-level occlusion has been tackled with quite some success in terms of recognition performance by drawing evidence from partial object detections in probabilistic scene models [22, 14]. [sent-47, score-0.897]
22 While these models can reason about occluder/occludee in principle, their level of detail is limited by the chosen object class representation – in both cases standard 2D bounding box-based detectors are used [7] which clearly fail to capture interactions between objects that are not box-shaped. [sent-48, score-0.43]
23 Occlusion patterns Our approach to modelling partial occlusions is based on the notion of occlusion patterns, i. [sent-51, score-1.066]
24 Specifically, we limit ourselves to pairs of objects, giving rise to occlusion patterns on the level of single objects (occludees) and double objects (occluder-occludee pairs). [sent-55, score-1.043]
25 Mining occlusion patterns We mine occlusion patterns from training data by leveraging fine-grained annotations in the form of 3D object bounding boxes and camera projection matrices that are readily available as part of the KITTI dataset [9]. [sent-58, score-1.947]
26 We use these annotations to define a joint feature space that represents both the relative layout of two objects taking part in an occlusion and the viewpoint from which this arrangement is observed by the camera. [sent-59, score-0.863]
27 We then perform clustering on this joint feature space, resulting in an assignment of object pairs to clusters that we use as training data for the components of mixture models, as detailed in Sec. [sent-60, score-0.189]
28 We use the following properties of occlusion patterns as features in our clustering: i) occluder left/right of occludee in image space, ii) occluder and occludee orientation in 3D object coordinates, iii) occluder is/is not itself occluded, iv) degree of occlusion of occludee. [sent-63, score-2.806]
29 based on assigning the viewing angle of the occluder to one of a fixed number 333222888755 Figure 2. [sent-67, score-0.283]
30 Occlusion patterns span a wide range of occluder-occludee arrangements: resulting appearance can be well aligned (leftmost columns), or diverging (rightmost columns) – note that occluders are sometimes themselves occluded. [sent-70, score-0.253]
31 Figure 2 visualizes a selection of occlusion patterns mined from the KITTI dataset [9]. [sent-72, score-0.884]
32 As shown by the average images over cluster members (row (2)), some occlusion patterns are quite well aligned, which is a prerequisite for learning reliable detectors from them (Sec. [sent-73, score-1.02]
33 Occlusion pattern detectors In the following, we introduce three different models for the detection of occlusion patterns, each based on the well known and tested deformable part model (DPM [7]) framework. [sent-77, score-0.973]
34 2) focuses on individual occluded objects, by dedicating distinct mixture components to different single-object occlusion patterns. [sent-80, score-0.954]
35 The DPM is a mixture of C star shaped loglinear conditional random fields (CRF), all of which have a root p0 and a number of latent parts pi , i = 1, . [sent-89, score-0.221]
36 , N of pairs of images I object and annotations y, consisting of bounding boxes (ln, rn, tn, bn) and coarse viewpoint estimates. [sent-119, score-0.271]
37 Single-object occlusion patterns – OC-DPM We experiment with the following extension of the DPM [7]. [sent-122, score-0.829]
38 , Cvisible that represent the appearances of instances of an object class of interest, we introduce additional mixture components dedicated to representing the distinctive appearance of occluded instances of that class. [sent-126, score-0.386]
39 In particular, we reserve a distinct mixture components, for each of the occludee members of clusters resulting from our occlusion pattern mining step (Sec. [sent-127, score-1.077]
40 Double-object occlusion patterns While the single-object occlusion model of Sec. [sent-131, score-1.501]
41 2 has the potential to represent distinctive occlusion patterns in 333222888866 the data, modelling occluder and corresponding occludee jointly suggests a potential improvement: intuitively, the strong evidence of the occluder should provide strong cues as to where to look for the occludee. [sent-133, score-1.793]
42 In the following we capture this intuition by designing two variants of a hierarchical occlusion model based on the DPM [7] framework. [sent-134, score-0.7]
43 In these models occluder and occludee are allowed to move w. [sent-135, score-0.519]
44 1 Double-objects with joint root – Sym-DPM The first double-object occlusion pattern detector is graphically depicted in Fig. [sent-143, score-0.9]
45 The idea is to join two star shaped CRFs, one for the occluding object p0, and one for the occluded object p0 by an extra common root part p0 = (l, r, t, b). [sent-145, score-0.342]
46 As annotation for the root part we use the tightest rectangle around the union of the two objects, see the green bounding boxes in Fig. [sent-146, score-0.317]
47 The inclusion of this common root part introduces three new terms to the energy, an appearance term for the common root ? [sent-148, score-0.203]
48 2 Double-objects without joint root – Asym-DPM The second double-object model is a variation of SymDPM, where the common root part is omitted (Fig. [sent-162, score-0.243]
49 This relationship is asymmetric which is why we refer to this model as Asym-DPM and follows the intuition that the occluder can typically be trusted more (because it provides unhampered image evidence). [sent-165, score-0.283]
50 (3) For the models considered here, β refers to all their parameters (v, w, w, w) for all components c, y to the bounding box annotations per example (can be 1 or 2), and h to the latent part placements. [sent-187, score-0.32]
51 The latter step involves detecting high scoring bounding boxes and latent part assignments (y? [sent-191, score-0.192]
52 We use the standard intersection over union loss ΔV OC for a pair of bounding boxes y, y? [sent-206, score-0.204]
53 In case the model predicts a single bounding box y only (decided through the choice of the component) the loss is the intersection over union loss between Δ(yn, y) in case there is one annotation and Δ(yn, y) in case of an occlusion annotation. [sent-213, score-0.955]
54 When two bounding boxes are predicted y, y the loss is computed as either Δ(yn, y) in case there is a single annotation or as the average 0. [sent-215, score-0.214]
55 Our implementation of the loss function is capturing both single and double object detections simultaneously. [sent-226, score-0.185]
56 Experimental evaluation In the following, we give a detailed analysis of the various methods based on the notion of occlusion patterns that we introduced in Sect. [sent-231, score-0.871]
57 In a series of experiments we consider both results according to classical 2D bounding box-based localization measures, as well as a closer look at specific occlusion cases. [sent-233, score-0.752]
58 We commence by confirming the ability of our models to detect occlusion patterns in isolation 5. [sent-234, score-0.954]
59 2, and then move on the task of object class detection in an unconstrained setting, comprising both un-occluded and occluded objects of varying difficulty 5. [sent-235, score-0.323]
60 The KITTI dataset [9] is a rich source of challenging occlusion cases, as shown in Tab. [sent-251, score-0.672]
61 In all our experiments on Car (Pedestrian) we train our occlusion models with 6 (6) components for visible objects and 16 (15)2 components for occlusion patterns. [sent-261, score-1.496]
62 We obtain these numbers after keeping the occlusion pattern clusters which have at least 30 positive training examples. [sent-262, score-0.721]
63 Detecting occlusion patterns We commence by evaluating the ability of our models to reliably detect occlusion patterns in isolation, since this constitutes the basis for handling occlusion cases in a realistic detection setting (Sect. [sent-265, score-2.502]
64 We first consider the joint detection of occlusion patterns in the form of object pairs (occluder and occludee). [sent-270, score-0.989]
65 images that contain occlusion pairs, which we determine from the available fine-grained annotations (we run the occlusion pattern mining of Sect. [sent-273, score-1.503]
66 This targeted evaluation is essential in order to separate concerns, and to draw meaningful conclusions about the role of different variants of occlusion mod- elling from the results. [sent-275, score-0.7]
67 Based on the setup of the previous experiment we turn to evaluating our occlusion pattern detectors on the level of individual objects (this comprises both occluders and occludees from the doubleobject occlusion patterns). [sent-285, score-1.738]
68 To that end, we add our singleobject detectors to the comparison, namely, our Asym-DPM (orange), our OC-DPM (cyan), and the deformable part model [7] baseline (green). [sent-286, score-0.251]
69 Clearly, all explicit means of modelling occlusion improve over the DPM [7] baseline (53. [sent-289, score-0.815]
70 As concerns the relative performance of the different occlusion models, we observe a different order- ing compared to the double-object occlusion pattern case: the double-object baseline [19] (blue, 61% AP) performs slightly better than our double-resolution Sym-DPM (red, 57. [sent-294, score-1.449]
71 To summarize, we conclude that detecting occlusion patterns in images is in fact feasible, achieving both sufficiently high recall (over 90% for both single- and double-object occlusion patterns) and reasonable AP (up to 74% for single-object occlusion patterns). [sent-300, score-2.173]
72 We consider this result viable evidence that occlusion pattern detectors have the potential to aid recognition in the case of occlusion (which we examine and verify in Sect. [sent-301, score-1.608]
73 Furthermore, careful and explicit modelling of occluder and occludee characteristics helps for the joint detection of double-object patterns (our hierarchical Sym-DPM model outperforms the flat baseline [19]). [sent-304, score-0.882]
74 Occlusion patterns for object class detection In this section we apply our findings from the isolated evaluation of occlusion pattern detectors to the more realistic setting of unconstrained object class detection, again considering the KITTI dataset [9] as a testbed. [sent-308, score-1.296]
75 Since the focus is again on occlusion, we consider a series of increasingly difficult scenarios for comparing performance, corresponding to increasing levels of occlusion (which we measure based on 3D annotations and the given camera parameters). [sent-309, score-0.778]
76 8 (a)), the data set restricted to at most 20% occluded objects (Fig. [sent-311, score-0.182]
77 In order to enable detection of occluded as well as unoccluded object instances, we augment our various occlusion pattern detectors by additional mixture components for unoccluded objects. [sent-318, score-1.216]
78 8 (a)) we observe that the trends from the isolated evaluation of occlusion patterns (Sect. [sent-321, score-0.853]
79 2) transfer to the more realistic object class detection setting: while the double-object occlusion pattern detectors are comparable in terms of AP (Asym-DPM, orange, 52. [sent-323, score-0.981]
80 4%), improving over the next best double-object occlusion pattern detector Sym-DPM by a significant margin of 10. [sent-326, score-0.772]
81 8% AP) beats all double-object occlusion pattern detectors, but is in turn outperformed by our OC-DPM (cyan, 64. [sent-329, score-0.721]
82 All double-object detectors have proven to be very sensitive to the non-maxima supression scheme used and suffer from score incomparability among the double and single object components. [sent-334, score-0.197]
83 2%), confirming the benefit of our occlusion modelling, while Sym-DPM (31. [sent-339, score-0.703]
84 We proceed by examining the results for increasing levels of occlusion (Fig. [sent-343, score-0.703]
85 First, we observe that the relative ordering among double-object and single-object occlusion pattern detectors is stable across occlusion levels: our OCDPM (cyan) outperforms all double-object occlusion pattern detectors, namely, Sym-DPM (blue) and Asym-DPM (orange). [sent-345, score-2.233]
86 Second, the DPM [7] baseline (green) excels at low levels of occlusion (77. [sent-346, score-0.733]
87 2% AP for up to 20% occlusion, 37% AP for 20 to 40% occlusion), performing better than the double-object occlusion pattern detectors for all occlusion levels. [sent-347, score-1.512]
88 But third, the DPM [7] is outperformed by our OC-DPM for all occlusion levels above 40% by significant margins (12. [sent-348, score-0.703]
89 We conclude that occlusion pattern detectors can in fact aid detection in presence of occlusion, and the benefit increases with increasing occlusion level. [sent-360, score-1.6]
90 While, to our surprise, we found that double-object occlusion pattern detectors were not competitive with [7], our simpler, singleobject occlusion pattern detector (OC-DPM) improved performance for occlusion by a significant margin. [sent-361, score-2.326]
91 From our experience, the poor performance of double-object occlusion detectors on the KITTI dataset [9] (Sect. [sent-366, score-0.791]
92 3), which is in contrast to [19]’s findings for people detection, can be explained by the distribution over occlusion patterns: it seems biased towards extremely challenging “occluded occluder” cases. [sent-368, score-0.747]
93 row of cars parked on the side of the road), where the occluder is itself occluded these cases are not correctly represented by occluder-occludee models. [sent-371, score-0.532]
94 Examplesofno tightB an otaions cases it proves less robust to combine possibly conflicting pairwise detections (Asym-DPM, Sym-DPM) into a consistent interpretation than aggregating single-object occlusion patterns (OC-DPM). [sent-375, score-0.946]
95 We also found that the KITTI dataset [9] contains a significant number of occluded objects that are not annotated, supposingly due to being in the Lidar shadow, and hence missing 3D ground truth evidence for annotation. [sent-378, score-0.238]
96 Conclusions We have considered the long-standing problem of partial occlusion by making occluders first class citizens in modelling. [sent-388, score-0.946]
97 Detection performance for class Car on (a) the full dataset, (b)-(f) increasing occlusion levels from [0 − 20] % to [80 − 100] %. [sent-437, score-0.762]
98 Valid detections on unannotated objects els for detecting distinctive, reoccurring occlusion patterns, mined from annotated training data. [sent-441, score-0.92]
99 Using these detectors we could improve over the performance of a current, stateof-the-art object class detector over an entire dataset ofchallenging urban street scenes, but even more so for increasingly difficult cases in terms of occlusion. [sent-442, score-0.331]
100 Our most important findings are: i) reoccurring automatically occlusion patterns can be mined and reliably detected, ii) they can aid object detection, and iii) occlusion is still challenging also in terms of dataset annotation. [sent-443, score-1.781]
wordName wordTfidf (topN-words)
[('occlusion', 0.672), ('occluder', 0.283), ('occludee', 0.211), ('kitti', 0.181), ('dpm', 0.168), ('patterns', 0.157), ('occluded', 0.133), ('detectors', 0.119), ('ap', 0.118), ('occluders', 0.096), ('car', 0.088), ('root', 0.088), ('modelling', 0.087), ('reoccurring', 0.083), ('bounding', 0.08), ('annotations', 0.075), ('oc', 0.071), ('citizens', 0.07), ('singly', 0.07), ('yn', 0.066), ('arrangements', 0.061), ('detections', 0.061), ('class', 0.059), ('occlusions', 0.059), ('evidence', 0.056), ('pedestrian', 0.055), ('mined', 0.055), ('lidar', 0.055), ('detector', 0.051), ('partial', 0.049), ('pattern', 0.049), ('objects', 0.049), ('detection', 0.048), ('pepik', 0.048), ('cars', 0.047), ('cyan', 0.047), ('dedicating', 0.047), ('occludees', 0.047), ('members', 0.047), ('orange', 0.046), ('loss', 0.046), ('annotation', 0.044), ('distinctive', 0.044), ('boxes', 0.044), ('double', 0.044), ('planck', 0.043), ('notion', 0.042), ('unoccluded', 0.042), ('commence', 0.042), ('curiously', 0.042), ('singleobject', 0.042), ('findings', 0.041), ('latent', 0.041), ('stark', 0.041), ('joint', 0.04), ('aid', 0.04), ('components', 0.039), ('dedicated', 0.039), ('parked', 0.039), ('mixture', 0.038), ('pairs', 0.038), ('street', 0.038), ('dap', 0.036), ('mining', 0.035), ('union', 0.034), ('level', 0.034), ('object', 0.034), ('people', 0.034), ('insights', 0.034), ('deformable', 0.033), ('route', 0.033), ('box', 0.033), ('confirming', 0.031), ('treating', 0.031), ('levels', 0.031), ('baseline', 0.03), ('cases', 0.03), ('mine', 0.029), ('implementations', 0.029), ('parts', 0.028), ('variants', 0.028), ('isolation', 0.027), ('tackling', 0.027), ('part', 0.027), ('reliably', 0.027), ('explicit', 0.026), ('shaped', 0.026), ('wojek', 0.026), ('pairwise', 0.026), ('concerns', 0.026), ('schiele', 0.026), ('scans', 0.026), ('structured', 0.026), ('models', 0.025), ('cluster', 0.025), ('distinct', 0.025), ('occlude', 0.024), ('proxy', 0.024), ('isolated', 0.024), ('gehler', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000019 311 cvpr-2013-Occlusion Patterns for Object Class Detection
Author: Bojan Pepikj, Michael Stark, Peter Gehler, Bernt Schiele
Abstract: Despite the success of recent object class recognition systems, the long-standing problem of partial occlusion remains a major challenge, and a principled solution is yet to be found. In this paper we leave the beaten path of methods that treat occlusion as just another source of noise instead, we include the occluder itself into the modelling, by mining distinctive, reoccurring occlusion patterns from annotated training data. These patterns are then used as training data for dedicated detectors of varying sophistication. In particular, we evaluate and compare models that range from standard object class detectors to hierarchical, part-based representations of occluder/occludee pairs. In an extensive evaluation we derive insights that can aid further developments in tackling the occlusion challenge. –
2 0.52930915 154 cvpr-2013-Explicit Occlusion Modeling for 3D Object Class Representations
Author: M. Zeeshan Zia, Michael Stark, Konrad Schindler
Abstract: Despite the success of current state-of-the-art object class detectors, severe occlusion remains a major challenge. This is particularly true for more geometrically expressive 3D object class representations. While these representations have attracted renewed interest for precise object pose estimation, the focus has mostly been on rather clean datasets, where occlusion is not an issue. In this paper, we tackle the challenge of modeling occlusion in the context of a 3D geometric object class model that is capable of fine-grained, part-level 3D object reconstruction. Following the intuition that 3D modeling should facilitate occlusion reasoning, we design an explicit representation of likely geometric occlusion patterns. Robustness is achieved by pooling image evidence from of a set of fixed part detectors as well as a non-parametric representation of part configurations in the spirit of poselets. We confirm the potential of our method on cars in a newly collected data set of inner-city street scenes with varying levels of occlusion, and demonstrate superior performance in occlusion estimation and part localization, compared to baselines that are unaware of occlusions.
3 0.22416177 357 cvpr-2013-Revisiting Depth Layers from Occlusions
Author: Adarsh Kowdle, Andrew Gallagher, Tsuhan Chen
Abstract: In this work, we consider images of a scene with a moving object captured by a static camera. As the object (human or otherwise) moves about the scene, it reveals pairwise depth-ordering or occlusion cues. The goal of this work is to use these sparse occlusion cues along with monocular depth occlusion cues to densely segment the scene into depth layers. We cast the problem of depth-layer segmentation as a discrete labeling problem on a spatiotemporal Markov Random Field (MRF) that uses the motion occlusion cues along with monocular cues and a smooth motion prior for the moving object. We quantitatively show that depth ordering produced by the proposed combination of the depth cues from object motion and monocular occlusion cues are superior to using either feature independently, and using a na¨ ıve combination of the features.
4 0.17120984 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
Author: Sanja Fidler, Roozbeh Mottaghi, Alan Yuille, Raquel Urtasun
Abstract: In this paper we are interested in how semantic segmentation can help object detection. Towards this goal, we propose a novel deformable part-based model which exploits region-based segmentation algorithms that compute candidate object regions by bottom-up clustering followed by ranking of those regions. Our approach allows every detection hypothesis to select a segment (including void), and scores each box in the image using both the traditional HOG filters as well as a set of novel segmentation features. Thus our model “blends ” between the detector and segmentation models. Since our features can be computed very efficiently given the segments, we maintain the same complexity as the original DPM [14]. We demonstrate the effectiveness of our approach in PASCAL VOC 2010, and show that when employing only a root filter our approach outperforms Dalal & Triggs detector [12] on all classes, achieving 13% higher average AP. When employing the parts, we outperform the original DPM [14] in 19 out of 20 classes, achieving an improvement of 8% AP. Furthermore, we outperform the previous state-of-the-art on VOC’10 test by 4%.
5 0.16831356 439 cvpr-2013-Tracking Human Pose by Tracking Symmetric Parts
Author: Varun Ramakrishna, Takeo Kanade, Yaser Sheikh
Abstract: The human body is structurally symmetric. Tracking by detection approaches for human pose suffer from double counting, where the same image evidence is used to explain two separate but symmetric parts, such as the left and right feet. Double counting, if left unaddressed can critically affect subsequent processes, such as action recognition, affordance estimation, and pose reconstruction. In this work, we present an occlusion aware algorithm for tracking human pose in an image sequence, that addresses the problem of double counting. Our key insight is that tracking human pose can be cast as a multi-target tracking problem where the ”targets ” are related by an underlying articulated structure. The human body is modeled as a combination of singleton parts (such as the head and neck) and symmetric pairs of parts (such as the shoulders, knees, and feet). Symmetric body parts are jointly tracked with mutual exclusion constraints to prevent double counting by reasoning about occlusion. We evaluate our algorithm on an outdoor dataset with natural background clutter, a standard indoor dataset (HumanEva-I), and compare against a state of the art pose estimation algorithm.
6 0.16818497 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection
7 0.15178162 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels
8 0.15172337 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
9 0.13516602 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes
10 0.13083829 364 cvpr-2013-Robust Object Co-detection
11 0.12522145 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
12 0.12409556 325 cvpr-2013-Part Discovery from Partial Correspondence
13 0.11721116 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning
14 0.11281922 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
15 0.11025217 207 cvpr-2013-Human Pose Estimation Using a Joint Pixel-wise and Part-wise Formulation
16 0.10532873 14 cvpr-2013-A Joint Model for 2D and 3D Pose Estimation from a Single Image
17 0.10345348 330 cvpr-2013-Photometric Ambient Occlusion
18 0.097538009 453 cvpr-2013-Video Editing with Temporal, Spatial and Appearance Consistency
19 0.096227095 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video
20 0.086284906 163 cvpr-2013-Fast, Accurate Detection of 100,000 Object Classes on a Single Machine
topicId topicWeight
[(0, 0.218), (1, -0.023), (2, 0.038), (3, -0.101), (4, 0.083), (5, 0.004), (6, 0.158), (7, 0.12), (8, 0.054), (9, -0.001), (10, -0.131), (11, -0.06), (12, 0.059), (13, -0.091), (14, 0.05), (15, 0.015), (16, -0.02), (17, 0.066), (18, -0.069), (19, 0.031), (20, -0.007), (21, -0.041), (22, 0.125), (23, 0.027), (24, 0.086), (25, -0.125), (26, 0.028), (27, -0.078), (28, 0.003), (29, -0.003), (30, 0.154), (31, -0.099), (32, 0.037), (33, -0.077), (34, -0.063), (35, 0.019), (36, 0.003), (37, -0.022), (38, 0.188), (39, 0.012), (40, -0.147), (41, 0.028), (42, 0.021), (43, -0.047), (44, -0.168), (45, 0.107), (46, -0.165), (47, -0.07), (48, -0.013), (49, 0.142)]
simIndex simValue paperId paperTitle
same-paper 1 0.97963297 311 cvpr-2013-Occlusion Patterns for Object Class Detection
Author: Bojan Pepikj, Michael Stark, Peter Gehler, Bernt Schiele
Abstract: Despite the success of recent object class recognition systems, the long-standing problem of partial occlusion remains a major challenge, and a principled solution is yet to be found. In this paper we leave the beaten path of methods that treat occlusion as just another source of noise instead, we include the occluder itself into the modelling, by mining distinctive, reoccurring occlusion patterns from annotated training data. These patterns are then used as training data for dedicated detectors of varying sophistication. In particular, we evaluate and compare models that range from standard object class detectors to hierarchical, part-based representations of occluder/occludee pairs. In an extensive evaluation we derive insights that can aid further developments in tackling the occlusion challenge. –
2 0.90857124 154 cvpr-2013-Explicit Occlusion Modeling for 3D Object Class Representations
Author: M. Zeeshan Zia, Michael Stark, Konrad Schindler
Abstract: Despite the success of current state-of-the-art object class detectors, severe occlusion remains a major challenge. This is particularly true for more geometrically expressive 3D object class representations. While these representations have attracted renewed interest for precise object pose estimation, the focus has mostly been on rather clean datasets, where occlusion is not an issue. In this paper, we tackle the challenge of modeling occlusion in the context of a 3D geometric object class model that is capable of fine-grained, part-level 3D object reconstruction. Following the intuition that 3D modeling should facilitate occlusion reasoning, we design an explicit representation of likely geometric occlusion patterns. Robustness is achieved by pooling image evidence from of a set of fixed part detectors as well as a non-parametric representation of part configurations in the spirit of poselets. We confirm the potential of our method on cars in a newly collected data set of inner-city street scenes with varying levels of occlusion, and demonstrate superior performance in occlusion estimation and part localization, compared to baselines that are unaware of occlusions.
3 0.6424796 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning
Author: Tao Wang, Xuming He, Nick Barnes
Abstract: Wepropose a structuredHough voting methodfor detecting objects with heavy occlusion in indoor environments. First, we extend the Hough hypothesis space to include both object location and its visibility pattern, and design a new score function that accumulates votes for object detection and occlusion prediction. In addition, we explore the correlation between objects and their environment, building a depth-encoded object-context model based on RGB-D data. Particularly, we design a layered context representation and .barne s }@ nict a . com .au (a)(b)(c) (d)(e)(f) allow image patches from both objects and backgrounds voting for the object hypotheses. We demonstrate that using a data-driven 2.1D representation we can learn visual codebooks with better quality, and more interpretable detection results in terms of spatial relationship between objects and viewer. We test our algorithm on two challenging RGB-D datasets with significant occlusion and intraclass variation, and demonstrate the superior performance of our method.
4 0.59044164 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection
Author: Wanli Ouyang, Xingyu Zeng, Xiaogang Wang
Abstract: Detecting pedestrians in cluttered scenes is a challenging problem in computer vision. The difficulty is added when several pedestrians overlap in images and occlude each other. We observe, however, that the occlusion/visibility statuses of overlapping pedestrians provide useful mutual relationship for visibility estimation - the visibility estimation of one pedestrian facilitates the visibility estimation of another. In this paper, we propose a mutual visibility deep model that jointly estimates the visibility statuses of overlapping pedestrians. The visibility relationship among pedestrians is learned from the deep model for recognizing co-existing pedestrians. Experimental results show that the mutual visibility deep model effectively improves the pedestrian detection results. Compared with existing image-based pedestrian detection approaches, our approach has the lowest average miss rate on the CaltechTrain dataset, the Caltech-Test dataset and the ETHdataset. Including mutual visibility leads to 4% −8% improvements on mluudlitnipglem ubteunaclh vmiasibrki ditayta lesaedtss.
5 0.58694845 357 cvpr-2013-Revisiting Depth Layers from Occlusions
Author: Adarsh Kowdle, Andrew Gallagher, Tsuhan Chen
Abstract: In this work, we consider images of a scene with a moving object captured by a static camera. As the object (human or otherwise) moves about the scene, it reveals pairwise depth-ordering or occlusion cues. The goal of this work is to use these sparse occlusion cues along with monocular depth occlusion cues to densely segment the scene into depth layers. We cast the problem of depth-layer segmentation as a discrete labeling problem on a spatiotemporal Markov Random Field (MRF) that uses the motion occlusion cues along with monocular cues and a smooth motion prior for the moving object. We quantitatively show that depth ordering produced by the proposed combination of the depth cues from object motion and monocular occlusion cues are superior to using either feature independently, and using a na¨ ıve combination of the features.
6 0.55467415 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
7 0.5384689 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
8 0.53459865 364 cvpr-2013-Robust Object Co-detection
9 0.52333426 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
10 0.50836796 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models
11 0.506935 330 cvpr-2013-Photometric Ambient Occlusion
12 0.49480826 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection
13 0.4921307 439 cvpr-2013-Tracking Human Pose by Tracking Symmetric Parts
14 0.47184145 144 cvpr-2013-Efficient Maximum Appearance Search for Large-Scale Object Detection
15 0.46642351 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels
16 0.46553475 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
17 0.46011564 325 cvpr-2013-Part Discovery from Partial Correspondence
18 0.45837602 136 cvpr-2013-Discriminatively Trained And-Or Tree Models for Object Detection
19 0.45378944 96 cvpr-2013-Correlation Filters for Object Alignment
20 0.45208311 440 cvpr-2013-Tracking People and Their Objects
topicId topicWeight
[(10, 0.157), (16, 0.013), (26, 0.293), (33, 0.237), (67, 0.083), (69, 0.055), (80, 0.014), (87, 0.072)]
simIndex simValue paperId paperTitle
1 0.92294079 280 cvpr-2013-Maximum Cohesive Grid of Superpixels for Fast Object Localization
Author: Liang Li, Wei Feng, Liang Wan, Jiawan Zhang
Abstract: This paper addresses a challenging problem of regularizing arbitrary superpixels into an optimal grid structure, which may significantly extend current low-level vision algorithms by allowing them to use superpixels (SPs) conveniently as using pixels. For this purpose, we aim at constructing maximum cohesive SP-grid, which is composed of real nodes, i.e. SPs, and dummy nodes that are meaningless in the image with only position-taking function in the grid. For a given formation of image SPs and proper number of dummy nodes, we first dynamically align them into a grid based on the centroid localities of SPs. We then define the SP-grid coherence as the sum of edge weights, with SP locality and appearance encoded, along all direct paths connecting any two nearest neighboring real nodes in the grid. We finally maximize the SP-grid coherence via cascade dynamic programming. Our approach can take the regional objectness as an optional constraint to produce more semantically reliable SP-grids. Experiments on object localization show that our approach outperforms state-of-the-art methods in terms of both detection accuracy and speed. We also find that with the same searching strategy and features, object localization at SP-level is about 100-500 times faster than pixel-level, with usually better detection accuracy.
2 0.92158055 423 cvpr-2013-Template-Based Isometric Deformable 3D Reconstruction with Sampling-Based Focal Length Self-Calibration
Author: Adrien Bartoli, Toby Collins
Abstract: It has been shown that a surface deforming isometrically can be reconstructed from a single image and a template 3D shape. Methods from the literature solve this problem efficiently. However, they all assume that the camera model is calibrated, which drastically limits their applicability. We propose (i) a general variational framework that applies to (calibrated and uncalibrated) general camera models and (ii) self-calibrating 3D reconstruction algorithms for the weak-perspective and full-perspective camera models. In the former case, our algorithm returns the normal field and camera ’s scale factor. In the latter case, our algorithm returns the normal field, depth and camera ’s focal length. Our algorithms are the first to achieve deformable 3D reconstruction including camera self-calibration. They apply to much more general setups than existing methods. Experimental results on simulated and real data show that our algorithms give results with the same level of accuracy as existing methods (which use the true focal length) on perspective images, and correctly find the normal field on affine images for which the existing methods fail.
3 0.91395718 440 cvpr-2013-Tracking People and Their Objects
Author: Tobias Baumgartner, Dennis Mitzel, Bastian Leibe
Abstract: Current pedestrian tracking approaches ignore important aspects of human behavior. Humans are not moving independently, but they closely interact with their environment, which includes not only other persons, but also different scene objects. Typical everyday scenarios include people moving in groups, pushing child strollers, or pulling luggage. In this paper, we propose a probabilistic approach for classifying such person-object interactions, associating objects to persons, and predicting how the interaction will most likely continue. Our approach relies on stereo depth information in order to track all scene objects in 3D, while simultaneously building up their 3D shape models. These models and their relative spatial arrangement are then fed into a probabilistic graphical model which jointly infers pairwise interactions and object classes. The inferred interactions can then be used to support tracking by recovering lost object tracks. We evaluate our approach on a novel dataset containing more than 15,000 frames of personobject interactions in 325 video sequences and demonstrate good performance in challenging real-world scenarios.
4 0.89070189 281 cvpr-2013-Measures and Meta-Measures for the Supervised Evaluation of Image Segmentation
Author: Jordi Pont-Tuset, Ferran Marques
Abstract: This paper tackles the supervised evaluation of image segmentation algorithms. First, it surveys and structures the measures used to compare the segmentation results with a ground truth database; and proposes a new measure: the precision-recall for objects and parts. To compare the goodness of these measures, it defines three quantitative meta-measures involving six state of the art segmentation methods. The meta-measures consist in assuming some plausible hypotheses about the results and assessing how well each measure reflects these hypotheses. As a conclusion, this paper proposes the precision-recall curves for boundaries and for objects-and-parts as the tool of choice for the supervised evaluation of image segmentation. We make the datasets and code of all the measures publicly available.
5 0.8887558 152 cvpr-2013-Exemplar-Based Face Parsing
Author: Brandon M. Smith, Li Zhang, Jonathan Brandt, Zhe Lin, Jianchao Yang
Abstract: In this work, we propose an exemplar-based face image segmentation algorithm. We take inspiration from previous works on image parsing for general scenes. Our approach assumes a database of exemplar face images, each of which is associated with a hand-labeled segmentation map. Given a test image, our algorithm first selects a subset of exemplar images from the database, Our algorithm then computes a nonrigid warp for each exemplar image to align it with the test image. Finally, we propagate labels from the exemplar images to the test image in a pixel-wise manner, using trained weights to modulate and combine label maps from different exemplars. We evaluate our method on two challenging datasets and compare with two face parsing algorithms and a general scene parsing algorithm. We also compare our segmentation results with contour-based face alignment results; that is, we first run the alignment algorithms to extract contour points and then derive segments from the contours. Our algorithm compares favorably with all previous works on all datasets evaluated.
same-paper 6 0.88583672 311 cvpr-2013-Occlusion Patterns for Object Class Detection
7 0.82086337 88 cvpr-2013-Compressible Motion Fields
8 0.82074535 353 cvpr-2013-Relative Hidden Markov Models for Evaluating Motion Skill
9 0.79730737 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection
10 0.79462779 21 cvpr-2013-A New Perspective on Uncalibrated Photometric Stereo
11 0.79173625 331 cvpr-2013-Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis
12 0.78946084 414 cvpr-2013-Structure Preserving Object Tracking
13 0.7881152 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
14 0.78366941 285 cvpr-2013-Minimum Uncertainty Gap for Robust Visual Tracking
15 0.78045803 465 cvpr-2013-What Object Motion Reveals about Shape with Unknown BRDF and Lighting
16 0.77967846 325 cvpr-2013-Part Discovery from Partial Correspondence
17 0.77947688 424 cvpr-2013-Templateless Quasi-rigid Shape Modeling with Implicit Loop-Closure
18 0.77832305 360 cvpr-2013-Robust Estimation of Nonrigid Transformation for Point Set Registration
19 0.77708501 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection
20 0.77662694 96 cvpr-2013-Correlation Filters for Object Alignment