iccv iccv2013 iccv2013-190 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Markus Mathias, Rodrigo Benenson, Radu Timofte, Luc Van_Gool
Abstract: Detecting partially occluded pedestrians is challenging. A common practice to maximize detection quality is to train a set of occlusion-specific classifiers, each for a certain amount and type of occlusion. Since training classifiers is expensive, only a handful are typically trained. We show that by using many occlusion-specific classifiers, we outperform previous approaches on three pedestrian datasets; INRIA, ETH, and Caltech USA. We present a new approach to train such classifiers. By reusing computations among different training stages, 16 occlusion-specific classifiers can be trained at only one tenth the cost of one full training. We show that also test time cost grows sub-linearly.
Reference: text
sentIndex sentText sentNum sentScore
1 Since training classifiers is expensive, only a handful are typically trained. [sent-3, score-0.49]
2 By reusing computations among different training stages, 16 occlusion-specific classifiers can be trained at only one tenth the cost of one full training. [sent-6, score-0.581]
3 While the detection quality has constantly improved over recent years, state-of-the-art methods struggle to detect pedestrians that are far away (small in the image), in unusual poses, or occluded [11]. [sent-10, score-0.51]
4 A common practice to maximize the detection of occluded objects is to train a set of occlusion-specific classifiers, one classifier for each type (e. [sent-14, score-0.551]
5 occlusion from the left) and for each level of occlusion. [sent-16, score-0.416]
6 Since training is costly, only a limited number (3~5) of such classifiers tend to be trained. [sent-17, score-0.49]
7 Starting from one biased classifier trained for full-body detection, we reuse training time operations to efficiently build a set of occlusion specific classifiers. [sent-20, score-0.898]
8 (a) Training many occlusion specific classifiers is costly Figure 1: Motivation: to handle frequently occurring occlusions, we train many occlusion specific classifiers. [sent-22, score-1.197]
9 time (one order of magnitude) enables to train classifiers for all amounts of occlusions exhaustively, at a fraction of the cost for training a standard detector. [sent-23, score-0.809]
10 Our occlusion classifiers reach 97 % of the performance of a brute-force approach, while requiring only 8 % of the training time. [sent-24, score-0.932]
11 At test time, feature sharing among the occlusion-specific classifiers yields a sub-linear growth in cost. [sent-25, score-0.465]
12 Using our exhaustive set of classifiers provides better performance than using only a sparse set of individually trained occlusion-specific classifiers. [sent-26, score-0.468]
13 Overall, two common approaches exist: Training multiple classifiers Statistics on pedestrian occlusion show that a few occlusion types (from the bottom, right, and left) cover more than 95 % of cases [7]. [sent-32, score-1.258]
14 Thus, it has been proposed to train a small set of classifiers, each one for a specific occlusion [19]. [sent-33, score-0.409]
15 In contrast, when training a classifier for each occlusion type and level the feature extraction focuses on the visible area, thus enabling improved detection. [sent-46, score-0.746]
16 We expect that, independent of object class, a detector trained for a specific occlusion will surpass “cutting down” a detector that assumes full visibility (see figure 2a). [sent-47, score-0.528]
17 We present, for the first time, experiments quantifying the performance of the Integral Channel Features detector (ChnFt rs) [6] in the presence of various amounts of occlusion (§5). [sent-59, score-0.429]
18 First we briefly describe the ChnFt rs detector (section 2), and its poor performance under occlusion (section 3). [sent-71, score-0.546]
19 Integral channel features classifier As a base classifier we use our implementation of the Integral Channel Features (ChnFt rs) detector [6], similar in spirit to the work of Viola and Jones [16] (building upon the open source implementation of [1]). [sent-80, score-0.516]
20 The ChnFt rs detector is based on discrete Adaboost, using depth-2 decision trees as weak classifiers. [sent-83, score-0.555]
21 At each iteration of the training procedure a weak classifier must be built. [sent-85, score-0.661]
22 All our models are composed of 2 000 weak classifiers and are trained using two bootstrapping stages. [sent-92, score-0.847]
23 Classifiers for different occlusion levels In this paper we consider the most frequent types of pedestrian occlusions: occlusions from the bottom and right/left. [sent-95, score-0.737]
24 For each type we build a set of occlusion-specific classifiers in the range of 0 % to 50 % occlusion2, see figure 1a. [sent-96, score-0.442]
25 Naive approach The simplest approach to construct a set of occlusionspecific classifiers is to train one full-body classifier, and then “cut it” for each occlusion level: removing all weak classifiers with nodes whose rectangular regions overlap the occluded area (see figure 2a). [sent-102, score-1.886]
26 In figure 2b we show the detection quality for each of these “naive” classifiers (see section 5 for evaluation method). [sent-104, score-0.605]
27 It can be observed that quality drops drastically as occlusion increases (miss-rate is in log scale). [sent-105, score-0.493]
28 The performance drop can be explained by the number of weak classifiers left for a given level of occlusion. [sent-108, score-0.868]
29 The quality ofthe detector is correlated with the number of weak classifiers. [sent-109, score-0.57]
30 In figure 2c we present the number of weak classifiers as a function of the level of occlusion. [sent-110, score-0.828]
31 It can be observed that already at 20% occlusion more than 50% percent of the weak classifiers have been lost (see corresponding illustration 3a). [sent-111, score-1.164]
32 Scrutinising the learned models shows that the regions used by weak classifiers are well distributed across the model. [sent-112, score-0.773]
33 The exponential drop in weak classifiers indicates that most span a large part of the object height, i. [sent-113, score-0.843]
34 For each occlusion level a new classifier is trained from scratch, restricted to only use the visible part. [sent-120, score-0.651]
35 By construction, each occlusion-specific classifier will have selected the best weak classifiers for the task. [sent-121, score-0.977]
36 Training the 17 classifiers (1full-body + 16 occlusion levels) takes more than 18 hours. [sent-124, score-0.764]
37 Fast training of occlusion-specific classifiers We propose a fast training method for occlusion-specific classifiers. [sent-128, score-0.577]
38 (d) Franken-classifiers Figure 3: Losses in the number of weak classifiers lead to losses in classification quality. [sent-405, score-0.805]
39 A naive approach degrades rapidly in the presence of occlusion (figure 3a). [sent-406, score-0.516]
40 To cope with these issues, we propose to bias the classifier training towards a distribution of weak classifiers more suitable for generating (“cutting”) occlusion-specific classifiers. [sent-420, score-1.105]
41 This will re- sult in the weak classifiers being more concentrated in the non-occluded areas than when training a non-constrained classifier (as in section 3. [sent-421, score-1.064]
42 The key insight of this work, is that it is possible to change the spatial distribution of the regions selected by the weak classifiers, without a significant quality drop3. [sent-423, score-0.502]
43 In the next iteration Adaboost selects the best node with respect to the previous weak classifiers, thus selecting a slightly worse weak classifier in one stage does not necessarily imply that the final classifier will have worse performance. [sent-426, score-1.271]
44 To handle bottom occlusions we want to bias our weak classifiers upwards. [sent-427, score-1.021]
45 A single parameter β is used to tradeoff the weak classifiers position bias versus the quality of the resulting detector. [sent-428, score-0.946]
46 The learned biased classifier will be cut in a similar manner as the naive approach. [sent-443, score-0.585]
47 Adaboost learns a linear combination of weak classifiers; its learning procedure is sensitive to the ordering of the selected weak classifiers. [sent-444, score-0.74]
48 When removing weak classifiers based on a geometric criterion, they will be removed at arbitrary positions in the original classifier sequence. [sent-445, score-0.977]
49 To improve the quality of the remaining strong classifier, we reset the weights by applying the Adaboost algorithm over the remaining weak classifiers sequence. [sent-447, score-0.93]
50 Figure 2c shows the obtained node distribution for the biased classifier (versus the naive approach). [sent-448, score-0.63]
51 Where the weak classifiers are unconstrained, the biased classifier shows the same exponential behaviour as the naive classifier. [sent-449, score-1.393]
52 Between 0% and 50% occlusion level, the curve has a roughly linear behaviour (see also illustration 3b). [sent-450, score-0.43]
53 Costs and benefits Training a biased classifier has essentially the same cost as the normal approach. [sent-452, score-0.452]
54 Using the bias, many more weak classifiers remain at each occlusion level (up to 4× wmeoarek), c tlhasiss significantly b aoto esatcs hcla ocscsilfuicsaiotinon le accuracy. [sent-454, score-1.189]
55 oC 4u×tting, revisiting and resetting the weights of the weak classifiers has a negligible cost in comparison to the overall training. [sent-455, score-0.902]
56 11550088 1 occlusion level (a) Bottom occlusion 1 occlusion level (b) Right occlusion Figure 4: Comparison of the different approaches to handle occlusion. [sent-456, score-1.554]
57 Filled-up classifiers The quality of the ChnFt rs detector depends on the number of weak classifiers. [sent-461, score-1.09]
58 Although the biased classifier presented in the previous section presents a significant improvement over the naive approach, the number of weak classifiers still falls as the occlusion level increases. [sent-462, score-1.736]
59 To further improve the situation we propose to train a single bi- × ased classifier, “cut it” for each occlusion level, and then extend each occlusion-specific classifier by training additional weak classifiers until reaching the same number of weak classifiers as in the full-body detector. [sent-463, score-2.285]
60 Although having the same number of weak classifiers does not equate to reaching an equal quality, we use this measure as a proxy. [sent-465, score-0.812]
61 Similar to the biased case, after cutting a classifier the weights of the remaining weak classifiers need to be reset. [sent-466, score-1.241]
62 After resetting the weights, the classifier is then extended using standard Adaboost training until we reach the desired amount of weak classifiers. [sent-468, score-0.786]
63 Costs and benefits The ChnFt rs classifier training is done in three stages (see appendix A). [sent-469, score-0.493]
64 Creating the set of candidate nodes dominates the time of this last stage, thus the cost of fillingup the classifiers is roughly linear to the number of added weak classifiers. [sent-471, score-0.977]
65 A brute-force approach would require training 16 2 000 = 32 000 weak cprlaosascihfie wrso. [sent-473, score-0.457]
66 0 A wse we cwlaislsl isfiheorsw, in section 5, the filled-up classifiers reach 97 % of the qual- × ity of the brute-force approach. [sent-477, score-0.484]
67 Franken-classifiers The filled-up approach boosts quality, but still requires to train a significant number of weak classifiers. [sent-480, score-0.418]
68 We can further decrease the training time by generating the occlusionspecific classifiers in a recursive way (see figure 3d). [sent-481, score-0.624]
69 Similar to the filled-up classifiers, we start from the fullbody biased classifier and remove weak classifiers to generate the first occlusion classifier (least occluded). [sent-482, score-1.73]
70 The additional weak classifiers are learned without spatial bias. [sent-483, score-0.773]
71 Given the full classifier for the first occlusion level, we proceed to cut it using the second occlusion level. [sent-484, score-0.988]
72 This process is repeated until the last occlusion level is reached. [sent-486, score-0.416]
73 Because of the recursive training, the classifier for the last (and most drastic) occlusion level will potentially have weak classifiers originating from all previous occlusion levels (see figure 3d). [sent-487, score-1.817]
74 Costs and benefits Compared to the filled-up classifier we further reduce the number of weak classifiers to be trained. [sent-490, score-0.977]
75 fiers for the brute-force approach, or 10 000 for the filledup case; our experiments show that we only need to add about ∼ 6 000 weak classifiers to the last training stage. [sent-492, score-0.955]
76 Independent Franken-classifiers evaluation In the previous sections we presented different methods to obtain occlusion classifiers. [sent-497, score-0.409]
77 Evaluation method We first evaluate our occlusion-specific classifier independently and show their performance for each occlusion type and level. [sent-503, score-0.604]
78 Most pedestrian datasets contain only few annotated occluded pedestrians, for instance, Caltech USA [7] has only 100 pedestrians in the “partially occluded” range. [sent-505, score-0.441]
79 the classifier for pedestrians occluded by 50 % from the bottom, only contains features located in the upper part of the test window. [sent-511, score-0.539]
80 Figure 4 summarizes the result of all 764 trained classifiers over 1245 evaluations on the INRIA test set. [sent-519, score-0.499]
81 For a given level of occlusion we use the closest classifier not overlapping with the occlusion. [sent-521, score-0.62]
82 Training time computational cost Table 1 relates the quality of the occlusion classifiers to the measured training time (wall time). [sent-527, score-1.097]
83 Due to this, the wall time is not directly proportional to the weak classifiers count. [sent-530, score-0.836]
84 Given our results, the biased classifiers should be preferred over the naive approach, as the quality for all occlusion levels is much better, while the training time remains the same. [sent-532, score-1.389]
85 Training 3 or 5 brute-force classifiers takes more time than the proposed approaches, ×× ×× while still having lower quality. [sent-534, score-0.43]
86 Importantly, using 17 models or 3 models, corresponds to exactly the same amount of window evaluations, and thus to the exact same evaluation cost (assuming all models have the same number of weak classifiers). [sent-542, score-0.43]
87 Joint Franken-classifiers evaluation In the previous section we evaluated our Frankenclassifiers in the scenario where occlusions are known, but the presence of a pedestrian on such occlusion boundaries is unknown. [sent-548, score-0.653]
88 As full-body classifier we use the biased classifier for the bottom occlusion. [sent-551, score-0.644]
89 Merging detections Our merging approach is based on two principles: “detectors with higher occlusion levels have worse quality”, and “the Franken-classifiers should complement the full-body detections”. [sent-555, score-0.477]
90 Since occlusion detectors will trigger on fully visible pedestrians, for each zero occlusion detection we remove all overlapping detections, and increase its score by adding up the score of the overlapping bounding boxes. [sent-558, score-0.878]
91 Detection quality In this section we use as base classifier the Square sChnFt rs [2]. [sent-562, score-0.453]
92 In the supplementary material we include the additional Caltech occlusion ranges, and show that 33 classifiers improves over using only 7. [sent-571, score-0.791]
93 S tiemstiinlagr t iom heo bwy we can otrfai 3n3 many occlusion models in a fraction of the time of a full model, we can also evaluate our 33 classifiers much faster than independently. [sent-577, score-0.84]
94 In our joint experiment, the total amount ofunique weak classifiers to evaluate sums up to ∼ 14 000, this is significantly less tthoa env a2l 0u0at0e × s 3m3s = u p6 t6o o0 0∼0. [sent-580, score-0.773]
95 Given our training procedure, all models in each occlusion type share at least ~1000 features (number of remaining features at 50 % occlusion level, figure 2c). [sent-585, score-0.848]
96 When using a soft cascade over ChnFt rs [5, 1], in average as few as ∼ 20 weak learners are evaluated per detection window. [sent-586, score-0.557]
97 We have shown that a naive approach to handle occlusions provides poor quality, and that occlusion-specific classifiers can perform significantly better. [sent-593, score-0.717]
98 A proof of concept usage of the Franken-classifiers shows that we can reach top quality detection on challenging pedestrian datasets. [sent-595, score-0.449]
99 1155 11 11 (a) INRIA (b) ETH reasonable false positives per image (c) Caltech USA partially occluded subset Figure 5: Improved detection quality when using occlusion-specific classifiers. [sent-598, score-0.427]
100 The full classifier consists of 2 000 weak clas- ×× sifiers. [sent-723, score-0.574]
wordName wordTfidf (topN-words)
[('classifiers', 0.403), ('weak', 0.37), ('occlusion', 0.361), ('chnft', 0.268), ('classifier', 0.204), ('biased', 0.188), ('occluded', 0.165), ('occlusions', 0.159), ('naive', 0.155), ('pedestrians', 0.143), ('pedestrian', 0.133), ('quality', 0.132), ('rs', 0.117), ('training', 0.087), ('node', 0.083), ('reach', 0.081), ('occlusionspecific', 0.08), ('caltech', 0.077), ('detection', 0.07), ('adaboost', 0.068), ('detector', 0.068), ('cost', 0.06), ('inria', 0.059), ('detectors', 0.059), ('benenson', 0.057), ('stages', 0.056), ('nodes', 0.056), ('level', 0.055), ('bruteforce', 0.054), ('filledup', 0.054), ('frankenstein', 0.054), ('schnft', 0.054), ('mathias', 0.051), ('cutting', 0.051), ('train', 0.048), ('detections', 0.048), ('sections', 0.048), ('bottom', 0.048), ('consult', 0.044), ('resetting', 0.044), ('behaviour', 0.043), ('bootstrapping', 0.043), ('fiers', 0.041), ('bias', 0.041), ('drop', 0.04), ('stage', 0.04), ('channel', 0.04), ('achievable', 0.039), ('type', 0.039), ('integral', 0.039), ('doll', 0.039), ('reaching', 0.039), ('cut', 0.038), ('costs', 0.038), ('hz', 0.038), ('timofte', 0.038), ('evaluations', 0.038), ('handling', 0.036), ('levels', 0.036), ('wall', 0.036), ('suppression', 0.036), ('sharing', 0.035), ('candidate', 0.035), ('exhaustive', 0.034), ('usage', 0.033), ('ranking', 0.032), ('losses', 0.032), ('borders', 0.032), ('partially', 0.032), ('merging', 0.032), ('cascades', 0.031), ('trained', 0.031), ('exponential', 0.03), ('percent', 0.03), ('rounds', 0.03), ('occluding', 0.03), ('channels', 0.03), ('appendix', 0.029), ('eth', 0.029), ('exhaustively', 0.029), ('wojek', 0.028), ('positives', 0.028), ('schiele', 0.028), ('time', 0.027), ('supplementary', 0.027), ('bounding', 0.027), ('recursive', 0.027), ('test', 0.027), ('roughly', 0.026), ('weights', 0.025), ('fraction', 0.025), ('maximize', 0.025), ('proceed', 0.024), ('costly', 0.024), ('setup', 0.024), ('otrfai', 0.024), ('radu', 0.024), ('tood', 0.024), ('upfront', 0.024), ('missrate', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 190 iccv-2013-Handling Occlusions with Franken-Classifiers
Author: Markus Mathias, Rodrigo Benenson, Radu Timofte, Luc Van_Gool
Abstract: Detecting partially occluded pedestrians is challenging. A common practice to maximize detection quality is to train a set of occlusion-specific classifiers, each for a certain amount and type of occlusion. Since training classifiers is expensive, only a handful are typically trained. We show that by using many occlusion-specific classifiers, we outperform previous approaches on three pedestrian datasets; INRIA, ETH, and Caltech USA. We present a new approach to train such classifiers. By reusing computations among different training stages, 16 occlusion-specific classifiers can be trained at only one tenth the cost of one full training. We show that also test time cost grows sub-linearly.
2 0.31760961 338 iccv-2013-Randomized Ensemble Tracking
Author: Qinxun Bai, Zheng Wu, Stan Sclaroff, Margrit Betke, Camille Monnier
Abstract: We propose a randomized ensemble algorithm to model the time-varying appearance of an object for visual tracking. In contrast with previous online methods for updating classifier ensembles in tracking-by-detection, the weight vector that combines weak classifiers is treated as a random variable and the posterior distribution for the weight vector is estimated in a Bayesian manner. In essence, the weight vector is treated as a distribution that reflects the confidence among the weak classifiers used to construct and adapt the classifier ensemble. The resulting formulation models the time-varying discriminative ability among weak classifiers so that the ensembled strong classifier can adapt to the varying appearance, backgrounds, and occlusions. The formulation is tested in a tracking-by-detection implementation. Experiments on 28 challenging benchmark videos demonstrate that the proposed method can achieve results comparable to and often better than those of stateof-the-art approaches.
3 0.24154803 279 iccv-2013-Multi-stage Contextual Deep Learning for Pedestrian Detection
Author: Xingyu Zeng, Wanli Ouyang, Xiaogang Wang
Abstract: Cascaded classifiers1 have been widely used in pedestrian detection and achieved great success. These classifiers are trained sequentially without joint optimization. In this paper, we propose a new deep model that can jointly train multi-stage classifiers through several stages of backpropagation. It keeps the score map output by a classifier within a local region and uses it as contextual information to support the decision at the next stage. Through a specific design of the training strategy, this deep architecture is able to simulate the cascaded classifiers by mining hard samples to train the network stage-by-stage. Each classifier handles samples at a different difficulty level. Unsupervised pre-training and specifically designed stage-wise supervised training are used to regularize the optimization problem. Both theoretical analysis and experimental results show that the training strategy helps to avoid overfitting. Experimental results on three datasets (Caltech, ETH and TUD-Brussels) show that our approach outperforms the state-of-the-art approaches.
4 0.2413549 242 iccv-2013-Learning People Detectors for Tracking in Crowded Scenes
Author: Siyu Tang, Mykhaylo Andriluka, Anton Milan, Konrad Schindler, Stefan Roth, Bernt Schiele
Abstract: People tracking in crowded real-world scenes is challenging due to frequent and long-term occlusions. Recent tracking methods obtain the image evidence from object (people) detectors, but typically use off-the-shelf detectors and treat them as black box components. In this paper we argue that for best performance one should explicitly train people detectors on failure cases of the overall tracker instead. To that end, we first propose a novel joint people detector that combines a state-of-the-art single person detector with a detector for pairs of people, which explicitly exploits common patterns of person-person occlusions across multiple viewpoints that are a frequent failure case for tracking in crowded scenes. To explicitly address remaining failure modes of the tracker we explore two methods. First, we analyze typical failures of trackers and train a detector explicitly on these cases. And second, we train the detector with the people tracker in the loop, focusing on the most common tracker failures. We show that our joint multi-person detector significantly improves both de- tection accuracy as well as tracker performance, improving the state-of-the-art on standard benchmarks.
5 0.22785595 269 iccv-2013-Modeling Occlusion by Discriminative AND-OR Structures
Author: Bo Li, Wenze Hu, Tianfu Wu, Song-Chun Zhu
Abstract: Occlusion presents a challenge for detecting objects in real world applications. To address this issue, this paper models object occlusion with an AND-OR structure which (i) represents occlusion at semantic part level, and (ii) captures the regularities of different occlusion configurations (i.e., the different combinations of object part visibilities). This paper focuses on car detection on street. Since annotating part occlusion on real images is time-consuming and error-prone, we propose to learn the the AND-OR structure automatically using synthetic images of CAD models placed at different relative positions. The model parameters are learned from real images under the latent structural SVM (LSSVM) framework. In inference, an efficient dynamic programming (DP) algorithm is utilized. In experiments, we test our method on both car detection and car view estimation. Experimental results show that (i) Our CAD simulation strategy is capable of generating occlusion patterns for real scenarios, (ii) The proposed AND-OR structure model is effective for modeling occlusions, which outperforms the deformable part-based model (DPM) [6, 10] in car detec- , tion on both our self-collected streetparking dataset and the Pascal VOC 2007 car dataset [4], (iii) The learned model is on-par with the state-of-the-art methods on car view estimation tested on two public datasets.
6 0.18740501 336 iccv-2013-Random Forests of Local Experts for Pedestrian Detection
7 0.18033423 136 iccv-2013-Efficient Pedestrian Detection by Directly Optimizing the Partial Area under the ROC Curve
8 0.17085725 256 iccv-2013-Locally Affine Sparse-to-Dense Matching for Motion and Occlusion Estimation
9 0.14687772 44 iccv-2013-Adapting Classification Cascades to New Domains
10 0.14543813 75 iccv-2013-CoDeL: A Human Co-detection and Labeling Framework
11 0.13761868 424 iccv-2013-Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines
12 0.12691557 261 iccv-2013-Markov Network-Based Unified Classifier for Face Identification
13 0.12422026 220 iccv-2013-Joint Deep Learning for Pedestrian Detection
14 0.12245706 233 iccv-2013-Latent Task Adaptation with Large-Scale Hierarchies
15 0.11488511 157 iccv-2013-Fast Face Detector Training Using Tailored Views
16 0.1127236 165 iccv-2013-Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies
17 0.11107802 311 iccv-2013-Pedestrian Parsing via Deep Decompositional Network
18 0.10547625 111 iccv-2013-Detecting Dynamic Objects with Multi-view Background Subtraction
19 0.10351349 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition
20 0.10313691 104 iccv-2013-Decomposing Bag of Words Histograms
topicId topicWeight
[(0, 0.227), (1, 0.026), (2, -0.023), (3, -0.063), (4, 0.135), (5, -0.085), (6, -0.05), (7, 0.114), (8, -0.074), (9, -0.059), (10, -0.015), (11, -0.077), (12, 0.065), (13, -0.063), (14, 0.122), (15, -0.079), (16, -0.029), (17, 0.09), (18, 0.139), (19, 0.201), (20, -0.173), (21, 0.014), (22, -0.107), (23, 0.023), (24, -0.097), (25, -0.082), (26, -0.112), (27, -0.003), (28, -0.036), (29, -0.158), (30, -0.125), (31, 0.028), (32, -0.078), (33, 0.056), (34, -0.005), (35, -0.084), (36, 0.048), (37, 0.065), (38, -0.099), (39, -0.025), (40, 0.065), (41, -0.026), (42, -0.053), (43, 0.161), (44, 0.02), (45, 0.044), (46, 0.061), (47, 0.018), (48, 0.03), (49, -0.083)]
simIndex simValue paperId paperTitle
same-paper 1 0.98665136 190 iccv-2013-Handling Occlusions with Franken-Classifiers
Author: Markus Mathias, Rodrigo Benenson, Radu Timofte, Luc Van_Gool
Abstract: Detecting partially occluded pedestrians is challenging. A common practice to maximize detection quality is to train a set of occlusion-specific classifiers, each for a certain amount and type of occlusion. Since training classifiers is expensive, only a handful are typically trained. We show that by using many occlusion-specific classifiers, we outperform previous approaches on three pedestrian datasets; INRIA, ETH, and Caltech USA. We present a new approach to train such classifiers. By reusing computations among different training stages, 16 occlusion-specific classifiers can be trained at only one tenth the cost of one full training. We show that also test time cost grows sub-linearly.
2 0.83115608 136 iccv-2013-Efficient Pedestrian Detection by Directly Optimizing the Partial Area under the ROC Curve
Author: Sakrapee Paisitkriangkrai, Chunhua Shen, Anton Van Den Hengel
Abstract: Many typical applications of object detection operate within a prescribed false-positive range. In this situation the performance of a detector should be assessed on the basis of the area under the ROC curve over that range, rather than over the full curve, as the performance outside the range is irrelevant. This measure is labelled as the partial area under the ROC curve (pAUC). Effective cascade-based classification, for example, depends on training node classifiers that achieve the maximal detection rate at a moderate false positive rate, e.g., around 40% to 50%. We propose a novel ensemble learning method which achieves a maximal detection rate at a user-defined range of false positive rates by directly optimizing the partial AUC using structured learning. By optimizing for different ranges of false positive rates, the proposed method can be used to train either a single strong classifier or a node classifier forming part of a cascade classifier. Experimental results on both synthetic and real-world data sets demonstrate the effectiveness of our approach, and we show that it is possible to train state-of-the-art pedestrian detectors using the pro- posed structured ensemble learning method.
3 0.7473622 338 iccv-2013-Randomized Ensemble Tracking
Author: Qinxun Bai, Zheng Wu, Stan Sclaroff, Margrit Betke, Camille Monnier
Abstract: We propose a randomized ensemble algorithm to model the time-varying appearance of an object for visual tracking. In contrast with previous online methods for updating classifier ensembles in tracking-by-detection, the weight vector that combines weak classifiers is treated as a random variable and the posterior distribution for the weight vector is estimated in a Bayesian manner. In essence, the weight vector is treated as a distribution that reflects the confidence among the weak classifiers used to construct and adapt the classifier ensemble. The resulting formulation models the time-varying discriminative ability among weak classifiers so that the ensembled strong classifier can adapt to the varying appearance, backgrounds, and occlusions. The formulation is tested in a tracking-by-detection implementation. Experiments on 28 challenging benchmark videos demonstrate that the proposed method can achieve results comparable to and often better than those of stateof-the-art approaches.
4 0.73944551 336 iccv-2013-Random Forests of Local Experts for Pedestrian Detection
Author: Javier Marín, David Vázquez, Antonio M. López, Jaume Amores, Bastian Leibe
Abstract: Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one.
5 0.72161227 279 iccv-2013-Multi-stage Contextual Deep Learning for Pedestrian Detection
Author: Xingyu Zeng, Wanli Ouyang, Xiaogang Wang
Abstract: Cascaded classifiers1 have been widely used in pedestrian detection and achieved great success. These classifiers are trained sequentially without joint optimization. In this paper, we propose a new deep model that can jointly train multi-stage classifiers through several stages of backpropagation. It keeps the score map output by a classifier within a local region and uses it as contextual information to support the decision at the next stage. Through a specific design of the training strategy, this deep architecture is able to simulate the cascaded classifiers by mining hard samples to train the network stage-by-stage. Each classifier handles samples at a different difficulty level. Unsupervised pre-training and specifically designed stage-wise supervised training are used to regularize the optimization problem. Both theoretical analysis and experimental results show that the training strategy helps to avoid overfitting. Experimental results on three datasets (Caltech, ETH and TUD-Brussels) show that our approach outperforms the state-of-the-art approaches.
6 0.70244193 242 iccv-2013-Learning People Detectors for Tracking in Crowded Scenes
7 0.69831729 241 iccv-2013-Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection
8 0.66180784 269 iccv-2013-Modeling Occlusion by Discriminative AND-OR Structures
9 0.63887358 211 iccv-2013-Image Segmentation with Cascaded Hierarchical Models and Logistic Disjunctive Normal Networks
10 0.61928678 75 iccv-2013-CoDeL: A Human Co-detection and Labeling Framework
11 0.61775243 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition
12 0.61252093 349 iccv-2013-Regionlets for Generic Object Detection
13 0.60849696 44 iccv-2013-Adapting Classification Cascades to New Domains
14 0.53772825 193 iccv-2013-Heterogeneous Auto-similarities of Characteristics (HASC): Exploiting Relational Information for Classification
15 0.5256139 220 iccv-2013-Joint Deep Learning for Pedestrian Detection
16 0.51325256 189 iccv-2013-HOGgles: Visualizing Object Detection Features
17 0.50693071 311 iccv-2013-Pedestrian Parsing via Deep Decompositional Network
18 0.50586581 285 iccv-2013-NEIL: Extracting Visual Knowledge from Web Data
19 0.49938768 109 iccv-2013-Detecting Avocados to Zucchinis: What Have We Done, and Where Are We Going?
20 0.49867377 352 iccv-2013-Revisiting Example Dependent Cost-Sensitive Learning with Decision Trees
topicId topicWeight
[(2, 0.056), (6, 0.014), (7, 0.015), (12, 0.017), (19, 0.168), (26, 0.107), (31, 0.034), (40, 0.013), (42, 0.119), (48, 0.013), (64, 0.085), (73, 0.031), (89, 0.191), (95, 0.015), (98, 0.036)]
simIndex simValue paperId paperTitle
1 0.90955383 243 iccv-2013-Learning Slow Features for Behaviour Analysis
Author: Lazaros Zafeiriou, Mihalis A. Nicolaou, Stefanos Zafeiriou, Symeon Nikitidis, Maja Pantic
Abstract: A recently introduced latent feature learning technique for time varying dynamic phenomena analysis is the socalled Slow Feature Analysis (SFA). SFA is a deterministic component analysis technique for multi-dimensional sequences that by minimizing the variance of the first order time derivative approximation of the input signal finds uncorrelated projections that extract slowly-varying features ordered by their temporal consistency and constancy. In this paper, we propose a number of extensions in both the deterministic and the probabilistic SFA optimization frameworks. In particular, we derive a novel deterministic SFA algorithm that is able to identify linear projections that extract the common slowest varying features of two or more sequences. In addition, we propose an Expectation Maximization (EM) algorithm to perform inference in a probabilistic formulation of SFA and similarly extend it in order to handle two and more time varying data sequences. Moreover, we demonstrate that the probabilistic SFA (EMSFA) algorithm that discovers the common slowest varying latent space of multiple sequences can be combined with dynamic time warping techniques for robust sequence timealignment. The proposed SFA algorithms were applied for facial behavior analysis demonstrating their usefulness and appropriateness for this task.
2 0.89132845 418 iccv-2013-The Way They Move: Tracking Multiple Targets with Similar Appearance
Author: Caglayan Dicle, Octavia I. Camps, Mario Sznaier
Abstract: We introduce a computationally efficient algorithm for multi-object tracking by detection that addresses four main challenges: appearance similarity among targets, missing data due to targets being out of the field of view or occluded behind other objects, crossing trajectories, and camera motion. The proposed method uses motion dynamics as a cue to distinguish targets with similar appearance, minimize target mis-identification and recover missing data. Computational efficiency is achieved by using a Generalized Linear Assignment (GLA) coupled with efficient procedures to recover missing data and estimate the complexity of the underlying dynamics. The proposed approach works with tracklets of arbitrary length and does not assume a dynamical model a priori, yet it captures the overall motion dynamics of the targets. Experiments using challenging videos show that this framework can handle complex target motions, non-stationary cameras and long occlusions, on scenarios where appearance cues are not available or poor.
3 0.87689918 371 iccv-2013-Saliency Detection via Absorbing Markov Chain
Author: Bowen Jiang, Lihe Zhang, Huchuan Lu, Chuan Yang, Ming-Hsuan Yang
Abstract: In this paper, we formulate saliency detection via absorbing Markov chain on an image graph model. We jointly consider the appearance divergence and spatial distribution of salient objects and the background. The virtual boundary nodes are chosen as the absorbing nodes in a Markov chain and the absorbed time from each transient node to boundary absorbing nodes is computed. The absorbed time of transient node measures its global similarity with all absorbing nodes, and thus salient objects can be consistently separated from the background when the absorbed time is used as a metric. Since the time from transient node to absorbing nodes relies on the weights on the path and their spatial distance, the background region on the center of image may be salient. We further exploit the equilibrium distribution in an ergodic Markov chain to reduce the absorbed time in the long-range smooth background regions. Extensive experiments on four benchmark datasets demonstrate robustness and efficiency of the proposed method against the state-of-the-art methods.
4 0.86699665 383 iccv-2013-Semi-supervised Learning for Large Scale Image Cosegmentation
Author: Zhengxiang Wang, Rujie Liu
Abstract: This paper introduces to use semi-supervised learning for large scale image cosegmentation. Different from traditional unsupervised cosegmentation that does not use any segmentation groundtruth, semi-supervised cosegmentation exploits the similarity from both the very limited training image foregrounds, as well as the common object shared between the large number of unsegmented images. This would be a much practical way to effectively cosegment a large number of related images simultaneously, where previous unsupervised cosegmentation work poorly due to the large variances in appearance between different images and the lack ofsegmentation groundtruthfor guidance in cosegmentation. For semi-supervised cosegmentation in large scale, we propose an effective method by minimizing an energy function, which consists of the inter-image distance, the intraimage distance and the balance term. We also propose an iterative updating algorithm to efficiently solve this energy function, which decomposes the original energy minimization problem into sub-problems, and updates each image alternatively to reduce the number of variables in each subproblem for computation efficiency. Experiment results on iCoseg and Pascal VOC datasets show that the proposed cosegmentation method can effectively cosegment hundreds of images in less than one minute. And our semi-supervised cosegmentation is able to outperform both unsupervised cosegmentation as well asfully supervised single image segmentation, especially when the training data is limited.
5 0.85747826 314 iccv-2013-Perspective Motion Segmentation via Collaborative Clustering
Author: Zhuwen Li, Jiaming Guo, Loong-Fah Cheong, Steven Zhiying Zhou
Abstract: This paper addresses real-world challenges in the motion segmentation problem, including perspective effects, missing data, and unknown number of motions. It first formulates the 3-D motion segmentation from two perspective views as a subspace clustering problem, utilizing the epipolar constraint of an image pair. It then combines the point correspondence information across multiple image frames via a collaborative clustering step, in which tight integration is achieved via a mixed norm optimization scheme. For model selection, wepropose an over-segment and merge approach, where the merging step is based on the property of the ?1-norm ofthe mutual sparse representation oftwo oversegmented groups. The resulting algorithm can deal with incomplete trajectories and perspective effects substantially better than state-of-the-art two-frame and multi-frame methods. Experiments on a 62-clip dataset show the significant superiority of the proposed idea in both segmentation accuracy and model selection.
same-paper 6 0.85595697 190 iccv-2013-Handling Occlusions with Franken-Classifiers
7 0.83238614 150 iccv-2013-Exemplar Cut
8 0.83096647 338 iccv-2013-Randomized Ensemble Tracking
9 0.83073127 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
10 0.83014131 442 iccv-2013-Video Segmentation by Tracking Many Figure-Ground Segments
11 0.82986736 414 iccv-2013-Temporally Consistent Superpixels
12 0.82929301 379 iccv-2013-Semantic Segmentation without Annotating Segments
13 0.82771814 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
14 0.82760268 330 iccv-2013-Proportion Priors for Image Sequence Segmentation
15 0.8271184 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning
16 0.82646996 427 iccv-2013-Transfer Feature Learning with Joint Distribution Adaptation
17 0.82567978 121 iccv-2013-Discriminatively Trained Templates for 3D Object Detection: A Real Time Scalable Approach
18 0.82231641 270 iccv-2013-Modeling Self-Occlusions in Dynamic Shape and Appearance Tracking
19 0.82192147 65 iccv-2013-Breaking the Chain: Liberation from the Temporal Markov Assumption for Tracking Human Poses
20 0.8218357 160 iccv-2013-Fast Object Segmentation in Unconstrained Video