iccv iccv2013 iccv2013-190 iccv2013-190-reference knowledge-graph by maker-knowledge-mining

190 iccv-2013-Handling Occlusions with Franken-Classifiers

Source: pdf

Author: Markus Mathias, Rodrigo Benenson, Radu Timofte, Luc Van_Gool

Abstract: Detecting partially occluded pedestrians is challenging. A common practice to maximize detection quality is to train a set of occlusion-specific classifiers, each for a certain amount and type of occlusion. Since training classifiers is expensive, only a handful are typically trained. We show that by using many occlusion-specific classifiers, we outperform previous approaches on three pedestrian datasets; INRIA, ETH, and Caltech USA. We present a new approach to train such classifiers. By reusing computations among different training stages, 16 occlusion-specific classifiers can be trained at only one tenth the cost of one full training. We show that also test time cost grows sub-linearly.

reference text

[1] R. Benenson, M. Mathias, R. Timofte, and L. Van Gool. Pedestrian detection at 100 frames per second. In CVPR, 2012.

[2] R. Benenson, M. Mathias, T. Tuytelaars, and L. Van Gool. Seeking the strongest rigid detector. In CVPR,

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13] 2013. M. Blaschko and C. Lampert. Learning to Localize Objects with Structured Output Regression. In ECCV, 2008. N. Dalal and B. Triggs. Histograms of Oriented Gradients for Human Detection. In CVPR, 2005. P. Dollár, R. Appel, and W. Kienzle. Crosstalk Cascades for Frame-Rate Pedestrian Detection. In ECCV, 2012. P. Dollár, Z. Tu, P. Perona, and S. Belongie. Integral Channel Features. In BMVC, 2009. P. Dollár, C. Wojek, B. Schiele, and P. Perona. Pedestrian Detection: An Evaluation of the State of the Art. TPAMI, 2011. M. Enzweiler, A. Eigenstetter, B. Schiele, and D. Gavrila. Multi-cue pedestrian classification with partial occlusion handling. In CVPR, 2010. T. Gao, B. Packer, and D. Koller. A Segmentationaware Object Detection Model with Occlusion Handling. In CVPR, 2011. R. Girshick, P. Felzenszwalb, and D. McAllester. Object Detection with Grammar Models. In NIPS, 2011. D. Hoiem, Y. Chodpathumwan, and Q. Dai. Diagnosing Error in Object Detectors. In ECCV, 2012. M. W. Shelley. Frankenstein; or, the modern Prometheus. 1818. S. Tang, M. Andriluka, and B. Schiele. Detection and

[14]

[15]

[16]

[17]

[18]

[19]

[20] Tracking of Occluded People. In BMVC, 2012. A. Torralba, K. Murphy, and W. Freeman. Sharing features: efficient bossting procedures for multiclass object detection. In CVPR, 2004. A. Vedaldi and A. Zisserman. Structured Output Regression for Detection with Partial Truncation. In NIPS, 2009. P. Viola and M. Jones. Robust Real-Time Face Detection. In IJCV, 2004. X. Wang, X. Han, and S. Yan. HOG-LBP human detector with partial occlusion handling. In ICCV, 2009. P. Wohlhart, M. Donoser, P. Roth, and H. Bischof. Detecting Partially Occluded Objects with an Implicit Shape Model Random Field. In ACCV, 2012. C. Wojek, S. Walk, S. Roth, and B. Schiele. Monocular 3D scene understanding with explicit occlusion reasoning. In CVPR, 2011. B. Wu and R. Nevatia. Detection of Multiple, Partially Occluded Humans in a Single Image by Bayesian Combination of Edgelet Part Detectors. In ICCV, 2005. A. Training parameters Unless otherwise specified we use the parameters that provided the best classification results in the original paper

[6]. The nodes are constructed using a pool of 30 000 candidate regions. The full classifier consists of 2 000 weak clas- ×× sifiers. We train in 3 stages; the first stage randomly samples 5 000 negative samples, the second and third stage use bootstrapping to add 5 000 additional hard negatives each. To be faster and memory efficient we shrink the feature channels by a factor 4 (see [6, addendum]). The model window is of size 64 128 pixels, after shrinking it has size 16 32 pixels. Training t1i2m8ep i sx emles,a asfutererds on a desktop mssaiczhei1n6e× ×w3i2thp an eInls-. tel Core i7 870 CPU and a Nvidia GeForce GTX 590 GPU. 1155 1122