iccv iccv2013 iccv2013-336 knowledge-graph by maker-knowledge-mining

336 iccv-2013-Random Forests of Local Experts for Pedestrian Detection


Source: pdf

Author: Javier Marín, David Vázquez, Antonio M. López, Jaume Amores, Bastian Leibe

Abstract: Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. [sent-4, score-0.57]

2 All of these approaches are holistic, in the sense that the whole pedestrian is described by a single feature vector and is classified at once. [sent-14, score-0.256]

3 Recently, some authors have proposed successful methods for combining local detectors [13, 4, 25] and integrating the evidence from multiple local patches [14, 20, 16, 18, 11]. [sent-15, score-0.131]

4 AdaBoost is also a popular classifier for pedestrian detection, typically used in the presence of large numbers of features [24, 8, 7], or for speeding up the detection through cascaded layers of Boosting [21, 7, 27, 1]. [sent-21, score-0.416]

5 Recently, Random Forest ensembles [14, 20, 16] have been proposed as an alternative type of ensemble classifier for pedestrian detection. [sent-22, score-0.501]

6 In this paper, we propose a novel pedestrian detection approach that combines the flexibility of a part-based model with the fast execution time of a Random Forest classifier. [sent-24, score-0.325]

7 In this proposed combination, the role of the part evaluations is taken over by local expert evaluations at the nodes of the decision tree. [sent-25, score-0.243]

8 As an image window proceeds down the tree, a variable configuration of local experts is evaluated on its content, depending on the outcome of previous evaluations. [sent-26, score-0.347]

9 Thus, our proposed approach can flexibly adapt to different pedestrian viewpoints and body poses. [sent-27, score-0.256]

10 At the same time, by using an appropriate bootstrapping procedure, different trees of the forest cover different spatial configurations of parts in the object. [sent-28, score-0.421]

11 In addition to this, the decision tree structure ensures that only a small number of local experts are evaluated on each detection window, resulting in fast execution. [sent-30, score-0.482]

12 The proposed detection system was evaluated with a variety of well-known pedestrian datasets such as Caltech [9], Daimler [10], ETH [12] and INRIA [5], where it consistently ranks among the top performers. [sent-31, score-0.431]

13 Standard weak learner model Before discussing the classifier proposed in our work, let us first introduce the basic concepts and notation of the Random Forest ensemble [3]. [sent-36, score-0.411]

14 For lack of space we restrict the explanation to only the standard weak learner model, and we refer to [3] for an in-depth description of the Random Forests (RF) classifier. [sent-37, score-0.222]

15 Given a tree of the forest, we follow the notation in [3] and denote as 푆푗 the set of samples received by the 푗-th internal or split node of this tree. [sent-38, score-0.417]

16 We denote as ℎ(⃗ 푣; 휃푗) ∈ {0, 1} the split function associated with this node, wher)e ∈푣 {is0 a 1f}ea tthuere s pvlietcft uorn cantidon 휃푗 siso tchiaet esedt w woift parameters defining the split function. [sent-39, score-0.132]

17 The split function acts as a weak classifier that is part of the ensemble defined by the whole tree. [sent-40, score-0.278]

18 At training time, the 푗-th node receives a subset of samples 푆푗, and based on this data the classifier ℎ(⃗ 푣; 휃푗) is trained. [sent-41, score-0.392]

19 At test time, the 푗-th node receives the feature vector and this vector is passed to either the left or the right child depending on the output of ℎ(⃗ 푣; 휃푗) ∈ {0, 1}. [sent-43, score-0.206]

20 The feature selector 휙 is defined as the function 휙(⃗ 푣) = 푢 where 푢 ∈ ℝ푠 contains a s duebfsienet do fa components no f휙 푣( 푣⃗ ∈) =ℝ⃗ 푢푑 , 푠 < 휏], wFihnearlely [,] hise ethspel iintdfuicnacttoior nfuℎn( 푣⃗ct;io휃)n. [sent-48, score-0.207]

21 Define the parameters for node 푗 as: 휃푗 = argm휃∈a풯x푗퐼(휃) 3. [sent-56, score-0.147]

22 Proposed method In this work we define a novel ensemble of local experts based on an averaged combination ofrandom decision trees. [sent-57, score-0.39]

23 Afterwards we will introduce the concepts specific to pedestrian detection, and introduce our ensemble of local experts. [sent-59, score-0.44]

24 Weak learner model The main difference with respect to the standard framework is that in each node the optimization of the parameters 휃 is not only based on a maximization of a purity measure (Eq. [sent-62, score-0.409]

25 Later on we will see that the joint use of this learner together with and an appropriate feature selector 휙(⃗ 푣) provides the desired ensemble of local experts. [sent-68, score-0.567]

26 Keeping the discussion still under generic pattern recognition terms, the optimization process for each node is composed of the following steps: 1. [sent-69, score-0.147]

27 휓⃗푘 (b) Obtain a discriminant linear transformation by learning a linear SVM classifier over the transformed samples 풮휙푗푘. [sent-81, score-0.307]

28 Define the split function for node as: 푗 ℎ(⃗ 푣; 휃푗) = [휓⃗푘∗ ⋅ 휙푘∗ ( 푣⃗) ≤ 휏푘∗] . [sent-90, score-0.213]

29 (2) The most important difference between the proposed weak learner model and the standard one lies in the optimization ofthe linear transformation in step 2. [sent-91, score-0.271]

30 In our case, this is carried out by a discriminant optimizer such as the linear SVM learner. [sent-93, score-0.182]

31 This learner obtains a hyperplane that optimally separates the set of training samples at each node. [sent-94, score-0.268]

32 While the latter approach also provides a discriminant classification of the samples, there is no guarantee that the resulting hyperplane provides an optimal maximum-margin discrimination. [sent-96, score-0.214]

33 Furthermore, the use of a discriminant classifier such as the linear SVM, together with an appropriate definition of the feature selector 휙 (see section 3. [sent-97, score-0.43]

34 2) allows us to train our ensemble of local experts inside the RF framework. [sent-98, score-0.358]

35 Feature selector We define our ensemble of local experts through the definition of an appropriate feature selector 휙(⃗ 푣). [sent-101, score-0.804]

36 In particular, the 푘-th feature selector 휙푘 is generated by randomly selecting the coordinates (푖, 푗) of the top-left block, and randomly generating the width 푊 and height 퐻 of the rectangular area, where 1 ≤ 푊 ≤ 퐿 and 1 ≤ 퐻 ≤ 퐿, owfi tthh e퐿 r ethctea predefined ,m wahxeirmeu 1m ≤ ≤si 푊ze. [sent-105, score-0.251]

37 Given the previous definition of the feature selector 휙푘, the 푘-th local expert is defined as 퐸푘( 푣⃗) = ⋅ 휙푘( 푣⃗). [sent-106, score-0.417]

38 1, the transformation is learned by using a discriminant learner such as linear SVM, using the transformed samples as training set. [sent-108, score-0.461]

39 This is equivatlehent t rtoa extracting a mlopclaels s b 풮lock-based feature vector from the same rectangular area across the different image windows introduced into the node, and feeding them to a learner that obtains a model of this part of the window. [sent-109, score-0.259]

40 1, the 푗-th node randomly generates a fixed number 퐾 of feature selectors 휙푘, learns the corresponding local experts 퐸푘 and selects the most discriminant one according to the given data. [sent-113, score-0.63]

41 The selected lo- 풮휙푗푘 휓⃗푇푘 휓⃗푘 cal expert 퐸푘 also complements, in a classification sense, the ones selected by the other nodes in the same branch of the tree. [sent-114, score-0.231]

42 This is due to the fact that the data samples received by the node 푗 depend on the classification provided by its ancestors. [sent-115, score-0.245]

43 As a result, each tree of the forest provides an ensemble of local experts which are both discriminant and complementary. [sent-116, score-0.885]

44 , each tree of the forest receives the whole training set. [sent-123, score-0.438]

45 2594 Leaves where the majority of the samples are positive Leaves where the majority of the samples are negative (c) Figure 1. [sent-128, score-0.166]

46 (a) Illustration of our feature selector, (b) tree generated by our method (c) leaves of the forest. [sent-129, score-0.14]

47 probability that the window 푣 belongs to class 푐, computed by the 푡-th tree of the forest. [sent-131, score-0.178]

48 Every leaf stores the class distribution of the training samples that reach it, and then each leaf probability is set according to this distribution. [sent-133, score-0.295]

49 1e(ebs) i nsh t∑ohew sR an example of tree generated with the proposed method, where for each internal node we show the patch of the window selected, and for each leaf we show, overlapped, all the patches selected from the root to this leaf. [sent-136, score-0.566]

50 Bootstrapping procedure We use bootstrapping at training time in order to select a subset of negative windows from the large pool of possible negatives. [sent-141, score-0.254]

51 (b) Use the current forest ℱ for detecting false posiUtivsees hine t chuer training images. [sent-151, score-0.325]

52 Cdeotnescitdinegr fthalessee pfaolssiepositives as negative samples and add them to the training set 풮. [sent-152, score-0.13]

53 (c) Use the new training set 풮 for updating the leaf probabilities 푝 tr(a푐i∣n ⃗푣i)n fgo sre atll 풮 풮th feo rtr ueepsd aitni nℱg. [sent-153, score-0.113]

54 Moreover, it is worth mentioning that the training time is reduced thanks to the smaller number of negative samples introduced at each iteration. [sent-156, score-0.16]

55 Soft Cascade In order to speed up the detection of objects, we propose to use a Soft Cascade (SC) architecture [2]. [sent-159, score-0.116]

56 Let 푇 be the total number of trees in the forest, 푀 be the number of trees used in an initial layer, 휂 be the rejection threshold (see Section 5. [sent-160, score-0.148]

57 The cascade works by first gathering enough evidence for the window 푣, through the use of 푀 trees in the initial layer (step 1). [sent-167, score-0.213]

58 After this initialization, a new tree is added at each layer of the cascade (step 3. [sent-168, score-0.173]

59 This is due to the fact that a large majority of windows are rejected at early stages of the 2595 cascade, and thus there is no need to compute the probability for all the trees of the forest on these windows. [sent-173, score-0.361]

60 Here, the leaf nodes take up the role of visual words, and each of them stores a vote distribution for the relative position of the object center. [sent-177, score-0.148]

61 The votes from activated leaf nodes are combined in a Hough Voting space, and object locations are determined as local maxima of the vote distribution. [sent-178, score-0.157]

62 While HF is used for classifying the individual local patches, our RF is used for classifying the entire object at once. [sent-179, score-0.122]

63 For this purpose, at training time each node of the tree receives a subset of window samples containing the entire object and decides which local patch is most discriminant based on the given data. [sent-180, score-0.79]

64 As a result, each tree of the forest provides an ensemble of local experts, where each expert is specialized in a different local patch of the object. [sent-181, score-0.797]

65 Another important difference with respect to [14, 20, 16] is that in these approaches the collection of local patches is sampled beforehand from the window and introduced into the tree. [sent-182, score-0.161]

66 Therefore, each node of the tree is forced to learn each patch of the collection, regardless of whether or not this patch is discriminant for classifying the whole object. [sent-183, score-0.599]

67 In contrast, in the proposed method each node of the tree automatically selects the local patch that is found to be the most discriminant one, based on the subset of samples received. [sent-184, score-0.666]

68 Furthermore, by using the RF machinery the local patch selected by each node complements, in a discriminative sense, the local patches selected by its ancestors in the tree, obtaining a strong ensemble of local experts. [sent-185, score-0.617]

69 At the end of the process each tree of the forest has selected a different collection of discriminant local patches, increas- ing the robustness and generalization capability of the final classifier. [sent-186, score-0.565]

70 [17] proposed a new RF classifier based on ridge regression for obtaining discriminant linear splits at the node level. [sent-188, score-0.338]

71 This work resembles our proposal in the sense that we also make use of linear discriminant learners at split nodes. [sent-189, score-0.21]

72 This is combined with a patch-selection strategy in such a way that each split node determines a local expert of the ensemble. [sent-191, score-0.391]

73 More similar to our work is [26], where each split node selects a rectangular patch and applies a linear SVM onto it. [sent-192, score-0.376]

74 A node is no longer split if either of the following conditions occurs: a) its depth is larger than 6 levels3 ; b) the subset of samples contains less than 10 samples; or c) the percentage of samples from the same class is above 99%. [sent-219, score-0.388]

75 We performed the standard per-image evaluation used in pedestrian detec3In preliminary experiments we saw that the performance was longer improving when increasing the number of levels. [sent-227, score-0.256]

76 In order to quantify the performance we used the well-known Caltech pedestrian toolbox [9]. [sent-230, score-0.256]

77 This permits us to both × accelerate the detection of pedestrians (as fewer windows are evaluated by the classifier), and remove false positives standing on non-plausible locations of the scale-space pyramid, thus improving the resulting accuracy. [sent-234, score-0.324]

78 The first one is the maximum patch size 퐿 selected by each local expert (see section 3. [sent-241, score-0.29]

79 This parameter represents the compromise between having an expert that is based on local (i. [sent-243, score-0.178]

80 Regarding the rest of the other Caltech evaluation subsets, we show results for the two most difficult cases: partial-occlusions and medium-scale (where the size of the pedestrian is fairly small). [sent-272, score-0.288]

81 In the overall subset, our method ranks second (with miss rate 82. [sent-280, score-0.118]

82 In the near-scale, it ranks third (with miss rate 29. [sent-283, score-0.118]

83 Furthermore, we parallelized our code in order to compute several scales of the pyramid at the same time, and in order to compute several trees of the forest at the same time (this last parallelization was only performed for our baseline and it was not performed when using the SC component). [sent-290, score-0.316]

84 If we consider pedestrians with a minimum height of 96 pixels, the system operates at 4 fps with HOG, and 3 fps with HOG-LBP. [sent-292, score-0.332]

85 For this purpose, we used AVX instructions in order to implement 2597 RandForest−HOGLBP: Maxmium patch szie L eistamr. [sent-294, score-0.124]

86 4 false postivies per miage (a) RandForest−HOGLBP: # Bootstrappnig tierations 1 8. [sent-299, score-0.243]

87 (a) Performance as a function of maximum patch size 퐿, (b) performance as a function of the number of bootstrapping rounds. [sent-305, score-0.186]

88 19% MVCRLueahtlnSrFdyivoafmsEr+−CeVtlTMS2HoipOneGsdtLBviP10−rapnemtsg10 Caltech testnig dataset: Reasonabel Caltech testnig dataset: Partail occulsion sirtaem5. [sent-310, score-0.168]

89 51 0−210−1100 false postivies per miage Caltech testnig dataset: Meduim scael 1 6. [sent-322, score-0.327]

90 5 fps without any optimization (using only the SoftCascade and excluding the CGP step) and at 4. [sent-341, score-0.116]

91 6 fps if we use both AVX instructions and the CGP step. [sent-342, score-0.159]

92 The second and third columns show, respectively, the fps when the minimum pedestrian height is 50 and 96 pixels. [sent-350, score-0.372]

93 Conclusions We presented a novel approach for estimating ensembles of local experts through the RF framework. [sent-352, score-0.299]

94 The proposed approach works with rich block-based descriptors which are reused by the different experts of the ensemble in such a way that each expert selects the most discriminant local patch based on this descriptor. [sent-353, score-0.795]

95 Making use of the RF framework, the patches selected by each tree are both discriminant and complementary, and at the end of the process the forest estimates a diverse collection of ensembles providing both robustness and generalization capabilities. [sent-354, score-0.624]

96 As part of the work, we show how to integrate the proposed RF classifier with both a SC architecture and a simple yet effective CGP algorithm, which permit to significantly speed up the detection, as shown in the results. [sent-355, score-0.134]

97 At the same time, we showed that the proposed architecture provides a quasi real-time performance on pair with some of the fastest approaches. [sent-362, score-0.112]

98 Survey of pedestrian detection for advanced driver assistance systems. [sent-457, score-0.325]

99 Fast pedestrian detection by cascaded random forest with dominant orientation templates. [sent-494, score-0.611]

100 Fast human detection using a cascade of histograms of oriented gradients. [sent-539, score-0.136]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('cgp', 0.448), ('pedestrian', 0.256), ('forest', 0.242), ('caltech', 0.215), ('selector', 0.207), ('experts', 0.203), ('learner', 0.17), ('rf', 0.164), ('node', 0.147), ('discriminant', 0.144), ('expert', 0.136), ('fps', 0.116), ('ensemble', 0.113), ('tree', 0.106), ('bootstrapping', 0.105), ('sc', 0.101), ('postivies', 0.099), ('purity', 0.092), ('miage', 0.092), ('daimler', 0.092), ('inria', 0.087), ('testnig', 0.084), ('leaf', 0.082), ('patch', 0.081), ('ranks', 0.076), ('trees', 0.074), ('window', 0.072), ('detection', 0.069), ('multiresc', 0.069), ('cascade', 0.067), ('samples', 0.067), ('split', 0.066), ('regarding', 0.063), ('pedestrians', 0.062), ('hog', 0.061), ('receives', 0.059), ('forests', 0.056), ('avx', 0.056), ('menze', 0.056), ('randforest', 0.056), ('selectors', 0.056), ('ensembles', 0.054), ('performer', 0.052), ('false', 0.052), ('weak', 0.052), ('transformation', 0.049), ('svm', 0.049), ('architecture', 0.047), ('patches', 0.047), ('classifier', 0.047), ('hf', 0.046), ('windows', 0.045), ('rectangular', 0.044), ('cascaded', 0.044), ('instructions', 0.043), ('hoglbp', 0.043), ('local', 0.042), ('miss', 0.042), ('subset', 0.041), ('machinery', 0.041), ('complements', 0.041), ('doll', 0.04), ('classifying', 0.04), ('criminisi', 0.04), ('permit', 0.04), ('hough', 0.039), ('selects', 0.038), ('reused', 0.038), ('optimizer', 0.038), ('operates', 0.038), ('enzweiler', 0.037), ('laptop', 0.036), ('provides', 0.035), ('altogether', 0.035), ('randomness', 0.035), ('sources', 0.035), ('partial', 0.034), ('reasonable', 0.034), ('leaves', 0.034), ('standing', 0.033), ('nodes', 0.033), ('stores', 0.033), ('permits', 0.033), ('negative', 0.032), ('definition', 0.032), ('rest', 0.032), ('decision', 0.032), ('descriptor', 0.031), ('training', 0.031), ('selected', 0.031), ('received', 0.031), ('type', 0.031), ('evaluated', 0.03), ('exhaustively', 0.03), ('mentioning', 0.03), ('holistic', 0.03), ('wojek', 0.03), ('fastest', 0.03), ('triggs', 0.03), ('concepts', 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 336 iccv-2013-Random Forests of Local Experts for Pedestrian Detection

Author: Javier Marín, David Vázquez, Antonio M. López, Jaume Amores, Bastian Leibe

Abstract: Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one.

2 0.22238168 279 iccv-2013-Multi-stage Contextual Deep Learning for Pedestrian Detection

Author: Xingyu Zeng, Wanli Ouyang, Xiaogang Wang

Abstract: Cascaded classifiers1 have been widely used in pedestrian detection and achieved great success. These classifiers are trained sequentially without joint optimization. In this paper, we propose a new deep model that can jointly train multi-stage classifiers through several stages of backpropagation. It keeps the score map output by a classifier within a local region and uses it as contextual information to support the decision at the next stage. Through a specific design of the training strategy, this deep architecture is able to simulate the cascaded classifiers by mining hard samples to train the network stage-by-stage. Each classifier handles samples at a different difficulty level. Unsupervised pre-training and specifically designed stage-wise supervised training are used to regularize the optimization problem. Both theoretical analysis and experimental results show that the training strategy helps to avoid overfitting. Experimental results on three datasets (Caltech, ETH and TUD-Brussels) show that our approach outperforms the state-of-the-art approaches.

3 0.18740501 190 iccv-2013-Handling Occlusions with Franken-Classifiers

Author: Markus Mathias, Rodrigo Benenson, Radu Timofte, Luc Van_Gool

Abstract: Detecting partially occluded pedestrians is challenging. A common practice to maximize detection quality is to train a set of occlusion-specific classifiers, each for a certain amount and type of occlusion. Since training classifiers is expensive, only a handful are typically trained. We show that by using many occlusion-specific classifiers, we outperform previous approaches on three pedestrian datasets; INRIA, ETH, and Caltech USA. We present a new approach to train such classifiers. By reusing computations among different training stages, 16 occlusion-specific classifiers can be trained at only one tenth the cost of one full training. We show that also test time cost grows sub-linearly.

4 0.18046387 404 iccv-2013-Structured Forests for Fast Edge Detection

Author: Piotr Dollár, C. Lawrence Zitnick

Abstract: Edge detection is a critical component of many vision systems, including object detectors and image segmentation algorithms. Patches of edges exhibit well-known forms of local structure, such as straight lines or T-junctions. In this paper we take advantage of the structure present in local image patches to learn both an accurate and computationally efficient edge detector. We formulate the problem of predicting local edge masks in a structured learning framework applied to random decision forests. Our novel approach to learning decision trees robustly maps the structured labels to a discrete space on which standard information gain measures may be evaluated. The result is an approach that obtains realtime performance that is orders of magnitude faster than many competing state-of-the-art approaches, while also achieving state-of-the-art edge detection results on the BSDS500 Segmentation dataset and NYU Depth dataset. Finally, we show the potential of our approach as a general purpose edge detector by showing our learned edge models generalize well across datasets.

5 0.15416652 448 iccv-2013-Weakly Supervised Learning of Image Partitioning Using Decision Trees with Structured Split Criteria

Author: Christoph Straehle, Ullrich Koethe, Fred A. Hamprecht

Abstract: We propose a scheme that allows to partition an image into a previously unknown number of segments, using only minimal supervision in terms of a few must-link and cannotlink annotations. We make no use of regional data terms, learning instead what constitutes a likely boundary between segments. Since boundaries are only implicitly specified through cannot-link constraints, this is a hard and nonconvex latent variable problem. We address this problem in a greedy fashion using a randomized decision tree on features associated with interpixel edges. We use a structured purity criterion during tree construction and also show how a backtracking strategy can be used to prevent the greedy search from ending up in poor local optima. The proposed strategy is compared with prior art on natural images.

6 0.15194564 136 iccv-2013-Efficient Pedestrian Detection by Directly Optimizing the Partial Area under the ROC Curve

7 0.1393567 437 iccv-2013-Unsupervised Random Forest Manifold Alignment for Lipreading

8 0.13878842 220 iccv-2013-Joint Deep Learning for Pedestrian Detection

9 0.13001858 47 iccv-2013-Alternating Regression Forests for Object Detection and Pose Estimation

10 0.12922128 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition

11 0.11807696 338 iccv-2013-Randomized Ensemble Tracking

12 0.11593612 165 iccv-2013-Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies

13 0.11004596 340 iccv-2013-Real-Time Articulated Hand Pose Estimation Using Semi-supervised Transductive Regression Forests

14 0.10676541 352 iccv-2013-Revisiting Example Dependent Cost-Sensitive Learning with Decision Trees

15 0.1031967 133 iccv-2013-Efficient Hand Pose Estimation from a Single Depth Image

16 0.098455869 111 iccv-2013-Detecting Dynamic Objects with Multi-view Background Subtraction

17 0.096186116 130 iccv-2013-Dynamic Structured Model Selection

18 0.094799332 81 iccv-2013-Combining the Right Features for Complex Event Recognition

19 0.09407492 391 iccv-2013-Sieving Regression Forest Votes for Facial Feature Detection in the Wild

20 0.093557619 44 iccv-2013-Adapting Classification Cascades to New Domains


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.214), (1, 0.036), (2, -0.022), (3, -0.062), (4, 0.093), (5, -0.04), (6, -0.016), (7, 0.08), (8, -0.052), (9, -0.098), (10, -0.035), (11, -0.043), (12, 0.02), (13, -0.032), (14, 0.131), (15, 0.028), (16, -0.038), (17, -0.041), (18, 0.093), (19, 0.204), (20, -0.125), (21, 0.016), (22, -0.055), (23, 0.131), (24, -0.168), (25, -0.02), (26, -0.071), (27, 0.084), (28, -0.035), (29, -0.155), (30, -0.033), (31, 0.086), (32, -0.09), (33, 0.062), (34, -0.003), (35, 0.031), (36, -0.03), (37, -0.042), (38, 0.003), (39, -0.054), (40, -0.009), (41, 0.045), (42, -0.112), (43, 0.044), (44, -0.032), (45, -0.056), (46, -0.055), (47, -0.051), (48, 0.078), (49, -0.013)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9447217 336 iccv-2013-Random Forests of Local Experts for Pedestrian Detection

Author: Javier Marín, David Vázquez, Antonio M. López, Jaume Amores, Bastian Leibe

Abstract: Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one.

2 0.79817939 136 iccv-2013-Efficient Pedestrian Detection by Directly Optimizing the Partial Area under the ROC Curve

Author: Sakrapee Paisitkriangkrai, Chunhua Shen, Anton Van Den Hengel

Abstract: Many typical applications of object detection operate within a prescribed false-positive range. In this situation the performance of a detector should be assessed on the basis of the area under the ROC curve over that range, rather than over the full curve, as the performance outside the range is irrelevant. This measure is labelled as the partial area under the ROC curve (pAUC). Effective cascade-based classification, for example, depends on training node classifiers that achieve the maximal detection rate at a moderate false positive rate, e.g., around 40% to 50%. We propose a novel ensemble learning method which achieves a maximal detection rate at a user-defined range of false positive rates by directly optimizing the partial AUC using structured learning. By optimizing for different ranges of false positive rates, the proposed method can be used to train either a single strong classifier or a node classifier forming part of a cascade classifier. Experimental results on both synthetic and real-world data sets demonstrate the effectiveness of our approach, and we show that it is possible to train state-of-the-art pedestrian detectors using the pro- posed structured ensemble learning method.

3 0.77591926 404 iccv-2013-Structured Forests for Fast Edge Detection

Author: Piotr Dollár, C. Lawrence Zitnick

Abstract: Edge detection is a critical component of many vision systems, including object detectors and image segmentation algorithms. Patches of edges exhibit well-known forms of local structure, such as straight lines or T-junctions. In this paper we take advantage of the structure present in local image patches to learn both an accurate and computationally efficient edge detector. We formulate the problem of predicting local edge masks in a structured learning framework applied to random decision forests. Our novel approach to learning decision trees robustly maps the structured labels to a discrete space on which standard information gain measures may be evaluated. The result is an approach that obtains realtime performance that is orders of magnitude faster than many competing state-of-the-art approaches, while also achieving state-of-the-art edge detection results on the BSDS500 Segmentation dataset and NYU Depth dataset. Finally, we show the potential of our approach as a general purpose edge detector by showing our learned edge models generalize well across datasets.

4 0.74650121 47 iccv-2013-Alternating Regression Forests for Object Detection and Pose Estimation

Author: Samuel Schulter, Christian Leistner, Paul Wohlhart, Peter M. Roth, Horst Bischof

Abstract: We present Alternating Regression Forests (ARFs), a novel regression algorithm that learns a Random Forest by optimizing a global loss function over all trees. This interrelates the information of single trees during the training phase and results in more accurate predictions. ARFs can minimize any differentiable regression loss without sacrificing the appealing properties of Random Forests, like low computational complexity during both, training and testing. Inspired by recent developments for classification [19], we derive a new algorithm capable of dealing with different regression loss functions, discuss its properties and investigate the relations to other methods like Boosted Trees. We evaluate ARFs on standard machine learning benchmarks, where we observe better generalization power compared to both standard Random Forests and Boosted Trees. Moreover, we apply the proposed regressor to two computer vision applications: object detection and head pose estimation from depth images. ARFs outperform the Random Forest baselines in both tasks, illustrating the importance of optimizing a common loss function for all trees.

5 0.74332541 190 iccv-2013-Handling Occlusions with Franken-Classifiers

Author: Markus Mathias, Rodrigo Benenson, Radu Timofte, Luc Van_Gool

Abstract: Detecting partially occluded pedestrians is challenging. A common practice to maximize detection quality is to train a set of occlusion-specific classifiers, each for a certain amount and type of occlusion. Since training classifiers is expensive, only a handful are typically trained. We show that by using many occlusion-specific classifiers, we outperform previous approaches on three pedestrian datasets; INRIA, ETH, and Caltech USA. We present a new approach to train such classifiers. By reusing computations among different training stages, 16 occlusion-specific classifiers can be trained at only one tenth the cost of one full training. We show that also test time cost grows sub-linearly.

6 0.73837191 241 iccv-2013-Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection

7 0.72977769 211 iccv-2013-Image Segmentation with Cascaded Hierarchical Models and Logistic Disjunctive Normal Networks

8 0.70032406 352 iccv-2013-Revisiting Example Dependent Cost-Sensitive Learning with Decision Trees

9 0.67261118 279 iccv-2013-Multi-stage Contextual Deep Learning for Pedestrian Detection

10 0.62772125 193 iccv-2013-Heterogeneous Auto-similarities of Characteristics (HASC): Exploiting Relational Information for Classification

11 0.61999583 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition

12 0.59082633 448 iccv-2013-Weakly Supervised Learning of Image Partitioning Using Decision Trees with Structured Split Criteria

13 0.59026676 338 iccv-2013-Randomized Ensemble Tracking

14 0.56901759 165 iccv-2013-Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies

15 0.56565195 437 iccv-2013-Unsupervised Random Forest Manifold Alignment for Lipreading

16 0.54903656 125 iccv-2013-Drosophila Embryo Stage Annotation Using Label Propagation

17 0.53036141 44 iccv-2013-Adapting Classification Cascades to New Domains

18 0.51282501 277 iccv-2013-Multi-channel Correlation Filters

19 0.49345943 220 iccv-2013-Joint Deep Learning for Pedestrian Detection

20 0.49129269 340 iccv-2013-Real-Time Articulated Hand Pose Estimation Using Semi-supervised Transductive Regression Forests


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.083), (7, 0.019), (12, 0.043), (26, 0.111), (31, 0.031), (40, 0.031), (42, 0.092), (64, 0.043), (70, 0.182), (73, 0.027), (89, 0.191), (95, 0.017), (98, 0.025)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.90561986 325 iccv-2013-Predicting Primary Gaze Behavior Using Social Saliency Fields

Author: Hyun Soo Park, Eakta Jain, Yaser Sheikh

Abstract: We present a method to predict primary gaze behavior in a social scene. Inspired by the study of electric fields, we posit “social charges ”—latent quantities that drive the primary gaze behavior of members of a social group. These charges induce a gradient field that defines the relationship between the social charges and the primary gaze direction of members in the scene. This field model is used to predict primary gaze behavior at any location or time in the scene. We present an algorithm to estimate the time-varying behavior of these charges from the primary gaze behavior of measured observers in the scene. We validate the model by evaluating its predictive precision via cross-validation in a variety of social scenes.

2 0.84633392 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation

Author: Suyog Dutt Jain, Kristen Grauman

Abstract: The mode of manual annotation used in an interactive segmentation algorithm affects both its accuracy and easeof-use. For example, bounding boxes are fast to supply, yet may be too coarse to get good results on difficult images; freehand outlines are slower to supply and more specific, yet they may be overkill for simple images. Whereas existing methods assume a fixed form of input no matter the image, we propose to predict the tradeoff between accuracy and effort. Our approach learns whether a graph cuts segmentation will succeed if initialized with a given annotation mode, based on the image ’s visual separability and foreground uncertainty. Using these predictions, we optimize the mode of input requested on new images a user wants segmented. Whether given a single image that should be segmented as quickly as possible, or a batch of images that must be segmented within a specified time budget, we show how to select the easiest modality that will be sufficiently strong to yield high quality segmentations. Extensive results with real users and three datasets demonstrate the impact.

same-paper 3 0.84132367 336 iccv-2013-Random Forests of Local Experts for Pedestrian Detection

Author: Javier Marín, David Vázquez, Antonio M. López, Jaume Amores, Bastian Leibe

Abstract: Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one.

4 0.82574576 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation

Author: Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang

Abstract: We propose an unsupervised detector adaptation algorithm to adapt any offline trained face detector to a specific collection of images, and hence achieve better accuracy. The core of our detector adaptation algorithm is a probabilistic elastic part (PEP) model, which is offline trained with a set of face examples. It produces a statisticallyaligned part based face representation, namely the PEP representation. To adapt a general face detector to a collection of images, we compute the PEP representations of the candidate detections from the general face detector, and then train a discriminative classifier with the top positives and negatives. Then we re-rank all the candidate detections with this classifier. This way, a face detector tailored to the statistics of the specific image collection is adapted from the original detector. We present extensive results on three datasets with two state-of-the-art face detectors. The significant improvement of detection accuracy over these state- of-the-art face detectors strongly demonstrates the efficacy of the proposed face detector adaptation algorithm.

5 0.80223262 299 iccv-2013-Online Video SEEDS for Temporal Window Objectness

Author: Michael Van_Den_Bergh, Gemma Roig, Xavier Boix, Santiago Manen, Luc Van_Gool

Abstract: Superpixel and objectness algorithms are broadly used as a pre-processing step to generate support regions and to speed-up further computations. Recently, many algorithms have been extended to video in order to exploit the temporal consistency between frames. However, most methods are computationally too expensive for real-time applications. We introduce an online, real-time video superpixel algorithm based on the recently proposed SEEDS superpixels. A new capability is incorporated which delivers multiple diverse samples (hypotheses) of superpixels in the same image or video sequence. The multiple samples are shown to provide a strong cue to efficiently measure the objectness of image windows, and we introduce the novel concept of objectness in temporal windows. Experiments show that the video superpixels achieve comparable performance to state-of-the-art offline methods while running at 30 fps on a single 2.8 GHz i7 CPU. State-of-the-art performance on objectness is also demonstrated, yet orders of magnitude faster and extended to temporal windows in video.

6 0.80052644 414 iccv-2013-Temporally Consistent Superpixels

7 0.79952925 150 iccv-2013-Exemplar Cut

8 0.79710668 404 iccv-2013-Structured Forests for Fast Edge Detection

9 0.79686677 47 iccv-2013-Alternating Regression Forests for Object Detection and Pose Estimation

10 0.79680973 160 iccv-2013-Fast Object Segmentation in Unconstrained Video

11 0.79664963 21 iccv-2013-A Method of Perceptual-Based Shape Decomposition

12 0.79389566 426 iccv-2013-Training Deformable Part Models with Decorrelated Features

13 0.79386574 411 iccv-2013-Symbiotic Segmentation and Part Localization for Fine-Grained Categorization

14 0.79372358 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning

15 0.79363102 6 iccv-2013-A Convex Optimization Framework for Active Learning

16 0.79330111 396 iccv-2013-Space-Time Robust Representation for Action Recognition

17 0.7930305 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition

18 0.7922498 258 iccv-2013-Low-Rank Sparse Coding for Image Classification

19 0.79208273 111 iccv-2013-Detecting Dynamic Objects with Multi-view Background Subtraction

20 0.79124063 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation