iccv iccv2013 iccv2013-241 knowledge-graph by maker-knowledge-mining

241 iccv-2013-Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection

Source: pdf

Author: Tianfu Wu, Song-Chun Zhu

Abstract: Many object detectors, such as AdaBoost, SVM and deformable part-based models (DPM), compute additive scoring functions at a large number of windows scanned over image pyramid, thus computational efficiency is an important consideration beside accuracy performance. In this paper, we present a framework of learning cost-sensitive decision policy which is a sequence of two-sided thresholds to execute early rejection or early acceptance based on the accumulative scores at each step. A decision policy is said to be optimal if it minimizes an empirical global risk function that sums over the loss of false negatives (FN) and false positives (FP), and the cost of computation. While the risk function is very complex due to high-order connections among the two-sided thresholds, we find its upper bound can be optimized by dynamic programming (DP) efficiently and thus say the learned policy is near-optimal. Given the loss of FN and FP and the cost in three numbers, our method can produce a policy on-the-fly for Adaboost, SVM and DPM. In experiments, we show that our decision policy outperforms state-of-the-art cascade methods significantly in terms of speed with similar accuracy performance.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract Many object detectors, such as AdaBoost, SVM and deformable part-based models (DPM), compute additive scoring functions at a large number of windows scanned over image pyramid, thus computational efficiency is an important consideration beside accuracy performance. [sent-3, score-0.219]

2 In this paper, we present a framework of learning cost-sensitive decision policy which is a sequence of two-sided thresholds to execute early rejection or early acceptance based on the accumulative scores at each step. [sent-4, score-1.199]

3 A decision policy is said to be optimal if it minimizes an empirical global risk function that sums over the loss of false negatives (FN) and false positives (FP), and the cost of computation. [sent-5, score-1.183]

4 While the risk function is very complex due to high-order connections among the two-sided thresholds, we find its upper bound can be optimized by dynamic programming (DP) efficiently and thus say the learned policy is near-optimal. [sent-6, score-0.953]

5 Given the loss of FN and FP and the cost in three numbers, our method can produce a policy on-the-fly for Adaboost, SVM and DPM. [sent-7, score-0.54]

6 In experiments, we show that our decision policy outperforms state-of-the-art cascade methods significantly in terms of speed with similar accuracy performance. [sent-8, score-0.875]

7 During offline training, those scoring functions are learned by minimizing some regularized convex surrogate functions of the 0/1 loss function; and during detection, they are evaluated at a vast number of sliding windows in an image pyramid. [sent-11, score-0.25]

8 In this paper, we present a generic empirical global risk minimization framework and a DP algorithm for learning near-optimal cost-sensitive decision policy online. [sent-13, score-0.976]

9 The policy consists of a sequence oftwo-sided thresholds to execute Figure 1. [sent-14, score-0.592]

10 Illustration of two decision policies learned for a human face AdaBoost classifier. [sent-15, score-0.35]

11 The horizontal axis is the 1, 146 boosted weak classifiers divided in 20 stages, and the vertical axis is the accumulative scores. [sent-16, score-0.235]

12 At each stage the solid histograms show the original positive (red) and negative (green) populations in training dataset, and the dotted histograms show the changed populations after early decisions by the policy. [sent-17, score-0.452]

13 early rejection or early acceptance based on the accumula- tive scores at each stage. [sent-18, score-0.279]

14 By “cost-sensitive”, it means that the decision policy accounts for two factors explicitly: • The loss of misclassification: the losses CFN and CFP fTohre a ofasslse o negative saifnicda a iofanls:e t positive respectively. [sent-19, score-0.789]

15 The number of features and the scoring functions are all learned by the offline training stage as usual. [sent-23, score-0.317]

16 Our policy Θ Θ 775533 does not change these settings and is learned “online” with very flexible design of the configuration of sequential stages (i. [sent-24, score-0.663]

17 The policy is adjustable before the online detection starts. [sent-27, score-0.515]

18 1 shows two examples of decision policies learned for the human face AdaBoost classifier based on different parameter settings: the top one is more cautious, while the bottom one is more aggressive in early rejection/acceptance. [sent-30, score-0.465]

19 Learning decision policy is formulated under the risk minimization framework with two phases. [sent-32, score-0.9]

20 (i) Off-line learning: We assume that an additive scoring Θ × function (AdBoost, SVM or DPM) consisting of T basic additive terms is trained on a dedicated training data set, and remains fixed. [sent-33, score-0.291]

21 We divide the T basic terms into N sequential stages, and compute N histograms of accumulative scores (with K bins) on the positive (see the red histograms in Fig. [sent-35, score-0.308]

22 4) projected from the trajectories so that we can count the number of FPs and FNs, and calculate the computational cost for any decision policy recursively. [sent-39, score-0.809]

23 (ii) On-line learning: We derive an upper bound of the risk to address the optimization issue caused by the highorder connections between the two-sided thresholds. [sent-40, score-0.491]

24 Then, we propose a DP algorithm to minimize the upper bound efficiently with Θ = (CFN, CFP, λ) given to produce the near-optimal decision policy. [sent-41, score-0.375]

25 So, a fast online algorithm is entailed for automatically adapting the policy to different contexts and goals. [sent-47, score-0.491]

26 This, together with the proposed global risk function, its tight upper bound and the DP solution, distinguishes our work from many others in the literature [3–5, 10, 12, 16, 18, 20, 23]. [sent-48, score-0.404]

27 In boosting cascade [26], the rejection thresholds of each stage are learned independently, and the training process needs time-consuming trial-and-error tuning. [sent-51, score-0.557]

28 Soft cascade [3] needs a sufficiently large validation set to calibrate the rejection thresholds. [sent-56, score-0.277]

29 In [7], the cascade is learned for a off-line trained DPM. [sent-57, score-0.238]

30 The rejection threshold of each stage is selected so that the probability of a misclassification occurring is bounded from above by a predefined small positive value. [sent-58, score-0.299]

31 (ii) The coarse-to-fine methods [1, 2, 9, 21] require top- down coarse-to-fine partition of the pose space of an object class, which is often done manually and then used to organize all the tests into a hierarchy (such as a decision tree). [sent-59, score-0.205]

32 This paper makes three main contributions to rapid object detection: (i) It presents a cost-sensitive formulation to learn the decision policy accounting for the loss of misclassification and the computational cost explicitly, applicable to object detectors with additive scoring functions. [sent-65, score-1.01]

33 (ii) It derives an upper bound of the empirical global risk function and presents a DP algorithm to minimize it efficiently. [sent-66, score-0.48]

34 (iii) In experiments, it shows the learned policies of AdaBoost, SVM and DPM outperform the state-of-the-art cascade methods significantly in speed with similar accuracy. [sent-68, score-0.319]

35 Definition of the decision policy Denote by f(x) = gt(x) the off-line trained additive scoring function w? [sent-72, score-0.861]

36 accumulative score a ≤t th ie ≤ ≤i-t Nh stage is defined by fi(x) = ? [sent-79, score-0.245]

37 ecision policy): A N-stage decision policy of f(x) is define? [sent-82, score-0.666]

38 In testing, ΠN makes three possible actions at stage iin terms of accumulative score fi(x) : • Reject x, if fi(x) < τi− , • Accept x, if fi(x) τi+, and • Continue to compute Gi+1 (x), otherwise. [sent-87, score-0.245]

39 The risk function of a decision policy Denote by R(ΠN; Θ) the global risk function of ΠN, R(ΠN; Θ) = L(ΠN; CFP, CFN) λ · C(ΠN), (2) ≥ + where L(·) is the expected loss and C(·) the expected computational )cio sst th. [sent-93, score-1.192]

40 Denote by c(GTi)h eth eex computational fc coosmt opf an itniodnivi Cd(uΠal stage i(which is fixed after f(x) is trained offline). [sent-101, score-0.204]

41 For a given sample x, let n(x) be the number of stages tested before a decision is made by ΠN (i. [sent-106, score-0.316]

42 =N1 pˆ(i;ΠN) · Ci (4) where pˆ(i; ΠN) is the empirical probability of making an early decision at stage i(to be calculated in Sec. [sent-111, score-0.447]

43 4), ˆp(i; ΠN) = pˆ(ΠN classifies x at stage i|x ∈ D) = #Si/S, where #Si is the number of examples in D which are classified at stage iby ΠN. [sent-112, score-0.34]

44 By substituting the three empirical probabilities into the global risk function in Eqn. [sent-113, score-0.333]

45 Given parameters Θ = (CFP, CFN, λ), the optimal decision policy Π∗N (Θ) is sought by minimizing the empirical global risk in Eqn. [sent-120, score-1.005]

46 The two-sided thresholds at a stage igenerate the three numbers (#FNi, #FPi, #Si). [sent-136, score-0.228]

47 All the thresholds are fully dependent in the sense that the positive and negative sub-populations observed at a stage iare affected by the thresholds at all previous stages. [sent-137, score-0.406]

48 High-order connections between {τi}’s in ΠN Before presenting the method of minimizing the risk in Eqn. [sent-153, score-0.325]

49 2), which are affected by all the previous stages j (j < i) as early thresholds τj ’s change the populations through their decisions. [sent-160, score-0.329]

50 structure makes the problem of minimizing the risk in Eqn. [sent-164, score-0.263]

51 By using the derived recursions, the empirical risk in Eqn. [sent-169, score-0.31]

52 (5) is divided into two parts: one is the sum of risks caused by individual stages (i. [sent-170, score-0.206]

53 , measured independently), and the other accounts for all the double-counting occurred by removing the oFifg positive (red da ×sh Ned m lines) ac rneda negative (blue ones) training examples (created for the human face AdaBoost classifier in Fig. [sent-172, score-0.227]

54 2 and only keeping the sequential chain connections, we derive an upper bound for the empirical risk which is then minimized by a DP algorithm efficiently. [sent-181, score-0.567]

55 It is an upper bound because certain populations could be counted more than once in the empirical risk. [sent-182, score-0.314]

56 We also show the tightness of the upper bound empirically (see Fig. [sent-183, score-0.252]

57 Offline learning: computing the risk We first compute two histograms of the accumulative ×× scores fi(x) at each stage ion D+ and D−respectively. [sent-189, score-0.536]

58 At stage i, the set of accumulative scores is denoted by A(i) = {fi(x) ;∀x ∈ D}. [sent-190, score-0.276]

59 The trajectories of examples over the sequence of accumulative scores remain relatively steady, as illustrated in Fig. [sent-200, score-0.213]

60 In other words, if τi−1 classified x based on fi−1 (x), its scores fi(x) in the next steps unlikely jump inside [τi− , τi+), thus the policy will not regret its decisions. [sent-203, score-0.522]

61 4) which lead to the recursions of counting (#FNi, #FPi, #Si) in the empirical global risk in Eqn. [sent-205, score-0.368]

62 Furthermore, we use the notation of a threshold itself 775566 ×× (such as τi+ or τi−) in the decision policy to denote its bin index in the K N matrix without confusion. [sent-210, score-0.714]

63 We denote the sub-population of positive examples falling into bin k at stage iby, D+(ki) = {x : bin(x, i) = k, x ∈ D+}, (9) and its cardinality by S+ (ki) = |D+ (ki) | (e. [sent-217, score-0.232]

64 Denote the entries below the rejection line of Π,i· (·f·r o,τnt part in ΠN) by, Πi= (τ1−,··· ,τi−), and the entries above the acceptance line by, (11) Πi = (τ1+, ··· ,τi+). [sent-231, score-0.206]

65 i (16) i This set of examples will be counted more than once if we compute the empirical risk as the sum of risks caused by individual stages. [sent-253, score-0.43]

66 j=1 (18) The number of examples in D which are classified at stage iby Πi is defined by, |D(τi−) ∪ D(τi+)| − S(τi−; Πi−1)] + [S(τi+) − S(τi+; S(τi) ? [sent-274, score-0.218]

67 i) All the statistics are computed once at the offline stage, and can be used to compute the decision policy online for any given Θ = (CFP, CFN, λ). [sent-276, score-0.743]

68 This figure shows the tightness of the upper bound (Eqn. [sent-278, score-0.252]

69 (5), it is divided into two parts: (i) the sum of risks caused by the assignment of an individual τi, R(τi;Θ) =S1·[λ · Ci· (S(τi−) + S(τi+)) + S+(τi−) · CFN+ S−(τi+) · CFP], (20) and (ii) the sum of risks caused by examples which are double-counted at stage iw. [sent-285, score-0.337]

70 (21) make the problem of minimizing the risk very hard in general. [sent-295, score-0.263]

71 (22), we know that we can obtain an upper bound of the risk by deriving a lower bound for R(τi , Πi−1 ; Θ) (due to the minus operator). [sent-300, score-0.502]

72 2), and only consider the sequential chain connections (black arrows) to obtain the lower bound of R(τi, Πi−1 ; Θ). [sent-302, score-0.247]

73 Proposition 2 (The upper bound of risk): By substituting R(τi, Πi−1 ; Θ) with its lower bound R(τi, τi−1 ; Θ), we obtain an upper bound of R(ΠN; Θ), R? [sent-303, score-0.438]

74 The intuitive idea is that R(τi , τi−1 ; Θ) consider the risk caused by examples double-counted at τi w. [sent-315, score-0.284]

75 5, we show tightness of the upper bound empirically on the human face AdaBoost classifier which is consistent across different settings of the loss of misclassifica- tion (we set CFP ∈ [0. [sent-320, score-0.361]

76 L(eΠt Bi [τi] be the risk of the best assignment to the ifirst ? [sent-331, score-0.234]

77 (ii) The backward step for finding the near-optimal decision policy Π∗N = (τ1∗ , · · · , τN∗), where we first take, τN∗ = arg minτN BN [τN] an,d· t·h·e n,τ trace back τi∗ = Ti+1 [τi∗+1] in the order of decreasing i= N − 1, · · · , 1. [sent-336, score-0.666]

78 Experiments In the experiments, we learn decision policies for AdaBoost, SVM and DPM, and compare with their corresponding popular cascade methods [3, 7, 26]. [sent-339, score-0.472]

79 × Settings: In all the comparison experiments, we learned the decision policies by specifying (α, β), and the corresponding (CFP, CFN) are searched. [sent-340, score-0.315]

80 4t12s6yo075P9neurtWpinfoms the cascade methods in speed sigificantly on both the classification testing dataset and the CMU+MIT accuracy performance. [sent-354, score-0.209]

81 For example, in detection, the two cascade methods and our decision policy obtain similar APs, 0. [sent-357, score-0.849]

82 809 respectively, but our decision policy saves about 10 haar feature evaluations per sliding window on average. [sent-360, score-0.712]

83 Decision Policy for AdaBoost Classifier In this experiment, we learn the decision policy for the AdaBoost classifier trained for human face detection. [sent-363, score-0.776]

84 We compare with the original “hard” cascade method [26] and the soft cascade method [3] which are the two of the most popular cascade methods for face detection. [sent-364, score-0.619]

85 We use the same 4 types ofHaar features used in [26], and adopt the decision stump as the weak classifier. [sent-371, score-0.256]

86 So, the computational cost for each weak classifier is the same and the computational cost for each stage is proportional to the number of weak classifiers included in that stage. [sent-373, score-0.456]

87 We train the cascade consisting of 20 stages for both the “hard” and soft cascade using the publicly available OpenCV package. [sent-375, score-0.512]

88 The trained “hard” cascade consists of 2497 boosted weak classifiers in total and the soft cascade consists of 2135 boosted weak classifiers (note that the soft cascade needs an additional validation dataset to tune the thresholds, and we use 5000 positives and 40,000 negative examples). [sent-376, score-0.95]

89 To learn the decision policy, we first divide the 1146 boosted weak classifier into 20 subsets (as is shown in the bottom of Fig. [sent-379, score-0.339]

90 Note that the configuration of the decision policy is not particularly chosen, but can be very flexible for different situations (which can not be easily specified in training the cascade). [sent-381, score-0.692]

91 Overall, the decision policy outperforms the two cascade methods in speed ×× × with similar accuracy performance. [sent-384, score-0.875]

92 Decision policies for SVM classifiers and DPMs We learn decision policies for linear SVM classifier and DPM trained on INRIA person dataset [6]. [sent-387, score-0.472]

93 We compare with the probably approximate admissible (PAA) threshold [7], which is used in the cascade of DPM and selects the rejection thresholds based on τi− = min{x;f(x)≥t1 ,x∈D+} fi(x), where t1is predefined to focus on high-scoring positive examples. [sent-388, score-0.42]

94 So, we h daavteas e15t,0 r ecseull sti ning t ointa 1l5 5w ×hic 5h × are wreeoigrdhetr epdar abamseetde on tShoe, statistical power/computational cost ratio [2], and then organized into 20 subsets with the size of the first 19 stages being 7 cells and the last stage being 17 cells. [sent-392, score-0.337]

95 So, the computational cost of a stage is proportional to the number of cells. [sent-395, score-0.203]

96 Our decision policy outperforms the PAA method in speed in both clas775599 Table2. [sent-407, score-0.692]

97 Conclusion This paper presents a framework of learning cost- sensitive decision policy for popular object detectors with additive scoring functions (such as AdaBoost, SVM and DPM). [sent-420, score-0.855]

98 The decision policy explicitly accounts for the loss of misclassification and the cost of computation. [sent-421, score-0.814]

99 The learning is formulated under the risk minimization framework and the upper bound of the risk function is solved by an efficient DP algorithm. [sent-422, score-0.638]

100 By comparing with the state-of-art cascade methods, our decision policy outperforms them in speed significantly with similar accuracy in experiments. [sent-423, score-0.875]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('policy', 0.461), ('cfn', 0.387), ('cfp', 0.387), ('risk', 0.234), ('decision', 0.205), ('cascade', 0.183), ('adaboost', 0.144), ('accumulative', 0.123), ('stage', 0.122), ('stages', 0.111), ('dp', 0.108), ('fni', 0.106), ('thresholds', 0.106), ('fn', 0.099), ('bound', 0.098), ('scoring', 0.096), ('rejection', 0.094), ('fpi', 0.094), ('fp', 0.084), ('policies', 0.084), ('tightness', 0.082), ('empirical', 0.076), ('dpm', 0.075), ('upper', 0.072), ('fns', 0.07), ('paa', 0.07), ('risks', 0.07), ('additive', 0.07), ('populations', 0.068), ('acceptance', 0.066), ('sequential', 0.065), ('ki', 0.063), ('connections', 0.062), ('fnr', 0.058), ('weak', 0.051), ('cost', 0.051), ('fpr', 0.048), ('bin', 0.048), ('offline', 0.047), ('misclassification', 0.046), ('classifier', 0.046), ('early', 0.044), ('positives', 0.043), ('ci', 0.042), ('iby', 0.041), ('fps', 0.041), ('fi', 0.04), ('ii', 0.037), ('boosted', 0.037), ('bi', 0.037), ('positive', 0.037), ('svm', 0.036), ('recursions', 0.035), ('soft', 0.035), ('face', 0.035), ('negative', 0.035), ('trajectories', 0.034), ('definition', 0.032), ('rejections', 0.031), ('waldboost', 0.031), ('scores', 0.031), ('si', 0.031), ('computational', 0.03), ('online', 0.03), ('classified', 0.03), ('trained', 0.029), ('minimizing', 0.029), ('false', 0.029), ('loss', 0.028), ('gt', 0.028), ('count', 0.028), ('reachable', 0.027), ('negatives', 0.027), ('recursive', 0.027), ('statistical', 0.027), ('transition', 0.026), ('speed', 0.026), ('learned', 0.026), ('cells', 0.026), ('histograms', 0.026), ('training', 0.026), ('caused', 0.025), ('examples', 0.025), ('rejects', 0.025), ('execute', 0.025), ('sliding', 0.024), ('simplified', 0.024), ('feat', 0.024), ('detection', 0.024), ('classifiers', 0.024), ('counting', 0.023), ('detectors', 0.023), ('deformable', 0.023), ('accounts', 0.023), ('probabilities', 0.023), ('accepts', 0.023), ('eex', 0.023), ('entries', 0.023), ('chain', 0.022), ('haar', 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999964 241 iccv-2013-Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection

Author: Tianfu Wu, Song-Chun Zhu

2 0.21703985 44 iccv-2013-Adapting Classification Cascades to New Domains

Author: Vidit Jain, Sachin Sudhakar Farfade

Abstract: Classification cascades have been very effective for object detection. Such a cascade fails to perform well in data domains with variations in appearances that may not be captured in the training examples. This limited generalization severely restricts the domains for which they can be used effectively. A common approach to address this limitation is to train a new cascade of classifiers from scratch for each of the new domains. Building separate detectors for each of the different domains requires huge annotation and computational effort, making it not scalable to a large number of data domains. Here we present an algorithm for quickly adapting a pre-trained cascade of classifiers using a small number oflabeledpositive instancesfrom a different yet similar data domain. In our experiments with images of human babies and human-like characters from movies, we demonstrate that the adapted cascade significantly outperforms both of the original cascade and the one trained from scratch using the given training examples. –

3 0.12543465 448 iccv-2013-Weakly Supervised Learning of Image Partitioning Using Decision Trees with Structured Split Criteria

Author: Christoph Straehle, Ullrich Koethe, Fred A. Hamprecht

Abstract: We propose a scheme that allows to partition an image into a previously unknown number of segments, using only minimal supervision in terms of a few must-link and cannotlink annotations. We make no use of regional data terms, learning instead what constitutes a likely boundary between segments. Since boundaries are only implicitly specified through cannot-link constraints, this is a hard and nonconvex latent variable problem. We address this problem in a greedy fashion using a randomized decision tree on features associated with interpixel edges. We use a structured purity criterion during tree construction and also show how a backtracking strategy can be used to prevent the greedy search from ending up in poor local optima. The proposed strategy is compared with prior art on natural images.

4 0.10443181 279 iccv-2013-Multi-stage Contextual Deep Learning for Pedestrian Detection

Author: Xingyu Zeng, Wanli Ouyang, Xiaogang Wang

Abstract: Cascaded classifiers1 have been widely used in pedestrian detection and achieved great success. These classifiers are trained sequentially without joint optimization. In this paper, we propose a new deep model that can jointly train multi-stage classifiers through several stages of backpropagation. It keeps the score map output by a classifier within a local region and uses it as contextual information to support the decision at the next stage. Through a specific design of the training strategy, this deep architecture is able to simulate the cascaded classifiers by mining hard samples to train the network stage-by-stage. Each classifier handles samples at a different difficulty level. Unsupervised pre-training and specifically designed stage-wise supervised training are used to regularize the optimization problem. Both theoretical analysis and experimental results show that the training strategy helps to avoid overfitting. Experimental results on three datasets (Caltech, ETH and TUD-Brussels) show that our approach outperforms the state-of-the-art approaches.

5 0.097620502 190 iccv-2013-Handling Occlusions with Franken-Classifiers

Author: Markus Mathias, Rodrigo Benenson, Radu Timofte, Luc Van_Gool

Abstract: Detecting partially occluded pedestrians is challenging. A common practice to maximize detection quality is to train a set of occlusion-specific classifiers, each for a certain amount and type of occlusion. Since training classifiers is expensive, only a handful are typically trained. We show that by using many occlusion-specific classifiers, we outperform previous approaches on three pedestrian datasets; INRIA, ETH, and Caltech USA. We present a new approach to train such classifiers. By reusing computations among different training stages, 16 occlusion-specific classifiers can be trained at only one tenth the cost of one full training. We show that also test time cost grows sub-linearly.

6 0.096705928 136 iccv-2013-Efficient Pedestrian Detection by Directly Optimizing the Partial Area under the ROC Curve

7 0.079388864 165 iccv-2013-Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies

8 0.076955289 130 iccv-2013-Dynamic Structured Model Selection

9 0.074347258 426 iccv-2013-Training Deformable Part Models with Decorrelated Features

10 0.07426744 157 iccv-2013-Fast Face Detector Training Using Tailored Views

11 0.072200663 121 iccv-2013-Discriminatively Trained Templates for 3D Object Detection: A Real Time Scalable Approach

12 0.070153907 336 iccv-2013-Random Forests of Local Experts for Pedestrian Detection

13 0.069097795 338 iccv-2013-Randomized Ensemble Tracking

14 0.068752356 404 iccv-2013-Structured Forests for Fast Edge Detection

15 0.06835077 274 iccv-2013-Monte Carlo Tree Search for Scheduling Activity Recognition

16 0.065400518 390 iccv-2013-Shufflets: Shared Mid-level Parts for Fast Object Detection

17 0.061229389 20 iccv-2013-A Max-Margin Perspective on Sparse Representation-Based Classification

18 0.059958413 163 iccv-2013-Feature Weighting via Optimal Thresholding for Video Analysis

19 0.059179127 107 iccv-2013-Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction

20 0.057536483 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.14), (1, 0.041), (2, -0.021), (3, -0.045), (4, 0.04), (5, -0.011), (6, 0.015), (7, 0.065), (8, -0.023), (9, -0.05), (10, -0.016), (11, -0.028), (12, -0.003), (13, -0.038), (14, 0.054), (15, 0.008), (16, -0.027), (17, 0.052), (18, 0.027), (19, 0.064), (20, -0.077), (21, -0.008), (22, -0.049), (23, 0.065), (24, -0.057), (25, -0.005), (26, -0.046), (27, -0.002), (28, 0.052), (29, -0.089), (30, -0.024), (31, 0.068), (32, -0.067), (33, 0.032), (34, 0.002), (35, -0.08), (36, -0.055), (37, 0.002), (38, 0.003), (39, -0.049), (40, 0.027), (41, 0.079), (42, 0.019), (43, 0.093), (44, -0.037), (45, -0.021), (46, -0.055), (47, -0.057), (48, -0.063), (49, 0.047)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94350189 241 iccv-2013-Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection

Author: Tianfu Wu, Song-Chun Zhu

2 0.81435382 136 iccv-2013-Efficient Pedestrian Detection by Directly Optimizing the Partial Area under the ROC Curve

Author: Sakrapee Paisitkriangkrai, Chunhua Shen, Anton Van Den Hengel

Abstract: Many typical applications of object detection operate within a prescribed false-positive range. In this situation the performance of a detector should be assessed on the basis of the area under the ROC curve over that range, rather than over the full curve, as the performance outside the range is irrelevant. This measure is labelled as the partial area under the ROC curve (pAUC). Effective cascade-based classification, for example, depends on training node classifiers that achieve the maximal detection rate at a moderate false positive rate, e.g., around 40% to 50%. We propose a novel ensemble learning method which achieves a maximal detection rate at a user-defined range of false positive rates by directly optimizing the partial AUC using structured learning. By optimizing for different ranges of false positive rates, the proposed method can be used to train either a single strong classifier or a node classifier forming part of a cascade classifier. Experimental results on both synthetic and real-world data sets demonstrate the effectiveness of our approach, and we show that it is possible to train state-of-the-art pedestrian detectors using the pro- posed structured ensemble learning method.

3 0.73101807 336 iccv-2013-Random Forests of Local Experts for Pedestrian Detection

Author: Javier Marín, David Vázquez, Antonio M. López, Jaume Amores, Bastian Leibe

Abstract: Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one.

4 0.71182603 44 iccv-2013-Adapting Classification Cascades to New Domains

Author: Vidit Jain, Sachin Sudhakar Farfade

5 0.68703967 190 iccv-2013-Handling Occlusions with Franken-Classifiers

Author: Markus Mathias, Rodrigo Benenson, Radu Timofte, Luc Van_Gool

6 0.67769849 211 iccv-2013-Image Segmentation with Cascaded Hierarchical Models and Logistic Disjunctive Normal Networks

7 0.64151704 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition

8 0.63174146 279 iccv-2013-Multi-stage Contextual Deep Learning for Pedestrian Detection

9 0.59673887 125 iccv-2013-Drosophila Embryo Stage Annotation Using Label Propagation

10 0.591241 404 iccv-2013-Structured Forests for Fast Edge Detection

11 0.58255047 75 iccv-2013-CoDeL: A Human Co-detection and Labeling Framework

12 0.57076514 165 iccv-2013-Find the Best Path: An Efficient and Accurate Classifier for Image Hierarchies

13 0.5644713 121 iccv-2013-Discriminatively Trained Templates for 3D Object Detection: A Real Time Scalable Approach

14 0.55053359 426 iccv-2013-Training Deformable Part Models with Decorrelated Features

15 0.54716295 47 iccv-2013-Alternating Regression Forests for Object Detection and Pose Estimation

16 0.53998011 338 iccv-2013-Randomized Ensemble Tracking

17 0.53789008 390 iccv-2013-Shufflets: Shared Mid-level Parts for Fast Object Detection

18 0.5333634 352 iccv-2013-Revisiting Example Dependent Cost-Sensitive Learning with Decision Trees

19 0.51499194 193 iccv-2013-Heterogeneous Auto-similarities of Characteristics (HASC): Exploiting Relational Information for Classification

20 0.51378399 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.085), (7, 0.023), (12, 0.025), (26, 0.101), (27, 0.01), (31, 0.039), (42, 0.134), (48, 0.011), (64, 0.059), (73, 0.032), (84, 0.215), (89, 0.144)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.91363847 401 iccv-2013-Stacked Predictive Sparse Coding for Classification of Distinct Regions in Tumor Histopathology

Author: Hang Chang, Yin Zhou, Paul Spellman, Bahram Parvin

Abstract: Image-based classification ofhistology sections, in terms of distinct components (e.g., tumor, stroma, normal), provides a series of indices for tumor composition. Furthermore, aggregation of these indices, from each whole slide image (WSI) in a large cohort, can provide predictive models of the clinical outcome. However, performance of the existing techniques is hindered as a result of large technical variations and biological heterogeneities that are always present in a large cohort. We propose a system that automatically learns a series of basis functions for representing the underlying spatial distribution using stacked predictive sparse decomposition (PSD). The learned representation is then fed into the spatial pyramid matching framework (SPM) with a linear SVM classifier. The system has been evaluated for classification of (a) distinct histological components for two cohorts of tumor types, and (b) colony organization of normal and malignant cell lines in 3D cell culture models. Throughput has been increased through the utility of graphical processing unit (GPU), and evalu- ation indicates a superior performance results, compared with previous research.

same-paper 2 0.84227216 241 iccv-2013-Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection

Author: Tianfu Wu, Song-Chun Zhu

3 0.81739295 381 iccv-2013-Semantically-Based Human Scanpath Estimation with HMMs

Author: Huiying Liu, Dong Xu, Qingming Huang, Wen Li, Min Xu, Stephen Lin

Abstract: We present a method for estimating human scanpaths, which are sequences of gaze shifts that follow visual attention over an image. In this work, scanpaths are modeled based on three principal factors that influence human attention, namely low-levelfeature saliency, spatialposition, and semantic content. Low-level feature saliency is formulated as transition probabilities between different image regions based on feature differences. The effect of spatial position on gaze shifts is modeled as a Levy flight with the shifts following a 2D Cauchy distribution. To account for semantic content, we propose to use a Hidden Markov Model (HMM) with a Bag-of-Visual-Words descriptor of image regions. An HMM is well-suited for this purpose in that 1) the hidden states, obtained by unsupervised learning, can represent latent semantic concepts, 2) the prior distribution of the hidden states describes visual attraction to the semantic concepts, and 3) the transition probabilities represent human gaze shift patterns. The proposed method is applied to task-driven viewing processes. Experiments and analysis performed on human eye gaze data verify the effectiveness of this method.

4 0.77926838 60 iccv-2013-Bayesian Robust Matrix Factorization for Image and Video Processing

Author: Naiyan Wang, Dit-Yan Yeung

Abstract: Matrix factorization is a fundamental problem that is often encountered in many computer vision and machine learning tasks. In recent years, enhancing the robustness of matrix factorization methods has attracted much attention in the research community. To benefit from the strengths of full Bayesian treatment over point estimation, we propose here a full Bayesian approach to robust matrix factorization. For the generative process, the model parameters have conjugate priors and the likelihood (or noise model) takes the form of a Laplace mixture. For Bayesian inference, we devise an efficient sampling algorithm by exploiting a hierarchical view of the Laplace distribution. Besides the basic model, we also propose an extension which assumes that the outliers exhibit spatial or temporal proximity as encountered in many computer vision applications. The proposed methods give competitive experimental results when compared with several state-of-the-art methods on some benchmark image and video processing tasks.

5 0.77630889 168 iccv-2013-Finding the Best from the Second Bests - Inhibiting Subjective Bias in Evaluation of Visual Tracking Algorithms

Author: Yu Pang, Haibin Ling

Abstract: Evaluating visual tracking algorithms, or “trackers ” for short, is of great importance in computer vision. However, it is hard to “fairly” compare trackers due to many parameters need to be tuned in the experimental configurations. On the other hand, when introducing a new tracker, a recent trend is to validate it by comparing it with several existing ones. Such an evaluation may have subjective biases towards the new tracker which typically performs the best. This is mainly due to the difficulty to optimally tune all its competitors and sometimes the selected testing sequences. By contrast, little subjective bias exists towards the “second best” ones1 in the contest. This observation inspires us with a novel perspective towards inhibiting subjective bias in evaluating trackers by analyzing the results between the second bests. In particular, we first collect all tracking papers published in major computer vision venues in recent years. From these papers, after filtering out potential biases in various aspects, we create a dataset containing many records of comparison results between various visual trackers. Using these records, we derive performance rank- ings of the involved trackers by four different methods. The first two methods model the dataset as a graph and then derive the rankings over the graph, one by a rank aggregation algorithm and the other by a PageRank-like solution. The other two methods take the records as generated from sports contests and adopt widely used Elo’s and Glicko ’s rating systems to derive the rankings. The experimental results are presented and may serve as a reference for related research.

6 0.75337124 133 iccv-2013-Efficient Hand Pose Estimation from a Single Depth Image

7 0.74774545 219 iccv-2013-Internet Based Morphable Model

8 0.74313992 180 iccv-2013-From Where and How to What We See

9 0.74279082 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation

10 0.73911059 253 iccv-2013-Linear Sequence Discriminant Analysis: A Model-Based Dimensionality Reduction Method for Vector Sequences

11 0.73731792 427 iccv-2013-Transfer Feature Learning with Joint Distribution Adaptation

12 0.73712659 44 iccv-2013-Adapting Classification Cascades to New Domains

13 0.73580909 150 iccv-2013-Exemplar Cut

14 0.73502398 161 iccv-2013-Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration

15 0.73435175 330 iccv-2013-Proportion Priors for Image Sequence Segmentation

16 0.73402309 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization

17 0.73375726 80 iccv-2013-Collaborative Active Learning of a Kernel Machine Ensemble for Recognition

18 0.73342133 277 iccv-2013-Multi-channel Correlation Filters

19 0.73243839 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps

20 0.73170376 245 iccv-2013-Learning a Dictionary of Shape Epitomes with Applications to Image Labeling