cvpr cvpr2013 cvpr2013-254 knowledge-graph by maker-knowledge-mining

254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection


Source: pdf

Author: Jianguo Li, Yimin Zhang

Abstract: This paper presents a novel learning framework for training boosting cascade based object detector from large scale dataset. The framework is derived from the wellknown Viola-Jones (VJ) framework but distinguished by three key differences. First, the proposed framework adopts multi-dimensional SURF features instead of single dimensional Haar features to describe local patches. In this way, the number of used local patches can be reduced from hundreds of thousands to several hundreds. Second, it adopts logistic regression as weak classifier for each local patch instead of decision trees in the VJ framework. Third, we adopt AUC as a single criterion for the convergence test during cascade training rather than the two trade-off criteria (false-positive-rate and hit-rate) in the VJ framework. The benefit is that the false-positive-rate can be adaptive among different cascade stages, and thus yields much faster convergence speed of SURF cascade. Combining these points together, the proposed approach has three good properties. First, the boosting cascade can be trained very efficiently. Experiments show that the proposed approach can train object detectors from billions of negative samples within one hour even on personal computers. Second, the built detector is comparable to the stateof-the-art algorithm not only on the accuracy but also on the processing speed. Third, the built detector is small in model-size due to short cascade stages.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Learning SURF Cascade for Fast and Accurate Object Detection Jianguo Li, Yimin Zhang Intel Labs China Abstract This paper presents a novel learning framework for training boosting cascade based object detector from large scale dataset. [sent-1, score-0.902]

2 Second, it adopts logistic regression as weak classifier for each local patch instead of decision trees in the VJ framework. [sent-5, score-0.272]

3 Third, we adopt AUC as a single criterion for the convergence test during cascade training rather than the two trade-off criteria (false-positive-rate and hit-rate) in the VJ framework. [sent-6, score-0.777]

4 The benefit is that the false-positive-rate can be adaptive among different cascade stages, and thus yields much faster convergence speed of SURF cascade. [sent-7, score-0.698]

5 First, the boosting cascade can be trained very efficiently. [sent-9, score-0.666]

6 Experiments show that the proposed approach can train object detectors from billions of negative samples within one hour even on personal computers. [sent-10, score-0.413]

7 Third, the built detector is small in model-size due to short cascade stages. [sent-12, score-0.754]

8 As is known, the training is usually required to reach very low false positive rate per scan-window (FPPW) such as 10−6 [37], which means that hundreds of millions or even billions of negative samples should be processed during training procedure. [sent-20, score-0.453]

9 Therefore, training object detector is a very time-consuming task. [sent-21, score-0.261]

10 As some fine-tuning of parameters are usually required based on training experiments, the long training time yields very painful experiences for researches in the field of object detection. [sent-25, score-0.258]

11 Someone may argue that we just care the detection speed since the training only need running once. [sent-26, score-0.206]

12 Besides the big data problem, another important factor is the convergence speed of the cascade training. [sent-32, score-0.666]

13 To the best of our knowledge, almost all existing cascade detection frameworks are trained based on two conflicted criteria (false-positive-rate and hit-rate) for the detection-error tradeoff. [sent-33, score-0.818]

14 Although some researches introduced intermediate or post tuning of cascade parameters with some optimization methods [40, 3, 3 1, 39], they did not touch the convergence speed of cascade training. [sent-34, score-1.3]

15 As a result, the final cascade usually has very long stages. [sent-35, score-0.59]

16 This paper proposes a new cascade learning framework for object detection, with an emphasis on training efficiency. [sent-36, score-0.687]

17 First, the detection window is represented by local SURF patches, which are spatial regions within the window and described by the multi-dimensional SURF descriptor [2]. [sent-37, score-0.201]

18 Third, we adopt AUC (Area under ROC curve) as a single criterion for convergence test during cascade training instead of the two conflicted criteria (false-positive-rate and hit-rate). [sent-42, score-0.854]

19 The training of SURF cascade converges much faster and generates much shorter cascade stages. [sent-44, score-1.315]

20 Experiments show that the proposed approach can build highly accurate detectors by processing billions of samples within one hour even on personal computers. [sent-45, score-0.324]

21 (2) We propose AUC as a single criterion for cascade training, which makes the training converge faster and yields cascade model with much shorter stages. [sent-47, score-1.38]

22 (3) We show a system that can train cascade object detectors from billions of samples within one hour even on PCs. [sent-48, score-0.915]

23 Viola-Jones Framework Revisited The boosting cascade framework by Viola and Jones is a milestone work in object detection [36]. [sent-53, score-0.792]

24 Basically, there are three key ideas that make it able to build real-time object detectors: the integral image trick for efficient Haar feature extraction, the boosting algorithm for ensemble weak classifiers, and the attentional cascade structure for fast negative rejection. [sent-56, score-0.92]

25 It is closely related to HoG features in the cascade HoG framework [42]. [sent-70, score-0.59]

26 Third, the attentional cascade is trained based on two conflicted criteria: false-positive rate (FPR) 푓푖 and hit-rate (or dete∏ction-rate) 푑푖. [sent-72, score-0.745]

27 The overall FPR of a 푇-stage∏ cascade is 퐹 = 푓푖, while the overall hit-rate is 퐷 = 푑푖. [sent-73, score-0.59]

28 It is better that 푓푖 can be adaptive among different stages such that we could easily reach overall training goal. [sent-80, score-0.226]

29 Inspired by [22], we introduced AUC [8] as a single criterion for cascade con- ∏푖푇=1 ∏푖푇=1 vergence testing. [sent-83, score-0.633]

30 This will realize adaptive FPR among different stages, and yield fast convergence speed and cascade model with much shorter stages. [sent-84, score-0.747]

31 Later works trained mixtures of deformable part models even under the cascade framework [10, 11]. [sent-87, score-0.641]

32 SURF Cascade for Object Detection The proposed approach contains four ingredients: SURF features for local patch description, logistic regression based weak classifier for each patch, boosting ensemble of weak classifiers for each stage, and AUC-based cascade learning algorithm. [sent-92, score-1.051]

33 For instance, given a 40×40 detection template, 333444666977 we define patch with 4 spatial cells, and allow the patch size ranging from 12× 12 pixels to 40×40 pixels. [sent-97, score-0.226]

34 We further allow different aspect ratio for each patch (the ratio of patch width and patch height). [sent-99, score-0.204]

35 This paper derives some variant of SURF descriptor for object detection, and does not use the keypoint detector part at all. [sent-108, score-0.212]

36 The corresponding threshold at that point is the desired After one stage is converged, we continued to train another stage with false-positive samples coming from scanning non-target images with partial trained cascade. [sent-121, score-0.248]

37 The cascade training algorithm is given in Table 2. [sent-123, score-0.664]

38 For instance, we got an 8-stage cascade for face detection with 푓푖 at each stage forming a vector like (0. [sent-126, score-0.9]

39 This means that AUC based cascade training can converge much faster. [sent-137, score-0.686]

40 As a byproduct, this will not only make model-size very smaller (for instance, modelsize of 8-stage cascade is only 50KB), but also increase the detection speed quite a lot. [sent-138, score-0.722]

41 Training SURF cascade based on ROC analysis ∙ ∙ ∙ Input: over all FPR 퐹푡 ; minimum hit-rate per stage 푑푚푖푛 ; positive sIanpmupt:le o sveetr 푋 all+ F ; PnReg 퐹ative sample set 푋−. [sent-141, score-0.662]

42 If 퐹푖+1 > 퐹푡, adopt current cascaded detector to scan nontarget images with sliding window and put false-positive samples into 푋−until the size ∣푋−∣ = ∣푋+ ∣ . [sent-152, score-0.263]

43 Output the boosting cascade detector {퐻푖 (x) > 휃푖} and overall training accuracy n퐹g ac andsc 퐷ad. [sent-153, score-0.879]

44 Experiments We applied the proposed approach to face detection and car detection. [sent-155, score-0.299]

45 Note that the proposed approach is not limited to face detection and car detection. [sent-157, score-0.299]

46 Specially, we have made a high quality face detection SDK free available to public 3. [sent-159, score-0.238]

47 For the cascade training part, we parallelized the time-consuming weak classifier training step with OpenMP in task level. [sent-165, score-0.866]

48 The training and detection experiments were done on a personal workstation with 3. [sent-170, score-0.209]

49 3Face detection and tracking based on SURF cascade are integrated in Intel perceptual computing SDK, which is available at http : / /www . [sent-172, score-0.68]

50 Positive samples of frontal faces are mainly from the GENKI dataset [35], the facetracer dataset [17], the FERET dataset [29], etc. [sent-183, score-0.248]

51 aWnde fcuolr-ther derived 15,000 faces with mirror transform, and 15,000 samples by random perspective transforming face image within [-10, 10] degree. [sent-186, score-0.304]

52 Wt hee s4e0t× ×th4e0 m daexteimctuiomn number of weak classifiers in each stage to 128. [sent-194, score-0.214]

53 To obtain fast detector, we restricted that the first 3 stages have at most 4 weak classifiers. [sent-195, score-0.239]

54 In comparison, we tried to train face detector using the OpenCV Haar training modules on the same dataset. [sent-203, score-0.438]

55 This means that SURF cascade is at least 60X faster than Haar cascade in training. [sent-205, score-1.212]

56 Besides, we tried to replace AUC based criterion with VJ’s criteria to control SURF cascade training, which requires more than 5 hours to converge. [sent-206, score-0.669]

57 Figure 1(a) and 1(b) illustrate details of the final cascade, including the number of weak learner in each stage and the accumulated rejection rate over the cascade stages. [sent-208, score-0.871]

58 It shows that the first three stages rejects 98% negative samples with only 7 weak classifiers. [sent-209, score-0.343]

59 The cascade detector contains 334 weak classifiers, and only need to evaluate 1. [sent-210, score-0.856]

60 On the contrary, the default Haar-based face detector in OpenCV contains more than 24 stages and 2,912 weak classifiers, and requires to evaluate more than 28 Haar features per window [21]. [sent-212, score-0.566]

61 The final detector contains a common post-processing step, which merges cascade outputs using the disjointset algorithm and filters unreliable results using the non- maximum suppression algorithm. [sent-224, score-0.754]

62 Top-3 local patches picked by training procedure in the red-green-blue order (a) on the face object (b) on the car object. [sent-226, score-0.395]

63 2 Multi-view Face Detector Besides frontal face detectors, we also trained multi-view face detector using the proposed approach. [sent-229, score-0.59]

64 7 T∼h9e0 f dreognrteael view detector is the same as the previous frontal detector. [sent-231, score-0.294]

65 The detector training for each view follows the same procedure as the training of frontal view detector. [sent-234, score-0.467]

66 3 Face Detection Evaluation We evaluated SURF cascade detector on two public datasets: one is the CMU+MIT dataset, the other is the UMass FDDB dataset [15]. [sent-237, score-0.754]

67 As SURF cascade can directly output probability score (in the range 0∼1) at any stage, it is natural to define score for each detection window 푤 as 푠(푤) = 푝(푤) + 푘(푤), where 푘(푤) is the number of passed stages and 푝(푤) is probability output at the exit stage. [sent-239, score-0.832]

68 Comparable results are depicted for some recent works in face detection such as the VJ detector [37], polygon-feature detector [28], soft cascade detector [3] and recycling-cascade detector [4]. [sent-241, score-1.484]

69 Figure 3(a) shows that SURF cascade achieves comparable performance to the state-of-the-art method soft-cascade [3], while outperforms all the other methods. [sent-242, score-0.59]

70 Hence, the UMass face detection benchmark (FDDB) is introduced [15]. [sent-244, score-0.238]

71 Besides, it provides a systematic protocol to evaluate performance of face detection system. [sent-246, score-0.238]

72 Figure 3(b) shows the discrete-score ROC curve generated by SURF cascade in comparison to available results on the benchmark [33, 21, 25, 16]. [sent-247, score-0.63]

73 It is obvious that SURF cascade outperforms others significantly, and Gentle AdaBoost is better than Discrete AdaBoost for ensemble logit classifiers. [sent-249, score-0.677]

74 Furthermore, our multi-view detector yields significant improvement over pure frontal face detectors. [sent-250, score-0.417]

75 Supplementary illustrates some examples of face detection results on CMU+MIT and UMass FDDB. [sent-251, score-0.238]

76 4 Detection Speed We ran faces detector on videos to collect performance data. [sent-254, score-0.216]

77 The frontal detector reaches 100 fps (frame-per-second) for a typical VGA video with single face in each frame, while the multi-view detector can process this video in real time. [sent-255, score-0.604]

78 In comparison, the OpenCV default face detector can only achieve 60 fps with parallel processing tuned on. [sent-256, score-0.361]

79 As is known, the OpenCV face detector is tailored optimized for 333444777311 (a) (b) Figure 3. [sent-257, score-0.312]

80 First, SURF cascade detector has fewer number of stages (8 vs 24), fewer number of weaker classifiers (334 vs 2,912) and fewer average number of evaluated weak classifiers per detection window (1. [sent-261, score-1.242]

81 Second, SURF cascade benefits more from optimization than Haar cascade. [sent-264, score-0.59]

82 The 8-stage SURF cascade has better workload load balance among threads in parallelization than that of 24-stage Haar cascade. [sent-265, score-0.618]

83 Besides, SURF cascade is much easier for SIMD optimization (i. [sent-266, score-0.612]

84 Car Detection For car detection, we collected 600 side view car samples from PASCAL VOC 2005 dataset [6, 1], containing the UIUC subset and ETHZ subset. [sent-271, score-0.203]

85 2Th peix negative images are 4c,2ol0le0c pteods stiivmei ltraari ntothe face detection task. [sent-275, score-0.281]

86 On the target 80×32 detection template, we defined patch ns tizhee range tfr 8om0× 1362 2× d 1et6e cttoi o8n0 t×e m3p2,l aatend, walelow deedfi ntehed patch aspect nrgaetio fr loimke 16 :1 ×, 11 :62, t o2 :8 10, ×3: 31,2 4, :a 1n,d e atlcl. [sent-277, score-0.226]

87 And we set the same training configuration as face detection. [sent-279, score-0.222]

88 Figure 1(c) and 1(d) illustrate the number of weak classifier in each stage and the accumulated rejection rate over the cascade stages. [sent-283, score-0.897]

89 It shows that the first three stages rejects 95% negative samples with 7 weak classifiers. [sent-284, score-0.343]

90 Conclusions This paper presents SURF cascade for fast and accurate object detectors. [sent-295, score-0.641]

91 Second, we propose AUC as the single criterion for cascade optimization. [sent-298, score-0.633]

92 Third, we show a real example that can train cascade object detector from billions of samples within one hour on personal computers. [sent-299, score-1.067]

93 We compared SURF cascade detector with existing algorithms on detection accuracy and speed. [sent-300, score-0.844]

94 Experiments show that SURF cascade can achieve results on par with state-of- the-art detectors, while beats tailored optimized OpenCV detector in detection speed. [sent-301, score-0.844]

95 Future work will consider three points: (1) other possible SURF variants to further improve detection accuracy; (2) applying the approach on other object detection task like human detection; (3) combing SURF cascade with deformable part based models. [sent-302, score-0.819]

96 Fddb: A benchmark for face detection in unconstrained settings. [sent-406, score-0.238]

97 Online domain adaptation of a pre-trained cascade of classifiers. [sent-415, score-0.59]

98 Fast training and selection of haar features during statistics in boosting-based face detection. [sent-478, score-0.422]

99 Rapid object detection using a boosted cascade of simple features. [sent-530, score-0.726]

100 Fast human detection using a cascade of histograms of oriented gradients. [sent-565, score-0.68]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('cascade', 0.59), ('surf', 0.452), ('fpr', 0.213), ('haar', 0.2), ('detector', 0.164), ('face', 0.148), ('roc', 0.127), ('stages', 0.109), ('billions', 0.107), ('frontal', 0.105), ('weak', 0.102), ('umass', 0.096), ('fddb', 0.096), ('vj', 0.094), ('detection', 0.09), ('auc', 0.09), ('opencv', 0.086), ('conflicted', 0.077), ('training', 0.074), ('stage', 0.072), ('cmu', 0.07), ('patch', 0.068), ('car', 0.061), ('hour', 0.059), ('logit', 0.057), ('simd', 0.057), ('detectors', 0.057), ('samples', 0.056), ('faces', 0.052), ('boosting', 0.051), ('patches', 0.048), ('feret', 0.048), ('rejection', 0.047), ('personal', 0.045), ('researches', 0.044), ('negative', 0.043), ('criterion', 0.043), ('genki', 0.043), ('painful', 0.043), ('sdk', 0.043), ('reach', 0.043), ('window', 0.043), ('logistic', 0.042), ('speed', 0.042), ('picked', 0.041), ('curve', 0.04), ('classifiers', 0.04), ('milestone', 0.038), ('viola', 0.037), ('accumulated', 0.036), ('criteria', 0.036), ('adaboost', 0.036), ('wtx', 0.035), ('fppw', 0.035), ('facetracer', 0.035), ('cars', 0.035), ('adopts', 0.034), ('convergence', 0.034), ('rejects', 0.033), ('mit', 0.033), ('faster', 0.032), ('hundreds', 0.032), ('vs', 0.032), ('pham', 0.032), ('vga', 0.031), ('speedup', 0.03), ('ensemble', 0.03), ('pie', 0.029), ('modules', 0.029), ('attentional', 0.029), ('shorter', 0.029), ('besides', 0.029), ('brings', 0.029), ('fast', 0.028), ('finish', 0.028), ('threads', 0.028), ('template', 0.027), ('nowadays', 0.026), ('tuned', 0.026), ('classifier', 0.026), ('deformable', 0.026), ('view', 0.025), ('descriptor', 0.025), ('trained', 0.025), ('rate', 0.024), ('transforming', 0.024), ('trick', 0.024), ('mirror', 0.024), ('revisit', 0.024), ('uiuc', 0.024), ('realize', 0.024), ('days', 0.024), ('cascades', 0.024), ('fps', 0.023), ('object', 0.023), ('boosted', 0.023), ('train', 0.023), ('easier', 0.022), ('converge', 0.022), ('sums', 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection

Author: Jianguo Li, Yimin Zhang

Abstract: This paper presents a novel learning framework for training boosting cascade based object detector from large scale dataset. The framework is derived from the wellknown Viola-Jones (VJ) framework but distinguished by three key differences. First, the proposed framework adopts multi-dimensional SURF features instead of single dimensional Haar features to describe local patches. In this way, the number of used local patches can be reduced from hundreds of thousands to several hundreds. Second, it adopts logistic regression as weak classifier for each local patch instead of decision trees in the VJ framework. Third, we adopt AUC as a single criterion for the convergence test during cascade training rather than the two trade-off criteria (false-positive-rate and hit-rate) in the VJ framework. The benefit is that the false-positive-rate can be adaptive among different cascade stages, and thus yields much faster convergence speed of SURF cascade. Combining these points together, the proposed approach has three good properties. First, the boosting cascade can be trained very efficiently. Experiments show that the proposed approach can train object detectors from billions of negative samples within one hour even on personal computers. Second, the built detector is comparable to the stateof-the-art algorithm not only on the accuracy but also on the processing speed. Third, the built detector is small in model-size due to short cascade stages.

2 0.20000911 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval

Author: Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu

Abstract: Detecting faces in uncontrolled environments continues to be a challenge to traditional face detection methods[24] due to the large variation in facial appearances, as well as occlusion and clutter. In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning. A large database of faces with bounding rectangles and facial landmark locations is collected, and simple discriminative classifiers are learned from each of them. A voting-based method is then proposed to let these classifiers cast votes on the test image through an efficient image retrieval technique. As a result, faces can be very efficiently detected by selecting the modes from the voting maps, without resorting to exhaustive sliding window-style scanning. Moreover, due to the exemplar-based framework, our approach can detect faces under challenging conditions without explicitly modeling their variations. Evaluation on two public benchmark datasets shows that our new face detection approach is accurate and efficient, and achieves the state-of-the-art performance. We further propose to use image retrieval for face validation (in order to remove false positives) and for face alignment/landmark localization. The same methodology can also be easily generalized to other facerelated tasks, such as attribute recognition, as well as general object detection.

3 0.18876024 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning

Author: David Weiss, Ben Taskar

Abstract: We propose SCALPEL, a flexible method for object segmentation that integrates rich region-merging cues with mid- and high-level information about object layout, class, and scale into the segmentation process. Unlike competing approaches, SCALPEL uses a cascade of bottom-up segmentation models that is capable of learning to ignore boundaries early on, yet use them as a stopping criterion once the object has been mostly segmented. Furthermore, we show how such cascades can be learned efficiently. When paired with a novel method that generates better localized shapepriors than our competitors, our method leads to a concise, accurate set of segmentation proposals; these proposals are more accurate on the PASCAL VOC2010 dataset than state-of-the-art methods that use re-ranking to filter much larger bags of proposals. The code for our algorithm is available online.

4 0.12917289 168 cvpr-2013-Fast Object Detection with Entropy-Driven Evaluation

Author: Raphael Sznitman, Carlos Becker, François Fleuret, Pascal Fua

Abstract: Cascade-style approaches to implementing ensemble classifiers can deliver significant speed-ups at test time. While highly effective, they remain challenging to tune and their overall performance depends on the availability of large validation sets to estimate rejection thresholds. These characteristics are often prohibitive and thus limit their applicability. We introduce an alternative approach to speeding-up classifier evaluation which overcomes these limitations. It involves maintaining a probability estimate of the class label at each intermediary response and stopping when the corresponding uncertainty becomes small enough. As a result, the evaluation terminates early based on the sequence of responses observed. Furthermore, it does so independently of the type of ensemble classifier used or the way it was trained. We show through extensive experimentation that our method provides 2 to 10 fold speed-ups, over existing state-of-the-art methods, at almost no loss in accuracy on a number of object classification tasks.

5 0.12675087 131 cvpr-2013-Discriminative Non-blind Deblurring

Author: Uwe Schmidt, Carsten Rother, Sebastian Nowozin, Jeremy Jancsary, Stefan Roth

Abstract: Non-blind deblurring is an integral component of blind approaches for removing image blur due to camera shake. Even though learning-based deblurring methods exist, they have been limited to the generative case and are computationally expensive. To this date, manually-defined models are thus most widely used, though limiting the attained restoration quality. We address this gap by proposing a discriminative approach for non-blind deblurring. One key challenge is that the blur kernel in use at test time is not known in advance. To address this, we analyze existing approaches that use half-quadratic regularization. From this analysis, we derive a discriminative model cascade for image deblurring. Our cascade model consists of a Gaussian CRF at each stage, based on the recently introduced regression tree fields. We train our model by loss minimization and use synthetically generated blur kernels to generate training data. Our experiments show that the proposed approach is efficient and yields state-of-the-art restoration quality on images corrupted with synthetic and real blur.

6 0.12269121 373 cvpr-2013-SWIGS: A Swift Guided Sampling Method

7 0.11842123 438 cvpr-2013-Towards Pose Robust Face Recognition

8 0.11581483 383 cvpr-2013-Seeking the Strongest Rigid Detector

9 0.10535206 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

10 0.10397553 338 cvpr-2013-Probabilistic Elastic Matching for Pose Variant Face Verification

11 0.10017641 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

12 0.097891271 142 cvpr-2013-Efficient Detector Adaptation for Object Detection in a Video

13 0.093862481 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence

14 0.093103476 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories

15 0.091341347 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection

16 0.088955306 64 cvpr-2013-Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification

17 0.088832289 182 cvpr-2013-Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild

18 0.088821806 160 cvpr-2013-Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification

19 0.088695072 277 cvpr-2013-MODEC: Multimodal Decomposable Models for Human Pose Estimation

20 0.087162778 212 cvpr-2013-Image Segmentation by Cascaded Region Agglomeration


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.173), (1, -0.068), (2, -0.024), (3, -0.012), (4, 0.068), (5, 0.035), (6, 0.051), (7, -0.02), (8, 0.117), (9, -0.116), (10, -0.019), (11, -0.065), (12, 0.117), (13, -0.039), (14, 0.02), (15, -0.033), (16, 0.021), (17, -0.052), (18, -0.007), (19, 0.069), (20, -0.036), (21, -0.003), (22, -0.038), (23, 0.025), (24, 0.014), (25, 0.036), (26, -0.068), (27, 0.014), (28, 0.004), (29, 0.006), (30, -0.016), (31, 0.034), (32, 0.043), (33, -0.026), (34, 0.024), (35, 0.069), (36, 0.027), (37, -0.058), (38, -0.054), (39, 0.085), (40, 0.041), (41, -0.001), (42, 0.016), (43, -0.043), (44, -0.029), (45, 0.077), (46, 0.042), (47, -0.029), (48, 0.08), (49, 0.004)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95149404 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection

Author: Jianguo Li, Yimin Zhang

Abstract: This paper presents a novel learning framework for training boosting cascade based object detector from large scale dataset. The framework is derived from the wellknown Viola-Jones (VJ) framework but distinguished by three key differences. First, the proposed framework adopts multi-dimensional SURF features instead of single dimensional Haar features to describe local patches. In this way, the number of used local patches can be reduced from hundreds of thousands to several hundreds. Second, it adopts logistic regression as weak classifier for each local patch instead of decision trees in the VJ framework. Third, we adopt AUC as a single criterion for the convergence test during cascade training rather than the two trade-off criteria (false-positive-rate and hit-rate) in the VJ framework. The benefit is that the false-positive-rate can be adaptive among different cascade stages, and thus yields much faster convergence speed of SURF cascade. Combining these points together, the proposed approach has three good properties. First, the boosting cascade can be trained very efficiently. Experiments show that the proposed approach can train object detectors from billions of negative samples within one hour even on personal computers. Second, the built detector is comparable to the stateof-the-art algorithm not only on the accuracy but also on the processing speed. Third, the built detector is small in model-size due to short cascade stages.

2 0.72677189 168 cvpr-2013-Fast Object Detection with Entropy-Driven Evaluation

Author: Raphael Sznitman, Carlos Becker, François Fleuret, Pascal Fua

Abstract: Cascade-style approaches to implementing ensemble classifiers can deliver significant speed-ups at test time. While highly effective, they remain challenging to tune and their overall performance depends on the availability of large validation sets to estimate rejection thresholds. These characteristics are often prohibitive and thus limit their applicability. We introduce an alternative approach to speeding-up classifier evaluation which overcomes these limitations. It involves maintaining a probability estimate of the class label at each intermediary response and stopping when the corresponding uncertainty becomes small enough. As a result, the evaluation terminates early based on the sequence of responses observed. Furthermore, it does so independently of the type of ensemble classifier used or the way it was trained. We show through extensive experimentation that our method provides 2 to 10 fold speed-ups, over existing state-of-the-art methods, at almost no loss in accuracy on a number of object classification tasks.

3 0.71920949 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval

Author: Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu

Abstract: Detecting faces in uncontrolled environments continues to be a challenge to traditional face detection methods[24] due to the large variation in facial appearances, as well as occlusion and clutter. In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning. A large database of faces with bounding rectangles and facial landmark locations is collected, and simple discriminative classifiers are learned from each of them. A voting-based method is then proposed to let these classifiers cast votes on the test image through an efficient image retrieval technique. As a result, faces can be very efficiently detected by selecting the modes from the voting maps, without resorting to exhaustive sliding window-style scanning. Moreover, due to the exemplar-based framework, our approach can detect faces under challenging conditions without explicitly modeling their variations. Evaluation on two public benchmark datasets shows that our new face detection approach is accurate and efficient, and achieves the state-of-the-art performance. We further propose to use image retrieval for face validation (in order to remove false positives) and for face alignment/landmark localization. The same methodology can also be easily generalized to other facerelated tasks, such as attribute recognition, as well as general object detection.

4 0.67947054 383 cvpr-2013-Seeking the Strongest Rigid Detector

Author: Rodrigo Benenson, Markus Mathias, Tinne Tuytelaars, Luc Van_Gool

Abstract: The current state of the art solutions for object detection describe each class by a set of models trained on discovered sub-classes (so called “components ”), with each model itself composed of collections of interrelated parts (deformable models). These detectors build upon the now classic Histogram of Oriented Gradients+linear SVM combo. In this paper we revisit some of the core assumptions in HOG+SVM and show that by properly designing the feature pooling, feature selection, preprocessing, and training methods, it is possible to reach top quality, at least for pedestrian detections, using a single rigid component. We provide experiments for a large design space, that give insights into the design of classifiers, as well as relevant information for practitioners. Our best detector is fully feed-forward, has a single unified architecture, uses only histograms of oriented gradients and colour information in monocular static images, and improves over 23 other methods on the INRIA, ETHand Caltech-USA datasets, reducing the average miss-rate over HOG+SVM by more than 30%.

5 0.65494376 338 cvpr-2013-Probabilistic Elastic Matching for Pose Variant Face Verification

Author: Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang

Abstract: Pose variation remains to be a major challenge for realworld face recognition. We approach this problem through a probabilistic elastic matching method. We take a part based representation by extracting local features (e.g., LBP or SIFT) from densely sampled multi-scale image patches. By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus. Each mixture component of the GMM is confined to be a spherical Gaussian to balance the influence of the appearance and the location terms. Each Gaussian component builds correspondence of a pair of features to be matched between two faces/face tracks. For face verification, we train an SVM on the vector concatenating the difference vectors of all the feature pairs to decide if a pair of faces/face tracks is matched or not. We further propose a joint Bayesian adaptation algorithm to adapt the universally trained GMM to better model the pose variations between the target pair of faces/face tracks, which consistently improves face verification accuracy. Our experiments show that our method outperforms the state-ofthe-art in the most restricted protocol on Labeled Face in the Wild (LFW) and the YouTube video face database by a significant margin.

6 0.64303136 430 cvpr-2013-The SVM-Minus Similarity Score for Video Face Recognition

7 0.63120788 142 cvpr-2013-Efficient Detector Adaptation for Object Detection in a Video

8 0.6233784 323 cvpr-2013-POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation

9 0.61820912 438 cvpr-2013-Towards Pose Robust Face Recognition

10 0.61679161 64 cvpr-2013-Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification

11 0.61676729 420 cvpr-2013-Supervised Descent Method and Its Applications to Face Alignment

12 0.609034 160 cvpr-2013-Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification

13 0.60728258 182 cvpr-2013-Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild

14 0.60265595 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence

15 0.59783882 167 cvpr-2013-Fast Multiple-Part Based Object Detection Using KD-Ferns

16 0.58152246 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video

17 0.57767314 359 cvpr-2013-Robust Discriminative Response Map Fitting with Constrained Local Models

18 0.57701206 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

19 0.56382954 401 cvpr-2013-Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection

20 0.56348556 389 cvpr-2013-Semi-supervised Learning with Constraints for Person Identification in Multimedia Data


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(9, 0.129), (10, 0.097), (16, 0.027), (19, 0.018), (26, 0.106), (28, 0.012), (33, 0.224), (39, 0.01), (67, 0.143), (69, 0.046), (87, 0.095)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.89251024 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection

Author: Jianguo Li, Yimin Zhang

Abstract: This paper presents a novel learning framework for training boosting cascade based object detector from large scale dataset. The framework is derived from the wellknown Viola-Jones (VJ) framework but distinguished by three key differences. First, the proposed framework adopts multi-dimensional SURF features instead of single dimensional Haar features to describe local patches. In this way, the number of used local patches can be reduced from hundreds of thousands to several hundreds. Second, it adopts logistic regression as weak classifier for each local patch instead of decision trees in the VJ framework. Third, we adopt AUC as a single criterion for the convergence test during cascade training rather than the two trade-off criteria (false-positive-rate and hit-rate) in the VJ framework. The benefit is that the false-positive-rate can be adaptive among different cascade stages, and thus yields much faster convergence speed of SURF cascade. Combining these points together, the proposed approach has three good properties. First, the boosting cascade can be trained very efficiently. Experiments show that the proposed approach can train object detectors from billions of negative samples within one hour even on personal computers. Second, the built detector is comparable to the stateof-the-art algorithm not only on the accuracy but also on the processing speed. Third, the built detector is small in model-size due to short cascade stages.

2 0.87225807 2 cvpr-2013-3D Pictorial Structures for Multiple View Articulated Pose Estimation

Author: Magnus Burenius, Josephine Sullivan, Stefan Carlsson

Abstract: We consider the problem of automatically estimating the 3D pose of humans from images, taken from multiple calibrated views. We show that it is possible and tractable to extend the pictorial structures framework, popular for 2D pose estimation, to 3D. We discuss how to use this framework to impose view, skeleton, joint angle and intersection constraints in 3D. The 3D pictorial structures are evaluated on multiple view data from a professional football game. The evaluation is focused on computational tractability, but we also demonstrate how a simple 2D part detector can be plugged into the framework.

3 0.86627114 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval

Author: Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu

Abstract: Detecting faces in uncontrolled environments continues to be a challenge to traditional face detection methods[24] due to the large variation in facial appearances, as well as occlusion and clutter. In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning. A large database of faces with bounding rectangles and facial landmark locations is collected, and simple discriminative classifiers are learned from each of them. A voting-based method is then proposed to let these classifiers cast votes on the test image through an efficient image retrieval technique. As a result, faces can be very efficiently detected by selecting the modes from the voting maps, without resorting to exhaustive sliding window-style scanning. Moreover, due to the exemplar-based framework, our approach can detect faces under challenging conditions without explicitly modeling their variations. Evaluation on two public benchmark datasets shows that our new face detection approach is accurate and efficient, and achieves the state-of-the-art performance. We further propose to use image retrieval for face validation (in order to remove false positives) and for face alignment/landmark localization. The same methodology can also be easily generalized to other facerelated tasks, such as attribute recognition, as well as general object detection.

4 0.86601621 311 cvpr-2013-Occlusion Patterns for Object Class Detection

Author: Bojan Pepikj, Michael Stark, Peter Gehler, Bernt Schiele

Abstract: Despite the success of recent object class recognition systems, the long-standing problem of partial occlusion remains a major challenge, and a principled solution is yet to be found. In this paper we leave the beaten path of methods that treat occlusion as just another source of noise instead, we include the occluder itself into the modelling, by mining distinctive, reoccurring occlusion patterns from annotated training data. These patterns are then used as training data for dedicated detectors of varying sophistication. In particular, we evaluate and compare models that range from standard object class detectors to hierarchical, part-based representations of occluder/occludee pairs. In an extensive evaluation we derive insights that can aid further developments in tackling the occlusion challenge. –

5 0.86106849 45 cvpr-2013-Articulated Pose Estimation Using Discriminative Armlet Classifiers

Author: Georgia Gkioxari, Pablo Arbeláez, Lubomir Bourdev, Jitendra Malik

Abstract: We propose a novel approach for human pose estimation in real-world cluttered scenes, and focus on the challenging problem of predicting the pose of both arms for each person in the image. For this purpose, we build on the notion of poselets [4] and train highly discriminative classifiers to differentiate among arm configurations, which we call armlets. We propose a rich representation which, in addition to standardHOGfeatures, integrates the information of strong contours, skin color and contextual cues in a principled manner. Unlike existing methods, we evaluate our approach on a large subset of images from the PASCAL VOC detection dataset, where critical visual phenomena, such as occlusion, truncation, multiple instances and clutter are the norm. Our approach outperforms Yang and Ramanan [26], the state-of-the-art technique, with an improvement from 29.0% to 37.5% PCP accuracy on the arm keypoint prediction task, on this new pose estimation dataset.

6 0.86101937 160 cvpr-2013-Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification

7 0.86081052 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation

8 0.85821897 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

9 0.85683 152 cvpr-2013-Exemplar-Based Face Parsing

10 0.85525602 345 cvpr-2013-Real-Time Model-Based Rigid Object Pose Estimation and Tracking Combining Dense and Sparse Visual Cues

11 0.85405833 275 cvpr-2013-Lp-Norm IDF for Large Scale Image Search

12 0.85248166 440 cvpr-2013-Tracking People and Their Objects

13 0.85024971 103 cvpr-2013-Decoding Children's Social Behavior

14 0.85006487 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

15 0.84861624 4 cvpr-2013-3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image

16 0.8483395 375 cvpr-2013-Saliency Detection via Graph-Based Manifold Ranking

17 0.84796506 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence

18 0.8469913 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection

19 0.84488058 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection

20 0.84405422 414 cvpr-2013-Structure Preserving Object Tracking