cvpr cvpr2013 cvpr2013-142 knowledge-graph by maker-knowledge-mining

142 cvpr-2013-Efficient Detector Adaptation for Object Detection in a Video

Source: pdf

Author: Pramod Sharma, Ram Nevatia

Abstract: In this work, we present a novel and efficient detector adaptation method which improves the performance of an offline trained classifier (baseline classifier) by adapting it to new test datasets. We address two critical aspects of adaptation methods: generalizability and computational efficiency. We propose an adaptation method, which can be applied to various baseline classifiers and is computationally efficient also. For a given test video, we collect online samples in an unsupervised manner and train a randomfern adaptive classifier . The adaptive classifier improves precision of the baseline classifier by validating the obtained detection responses from baseline classifier as correct detections or false alarms. Experiments demonstrate generalizability, computational efficiency and effectiveness of our method, as we compare our method with state of the art approaches for the problem of human detection and show good performance with high computational efficiency on two different baseline classifiers.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu a} Abstract In this work, we present a novel and efficient detector adaptation method which improves the performance of an offline trained classifier (baseline classifier) by adapting it to new test datasets. [sent-2, score-0.917]

2 We propose an adaptation method, which can be applied to various baseline classifiers and is computationally efficient also. [sent-4, score-0.544]

3 For a given test video, we collect online samples in an unsupervised manner and train a randomfern adaptive classifier . [sent-5, score-1.291]

4 The adaptive classifier improves precision of the baseline classifier by validating the obtained detection responses from baseline classifier as correct detections or false alarms. [sent-6, score-1.808]

5 Common procedure for object detection is to train an object detector in an offline manner by using thousands of training examples. [sent-10, score-0.508]

6 However, when applied on novel test data, performance of the offline trained classifier (baseline classifier) may not be high, as the examples in test data may be very different than the ones used for the training. [sent-11, score-0.471]

7 We propose a detector adaptation method, which is independent of the baseline classifier used, hence is applicable to various baseline classifiers. [sent-14, score-1.11]

8 [21, 11, 24] use manually labeled offline training samples for adaptation, which can make the adaptation process computationally expensive, because the size of the training data could be large after combining offline and online samples. [sent-19, score-1.201]

9 However, these approaches optimize the baseline classifier using gradient descent methods, which are inherently slow in nature. [sent-21, score-0.44]

10 Supervised [7] and semi-supervised [6, 26] methods require manual labeling for online sample collection, which is difficult for new test videos. [sent-23, score-0.462]

11 Hence, unsupervised sample collection is important for adaptation methods. [sent-24, score-0.48]

12 Background subtraction based approaches [12, 13, 10, 15] have been used for unsupervised online sample collection. [sent-25, score-0.593]

13 We propose a novel generalized and computationally efficient approach for adapting a baseline classifier for a specific test video. [sent-30, score-0.576]

14 Our approach is generalized because it is independent of the type of baseline classifiers used and does not depend on specific features or kind of training algorithm used for creating the baseline classifier. [sent-31, score-0.499]

15 333222555422 For a given test video, we apply the baseline classifier at a high precision setting, and track obtained detection responses using a simple position, size and appearance based tracking method. [sent-32, score-0.914]

16 Short tracks are obtained as tracking output, which are sufficient for our method, as we do not seek long tracks to collect online samples. [sent-33, score-0.682]

17 By using tracks and detection responses, positive and negative online samples are collected in an unsupervised manner. [sent-34, score-1.045]

18 Positive online samples are further divided into different categories for variations in object poses. [sent-35, score-0.647]

19 Then a computationally efficient multi-category random fern [14] classifier is trained as the adaptive classifier using online samples only. [sent-36, score-1.646]

20 The adaptive classifier improves the precision of baseline classifier by validating the detection responses obtained from the baseline classifier as correct detections or false alarms Rest of this paper is divided as follows: Related work is presented in section 2. [sent-37, score-1.877]

21 Our unsupervised detector adaptation approach is described in section 4. [sent-39, score-0.525]

22 Related Work In recent years, significant work has been published for detector adaptation methods. [sent-42, score-0.394]

23 Background subtraction based methods [12, 13, 10, 15] have been proposed for unsupervised online sample collection, but these methods are not applicable for datasets with complex backgrounds. [sent-44, score-0.624]

24 Many approaches [24, 4, 17, 23] have used detection output from the baseline classifier or tracking information for unsupervised online sample collection. [sent-45, score-1.155]

25 Unsupervised detector adaptation methods can be broadly categorized into three different categories: Boosting based methods, SVM based approaches and generic adaption methods. [sent-46, score-0.428]

26 [15] described a detector adaptation method in which they divide the image into several grids and train an adaptive classifier separately for each grid. [sent-48, score-0.87]

27 They collect online samples in an unsupervised manner by applying the combination of different part detectors. [sent-51, score-0.827]

28 [17] proposed an unsupervised incremental learning approach for Real Adaboost framework by using tracking information to collect the online samples automatically and extending the Real Adaboost exponential loss function to handle multiple instances of the online samples. [sent-53, score-1.312]

29 They collect missed detections and false alarms as online samples, therefore their method relies on tracking methods which can interpolate object instances missed by the baseline classifier. [sent-54, score-1.06]

30 Our proposed approach uses a simple position, size and appearance based tracking method in order to collect online samples. [sent-55, score-0.532]

31 They used motion, scene structure and geometry information to collect the online samples in unsupervised manner and combine all this information in confidence encoded SVM. [sent-60, score-0.867]

32 Their method uses offline training samples for adaptation, which may increase the computation time for training the adaptive classifier. [sent-61, score-0.557]

33 Both boosting and SVM based adaptation methods are limited to a specific kind of algorithm of baseline classifier, hence are not applicable for various baseline classifiers. [sent-62, score-0.805]

34 proposed a detector adaptation method in which they apply the baseline classifier at low precision and collect the online samples automatically. [sent-64, score-1.557]

35 Dense features are extracted from collected online samples to train a vocabulary tree based transfer classifier. [sent-65, score-0.661]

36 They showed the results on two types of baseline classifiers for pedestrian category, whereas our proposed method show the performance with different articulations in human pose in addition to the pedestrian category. [sent-66, score-0.52]

37 Overview The objective of our work is to improve the performance of a baseline classifier by adapting it to a specific test video. [sent-68, score-0.524]

38 Our approach has the following advantages over the existing detector adaptation methods: 1. [sent-70, score-0.394]

39 Generalizability: Our approach is widely applicable, as it is not limited to a specific baseline classifier or any specific features used for the training of the baseline classifiers. [sent-71, score-0.699]

40 Computationally Efficient: Training of the random fern based adaptive classifier is computationally efficient. [sent-73, score-0.826]

41 Even with thousands of online samples, adaptive classifier training takes only couple of seconds . [sent-74, score-0.855]

42 For online sample collection, we apply baseline detector at a high precision (high threshold) setting. [sent-77, score-0.857]

43 Obtained detection responses, are tracked by applying a simple trackingby-detection method, which only considers the association of detection responses in consecutive frames based on the size, position and appearance of the object. [sent-78, score-0.394]

44 Overview of our detector adaptation method detection responses is computed. [sent-80, score-0.662]

45 Those detection responses which match with the track responses and have a high detection confidence are collected as positive online samples. [sent-81, score-1.123]

46 Positive online samples are further divided into different categories for variations in the poses for the target object and then a random fern classifier is trained as adaptive classifier. [sent-83, score-1.521]

47 Testing is done in two stages: First we apply the baseline classifier at a high recall setting (low threshold). [sent-84, score-0.481]

48 In this way, baseline classifier produces many correct detection responses in addition to many false alarms. [sent-85, score-0.763]

49 In the next stage, these detection responses from baseline classifier are provided to the learned random fern adaptive classifier, which classifies the obtained detection responses as the correct detections or the false alarms. [sent-86, score-1.631]

50 In this way our adaptation method improves the precision of the baseline classifier. [sent-87, score-0.578]

51 Experiments also show that the method is highly computationally efficient and outperforms the baseline classifier and other state of the art adaptation methods. [sent-90, score-0.777]

52 Unsupervised Detector Adaptation In the following subsections, we describe the two different modules of our detector adaptation method : Online sample collection and training of the random fern based adaptive classifier. [sent-92, score-1.084]

53 Unsupervised Training Samples Collection To collect the online samples, we first apply the baseline classifier at high precision setting for each frame in the video and obtain the detection responses D = {di}. [sent-95, score-1.298]

54 A detection response di is represented as di = {xi , yi , si, ai, ti, li}. [sent-98, score-0.406]

55 The link probability between two detection responses di and dj is defined as : Pl (dj |di) = Ap(dj |di)As (dj |di)Aa (dj |di) (1) where Ap is the position affinity, As is size affinity and Aa iws htheree appearance affinity. [sent-100, score-0.53]

56 A detection response di i sn gco bnosxiedser oefd D as positive coonmlinpeut sample eitf:e O(di ∩ Tk) > θ1 and li > θ2 (3) Where O is the overlap of the bounding boxes of di and Tk. [sent-108, score-0.514]

57 On the other hand, a detection response is considered as negative online sample if: O(di ∩ Tk) < θ1 ∀k = 1, . [sent-111, score-0.656]

58 Some of the collected posi- tive and negative online samples are shown in Figure 3. [sent-118, score-0.711]

59 Hence, we divide the positive online samples into different categories. [sent-124, score-0.647]

60 For this purpose, we use the poselet [5] detector as the baseline classifier. [sent-125, score-0.449]

61 A detection response di obtained from the poselet detector is represented as di = {xi, yi, si, ai, ti, li, hi}, where hi is the distribution of th=e poselets. [sent-126, score-0.691]

62 We train a pose classifier offline, in order to divide the positive online samples into different categories. [sent-128, score-0.971]

63 We collect the training images for different variations in the human pose and compute the poselet histograms for these training images, by applying the poselet detector. [sent-129, score-0.514]

64 video, collected positive online samples are represented as, P = {Pi}, where Pi = {pxleis, yi , si, ai , hesi,e lnit,e vdi}, a vi iPs the target category, ew Phich is d{xetermined as: {h? [sent-133, score-0.802]

65 n this manner we divide the positive online samples into different categories. [sent-139, score-0.702]

66 Each of these categories are considered as a separate class for adaptive classifier training. [sent-140, score-0.445]

67 proposed an efficient random fern [14] classifier, which uses binary features to classify a test sample. [sent-144, score-0.453]

68 Examples of some of the positive (first row) and negative (second row) online samples collected in unsupervised manner from Mind’s Eye [1] and CAVIAR [2] datasets. [sent-169, score-0.926]

69 Learning algorithm of random fern based adaptive classifier is described in algorithm 1. [sent-188, score-0.774]

70 For the training of the adaptive classifier, we only use online samples collected in an unsupervised manner, no manually labeled offline samples are used for the training. [sent-189, score-1.261]

71 We train a multi-class random fern adaptive classifier by considering different categories ofthe positive samples as different target classes, all negative online samples are considered as a single target class. [sent-190, score-1.798]

72 For a test video, first online samples are collected from all the frames and then random fern classifier is trained. [sent-191, score-1.275]

73 vi = j - Train Random fern classifier using online samples P and N. [sent-199, score-1.148]

74 • Test: f•or T eis t=: 1to F do - Apply baseline classifier at low threshold δ to obtain detection responses Df for j = 1to |Df | do - Apply Roa |nDdo|m d ofern classifier to validate the detection responses as the true detections and false alarms ? [sent-200, score-1.394]

75 Baseline classifiers: To demonstrate the generalizability of our approach, we performed experiments with two different baseline classifiers: For CAVIAR, boosting based classifier is used as described in [8]. [sent-211, score-0.627]

76 Computation Time Performance We evaluated computational efficiency of our approach for the training of the adaptive classifier after collection of online samples. [sent-216, score-0.869]

77 We performed this experiment for online samples collected from CAVIAR dataset and trained the adaptive classifier for two target categories. [sent-217, score-1.12]

78 For the adaptive classifier training, we only use the online samples collected in unsupervised manner, no offline samples are used for the training. [sent-218, score-1.442]

79 [17] uses bags of instances, instead of single instance, hence we count all the training samples in the bag in order to count the total number of samples used for the training. [sent-220, score-0.426]

80 We can see that random fern based adaptive classifier training outperforms [17] in run time performance. [sent-222, score-0.827]

81 [17] optimizes parameters of baseline detector using gradient descent method, hence training time of incremental detector is high. [sent-223, score-0.646]

82 Whereas our random fern adaptive classifier is independent of the parameters of baseline classifier and uses simple binary features for the training, hence is computationally efficient. [sent-224, score-1.343]

83 2 seconds for training of 1000 online samples, whereas the method described in [17] takes approximately 35 seconds, which makes our method approximately 30 times faster than [17]. [sent-226, score-0.592]

84 Total training time of random fern classifier for CAVIAR1 sequence takes only 8 seconds for approximately 16000 online samples, whereas for CAVIAR2 it takes only 19 seconds with approximately 43000 online samples. [sent-227, score-1.614]

85 1 CAVIAR Dataset For this dataset, we use Real Adaboost based baseline classifier [8] and train it for 16 cascade layers for full body of human. [sent-239, score-0.475]

86 30 random ferns are trained for 10 binary features, for two target categories (positive and negative classes). [sent-240, score-0.398]

87 X-axis represents the number of online samples used for the classifier training, Y-axis is shown in log scale and represents runtime in seconds. [sent-243, score-0.777]

88 65, Sharma et al’s method improves the precision over baseline by 14%, whereas our method improves the precision by 22%. [sent-252, score-0.513]

89 Both our approach and Sharma et al’s method outperforms baseline detector [8], however for CAVIAR2 sequence, long tracks are not available for some of the humans, hence not enough missed detections are collected by Sharma et al’s approach, due to which its performance is not as high. [sent-256, score-0.653]

90 We train 15 random ferns with 8 binary features for the adaptive classifier training. [sent-261, score-0.619]

91 Adaptive classifier is trained for four target categories (standing/walking, bending, digging and negative). [sent-262, score-0.478]

92 During online sample collection, not many negative samples are obtained, hence we add approximately 1100 negative online samples collected in unsupervised manner from the CAVIAR dataset in the online negative samples set for both the ME1 and ME2 sequences. [sent-266, score-2.191]

93 We compare the performance of our approach with the baseline classifier (poselet detector [5]), and show that by dividing the positive samples into different categories, we get better performance as compared to the case where we do not divide the positive samples into different categories. [sent-278, score-1.077]

94 6, we improve the precision for poselet detector by 12% with sample categorization, whereas without sample categorization improvement is 7%. [sent-285, score-0.554]

95 Also trained multi-category adaptive classifier can be used as pose identification such as standing, bending, digging etc. [sent-288, score-0.585]

96 Conclusion We proposed a novel detector adaptation approach, which efficiently adapts a baseline classifier for a test video. [sent-290, score-0.869]

97 Online samples are collected in an unsupervised manner and random fern classifier is trained as the adaptive classifier. [sent-291, score-1.253]

98 In future, we plan to apply our adaptation method on other categories of objects and other baseline classifiers. [sent-296, score-0.509]

99 Examples of some of the detection results when applied baseline detector at low threshold (best viewed in color). [sent-378, score-0.447]

100 Improving part based object detection by unsupervised, online boosting. [sent-466, score-0.475]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('online', 0.376), ('fern', 0.344), ('adaptation', 0.252), ('classifier', 0.234), ('caviar', 0.222), ('baseline', 0.206), ('responses', 0.169), ('samples', 0.167), ('adaptive', 0.16), ('mind', 0.147), ('detector', 0.142), ('unsupervised', 0.131), ('sharma', 0.124), ('offline', 0.124), ('di', 0.117), ('ferns', 0.116), ('generalizability', 0.116), ('eye', 0.115), ('dj', 0.114), ('poselet', 0.101), ('detection', 0.099), ('collect', 0.098), ('digging', 0.093), ('bending', 0.086), ('collected', 0.083), ('precision', 0.082), ('tracks', 0.075), ('response', 0.073), ('boosting', 0.071), ('alarms', 0.069), ('whereas', 0.067), ('incremental', 0.064), ('ci', 0.061), ('detections', 0.06), ('categorization', 0.06), ('pl', 0.059), ('tracking', 0.058), ('positive', 0.057), ('adaboost', 0.057), ('negative', 0.057), ('target', 0.057), ('dmin', 0.056), ('articulations', 0.056), ('pose', 0.055), ('manner', 0.055), ('false', 0.055), ('training', 0.053), ('variations', 0.053), ('computationally', 0.052), ('fk', 0.052), ('gt', 0.052), ('pedestrian', 0.051), ('categories', 0.051), ('sample', 0.051), ('adapting', 0.049), ('missed', 0.048), ('divide', 0.047), ('alarm', 0.046), ('collection', 0.046), ('fn', 0.044), ('trained', 0.043), ('instances', 0.042), ('grabner', 0.042), ('hi', 0.042), ('recall', 0.041), ('confidence', 0.04), ('hence', 0.039), ('category', 0.038), ('binary', 0.038), ('improves', 0.038), ('nevatia', 0.037), ('ozuysal', 0.037), ('random', 0.036), ('subtraction', 0.035), ('poselets', 0.035), ('train', 0.035), ('ai', 0.035), ('test', 0.035), ('video', 0.034), ('classifiers', 0.034), ('adaption', 0.034), ('kembhavi', 0.034), ('art', 0.033), ('sequences', 0.033), ('seconds', 0.032), ('approximately', 0.032), ('validating', 0.031), ('subsections', 0.031), ('track', 0.031), ('link', 0.031), ('applicable', 0.031), ('bin', 0.03), ('df', 0.029), ('kth', 0.029), ('army', 0.028), ('tive', 0.028), ('leistner', 0.028), ('association', 0.027), ('government', 0.027), ('vi', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999923 142 cvpr-2013-Efficient Detector Adaptation for Object Detection in a Video

Author: Pramod Sharma, Ram Nevatia

2 0.22034512 387 cvpr-2013-Semi-supervised Domain Adaptation with Instance Constraints

Author: Jeff Donahue, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell

Abstract: Most successful object classification and detection methods rely on classifiers trained on large labeled datasets. However, for domains where labels are limited, simply borrowing labeled data from existing datasets can hurt performance, a phenomenon known as “dataset bias.” We propose a general framework for adapting classifiers from “borrowed” data to the target domain using a combination of available labeled and unlabeled examples. Specifically, we show that imposing smoothness constraints on the classifier scores over the unlabeled data can lead to improved adaptation results. Such constraints are often available in the form of instance correspondences, e.g. when the same object or individual is observed simultaneously from multiple views, or tracked between video frames. In these cases, the object labels are unknown but can be constrained to be the same or similar. We propose techniques that build on existing domain adaptation methods by explicitly modeling these relationships, and demonstrate empirically that they improve recognition accuracy in two scenarios, multicategory image classification and object detection in video.

3 0.21774904 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video

Author: Yang Yang, Guang Shu, Mubarak Shah

Abstract: We propose a novel approach to boost the performance of generic object detectors on videos by learning videospecific features using a deep neural network. The insight behind our proposed approach is that an object appearing in different frames of a video clip should share similar features, which can be learned to build better detectors. Unlike many supervised detector adaptation or detection-bytracking methods, our method does not require any extra annotations or utilize temporal correspondence. We start with the high-confidence detections from a generic detector, then iteratively learn new video-specific features and refine the detection scores. In order to learn discriminative and compact features, we propose a new feature learning method using a deep neural network based on auto encoders. It differs from the existing unsupervised feature learning methods in two ways: first it optimizes both discriminative and generative properties of the features simultaneously, which gives our features better discriminative ability; second, our learned features are more compact, while the unsupervised feature learning methods usually learn a redundant set of over-complete features. Extensive experimental results on person and horse detection show that significant performance improvement can be achieved with our proposed method.

4 0.18977576 386 cvpr-2013-Self-Paced Learning for Long-Term Tracking

Author: unkown-author

Abstract: We address the problem of long-term object tracking, where the object may become occluded or leave-the-view. In this setting, we show that an accurate appearance model is considerably more effective than a strong motion model. We develop simple but effective algorithms that alternate between tracking and learning a good appearance model given a track. We show that it is crucial to learn from the “right” frames, and use the formalism of self-paced curriculum learning to automatically select such frames. We leverage techniques from object detection for learning accurate appearance-based templates, demonstrating the importance of using a large negative training set (typically not used for tracking). We describe both an offline algorithm (that processes frames in batch) and a linear-time online (i.e. causal) algorithm that approaches real-time performance. Our models significantly outperform prior art, reducing the average error on benchmark videos by a factor of 4.

5 0.18713024 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

Author: Guang Shu, Afshin Dehghan, Mubarak Shah

Abstract: We propose an approach to improve the detection performance of a generic detector when it is applied to a particular video. The performance of offline-trained objects detectors are usually degraded in unconstrained video environments due to variant illuminations, backgrounds and camera viewpoints. Moreover, most object detectors are trained using Haar-like features or gradient features but ignore video specificfeatures like consistent colorpatterns. In our approach, we apply a Superpixel-based Bag-of-Words (BoW) model to iteratively refine the output of a generic detector. Compared to other related work, our method builds a video-specific detector using superpixels, hence it can handle the problem of appearance variation. Most importantly, using Conditional Random Field (CRF) along with our super pixel-based BoW model, we develop and algorithm to segment the object from the background . Therefore our method generates an output of the exact object regions instead of the bounding boxes generated by most detectors. In general, our method takes detection bounding boxes of a generic detector as input and generates the detection output with higher average precision and precise object regions. The experiments on four recent datasets demonstrate the effectiveness of our approach and significantly improves the state-of-art detector by 5-16% in average precision.

6 0.14858888 315 cvpr-2013-Online Robust Dictionary Learning

7 0.14196236 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

8 0.1274202 150 cvpr-2013-Event Recognition in Videos by Learning from Heterogeneous Web Sources

9 0.12454033 314 cvpr-2013-Online Object Tracking: A Benchmark

10 0.11841133 324 cvpr-2013-Part-Based Visual Tracking with Online Latent Structural Learning

11 0.11671659 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence

12 0.11099661 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection

13 0.10852076 383 cvpr-2013-Seeking the Strongest Rigid Detector

14 0.10764521 459 cvpr-2013-Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots

15 0.10561866 335 cvpr-2013-Poselet Conditioned Pictorial Structures

16 0.097891271 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection

17 0.097096547 143 cvpr-2013-Efficient Large-Scale Structured Learning

18 0.096322224 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People

19 0.094670661 15 cvpr-2013-A Lazy Man's Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration

20 0.093965888 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.232), (1, -0.105), (2, -0.047), (3, -0.063), (4, 0.047), (5, 0.003), (6, 0.105), (7, -0.03), (8, 0.076), (9, 0.06), (10, -0.107), (11, -0.104), (12, 0.035), (13, -0.076), (14, -0.075), (15, -0.073), (16, -0.034), (17, -0.076), (18, -0.036), (19, -0.035), (20, -0.086), (21, -0.094), (22, -0.088), (23, -0.016), (24, -0.051), (25, 0.03), (26, 0.009), (27, 0.033), (28, -0.008), (29, 0.03), (30, -0.037), (31, 0.027), (32, -0.042), (33, 0.007), (34, 0.044), (35, -0.013), (36, -0.005), (37, -0.028), (38, -0.051), (39, -0.025), (40, 0.02), (41, -0.004), (42, 0.044), (43, 0.03), (44, -0.059), (45, 0.022), (46, 0.038), (47, -0.03), (48, 0.104), (49, -0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97082013 142 cvpr-2013-Efficient Detector Adaptation for Object Detection in a Video

Author: Pramod Sharma, Ram Nevatia

2 0.77046329 15 cvpr-2013-A Lazy Man's Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration

Author: Peter Welinder, Max Welling, Pietro Perona

Abstract: How many labeled examples are needed to estimate a classifier’s performance on a new dataset? We study the case where data is plentiful, but labels are expensive. We show that by making a few reasonable assumptions on the structure of the data, it is possible to estimate performance curves, with confidence bounds, using a small number of ground truth labels. Our approach, which we call Semisupervised Performance Evaluation (SPE), is based on a generative model for the classifier’s confidence scores. In addition to estimating the performance of classifiers on new datasets, SPE can be used to recalibrate a classifier by reestimating the class-conditional confidence distributions.

3 0.74196571 387 cvpr-2013-Semi-supervised Domain Adaptation with Instance Constraints

Author: Jeff Donahue, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell

4 0.71855497 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video

Author: Yang Yang, Guang Shu, Mubarak Shah

5 0.70339644 383 cvpr-2013-Seeking the Strongest Rigid Detector

Author: Rodrigo Benenson, Markus Mathias, Tinne Tuytelaars, Luc Van_Gool

Abstract: The current state of the art solutions for object detection describe each class by a set of models trained on discovered sub-classes (so called “components ”), with each model itself composed of collections of interrelated parts (deformable models). These detectors build upon the now classic Histogram of Oriented Gradients+linear SVM combo. In this paper we revisit some of the core assumptions in HOG+SVM and show that by properly designing the feature pooling, feature selection, preprocessing, and training methods, it is possible to reach top quality, at least for pedestrian detections, using a single rigid component. We provide experiments for a large design space, that give insights into the design of classifiers, as well as relevant information for practitioners. Our best detector is fully feed-forward, has a single unified architecture, uses only histograms of oriented gradients and colour information in monocular static images, and improves over 23 other methods on the INRIA, ETHand Caltech-USA datasets, reducing the average miss-rate over HOG+SVM by more than 30%.

6 0.69749904 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

7 0.6852296 179 cvpr-2013-From N to N+1: Multiclass Transfer Incremental Learning

8 0.68344694 386 cvpr-2013-Self-Paced Learning for Long-Term Tracking

9 0.67527086 168 cvpr-2013-Fast Object Detection with Entropy-Driven Evaluation

10 0.66356802 385 cvpr-2013-Selective Transfer Machine for Personalized Facial Action Unit Detection

11 0.66195893 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection

12 0.65399331 239 cvpr-2013-Kernel Null Space Methods for Novelty Detection

13 0.62045485 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence

14 0.60907483 430 cvpr-2013-The SVM-Minus Similarity Score for Video Face Recognition

15 0.60410148 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People

16 0.60140681 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning

17 0.60079306 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection

18 0.59972066 167 cvpr-2013-Fast Multiple-Part Based Object Detection Using KD-Ferns

19 0.59530991 150 cvpr-2013-Event Recognition in Videos by Learning from Heterogeneous Web Sources

20 0.58035493 143 cvpr-2013-Efficient Large-Scale Structured Learning

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.065), (26, 0.027), (33, 0.186), (67, 0.522), (69, 0.025), (80, 0.024), (87, 0.064)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.8868984 142 cvpr-2013-Efficient Detector Adaptation for Object Detection in a Video

Author: Pramod Sharma, Ram Nevatia

2 0.87106866 103 cvpr-2013-Decoding Children's Social Behavior

Author: James M. Rehg, Gregory D. Abowd, Agata Rozga, Mario Romero, Mark A. Clements, Stan Sclaroff, Irfan Essa, Opal Y. Ousley, Yin Li, Chanho Kim, Hrishikesh Rao, Jonathan C. Kim, Liliana Lo Presti, Jianming Zhang, Denis Lantsman, Jonathan Bidwell, Zhefan Ye

Abstract: We introduce a new problem domain for activity recognition: the analysis of children ’s social and communicative behaviors based on video and audio data. We specifically target interactions between children aged 1–2 years and an adult. Such interactions arise naturally in the diagnosis and treatment of developmental disorders such as autism. We introduce a new publicly-available dataset containing over 160 sessions of a 3–5 minute child-adult interaction. In each session, the adult examiner followed a semistructured play interaction protocol which was designed to elicit a broad range of social behaviors. We identify the key technical challenges in analyzing these behaviors, and describe methods for decoding the interactions. We present experimental results that demonstrate the potential of the dataset to drive interesting research questions, and show preliminary results for multi-modal activity recognition.

3 0.83427262 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection

Author: Wanli Ouyang, Xiaogang Wang

Abstract: In this paper, we address the challenging problem of detecting pedestrians who appear in groups and have interaction. A new approach is proposed for single-pedestrian detection aided by multi-pedestrian detection. A mixture model of multi-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby multiple pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and multi-pedestrian detectors, and to refine the single-pedestrian detection result with multi-pedestrian detection. It can integrate with any single-pedestrian detector without significantly increasing the computation load. 15 state-of-the-art single-pedestrian detection approaches are investigated on three widely used public datasets: Caltech, TUD-Brussels andETH. Experimental results show that our framework significantly improves all these approaches. The average improvement is 9% on the Caltech-Test dataset, 11% on the TUD-Brussels dataset and 17% on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from 48% to 43% on the Caltech-Test dataset, from 55% to 50% on the TUD-Brussels dataset and from 51% to 41% on the ETH dataset.

4 0.77100956 160 cvpr-2013-Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification

Author: Enrique G. Ortiz, Alan Wright, Mubarak Shah

Abstract: This paper presents an end-to-end video face recognition system, addressing the difficult problem of identifying a video face track using a large dictionary of still face images of a few hundred people, while rejecting unknown individuals. A straightforward application of the popular ?1minimization for face recognition on a frame-by-frame basis is prohibitively expensive, so we propose a novel algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and the knowledge that the face track frames belong to the same individual. By adding a strict temporal constraint to the ?1-minimization that forces individual frames in a face track to all reconstruct a single identity, we show the optimization reduces to a single minimization over the mean of the face track. We also introduce a new Movie Trailer Face Dataset collected from 101 movie trailers on YouTube. Finally, we show that our methodmatches or outperforms the state-of-the-art on three existing datasets (YouTube Celebrities, YouTube Faces, and Buffy) and our unconstrained Movie Trailer Face Dataset. More importantly, our method excels at rejecting unknown identities by at least 8% in average precision.

5 0.76758224 45 cvpr-2013-Articulated Pose Estimation Using Discriminative Armlet Classifiers

Author: Georgia Gkioxari, Pablo Arbeláez, Lubomir Bourdev, Jitendra Malik

Abstract: We propose a novel approach for human pose estimation in real-world cluttered scenes, and focus on the challenging problem of predicting the pose of both arms for each person in the image. For this purpose, we build on the notion of poselets [4] and train highly discriminative classifiers to differentiate among arm configurations, which we call armlets. We propose a rich representation which, in addition to standardHOGfeatures, integrates the information of strong contours, skin color and contextual cues in a principled manner. Unlike existing methods, we evaluate our approach on a large subset of images from the PASCAL VOC detection dataset, where critical visual phenomena, such as occlusion, truncation, multiple instances and clutter are the norm. Our approach outperforms Yang and Ramanan [26], the state-of-the-art technique, with an improvement from 29.0% to 37.5% PCP accuracy on the arm keypoint prediction task, on this new pose estimation dataset.

6 0.75589138 275 cvpr-2013-Lp-Norm IDF for Large Scale Image Search

7 0.74648798 2 cvpr-2013-3D Pictorial Structures for Multiple View Articulated Pose Estimation

8 0.74153405 246 cvpr-2013-Learning Binary Codes for High-Dimensional Data Using Bilinear Projections

9 0.72646105 375 cvpr-2013-Saliency Detection via Graph-Based Manifold Ranking

10 0.70338303 345 cvpr-2013-Real-Time Model-Based Rigid Object Pose Estimation and Tracking Combining Dense and Sparse Visual Cues

11 0.6705724 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation

12 0.66197717 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection

13 0.65699857 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection

14 0.63590753 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval

15 0.63549995 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes

16 0.62437135 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence

17 0.5971539 438 cvpr-2013-Towards Pose Robust Face Recognition

18 0.59552127 60 cvpr-2013-Beyond Physical Connections: Tree Models in Human Pose Estimation

19 0.59400171 338 cvpr-2013-Probabilistic Elastic Matching for Pose Variant Face Verification

20 0.59212965 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors