cvpr cvpr2013 cvpr2013-288 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Wanli Ouyang, Xingyu Zeng, Xiaogang Wang
Abstract: Detecting pedestrians in cluttered scenes is a challenging problem in computer vision. The difficulty is added when several pedestrians overlap in images and occlude each other. We observe, however, that the occlusion/visibility statuses of overlapping pedestrians provide useful mutual relationship for visibility estimation - the visibility estimation of one pedestrian facilitates the visibility estimation of another. In this paper, we propose a mutual visibility deep model that jointly estimates the visibility statuses of overlapping pedestrians. The visibility relationship among pedestrians is learned from the deep model for recognizing co-existing pedestrians. Experimental results show that the mutual visibility deep model effectively improves the pedestrian detection results. Compared with existing image-based pedestrian detection approaches, our approach has the lowest average miss rate on the CaltechTrain dataset, the Caltech-Test dataset and the ETHdataset. Including mutual visibility leads to 4% −8% improvements on mluudlitnipglem ubteunaclh vmiasibrki ditayta lesaedtss.
Reference: text
sentIndex sentText sentNum sentScore
1 hk , Abstract Detecting pedestrians in cluttered scenes is a challenging problem in computer vision. [sent-8, score-0.315]
2 The difficulty is added when several pedestrians overlap in images and occlude each other. [sent-9, score-0.346]
3 We observe, however, that the occlusion/visibility statuses of overlapping pedestrians provide useful mutual relationship for visibility estimation - the visibility estimation of one pedestrian facilitates the visibility estimation of another. [sent-10, score-2.532]
4 In this paper, we propose a mutual visibility deep model that jointly estimates the visibility statuses of overlapping pedestrians. [sent-11, score-1.547]
5 The visibility relationship among pedestrians is learned from the deep model for recognizing co-existing pedestrians. [sent-12, score-1.151]
6 Experimental results show that the mutual visibility deep model effectively improves the pedestrian detection results. [sent-13, score-1.265]
7 Compared with existing image-based pedestrian detection approaches, our approach has the lowest average miss rate on the CaltechTrain dataset, the Caltech-Test dataset and the ETHdataset. [sent-14, score-0.451]
8 Including mutual visibility leads to 4% −8% improvements on mluudlitnipglem ubteunaclh vmiasibrki ditayta lesaedtss. [sent-15, score-0.651]
9 Introduction Pedestrian detection is a challenging task due to the intra-class variation of pedestrians in clothing and articulation. [sent-17, score-0.341]
10 When several pedestrians overlap in the image region, some will be occluded by others and the expected visual cues of the occluded parts are corrupted, resulting in the added difficulty in detection. [sent-18, score-0.446]
11 Pedestrians with overlaps are difficult to detect, however, we observe that these pedestrians have useful mutual visibility relationship information. [sent-21, score-1.049]
12 When pedestrians are found to overlap in the image region, there are two types of mutual visibility relationships among their parts: 1. [sent-22, score-1.029]
13 It means that the observation of one part is a positive indication of the other part. [sent-24, score-0.071]
14 (a) Mutual visibility relationship of parts among pedestrians and (b) detection results comparison of the approach without modeling mutual visibility in [25] and our approach modeling mutual visibility. [sent-29, score-1.884]
15 With mutual visibility modeled in our approach, the false positive window on the left leg is suppressed and missed pedestrian on the left is found by modeling the visibility relationship among parts. [sent-30, score-1.703]
16 left-half part and right-half part, for each pedestrian in Fig. [sent-33, score-0.377]
17 1 (a), given the prior knowledge that there are two pedestrian co-existing side by side, the right-half part of the left pedestrian is compatible with the right-half part of the right pedestrian because these two parts often co-exist in positive training examples. [sent-36, score-1.296]
18 The compatible relationship can be used for increasing the visibility confidence of mutually compatible pedestrian parts. [sent-37, score-1.147]
19 1 (b) as an example, if a pedestrian detector detects both Alice1 on the left and Bob on the right with high false positive rate, then the visibility confidence of Alice’s right- 1’Alice’ and ’Bob’ are used as placeholder names in this paper. [sent-39, score-0.909]
20 333222222200 half part increases when Bob’s right-half part is found to be visible. [sent-40, score-0.094]
21 In this example, the compatible relationship helps to detect Alice in Fig. [sent-42, score-0.197]
22 It means that the occlusion of one part indicates the visibility of the other part, and vice versa. [sent-46, score-0.574]
23 1 (b), Alice and Bob have so strong overlap that one occludes the other. [sent-48, score-0.057]
24 In this case, Alice’s right-half part and Bob’s left-half part are incompatible because they shall not be visible simultaneously. [sent-49, score-0.209]
25 If a pedestrian detector detects both Alice and Bob with high false positive rate in Fig. [sent-50, score-0.377]
26 1(b), then the visibility confidence of Alice’s right-halfpart increases when Bob’s left-half part is found to be invisible. [sent-51, score-0.579]
27 Therefore, incompatible relationship helps to detect Alice in this example. [sent-53, score-0.224]
28 These observations motivate us to jointly estimate the occlusion status of co-existing pedestrians by modeling the mutual visibility relationship among their parts. [sent-54, score-1.193]
29 In this paper, we propose to learn the compatible and incompatible relationship by a discriminative deep model. [sent-55, score-0.544]
30 The main contribution of this paper is to jointly estimate the visibility statuses of multiple pedestrians and recognize co-existing pedestrians via a mutual visibility deep model. [sent-56, score-2.062]
31 Overlapping parts of co-existing pedestrians are placed at multiple layers in this deep model. [sent-57, score-0.693]
32 With this deep model, 1) overlapping parts at different layers verify the visibility of each other for multiple times; 2) the complex probabilistic connections across layers are modeled with good efficiency on both learning and inference. [sent-58, score-1.075]
33 The mutual visibility deep model effectively improves pedestrian detection performance with less than 5% extra comoputation in the detection process. [sent-60, score-1.317]
34 It achieves the lowest average miss rate on the Caltech-Train dataset and the ETH dataset. [sent-61, score-0.069]
35 On the more challenging PETS dataset labeled by us, including mutual visibility leads to 8% improvement on the lowest average miss rate. [sent-62, score-0.72]
36 Furthermore, our model takes part detection scores as input and it is complementary to many existing pedestrian approaches. [sent-63, score-0.48]
37 It has good flexibility to integrate with other techniques, such as more discriminative features [3 1], scene geometric constraints [27], richer part models [40, 38] and contextual multi-pedestrian detection information [30, 26, 36] to further improve the performance. [sent-64, score-0.099]
38 Related Work Since visibility estimation is the key to handle occlusions, many approaches were proposed for estimating visibility of parts [2, 10, 11, 32, 35, 33, 29, 22, 34, 21]. [sent-66, score-1.078]
39 [32] used the block-wise HOG+SVM scores to estimate visibility status and combined the full-body classifier and part-based classifiers by heuristics. [sent-68, score-0.614]
40 [11] estimated the visibility of different parts using motion, depth and segmentation and then computed the classification score by summing up multiple visibility weighted cues of parts. [sent-70, score-1.099]
41 Each substructure was composed of a set of part detectors. [sent-72, score-0.07]
42 And the detection confidence score of an object was determined by the existence of these substructures. [sent-73, score-0.116]
43 The And-Or graph was used in [35] to accumulate hardthresholded part detection scores. [sent-74, score-0.132]
44 Recently, the approaches in [10, 25] utilized the visibility relationship among parts for isolated pedestrian. [sent-75, score-0.773]
45 However, the part visibility relationship among co-existing pedestrians was not explored in [2, 10, 11, 32, 35]. [sent-76, score-0.966]
46 These approaches obtain the visibility status by occlusion reasoning using 2-D visibility scores in [33, 29, 22] or using segmentation results in [34, 21]. [sent-78, score-1.141]
47 They manually defined the incompatible relationship among parts of multiple pedestrians through the exclusive occupancy of segmentation region or part detection response, while our approach learns the incompatible relationship from training data. [sent-79, score-1.007]
48 In addition, the compatible relationship was not used by these approaches. [sent-80, score-0.197]
49 The articulation relationship among the parts of multiple objects, parameterized by position, scale, size, rotation, was investigated as context [39, 36, 37, 5]. [sent-81, score-0.241]
50 Nearby detection scores was considered as context in [6]. [sent-82, score-0.103]
51 But it did not consider the visibility relationship of co-existing pedestrians, which is the focus of our approach. [sent-83, score-0.598]
52 The part visibility relationship among co-existing pedestrians has not been investigated yet and is complementary to these context-based approaches. [sent-84, score-0.966]
53 [19] proposed a deep model that achieved state-of-the-art performance for object detection and recognition on the ImageNet dataset [4]. [sent-90, score-0.284]
54 model was used for pedestrian detection in [25, 24]. [sent-97, score-0.382]
55 Overview of our approach In this paper, we mainly discuss the approach for pairwise pedestrians and extend it to more pedestrians in Section 4. [sent-101, score-0.578]
56 Denote the features of detection window wnd1 by vector x1, containing both appearance and position information. [sent-103, score-0.095]
57 obtaining p(y1 |x1) for each window wnd1 in a sliding window manner f|oxr all sizes of windows. [sent-106, score-0.086]
58 We consider another detection window wnd2 with features x2 and label y2 ∈ {0, 1}. [sent-107, score-0.095]
59 =0,1 (1) = p(y1, y2 = 1|x1, x2) + p(y1, y2 = 0|x1, x2), When y2 = 0, we have p(y1, y2 = 0|x1, x2) = p(y1|y2 = 0, x1)p(y2 = 0), (2) where p(y1 |y2 = 0, x1) is obtained from the deep model for isolated pedestrians. [sent-110, score-0.275]
60 When y2 = 1, we have p(y1, y2 = 1|x1, x2) ∝ φ(y; x)φp (y; x) , (3) φ(y; x) in (3) is used for recognizing pair-wise co-existing pedestrians frompartdetection scores, where x = [xT1 x2T]T, y = 1 if y1 = 1 and y2 = 1, otherwise y = 0. [sent-112, score-0.289]
61 The mutual visibility deep model (a)Ph ed1 3estrioa1n . [sent-118, score-0.883]
62 destori2ah n2 132s and fine tuning parameters and (b) the detailed connection and parts model for pedestrian 1. [sent-125, score-0.453]
63 classified into the kth mixture and then this pair are used by the kth deep model for learning and inference. [sent-126, score-0.232]
64 The differences between the two pedestrians in horizontal lo- cation, vertical location and size, denoted by (dx, dy, ds), are used as the random variables in the GMM distribution p(dx , dy, ds). [sent-127, score-0.317]
65 2(a) shows the deep model used at the inference stage. [sent-133, score-0.232]
66 2(b) shows the parts model used for pedestrian 1 at window wnd1 . [sent-135, score-0.473]
67 The parts model for pedestrian 2 at window wnd2 is the same. [sent-136, score-0.473]
68 2(b), there are 3 layers of parts with different sizes. [sent-138, score-0.172]
69 For each pedestrian, there are six small parts at layer 1, seven medium-sized parts at layer 2 and seven large parts at Layer 3. [sent-139, score-0.679]
70 The six parts at layer 1 are left-head-shoulder, right-head-shoulder, left-torso, right-torso, left-leg and right-leg. [sent-140, score-0.282]
71 A part at an upper layer consists ofits children at the lower layer. [sent-141, score-0.22]
72 The parts at the top layer are the possible occlusion statuses with gray color indicating occlusions. [sent-142, score-0.401]
73 The detection scores for L layers are denoted by s = = γ(x), where γ(x) is obtained from part detectors, sl for l = 1, . [sent-143, score-0.322]
74 And we have sl = where the Pl scores of the two pedestrians [s1T . [sent-149, score-0.412]
75 hi Sddinecne vhˆalri asb nleots p arto lvaiydeerd l at ar tera dineinnogte sdtag bey at layer l are denoted by sl1 = [s11,1, 333222222422 = or testing stage, it is considered as a hidden random vector. [sent-165, score-0.252]
76 In our implementation, DPM in [14] is used for obtaining part detection scores in s. [sent-166, score-0.15]
77 The deformation among parts are arranged in the star-model with full-body being the center. [sent-167, score-0.159]
78 In order to have the top layer representing occlusion status in a more direct way, s3 accumulate the detection scores that fit their possible occlusion statuses. [sent-169, score-0.437]
79 , Pl is the detection score for the ith part at layer l, s˜13,1 is the detection score for the 3 head-shoulder part at layer and s13,2 is the detection score for the head-torso part at layer 3. [sent-179, score-0.837]
80 In our implementation of the detector, the head-shoulder part at the top layer has half of the resolution of HOG features compared with the head-shoulder part at the middle layer. [sent-180, score-0.245]
81 The overlap information for six parts are left-head-shoulder on,1, right-head-shoulder on,2, left-torso on,3, right-torso on,4, left-leg on,5 and right-leg on,6. [sent-186, score-0.188]
82 In order to obtain o, the overlap of these six parts with the pedestrian region of the other pedestrian is computed. [sent-187, score-0.887]
83 3(a), which is obtained by averaging the gradient of positive samples, two rectangles are used for approximating the pedestrian region of the other pedestrian. [sent-189, score-0.418]
84 One rectangle is used for the head region, denoted by Ah, another rectangle is used for the torso-leg region, denoted by At. [sent-190, score-0.1]
85 3(b) has the left-head-shoulder, left-torso and left-leg overlapping with the pedestrian regions of the left person. [sent-194, score-0.393]
86 Since An,i, Ah and At are rectangular regions, the operations area(·) and ∩ in (5) can be efficiently computed using tshe a rceoao(r·d)i naantdes ∩ of rectangles iens etfefiacdie onft being computed in a pixel-wise way on the rectangular regions. [sent-195, score-0.132]
87 The overlap information o can also be obtained from segmentation. [sent-196, score-0.057]
88 Compared with segmentation, the rectangular region is an approximate but faster approach for obtaining pedestrian region and computing the overlap information o. [sent-197, score-0.496]
89 At the inference stage, the pedestrian co-existence label y is inferred from features x. [sent-198, score-0.33]
90 The part visibility probability pedestrian region and (b) an example with left-head-shoulder, lefttorso and left-leg overlapping with the pedestrian regions of the left person. [sent-199, score-1.298]
91 h˜lj+1 = p(hlj+1 = 1|hl, x) = σ(hlTwl∗,j + cjl+1 + gjl+1Tsjl+1), hl = hˆl if l = L − 1,hl = [hˆlToT]T, 2, i. [sent-201, score-0.275]
92 The learning of the deep model The following two stages are used for learning the pa- rameters in (6) and (7). [sent-214, score-0.232]
93 The variables are arranged as a backpropagation (BP) network as shown in Fig. [sent-217, score-0.052]
94 As stated in [12], unsupervised pretraining guides the learning of the deep model towards the basins of attraction of minima that support better generalization from the training data. [sent-219, score-0.355]
95 Therefore, we adopt unsupervised pretraining of parameters at stage 1. [sent-220, score-0.11]
96 The graphical model for unsupervised pretraining is shown in Fig. [sent-221, score-0.075]
97 Wl, gi,l and cil are the parameters to be learned. [sent-247, score-0.099]
98 Wl models the correlation between hl and hl+1, wil,∗ is the ith row of Wl, gil is the weight for sil, and cil is the bias term. [sent-248, score-0.398]
99 The element wil,j of Wl in (8) is set to hjl+1 zero if there is no connection between units hil and in Fig. [sent-249, score-0.138]
100 Similar to the approach in [16], the parameters in (8) are trained layer by layer and two adjacent layers are considered as a Restricted Boltzmann Machine (RBM) that has the following distributions: p(hl, hl+1|x) ∝ e? [sent-253, score-0.374]
wordName wordTfidf (topN-words)
[('visibility', 0.489), ('pedestrian', 0.33), ('pedestrians', 0.289), ('alice', 0.28), ('hl', 0.275), ('deep', 0.232), ('bob', 0.174), ('mutual', 0.162), ('layer', 0.151), ('wl', 0.117), ('hil', 0.115), ('incompatible', 0.115), ('gjl', 0.112), ('statuses', 0.112), ('relationship', 0.109), ('parts', 0.1), ('wil', 0.099), ('cil', 0.099), ('cjl', 0.099), ('compatible', 0.088), ('pretraining', 0.075), ('status', 0.074), ('layers', 0.072), ('sl', 0.072), ('thl', 0.07), ('gl', 0.067), ('overlapping', 0.063), ('overlap', 0.057), ('hilhjl', 0.056), ('hilsil', 0.056), ('hjl', 0.056), ('hltwl', 0.056), ('wlt', 0.056), ('dy', 0.053), ('detection', 0.052), ('scores', 0.051), ('cl', 0.051), ('part', 0.047), ('shenzhen', 0.043), ('confidence', 0.043), ('window', 0.043), ('isolated', 0.043), ('ds', 0.042), ('dx', 0.042), ('region', 0.039), ('miss', 0.039), ('occlusion', 0.038), ('rbm', 0.037), ('cuhk', 0.035), ('stage', 0.035), ('accumulate', 0.033), ('among', 0.032), ('correspondingly', 0.032), ('six', 0.031), ('rectangular', 0.031), ('lowest', 0.03), ('lj', 0.03), ('ah', 0.029), ('ey', 0.029), ('denoted', 0.028), ('gmm', 0.028), ('arranged', 0.027), ('hk', 0.026), ('pl', 0.025), ('backpropagation', 0.025), ('pretrain', 0.025), ('basins', 0.025), ('fromp', 0.025), ('shallower', 0.025), ('lna', 0.025), ('asb', 0.025), ('edn', 0.025), ('arto', 0.025), ('sil', 0.025), ('kfor', 0.025), ('aered', 0.025), ('substructures', 0.025), ('thore', 0.025), ('modeled', 0.025), ('rectangles', 0.025), ('ith', 0.024), ('positive', 0.024), ('connection', 0.023), ('seven', 0.023), ('substructure', 0.023), ('tera', 0.023), ('onft', 0.023), ('eyx', 0.023), ('attraction', 0.023), ('vh', 0.023), ('wanli', 0.023), ('wlouyang', 0.023), ('detects', 0.023), ('connections', 0.022), ('oist', 0.022), ('ofits', 0.022), ('iens', 0.022), ('enzweiler', 0.022), ('rectangle', 0.022), ('score', 0.021)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection
Author: Wanli Ouyang, Xingyu Zeng, Xiaogang Wang
Abstract: Detecting pedestrians in cluttered scenes is a challenging problem in computer vision. The difficulty is added when several pedestrians overlap in images and occlude each other. We observe, however, that the occlusion/visibility statuses of overlapping pedestrians provide useful mutual relationship for visibility estimation - the visibility estimation of one pedestrian facilitates the visibility estimation of another. In this paper, we propose a mutual visibility deep model that jointly estimates the visibility statuses of overlapping pedestrians. The visibility relationship among pedestrians is learned from the deep model for recognizing co-existing pedestrians. Experimental results show that the mutual visibility deep model effectively improves the pedestrian detection results. Compared with existing image-based pedestrian detection approaches, our approach has the lowest average miss rate on the CaltechTrain dataset, the Caltech-Test dataset and the ETHdataset. Including mutual visibility leads to 4% −8% improvements on mluudlitnipglem ubteunaclh vmiasibrki ditayta lesaedtss.
2 0.37064388 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection
Author: Wanli Ouyang, Xiaogang Wang
Abstract: In this paper, we address the challenging problem of detecting pedestrians who appear in groups and have interaction. A new approach is proposed for single-pedestrian detection aided by multi-pedestrian detection. A mixture model of multi-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby multiple pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and multi-pedestrian detectors, and to refine the single-pedestrian detection result with multi-pedestrian detection. It can integrate with any single-pedestrian detector without significantly increasing the computation load. 15 state-of-the-art single-pedestrian detection approaches are investigated on three widely used public datasets: Caltech, TUD-Brussels andETH. Experimental results show that our framework significantly improves all these approaches. The average improvement is 9% on the Caltech-Test dataset, 11% on the TUD-Brussels dataset and 17% on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from 48% to 43% on the Caltech-Test dataset, from 55% to 50% on the TUD-Brussels dataset and from 51% to 41% on the ETH dataset.
3 0.28329059 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes
Author: Junjie Yan, Xucong Zhang, Zhen Lei, Shengcai Liao, Stan Z. Li
Abstract: The serious performance decline with decreasing resolution is the major bottleneck for current pedestrian detection techniques [14, 23]. In this paper, we take pedestrian detection in different resolutions as different but related problems, and propose a Multi-Task model to jointly consider their commonness and differences. The model contains resolution aware transformations to map pedestrians in different resolutions to a common space, where a shared detector is constructed to distinguish pedestrians from background. For model learning, we present a coordinate descent procedure to learn the resolution aware transformations and deformable part model (DPM) based detector iteratively. In traffic scenes, there are many false positives located around vehicles, therefore, we further build a context model to suppress them according to the pedestrian-vehicle relationship. The context model can be learned automatically even when the vehicle annotations are not available. Our method reduces the mean miss rate to 60% for pedestrians taller than 30 pixels on the Caltech Pedestrian Benchmark, which noticeably outperforms previous state-of-the-art (71%).
4 0.18929546 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning
Author: Pierre Sermanet, Koray Kavukcuoglu, Soumith Chintala, Yann Lecun
Abstract: Pedestrian detection is a problem of considerable practical interest. Adding to the list of successful applications of deep learning methods to vision, we report state-of-theart and competitive results on all major pedestrian datasets with a convolutional network model. The model uses a few new twists, such as multi-stage features, connections that skip layers to integrate global shape information with local distinctive motif information, and an unsupervised method based on convolutional sparse coding to pre-train the filters at each stage.
5 0.16331227 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning
Author: Tao Wang, Xuming He, Nick Barnes
Abstract: Wepropose a structuredHough voting methodfor detecting objects with heavy occlusion in indoor environments. First, we extend the Hough hypothesis space to include both object location and its visibility pattern, and design a new score function that accumulates votes for object detection and occlusion prediction. In addition, we explore the correlation between objects and their environment, building a depth-encoded object-context model based on RGB-D data. Particularly, we design a layered context representation and .barne s }@ nict a . com .au (a)(b)(c) (d)(e)(f) allow image patches from both objects and backgrounds voting for the object hypotheses. We demonstrate that using a data-driven 2.1D representation we can learn visual codebooks with better quality, and more interpretable detection results in terms of spatial relationship between objects and viewer. We test our algorithm on two challenging RGB-D datasets with significant occlusion and intraclass variation, and demonstrate the superior performance of our method.
6 0.15626773 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People
7 0.14398029 330 cvpr-2013-Photometric Ambient Occlusion
8 0.13170061 167 cvpr-2013-Fast Multiple-Part Based Object Detection Using KD-Ferns
9 0.11328567 32 cvpr-2013-Action Recognition by Hierarchical Sequence Summarization
10 0.11200304 371 cvpr-2013-SCaLE: Supervised and Cascaded Laplacian Eigenmaps for Visual Object Recognition Based on Nearest Neighbors
11 0.11099245 158 cvpr-2013-Exploring Weak Stabilization for Motion Feature Extraction
12 0.10907337 154 cvpr-2013-Explicit Occlusion Modeling for 3D Object Class Representations
13 0.10213653 383 cvpr-2013-Seeking the Strongest Rigid Detector
14 0.097230338 105 cvpr-2013-Deep Learning Shape Priors for Object Segmentation
15 0.093432888 218 cvpr-2013-Improving the Visual Comprehension of Point Sets
16 0.092345707 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video
17 0.088256113 272 cvpr-2013-Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery
18 0.083027877 264 cvpr-2013-Learning to Detect Partially Overlapping Instances
19 0.082024686 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection
20 0.080296852 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels
topicId topicWeight
[(0, 0.142), (1, -0.024), (2, 0.017), (3, -0.04), (4, 0.056), (5, 0.011), (6, 0.111), (7, 0.075), (8, 0.045), (9, -0.038), (10, -0.121), (11, -0.096), (12, 0.133), (13, -0.218), (14, 0.122), (15, 0.1), (16, -0.188), (17, 0.119), (18, 0.037), (19, 0.039), (20, -0.013), (21, -0.1), (22, -0.137), (23, 0.099), (24, -0.029), (25, -0.098), (26, 0.025), (27, 0.004), (28, 0.036), (29, 0.098), (30, 0.094), (31, -0.023), (32, 0.016), (33, 0.037), (34, -0.013), (35, -0.038), (36, -0.077), (37, 0.005), (38, 0.064), (39, -0.038), (40, -0.024), (41, 0.032), (42, 0.073), (43, 0.013), (44, 0.081), (45, 0.07), (46, -0.09), (47, 0.086), (48, -0.018), (49, 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 0.94736272 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection
Author: Wanli Ouyang, Xingyu Zeng, Xiaogang Wang
Abstract: Detecting pedestrians in cluttered scenes is a challenging problem in computer vision. The difficulty is added when several pedestrians overlap in images and occlude each other. We observe, however, that the occlusion/visibility statuses of overlapping pedestrians provide useful mutual relationship for visibility estimation - the visibility estimation of one pedestrian facilitates the visibility estimation of another. In this paper, we propose a mutual visibility deep model that jointly estimates the visibility statuses of overlapping pedestrians. The visibility relationship among pedestrians is learned from the deep model for recognizing co-existing pedestrians. Experimental results show that the mutual visibility deep model effectively improves the pedestrian detection results. Compared with existing image-based pedestrian detection approaches, our approach has the lowest average miss rate on the CaltechTrain dataset, the Caltech-Test dataset and the ETHdataset. Including mutual visibility leads to 4% −8% improvements on mluudlitnipglem ubteunaclh vmiasibrki ditayta lesaedtss.
2 0.79121733 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes
Author: Junjie Yan, Xucong Zhang, Zhen Lei, Shengcai Liao, Stan Z. Li
Abstract: The serious performance decline with decreasing resolution is the major bottleneck for current pedestrian detection techniques [14, 23]. In this paper, we take pedestrian detection in different resolutions as different but related problems, and propose a Multi-Task model to jointly consider their commonness and differences. The model contains resolution aware transformations to map pedestrians in different resolutions to a common space, where a shared detector is constructed to distinguish pedestrians from background. For model learning, we present a coordinate descent procedure to learn the resolution aware transformations and deformable part model (DPM) based detector iteratively. In traffic scenes, there are many false positives located around vehicles, therefore, we further build a context model to suppress them according to the pedestrian-vehicle relationship. The context model can be learned automatically even when the vehicle annotations are not available. Our method reduces the mean miss rate to 60% for pedestrians taller than 30 pixels on the Caltech Pedestrian Benchmark, which noticeably outperforms previous state-of-the-art (71%).
3 0.77642757 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection
Author: Wanli Ouyang, Xiaogang Wang
Abstract: In this paper, we address the challenging problem of detecting pedestrians who appear in groups and have interaction. A new approach is proposed for single-pedestrian detection aided by multi-pedestrian detection. A mixture model of multi-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby multiple pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and multi-pedestrian detectors, and to refine the single-pedestrian detection result with multi-pedestrian detection. It can integrate with any single-pedestrian detector without significantly increasing the computation load. 15 state-of-the-art single-pedestrian detection approaches are investigated on three widely used public datasets: Caltech, TUD-Brussels andETH. Experimental results show that our framework significantly improves all these approaches. The average improvement is 9% on the Caltech-Test dataset, 11% on the TUD-Brussels dataset and 17% on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from 48% to 43% on the Caltech-Test dataset, from 55% to 50% on the TUD-Brussels dataset and from 51% to 41% on the ETH dataset.
4 0.64415145 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning
Author: Pierre Sermanet, Koray Kavukcuoglu, Soumith Chintala, Yann Lecun
Abstract: Pedestrian detection is a problem of considerable practical interest. Adding to the list of successful applications of deep learning methods to vision, we report state-of-theart and competitive results on all major pedestrian datasets with a convolutional network model. The model uses a few new twists, such as multi-stage features, connections that skip layers to integrate global shape information with local distinctive motif information, and an unsupervised method based on convolutional sparse coding to pre-train the filters at each stage.
5 0.63526815 167 cvpr-2013-Fast Multiple-Part Based Object Detection Using KD-Ferns
Author: Dan Levi, Shai Silberstein, Aharon Bar-Hillel
Abstract: In this work we present a new part-based object detection algorithm with hundreds of parts performing realtime detection. Part-based models are currently state-ofthe-art for object detection due to their ability to represent large appearance variations. However, due to their high computational demands such methods are limited to several parts only and are too slow for practical real-time implementation. Our algorithm is an accelerated version of the “Feature Synthesis ” (FS) method [1], which uses multiple object parts for detection and is among state-of-theart methods on human detection benchmarks, but also suffers from a high computational cost. The proposed Accelerated Feature Synthesis (AFS) uses several strategies for reducing the number of locations searched for each part. The first strategy uses a novel algorithm for approximate nearest neighbor search which we developed, termed “KDFerns ”, to compare each image location to only a subset of the model parts. Candidate part locations for a specific part are further reduced using spatial inhibition, and using an object-level “coarse-to-fine ” strategy. In our empirical evaluation on pedestrian detection benchmarks, AFS main- × tains almost fully the accuracy performance of the original FS, while running more than 4 faster than existing partbased methods which use only several parts. AFS is to our best knowledge the first part-based object detection method achieving real-time running performance: nearly 10 frames per-second on 640 480 images on a regular CPU.
6 0.5899173 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People
7 0.57293999 383 cvpr-2013-Seeking the Strongest Rigid Detector
8 0.5666815 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence
9 0.51199329 272 cvpr-2013-Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery
10 0.47356281 311 cvpr-2013-Occlusion Patterns for Object Class Detection
11 0.41683227 154 cvpr-2013-Explicit Occlusion Modeling for 3D Object Class Representations
12 0.40878293 264 cvpr-2013-Learning to Detect Partially Overlapping Instances
13 0.40820226 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning
14 0.38870117 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels
15 0.37062395 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video
16 0.36898682 144 cvpr-2013-Efficient Maximum Appearance Search for Large-Scale Object Detection
17 0.36886689 105 cvpr-2013-Deep Learning Shape Priors for Object Segmentation
18 0.36446619 371 cvpr-2013-SCaLE: Supervised and Cascaded Laplacian Eigenmaps for Visual Object Recognition Based on Nearest Neighbors
19 0.35657242 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection
20 0.35589263 142 cvpr-2013-Efficient Detector Adaptation for Object Detection in a Video
topicId topicWeight
[(10, 0.108), (16, 0.035), (26, 0.025), (33, 0.195), (39, 0.011), (53, 0.233), (67, 0.158), (69, 0.097), (87, 0.034)]
simIndex simValue paperId paperTitle
same-paper 1 0.83605349 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection
Author: Wanli Ouyang, Xingyu Zeng, Xiaogang Wang
Abstract: Detecting pedestrians in cluttered scenes is a challenging problem in computer vision. The difficulty is added when several pedestrians overlap in images and occlude each other. We observe, however, that the occlusion/visibility statuses of overlapping pedestrians provide useful mutual relationship for visibility estimation - the visibility estimation of one pedestrian facilitates the visibility estimation of another. In this paper, we propose a mutual visibility deep model that jointly estimates the visibility statuses of overlapping pedestrians. The visibility relationship among pedestrians is learned from the deep model for recognizing co-existing pedestrians. Experimental results show that the mutual visibility deep model effectively improves the pedestrian detection results. Compared with existing image-based pedestrian detection approaches, our approach has the lowest average miss rate on the CaltechTrain dataset, the Caltech-Test dataset and the ETHdataset. Including mutual visibility leads to 4% −8% improvements on mluudlitnipglem ubteunaclh vmiasibrki ditayta lesaedtss.
Author: Luping Zhou, Lei Wang, Lingqiao Liu, Philip Ogunbona, Dinggang Shen
Abstract: Analyzing brain networks from neuroimages is becoming a promising approach in identifying novel connectivitybased biomarkers for the Alzheimer’s disease (AD). In this regard, brain “effective connectivity ” analysis, which studies the causal relationship among brain regions, is highly challenging and of many research opportunities. Most of the existing works in this field use generative methods. Despite their success in data representation and other important merits, generative methods are not necessarily discriminative, which may cause the ignorance of subtle but critical disease-induced changes. In this paper, we propose a learning-based approach that integrates the benefits of generative and discriminative methods to recover effective connectivity. In particular, we employ Fisher kernel to bridge the generative models of sparse Bayesian networks (SBN) and the discriminative classifiers of SVMs, and convert the SBN parameter learning to Fisher kernel learning via minimizing a generalization error bound of SVMs. Our method is able to simultaneously boost the discriminative power of both the generative SBN models and the SBN-induced SVM classifiers via Fisher kernel. The proposed method is tested on analyzing brain effective connectivity for AD from ADNI data, and demonstrates significant improvements over the state-of-the-art work.
3 0.74518591 103 cvpr-2013-Decoding Children's Social Behavior
Author: James M. Rehg, Gregory D. Abowd, Agata Rozga, Mario Romero, Mark A. Clements, Stan Sclaroff, Irfan Essa, Opal Y. Ousley, Yin Li, Chanho Kim, Hrishikesh Rao, Jonathan C. Kim, Liliana Lo Presti, Jianming Zhang, Denis Lantsman, Jonathan Bidwell, Zhefan Ye
Abstract: We introduce a new problem domain for activity recognition: the analysis of children ’s social and communicative behaviors based on video and audio data. We specifically target interactions between children aged 1–2 years and an adult. Such interactions arise naturally in the diagnosis and treatment of developmental disorders such as autism. We introduce a new publicly-available dataset containing over 160 sessions of a 3–5 minute child-adult interaction. In each session, the adult examiner followed a semistructured play interaction protocol which was designed to elicit a broad range of social behaviors. We identify the key technical challenges in analyzing these behaviors, and describe methods for decoding the interactions. We present experimental results that demonstrate the potential of the dataset to drive interesting research questions, and show preliminary results for multi-modal activity recognition.
4 0.73905504 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection
Author: Wanli Ouyang, Xiaogang Wang
Abstract: In this paper, we address the challenging problem of detecting pedestrians who appear in groups and have interaction. A new approach is proposed for single-pedestrian detection aided by multi-pedestrian detection. A mixture model of multi-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby multiple pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and multi-pedestrian detectors, and to refine the single-pedestrian detection result with multi-pedestrian detection. It can integrate with any single-pedestrian detector without significantly increasing the computation load. 15 state-of-the-art single-pedestrian detection approaches are investigated on three widely used public datasets: Caltech, TUD-Brussels andETH. Experimental results show that our framework significantly improves all these approaches. The average improvement is 9% on the Caltech-Test dataset, 11% on the TUD-Brussels dataset and 17% on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from 48% to 43% on the Caltech-Test dataset, from 55% to 50% on the TUD-Brussels dataset and from 51% to 41% on the ETH dataset.
5 0.73354042 160 cvpr-2013-Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification
Author: Enrique G. Ortiz, Alan Wright, Mubarak Shah
Abstract: This paper presents an end-to-end video face recognition system, addressing the difficult problem of identifying a video face track using a large dictionary of still face images of a few hundred people, while rejecting unknown individuals. A straightforward application of the popular ?1minimization for face recognition on a frame-by-frame basis is prohibitively expensive, so we propose a novel algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and the knowledge that the face track frames belong to the same individual. By adding a strict temporal constraint to the ?1-minimization that forces individual frames in a face track to all reconstruct a single identity, we show the optimization reduces to a single minimization over the mean of the face track. We also introduce a new Movie Trailer Face Dataset collected from 101 movie trailers on YouTube. Finally, we show that our methodmatches or outperforms the state-of-the-art on three existing datasets (YouTube Celebrities, YouTube Faces, and Buffy) and our unconstrained Movie Trailer Face Dataset. More importantly, our method excels at rejecting unknown identities by at least 8% in average precision.
6 0.73156905 2 cvpr-2013-3D Pictorial Structures for Multiple View Articulated Pose Estimation
7 0.73145986 45 cvpr-2013-Articulated Pose Estimation Using Discriminative Armlet Classifiers
8 0.7274515 275 cvpr-2013-Lp-Norm IDF for Large Scale Image Search
10 0.7188524 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation
11 0.71820742 453 cvpr-2013-Video Editing with Temporal, Spatial and Appearance Consistency
12 0.71770316 142 cvpr-2013-Efficient Detector Adaptation for Object Detection in a Video
13 0.71740502 246 cvpr-2013-Learning Binary Codes for High-Dimensional Data Using Bilinear Projections
14 0.71643829 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval
15 0.7157585 375 cvpr-2013-Saliency Detection via Graph-Based Manifold Ranking
16 0.71292174 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection
17 0.71242535 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes
18 0.71056378 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence
19 0.70759332 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
20 0.70559812 172 cvpr-2013-Finding Group Interactions in Social Clutter