cvpr cvpr2013 cvpr2013-318 knowledge-graph by maker-knowledge-mining

318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People


Source: pdf

Author: Sitapa Rujikietgumjorn, Robert T. Collins

Abstract: We present a quadratic unconstrained binary optimization (QUBO) framework for reasoning about multiple object detections with spatial overlaps. The method maximizes an objective function composed of unary detection confidence scores andpairwise overlap constraints to determine which overlapping detections should be suppressed, and which should be kept. The framework is flexible enough to handle the problem of detecting objects as a shape covering of a foreground mask, and to handle the problem of filtering confidence weighted detections produced by a traditional sliding window object detector. In our experiments, we show that our method outperforms two existing state-ofthe-art pedestrian detectors.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract We present a quadratic unconstrained binary optimization (QUBO) framework for reasoning about multiple object detections with spatial overlaps. [sent-6, score-0.511]

2 The method maximizes an objective function composed of unary detection confidence scores andpairwise overlap constraints to determine which overlapping detections should be suppressed, and which should be kept. [sent-7, score-1.069]

3 The framework is flexible enough to handle the problem of detecting objects as a shape covering of a foreground mask, and to handle the problem of filtering confidence weighted detections produced by a traditional sliding window object detector. [sent-8, score-0.93]

4 In our experiments, we show that our method outperforms two existing state-ofthe-art pedestrian detectors. [sent-9, score-0.28]

5 Notwithstanding these difficulties, good progress has been made on the problem of detecting individual walking pedestrians through the use of statistical machine learning meth- ods for training pedestrian object detectors [4]. [sent-12, score-0.361]

6 As shown in Figure 1 using the PLS detector [20] with its default non maximum suppression method, correct candidate detections were present that were suppressed by stronger (a)(b) Figure 1. [sent-17, score-0.684]

7 (a) PLS detector using its default non maximum suppression method misses one person. [sent-18, score-0.357]

8 (b) Examining candidate ob- ject center locations from the PLS detector without non maximum suppression shows that candidate detections were generated for that person, but they were subsequently suppressed. [sent-19, score-0.788]

9 The aim of this paper is to show that quadratic optimization for reasoning about overlapping detections can improve the performance of a pedestrian detection system, especially when there are multiple overlapping people. [sent-21, score-0.946]

10 Loosely speaking, the unary scores reward candidates with high detector confidence, whereas the pairwise scores impose a penalty for excessive amounts of overlap between two candidates. [sent-23, score-0.909]

11 The problem is to find a binary vector that maximizes the quadratic objective function, which is a classic problem of quadratic unconstrained binary optimization (QUBO). [sent-24, score-0.782]

12 First, a large set of possible detection candidates (Figure 2(b) and 2(f)) is generated, based on shape covering of a foreground mask in the top row, or sampling from a confidence map produced by a standard pedestrian detec- tor (with non-maximum suppression disabled) in the bottom row. [sent-27, score-1.349]

13 In both cases, the object configuration that maximizes the quadratic objective function is found and shown in Figure 2(c) and 2(g). [sent-28, score-0.339]

14 The top row uses a foreground shape covering approach to produce a large set of candidate detections. [sent-31, score-0.483]

15 The bottom row samples from confidence maps produced by a sliding window human detector to generate candidate detections. [sent-32, score-0.668]

16 In both cases, we perform the same binary quadratic optimization procedure to choose the solution set of candidates that maximize a quadratic objective function. [sent-33, score-0.792]

17 (e) sliding window detector confidence map (only one scale level shown). [sent-37, score-0.522]

18 candidate detections to optimize the tradeoff between unary confidence scores and pairwise overlap penalties. [sent-40, score-0.955]

19 Of the several approaches that have been proposed for detecting pedestrians, one common method uses a pre-trained classifier within a sliding window to scan the whole image looking for people at all locations and scales. [sent-43, score-0.254]

20 Since sliding window methods usually generate multiple overlapping detections on a person, the common final step is to apply non-maximum suppression as an attempt to remove false positive detections [9, 11]. [sent-49, score-0.687]

21 One line of work proposes to segment foreground blobs into human shapes using an MCMC-based optimization ap- proach to determine the number and configuration of overlapping shapes [23, 24, 10]. [sent-51, score-0.55]

22 We propose to use Quadratic Unconstrained Binary Optimization (QUBO) for pedestrian detection in order to reason more directly/thoroughly about overlapping detection candidates and their associated confidence scores and overlap penalties. [sent-53, score-1.119]

23 A quadratic cost function is used to represent criteria relating object confidence, overlapping hypotheses, and spatial interactions between pairs of objects from different classes. [sent-56, score-0.332]

24 Proposed Optimized Detection Framework Figure 3 presents a “big picture” overview of how our approach would be incorporated into a typical pedestrian detection pipeline. [sent-63, score-0.358]

25 First, an existing pedestrian detector is applied to produce a detection confidence score map, or if that is not available as output, a set of unfiltered bounding boxes with associated confidence scores. [sent-65, score-1.142]

26 A unary confidence score is computed for each candidate, to represent the quality of that proposed detection. [sent-67, score-0.512]

27 Furthermore, for candidates that overlap, a pairwise score is computed to specify a penalty that will be incurred if both candidates are kept in the final solution. [sent-68, score-0.535]

28 The purpose of this penalty is to prohibit excessive amounts of overlap while still allowing some amount of reasonable overlap to occur. [sent-69, score-0.298]

29 In the second step, unary and pairwise scores are grouped into a cost matrix to form the objective function for a quadratic unconstrained binary optimization (QUBO) problem. [sent-70, score-0.788]

30 In this QUBO problem, the unknown binary variables to solve for represent whether to keep or discard each pedestrian candidate from the final solution set of detections. [sent-71, score-0.549]

31 An optimization algorithm is then applied to search for an assignment of 0’s and 1’s to candidates yielding a high, ideally the maximum, objective function value. [sent-72, score-0.352]

32 1 discusses generation of candidates and objective function scores for two very different types of detection approach. [sent-74, score-0.43]

33 The first is a shape covering approach, based on finding size and placement of a set of pedestrian shapes in order to cover the pixels of a foreground mask computed by, e. [sent-75, score-0.814]

34 Overview of the Proposed Optimized Detection Framework proach uses a multi-scale confidence map produced by an existing, publicly available appearance-based detector. [sent-79, score-0.343]

35 2 discusses the second step of transforming into a binary quadratic objective function and efficient methods for generating high-quality approximate solutions to the resulting QUBO problem. [sent-81, score-0.457]

36 1 Detection Candidates by Shape Covering Previous works have considered the problem of detecting people as a “shape covering” of foreground mask data [10, 23]. [sent-86, score-0.289]

37 That is, given a foreground mask computed by background subtraction or motion analysis, a solution is sought as to the number, location, size and possibly articulation of a set of shapes to cover as many foreground pixels as possible while leaving as many background pixels as possible uncovered. [sent-87, score-0.528]

38 To avoid unnecessary proliferation of overlapping shapes, these methods augment the covering quality term of the objective function with either prior terms on the number of objects present, or with data terms that penalize object overlap. [sent-88, score-0.336]

39 To use this shape covering approach in practice, we first generate a lookup table relating location (x,y) in the image to expected height and width of a pedestrian centered at that location. [sent-91, score-0.673]

40 Given an automatically computed foreground mask, a candidate set of shapes is generated by methodically sampling midpoint locations every 10 pixels in x and y, looking up the expected width and height at each location, and computing a unary score ci for each candidate xi. [sent-95, score-0.98]

41 We use three common pedestrian poses shown in Figure 4 as the three shapes that can be proposed for shape covering. [sent-96, score-0.426]

42 For each candidate xi, three unary confidence scores are computing using each of these three pedestrian shapes, scaled to the size of the detection candidate bounding box. [sent-97, score-1.216]

43 2 Detection Candidates by Bounding Box Filtering Many object detection approaches use a sliding window based detector to generate a confidence score map, and then generate a set of final detections through a process of non maximum suppression. [sent-106, score-0.89]

44 For our approach, we modify an existing Histogram of Oriented Gradients (HOG) based pedestrian detector [1], available in OpenCV, and apply it at multiple scales without non maximum suppression to generate a multi-scale detection confidence map. [sent-107, score-0.934]

45 In the experiment, a set of 500 detection candidates is then randomly sampled from each detection scale, with the likelihood of a candidate being sampled at location (x,y) being proportional to the detector confidence at that location and scale. [sent-108, score-0.876]

46 Figure2(f) shows candidate samples generated from the confidence score map in Figure2(e) (only one scale level of the map is shown). [sent-109, score-0.504]

47 For example, images taken by a camera near eye level will have a large allowable range of scales, whereas an elevated camera much farther away may not see any difference in pedestrian size across the image. [sent-113, score-0.346]

48 Expected pedestrian bounding box size learned as a regression function between y location in the image and height of a detected pedestrian at that location. [sent-116, score-0.746]

49 Some previous works have used apriori pedestrian height distribu- tions to estimate camera calibration information from noisy pedestrian detections [13, 18]. [sent-119, score-0.847]

50 In our work, we employ a simple online learning approach that learns a regression function on expected bounding box height versus location in the image from a set of detections of pedestrians at different locations in images taken from a stationary camera view. [sent-120, score-0.479]

51 The light blue line is a quadratic regression function learned for one frame only, while the dark blue line is the height approximation found using data acquired over several frames. [sent-123, score-0.284]

52 However, rather than apply a hard threshold to filter improperly sized detections, we reduce unary confidence scores according to the dissimilarity between a candidate bounding box scale and the expected detection height at that location. [sent-125, score-0.943]

53 The detection confidence score will be greatly reduced when the detected box is much larger or smaller than the learned size estimate, reducing the chances that the candidate will be kept in the solution vector returned by QUBO. [sent-126, score-0.62]

54 In this work, we use the quadratic objective function = ? [sent-134, score-0.284]

55 , n are the binary variables to be solved f∈or, { ci are n unary . [sent-150, score-0.32]

56 1 Objective Function The quadratic objective function in Eq. [sent-157, score-0.284]

57 2 is computed by combining unary and pairwise terms. [sent-158, score-0.261]

58 Each unary score ci is a measure of confidence that candidate xi represents a person, while the pairwise scores ci,j penalize excessive overlap between pairs of candidates. [sent-159, score-1.033]

59 The three elliptically shaped candidates from left to right have unary values 3425, 4412 and 3658 computed from Eq. [sent-164, score-0.371]

60 For each pair of distinct overlapping candidates xi and xj, the overlap penalty ci,j (xi, xj) is -4594 for shapes x1 and x2, -1998 for shapes x1 and x3, and -3432 for shapes x2 and x3. [sent-166, score-0.746]

61 We find that the optimal solution [1, 0, 1] specifies that candidates x1 and x3 should be kept, while candidate x2 should be discarded. [sent-170, score-0.339]

62 Note that if we applied a traditional, greedy non maximum suppression approach where the candidate with highest confidence score is chosen and overlapping candidates of lesser score are suppressed, we would have chosen to keep only the middle candidate x2, while suppressing the other two. [sent-171, score-1.27]

63 3 can be formed as Q = w1 U − w2 P (4) where U is a diagonal unary score matrix, P is the pair- wise score matrix (overlap ratios), and , w1 w2 are relative 333666999311 Figure 6. [sent-173, score-0.35]

64 Using quadratic binary optimization to find the best set of detection candidates. [sent-174, score-0.39]

65 2 by adding a second unary term that represents the score from a second detector or other additional information. [sent-178, score-0.371]

66 For example, in our experiments we have explored combining unary scores computed from foreground shapes with scores computed from a detector confidence map. [sent-179, score-0.95]

67 Weight Parameter Estimation (5) Even though the unary and pairwise scores are both normalized values between 0 and 1, they represent very different types of information, one being an appearance-based detection confidence and the other being an area ratio. [sent-183, score-0.665]

68 Furthermore, the amount of “acceptable” bounding box overlap for a given situation may depend on the expected density of people in the scene as well as on the camera viewpoint. [sent-184, score-0.281]

69 For this reason, it is better to weight the relative contributions of the unary and pairwise terms with weighting parameters learned from representative training data. [sent-185, score-0.261]

70 The single candidate that gives the highest unary score is first selected. [sent-197, score-0.407]

71 Then, given a set of previously selected candidates, each remaining candidate is tentatively added to the set, and the set yielding the highest new objective function value becomes the new current solution set. [sent-198, score-0.249]

72 The process stops when adding any single candidate to the solution set will reduce the objective function value. [sent-199, score-0.249]

73 As a third approach, we relax the binary variable constraints into 0 ≤ xi ≤ 1to transform QUBO into a continuous quadratic programming problem. [sent-200, score-0.394]

74 Experimental Results In this section we evaluate our proposed quadratic binary optimization framework to detect overlapping pedes- trians. [sent-204, score-0.416]

75 We also evaluate three methods for solving the constructed QUBO optimization problem: Tabu search, greedy algorithm, and relaxation into a continuous quadratic program. [sent-206, score-0.327]

76 We use Matlab’s trust-region method to solve the relaxed quadratic programming problem. [sent-209, score-0.27]

77 Dataset We perform quantitative evaluation of our approach using a pedestrian dataset from EPFL called the Terrace sequences2 and on the PETS 2009 dataset. [sent-210, score-0.28]

78 These two datasets were chosen because of the large number of occlusions, a challenging issue in pedestrian detection. [sent-211, score-0.28]

79 (a) Comparison between Tabu search, Greedy, and Quadratic programming methods using foreground shape covering. [sent-216, score-0.254]

80 Figure 7(a) shows a quantitative comparison between Tabu Search, greedy algorithm, and quadratic programming results when candidates are generated based on foreground shape covering. [sent-219, score-0.717]

81 Both greedy algorithm and quadratic programming perform reasonably well. [sent-220, score-0.359]

82 Based on this result, we decided to use quadratic programming to evaluate our three options for generating candidates and unary confidence scores (foreground shape covering, detector confidence, confidence+foreground shape cover). [sent-225, score-1.194]

83 Two other approaches compared in those plots are OpenCV’s HOG-based human detector [1] and the PLS detector of [20], both using their default non-maximum suppression methods. [sent-230, score-0.354]

84 Approach 1 based on finding shape covering of a foreground mask works surprisingly well given the simplicity of the approach compared to the sophisticated appearancebased detectors it is being compared against. [sent-231, score-0.409]

85 Comparing approaches 1and 2, using detector confidence scores yields better results on the EPFL dataset, whereas foreground shape covering gives the better result on PETS. [sent-232, score-0.773]

86 In EPFL, people are large, with clearly visible edge appearance information, whereas the small / lowresolution crowds in PETS is a situation where fitting body shapes to cover foreground blobs yields better results. [sent-234, score-0.349]

87 In approach 3 we attempt to improve results using a hybrid of detector confidence combined with foreground covering as a second unary term, but with mixed results. [sent-235, score-0.878]

88 However, it demonstrates that the QUBO framework is flexible enough to be applied to a variety of detectors or combinations of unary and pairwise information. [sent-236, score-0.261]

89 Conclusion We have presented a framework for improving pedestrian detection performance in cases where there are mul- tiple, overlapping objects. [sent-243, score-0.462]

90 A QUBO framework is adopted where a quadratic objective function is formed from unary confidence scores and pairwise overlap penalties. [sent-244, score-0.977]

91 The unary terms are not limited to a specific type of detector and can be applied to various types of detection confidence scores with some adjustment. [sent-245, score-0.712]

92 Solving for the binary solution vector that maximizes this quadratic objective function automatically balances the trade off between encouraging multiple, high-quality detections, while discouraging excessive amounts of overlap. [sent-246, score-0.526]

93 Since finding exact solutions for largescale QUBO problems is not possible, we evaluate three approximate methods: heuristic Tabu search, greedy forward search, and relaxation to a continuous quadratic program. [sent-247, score-0.399]

94 Our results show that the use of binary quadratic optimization to explicitly reason about pedestrian candidate confidences and overlaps yields a performance improvement over existing detection methods that use nonmaximum suppression, in terms of lower miss rates and lower false positives. [sent-249, score-0.904]

95 sliding window detector that produces either a detection confidence map or a set of unfiltered, thresholded bounding boxes with associated confidence scores. [sent-254, score-0.926]

96 This covering approach can be generalized to use a library of realistic pedestrian shapes, such as those in [10, 23]. [sent-256, score-0.432]

97 Survey of pedestrian detection for advanced driver assistance systems. [sent-337, score-0.358]

98 Multistart tabu search strategies for the unconstrained binary quadratic optimization problem. [sent-373, score-0.807]

99 Hand tracking by binary quadratic programming and its application to retail activity recognition. [sent-405, score-0.344]

100 (c) Our approach 3: hybrid of confidence score and shape covering. [sent-417, score-0.389]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('qubo', 0.459), ('tabu', 0.37), ('pedestrian', 0.28), ('confidence', 0.248), ('quadratic', 0.204), ('unary', 0.201), ('candidates', 0.17), ('covering', 0.152), ('candidate', 0.143), ('detections', 0.142), ('foreground', 0.14), ('pls', 0.111), ('suppression', 0.111), ('detector', 0.107), ('overlapping', 0.104), ('shapes', 0.098), ('sliding', 0.093), ('greedy', 0.089), ('overlap', 0.083), ('objective', 0.08), ('height', 0.08), ('epfl', 0.08), ('non', 0.079), ('scores', 0.078), ('detection', 0.078), ('pets', 0.077), ('binary', 0.074), ('mask', 0.069), ('search', 0.068), ('programming', 0.066), ('score', 0.063), ('excessive', 0.062), ('pairwise', 0.06), ('unconstrained', 0.057), ('maximizes', 0.055), ('positives', 0.052), ('pedestrians', 0.051), ('people', 0.05), ('xi', 0.05), ('palubeckis', 0.049), ('terrace', 0.049), ('window', 0.049), ('shape', 0.048), ('june', 0.047), ('false', 0.046), ('miss', 0.045), ('penalty', 0.045), ('ci', 0.045), ('bounding', 0.045), ('variants', 0.044), ('multistart', 0.044), ('proach', 0.042), ('suppressed', 0.042), ('opencv', 0.042), ('unfiltered', 0.04), ('qx', 0.038), ('conference', 0.037), ('crowd', 0.036), ('pattern', 0.035), ('expected', 0.035), ('box', 0.035), ('optimization', 0.034), ('blobs', 0.034), ('camera', 0.033), ('boxes', 0.033), ('locations', 0.032), ('calibration', 0.032), ('fppi', 0.032), ('maximum', 0.031), ('detecting', 0.03), ('hybrid', 0.03), ('pages', 0.029), ('hog', 0.029), ('default', 0.029), ('produced', 0.028), ('heuristic', 0.028), ('articulation', 0.028), ('intelligence', 0.028), ('lookup', 0.028), ('forward', 0.027), ('kept', 0.027), ('ieee', 0.027), ('solutions', 0.027), ('cover', 0.027), ('annals', 0.027), ('dollar', 0.027), ('solution', 0.026), ('papers', 0.026), ('location', 0.026), ('keep', 0.026), ('amounts', 0.025), ('map', 0.025), ('transactions', 0.025), ('discusses', 0.024), ('approximate', 0.024), ('relating', 0.024), ('generating', 0.024), ('formed', 0.023), ('collins', 0.023), ('stochastic', 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000007 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People

Author: Sitapa Rujikietgumjorn, Robert T. Collins

Abstract: We present a quadratic unconstrained binary optimization (QUBO) framework for reasoning about multiple object detections with spatial overlaps. The method maximizes an objective function composed of unary detection confidence scores andpairwise overlap constraints to determine which overlapping detections should be suppressed, and which should be kept. The framework is flexible enough to handle the problem of detecting objects as a shape covering of a foreground mask, and to handle the problem of filtering confidence weighted detections produced by a traditional sliding window object detector. In our experiments, we show that our method outperforms two existing state-ofthe-art pedestrian detectors.

2 0.23889828 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection

Author: Wanli Ouyang, Xiaogang Wang

Abstract: In this paper, we address the challenging problem of detecting pedestrians who appear in groups and have interaction. A new approach is proposed for single-pedestrian detection aided by multi-pedestrian detection. A mixture model of multi-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby multiple pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and multi-pedestrian detectors, and to refine the single-pedestrian detection result with multi-pedestrian detection. It can integrate with any single-pedestrian detector without significantly increasing the computation load. 15 state-of-the-art single-pedestrian detection approaches are investigated on three widely used public datasets: Caltech, TUD-Brussels andETH. Experimental results show that our framework significantly improves all these approaches. The average improvement is 9% on the Caltech-Test dataset, 11% on the TUD-Brussels dataset and 17% on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from 48% to 43% on the Caltech-Test dataset, from 55% to 50% on the TUD-Brussels dataset and from 51% to 41% on the ETH dataset.

3 0.21745972 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes

Author: Junjie Yan, Xucong Zhang, Zhen Lei, Shengcai Liao, Stan Z. Li

Abstract: The serious performance decline with decreasing resolution is the major bottleneck for current pedestrian detection techniques [14, 23]. In this paper, we take pedestrian detection in different resolutions as different but related problems, and propose a Multi-Task model to jointly consider their commonness and differences. The model contains resolution aware transformations to map pedestrians in different resolutions to a common space, where a shared detector is constructed to distinguish pedestrians from background. For model learning, we present a coordinate descent procedure to learn the resolution aware transformations and deformable part model (DPM) based detector iteratively. In traffic scenes, there are many false positives located around vehicles, therefore, we further build a context model to suppress them according to the pedestrian-vehicle relationship. The context model can be learned automatically even when the vehicle annotations are not available. Our method reduces the mean miss rate to 60% for pedestrians taller than 30 pixels on the Caltech Pedestrian Benchmark, which noticeably outperforms previous state-of-the-art (71%).

4 0.16041383 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

Author: Guang Shu, Afshin Dehghan, Mubarak Shah

Abstract: We propose an approach to improve the detection performance of a generic detector when it is applied to a particular video. The performance of offline-trained objects detectors are usually degraded in unconstrained video environments due to variant illuminations, backgrounds and camera viewpoints. Moreover, most object detectors are trained using Haar-like features or gradient features but ignore video specificfeatures like consistent colorpatterns. In our approach, we apply a Superpixel-based Bag-of-Words (BoW) model to iteratively refine the output of a generic detector. Compared to other related work, our method builds a video-specific detector using superpixels, hence it can handle the problem of appearance variation. Most importantly, using Conditional Random Field (CRF) along with our super pixel-based BoW model, we develop and algorithm to segment the object from the background . Therefore our method generates an output of the exact object regions instead of the bounding boxes generated by most detectors. In general, our method takes detection bounding boxes of a generic detector as input and generates the detection output with higher average precision and precise object regions. The experiments on four recent datasets demonstrate the effectiveness of our approach and significantly improves the state-of-art detector by 5-16% in average precision.

5 0.15626773 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection

Author: Wanli Ouyang, Xingyu Zeng, Xiaogang Wang

Abstract: Detecting pedestrians in cluttered scenes is a challenging problem in computer vision. The difficulty is added when several pedestrians overlap in images and occlude each other. We observe, however, that the occlusion/visibility statuses of overlapping pedestrians provide useful mutual relationship for visibility estimation - the visibility estimation of one pedestrian facilitates the visibility estimation of another. In this paper, we propose a mutual visibility deep model that jointly estimates the visibility statuses of overlapping pedestrians. The visibility relationship among pedestrians is learned from the deep model for recognizing co-existing pedestrians. Experimental results show that the mutual visibility deep model effectively improves the pedestrian detection results. Compared with existing image-based pedestrian detection approaches, our approach has the lowest average miss rate on the CaltechTrain dataset, the Caltech-Test dataset and the ETHdataset. Including mutual visibility leads to 4% −8% improvements on mluudlitnipglem ubteunaclh vmiasibrki ditayta lesaedtss.

6 0.1400113 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning

7 0.13509682 158 cvpr-2013-Exploring Weak Stabilization for Motion Feature Extraction

8 0.12780517 272 cvpr-2013-Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery

9 0.1276983 167 cvpr-2013-Fast Multiple-Part Based Object Detection Using KD-Ferns

10 0.12401824 207 cvpr-2013-Human Pose Estimation Using a Joint Pixel-wise and Part-wise Formulation

11 0.12243173 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision

12 0.12116648 155 cvpr-2013-Exploiting the Power of Stereo Confidences

13 0.1192423 364 cvpr-2013-Robust Object Co-detection

14 0.11547397 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

15 0.11410064 383 cvpr-2013-Seeking the Strongest Rigid Detector

16 0.11238357 247 cvpr-2013-Learning Class-to-Image Distance with Object Matchings

17 0.11161385 370 cvpr-2013-SCALPEL: Segmentation Cascades with Localized Priors and Efficient Learning

18 0.1062736 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video

19 0.10474902 258 cvpr-2013-Learning Video Saliency from Human Gaze Using Candidate Selection

20 0.10411137 216 cvpr-2013-Improving Image Matting Using Comprehensive Sampling Sets


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.226), (1, -0.022), (2, 0.039), (3, -0.067), (4, 0.093), (5, 0.01), (6, 0.128), (7, 0.026), (8, 0.019), (9, 0.009), (10, -0.077), (11, -0.069), (12, 0.11), (13, -0.173), (14, 0.03), (15, 0.04), (16, -0.101), (17, 0.045), (18, 0.022), (19, -0.008), (20, -0.068), (21, 0.033), (22, -0.135), (23, 0.139), (24, -0.006), (25, -0.117), (26, -0.002), (27, 0.055), (28, -0.034), (29, -0.061), (30, 0.069), (31, 0.037), (32, -0.032), (33, -0.018), (34, -0.041), (35, 0.028), (36, -0.0), (37, -0.0), (38, -0.097), (39, -0.005), (40, 0.078), (41, 0.037), (42, 0.019), (43, 0.012), (44, 0.011), (45, -0.027), (46, -0.009), (47, 0.012), (48, 0.005), (49, -0.06)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95185161 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People

Author: Sitapa Rujikietgumjorn, Robert T. Collins

Abstract: We present a quadratic unconstrained binary optimization (QUBO) framework for reasoning about multiple object detections with spatial overlaps. The method maximizes an objective function composed of unary detection confidence scores andpairwise overlap constraints to determine which overlapping detections should be suppressed, and which should be kept. The framework is flexible enough to handle the problem of detecting objects as a shape covering of a foreground mask, and to handle the problem of filtering confidence weighted detections produced by a traditional sliding window object detector. In our experiments, we show that our method outperforms two existing state-ofthe-art pedestrian detectors.

2 0.87351698 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes

Author: Junjie Yan, Xucong Zhang, Zhen Lei, Shengcai Liao, Stan Z. Li

Abstract: The serious performance decline with decreasing resolution is the major bottleneck for current pedestrian detection techniques [14, 23]. In this paper, we take pedestrian detection in different resolutions as different but related problems, and propose a Multi-Task model to jointly consider their commonness and differences. The model contains resolution aware transformations to map pedestrians in different resolutions to a common space, where a shared detector is constructed to distinguish pedestrians from background. For model learning, we present a coordinate descent procedure to learn the resolution aware transformations and deformable part model (DPM) based detector iteratively. In traffic scenes, there are many false positives located around vehicles, therefore, we further build a context model to suppress them according to the pedestrian-vehicle relationship. The context model can be learned automatically even when the vehicle annotations are not available. Our method reduces the mean miss rate to 60% for pedestrians taller than 30 pixels on the Caltech Pedestrian Benchmark, which noticeably outperforms previous state-of-the-art (71%).

3 0.85413671 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection

Author: Wanli Ouyang, Xiaogang Wang

Abstract: In this paper, we address the challenging problem of detecting pedestrians who appear in groups and have interaction. A new approach is proposed for single-pedestrian detection aided by multi-pedestrian detection. A mixture model of multi-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby multiple pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and multi-pedestrian detectors, and to refine the single-pedestrian detection result with multi-pedestrian detection. It can integrate with any single-pedestrian detector without significantly increasing the computation load. 15 state-of-the-art single-pedestrian detection approaches are investigated on three widely used public datasets: Caltech, TUD-Brussels andETH. Experimental results show that our framework significantly improves all these approaches. The average improvement is 9% on the Caltech-Test dataset, 11% on the TUD-Brussels dataset and 17% on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from 48% to 43% on the Caltech-Test dataset, from 55% to 50% on the TUD-Brussels dataset and from 51% to 41% on the ETH dataset.

4 0.77660507 272 cvpr-2013-Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery

Author: Rikke Gade, Anders Jørgensen, Thomas B. Moeslund

Abstract: This paper presents a robust occupancy analysis system for thermal imaging. Reliable detection of people is very hard in crowded scenes, due to occlusions and segmentation problems. We therefore propose a framework that optimises the occupancy analysis over long periods by including information on the transition in occupancy, whenpeople enter or leave the monitored area. In stable periods, with no activity close to the borders, people are detected and counted which contributes to a weighted histogram. When activity close to the border is detected, local tracking is applied in order to identify a crossing. After a full sequence, the number of people during all periods are estimated using a probabilistic graph search optimisation. The system is tested on a total of 51,000 frames, captured in sports arenas. The mean error for a 30-minute period containing 3-13 people is 4.44 %, which is a half of the error percentage optained by detection only, and better than the results of comparable work. The framework is also tested on a public available dataset from an outdoor scene, which proves the generality of the method.

5 0.75971997 167 cvpr-2013-Fast Multiple-Part Based Object Detection Using KD-Ferns

Author: Dan Levi, Shai Silberstein, Aharon Bar-Hillel

Abstract: In this work we present a new part-based object detection algorithm with hundreds of parts performing realtime detection. Part-based models are currently state-ofthe-art for object detection due to their ability to represent large appearance variations. However, due to their high computational demands such methods are limited to several parts only and are too slow for practical real-time implementation. Our algorithm is an accelerated version of the “Feature Synthesis ” (FS) method [1], which uses multiple object parts for detection and is among state-of-theart methods on human detection benchmarks, but also suffers from a high computational cost. The proposed Accelerated Feature Synthesis (AFS) uses several strategies for reducing the number of locations searched for each part. The first strategy uses a novel algorithm for approximate nearest neighbor search which we developed, termed “KDFerns ”, to compare each image location to only a subset of the model parts. Candidate part locations for a specific part are further reduced using spatial inhibition, and using an object-level “coarse-to-fine ” strategy. In our empirical evaluation on pedestrian detection benchmarks, AFS main- × tains almost fully the accuracy performance of the original FS, while running more than 4 faster than existing partbased methods which use only several parts. AFS is to our best knowledge the first part-based object detection method achieving real-time running performance: nearly 10 frames per-second on 640 480 images on a regular CPU.

6 0.73142517 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection

7 0.72537833 383 cvpr-2013-Seeking the Strongest Rigid Detector

8 0.71027589 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence

9 0.61060584 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels

10 0.60810113 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning

11 0.59796149 264 cvpr-2013-Learning to Detect Partially Overlapping Instances

12 0.59462911 120 cvpr-2013-Detecting and Naming Actors in Movies Using Generative Appearance Models

13 0.57566112 142 cvpr-2013-Efficient Detector Adaptation for Object Detection in a Video

14 0.56160045 382 cvpr-2013-Scene Text Recognition Using Part-Based Tree-Structured Character Detection

15 0.55102652 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection

16 0.54948378 15 cvpr-2013-A Lazy Man's Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration

17 0.53355396 144 cvpr-2013-Efficient Maximum Appearance Search for Large-Scale Object Detection

18 0.50747138 271 cvpr-2013-Locally Aligned Feature Transforms across Views

19 0.49544516 401 cvpr-2013-Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection

20 0.49418944 270 cvpr-2013-Local Fisher Discriminant Analysis for Pedestrian Re-identification


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.124), (16, 0.015), (26, 0.04), (28, 0.012), (33, 0.317), (67, 0.117), (69, 0.081), (80, 0.014), (87, 0.087), (96, 0.122)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.96625668 228 cvpr-2013-Is There a Procedural Logic to Architecture?

Author: Julien Weissenberg, Hayko Riemenschneider, Mukta Prasad, Luc Van_Gool

Abstract: Urban models are key to navigation, architecture and entertainment. Apart from visualizing fa ¸cades, a number of tedious tasks remain largely manual (e.g. compression, generating new fac ¸ade designs and structurally comparing fa c¸ades for classification, retrieval and clustering). We propose a novel procedural modelling method to automatically learn a grammar from a set of fa c¸ades, generate new fa ¸cade instances and compare fa ¸cades. To deal with the difficulty of grammatical inference, we reformulate the problem. Instead of inferring a compromising, onesize-fits-all, single grammar for all tasks, we infer a model whose successive refinements are production rules tailored for each task. We demonstrate our automatic rule inference on datasets of two different architectural styles. Our method supercedes manual expert work and cuts the time required to build a procedural model of a fa ¸cade from several days to a few milliseconds.

2 0.96298045 312 cvpr-2013-On a Link Between Kernel Mean Maps and Fraunhofer Diffraction, with an Application to Super-Resolution Beyond the Diffraction Limit

Author: Stefan Harmeling, Michael Hirsch, Bernhard Schölkopf

Abstract: We establish a link between Fourier optics and a recent construction from the machine learning community termed the kernel mean map. Using the Fraunhofer approximation, it identifies the kernel with the squared Fourier transform of the aperture. This allows us to use results about the invertibility of the kernel mean map to provide a statement about the invertibility of Fraunhofer diffraction, showing that imaging processes with arbitrarily small apertures can in principle be invertible, i.e., do not lose information, provided the objects to be imaged satisfy a generic condition. A real world experiment shows that we can super-resolve beyond the Rayleigh limit.

3 0.94905221 218 cvpr-2013-Improving the Visual Comprehension of Point Sets

Author: Sagi Katz, Ayellet Tal

Abstract: Point sets are the standard output of many 3D scanning systems and depth cameras. Presenting the set of points as is, might “hide ” the prominent features of the object from which the points are sampled. Our goal is to reduce the number of points in a point set, for improving the visual comprehension from a given viewpoint. This is done by controlling the density of the reduced point set, so as to create bright regions (low density) and dark regions (high density), producing an effect of shading. This data reduction is achieved by leveraging a limitation of a solution to the classical problem of determining visibility from a viewpoint. In addition, we introduce a new dual problem, for determining visibility of a point from infinity, and show how a limitation of its solution can be leveraged in a similar way.

same-paper 4 0.94606471 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People

Author: Sitapa Rujikietgumjorn, Robert T. Collins

Abstract: We present a quadratic unconstrained binary optimization (QUBO) framework for reasoning about multiple object detections with spatial overlaps. The method maximizes an objective function composed of unary detection confidence scores andpairwise overlap constraints to determine which overlapping detections should be suppressed, and which should be kept. The framework is flexible enough to handle the problem of detecting objects as a shape covering of a foreground mask, and to handle the problem of filtering confidence weighted detections produced by a traditional sliding window object detector. In our experiments, we show that our method outperforms two existing state-ofthe-art pedestrian detectors.

5 0.94111383 431 cvpr-2013-The Variational Structure of Disparity and Regularization of 4D Light Fields

Author: Bastian Goldluecke, Sven Wanner

Abstract: Unlike traditional images which do not offer information for different directions of incident light, a light field is defined on ray space, and implicitly encodes scene geometry data in a rich structure which becomes visible on its epipolar plane images. In this work, we analyze regularization of light fields in variational frameworks and show that their variational structure is induced by disparity, which is in this context best understood as a vector field on epipolar plane image space. We derive differential constraints on this vector field to enable consistent disparity map regularization. Furthermore, we show how the disparity field is related to the regularization of more general vector-valued functions on the 4D ray space of the light field. This way, we derive an efficient variational framework with convex priors, which can serve as a fundament for a large class of inverse problems on ray space.

6 0.94089961 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

7 0.94006318 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval

8 0.93762732 239 cvpr-2013-Kernel Null Space Methods for Novelty Detection

9 0.93581891 94 cvpr-2013-Context-Aware Modeling and Recognition of Activities in Video

10 0.9349072 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases

11 0.93478769 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation

12 0.93366325 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection

13 0.93332976 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence

14 0.93220186 60 cvpr-2013-Beyond Physical Connections: Tree Models in Human Pose Estimation

15 0.93179506 416 cvpr-2013-Studying Relationships between Human Gaze, Description, and Computer Vision

16 0.93137622 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning

17 0.93126994 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors

18 0.9309029 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation

19 0.93078792 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection

20 0.93033344 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection