cvpr cvpr2013 cvpr2013-272 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Rikke Gade, Anders Jørgensen, Thomas B. Moeslund
Abstract: This paper presents a robust occupancy analysis system for thermal imaging. Reliable detection of people is very hard in crowded scenes, due to occlusions and segmentation problems. We therefore propose a framework that optimises the occupancy analysis over long periods by including information on the transition in occupancy, whenpeople enter or leave the monitored area. In stable periods, with no activity close to the borders, people are detected and counted which contributes to a weighted histogram. When activity close to the border is detected, local tracking is applied in order to identify a crossing. After a full sequence, the number of people during all periods are estimated using a probabilistic graph search optimisation. The system is tested on a total of 51,000 frames, captured in sports arenas. The mean error for a 30-minute period containing 3-13 people is 4.44 %, which is a half of the error percentage optained by detection only, and better than the results of comparable work. The framework is also tested on a public available dataset from an outdoor scene, which proves the generality of the method.
Reference: text
sentIndex sentText sentNum sentScore
1 dk Abstract This paper presents a robust occupancy analysis system for thermal imaging. [sent-4, score-0.845]
2 Reliable detection of people is very hard in crowded scenes, due to occlusions and segmentation problems. [sent-5, score-0.314]
3 We therefore propose a framework that optimises the occupancy analysis over long periods by including information on the transition in occupancy, whenpeople enter or leave the monitored area. [sent-6, score-0.454]
4 In stable periods, with no activity close to the borders, people are detected and counted which contributes to a weighted histogram. [sent-7, score-0.379]
5 After a full sequence, the number of people during all periods are estimated using a probabilistic graph search optimisation. [sent-9, score-0.427]
6 The mean error for a 30-minute period containing 3-13 people is 4. [sent-11, score-0.282]
7 Introduction Measuring the occupancy maps from people has become an essential step towards an intelligent and efficient society [21, 33]. [sent-15, score-0.439]
8 We therefore apply thermal imagery, which captures the infrared radiation instead of visible light, and creates an image whose pixel values represent temperature. [sent-22, score-1.044]
9 People can not be identified in thermal images, thereby eliminating the privacy issues. [sent-23, score-0.683]
10 A positive side effect of thermal imaging is that detection can often be reduced to a trivial task. [sent-24, score-0.74]
11 However, thermal imaging also introduces new problems, as people are often fragmented into small parts, and reflec- tions can be seen in the floor. [sent-25, score-0.883]
12 Moreover, the challenges of occlusions remain in thermal images, see figure 1. [sent-26, score-0.711]
13 The contribution of this work is a reliable method for occupancy analysis in thermal video. [sent-29, score-0.808]
14 The first type is the stable periods, where no people exit or enter the court. [sent-34, score-0.265]
15 In these periods, the number of people on the court must be the same, which in turn introduces a constraint on the problem. [sent-35, score-0.324]
16 Combining these two types ofinformation to model the periods and transitions between them provides a unified framework to optimise over a long period of time. [sent-37, score-0.313]
17 This section will therefore provide information on the physical foundation of thermal radiation and cameras. [sent-41, score-0.826]
18 All objects with a temperature above the absolute zero emit infrared radiation, mainly in the mid-wavelength infrared spectrum (MWIR, 3-5 μm) and long-wavelength infrared spectrum (LWIR, 8-15 μm). [sent-42, score-0.767]
19 The intensity of the radiation from an object with temperature T is described by Planck’s Law as a function of the wavelength λ: I(λ,T) =λ5? [sent-44, score-0.293]
20 The thermal radiation originates from energy in the molecules of an object. [sent-54, score-0.854]
21 The same principle applies to infrared light, with the difference that the photons contain less energy and cause transitions in the vibrational and rotational energy levels instead. [sent-57, score-0.345]
22 The electromagnetic radiation can be absorbed or emitted by the molecule, then the incident radiation causes the molecule to rise to an excited energy state, and when it falls back to ground state a photon is released. [sent-58, score-0.676]
23 If more radiation is absorbed than emitted, the temperature of the molecule will rise until equilibrium is re-established. [sent-60, score-0.478]
24 Likewise, the temperature will fall if more radiation is emitted than absorbed, until equilibrium is re-established. [sent-61, score-0.368]
25 Thermal cameras Generally two types of detectors exist for thermal cameras: photon detectors and thermal detectors. [sent-64, score-1.408]
26 Photon detectors convert the absorbed electromagnetic radiation directly into a change of the electronic energy distribution in a semiconductor by the change of the free charge car- rier concentration. [sent-65, score-0.334]
27 This type of detector typically works in the MWIR spectrum, where the thermal contrast is high, making it very sensitive to small differences in the scene temperature. [sent-66, score-0.646]
28 The thermal detector converts the absorbed electromagnetic radiation into thermal energy causing a rise in the detector temperature. [sent-68, score-1.626]
29 Then, the electrical output of the thermal sensor is produced by a corresponding change in some physical property of material, e. [sent-69, score-0.675]
30 The thermal cameras can here often be a better choice than a normal visual camera. [sent-80, score-0.72]
31 The methods applied to thermal imaging span from simple thresholding and shape analysis [43, 17, 39, 15, 7] to more complex, but well-known methods such as HOG and SVM [42, 37, 41, 3 1, 26] as well as contour analysis [10, 9, 27, 38]. [sent-81, score-0.672]
32 Using simple methods allows for fast real-time processing, and combined with the illumination independency, the thermal sensor is very well suited for detecting humans in real-life applications. [sent-82, score-0.675]
33 An obvious application area for thermal imaging is pedestrian detection systems for vehicles, due to the cameras’ ability to ”see” during the night. [sent-83, score-0.964]
34 Using a thermal sensor with low spatial resolution, [28] builds a robust pedestrian detector by combining three different methods. [sent-88, score-0.867]
35 [19] also proposes a low resolution system for pedestrian detection from vehicles. [sent-89, score-0.297]
36 [32] proposes a pedestrian detection system that detects people based on their temperature and dimensions, and tracks them using a Kalman filter. [sent-90, score-0.621]
37 333666999977 A more general interest in pedestrian detection based on thermal imaging can also be seen in surveillance or for analysis of pedestrian flow in cities. [sent-92, score-1.124]
38 A general purpose pedestrian detection system is proposed in [8]. [sent-93, score-0.297]
39 [29] uses a statistical approach for head detection as the first step in the pedestrian detection. [sent-98, score-0.291]
40 Examples of systems combining thermal and RGB cameras are given by Davis et al. [sent-101, score-0.752]
41 Other sensors like laser scanners and near-infrared cameras, have also been combined with thermal sensors [14, 35]. [sent-104, score-0.714]
42 Due to privacy issues, this work will concentrate on thermal cameras only. [sent-105, score-0.757]
43 Approach As described in the introduction, precisely counting people in single frames can be a nearly impossible task, due to occlusions and segmentation errors. [sent-109, score-0.273]
44 The idea is to automatically split a video sequence into stable periods, with no activities near the border of the court, and transition periods with activity near the border. [sent-111, score-0.508]
45 During the stable periods, the detected number of people in each frame contributes to a distribution of observations for that period. [sent-112, score-0.334]
46 For the transition periods, local tracking of the blobs in the border area is applied, in order to estimate the likelihood of crossings. [sent-113, score-0.35]
47 The remaining part of section 2 describes the details of the people detection and the monitoring of transitions. [sent-116, score-0.279]
48 People detection The first step towards detecting people is to separate foreground from background. [sent-121, score-0.279]
49 Using thermal imagery in an indoor environment simplifies this task, as the surrounding temperature is normally stable and colder than the human temperature. [sent-122, score-0.843]
50 Generally, two types of occlusions are seen: people standing behind each other, seen from the camera’s point of view (”tall blobs”) and people standing close together in a group (”wide blobs”). [sent-139, score-0.549]
51 1 Split tall blobs In order to split people that form one blob by standing behind each other, it must be detected when the blob is too tall to contain only one person. [sent-142, score-0.661]
52 Sorting people candidates In addition to occlusions, other problems like reflections from people in the floor, or one person split into many blobs can be observed. [sent-166, score-0.778]
53 The algorithm will take all the bottom points of the blobs as person location candidates, and calculate the probability for each of them being a true position. [sent-171, score-0.252]
54 A rectangle is generated from each candidate point, with a height corresponding to a given average height of people and the width being one third of the height. [sent-172, score-0.407]
55 Two parameters are used for evaluating the probability of the rectangle containing a person: the ratio of white pixels inside the rectangle and the ratio of the rectangle perimeter that is white. [sent-173, score-0.341]
56 From figure 3 it is seen that only 1 % of the true candidates have a white ratio less than 25 %, while a large part of the false candidates are found here, and no true candidates are above 70 %. [sent-212, score-0.268]
57 For the rectangle perimeter it is found that the lower the ratio of the rectangle perimeter that is white, the better is the fit of the rectangle to the person. [sent-213, score-0.324]
58 Identification of people entering and leaving During the periods with activities detected at the border of the court, it is very likely that a change will happen. [sent-225, score-0.646]
59 For these periods, the people near the border are monitored in order to detect crossings. [sent-226, score-0.344]
60 Instead, the position of each person near the border is tracked, and if the border is crossed, it is registered along with the direction. [sent-229, score-0.297]
61 Until a new stable period is observed, the number of people entering or leaving the court will contribute to the total transition in number. [sent-230, score-0.572]
62 Graph search optimisation Two types of data exist now, the number of detected persons during the stable periods, and the number of entering 333667990199 or leaving persons during periods with activity at the border. [sent-232, score-0.487]
63 The graph will consist of nodes, representing the number of people in the stable periods and edges, representing the change in number between two periods. [sent-234, score-0.481]
64 The weights for the nodes will be distributed according to the weighted histogram of the number of detected people in all frames during the stable period. [sent-261, score-0.304]
65 The histogram is constructed from the detected people in each frame, with a weight describing the probability of each detection being true, and a weight describing the uncertainty of the frame, caused by occlusions and clutter. [sent-262, score-0.353]
66 1 where n is the number of people, wp(i) is the probability of people ibeing a true detection (see equation 3), and ws is a weight that decreases with the number of splits performed (described in section 2. [sent-267, score-0.279]
67 The weighting of edges depends on the total number of crossings during the period of border activity, as well as the weighting of the individual people crossing the border. [sent-272, score-0.55]
68 im=1wp(i) wb(x) = × wp (5) (6) where m is the number of people crossing the border. [sent-277, score-0.297]
69 In the example the variance σ is high for the first period of border activity and low for the second period of border activity. [sent-279, score-0.396]
70 Experimental results Comparing our results with others is difficult, because as far as we know, only [17] has focused on occupancy analysis of thermal video. [sent-281, score-0.808]
71 Moreover, no public datasets with long thermal videos containing more than a few people exist. [sent-283, score-0.857]
72 We test on a 5-minute sequence from each of the five arenas for the evaluation of the detection algorithm and the tracking algorithm for the border areas. [sent-287, score-0.343]
73 The camera set-up used in this work consists of three thermal cameras placed at the same location, and adjusted to have adjacent fields-of-view. [sent-306, score-0.747]
74 Calibration of thermal cameras is not a trivial task, as they can not see the contrast differences of a typical chessboard used in most applications. [sent-312, score-0.72]
75 As the cameras are fixed relative to each other and then tilted downwards when recording in arenas, the result is that people in the image are more tilted the further they get from the image centre along the x-axis. [sent-320, score-0.285]
76 Detection of people The first test evaluates the detection algorithm described in section 2. [sent-329, score-0.279]
77 The number of detected people is registered as well as the manually counted number. [sent-331, score-0.281]
78 Periods with large groupings have a higher detection error than periods with people separated from each other. [sent-337, score-0.495]
79 This is also expected, as the detection algorithm works on each frame independently, and people that are fully or mostly occluded can not be detected. [sent-338, score-0.309]
80 Transition recognition For the five videos of five minutes, it is registered each time a person crosses a specified border in order to evaluate the tracking algorithm. [sent-343, score-0.253]
81 In table 1 our results are compared to related work, based on both thermal and RGB images. [sent-361, score-0.646]
82 Test on OSU dataset To show the generality of our framework, we tested the system on the thermal video from the OSU Color-Thermal database [11], which is dataset three from the OTCBVS Benchmark Dataset Collection. [sent-372, score-0.711]
83 Due to the low number of people in this dataset, instead of error we calculated the precision, being the number of frames with the correct number of people estimated. [sent-375, score-0.422]
84 However, it should be noted that the results of [25] are obtained by fusing the thermal and visible modalities and are intended for people tracking. [sent-377, score-0.857]
85 This method includes temporal information in the estimation by measuring the transition in numbers, and using that together with the detection of people in the global optimisation. [sent-389, score-0.327]
86 Pedestrian detection in far infrared images based on the use of probabilistic templates. [sent-436, score-0.286]
87 A modular tracking system for far infrared pedestrian recognition. [sent-446, score-0.508]
88 Layered representation for pedestrian detection and tracking in infrared imagery. [sent-468, score-0.539]
89 Pedestrian detection and tracking in infrared imagery using shape and appearance. [sent-474, score-0.377]
90 A two-stage template approach to person detection in thermal imagery. [sent-481, score-0.801]
91 Background-subtraction using contour-based fusion of thermal and visible imagery. [sent-493, score-0.646]
92 A shape-independent method for pedestrian detection with far-infrared images. [sent-511, score-0.26]
93 Shape and motionbased pedestrian detection in infrared images: a multi sensor approach. [sent-518, score-0.507]
94 Thermal-visible video fusion for moving target tracking and pedestrian classification. [sent-597, score-0.253]
95 An effective approach to pedestrian detection in thermal imagery. [sent-604, score-0.906]
96 Reinforcing the reliability of pedestrian detection in far-infrared sensing. [sent-625, score-0.26]
97 Contrast invariant features for human detection in far infrared images. [sent-636, score-0.286]
98 Pedestrian detection using infrared images and histograms of oriented gradients. [sent-681, score-0.286]
99 Pedestrian detection in infrared images based on local shape features. [sent-717, score-0.286]
100 Robust person detection using far infrared camera for image fusion. [sent-724, score-0.4]
wordName wordTfidf (topN-words)
[('thermal', 0.646), ('infrared', 0.218), ('periods', 0.216), ('people', 0.211), ('pedestrian', 0.192), ('radiation', 0.18), ('occupancy', 0.162), ('blobs', 0.136), ('temperature', 0.113), ('court', 0.113), ('arenas', 0.109), ('border', 0.105), ('vehicles', 0.093), ('arena', 0.091), ('person', 0.087), ('absorbed', 0.084), ('cameras', 0.074), ('leykin', 0.073), ('molecule', 0.073), ('period', 0.071), ('height', 0.07), ('detection', 0.068), ('symposium', 0.067), ('intelligent', 0.066), ('candidates', 0.063), ('crossings', 0.062), ('perimeter', 0.062), ('tracking', 0.061), ('blob', 0.06), ('sports', 0.059), ('optimisation', 0.059), ('rectangle', 0.056), ('broggi', 0.054), ('gade', 0.054), ('stable', 0.054), ('initialisation', 0.051), ('wp', 0.051), ('transition', 0.048), ('defects', 0.048), ('osu', 0.048), ('emitted', 0.047), ('white', 0.047), ('standing', 0.046), ('photons', 0.045), ('activity', 0.044), ('photon', 0.042), ('electromagnetic', 0.042), ('split', 0.041), ('entering', 0.041), ('conference', 0.039), ('detected', 0.039), ('privacy', 0.037), ('system', 0.037), ('bertozzi', 0.036), ('handball', 0.036), ('kallhammer', 0.036), ('lohlein', 0.036), ('lwir', 0.036), ('microbolometer', 0.036), ('mwir', 0.036), ('oberlander', 0.036), ('olmeda', 0.036), ('rgensen', 0.036), ('uncooled', 0.036), ('occlusions', 0.035), ('crossing', 0.035), ('tall', 0.034), ('sensors', 0.034), ('leaving', 0.034), ('weighting', 0.033), ('davis', 0.033), ('escalera', 0.032), ('systems', 0.032), ('rectangles', 0.032), ('international', 0.032), ('wb', 0.032), ('ratio', 0.032), ('counted', 0.031), ('head', 0.031), ('imagery', 0.03), ('minute', 0.03), ('metres', 0.03), ('challenges', 0.03), ('frame', 0.03), ('calibration', 0.029), ('sensor', 0.029), ('june', 0.029), ('reflections', 0.029), ('calculate', 0.029), ('unstable', 0.028), ('tested', 0.028), ('equilibrium', 0.028), ('weeks', 0.028), ('monitored', 0.028), ('energy', 0.028), ('camera', 0.027), ('counting', 0.027), ('fir', 0.027), ('imaging', 0.026), ('transitions', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999952 272 cvpr-2013-Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery
Author: Rikke Gade, Anders Jørgensen, Thomas B. Moeslund
Abstract: This paper presents a robust occupancy analysis system for thermal imaging. Reliable detection of people is very hard in crowded scenes, due to occlusions and segmentation problems. We therefore propose a framework that optimises the occupancy analysis over long periods by including information on the transition in occupancy, whenpeople enter or leave the monitored area. In stable periods, with no activity close to the borders, people are detected and counted which contributes to a weighted histogram. When activity close to the border is detected, local tracking is applied in order to identify a crossing. After a full sequence, the number of people during all periods are estimated using a probabilistic graph search optimisation. The system is tested on a total of 51,000 frames, captured in sports arenas. The mean error for a 30-minute period containing 3-13 people is 4.44 %, which is a half of the error percentage optained by detection only, and better than the results of comparable work. The framework is also tested on a public available dataset from an outdoor scene, which proves the generality of the method.
2 0.15971522 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
Author: Horst Possegger, Sabine Sternig, Thomas Mauthner, Peter M. Roth, Horst Bischof
Abstract: Combining foreground images from multiple views by projecting them onto a common ground-plane has been recently applied within many multi-object tracking approaches. These planar projections introduce severe artifacts and constrain most approaches to objects moving on a common 2D ground-plane. To overcome these limitations, we introduce the concept of an occupancy volume exploiting the full geometry and the objects ’ center of mass and develop an efficient algorithm for 3D object tracking. Individual objects are tracked using the local mass density scores within a particle filter based approach, constrained by a Voronoi partitioning between nearby trackers. Our method benefits from the geometric knowledge given by the occupancy volume to robustly extract features and train classifiers on-demand, when volumetric information becomes unreliable. We evaluate our approach on several challenging real-world scenarios including the public APIDIS dataset. Experimental evaluations demonstrate significant improvements compared to state-of-theart methods, while achieving real-time performance. – –
3 0.14074215 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes
Author: Junjie Yan, Xucong Zhang, Zhen Lei, Shengcai Liao, Stan Z. Li
Abstract: The serious performance decline with decreasing resolution is the major bottleneck for current pedestrian detection techniques [14, 23]. In this paper, we take pedestrian detection in different resolutions as different but related problems, and propose a Multi-Task model to jointly consider their commonness and differences. The model contains resolution aware transformations to map pedestrians in different resolutions to a common space, where a shared detector is constructed to distinguish pedestrians from background. For model learning, we present a coordinate descent procedure to learn the resolution aware transformations and deformable part model (DPM) based detector iteratively. In traffic scenes, there are many false positives located around vehicles, therefore, we further build a context model to suppress them according to the pedestrian-vehicle relationship. The context model can be learned automatically even when the vehicle annotations are not available. Our method reduces the mean miss rate to 60% for pedestrians taller than 30 pixels on the Caltech Pedestrian Benchmark, which noticeably outperforms previous state-of-the-art (71%).
4 0.12780517 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People
Author: Sitapa Rujikietgumjorn, Robert T. Collins
Abstract: We present a quadratic unconstrained binary optimization (QUBO) framework for reasoning about multiple object detections with spatial overlaps. The method maximizes an objective function composed of unary detection confidence scores andpairwise overlap constraints to determine which overlapping detections should be suppressed, and which should be kept. The framework is flexible enough to handle the problem of detecting objects as a shape covering of a foreground mask, and to handle the problem of filtering confidence weighted detections produced by a traditional sliding window object detector. In our experiments, we show that our method outperforms two existing state-ofthe-art pedestrian detectors.
5 0.11232807 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection
Author: Wanli Ouyang, Xiaogang Wang
Abstract: In this paper, we address the challenging problem of detecting pedestrians who appear in groups and have interaction. A new approach is proposed for single-pedestrian detection aided by multi-pedestrian detection. A mixture model of multi-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby multiple pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and multi-pedestrian detectors, and to refine the single-pedestrian detection result with multi-pedestrian detection. It can integrate with any single-pedestrian detector without significantly increasing the computation load. 15 state-of-the-art single-pedestrian detection approaches are investigated on three widely used public datasets: Caltech, TUD-Brussels andETH. Experimental results show that our framework significantly improves all these approaches. The average improvement is 9% on the Caltech-Test dataset, 11% on the TUD-Brussels dataset and 17% on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from 48% to 43% on the Caltech-Test dataset, from 55% to 50% on the TUD-Brussels dataset and from 51% to 41% on the ETH dataset.
7 0.1013686 100 cvpr-2013-Crossing the Line: Crowd Counting by Integer Programming with Local Features
8 0.099792272 158 cvpr-2013-Exploring Weak Stabilization for Motion Feature Extraction
9 0.097798474 441 cvpr-2013-Tracking Sports Players with Context-Conditioned Motion Models
10 0.097672239 4 cvpr-2013-3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image
11 0.097310796 440 cvpr-2013-Tracking People and Their Objects
12 0.088256113 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection
13 0.085121468 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning
14 0.081042379 209 cvpr-2013-Hypergraphs for Joint Multi-view Reconstruction and Multi-object Tracking
15 0.07982672 120 cvpr-2013-Detecting and Naming Actors in Movies Using Generative Appearance Models
16 0.073884353 386 cvpr-2013-Self-Paced Learning for Long-Term Tracking
17 0.073721401 167 cvpr-2013-Fast Multiple-Part Based Object Detection Using KD-Ferns
18 0.072922602 233 cvpr-2013-Joint Sparsity-Based Representation and Analysis of Unconstrained Activities
19 0.068705127 217 cvpr-2013-Improving an Object Detector and Extracting Regions Using Superpixels
20 0.067065924 175 cvpr-2013-First-Person Activity Recognition: What Are They Doing to Me?
topicId topicWeight
[(0, 0.163), (1, 0.025), (2, 0.014), (3, -0.072), (4, -0.003), (5, -0.028), (6, 0.06), (7, -0.024), (8, 0.047), (9, 0.076), (10, -0.034), (11, -0.055), (12, 0.073), (13, -0.085), (14, 0.037), (15, 0.028), (16, -0.055), (17, 0.111), (18, 0.007), (19, -0.06), (20, -0.001), (21, 0.112), (22, -0.145), (23, 0.082), (24, -0.056), (25, -0.063), (26, 0.042), (27, -0.023), (28, -0.016), (29, -0.013), (30, 0.015), (31, 0.04), (32, -0.005), (33, 0.011), (34, -0.022), (35, -0.002), (36, -0.01), (37, 0.037), (38, -0.052), (39, -0.064), (40, -0.026), (41, -0.012), (42, 0.055), (43, 0.019), (44, 0.044), (45, -0.061), (46, -0.001), (47, -0.021), (48, -0.025), (49, -0.006)]
simIndex simValue paperId paperTitle
same-paper 1 0.93614227 272 cvpr-2013-Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery
Author: Rikke Gade, Anders Jørgensen, Thomas B. Moeslund
Abstract: This paper presents a robust occupancy analysis system for thermal imaging. Reliable detection of people is very hard in crowded scenes, due to occlusions and segmentation problems. We therefore propose a framework that optimises the occupancy analysis over long periods by including information on the transition in occupancy, whenpeople enter or leave the monitored area. In stable periods, with no activity close to the borders, people are detected and counted which contributes to a weighted histogram. When activity close to the border is detected, local tracking is applied in order to identify a crossing. After a full sequence, the number of people during all periods are estimated using a probabilistic graph search optimisation. The system is tested on a total of 51,000 frames, captured in sports arenas. The mean error for a 30-minute period containing 3-13 people is 4.44 %, which is a half of the error percentage optained by detection only, and better than the results of comparable work. The framework is also tested on a public available dataset from an outdoor scene, which proves the generality of the method.
2 0.68296885 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People
Author: Sitapa Rujikietgumjorn, Robert T. Collins
Abstract: We present a quadratic unconstrained binary optimization (QUBO) framework for reasoning about multiple object detections with spatial overlaps. The method maximizes an objective function composed of unary detection confidence scores andpairwise overlap constraints to determine which overlapping detections should be suppressed, and which should be kept. The framework is flexible enough to handle the problem of detecting objects as a shape covering of a foreground mask, and to handle the problem of filtering confidence weighted detections produced by a traditional sliding window object detector. In our experiments, we show that our method outperforms two existing state-ofthe-art pedestrian detectors.
3 0.62429667 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes
Author: Junjie Yan, Xucong Zhang, Zhen Lei, Shengcai Liao, Stan Z. Li
Abstract: The serious performance decline with decreasing resolution is the major bottleneck for current pedestrian detection techniques [14, 23]. In this paper, we take pedestrian detection in different resolutions as different but related problems, and propose a Multi-Task model to jointly consider their commonness and differences. The model contains resolution aware transformations to map pedestrians in different resolutions to a common space, where a shared detector is constructed to distinguish pedestrians from background. For model learning, we present a coordinate descent procedure to learn the resolution aware transformations and deformable part model (DPM) based detector iteratively. In traffic scenes, there are many false positives located around vehicles, therefore, we further build a context model to suppress them according to the pedestrian-vehicle relationship. The context model can be learned automatically even when the vehicle annotations are not available. Our method reduces the mean miss rate to 60% for pedestrians taller than 30 pixels on the Caltech Pedestrian Benchmark, which noticeably outperforms previous state-of-the-art (71%).
4 0.60821664 398 cvpr-2013-Single-Pedestrian Detection Aided by Multi-pedestrian Detection
Author: Wanli Ouyang, Xiaogang Wang
Abstract: In this paper, we address the challenging problem of detecting pedestrians who appear in groups and have interaction. A new approach is proposed for single-pedestrian detection aided by multi-pedestrian detection. A mixture model of multi-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby multiple pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and multi-pedestrian detectors, and to refine the single-pedestrian detection result with multi-pedestrian detection. It can integrate with any single-pedestrian detector without significantly increasing the computation load. 15 state-of-the-art single-pedestrian detection approaches are investigated on three widely used public datasets: Caltech, TUD-Brussels andETH. Experimental results show that our framework significantly improves all these approaches. The average improvement is 9% on the Caltech-Test dataset, 11% on the TUD-Brussels dataset and 17% on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from 48% to 43% on the Caltech-Test dataset, from 55% to 50% on the TUD-Brussels dataset and from 51% to 41% on the ETH dataset.
5 0.59700888 440 cvpr-2013-Tracking People and Their Objects
Author: Tobias Baumgartner, Dennis Mitzel, Bastian Leibe
Abstract: Current pedestrian tracking approaches ignore important aspects of human behavior. Humans are not moving independently, but they closely interact with their environment, which includes not only other persons, but also different scene objects. Typical everyday scenarios include people moving in groups, pushing child strollers, or pulling luggage. In this paper, we propose a probabilistic approach for classifying such person-object interactions, associating objects to persons, and predicting how the interaction will most likely continue. Our approach relies on stereo depth information in order to track all scene objects in 3D, while simultaneously building up their 3D shape models. These models and their relative spatial arrangement are then fed into a probabilistic graphical model which jointly infers pairwise interactions and object classes. The inferred interactions can then be used to support tracking by recovering lost object tracks. We evaluate our approach on a novel dataset containing more than 15,000 frames of personobject interactions in 325 video sequences and demonstrate good performance in challenging real-world scenarios.
6 0.58390737 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
7 0.58178002 120 cvpr-2013-Detecting and Naming Actors in Movies Using Generative Appearance Models
9 0.5594179 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection
10 0.55426866 209 cvpr-2013-Hypergraphs for Joint Multi-view Reconstruction and Multi-object Tracking
11 0.53914255 167 cvpr-2013-Fast Multiple-Part Based Object Detection Using KD-Ferns
12 0.5246262 271 cvpr-2013-Locally Aligned Feature Transforms across Views
13 0.51747668 100 cvpr-2013-Crossing the Line: Crowd Counting by Integer Programming with Local Features
14 0.51368797 4 cvpr-2013-3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image
16 0.50248772 270 cvpr-2013-Local Fisher Discriminant Analysis for Pedestrian Re-identification
17 0.49042636 37 cvpr-2013-Adherent Raindrop Detection and Removal in Video
18 0.49000543 313 cvpr-2013-Online Dominant and Anomalous Behavior Detection in Videos
19 0.48685387 264 cvpr-2013-Learning to Detect Partially Overlapping Instances
20 0.47707558 331 cvpr-2013-Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis
topicId topicWeight
[(10, 0.078), (16, 0.044), (26, 0.031), (33, 0.222), (67, 0.078), (69, 0.043), (80, 0.321), (87, 0.077)]
simIndex simValue paperId paperTitle
same-paper 1 0.78038663 272 cvpr-2013-Long-Term Occupancy Analysis Using Graph-Based Optimisation in Thermal Imagery
Author: Rikke Gade, Anders Jørgensen, Thomas B. Moeslund
Abstract: This paper presents a robust occupancy analysis system for thermal imaging. Reliable detection of people is very hard in crowded scenes, due to occlusions and segmentation problems. We therefore propose a framework that optimises the occupancy analysis over long periods by including information on the transition in occupancy, whenpeople enter or leave the monitored area. In stable periods, with no activity close to the borders, people are detected and counted which contributes to a weighted histogram. When activity close to the border is detected, local tracking is applied in order to identify a crossing. After a full sequence, the number of people during all periods are estimated using a probabilistic graph search optimisation. The system is tested on a total of 51,000 frames, captured in sports arenas. The mean error for a 30-minute period containing 3-13 people is 4.44 %, which is a half of the error percentage optained by detection only, and better than the results of comparable work. The framework is also tested on a public available dataset from an outdoor scene, which proves the generality of the method.
2 0.74777973 183 cvpr-2013-GRASP Recurring Patterns from a Single View
Author: Jingchen Liu, Yanxi Liu
Abstract: We propose a novel unsupervised method for discovering recurring patterns from a single view. A key contribution of our approach is the formulation and validation of a joint assignment optimization problem where multiple visual words and object instances of a potential recurring pattern are considered simultaneously. The optimization is achieved by a greedy randomized adaptive search procedure (GRASP) with moves specifically designed for fast convergence. We have quantified systematically the performance of our approach under stressed conditions of the input (missing features, geometric distortions). We demonstrate that our proposed algorithm outperforms state of the art methods for recurring pattern discovery on a diverse set of 400+ real world and synthesized test images.
3 0.72755831 335 cvpr-2013-Poselet Conditioned Pictorial Structures
Author: Leonid Pishchulin, Mykhaylo Andriluka, Peter Gehler, Bernt Schiele
Abstract: In this paper we consider the challenging problem of articulated human pose estimation in still images. We observe that despite high variability of the body articulations, human motions and activities often simultaneously constrain the positions of multiple body parts. Modelling such higher order part dependencies seemingly comes at a cost of more expensive inference, which resulted in their limited use in state-of-the-art methods. In this paper we propose a model that incorporates higher order part dependencies while remaining efficient. We achieve this by defining a conditional model in which all body parts are connected a-priori, but which becomes a tractable tree-structured pictorial structures model once the image observations are available. In order to derive a set of conditioning variables we rely on the poselet-based features that have been shown to be effective for people detection but have so far found limited application for articulated human pose estimation. We demon- strate the effectiveness of our approach on three publicly available pose estimation benchmarks improving or being on-par with state of the art in each case.
4 0.7149269 153 cvpr-2013-Expanded Parts Model for Human Attribute and Action Recognition in Still Images
Author: Gaurav Sharma, Frédéric Jurie, Cordelia Schmid
Abstract: We propose a new model for recognizing human attributes (e.g. wearing a suit, sitting, short hair) and actions (e.g. running, riding a horse) in still images. The proposed model relies on a collection of part templates which are learnt discriminatively to explain specific scale-space locations in the images (in human centric coordinates). It avoids the limitations of highly structured models, which consist of a few (i.e. a mixture of) ‘average ’ templates. To learn our model, we propose an algorithm which automatically mines out parts and learns corresponding discriminative templates with their respective locations from a large number of candidate parts. We validate the method on recent challenging datasets: (i) Willow 7 actions [7], (ii) 27 Human Attributes (HAT) [25], and (iii) Stanford 40 actions [37]. We obtain convincing qualitative and state-of-the-art quantitative results on the three datasets.
5 0.71486485 210 cvpr-2013-Illumination Estimation Based on Bilayer Sparse Coding
Author: Bing Li, Weihua Xiong, Weiming Hu, Houwen Peng
Abstract: Computational color constancy is a very important topic in computer vision and has attracted many researchers ’ attention. Recently, lots of research has shown the effects of using high level visual content cues for improving illumination estimation. However, nearly all the existing methods are essentially combinational strategies in which image ’s content analysis is only used to guide the combination or selection from a variety of individual illumination estimation methods. In this paper, we propose a novel bilayer sparse coding model for illumination estimation that considers image similarity in terms of both low level color distribution and high level image scene content simultaneously. For the purpose, the image ’s scene content information is integrated with its color distribution to obtain optimal illumination estimation model. The experimental results on real-world image sets show that our algorithm is superior to some prevailing illumination estimation methods, even better than some combinational methods.
6 0.69749445 273 cvpr-2013-Looking Beyond the Image: Unsupervised Learning for Object Saliency and Detection
7 0.67013597 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video
8 0.66420263 206 cvpr-2013-Human Pose Estimation Using Body Parts Dependent Joint Regressors
9 0.64849746 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation
10 0.64622641 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors
11 0.64579648 45 cvpr-2013-Articulated Pose Estimation Using Discriminative Armlet Classifiers
12 0.64392883 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
13 0.64286846 60 cvpr-2013-Beyond Physical Connections: Tree Models in Human Pose Estimation
14 0.63916165 89 cvpr-2013-Computationally Efficient Regression on a Dependency Graph for Human Pose Estimation
15 0.63905603 334 cvpr-2013-Pose from Flow and Flow from Pose
16 0.63331497 439 cvpr-2013-Tracking Human Pose by Tracking Symmetric Parts
18 0.63285047 120 cvpr-2013-Detecting and Naming Actors in Movies Using Generative Appearance Models
19 0.6317032 2 cvpr-2013-3D Pictorial Structures for Multiple View Articulated Pose Estimation
20 0.63156754 207 cvpr-2013-Human Pose Estimation Using a Joint Pixel-wise and Part-wise Formulation