iccv iccv2013 iccv2013-157 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Kristina Scherbaum, James Petterson, Rogerio S. Feris, Volker Blanz, Hans-Peter Seidel
Abstract: Face detection is an important task in computer vision and often serves as the first step for a variety of applications. State-of-the-art approaches use efficient learning algorithms and train on large amounts of manually labeled imagery. Acquiring appropriate training images, however, is very time-consuming and does not guarantee that the collected training data is representative in terms of data variability. Moreover, available data sets are often acquired under controlled settings, restricting, for example, scene illumination or 3D head pose to a narrow range. This paper takes a look into the automated generation of adaptive training samples from a 3D morphable face model. Using statistical insights, the tailored training data guarantees full data variability and is enriched by arbitrary facial attributes such as age or body weight. Moreover, it can automatically adapt to environmental constraints, such as illumination or viewing angle of recorded video footage from surveillance cameras. We use the tailored imagery to train a new many-core imple- mentation of Viola Jones ’ AdaBoost object detection framework. The new implementation is not only faster but also enables the use of multiple feature channels such as color features at training time. In our experiments we trained seven view-dependent face detectors and evaluate these on the Face Detection Data Set and Benchmark (FDDB). Our experiments show that the use of tailored training imagery outperforms state-of-the-art approaches on this challenging dataset.
Reference: text
sentIndex sentText sentNum sentScore
1 This paper takes a look into the automated generation of adaptive training samples from a 3D morphable face model. [sent-17, score-0.806]
2 Using statistical insights, the tailored training data guarantees full data variability and is enriched by arbitrary facial attributes such as age or body weight. [sent-18, score-0.792]
3 Moreover, it can automatically adapt to environmental constraints, such as illumination or viewing angle of recorded video footage from surveillance cameras. [sent-19, score-0.442]
4 We use the tailored imagery to train a new many-core imple- mentation of Viola Jones ’ AdaBoost object detection framework. [sent-20, score-0.327]
5 In our experiments we trained seven view-dependent face detectors and evaluate these on the Face Detection Data Set and Benchmark (FDDB). [sent-22, score-0.556]
6 Our experiments show that the use of tailored training imagery outperforms state-of-the-art approaches on this challenging dataset. [sent-23, score-0.328]
7 Thus, a variety of face detection algorithms have been presented in recent years, many of them involving supervised or unsupervised machine learning methods. [sent-26, score-0.457]
8 Their goal is to learn a face classification false alarm rate Figure 2. [sent-27, score-0.365]
9 Training on the synthetically tailored views of 75 distinct subjects, is sufficient to outperform previously presented face detectors on the challenging Face Detection Data Set and Benchmark (FDDB) [1]. [sent-29, score-0.602]
10 This, however, requires large amounts of manually labeled face pictures during the training phase, which is not only very time-consuming but also difficult to obtain. [sent-31, score-0.701]
11 When training multiview classifiers for example, the manually labeled viewing angles might be inaccurate and thus lead to biased results. [sent-35, score-0.498]
12 This paper addresses these shortcomings and takes a look into the automated generation of tailored training samples from a 3D morphable face model. [sent-36, score-0.944]
13 The algorithm robustly detects faces despite apparent variation of facial attributes such as increased body weight, beards, bright or dark skin color. [sent-41, score-0.804]
14 a few randomized facial samples which we gain from a 3D morphable face model [2, 3]. [sent-42, score-0.836]
15 Each 3D random face is then modulated by automatically changing a set offacial attributes such as gender, weight or age. [sent-43, score-0.592]
16 The latter method in particular can be helpful when training classifiers for cameras that are positioned at unusual viewing angles or in a very dark or artificially lit environment. [sent-45, score-0.457]
17 Using this procedure, we generate seven distinct training sets, each set corresponding to a specific range of face orientations. [sent-46, score-0.634]
18 Related Work Face detection is often the first step in complex image processing applications, like face recognition, visual surveillance, or human-machine interaction. [sent-51, score-0.457]
19 Acquired Training Data Data-driven face classifiers require appropriate training data. [sent-76, score-0.566]
20 Out of these, the face dataset of the MPI for Biological Cybernetics is the most related to our work. [sent-81, score-0.365]
21 However, their dataset does not cover the full spectrum of statistical data variability and does not include facial attributes such as age or body weight. [sent-82, score-0.516]
22 Automatically synthesized training data, by contrast, involves high-level face models to increase variability: In 2004 Yue-Min et. [sent-85, score-0.503]
23 [35] relit faces in training images using harmonic images they derived from a 3D face model. [sent-87, score-0.77]
24 A variety of face recognition systems [37, 38, 39, 40, 41] use 3D models to synthesize intermediate views or viewpoint invariant reference frames for the purpose of face recognition. [sent-92, score-0.769]
25 [43] use a morphable body model to generate training data for pedestrian detection. [sent-96, score-0.438]
26 They could show that even a low number of synthetic training samples with increased data variability — can outperform detectors trained on large manually collected data sets. [sent-97, score-0.438]
27 [44] use a morphable model for pose invariant face recognition. [sent-100, score-0.623]
28 The model is then rendered under varying pose and illumination conditions to build a set of synthetic images, used for training a component-based face recognition system. [sent-102, score-0.757]
29 In contrast to their method, we focus on face detection, and show that state-of-the-art results can be obtained by leveraging tailored training data from a 3D morphable model. [sent-103, score-0.899]
30 We do not require initial facial input images, but randomly generate artificial faces while controlling the data variability. [sent-104, score-0.48]
31 Moreover, we introduce facial attributes such as body weight or skin tone and make use of an advanced face model [3] to render the subjects’ ages. [sent-105, score-0.898]
32 First we generate random faces by modulating existing faces from the database permitting 76. [sent-114, score-0.572]
33 By deploying a statistically driven face model for data generation, one can be sure to incorporate the full data variance (with respect to the database of the face model). [sent-119, score-0.816]
34 We use this technique in the following section to first generate randomized 3D faces using a 3D morphable face model [2, 3]. [sent-120, score-0.89]
35 In a second step, we modulate these three-dimensional random faces by applying facial attributes, which we then render for defined viewing angles and illumination parameters. [sent-121, score-0.84]
36 Finally, we compose the face renderings with natural-looking background images. [sent-122, score-0.365]
37 Modeling To generate artificial training data we employ a 3D morphable face model [2, 3]. [sent-123, score-0.761]
38 The model’s database contains m = 512 faces ranging from the age of 3 months to ≈ 40 years with an approximately equal number of female a≈nd male individuals (200 adults, 236 children aged between 7 and 16 years and 76 very young children aged between 3 and 12 months). [sent-124, score-0.653]
39 j=1 (3) The eigenvectors of the PCA represent the variation across all faces in the database. [sent-150, score-0.359]
40 Most eigenvectors do not explicitly represent semantically meaningful facial features. [sent-151, score-0.305]
41 Thus the highest variation between all faces in the database is represented by the frontmost eigenvectors. [sent-155, score-0.305]
42 In the following analysis, we therefore only consider the first eigenvectors (j <= 50) for shape sj and texture tj to control the variance of the computed training data. [sent-156, score-0.303]
43 In a first step, we generate a set of randomized 3D faces by manipulating available 3D faces from the 3D morphable model face database. [sent-158, score-1.157]
44 We additionally require a large angle (Mahalanobis Distance) between computed sample face vectors to ensure low similarity in-between the random faces. [sent-168, score-0.418]
45 In a final compositing step we randomly scale the size of the rendered faces to simulate typical surveillance recordings and blend the rendered views with background scenes. [sent-170, score-0.813]
46 To these we subsequently apply a set of facial attributes such as increased or decreased body weight or, for example, light or dark skin color. [sent-172, score-0.586]
47 The learning procedure ([2]) involves manual labeling of each database face according to the perceived strength of the attribute in each face. [sent-174, score-0.403]
48 Following the gradient of f in PCA space will then produce a perceived change of the attribute strength in a given face, while all other individual characteristics of the face remain unchanged. [sent-177, score-0.365]
49 We performed this method for the following facial attributes: age (young / old); body weight (obese / skinny); beard shadow (dark shadow / no shadow); skin color (light / dark); ears (big / small) and also for one facial expression (friendly / unfriendly). [sent-179, score-0.815]
50 We apply these attributes to all random faces using a scaling factor σ ∈ [−3. [sent-180, score-0.402]
51 For example, for female faces we avoid bearded female faces by setting σ = 0. [sent-187, score-0.694]
52 Results of the modu- lation are shown in Figure 4, where we rendered all facial attributes for the values σ = ±2. [sent-188, score-0.455]
53 To obtain an image Ii(x, y) from a given 3D face Fi, we apply standard computer graphics procedures: Rρ(Fi) = Ii(x, y) (4) 285 1 Figure 6. [sent-190, score-0.365]
54 All faces are rendered at seven predefined viewing angles. [sent-198, score-0.657]
55 We thus generate a total of seven distinct training sets that all differ in the chosen face viewing angle φ, where φ ∈{− 30◦; −20◦; −10◦; 0◦; +10◦ ; +20◦ ; +30◦}. [sent-199, score-0.839]
56 Within each set, we modulate for example the left-right viewing direction of the face ([−3◦ ; 3◦]), the up-down angle (”‘nodding‘”, θ ∈ [−15◦ ; 1[−5◦]3) and the in-plane rotation (γ ∈ [−5; 5]). [sent-201, score-0.681]
57 γ γT o∈ s [i−m5u;la5]te various environments, we additionally modulate color and intensity of the ambient light and apply random variations to color contrast, color gain and offset. [sent-203, score-0.292]
58 For each of the resulting training sets we later train a single face classifier which we combine to a multi-view face detector in the end. [sent-205, score-0.913]
59 Compositing In a final step we blend the rendered faces with background scenes that do not contain any faces (‘negatives’). [sent-206, score-0.768]
60 To exclude apparent faces from the selected images, we initially remove all images that are labeled to contain humans or human faces. [sent-208, score-0.312]
61 We blend each rendered face with a random background and apply a smooth contour blend (Gaussian blur) using the rendered contour mask at the facial contour. [sent-211, score-0.97]
62 uk/challenges/VOC/voc2012/ randomly vary the size and position of the face in the background image and then store its rectangular coordinates as ground truth. [sent-216, score-0.365]
63 Each composed image contains exactly one individual face in the end. [sent-217, score-0.365]
64 Images containing at least one face are also referred to as ‘positives’ . [sent-218, score-0.365]
65 Enhanced AdaBoost Training Using the above-described synthetic training data, we train a face classification system. [sent-221, score-0.647]
66 Given an image patch, the system should determine whether the patch contains faces or not and should locate potential faces in the image. [sent-222, score-0.534]
67 While the algorithm performs very well during testing phase (real-time), the training phase can quickly turn slow and tedious, especially when training on numerous images (>3000). [sent-225, score-0.35]
68 Another drawback of Viola and Jones’ method is that only greyscale images are processed as opposed to recent approaches that have shown that color information may improve face detection results ([45]). [sent-229, score-0.501]
69 To overcome these drawbacks we present two major adaptions to the OpenCV system: Firstly, we introduce new feature layers that can be used either for color channels or arbitrary descriptors and secondly, we parallelize the complete training procedure to run on many-core architectures. [sent-230, score-0.355]
70 However, during the detection phase one might want to detect all faces in an image, regardless of their viewing angles. [sent-261, score-0.586]
71 For this purpose we later recombine the view-dependent cascades to a single multiview classifier that is capable of finding faces at any viewing direction in the image (cf. [sent-262, score-0.419]
72 Evaluation and Results Overall we trained seven distinct classifiers, one for each of our seven synthetic training sets. [sent-265, score-0.499]
73 Examples are challenges such as low resolution faces, out-of-focus faces, occlusions or difficult and unusual face poses. [sent-270, score-0.416]
74 After threshold adjustment, we build a cascade of all seven classifiers, which for each image patch searches for faces at the respective viewing angles. [sent-275, score-0.55]
75 However, the results indicate that training on statistically well distributed synthetic training data seems to be a promising concept: Though the method of Li et al. [sent-280, score-0.375]
76 Applications Most presented face detection algorithms so far do involve a time-comsuming initial step: the collection and labeling of training images. [sent-283, score-0.595]
77 Camera-specific properties are ignored and detection might fail, for example at unusual viewing perspectives like a birds-eye view. [sent-289, score-0.333]
78 However, generalized detectors are still standard, since it would be too time-consuming to train hundreds of camera-specific face detectors. [sent-291, score-0.47]
79 html 2853 Scene snapshots at night Scene snapshots at daytime Sample scene snapshots, from which we automatical y extract facial il umination and pose parameters Synthetically tailored views 8. [sent-304, score-0.586]
80 Middle row: From the still images we manually extract a few cropped face images which we then use to automatically estimate facial illumination and pose parameters by fitting the 3D morphable face model them. [sent-307, score-1.348]
81 Bottom Figure row: Using the estimated light and pose parameters we render tailored training imagery showing arbitrarily modulated random faces. [sent-308, score-0.481]
82 The generated files guarantee coverage of the full spectrum of statistical data variability and may be equipped with tailored facial attributes. [sent-310, score-0.43]
83 Summary, Discussion and Future Aspects We presented a face detection system that is trained only on synthetic training data. [sent-322, score-0.694]
84 The time consuming process of manually labeling faces can be replaced by a fully automated procedure. [sent-324, score-0.329]
85 However, the generation of synthetic training data highly depends on the availability of a suitable 3D face model (such as morphable 3D face model or active shape model). [sent-325, score-1.27]
86 For the detection of facial skin, advanced color models could be helpful to enhance our results. [sent-332, score-0.349]
87 Learned-Miller, “FDDB: A benchmark for face detection in unconstrained settings,” Univ. [sent-336, score-0.457]
88 Zhengyou, “A survey of recent advances in face detection,” Microsoft Research, Tech. [sent-371, score-0.402]
89 2854 [9] “A software-based dynamic-warp scheduling approach for load-balancing the Viola-Jones face detection algorithm on GPUs,” J. [sent-388, score-0.457]
90 Kale, “Towards a robust, real-time face processing system using CUDAenabled GPUs,” Intl. [sent-396, score-0.365]
91 Ming, “The face detection system based on GPU+CPU desktop cluster,” Intl. [sent-403, score-0.457]
92 Lai, “Multilevel parallelism analysis of face detection on a shared memory multi-core system,” Intl. [sent-450, score-0.499]
93 Liao, “Learning facial attributes by crowdsourcing in social media,” Intl. [sent-605, score-0.391]
94 Gao, “Face detection under variable lighting based on resample by face relighting,” Intl. [sent-655, score-0.457]
95 Tarres, “P2CA: a new face recognition scheme combining 2D and 3D information,” IEEE Intl. [sent-679, score-0.365]
96 Fang, “Improved 3D assisted pose-invariant face recognition,” IEEE Intl. [sent-695, score-0.365]
97 Abdel-Mottaleb, “Normalized 3D to 2D model-based facial image synthesis for 2D modelbased face recognition,” IEEE GCC Conf. [sent-702, score-0.578]
98 Flynn, “A survey of approaches and challenges in 3D and multi-modal 3D+2D face recognition,” Comput. [sent-708, score-0.402]
99 Shirazi, “Comparative performance of different skin chrominance models and chrominance spaces for the automatic detection of human faces in color images,” IEEE Intl. [sent-746, score-0.593]
100 Lao, “Vector boosting for rotation invariant multi-view face detection,” IEEE Intl. [sent-756, score-0.365]
wordName wordTfidf (topN-words)
[('face', 0.365), ('faces', 0.267), ('morphable', 0.258), ('facial', 0.213), ('fddb', 0.2), ('viewing', 0.152), ('tailored', 0.138), ('training', 0.138), ('attributes', 0.135), ('haar', 0.135), ('seven', 0.131), ('modulate', 0.111), ('adaboost', 0.11), ('jones', 0.11), ('surveillance', 0.11), ('rendered', 0.107), ('blanz', 0.105), ('synthetic', 0.099), ('skin', 0.094), ('detection', 0.092), ('eigenvectors', 0.092), ('viola', 0.091), ('blend', 0.089), ('female', 0.08), ('snapshots', 0.08), ('variability', 0.079), ('parallelize', 0.077), ('scherbaum', 0.072), ('gpus', 0.067), ('classifiers', 0.063), ('manually', 0.062), ('shadow', 0.06), ('detectors', 0.06), ('layers', 0.056), ('modulated', 0.055), ('pictures', 0.055), ('gcc', 0.054), ('mmci', 0.054), ('processors', 0.054), ('weyrauch', 0.054), ('compositing', 0.054), ('angle', 0.053), ('dark', 0.053), ('imagery', 0.052), ('unusual', 0.051), ('light', 0.049), ('render', 0.049), ('illumination', 0.048), ('annotation', 0.048), ('beards', 0.048), ('deploying', 0.048), ('aged', 0.048), ('chrominance', 0.048), ('pishchulin', 0.048), ('siegen', 0.048), ('ucken', 0.048), ('ieee', 0.048), ('age', 0.047), ('mpi', 0.047), ('male', 0.045), ('labeled', 0.045), ('generation', 0.045), ('train', 0.045), ('color', 0.044), ('social', 0.043), ('body', 0.042), ('beard', 0.042), ('flynn', 0.042), ('parallelism', 0.042), ('modulation', 0.042), ('vetter', 0.042), ('recorded', 0.042), ('rendering', 0.041), ('young', 0.04), ('channels', 0.04), ('recordings', 0.04), ('bowyer', 0.04), ('months', 0.04), ('integral', 0.04), ('media', 0.039), ('views', 0.039), ('germany', 0.038), ('chiang', 0.038), ('saarbr', 0.038), ('might', 0.038), ('database', 0.038), ('scenes', 0.038), ('tj', 0.037), ('automatically', 0.037), ('phase', 0.037), ('chuang', 0.037), ('seidel', 0.037), ('parallel', 0.037), ('survey', 0.037), ('freeman', 0.036), ('gpu', 0.036), ('amounts', 0.036), ('james', 0.036), ('night', 0.036), ('sj', 0.036)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000013 157 iccv-2013-Fast Face Detector Training Using Tailored Views
Author: Kristina Scherbaum, James Petterson, Rogerio S. Feris, Volker Blanz, Hans-Peter Seidel
Abstract: Face detection is an important task in computer vision and often serves as the first step for a variety of applications. State-of-the-art approaches use efficient learning algorithms and train on large amounts of manually labeled imagery. Acquiring appropriate training images, however, is very time-consuming and does not guarantee that the collected training data is representative in terms of data variability. Moreover, available data sets are often acquired under controlled settings, restricting, for example, scene illumination or 3D head pose to a narrow range. This paper takes a look into the automated generation of adaptive training samples from a 3D morphable face model. Using statistical insights, the tailored training data guarantees full data variability and is enriched by arbitrary facial attributes such as age or body weight. Moreover, it can automatically adapt to environmental constraints, such as illumination or viewing angle of recorded video footage from surveillance cameras. We use the tailored imagery to train a new many-core imple- mentation of Viola Jones ’ AdaBoost object detection framework. The new implementation is not only faster but also enables the use of multiple feature channels such as color features at training time. In our experiments we trained seven view-dependent face detectors and evaluate these on the Face Detection Data Set and Benchmark (FDDB). Our experiments show that the use of tailored training imagery outperforms state-of-the-art approaches on this challenging dataset.
2 0.32367161 335 iccv-2013-Random Faces Guided Sparse Many-to-One Encoder for Pose-Invariant Face Recognition
Author: Yizhe Zhang, Ming Shao, Edward K. Wong, Yun Fu
Abstract: One of the most challenging task in face recognition is to identify people with varied poses. Namely, the test faces have significantly different poses compared with the registered faces. In this paper, we propose a high-level feature learning scheme to extract pose-invariant identity feature for face recognition. First, we build a single-hiddenlayer neural network with sparse constraint, to extractposeinvariant feature in a supervised fashion. Second, we further enhance the discriminative capability of the proposed feature by using multiple random faces as the target values for multiple encoders. By enforcing the target values to be uniquefor inputfaces over differentposes, the learned highlevel feature that is represented by the neurons in the hidden layer is pose free and only relevant to the identity information. Finally, we conduct face identification on CMU MultiPIE, and verification on Labeled Faces in the Wild (LFW) databases, where identification rank-1 accuracy and face verification accuracy with ROC curve are reported. These experiments demonstrate that our model is superior to oth- er state-of-the-art approaches on handling pose variations.
3 0.28234658 444 iccv-2013-Viewing Real-World Faces in 3D
Author: Tal Hassner
Abstract: We present a data-driven method for estimating the 3D shapes of faces viewed in single, unconstrained photos (aka “in-the-wild”). Our method was designed with an emphasis on robustness and efficiency with the explicit goal of deployment in real-world applications which reconstruct and display faces in 3D. Our key observation is that for many practical applications, warping the shape of a reference face to match the appearance of a query, is enough to produce realistic impressions of the query ’s 3D shape. Doing so, however, requires matching visual features between the (possibly very different) query and reference images, while ensuring that a plausible face shape is produced. To this end, we describe an optimization process which seeks to maximize the similarity of appearances and depths, jointly, to those of a reference model. We describe our system for monocular face shape reconstruction and present both qualitative and quantitative experiments, comparing our method against alternative systems, and demonstrating its capabilities. Finally, as a testament to its suitability for real-world applications, we offer an open, online implementation of our system, providing unique means – of instant 3D viewing of faces appearing in web photos.
4 0.26914534 219 iccv-2013-Internet Based Morphable Model
Author: Ira Kemelmacher-Shlizerman
Abstract: In thispaper wepresent a new concept ofbuilding a morphable model directly from photos on the Internet. Morphable models have shown very impressive results more than a decade ago, and could potentially have a huge impact on all aspects of face modeling and recognition. One of the challenges, however, is to capture and register 3D laser scans of large number of people and facial expressions. Nowadays, there are enormous amounts of face photos on the Internet, large portion of which has semantic labels. We propose a framework to build a morphable model directly from photos, the framework includes dense registration of Internet photos, as well as, new single view shape reconstruction and modification algorithms.
5 0.23112671 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation
Author: Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang
Abstract: We propose an unsupervised detector adaptation algorithm to adapt any offline trained face detector to a specific collection of images, and hence achieve better accuracy. The core of our detector adaptation algorithm is a probabilistic elastic part (PEP) model, which is offline trained with a set of face examples. It produces a statisticallyaligned part based face representation, namely the PEP representation. To adapt a general face detector to a collection of images, we compute the PEP representations of the candidate detections from the general face detector, and then train a discriminative classifier with the top positives and negatives. Then we re-rank all the candidate detections with this classifier. This way, a face detector tailored to the statistics of the specific image collection is adapted from the original detector. We present extensive results on three datasets with two state-of-the-art face detectors. The significant improvement of detection accuracy over these state- of-the-art face detectors strongly demonstrates the efficacy of the proposed face detector adaptation algorithm.
6 0.2219125 36 iccv-2013-Accurate and Robust 3D Facial Capture Using a Single RGBD Camera
7 0.22092657 70 iccv-2013-Cascaded Shape Space Pruning for Robust Facial Landmark Detection
8 0.19237064 97 iccv-2013-Coupling Alignments with Recognition for Still-to-Video Face Recognition
9 0.18588018 391 iccv-2013-Sieving Regression Forest Votes for Facial Feature Detection in the Wild
10 0.18329293 195 iccv-2013-Hidden Factor Analysis for Age Invariant Face Recognition
11 0.18210556 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model
12 0.17293294 44 iccv-2013-Adapting Classification Cascades to New Domains
13 0.16603072 356 iccv-2013-Robust Feature Set Matching for Partial Face Recognition
14 0.149169 392 iccv-2013-Similarity Metric Learning for Face Recognition
15 0.14817703 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification
16 0.14751144 180 iccv-2013-From Where and How to What We See
17 0.14595132 153 iccv-2013-Face Recognition Using Face Patch Networks
18 0.13450108 106 iccv-2013-Deep Learning Identity-Preserving Face Space
19 0.13210121 206 iccv-2013-Hybrid Deep Learning for Face Verification
20 0.12568425 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary
topicId topicWeight
[(0, 0.287), (1, 0.031), (2, -0.136), (3, -0.172), (4, 0.04), (5, -0.197), (6, 0.322), (7, 0.06), (8, 0.036), (9, 0.045), (10, -0.021), (11, 0.088), (12, 0.092), (13, 0.002), (14, -0.072), (15, -0.026), (16, -0.036), (17, 0.007), (18, -0.075), (19, 0.009), (20, -0.075), (21, -0.079), (22, 0.03), (23, 0.002), (24, -0.037), (25, 0.036), (26, -0.025), (27, -0.042), (28, 0.017), (29, -0.013), (30, -0.004), (31, 0.089), (32, 0.022), (33, -0.027), (34, -0.073), (35, -0.09), (36, 0.028), (37, -0.01), (38, -0.118), (39, -0.004), (40, 0.063), (41, 0.054), (42, 0.019), (43, 0.034), (44, 0.016), (45, -0.016), (46, -0.059), (47, -0.038), (48, -0.061), (49, -0.063)]
simIndex simValue paperId paperTitle
same-paper 1 0.97123194 157 iccv-2013-Fast Face Detector Training Using Tailored Views
Author: Kristina Scherbaum, James Petterson, Rogerio S. Feris, Volker Blanz, Hans-Peter Seidel
Abstract: Face detection is an important task in computer vision and often serves as the first step for a variety of applications. State-of-the-art approaches use efficient learning algorithms and train on large amounts of manually labeled imagery. Acquiring appropriate training images, however, is very time-consuming and does not guarantee that the collected training data is representative in terms of data variability. Moreover, available data sets are often acquired under controlled settings, restricting, for example, scene illumination or 3D head pose to a narrow range. This paper takes a look into the automated generation of adaptive training samples from a 3D morphable face model. Using statistical insights, the tailored training data guarantees full data variability and is enriched by arbitrary facial attributes such as age or body weight. Moreover, it can automatically adapt to environmental constraints, such as illumination or viewing angle of recorded video footage from surveillance cameras. We use the tailored imagery to train a new many-core imple- mentation of Viola Jones ’ AdaBoost object detection framework. The new implementation is not only faster but also enables the use of multiple feature channels such as color features at training time. In our experiments we trained seven view-dependent face detectors and evaluate these on the Face Detection Data Set and Benchmark (FDDB). Our experiments show that the use of tailored training imagery outperforms state-of-the-art approaches on this challenging dataset.
2 0.90689987 272 iccv-2013-Modifying the Memorability of Face Photographs
Author: Aditya Khosla, Wilma A. Bainbridge, Antonio Torralba, Aude Oliva
Abstract: Contemporary life bombards us with many new images of faces every day, which poses non-trivial constraints on human memory. The vast majority of face photographs are intended to be remembered, either because of personal relevance, commercial interests or because the pictures were deliberately designed to be memorable. Can we make aportrait more memorable or more forgettable automatically? Here, we provide a method to modify the memorability of individual face photographs, while keeping the identity and other facial traits (e.g. age, attractiveness, and emotional magnitude) of the individual fixed. We show that face photographs manipulated to be more memorable (or more forgettable) are indeed more often remembered (or forgotten) in a crowd-sourcing experiment with an accuracy of 74%. Quantifying and modifying the ‘memorability ’ of a face lends itself to many useful applications in computer vision and graphics, such as mnemonic aids for learning, photo editing applications for social networks and tools for designing memorable advertisements.
3 0.86819667 195 iccv-2013-Hidden Factor Analysis for Age Invariant Face Recognition
Author: Dihong Gong, Zhifeng Li, Dahua Lin, Jianzhuang Liu, Xiaoou Tang
Abstract: Age invariant face recognition has received increasing attention due to its great potential in real world applications. In spite of the great progress in face recognition techniques, reliably recognizingfaces across ages remains a difficult task. The facial appearance of a person changes substantially over time, resulting in significant intra-class variations. Hence, the key to tackle this problem is to separate the variation caused by aging from the person-specific features that are stable. Specifically, we propose a new method, calledHidden FactorAnalysis (HFA). This methodcaptures the intuition above through a probabilistic model with two latent factors: an identity factor that is age-invariant and an age factor affected by the aging process. Then, the observed appearance can be modeled as a combination of the components generated based on these factors. We also develop a learning algorithm that jointly estimates the latent factors and the model parameters using an EM procedure. Extensive experiments on two well-known public domain face aging datasets: MORPH (the largest public face aging database) and FGNET, clearly show that the proposed method achieves notable improvement over state-of-the-art algorithms.
4 0.86697596 391 iccv-2013-Sieving Regression Forest Votes for Facial Feature Detection in the Wild
Author: Heng Yang, Ioannis Patras
Abstract: In this paper we propose a method for the localization of multiple facial features on challenging face images. In the regression forests (RF) framework, observations (patches) that are extracted at several image locations cast votes for the localization of several facial features. In order to filter out votes that are not relevant, we pass them through two types of sieves, that are organised in a cascade, and which enforce geometric constraints. The first sieve filters out votes that are not consistent with a hypothesis for the location of the face center. Several sieves of the second type, one associated with each individual facial point, filter out distant votes. We propose a method that adjusts onthe-fly the proximity threshold of each second type sieve by applying a classifier which, based on middle-level features extracted from voting maps for the facial feature in question, makes a sequence of decisions on whether the threshold should be reduced or not. We validate our proposed method on two challenging datasets with images collected from the Internet in which we obtain state of the art results without resorting to explicit facial shape models. We also show the benefits of our method for proximity threshold adjustment especially on ’difficult’ face images.
5 0.85316992 335 iccv-2013-Random Faces Guided Sparse Many-to-One Encoder for Pose-Invariant Face Recognition
Author: Yizhe Zhang, Ming Shao, Edward K. Wong, Yun Fu
Abstract: One of the most challenging task in face recognition is to identify people with varied poses. Namely, the test faces have significantly different poses compared with the registered faces. In this paper, we propose a high-level feature learning scheme to extract pose-invariant identity feature for face recognition. First, we build a single-hiddenlayer neural network with sparse constraint, to extractposeinvariant feature in a supervised fashion. Second, we further enhance the discriminative capability of the proposed feature by using multiple random faces as the target values for multiple encoders. By enforcing the target values to be uniquefor inputfaces over differentposes, the learned highlevel feature that is represented by the neurons in the hidden layer is pose free and only relevant to the identity information. Finally, we conduct face identification on CMU MultiPIE, and verification on Labeled Faces in the Wild (LFW) databases, where identification rank-1 accuracy and face verification accuracy with ROC curve are reported. These experiments demonstrate that our model is superior to oth- er state-of-the-art approaches on handling pose variations.
6 0.85188621 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation
7 0.78390664 219 iccv-2013-Internet Based Morphable Model
8 0.77064359 251 iccv-2013-Like Father, Like Son: Facial Expression Dynamics for Kinship Verification
9 0.77001214 154 iccv-2013-Face Recognition via Archetype Hull Ranking
10 0.73046583 355 iccv-2013-Robust Face Landmark Estimation under Occlusion
11 0.72386205 206 iccv-2013-Hybrid Deep Learning for Face Verification
12 0.72101527 106 iccv-2013-Deep Learning Identity-Preserving Face Space
13 0.71134144 444 iccv-2013-Viewing Real-World Faces in 3D
14 0.68903786 158 iccv-2013-Fast High Dimensional Vector Multiplication Face Recognition
15 0.6884622 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model
16 0.67925519 70 iccv-2013-Cascaded Shape Space Pruning for Robust Facial Landmark Detection
17 0.6753099 356 iccv-2013-Robust Feature Set Matching for Partial Face Recognition
18 0.67185551 261 iccv-2013-Markov Network-Based Unified Classifier for Face Identification
19 0.66253036 97 iccv-2013-Coupling Alignments with Recognition for Still-to-Video Face Recognition
20 0.63234526 84 iccv-2013-Complex 3D General Object Reconstruction from Line Drawings
topicId topicWeight
[(2, 0.06), (7, 0.025), (12, 0.023), (26, 0.092), (31, 0.052), (34, 0.023), (40, 0.017), (41, 0.178), (42, 0.137), (64, 0.062), (73, 0.042), (89, 0.174), (98, 0.017)]
simIndex simValue paperId paperTitle
1 0.90154898 257 iccv-2013-Log-Euclidean Kernels for Sparse Representation and Dictionary Learning
Author: Peihua Li, Qilong Wang, Wangmeng Zuo, Lei Zhang
Abstract: The symmetric positive de?nite (SPD) matrices have been widely used in image and vision problems. Recently there are growing interests in studying sparse representation (SR) of SPD matrices, motivated by the great success of SR for vector data. Though the space of SPD matrices is well-known to form a Lie group that is a Riemannian manifold, existing work fails to take full advantage of its geometric structure. This paper attempts to tackle this problem by proposing a kernel based method for SR and dictionary learning (DL) of SPD matrices. We disclose that the space of SPD matrices, with the operations of logarithmic multiplication and scalar logarithmic multiplication de?ned in the Log-Euclidean framework, is a complete inner product space. We can thus develop a broad family of kernels that satis?es Mercer’s condition. These kernels characterize the geodesic distance and can be computed ef?ciently. We also consider the geometric structure in the DL process by updating atom matrices in the Riemannian space instead of in the Euclidean space. The proposed method is evaluated with various vision problems and shows notable per- formance gains over state-of-the-arts.
same-paper 2 0.87363565 157 iccv-2013-Fast Face Detector Training Using Tailored Views
Author: Kristina Scherbaum, James Petterson, Rogerio S. Feris, Volker Blanz, Hans-Peter Seidel
Abstract: Face detection is an important task in computer vision and often serves as the first step for a variety of applications. State-of-the-art approaches use efficient learning algorithms and train on large amounts of manually labeled imagery. Acquiring appropriate training images, however, is very time-consuming and does not guarantee that the collected training data is representative in terms of data variability. Moreover, available data sets are often acquired under controlled settings, restricting, for example, scene illumination or 3D head pose to a narrow range. This paper takes a look into the automated generation of adaptive training samples from a 3D morphable face model. Using statistical insights, the tailored training data guarantees full data variability and is enriched by arbitrary facial attributes such as age or body weight. Moreover, it can automatically adapt to environmental constraints, such as illumination or viewing angle of recorded video footage from surveillance cameras. We use the tailored imagery to train a new many-core imple- mentation of Viola Jones ’ AdaBoost object detection framework. The new implementation is not only faster but also enables the use of multiple feature channels such as color features at training time. In our experiments we trained seven view-dependent face detectors and evaluate these on the Face Detection Data Set and Benchmark (FDDB). Our experiments show that the use of tailored training imagery outperforms state-of-the-art approaches on this challenging dataset.
3 0.83589542 34 iccv-2013-Abnormal Event Detection at 150 FPS in MATLAB
Author: Cewu Lu, Jianping Shi, Jiaya Jia
Abstract: Speedy abnormal event detection meets the growing demand to process an enormous number of surveillance videos. Based on inherent redundancy of video structures, we propose an efficient sparse combination learning framework. It achieves decent performance in the detection phase without compromising result quality. The short running time is guaranteed because the new method effectively turns the original complicated problem to one in which only a few costless small-scale least square optimization steps are involved. Our method reaches high detection rates on benchmark datasets at a speed of 140∼150 frames per soenc obnednc on average wsehtesn a computing on an ordinary desktop PC using MATLAB.
4 0.82704139 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary
Author: Jungseock Joo, Shuo Wang, Song-Chun Zhu
Abstract: We present a part-based approach to the problem of human attribute recognition from a single image of a human body. To recognize the attributes of human from the body parts, it is important to reliably detect the parts. This is a challenging task due to the geometric variation such as articulation and view-point changes as well as the appearance variation of the parts arisen from versatile clothing types. The prior works have primarily focused on handling . edu . cn ???????????? geometric variation by relying on pre-trained part detectors or pose estimators, which require manual part annotation, but the appearance variation has been relatively neglected in these works. This paper explores the importance of the appearance variation, which is directly related to the main task, attribute recognition. To this end, we propose to learn a rich appearance part dictionary of human with significantly less supervision by decomposing image lattice into overlapping windows at multiscale and iteratively refining local appearance templates. We also present quantitative results in which our proposed method outperforms the existing approaches.
5 0.81262046 330 iccv-2013-Proportion Priors for Image Sequence Segmentation
Author: Claudia Nieuwenhuis, Evgeny Strekalovskiy, Daniel Cremers
Abstract: We propose a convex multilabel framework for image sequence segmentation which allows to impose proportion priors on object parts in order to preserve their size ratios across multiple images. The key idea is that for strongly deformable objects such as a gymnast the size ratio of respective regions (head versus torso, legs versus full body, etc.) is typically preserved. We propose different ways to impose such priors in a Bayesian framework for image segmentation. We show that near-optimal solutions can be computed using convex relaxation techniques. Extensive qualitative and quantitative evaluations demonstrate that the proportion priors allow for highly accurate segmentations, avoiding seeping-out of regions and preserving semantically relevant small-scale structures such as hands or feet. They naturally apply to multiple object instances such as players in sports scenes, and they can relate different objects instead of object parts, e.g. organs in medical imaging. The algorithm is efficient and easily parallelized leading to proportion-consistent segmentations at runtimes around one second.
6 0.81235939 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation
7 0.81228191 150 iccv-2013-Exemplar Cut
8 0.81215256 338 iccv-2013-Randomized Ensemble Tracking
9 0.81200856 427 iccv-2013-Transfer Feature Learning with Joint Distribution Adaptation
10 0.81138253 180 iccv-2013-From Where and How to What We See
11 0.80910027 349 iccv-2013-Regionlets for Generic Object Detection
12 0.80780143 44 iccv-2013-Adapting Classification Cascades to New Domains
13 0.80761003 259 iccv-2013-Manifold Based Face Synthesis from Sparse Samples
14 0.80758268 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
15 0.80716723 59 iccv-2013-Bayesian Joint Topic Modelling for Weakly Supervised Object Localisation
16 0.80603272 80 iccv-2013-Collaborative Active Learning of a Kernel Machine Ensemble for Recognition
17 0.80520821 379 iccv-2013-Semantic Segmentation without Annotating Segments
18 0.80452645 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps
19 0.80447847 277 iccv-2013-Multi-channel Correlation Filters
20 0.80420488 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification