iccv iccv2013 iccv2013-36 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yen-Lin Chen, Hsiang-Tao Wu, Fuhao Shi, Xin Tong, Jinxiang Chai
Abstract: This paper presents an automatic and robust approach that accurately captures high-quality 3D facial performances using a single RGBD camera. The key of our approach is to combine the power of automatic facial feature detection and image-based 3D nonrigid registration techniques for 3D facial reconstruction. In particular, we develop a robust and accurate image-based nonrigid registration algorithm that incrementally deforms a 3D template mesh model to best match observed depth image data and important facial features detected from single RGBD images. The whole process is fully automatic and robust because it is based on single frame facial registration framework. The system is flexible because it does not require any strong 3D facial priors such as blendshape models. We demonstrate the power of our approach by capturing a wide range of 3D facial expressions using a single RGBD camera and achieve state-of-the-art accuracy by comparing against alternative methods.
Reference: text
sentIndex sentText sentNum sentScore
1 The key of our approach is to combine the power of automatic facial feature detection and image-based 3D nonrigid registration techniques for 3D facial reconstruction. [sent-2, score-1.884]
2 In particular, we develop a robust and accurate image-based nonrigid registration algorithm that incrementally deforms a 3D template mesh model to best match observed depth image data and important facial features detected from single RGBD images. [sent-3, score-1.753]
3 The whole process is fully automatic and robust because it is based on single frame facial registration framework. [sent-4, score-0.993]
4 The system is flexible because it does not require any strong 3D facial priors such as blendshape models. [sent-5, score-0.874]
5 We demonstrate the power of our approach by capturing a wide range of 3D facial expressions using a single RGBD camera and achieve state-of-the-art accuracy by comparing against alternative methods. [sent-6, score-0.781]
6 Introduction The ability to accurately capture 3D facial performances has many applications including animation, gaming, human-computer interaction, security, and telepresence. [sent-8, score-0.84]
7 This paper presents an alternative to solving this problem: reconstructing the user’s 3D facial performances using a single RGBD camera. [sent-13, score-0.81]
8 The main contribution of this paper is a novel 3D facial modeling process that accurately reconstructs 3D facial expression models from single RGBD images. [sent-14, score-1.492]
9 We focus on single frame facial reconstruction because it ensures the process is fully automatic and does not suffer from driftJinxiang Chai Texas A&M; University ing errors. [sent-15, score-0.824]
10 At the core of our system lies a 3D facial deformation registration process that incrementally deforms a template face model to best match observed depth data. [sent-16, score-1.691]
11 We model 3D facial deformation in a reduced subspace through embedded deformation [16] and extend model-based optical flow formulation to depth image data. [sent-17, score-1.397]
12 This allows us to formulate the 3D nonrigid registration process in the LucasKanade registration framework [1] and use linear system solvers to incrementally deform the template face model to match observed depth images. [sent-18, score-1.231]
13 The system often produces poor registration results when facial deformations are far from the template face model. [sent-20, score-1.212]
14 In addition, it does not take into account perceptually significant facial features such as nose tip and mouth corners, thereby resulting in misalignments in those perceptually important facial regions. [sent-21, score-1.691]
15 We address the challenges by complementing our image-based nonrigid registration process with automatic facial feature detection process. [sent-22, score-1.216]
16 Our experiment shows that incorporating important facial features into the nonrigid registration process significantly improves the accuracy and robustness of the reconstruction process. [sent-23, score-1.21]
17 We demonstrate the power of our facial reconstruction system by modeling a wide range of facial expressions using a single Kinect (see Figure 1). [sent-24, score-1.576]
18 We evaluate the performance of the system by comparing against alternative methods including marker-based motion capture [18], “faceshift” system [20], Microsoft face SDK [11], and nonrigid facial registration using Iterative Closest Points (ICP). [sent-25, score-1.413]
19 Background Our system accurately captures high-quality 3D facial performances using a single RGBD camera. [sent-27, score-0.859]
20 Therefore, we will focus our discussion on methods and systems developed for acquiring 3D facial performances. [sent-28, score-0.721]
21 One of most successful approaches for 3D facial performance capture is based on marker-based motion cap33660158 Figure 1. [sent-29, score-0.784]
22 Accurate and robust facial performance capture using a single Kinect: (top) reference image data; (bottom) the reconstructed facial performances. [sent-30, score-1.505]
23 , [2, 10]) has been focused on complementing markerbased systems with other capturing types of devices such as video cameras and/or 3D scanners to improve the resolution and details of captured facial geometry. [sent-34, score-0.777]
24 Marker-based motion capture, however, is not practical for random users targeted by this paper as they are expensive and cumbersome for 3D facial performance capture. [sent-35, score-0.766]
25 Marker-less motion capture provides an appealing alternative to facial performance capture because it is non- intrusive and does not impede the subject’s ability to perform facial expressions. [sent-36, score-1.603]
26 One solution to marker-less facial capture is the use of depth and/or color data obtained from structured light systems [22, 13, 12, 21]. [sent-37, score-0.925]
27 For example, Zhang and his colleagues [22] captured 3D facial geometry and texture over time and built the correspondences across all the facial geometries by deforming a generic face template to fit the acquired depth data using optical flow computed from image sequences. [sent-38, score-2.056]
28 The minimal requirement of a single camera for facial performance capture is particularly appealing, as it offers the lowest cost and a simplified setup. [sent-40, score-0.76]
29 However, previous single RGB camera systems for facial capture [7, 6, 14] are often vulnerable to ambiguity caused by a lack of distinctive features on face and uncontrolled lighting environments. [sent-41, score-0.825]
30 One way to address the issue is to use 3D prior models to reduce the ambiguity of image-based facial deformations (e. [sent-42, score-0.721]
31 More recent research [5, 4, 20] has been focused on modeling 3D facial deformation using a single RGBD camera such as Microsoft Kinect or time-offlight (TOF) cameras. [sent-45, score-0.901]
32 [4] constructed 3D identity and expression morphable models from a large corpus of prerecorded 3D facial scans and used them to fit depth data obtained from a ToF camera via similar ICP techniques. [sent-48, score-0.886]
33 [20], which uses RGBD image data captured by a single Kinect and a template, along with a set of predefined blend shape models, to track facial deformations over time. [sent-50, score-0.798]
34 Our system shares a similar perspective as theirs because both are targeting low-cost and portable facial capture accessible to random users. [sent-51, score-0.818]
35 Our goal, however, is different from theirs in that we focus on authentic reconstruction of 3D facial performances rather than performancebased facial retargeting and animation. [sent-52, score-1.547]
36 Our method for facial capture is also significantly different from theirs. [sent-53, score-0.76]
37 Their approach utilizes a set of predefined blend shape models and closest points measurement to sequentially track facial performances in a Maximum A Posteriori (MAP) framework. [sent-54, score-0.866]
38 In contrast, our approach focuses on single frame facial reconstruction and combines image-based registration techniques with automatic facial detection in the LucasKanade registration framework. [sent-55, score-1.958]
39 Another difference is that we model deformation using embedded deformation rather than blendshape representation and therefore do not require any predefined blendshape models, which significantly reduces overhead costs for 3D facial capture. [sent-56, score-1.38]
40 Accurate and robust facial performance capture using a single Kinect: (a) the input depth image; (b) the input color image; (c) the detected facial features superimposed on the color image; (d) the reconstructed 3D facial model using both the depth image and detected facial features. [sent-60, score-3.327]
41 Overview Our system acquires high-quality 3D facial models from ×× single RGBD images recorded by a single Kinect. [sent-62, score-0.779]
42 O 48u0r facial modeling process leverages automatic facial feature detection and image-based nonrigid registration for 3D facial expression modeling (Figure 2). [sent-64, score-2.63]
43 We start the process by automatically locating important facial features such as the nose tip and the mouth corners in single RGBD images (Figure 2(c)). [sent-65, score-0.829]
44 To handle this challenge, we employ geometric hashing to robustly search closest examples in a training set of labeled images, where all the key facial features are labeled, and remove misclassified features inconsistent with the closest example. [sent-68, score-0.803]
45 In the final step, we refine feature locations by utilizing active appearance models (AAM) and 2D facial priors embedded in K closest examples of the detected features. [sent-69, score-0.873]
46 We model 3D facial deformation using embedded deformation [16] and this allows us to constrain the solution to be in a reduced subspace. [sent-71, score-1.167]
47 We introduce a model-based depth flow algorithm to incrementally deform the face template to best match observed depth data as well as detected facial features. [sent-72, score-1.465]
48 We formulate the problem in the Lucas-Kanade image registration framework and incrementally estimates both rigid transformations (ρ) and nonrigid deformation (g) via linear system solvers. [sent-73, score-0.809]
49 Image-based Nonrigid Facial Registration This section focuses on automatic 3D facial modeling using single RGBD images, which is achieved by deforming a template mesh model, s0, to best match observed image data. [sent-76, score-1.194]
50 In our implementation, we obtain the template mesh model s0 by scanning facial geometry of the subject under a neutral expression. [sent-77, score-1.04]
51 We choose embedded deformation because it allows us to model the deformation in a reduced subspace, thereby significantly reducing the ambiguity for 3D facial modeling. [sent-85, score-1.187]
52 In our experiment, graph nodes are chosen by uniformly sampling vertices of the template mesh model in the frontal facial region. [sent-103, score-1.112]
53 We have found M = 250 graph nodes are often sufficient to model facial details captured by a Kinect. [sent-104, score-0.782]
54 The state of our facial modeling process can thus be defined by q = [ρ, g] . [sent-107, score-0.768]
55 Our image-based nonrigid facial registration process aims to minimize the following objective function: mqinEdata + α1Erot + α2Ereg (2) where the first term is the data fitting term, which measures how well the deformed template model matches the observed RGBD data. [sent-115, score-1.439]
56 The second term Efeature is the facial feature term which ensures the “hypothesized” facial features are consistent with the “detected” facial features in observed data. [sent-127, score-2.297]
57 1 Depth Image Term This section introduces a novel model-based depth flow algorithm for incrementally estimating rigid transformation (ρ) and nonrigid transformation (g) of the template mesh (s0) to best match the observed depth image D. [sent-137, score-1.1]
58 We adopt an “analysis-by-synthesis” strategy to incrementally register the deforming template mesh with the observed depth image via depth flow. [sent-148, score-0.823]
59 The inner boundary of the face is defined by the close regions of the detected facial features of the mouth and eyes. [sent-160, score-0.922]
60 We address the challenge by including the facial feature term into the objective function. [sent-164, score-0.768]
61 In our implementation, we choose to define the facial feature term based on a combination of 2D and 3D facial points obtained from detection process. [sent-165, score-1.489]
62 In preprocessing step, we annotate the locations of facial features on the template mesh model by identifying the barycentric coordinates ( p¯i) of facial features on the template mesh. [sent-166, score-1.964]
63 The facial feature term minimizes the inconsistency between the “hypothesized” and “observed” features in either 2D or 3D space: ? [sent-167, score-0.768]
64 i where the vector xi and pi are 2D and 3D coordinates of the i-th detected facial features. [sent-173, score-0.774]
65 Note that only facial features around important regions, including the mouth and nose, eyes, and eyebrows, are included for facial feature term evaluation. [sent-175, score-1.54]
66 This is because facial features located on outer contour are often not very stable. [sent-176, score-0.763]
67 During nonrigid registration process, we stabilize the outer boundary of the deforming face by penalizing the deviation from the transformed template s0 ⊕ ρ. [sent-184, score-0.856]
68 Registration Optimization Our 3D facial modeling requires minimizing a sum of squared nonlinear function values defined in Equation (2). [sent-190, score-0.721]
69 We evaluate our system on synthetic RGBD image data generated by high-fidelity 3D facial data captured by Huang and colleagues [10]. [sent-211, score-0.896]
70 ground truth facial data acquired by an optical motion capture system [18]. [sent-227, score-0.868]
71 We placed 62 retro-reflective markers (4 mm diameter hemispheres) on the subject’s face and set up a twelve-camera Vicon motion capture system [18] to record dynamic facial movements at 240 frames per second acquisition rate. [sent-228, score-1.019]
72 The average reconstruction error, which is computed as average 3D marker position discrepancy between the reconstructed facial models and the ground truth mocap data, was reported in Table 1. [sent-240, score-0.876]
73 This is because ICP is often sensitive to initial values and prone to local minimum, particularly involving tracking high-dimensional facial mesh model from noisy depth data. [sent-259, score-1.078]
74 In addition, we have compared our system against nonrigid ICP registration on synthetic data generated by highquality facial performance data obtained by [10]. [sent-260, score-1.226]
75 Sequential tracking incorporates temporal coherence into facial reconstruction process. [sent-262, score-0.817]
76 As shown in Figure 3, our facial reconstruction produces more accurate results over nonrigid ICP registration, with and without temporal coherence. [sent-265, score-0.99]
77 Specifically, we instructed the user to sit in front of a Kinect and recorded a small number of facial expressions to retarget a set of predefined blendshape models to the user [20]. [sent-272, score-0.865]
78 With the retargeted blendshape models, we can use their software to sequentially track the facial expression of the user. [sent-273, score-0.816]
79 We evaluate the performance of our system by doing a side-by-side comparison against Microsoft Kinect facial SDK. [sent-276, score-0.779]
80 This is because we model detailed deformation of the whole face using both facial features and per-pixel depth information. [sent-279, score-1.131]
81 Evaluation on 3D Facial Reconstruction Process We have evaluated the performance of our facial registration process by dropping off each term of the cost function described in Equation 3. [sent-282, score-1.012]
82 We compared results obtained by the facial registration process with or without the facial feature term. [sent-285, score-1.686]
83 The accompanying video shows that tracking without the facial feature term often results in misalignments of perceptually important facial features. [sent-286, score-1.693]
84 More importantly, when the facial performances are very different or far away from the template model, the optimization often gets stuck in local minima, thereby producing inaccurate results. [sent-287, score-0.945]
85 With thefacialfeature term, the algorithm can accurately reconstruct facial performances across the entire sequence. [sent-288, score-0.801]
86 This is because the facial feature term only constrains facial deformation at locations of a sparse set of facial features rather than detailed per-pixel constraints. [sent-294, score-2.39]
87 With the depth image term, the average error of our facial reconstruction process is reduced from 5. [sent-295, score-0.961]
88 The accompanying video shows that adding the boundary term stabilizes the facial deformation along the 33661214 border of face boundary, which is more obvious in the real data. [sent-300, score-1.21]
89 Applications in Facial Performance Capture We have tested our system on acquiring 3D facial performances of four subjects. [sent-303, score-0.834]
90 We used a Minolta VIVID 910 laser scanner to record high-resolution static facial geometry of an actor/actress as the template mesh. [sent-304, score-0.891]
91 In our implementation, we utilize the temporal coherence to speed up the facial tracking process. [sent-306, score-0.767]
92 Conclusion We have presented an automatic algorithm for accurately capture 3D facial performances using a single RGBD camera. [sent-311, score-0.868]
93 The key idea of our algorithm is to combine the power of image-based 3D nonrigid registration and automatic facial feature detection for 3D facial registration. [sent-312, score-1.884]
94 We demonstrate the power of our approach by modeling a wide range of 3D facial expressions using a single RGBD camera and achieve state-of-the-art accuracy by comparing against alternative methods. [sent-313, score-0.781]
95 Our system is appealing for 3D facial modeling and capture. [sent-314, score-0.804]
96 It is flexible because it does not require strong 3D facial priors commonly used in previous facial modeling systems (e. [sent-315, score-1.442]
97 It is robust and does not suffer from error accumulation because it builds on single frame facial registration framework. [sent-318, score-0.94]
98 Last but not the least, our system is accurate because facial feature detection provides locations for significant facial features to avoid misalignments and our image-based nonrigid registration method achieves down to sub-pixel accuracy. [sent-319, score-1.966]
99 Leveraging motion capture and 3d scanning for high-fidelity facial performance acquisition. [sent-393, score-0.784]
100 Fast and accurate facial alignment from a single rgbd image. [sent-439, score-0.918]
wordName wordTfidf (topN-words)
[('facial', 0.721), ('registration', 0.219), ('rgbd', 0.197), ('nonrigid', 0.195), ('deformation', 0.18), ('mesh', 0.17), ('depth', 0.165), ('template', 0.149), ('icp', 0.125), ('kinect', 0.119), ('vicon', 0.11), ('blendshape', 0.095), ('deforming', 0.086), ('embedded', 0.086), ('accompanying', 0.083), ('vertex', 0.07), ('weise', 0.067), ('face', 0.065), ('markers', 0.064), ('boundary', 0.06), ('transformations', 0.058), ('system', 0.058), ('colleagues', 0.056), ('performances', 0.055), ('animation', 0.054), ('misalignments', 0.052), ('marker', 0.052), ('rigid', 0.051), ('mouth', 0.051), ('reconstruction', 0.05), ('sdk', 0.049), ('rendered', 0.048), ('incrementally', 0.048), ('deform', 0.048), ('perceptually', 0.047), ('term', 0.047), ('deformed', 0.043), ('outer', 0.042), ('closest', 0.041), ('stabilize', 0.04), ('observed', 0.04), ('capture', 0.039), ('transformation', 0.039), ('vertices', 0.039), ('flow', 0.039), ('aj', 0.036), ('derivatives', 0.035), ('texas', 0.035), ('microsoft', 0.035), ('pz', 0.035), ('alternative', 0.034), ('sumner', 0.033), ('hypothesized', 0.033), ('synthetic', 0.033), ('nodes', 0.033), ('affine', 0.033), ('vi', 0.032), ('border', 0.032), ('nose', 0.032), ('breidt', 0.032), ('faceshift', 0.032), ('tj', 0.029), ('mocap', 0.029), ('coordinates', 0.028), ('complementing', 0.028), ('lucaskanade', 0.028), ('captured', 0.028), ('automatic', 0.028), ('mm', 0.027), ('delta', 0.027), ('surface', 0.026), ('optical', 0.026), ('barycentric', 0.026), ('blend', 0.026), ('meshes', 0.026), ('expressions', 0.026), ('equation', 0.025), ('accurately', 0.025), ('process', 0.025), ('movement', 0.025), ('detected', 0.025), ('appealing', 0.025), ('reconstructed', 0.024), ('motion', 0.024), ('temporal', 0.024), ('nj', 0.024), ('acm', 0.024), ('predefined', 0.023), ('importance', 0.022), ('state', 0.022), ('tracking', 0.022), ('stabilizes', 0.022), ('tof', 0.022), ('record', 0.021), ('cumbersome', 0.021), ('deforms', 0.021), ('popovi', 0.021), ('graphics', 0.02), ('pfister', 0.02), ('thereby', 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000006 36 iccv-2013-Accurate and Robust 3D Facial Capture Using a Single RGBD Camera
Author: Yen-Lin Chen, Hsiang-Tao Wu, Fuhao Shi, Xin Tong, Jinxiang Chai
Abstract: This paper presents an automatic and robust approach that accurately captures high-quality 3D facial performances using a single RGBD camera. The key of our approach is to combine the power of automatic facial feature detection and image-based 3D nonrigid registration techniques for 3D facial reconstruction. In particular, we develop a robust and accurate image-based nonrigid registration algorithm that incrementally deforms a 3D template mesh model to best match observed depth image data and important facial features detected from single RGBD images. The whole process is fully automatic and robust because it is based on single frame facial registration framework. The system is flexible because it does not require any strong 3D facial priors such as blendshape models. We demonstrate the power of our approach by capturing a wide range of 3D facial expressions using a single RGBD camera and achieve state-of-the-art accuracy by comparing against alternative methods.
2 0.34982795 70 iccv-2013-Cascaded Shape Space Pruning for Robust Facial Landmark Detection
Author: Xiaowei Zhao, Shiguang Shan, Xiujuan Chai, Xilin Chen
Abstract: In this paper, we propose a novel cascaded face shape space pruning algorithm for robust facial landmark detection. Through progressively excluding the incorrect candidate shapes, our algorithm can accurately and efficiently achieve the globally optimal shape configuration. Specifically, individual landmark detectors are firstly applied to eliminate wrong candidates for each landmark. Then, the candidate shape space is further pruned by jointly removing incorrect shape configurations. To achieve this purpose, a discriminative structure classifier is designed to assess the candidate shape configurations. Based on the learned discriminative structure classifier, an efficient shape space pruning strategy is proposed to quickly reject most incorrect candidate shapes while preserve the true shape. The proposed algorithm is carefully evaluated on a large set of real world face images. In addition, comparison results on the publicly available BioID and LFW face databases demonstrate that our algorithm outperforms some state-of-the-art algorithms.
3 0.26724982 219 iccv-2013-Internet Based Morphable Model
Author: Ira Kemelmacher-Shlizerman
Abstract: In thispaper wepresent a new concept ofbuilding a morphable model directly from photos on the Internet. Morphable models have shown very impressive results more than a decade ago, and could potentially have a huge impact on all aspects of face modeling and recognition. One of the challenges, however, is to capture and register 3D laser scans of large number of people and facial expressions. Nowadays, there are enormous amounts of face photos on the Internet, large portion of which has semantic labels. We propose a framework to build a morphable model directly from photos, the framework includes dense registration of Internet photos, as well as, new single view shape reconstruction and modification algorithms.
4 0.24146575 391 iccv-2013-Sieving Regression Forest Votes for Facial Feature Detection in the Wild
Author: Heng Yang, Ioannis Patras
Abstract: In this paper we propose a method for the localization of multiple facial features on challenging face images. In the regression forests (RF) framework, observations (patches) that are extracted at several image locations cast votes for the localization of several facial features. In order to filter out votes that are not relevant, we pass them through two types of sieves, that are organised in a cascade, and which enforce geometric constraints. The first sieve filters out votes that are not consistent with a hypothesis for the location of the face center. Several sieves of the second type, one associated with each individual facial point, filter out distant votes. We propose a method that adjusts onthe-fly the proximity threshold of each second type sieve by applying a classifier which, based on middle-level features extracted from voting maps for the facial feature in question, makes a sequence of decisions on whether the threshold should be reduced or not. We validate our proposed method on two challenging datasets with images collected from the Internet in which we obtain state of the art results without resorting to explicit facial shape models. We also show the benefits of our method for proximity threshold adjustment especially on ’difficult’ face images.
5 0.23442329 69 iccv-2013-Capturing Global Semantic Relationships for Facial Action Unit Recognition
Author: Ziheng Wang, Yongqiang Li, Shangfei Wang, Qiang Ji
Abstract: In this paper we tackle the problem of facial action unit (AU) recognition by exploiting the complex semantic relationships among AUs, which carry crucial top-down information yet have not been thoroughly exploited. Towards this goal, we build a hierarchical model that combines the bottom-level image features and the top-level AU relationships to jointly recognize AUs in a principled manner. The proposed model has two major advantages over existing methods. 1) Unlike methods that can only capture local pair-wise AU dependencies, our model is developed upon the restricted Boltzmann machine and therefore can exploit the global relationships among AUs. 2) Although AU relationships are influenced by many related factors such as facial expressions, these factors are generally ignored by the current methods. Our model, however, can successfully capture them to more accurately characterize the AU relationships. Efficient learning and inference algorithms of the proposed model are also developed. Experimental results on benchmark databases demonstrate the effectiveness of the proposed approach in modelling complex AU relationships as well as its superior AU recognition performance over existing approaches.
6 0.2219125 157 iccv-2013-Fast Face Detector Training Using Tailored Views
7 0.21493784 251 iccv-2013-Like Father, Like Son: Facial Expression Dynamics for Kinship Verification
8 0.20055628 444 iccv-2013-Viewing Real-World Faces in 3D
9 0.19340776 335 iccv-2013-Random Faces Guided Sparse Many-to-One Encoder for Pose-Invariant Face Recognition
10 0.18906724 155 iccv-2013-Facial Action Unit Event Detection by Cascade of Tasks
11 0.18781933 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model
12 0.17054483 284 iccv-2013-Multiview Photometric Stereo Using Planar Mesh Parameterization
13 0.17048521 149 iccv-2013-Exemplar-Based Graph Matching for Robust Facial Landmark Localization
14 0.16394962 16 iccv-2013-A Generic Deformation Model for Dense Non-rigid Surface Registration: A Higher-Order MRF-Based Approach
15 0.16336571 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation
16 0.15816067 424 iccv-2013-Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines
17 0.15517408 283 iccv-2013-Multiple Non-rigid Surface Detection and Registration
18 0.15369742 183 iccv-2013-Geometric Registration Based on Distortion Estimation
19 0.14810805 339 iccv-2013-Rank Minimization across Appearance and Shape for AAM Ensemble Fitting
topicId topicWeight
[(0, 0.215), (1, -0.143), (2, -0.105), (3, -0.031), (4, -0.02), (5, -0.176), (6, 0.31), (7, 0.013), (8, -0.03), (9, 0.036), (10, -0.11), (11, 0.17), (12, 0.099), (13, 0.122), (14, 0.017), (15, -0.022), (16, 0.079), (17, 0.01), (18, -0.072), (19, -0.158), (20, 0.031), (21, 0.184), (22, -0.03), (23, 0.163), (24, -0.058), (25, -0.151), (26, 0.03), (27, -0.021), (28, 0.005), (29, 0.041), (30, 0.028), (31, 0.021), (32, -0.018), (33, 0.043), (34, -0.065), (35, -0.083), (36, -0.006), (37, 0.124), (38, -0.009), (39, 0.111), (40, 0.085), (41, 0.077), (42, 0.086), (43, -0.086), (44, 0.001), (45, -0.029), (46, 0.007), (47, -0.005), (48, 0.079), (49, 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.97646034 36 iccv-2013-Accurate and Robust 3D Facial Capture Using a Single RGBD Camera
Author: Yen-Lin Chen, Hsiang-Tao Wu, Fuhao Shi, Xin Tong, Jinxiang Chai
Abstract: This paper presents an automatic and robust approach that accurately captures high-quality 3D facial performances using a single RGBD camera. The key of our approach is to combine the power of automatic facial feature detection and image-based 3D nonrigid registration techniques for 3D facial reconstruction. In particular, we develop a robust and accurate image-based nonrigid registration algorithm that incrementally deforms a 3D template mesh model to best match observed depth image data and important facial features detected from single RGBD images. The whole process is fully automatic and robust because it is based on single frame facial registration framework. The system is flexible because it does not require any strong 3D facial priors such as blendshape models. We demonstrate the power of our approach by capturing a wide range of 3D facial expressions using a single RGBD camera and achieve state-of-the-art accuracy by comparing against alternative methods.
2 0.83383012 251 iccv-2013-Like Father, Like Son: Facial Expression Dynamics for Kinship Verification
Author: Hamdi Dibeklioglu, Albert Ali Salah, Theo Gevers
Abstract: Kinship verification from facial appearance is a difficult problem. This paper explores the possibility of employing facial expression dynamics in this problem. By using features that describe facial dynamics and spatio-temporal appearance over smile expressions, we show that it is possible to improve the state ofthe art in thisproblem, and verify that it is indeed possible to recognize kinship by resemblance of facial expressions. The proposed method is tested on different kin relationships. On the average, 72.89% verification accuracy is achieved on spontaneous smiles.
3 0.70163888 70 iccv-2013-Cascaded Shape Space Pruning for Robust Facial Landmark Detection
Author: Xiaowei Zhao, Shiguang Shan, Xiujuan Chai, Xilin Chen
Abstract: In this paper, we propose a novel cascaded face shape space pruning algorithm for robust facial landmark detection. Through progressively excluding the incorrect candidate shapes, our algorithm can accurately and efficiently achieve the globally optimal shape configuration. Specifically, individual landmark detectors are firstly applied to eliminate wrong candidates for each landmark. Then, the candidate shape space is further pruned by jointly removing incorrect shape configurations. To achieve this purpose, a discriminative structure classifier is designed to assess the candidate shape configurations. Based on the learned discriminative structure classifier, an efficient shape space pruning strategy is proposed to quickly reject most incorrect candidate shapes while preserve the true shape. The proposed algorithm is carefully evaluated on a large set of real world face images. In addition, comparison results on the publicly available BioID and LFW face databases demonstrate that our algorithm outperforms some state-of-the-art algorithms.
4 0.66518962 391 iccv-2013-Sieving Regression Forest Votes for Facial Feature Detection in the Wild
Author: Heng Yang, Ioannis Patras
Abstract: In this paper we propose a method for the localization of multiple facial features on challenging face images. In the regression forests (RF) framework, observations (patches) that are extracted at several image locations cast votes for the localization of several facial features. In order to filter out votes that are not relevant, we pass them through two types of sieves, that are organised in a cascade, and which enforce geometric constraints. The first sieve filters out votes that are not consistent with a hypothesis for the location of the face center. Several sieves of the second type, one associated with each individual facial point, filter out distant votes. We propose a method that adjusts onthe-fly the proximity threshold of each second type sieve by applying a classifier which, based on middle-level features extracted from voting maps for the facial feature in question, makes a sequence of decisions on whether the threshold should be reduced or not. We validate our proposed method on two challenging datasets with images collected from the Internet in which we obtain state of the art results without resorting to explicit facial shape models. We also show the benefits of our method for proximity threshold adjustment especially on ’difficult’ face images.
5 0.65126127 149 iccv-2013-Exemplar-Based Graph Matching for Robust Facial Landmark Localization
Author: Feng Zhou, Jonathan Brandt, Zhe Lin
Abstract: Localizing facial landmarks is a fundamental step in facial image analysis. However, the problem is still challenging due to the large variability in pose and appearance, and the existence ofocclusions in real-worldface images. In this paper, we present exemplar-based graph matching (EGM), a robust framework for facial landmark localization. Compared to conventional algorithms, EGM has three advantages: (1) an affine-invariant shape constraint is learned online from similar exemplars to better adapt to the test face; (2) the optimal landmark configuration can be directly obtained by solving a graph matching problem with the learned shape constraint; (3) the graph matching problem can be optimized efficiently by linear programming. To our best knowledge, this is the first attempt to apply a graph matching technique for facial landmark localization. Experiments on several challenging datasets demonstrate the advantages of EGM over state-of-the-art methods.
6 0.63648075 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model
7 0.63381183 219 iccv-2013-Internet Based Morphable Model
8 0.62879121 183 iccv-2013-Geometric Registration Based on Distortion Estimation
9 0.62020719 69 iccv-2013-Capturing Global Semantic Relationships for Facial Action Unit Recognition
10 0.58854187 155 iccv-2013-Facial Action Unit Event Detection by Cascade of Tasks
11 0.5615204 16 iccv-2013-A Generic Deformation Model for Dense Non-rigid Surface Registration: A Higher-Order MRF-Based Approach
12 0.52480471 444 iccv-2013-Viewing Real-World Faces in 3D
13 0.50261366 355 iccv-2013-Robust Face Landmark Estimation under Occlusion
14 0.47797012 272 iccv-2013-Modifying the Memorability of Face Photographs
15 0.46084559 157 iccv-2013-Fast Face Detector Training Using Tailored Views
16 0.45932221 196 iccv-2013-Hierarchical Data-Driven Descent for Efficient Optimal Deformation Estimation
17 0.45148957 185 iccv-2013-Go-ICP: Solving 3D Registration Efficiently and Globally Optimally
18 0.44346851 284 iccv-2013-Multiview Photometric Stereo Using Planar Mesh Parameterization
19 0.433209 139 iccv-2013-Elastic Fragments for Dense Scene Reconstruction
20 0.43150443 56 iccv-2013-Automatic Registration of RGB-D Scans via Salient Directions
topicId topicWeight
[(2, 0.029), (7, 0.017), (26, 0.049), (31, 0.044), (35, 0.191), (40, 0.013), (42, 0.154), (64, 0.039), (73, 0.071), (78, 0.021), (89, 0.24), (98, 0.02)]
simIndex simValue paperId paperTitle
1 0.98176897 90 iccv-2013-Content-Aware Rotation
Author: Kaiming He, Huiwen Chang, Jian Sun
Abstract: We present an image editing tool called Content-Aware Rotation. Casually shot photos can appear tilted, and are often corrected by rotation and cropping. This trivial solution may remove desired content and hurt image integrity. Instead of doing rigid rotation, we propose a warping method that creates the perception of rotation and avoids cropping. Human vision studies suggest that the perception of rotation is mainly due to horizontal/vertical lines. We design an optimization-based method that preserves the rotation of horizontal/vertical lines, maintains the completeness of the image content, and reduces the warping distortion. An efficient algorithm is developed to address the challenging optimization. We demonstrate our content-aware rotation method on a variety of practical cases.
2 0.9247182 119 iccv-2013-Discriminant Tracking Using Tensor Representation with Semi-supervised Improvement
Author: Jin Gao, Junliang Xing, Weiming Hu, Steve Maybank
Abstract: Visual tracking has witnessed growing methods in object representation, which is crucial to robust tracking. The dominant mechanism in object representation is using image features encoded in a vector as observations to perform tracking, without considering that an image is intrinsically a matrix, or a 2nd-order tensor. Thus approaches following this mechanism inevitably lose a lot of useful information, and therefore cannot fully exploit the spatial correlations within the 2D image ensembles. In this paper, we address an image as a 2nd-order tensor in its original form, and find a discriminative linear embedding space approximation to the original nonlinear submanifold embedded in the tensor space based on the graph embedding framework. We specially design two graphs for characterizing the intrinsic local geometrical structure of the tensor space, so as to retain more discriminant information when reducing the dimension along certain tensor dimensions. However, spatial correlations within a tensor are not limited to the elements along these dimensions. This means that some part of the discriminant information may not be encoded in the embedding space. We introduce a novel technique called semi-supervised improvement to iteratively adjust the embedding space to compensate for the loss of discriminant information, hence improving the performance of our tracker. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracker.
3 0.91460758 104 iccv-2013-Decomposing Bag of Words Histograms
Author: Ankit Gandhi, Karteek Alahari, C.V. Jawahar
Abstract: We aim to decompose a global histogram representation of an image into histograms of its associated objects and regions. This task is formulated as an optimization problem, given a set of linear classifiers, which can effectively discriminate the object categories present in the image. Our decomposition bypasses harder problems associated with accurately localizing and segmenting objects. We evaluate our method on a wide variety of composite histograms, and also compare it with MRF-based solutions. In addition to merely measuring the accuracy of decomposition, we also show the utility of the estimated object and background histograms for the task of image classification on the PASCAL VOC 2007 dataset.
4 0.90018094 403 iccv-2013-Strong Appearance and Expressive Spatial Models for Human Pose Estimation
Author: Leonid Pishchulin, Mykhaylo Andriluka, Peter Gehler, Bernt Schiele
Abstract: Typical approaches to articulated pose estimation combine spatial modelling of the human body with appearance modelling of body parts. This paper aims to push the state-of-the-art in articulated pose estimation in two ways. First we explore various types of appearance representations aiming to substantially improve the bodypart hypotheses. And second, we draw on and combine several recently proposed powerful ideas such as more flexible spatial models as well as image-conditioned spatial models. In a series of experiments we draw several important conclusions: (1) we show that the proposed appearance representations are complementary; (2) we demonstrate that even a basic tree-structure spatial human body model achieves state-ofthe-art performance when augmented with the proper appearance representation; and (3) we show that the combination of the best performing appearance model with a flexible image-conditioned spatial model achieves the best result, significantly improving over the state of the art, on the “Leeds Sports Poses ” and “Parse ” benchmarks.
same-paper 5 0.89850235 36 iccv-2013-Accurate and Robust 3D Facial Capture Using a Single RGBD Camera
Author: Yen-Lin Chen, Hsiang-Tao Wu, Fuhao Shi, Xin Tong, Jinxiang Chai
Abstract: This paper presents an automatic and robust approach that accurately captures high-quality 3D facial performances using a single RGBD camera. The key of our approach is to combine the power of automatic facial feature detection and image-based 3D nonrigid registration techniques for 3D facial reconstruction. In particular, we develop a robust and accurate image-based nonrigid registration algorithm that incrementally deforms a 3D template mesh model to best match observed depth image data and important facial features detected from single RGBD images. The whole process is fully automatic and robust because it is based on single frame facial registration framework. The system is flexible because it does not require any strong 3D facial priors such as blendshape models. We demonstrate the power of our approach by capturing a wide range of 3D facial expressions using a single RGBD camera and achieve state-of-the-art accuracy by comparing against alternative methods.
6 0.86150384 316 iccv-2013-Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose?
7 0.84556717 21 iccv-2013-A Method of Perceptual-Based Shape Decomposition
8 0.84381938 362 iccv-2013-Robust Tucker Tensor Decomposition for Effective Image Representation
9 0.83842719 30 iccv-2013-A Simple Model for Intrinsic Image Decomposition with Depth Cues
10 0.83795214 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model
11 0.83570623 115 iccv-2013-Direct Optimization of Frame-to-Frame Rotation
12 0.83497196 223 iccv-2013-Joint Noise Level Estimation from Personal Photo Collections
13 0.83351815 62 iccv-2013-Bird Part Localization Using Exemplar-Based Models with Enforced Pose and Subcategory Consistency
14 0.83243108 65 iccv-2013-Breaking the Chain: Liberation from the Temporal Markov Assumption for Tracking Human Poses
15 0.83163851 364 iccv-2013-SGTD: Structure Gradient and Texture Decorrelating Regularization for Image Decomposition
16 0.8315891 339 iccv-2013-Rank Minimization across Appearance and Shape for AAM Ensemble Fitting
17 0.83066934 284 iccv-2013-Multiview Photometric Stereo Using Planar Mesh Parameterization
18 0.83049351 300 iccv-2013-Optical Flow via Locally Adaptive Fusion of Complementary Data Costs
19 0.83039904 79 iccv-2013-Coherent Object Detection with 3D Geometric Context from a Single Image
20 0.82804966 199 iccv-2013-High Quality Shape from a Single RGB-D Image under Uncalibrated Natural Illumination