iccv iccv2013 iccv2013-219 knowledge-graph by maker-knowledge-mining

219 iccv-2013-Internet Based Morphable Model


Source: pdf

Author: Ira Kemelmacher-Shlizerman

Abstract: In thispaper wepresent a new concept ofbuilding a morphable model directly from photos on the Internet. Morphable models have shown very impressive results more than a decade ago, and could potentially have a huge impact on all aspects of face modeling and recognition. One of the challenges, however, is to capture and register 3D laser scans of large number of people and facial expressions. Nowadays, there are enormous amounts of face photos on the Internet, large portion of which has semantic labels. We propose a framework to build a morphable model directly from photos, the framework includes dense registration of Internet photos, as well as, new single view shape reconstruction and modification algorithms.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 This approach dramatically reduces the degrees of freedom of the face reconstruction problem, and enabled extremely impressive results. [sent-5, score-0.269]

2 We construct a morphable model directly from Internet photos, the model is then used for single view reconstruction from any new input image (Face Analyzer) and further for shape modification (Face Modifier), e. [sent-12, score-0.765]

3 Abstract In thispaper wepresent a new concept ofbuilding a morphable model directly from photos on the Internet. [sent-15, score-0.774]

4 One of the challenges, however, is to capture and register 3D laser scans of large number of people and facial expressions. [sent-17, score-0.367]

5 Nowadays, there are enormous amounts of face photos on the Internet, large portion of which has semantic labels. [sent-18, score-0.557]

6 We propose a framework to build a morphable model directly from photos, the framework includes dense registration of Internet photos, as well as, new single view shape reconstruction and modification algorithms. [sent-19, score-0.76]

7 Introduction In their landmark 1999 paper [ 13], Blanz and Vetter [ 13] introduced morphable models, a powerful tool for modeling 3D face shape and deformations that can be fit from a single photo. [sent-21, score-0.635]

8 Their key idea was to constrain face reconstruction (FRGC) concluded [ 1] that, if available, 3D models dramatically increase recognition performance due to their invariance to lighting, viewpoint, and occlusion. [sent-22, score-0.24]

9 More than a decade later, however, morphable models have yet to achieve their initial promise; while we’ve seen face detection and recognition enjoy widespread deployment into consumer cameras and photo sharing technology, morphable models have yet to achieve similar impact. [sent-23, score-0.999]

10 Still, face detection and recognition methods operate by training on a very large number of photos to achieve robust performance and often fail, e. [sent-24, score-0.557]

11 In the research community, the number of follow-on research papers on morphable models has declined in recent years. [sent-27, score-0.347]

12 Scale: while it’s relatively simple to train a face detec- tor on 10,000 examples, acquiring, cleaning, and aligning the 3D models needed for morphable models is a painstaking and cumbersome task. [sent-31, score-0.477]

13 In this paper we’d like to address these limitations and introduce a new framework for computing a morphable model. [sent-36, score-0.347]

14 The vast amounts of photos of people already on the Internet can potentially capture many of the degrees of freedom of the human face. [sent-38, score-0.427]

15 In contrast, consider the logistical challenges of trying to acquire 3D scans of many babies in different expressions. [sent-40, score-0.244]

16 Different search terms yield face photos of any desired age, country, ethnicity, etc. [sent-41, score-0.557]

17 Given a collection of photos our method automatically computes pixel-wise correspondence from every photo in the collection to a single reference (which is also computed by the method). [sent-45, score-1.034]

18 This enables putting in correspondence all the photos in the collection. [sent-46, score-0.54]

19 The key idea of the paper is that once the photos are aligned it is possible to derive a 3D shape basis directly from the collection, and further to estimate 3D shape from any single image and modify its shape, e. [sent-47, score-0.934]

20 In particular, we show that the matrix of aligned intensities is a rank 4K matrix under the Lambertian reflectance model assumption and can be factored into K 3D basis shapes, as well as lighting and shape coefficients per image using SVD. [sent-50, score-0.699]

21 We demonstrate the effectiveness of this method on challenging images taken ”in the wild”, including images of human faces with varying facial expression taken under arbitrary lighting and pose, and show shape reconstructions and modifications that are produced completely automatically. [sent-51, score-0.889]

22 Section 3 describes how we align collected Internet photos and derive a shape basis, we next describe the factorization method that allows reconstructing shape from a single image in Section 4 and modifying the shape to perform different facial expressions from a single image in Section 5. [sent-55, score-1.344]

23 Related Work Despite a large literature on face modeling, still it is very challenging to estimate 3D shape of a face from a single image and in particular with facial expressions, taken under unconstrained conditions. [sent-58, score-0.7]

24 Indeed, most state of the art techniques for high quality face reconstruction require a subject to come to a lab to be scanned with special equipment (e. [sent-59, score-0.24]

25 This approach works extremely well when the target is roughly within the linear span of the database (in their case, 200 young adults), but is not well suited for capturing facial shape with expressions and subtle details that vary from one individual to the next. [sent-66, score-0.64]

26 There are three publicly available implementations of morphable models [3, 5, 4] to which we compare in the results section. [sent-67, score-0.347]

27 Similarly,[24] reconstruct a shape by combining patches from a database of depths, [7] proposed general (non face specific) priors on depth and albedo for shape from shading. [sent-68, score-0.47]

28 In our work we build a shape basis directly from the photos, moreover we establish dense correspondence between the shape basis and the input image enabling modification of input’s shape and texture. [sent-73, score-0.863]

29 This registration procedure allows us to construct an image basis that will be used in the next sections for shape estimation and modification from a single image. [sent-163, score-0.384]

30 Data Collection Our major motivation in constructing an Internet based image basis is to address variations in facial shape that are challenging to capture with 3D scanning devices. [sent-166, score-0.543]

31 We used image search queries like “smiling baby”, “crying baby”, “screaming baby” and so forth to collect sets of photos (we call them clusters) divided by semantic labels. [sent-168, score-0.465]

32 Similarly we collected photos of adults making different facial expressions. [sent-169, score-0.73]

33 As a result we collected around 4000 photos with roughly 300 photos per cluster. [sent-170, score-0.918]

34 Pixel-wise correspondence The goal of this part is to obtain dense correspondence between any two photos in the collection. [sent-173, score-0.653]

35 Let’s assume for simplicity that we are given two images (I1 and I2) of the same person making different facial expressions while the pose and lighting are fixed. [sent-174, score-0.688]

36 Finding dense correspondence between such two photos means looking for transformations u(x, y) and v(x, y) such that the distance between I1(x + u(x, y) , y + v(x, y)) and I2 (x, y) is minimized, i. [sent-175, score-0.54]

37 Recently, however, Kemelmacher and Seitz proposed a method they called ”Collection Flow” (CF) [22] where they showed that given a large photo collection of same person, e. [sent-180, score-0.283]

38 , photos of a celebrity downloaded from the Internet, it is possible to leverage the collection for lighting invariant flow estimation. [sent-182, score-0.897]

39 The projection was done in a way to capture the lighting (low frequency component) of the image while normalizing for the facial expression (high frequency component). [sent-188, score-0.57]

40 Intuitively, flow was computed from an image to the collection average (modified to include the lighting of the particular image). [sent-189, score-0.47]

41 In our method, we follow the ideas presented in [22] but propose to apply collection flow on each cluster independently and then estimate flow between each cluster’s average (computed by the method) to a global average (chosen as one of the clusters). [sent-192, score-0.573]

42 First, we found that CF performs well in case the photos have either similar identity (same gender, age and person) and varying facial expression (as in the original paper) or with roughly similar facial expression but varying identity. [sent-195, score-1.174]

43 Second, since the clusters have semantic meanings having separate flows per cluster enables morphing capabilities. [sent-197, score-0.345]

44 Note that even though the facial expression is roughly the same per cluster it can still vary quite significantly across individuals (and since some of the expressions are not clearly defined with a particular search term, e. [sent-198, score-0.763]

45 Given a photo I1 in cluster iwe’d like to estimate flow to photo I2 in cluster j. [sent-203, score-0.833]

46 We first run collection flow on cluster ito get the flow I1 → Ai1 where Ai1 is the average of cluster iilluminated by lighting L1 of the input image, and similarly for I2 we find flow I2 → Aj2. [sent-204, score-1.138]

47 This process is performed in parallel for all photos in A the all clusters. [sent-205, score-0.427]

48 The output is pixel wise correspondence from each cluster’s photo to its average. [sent-207, score-0.288]

49 Figure 2 shows the averages of clusters before (top) and after (bottom) the collection flow process. [sent-208, score-0.398]

50 Note how much sharper the facial features look, indicating good correspondence. [sent-209, score-0.276]

51 , “crying babies” and “laughing babies” both have an open mouth and narrower eyes (compared to neutral) however the extent to which the eyes are closed or the shape of the mouth causes a dramatic expression change. [sent-213, score-0.532]

52 To obtain correspondence across clusters we warp all images to their respective cluster average, from the warped images we construct an f p matrix Mi (for cluster i) where afg eiss t whee n cuomnsbtreurc otf a images a mnadt p xn uMmber of pixels in each image. [sent-214, score-0.585]

53 We then project the average of cluster i, Ai onto the global average Ag (chosen as the first cluster) obtaining Aig and estimate optical flow between Aig and Ag. [sent-216, score-0.348]

54 This step is done to match color and lighting of the target cluster, in all cases the rank of the projection is rank-4 as in [22]. [sent-217, score-0.258]

55 Once correspondence was obtained all images are warped to a common global reference and matrix M that contains warped images from all the clusters is constructed. [sent-218, score-0.36]

56 Registration of a new input photo Given an input image, we’d like to align it to the rest of the collection. [sent-221, score-0.247]

57 We choose the cluster number to which the image belongs by measuring to which cluster the majority of nearest neighbor images belong. [sent-223, score-0.334]

58 Given the cluster number (say i) the image is projected to Mi and low-rank version of the input image is computed, and further optical flow between the low rank version and the input image is computed. [sent-224, score-0.465]

59 The change in intensity can be caused by difference in lighting or surface normal (due to facial expression– even though we aligned for 2D flow still there is a possible change in surface normals), and texture, e. [sent-229, score-0.652]

60 t Lncete us ofduretlh,e wr assume ∈tha Rt shape i ∈n an image can be represented by a linear combination of a set of basis shapes, i. [sent-233, score-0.261]

61 orize the matrix to enable recovery of the shape basis, lighting coefficients, and how to combine the shape basis to enable single view reconstruction. [sent-240, score-0.683]

62 The intuition behind this representation is that we use the images set to produce a set of basis shapes each of size 4 p, that spans the shapes of the faces captured in images. [sent-241, score-0.381]

63 We further show how to recover the basis coefficients, and use them to reconstruct a facial shape per image. [sent-245, score-0.567]

64 The main question, however, is how to separate the coefficients from the lighting representation? [sent-246, score-0.277]

65 To this end, we propose a doubleSVD approach, which includes rank constraints due to the lighting representation. [sent-247, score-0.258]

66 We were inspired by Bregler’s nonrigid shape factorization [15], however there it was done for a completely different problem–separating pose and shape parameters. [sent-248, score-0.351]

67 [28] created a basis that spans flow and normals given photos of different people, with the same expression (neutral) to use for recognition. [sent-249, score-0.902]

68 [16] create a basis that spans deformations due to flow but consider a controlled video sequence that is taken with 3 colored lights (thus every facial expression in every frame can be reconstructed using rigid photometric stereo). [sent-253, score-0.864]

69 Factorization to deformation and lighting We factorize M using Singular Value Decomposition, M = UDVT and take th√e rank-4K approximatio√n to get M = PB where P = U√D is f 4K and B = √DVT is 4K p. [sent-258, score-0.274]

70 bi Inanti thoen aobf sloenwc eor odfer a mcobeifgfuicitieiensts, Pof sthheo lighting and coefficients that combine the shape basis B. [sent-260, score-0.538]

71 It was shown in [12] that classic photometric stereo factorization can recover lighting and shape up to 3 3 Generalized Bas-Relief ambiguity, and faonrd arbitrary lighting approximated sw-Rithe fiierfsta morbdiegru spherical harmonics up to a 4 4 Lorentz transformation [9]. [sent-281, score-0.838]

72 ; denote P = U√D and B = √DVT ; Result: P and B for which the rank-1 condition holds while until convergence do Algorithm 1: Modify P and B to hold rank-1 condition Once P and B are estimated, we can determine the true lighting l and basis coefficients α (in Alg. [sent-289, score-0.409]

73 Given the basis coefficients the sought shape is S = ? [sent-293, score-0.325]

74 in our case we recover a shape matrix per color channel and integrate the three shape matrices together (instead of one equation per pixel we have 3 equations). [sent-299, score-0.351]

75 Once the depth is reconstructed, it is still in the 2d state of the global reference and therefore inverse flow should be applied to transform the shape from reference to the original expression of the input image. [sent-300, score-0.544]

76 The inverse flow is obtained from the flows between cluster averages, as in Section 3. [sent-301, score-0.343]

77 Synthesis of novel 2D and 3D views Once depth per image is reconstructed and correspondence between every image to every other image in the dataset obtained, it becomes possible to transform between different faces and expressions. [sent-304, score-0.331]

78 Specifically, we can change the expression of a person from a single image by transforming it using the flow between the clusters. [sent-305, score-0.347]

79 To synthesize view of cluster j from image in cluster iwe project the photo (aligned by flow to the cluster average) onto the rank-4 cluster ibasis and also onto the rank-4 cluster j basis, yielding a pair of illumination-aligned projections. [sent-306, score-1.257]

80 Subtracting these two projections yields a difference image between clusters iand j, under the same illumination as the input photo, and adding it to the input photo yields a texture change. [sent-307, score-0.35]

81 We also apply the flow difference, between cluster iand j,warped to the coordinate system of the input photo. [sent-308, score-0.352]

82 4 we show many input photos and the corresponding reconstructions automatically obtained using our method (for each photo we show three views of the reconstruction). [sent-313, score-0.754]

83 Note the dramatic difference in facial expressions which is captured in the reconstruction (going from laughing to screaming to sad and so forth), the change in identity, ethnicity and gender, variety of lighting conditions, etc. [sent-315, score-0.98]

84 In Figure 3 we demonstrate how based on the dense correspondence that is found between every photo to every cluster in the collection, it is possible to make automatic modifications to the input photo to achieve change in the facial expression. [sent-317, score-0.97]

85 Figure 7 shows reconstruction results of Vizago [5] and Image Metrics [4], both are implementations of the morphable model method. [sent-323, score-0.457]

86 By observing the profile views we see that the shapes are mostly of an average adult and do not capture the facial expression. [sent-324, score-0.344]

87 Second column presents the ground truth shape–estimated by taking all the photos of the same person given in YaleB and running calibrated photometric stereo (known lighting directions per image) [26]. [sent-332, score-0.926]

88 Column 4 shows the depth map difference between our single view reconstruction and photometric stereo, below each difference there is the mean error and standard deviation in percents. [sent-333, score-0.338]

89 hNtlyote b etthtaert ,[2 t0he] irs ynpoitc designed t o6 −wo 7r%k w wihthil efa ocuiarsl expressions and its performance degrades when the input photo is less similar to the reference template. [sent-338, score-0.43]

90 Conclusions We believe that morphable models have a huge potential to advance unconstrained face modeling, however most existing methods heavily depend on priors that are challenging to construct, e. [sent-343, score-0.515]

91 The key idea of this paper, is to find a way to leverage photographs (which already exist on the Internet) for construction of a morphable model basis. [sent-346, score-0.347]

92 To this end, we showed that if photos can be divided to ”clusters” based on semantic labels (e. [sent-347, score-0.427]

93 , ”smiling”, ”sad”), we can 1) get dense pixel-wise correspondence between any pair of photos in different clusters that represent facial expressions, e. [sent-349, score-0.857]

94 , smiling photo to sad photo, and 2) use this correspondence to analyze the space of warped images, i. [sent-351, score-0.471]

95 Given a single input image (left), the method can automatically synthesize the same person in different facial expressions using the derived morphable model. [sent-526, score-0.887]

96 This enabled a new single view shape reconstruction and modification method, with exciting results on very challenging photos, e. [sent-622, score-0.411]

97 From few to many: Illumination cone models for face recognition under variable lighting and pose. [sent-785, score-0.343]

98 3d face reconstruction from a single image using a single reference face shape. [sent-801, score-0.472]

99 7 of [20] and show several typical reconstructions on this dataset compared to calibrated photometric stereo (known lighting). [sent-874, score-0.283]

100 The shape does not account for facial expressions, and looks close to the average person model (note the profile views of the reconstruction). [sent-906, score-0.461]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('photos', 0.427), ('morphable', 0.347), ('facial', 0.244), ('lighting', 0.213), ('expressions', 0.175), ('photo', 0.175), ('cluster', 0.167), ('babies', 0.166), ('flow', 0.149), ('basis', 0.132), ('face', 0.13), ('shape', 0.129), ('photometric', 0.119), ('internet', 0.117), ('correspondence', 0.113), ('expression', 0.113), ('reconstruction', 0.11), ('collection', 0.108), ('yaleb', 0.103), ('mouth', 0.086), ('reconstructions', 0.084), ('kemelmacher', 0.08), ('stereo', 0.08), ('scans', 0.078), ('clusters', 0.073), ('basri', 0.072), ('crying', 0.071), ('averages', 0.068), ('shapes', 0.068), ('baby', 0.067), ('faces', 0.066), ('ethnicity', 0.066), ('warped', 0.065), ('smiling', 0.065), ('coefficients', 0.064), ('modification', 0.063), ('blanz', 0.062), ('adults', 0.059), ('gender', 0.056), ('person', 0.056), ('neutral', 0.054), ('aig', 0.054), ('albedo', 0.053), ('sad', 0.053), ('factorization', 0.053), ('view', 0.051), ('cf', 0.049), ('morphing', 0.047), ('iwe', 0.047), ('dvt', 0.047), ('screaming', 0.047), ('vizago', 0.047), ('spans', 0.047), ('aligned', 0.046), ('ambiguity', 0.045), ('eyes', 0.045), ('rank', 0.045), ('lambertian', 0.045), ('laser', 0.045), ('kids', 0.044), ('laughing', 0.044), ('reference', 0.044), ('modify', 0.042), ('frgc', 0.041), ('vetter', 0.041), ('completely', 0.04), ('bregler', 0.039), ('reflectance', 0.039), ('forth', 0.038), ('unconstrained', 0.038), ('scanning', 0.038), ('brightness', 0.037), ('input', 0.036), ('normals', 0.034), ('factorize', 0.034), ('wher', 0.034), ('grand', 0.033), ('span', 0.033), ('roughly', 0.033), ('siggraph', 0.032), ('views', 0.032), ('sharper', 0.032), ('smile', 0.032), ('optical', 0.032), ('per', 0.031), ('recover', 0.031), ('registration', 0.031), ('illumination', 0.03), ('every', 0.03), ('enabled', 0.029), ('depth', 0.029), ('single', 0.029), ('dramatic', 0.028), ('mi', 0.028), ('acquiring', 0.027), ('flows', 0.027), ('deformation', 0.027), ('nose', 0.027), ('jacobs', 0.027), ('young', 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999887 219 iccv-2013-Internet Based Morphable Model

Author: Ira Kemelmacher-Shlizerman

Abstract: In thispaper wepresent a new concept ofbuilding a morphable model directly from photos on the Internet. Morphable models have shown very impressive results more than a decade ago, and could potentially have a huge impact on all aspects of face modeling and recognition. One of the challenges, however, is to capture and register 3D laser scans of large number of people and facial expressions. Nowadays, there are enormous amounts of face photos on the Internet, large portion of which has semantic labels. We propose a framework to build a morphable model directly from photos, the framework includes dense registration of Internet photos, as well as, new single view shape reconstruction and modification algorithms.

2 0.36489534 444 iccv-2013-Viewing Real-World Faces in 3D

Author: Tal Hassner

Abstract: We present a data-driven method for estimating the 3D shapes of faces viewed in single, unconstrained photos (aka “in-the-wild”). Our method was designed with an emphasis on robustness and efficiency with the explicit goal of deployment in real-world applications which reconstruct and display faces in 3D. Our key observation is that for many practical applications, warping the shape of a reference face to match the appearance of a query, is enough to produce realistic impressions of the query ’s 3D shape. Doing so, however, requires matching visual features between the (possibly very different) query and reference images, while ensuring that a plausible face shape is produced. To this end, we describe an optimization process which seeks to maximize the similarity of appearances and depths, jointly, to those of a reference model. We describe our system for monocular face shape reconstruction and present both qualitative and quantitative experiments, comparing our method against alternative systems, and demonstrating its capabilities. Finally, as a testament to its suitability for real-world applications, we offer an open, online implementation of our system, providing unique means – of instant 3D viewing of faces appearing in web photos.

3 0.26914534 157 iccv-2013-Fast Face Detector Training Using Tailored Views

Author: Kristina Scherbaum, James Petterson, Rogerio S. Feris, Volker Blanz, Hans-Peter Seidel

Abstract: Face detection is an important task in computer vision and often serves as the first step for a variety of applications. State-of-the-art approaches use efficient learning algorithms and train on large amounts of manually labeled imagery. Acquiring appropriate training images, however, is very time-consuming and does not guarantee that the collected training data is representative in terms of data variability. Moreover, available data sets are often acquired under controlled settings, restricting, for example, scene illumination or 3D head pose to a narrow range. This paper takes a look into the automated generation of adaptive training samples from a 3D morphable face model. Using statistical insights, the tailored training data guarantees full data variability and is enriched by arbitrary facial attributes such as age or body weight. Moreover, it can automatically adapt to environmental constraints, such as illumination or viewing angle of recorded video footage from surveillance cameras. We use the tailored imagery to train a new many-core imple- mentation of Viola Jones ’ AdaBoost object detection framework. The new implementation is not only faster but also enables the use of multiple feature channels such as color features at training time. In our experiments we trained seven view-dependent face detectors and evaluate these on the Face Detection Data Set and Benchmark (FDDB). Our experiments show that the use of tailored training imagery outperforms state-of-the-art approaches on this challenging dataset.

4 0.26724982 36 iccv-2013-Accurate and Robust 3D Facial Capture Using a Single RGBD Camera

Author: Yen-Lin Chen, Hsiang-Tao Wu, Fuhao Shi, Xin Tong, Jinxiang Chai

Abstract: This paper presents an automatic and robust approach that accurately captures high-quality 3D facial performances using a single RGBD camera. The key of our approach is to combine the power of automatic facial feature detection and image-based 3D nonrigid registration techniques for 3D facial reconstruction. In particular, we develop a robust and accurate image-based nonrigid registration algorithm that incrementally deforms a 3D template mesh model to best match observed depth image data and important facial features detected from single RGBD images. The whole process is fully automatic and robust because it is based on single frame facial registration framework. The system is flexible because it does not require any strong 3D facial priors such as blendshape models. We demonstrate the power of our approach by capturing a wide range of 3D facial expressions using a single RGBD camera and achieve state-of-the-art accuracy by comparing against alternative methods.

5 0.23637721 199 iccv-2013-High Quality Shape from a Single RGB-D Image under Uncalibrated Natural Illumination

Author: Yudeog Han, Joon-Young Lee, In So Kweon

Abstract: We present a novel framework to estimate detailed shape of diffuse objects with uniform albedo from a single RGB-D image. To estimate accurate lighting in natural illumination environment, we introduce a general lighting model consisting oftwo components: global and local models. The global lighting model is estimated from the RGB-D input using the low-dimensional characteristic of a diffuse reflectance model. The local lighting model represents spatially varying illumination and it is estimated by using the smoothlyvarying characteristic of illumination. With both the global and local lighting model, we can estimate complex lighting variations in uncontrolled natural illumination conditions accurately. For high quality shape capture, a shapefrom-shading approach is applied with the estimated lighting model. Since the entire process is done with a single RGB-D input, our method is capable of capturing the high quality shape details of a dynamic object under natural illumination. Experimental results demonstrate the feasibility and effectiveness of our method that dramatically improves shape details of the rough depth input.

6 0.20527498 70 iccv-2013-Cascaded Shape Space Pruning for Robust Facial Landmark Detection

7 0.17609163 223 iccv-2013-Joint Noise Level Estimation from Personal Photo Collections

8 0.16763927 147 iccv-2013-Event Recognition in Photo Collections with a Stopwatch HMM

9 0.16538937 335 iccv-2013-Random Faces Guided Sparse Many-to-One Encoder for Pose-Invariant Face Recognition

10 0.16145846 286 iccv-2013-NYC3DCars: A Dataset of 3D Vehicles in Geographic Context

11 0.1505637 317 iccv-2013-Piecewise Rigid Scene Flow

12 0.13153583 387 iccv-2013-Shape Anchors for Data-Driven Multi-view Reconstruction

13 0.13027522 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model

14 0.12588671 339 iccv-2013-Rank Minimization across Appearance and Shape for AAM Ensemble Fitting

15 0.12558894 391 iccv-2013-Sieving Regression Forest Votes for Facial Feature Detection in the Wild

16 0.12257602 251 iccv-2013-Like Father, Like Son: Facial Expression Dynamics for Kinship Verification

17 0.12120172 69 iccv-2013-Capturing Global Semantic Relationships for Facial Action Unit Recognition

18 0.10973215 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification

19 0.10730992 281 iccv-2013-Multi-view Normal Field Integration for 3D Reconstruction of Mirroring Objects

20 0.10699767 300 iccv-2013-Optical Flow via Locally Adaptive Fusion of Complementary Data Costs


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.235), (1, -0.132), (2, -0.116), (3, -0.05), (4, -0.038), (5, -0.124), (6, 0.294), (7, -0.03), (8, 0.027), (9, 0.02), (10, -0.048), (11, 0.112), (12, 0.139), (13, 0.066), (14, -0.09), (15, -0.092), (16, -0.01), (17, 0.003), (18, 0.0), (19, -0.051), (20, 0.035), (21, 0.005), (22, 0.024), (23, 0.028), (24, 0.01), (25, -0.006), (26, 0.04), (27, -0.108), (28, 0.047), (29, -0.057), (30, 0.106), (31, 0.086), (32, 0.056), (33, -0.026), (34, -0.06), (35, -0.045), (36, 0.154), (37, 0.121), (38, -0.094), (39, -0.076), (40, -0.022), (41, 0.019), (42, 0.113), (43, -0.011), (44, 0.048), (45, -0.073), (46, -0.099), (47, -0.079), (48, -0.064), (49, -0.051)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96894103 219 iccv-2013-Internet Based Morphable Model

Author: Ira Kemelmacher-Shlizerman

Abstract: In thispaper wepresent a new concept ofbuilding a morphable model directly from photos on the Internet. Morphable models have shown very impressive results more than a decade ago, and could potentially have a huge impact on all aspects of face modeling and recognition. One of the challenges, however, is to capture and register 3D laser scans of large number of people and facial expressions. Nowadays, there are enormous amounts of face photos on the Internet, large portion of which has semantic labels. We propose a framework to build a morphable model directly from photos, the framework includes dense registration of Internet photos, as well as, new single view shape reconstruction and modification algorithms.

2 0.74736392 444 iccv-2013-Viewing Real-World Faces in 3D

Author: Tal Hassner

Abstract: We present a data-driven method for estimating the 3D shapes of faces viewed in single, unconstrained photos (aka “in-the-wild”). Our method was designed with an emphasis on robustness and efficiency with the explicit goal of deployment in real-world applications which reconstruct and display faces in 3D. Our key observation is that for many practical applications, warping the shape of a reference face to match the appearance of a query, is enough to produce realistic impressions of the query ’s 3D shape. Doing so, however, requires matching visual features between the (possibly very different) query and reference images, while ensuring that a plausible face shape is produced. To this end, we describe an optimization process which seeks to maximize the similarity of appearances and depths, jointly, to those of a reference model. We describe our system for monocular face shape reconstruction and present both qualitative and quantitative experiments, comparing our method against alternative systems, and demonstrating its capabilities. Finally, as a testament to its suitability for real-world applications, we offer an open, online implementation of our system, providing unique means – of instant 3D viewing of faces appearing in web photos.

3 0.701235 251 iccv-2013-Like Father, Like Son: Facial Expression Dynamics for Kinship Verification

Author: Hamdi Dibeklioglu, Albert Ali Salah, Theo Gevers

Abstract: Kinship verification from facial appearance is a difficult problem. This paper explores the possibility of employing facial expression dynamics in this problem. By using features that describe facial dynamics and spatio-temporal appearance over smile expressions, we show that it is possible to improve the state ofthe art in thisproblem, and verify that it is indeed possible to recognize kinship by resemblance of facial expressions. The proposed method is tested on different kin relationships. On the average, 72.89% verification accuracy is achieved on spontaneous smiles.

4 0.69544888 36 iccv-2013-Accurate and Robust 3D Facial Capture Using a Single RGBD Camera

Author: Yen-Lin Chen, Hsiang-Tao Wu, Fuhao Shi, Xin Tong, Jinxiang Chai

Abstract: This paper presents an automatic and robust approach that accurately captures high-quality 3D facial performances using a single RGBD camera. The key of our approach is to combine the power of automatic facial feature detection and image-based 3D nonrigid registration techniques for 3D facial reconstruction. In particular, we develop a robust and accurate image-based nonrigid registration algorithm that incrementally deforms a 3D template mesh model to best match observed depth image data and important facial features detected from single RGBD images. The whole process is fully automatic and robust because it is based on single frame facial registration framework. The system is flexible because it does not require any strong 3D facial priors such as blendshape models. We demonstrate the power of our approach by capturing a wide range of 3D facial expressions using a single RGBD camera and achieve state-of-the-art accuracy by comparing against alternative methods.

5 0.6704427 157 iccv-2013-Fast Face Detector Training Using Tailored Views

Author: Kristina Scherbaum, James Petterson, Rogerio S. Feris, Volker Blanz, Hans-Peter Seidel

Abstract: Face detection is an important task in computer vision and often serves as the first step for a variety of applications. State-of-the-art approaches use efficient learning algorithms and train on large amounts of manually labeled imagery. Acquiring appropriate training images, however, is very time-consuming and does not guarantee that the collected training data is representative in terms of data variability. Moreover, available data sets are often acquired under controlled settings, restricting, for example, scene illumination or 3D head pose to a narrow range. This paper takes a look into the automated generation of adaptive training samples from a 3D morphable face model. Using statistical insights, the tailored training data guarantees full data variability and is enriched by arbitrary facial attributes such as age or body weight. Moreover, it can automatically adapt to environmental constraints, such as illumination or viewing angle of recorded video footage from surveillance cameras. We use the tailored imagery to train a new many-core imple- mentation of Viola Jones ’ AdaBoost object detection framework. The new implementation is not only faster but also enables the use of multiple feature channels such as color features at training time. In our experiments we trained seven view-dependent face detectors and evaluate these on the Face Detection Data Set and Benchmark (FDDB). Our experiments show that the use of tailored training imagery outperforms state-of-the-art approaches on this challenging dataset.

6 0.63913542 391 iccv-2013-Sieving Regression Forest Votes for Facial Feature Detection in the Wild

7 0.63820189 272 iccv-2013-Modifying the Memorability of Face Photographs

8 0.61160374 195 iccv-2013-Hidden Factor Analysis for Age Invariant Face Recognition

9 0.60227519 70 iccv-2013-Cascaded Shape Space Pruning for Robust Facial Landmark Detection

10 0.57126993 199 iccv-2013-High Quality Shape from a Single RGB-D Image under Uncalibrated Natural Illumination

11 0.56938565 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model

12 0.55462563 223 iccv-2013-Joint Noise Level Estimation from Personal Photo Collections

13 0.54984397 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation

14 0.53608727 335 iccv-2013-Random Faces Guided Sparse Many-to-One Encoder for Pose-Invariant Face Recognition

15 0.53468353 149 iccv-2013-Exemplar-Based Graph Matching for Robust Facial Landmark Localization

16 0.53227329 355 iccv-2013-Robust Face Landmark Estimation under Occlusion

17 0.51256937 284 iccv-2013-Multiview Photometric Stereo Using Planar Mesh Parameterization

18 0.49248993 407 iccv-2013-Subpixel Scanning Invariant to Indirect Lighting Using Quadratic Code Length

19 0.48800471 154 iccv-2013-Face Recognition via Archetype Hull Ranking

20 0.46216422 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.059), (6, 0.01), (7, 0.012), (12, 0.017), (26, 0.07), (31, 0.039), (41, 0.023), (42, 0.156), (64, 0.029), (73, 0.034), (78, 0.012), (84, 0.157), (89, 0.256), (95, 0.016), (98, 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.97587383 401 iccv-2013-Stacked Predictive Sparse Coding for Classification of Distinct Regions in Tumor Histopathology

Author: Hang Chang, Yin Zhou, Paul Spellman, Bahram Parvin

Abstract: Image-based classification ofhistology sections, in terms of distinct components (e.g., tumor, stroma, normal), provides a series of indices for tumor composition. Furthermore, aggregation of these indices, from each whole slide image (WSI) in a large cohort, can provide predictive models of the clinical outcome. However, performance of the existing techniques is hindered as a result of large technical variations and biological heterogeneities that are always present in a large cohort. We propose a system that automatically learns a series of basis functions for representing the underlying spatial distribution using stacked predictive sparse decomposition (PSD). The learned representation is then fed into the spatial pyramid matching framework (SPM) with a linear SVM classifier. The system has been evaluated for classification of (a) distinct histological components for two cohorts of tumor types, and (b) colony organization of normal and malignant cell lines in 3D cell culture models. Throughput has been increased through the utility of graphical processing unit (GPU), and evalu- ation indicates a superior performance results, compared with previous research.

2 0.93636554 381 iccv-2013-Semantically-Based Human Scanpath Estimation with HMMs

Author: Huiying Liu, Dong Xu, Qingming Huang, Wen Li, Min Xu, Stephen Lin

Abstract: We present a method for estimating human scanpaths, which are sequences of gaze shifts that follow visual attention over an image. In this work, scanpaths are modeled based on three principal factors that influence human attention, namely low-levelfeature saliency, spatialposition, and semantic content. Low-level feature saliency is formulated as transition probabilities between different image regions based on feature differences. The effect of spatial position on gaze shifts is modeled as a Levy flight with the shifts following a 2D Cauchy distribution. To account for semantic content, we propose to use a Hidden Markov Model (HMM) with a Bag-of-Visual-Words descriptor of image regions. An HMM is well-suited for this purpose in that 1) the hidden states, obtained by unsupervised learning, can represent latent semantic concepts, 2) the prior distribution of the hidden states describes visual attraction to the semantic concepts, and 3) the transition probabilities represent human gaze shift patterns. The proposed method is applied to task-driven viewing processes. Experiments and analysis performed on human eye gaze data verify the effectiveness of this method.

3 0.9309724 241 iccv-2013-Learning Near-Optimal Cost-Sensitive Decision Policy for Object Detection

Author: Tianfu Wu, Song-Chun Zhu

Abstract: Many object detectors, such as AdaBoost, SVM and deformable part-based models (DPM), compute additive scoring functions at a large number of windows scanned over image pyramid, thus computational efficiency is an important consideration beside accuracy performance. In this paper, we present a framework of learning cost-sensitive decision policy which is a sequence of two-sided thresholds to execute early rejection or early acceptance based on the accumulative scores at each step. A decision policy is said to be optimal if it minimizes an empirical global risk function that sums over the loss of false negatives (FN) and false positives (FP), and the cost of computation. While the risk function is very complex due to high-order connections among the two-sided thresholds, we find its upper bound can be optimized by dynamic programming (DP) efficiently and thus say the learned policy is near-optimal. Given the loss of FN and FP and the cost in three numbers, our method can produce a policy on-the-fly for Adaboost, SVM and DPM. In experiments, we show that our decision policy outperforms state-of-the-art cascade methods significantly in terms of speed with similar accuracy performance.

4 0.92874181 133 iccv-2013-Efficient Hand Pose Estimation from a Single Depth Image

Author: Chi Xu, Li Cheng

Abstract: We tackle the practical problem of hand pose estimation from a single noisy depth image. A dedicated three-step pipeline is proposed: Initial estimation step provides an initial estimation of the hand in-plane orientation and 3D location; Candidate generation step produces a set of 3D pose candidate from the Hough voting space with the help of the rotational invariant depth features; Verification step delivers the final 3D hand pose as the solution to an optimization problem. We analyze the depth noises, and suggest tips to minimize their negative impacts on the overall performance. Our approach is able to work with Kinecttype noisy depth images, and reliably produces pose estimations of general motions efficiently (12 frames per second). Extensive experiments are conducted to qualitatively and quantitatively evaluate the performance with respect to the state-of-the-art methods that have access to additional RGB images. Our approach is shown to deliver on par or even better results.

5 0.92674071 60 iccv-2013-Bayesian Robust Matrix Factorization for Image and Video Processing

Author: Naiyan Wang, Dit-Yan Yeung

Abstract: Matrix factorization is a fundamental problem that is often encountered in many computer vision and machine learning tasks. In recent years, enhancing the robustness of matrix factorization methods has attracted much attention in the research community. To benefit from the strengths of full Bayesian treatment over point estimation, we propose here a full Bayesian approach to robust matrix factorization. For the generative process, the model parameters have conjugate priors and the likelihood (or noise model) takes the form of a Laplace mixture. For Bayesian inference, we devise an efficient sampling algorithm by exploiting a hierarchical view of the Laplace distribution. Besides the basic model, we also propose an extension which assumes that the outliers exhibit spatial or temporal proximity as encountered in many computer vision applications. The proposed methods give competitive experimental results when compared with several state-of-the-art methods on some benchmark image and video processing tasks.

same-paper 6 0.92644191 219 iccv-2013-Internet Based Morphable Model

7 0.91554147 168 iccv-2013-Finding the Best from the Second Bests - Inhibiting Subjective Bias in Evaluation of Visual Tracking Algorithms

8 0.90283895 218 iccv-2013-Interactive Markerless Articulated Hand Motion Tracking Using RGB and Depth Data

9 0.89026058 157 iccv-2013-Fast Face Detector Training Using Tailored Views

10 0.88932145 339 iccv-2013-Rank Minimization across Appearance and Shape for AAM Ensemble Fitting

11 0.88809705 189 iccv-2013-HOGgles: Visualizing Object Detection Features

12 0.88758159 308 iccv-2013-Parsing IKEA Objects: Fine Pose Estimation

13 0.88753986 199 iccv-2013-High Quality Shape from a Single RGB-D Image under Uncalibrated Natural Illumination

14 0.887398 66 iccv-2013-Building Part-Based Object Detectors via 3D Geometry

15 0.88699442 79 iccv-2013-Coherent Object Detection with 3D Geometric Context from a Single Image

16 0.88635242 34 iccv-2013-Abnormal Event Detection at 150 FPS in MATLAB

17 0.88609719 249 iccv-2013-Learning to Share Latent Tasks for Action Recognition

18 0.88590229 115 iccv-2013-Direct Optimization of Frame-to-Frame Rotation

19 0.88517839 62 iccv-2013-Bird Part Localization Using Exemplar-Based Models with Enforced Pose and Subcategory Consistency

20 0.88468772 314 iccv-2013-Perspective Motion Segmentation via Collaborative Clustering