cvpr cvpr2013 cvpr2013-338 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang
Abstract: Pose variation remains to be a major challenge for realworld face recognition. We approach this problem through a probabilistic elastic matching method. We take a part based representation by extracting local features (e.g., LBP or SIFT) from densely sampled multi-scale image patches. By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus. Each mixture component of the GMM is confined to be a spherical Gaussian to balance the influence of the appearance and the location terms. Each Gaussian component builds correspondence of a pair of features to be matched between two faces/face tracks. For face verification, we train an SVM on the vector concatenating the difference vectors of all the feature pairs to decide if a pair of faces/face tracks is matched or not. We further propose a joint Bayesian adaptation algorithm to adapt the universally trained GMM to better model the pose variations between the target pair of faces/face tracks, which consistently improves face verification accuracy. Our experiments show that our method outperforms the state-ofthe-art in the most restricted protocol on Labeled Face in the Wild (LFW) and the YouTube video face database by a significant margin.
Reference: text
sentIndex sentText sentNum sentScore
1 edu 18 Abstract Pose variation remains to be a major challenge for realworld face recognition. [sent-2, score-0.426]
2 We approach this problem through a probabilistic elastic matching method. [sent-3, score-0.317]
3 By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus. [sent-7, score-0.646]
4 Each mixture component of the GMM is confined to be a spherical Gaussian to balance the influence of the appearance and the location terms. [sent-8, score-0.389]
5 For face verification, we train an SVM on the vector concatenating the difference vectors of all the feature pairs to decide if a pair of faces/face tracks is matched or not. [sent-10, score-0.727]
6 We further propose a joint Bayesian adaptation algorithm to adapt the universally trained GMM to better model the pose variations between the target pair of faces/face tracks, which consistently improves face verification accuracy. [sent-11, score-0.975]
7 Our experiments show that our method outperforms the state-ofthe-art in the most restricted protocol on Labeled Face in the Wild (LFW) and the YouTube video face database by a significant margin. [sent-12, score-0.674]
8 In recent years, we have witnessed more and more research efforts on face recognition under uncontrolled settings [21, 14, 18, 29, 7]. [sent-15, score-0.456]
9 Face recognition can be categorized into two tasks: face identification and face verification. [sent-16, score-0.882]
10 The former attempts to recognize the identity of a probe face based on a set of gallery face images with known identities. [sent-17, score-0.852]
11 The latter tries to arbitrate if a pair of faces is from the same subject or not. [sent-18, score-0.242]
12 In this paper, we address the problem of pose variant face verification in uncontrolled settings. [sent-19, score-0.741]
13 com face recognition, pose variation is one of the most challeng- ing [30]. [sent-22, score-0.471]
14 Previous work has approached this problem by either exploiting a strong face alignment algorithm [7], or building a robust matching scheme that measures the similarity of faces across different poses [21, 30, 14, 18]. [sent-23, score-0.801]
15 While we have witnessed great progress on face alignment in recent years [4], building a robust face alignment system by itself is a very challenging problem which requires a lot of engineering efforts [4]. [sent-24, score-1.014]
16 As a result, state-of-the-art face alignment systems, even those with published papers, are often not fully accessible to the research community. [sent-25, score-0.525]
17 Although sharing aligned faces in a carefully crafted benchmark face recognition dataset such as the Labeled Face in the Wild (LFW) [16] partly relieves the issue, it immediately becomes a hurdle when one wants to build an end-to-end functioning system for face recognition. [sent-26, score-1.164]
18 Besides, state-of-the-art face alignment results are still far from perfect, the aligned face may still present a lot of pose variations. [sent-27, score-0.999]
19 Hence, we take the latter approach by designing robust matching schemes for unaligned or roughly aligned pose variant face verification. [sent-28, score-0.685]
20 We believe it is a more fundamental problem as it also addresses the residue pose variations from any state-of-the-art face alignment systems. [sent-29, score-0.535]
21 We take a part based representation for a single face image or face tracks. [sent-30, score-0.852]
22 Each face image is densely partitioned into overlapping patches at multiple scales, from each of which a local feature such as Local Binary Pattern (LBP) [1] or SIFT [19] is extracted. [sent-31, score-0.481]
23 We augment each local feature with its location in the face image, and hence a face is rep- resented as a bag of spatial-appearance features. [sent-32, score-1.023]
24 To enable robust matching for pose variant face verification, given a set of training images, we firstly build a Gaussian mixture model (GMM) on the spatial-appearance features from all the training images. [sent-33, score-0.745]
25 To balance the impact of the appearance and spatial location, we further constrain each mixture component of the UBM to be a spherical Gaussian (Section 4. [sent-35, score-0.324]
26 When matching two face images for face verification, each component of the GMM model identifies a pair of 333444999977 appearance features (corresponding to a pair of image patches) from the two face images to be matched (Section 4. [sent-37, score-1.676]
27 We concatenate the absolute difference vector of all these feature pairs from all spherical Gaussian components together to form a long difference vector. [sent-39, score-0.291]
28 An SVM classifier is trained on such difference vectors given a set of training matching/non-matching face/face track pairs, which is subsequently used to verify any new face/face track pairs. [sent-40, score-0.264]
29 One important advantage of this matching framework is that it can be used for both image-to-image and video-tovideo face verification without any modification. [sent-41, score-0.728]
30 To make PEM to be adaptive to each pair of faces, we further propose a joint Bayesian adaptation scheme to adapt the UBM-GMM to better fit the features of the pair of faces/face tracks by Bayesian maximum a posteriori parameter estimation (Section 4. [sent-43, score-0.57]
31 We call such an adapted matching algorithm to be adaptive probabilistic elastic matching (APEM). [sent-45, score-0.48]
32 It consistently improves the face verification accuracy over PEM at the cost of additional computation. [sent-46, score-0.655]
33 Our experiments even show that our PEM and APEM algorithms, when applied to face verification with unaligned faces, i. [sent-47, score-0.717]
34 , raw face images extracted from the Viola-Jones face detector [24], indeed outperforms the state-of-the-art algorithm, such as the bioinspired V1 features with multiple kernel learning applied to faces aligned with the funneling method [15] under the most restricted protocol in LFW. [sent-49, score-1.389]
35 Related Work Related works include those adopted UBM-GMM for visual recognition [34, 9, 12, 26], and the current state-ofthe-art face verification algorithms on both the LFW [18, 29, 21, 14, 5, 33, 8, 25, 3] and YouTube video face datasets [27]. [sent-53, score-1.18]
36 The Gaussian mixture model has been widely used for various visual recognition tasks including face recognition [12, 26, 34] and scene recognition [9, 34]. [sent-55, score-0.59]
37 While early works [12, 26] focused on modeling the holistic appearance of the face with GMM, more recent works [34, 9] have largely exploited the bag of local feature representation and use GMM to model the local appearances of the images. [sent-56, score-0.576]
38 These latter works also leveraged the UBM-GMM and Bayesian adaptation paradigm to learn adaptive representations, wherein the super-vector representations are adopted for building the final classification model. [sent-57, score-0.231]
39 The most restricted protocol does not allow any additional datasets to be used for face alignment, feature extraction, or building the recognition model. [sent-61, score-0.755]
40 The less restricted protocol allows to use additional datasets for face alignment and feature extraction, but not for building the recognition model. [sent-62, score-0.819]
41 The current state-of-the-art on the most restricted protocol is the work of the bio-inspired V1-like features presented by Pinto et al. [sent-64, score-0.245]
42 Predominant recent works focused on the less restricted protocol [5, 8, 25] and least restricted protocol [18, 33, 3], which have pushed the recognition accuracy to be as high as 0. [sent-68, score-0.45]
43 0 our experiments on tgheed madodstit rioesntarilcdtaetda protocol on LFW as our interest is the design of a robust matching method for pose variant face verification. [sent-74, score-0.695]
44 We also observed consistent improvement when fusing the results from these two types of features together, ± suggesting that we can further improve face verification accuracy from the proposed method by fusing more types of features, or by feature learning, which we leave as our future work. [sent-77, score-0.745]
45 [21] used the View 1 of LFW for parameter tuning, which may have partly boosted their accuracy on the View 2 of LFW as the face images in View 1 and View 2 are overlapping. [sent-79, score-0.455]
46 [27] published a video face verification benchmark, namely YouTube Faces. [sent-81, score-0.728]
47 To date, the state-ofthe-art results are reported by the authors, using a method extended from their previous work [29] on image-based face verification. [sent-82, score-0.426]
48 Our proposed approach can be directly applied to video face verification without any modification, which outperformed their method by a significant margin. [sent-83, score-0.73]
49 Spatial-appearance Feature Extraction For image based face verification, we represent each face image as a bag of spatial-appearance features. [sent-85, score-0.909]
50 As shown in Figure 1, for each face image F, we firstly bsuhoilwd a nthr Feeig layer ,G faourss eiaacnh image pyramid. [sent-86, score-0.426]
51 The set of all N patches extracted from face image F is denoted as P = {pi}iN=1 . [sent-88, score-0.455]
52 As a result, the final feature representation for patch pi is a spatial-appearance feature fpi = [apiT, The final representation for face image F is hence an ensemble of these spatialappearance fageaetu Fres i,s si h. [sent-91, score-0.698]
53 In video based face verification, the task is to verify if two tracks of faces are from the same person or not (assuming each track of faces is the face of a single person). [sent-94, score-1.415]
54 We adopt the same bag of spatial-appearance feature representation for a track of faces by repeating the feature extraction pipeline in Figure 1 on each face image in the track. [sent-95, score-0.872]
55 The features extracted from all the face images from a single track are put together to form a larger set of spatial-appearance features to serve as the final representation of a face track. [sent-96, score-1.066]
56 As a result, we take the same kind of feature representation for both image based and video based face verification. [sent-97, score-0.519]
57 Therefore the probabilistic elastic matching method we will introduce in the next section will apply to both image and video based face verification. [sent-98, score-0.781]
58 Probabilistic Elastic Matching The exact steps of the proposed probabilistic elastic matching method are illustrated in Figure 2. [sent-101, score-0.317]
59 We start by building a GMM from all the spatial-appearance features extracted from face images in the training set. [sent-102, score-0.524]
60 Given a face/face track pair, both of which are represented as a bag of spatial-appearance features, for each Gaussian component in the UBM-GMM, we look for a pair of features (one from each of the face images/tracks) that induces the highest probability on it. [sent-104, score-0.755]
61 An additional improvement is to conduct a joint Bayesian adaptation step to adapt the UBM-GMM to the union of the spatial-appearance features from both face images/tracks constrained a priori by the parameters of the original UBM-GMM to form a new GMM (A-GMM). [sent-107, score-0.658]
62 We call the proposed approach using UBM-GMM to build the corresponding feature pair to be probabilistic elastic matching (PEM), and the approach using A-GMM to build the corresponding feature pair to be adaptive probabilistic elastic matching (APEM). [sent-109, score-1.033]
63 , ωK, μK, σK); K is the number of Gaussian mixture components; I an identity matrix; is ωk is the mixture weight of the k-th Gaussian component; G(μk , σk2I) is a spherical Gaussian with mean μk and vGa(rμiance σk2I, and f is an m-dimensional spatial-appearance feature vector i. [sent-127, score-0.333]
64 Here we argue and demonstrate that confining each mixture component in GMM to be a spherical Gaussian can handle this issue, as it helps establish a balance between the spatial and appearance constraint. [sent-153, score-0.364]
65 As shown in Equation 10, the spherical Gaussian on the spatial- appearance model can be regarded as the product of two equal variance Gaussian distribution over two Euclidean distances produced by the appearance and location, respectively. [sent-163, score-0.239]
66 Invariant Matching After we obtained the K-components UBM-GMM trained over a set of m-dimensional spatial-appearance features, we exploit it to form an elastic matching scheme in the form of a D = m K dimensional long difference ivnec tthoer ffoorrm a pair oDf f =ace m images/tracks. [sent-179, score-0.42]
67 ) |T, which serves as the final matching =fe |aature ve−cto ar of a pair of faces/face tracks for face verification. [sent-200, score-0.659]
68 We call the matching algorithm presented in this section to be probabilistic elastic matching (PEM). [sent-220, score-0.39]
69 To make the matching process adaptive for each face/face track pair, we propose a joint Bayesian adaptation on the union of the bag of spatialappearance features from the faces/face tracks pair. [sent-224, score-0.681]
70 In the joint adaptation process, the parameters of the UBM-GMM build the prior distribution for the parameters of the jointly adapted GMM under a Bayesian maximum a posteriori (MAP) framework. [sent-225, score-0.336]
71 Given a {faωce/face track pair Q and S, the adaptive G {bM,pM} . [sent-230, score-0.224]
72 In both figures, the row above shows local patches from face A shown in Figure 2, while the bottom ones are from face B. [sent-252, score-0.852]
73 (23) After we obtain the adapted GMM given a pair of faces/face tracks, we conduct APEM to build the difference vector for invariant matching. [sent-268, score-0.26]
74 To post-fuse multiple features, we repeat the proposed pipeline over all face/face track pairs using D types of different local features to obtain D confidence scores for each face/face track pair pi as a score vector si = [si1 si2 . [sent-273, score-0.374]
75 Labeled Faces in the Wild Labeled Faces in the Wild (LFW) [16] dataset is designed to address the unconstrained face verification problem. [sent-287, score-0.7]
76 By design, image-restricted paradigm does not allow experimenters to use the name of a person to infer two face images are matched or non-matched, while in the image-unrestricted paradigm experimenters may form as many matched or non-matched face pairs as desired for training. [sent-290, score-1.048]
77 In our experiments, we followed the most restricted protocol, in which detected faces are aligned with the funneling method [15]. [sent-292, score-0.363]
78 1 Baseline Algorithm To better investigate our PEM/APEM approach to pose variant face verification, we introduce a baseline algorithm that shows how well a trivial location-based feature pair matching scheme performs. [sent-295, score-0.758]
79 The baseline algorithm provides a basis of comparison to evaluate the effectiveness of building feature pair correspondences bridged by UBM-GMM or adapted GMM. [sent-296, score-0.313]
80 78 training difference vectors to predict if a testing face/face track pair is matched. [sent-368, score-0.227]
81 For APEM, given a pair of face images, all features in the joint feature set are utilized for joint adaptation. [sent-377, score-0.718]
82 We demonstrated the effectiveness of the invariant matching by comparing with the baseline and we also observed joint Bayesian adaptation and multiple features fusion bring consistent improvements. [sent-383, score-0.427]
83 Furthermore, our approach on unaligned faces [16], which are the outputs of the Viola-Jones face detector, even outperformed state-of-the-art methods with faces aligned by the funneling method. [sent-384, score-0.952]
84 YouTube Faces Dataset This work is a general framework which can handle both image and video based face verification without modification. [sent-387, score-0.693]
85 Performance comparison over YouTube Faces (YTFaces) for studying the problem of unconstrained face recognition in videos. [sent-393, score-0.501]
86 On average, a face track from a video clip consists of 181. [sent-395, score-0.579]
87 Protocols are similar to LFW, for the same purpose, we focus on the restricted video face verification paradigm. [sent-398, score-0.793]
88 1 Settings In the video faces experiments, each image frame is center cropped to 100x100 before feature extraction. [sent-402, score-0.257]
89 On average, more than 40000 features are extracted from one face track. [sent-406, score-0.49]
90 In the stage of joint Bayesian adaptation, to ease the computational intensity, 1000 out of 40000 features are sampled randomly from each face track to be combined into the joint feature set. [sent-407, score-0.755]
91 Conclusion In this paper, we proposed a probabilistic elastic match- ing algorithm with an additional joint Bayesian adaptation component as a general framework for both image and video based face verification. [sent-442, score-0.949]
92 Extensive experiments were performed in which PEM/APEM showed superior performances over state-of-the-art methods on two standard face verification benchmark datasets, most restricted LFW and restricted Youtube Faces dataset. [sent-443, score-0.855]
93 Beyond simple features: A large-scale feature search approach to unconstrained face recognition. [sent-498, score-0.526]
94 From few to many: Illumination cone models for face recognition under variable lighting and pose. [sent-520, score-0.456]
95 Growing gaussian mixture models for pose invariant face recognition. [sent-526, score-0.671]
96 A robust elastic and partial matching metric for face recognition. [sent-536, score-0.694]
97 Describable visual attributes for face verification and image search. [sent-564, score-0.655]
98 How far can you get with a modern face recognition test set using only simple features? [sent-582, score-0.456]
99 Effective unconstrained face recognition by combining multiple descriptors and learned background statistics. [sent-629, score-0.501]
100 Implicit elastic matching with randomized projections for pose-variant face recognition. [sent-634, score-0.694]
wordName wordTfidf (topN-words)
[('face', 0.426), ('apem', 0.319), ('pem', 0.283), ('gmm', 0.242), ('verification', 0.229), ('lfw', 0.217), ('elastic', 0.195), ('faces', 0.164), ('lbp', 0.149), ('ubm', 0.142), ('adaptation', 0.135), ('spherical', 0.13), ('youtube', 0.116), ('track', 0.115), ('protocol', 0.11), ('bayesian', 0.107), ('restricted', 0.1), ('agk', 0.091), ('spatialappearance', 0.091), ('gaussian', 0.088), ('fusion', 0.084), ('tracks', 0.082), ('pair', 0.078), ('mixture', 0.074), ('matching', 0.073), ('fi', 0.071), ('alignment', 0.064), ('unaligned', 0.062), ('joint', 0.062), ('wild', 0.061), ('funneling', 0.061), ('adapted', 0.059), ('wolf', 0.058), ('bag', 0.057), ('feature', 0.055), ('pk', 0.054), ('matched', 0.052), ('build', 0.051), ('bk', 0.05), ('ek', 0.049), ('probabilistic', 0.049), ('algorithmaccuracy', 0.046), ('bridged', 0.046), ('experimenters', 0.046), ('fgk', 0.046), ('pose', 0.045), ('unconstrained', 0.045), ('speech', 0.045), ('component', 0.044), ('kf', 0.043), ('correspondences', 0.041), ('variant', 0.041), ('sift', 0.041), ('hassner', 0.041), ('adobe', 0.041), ('confining', 0.04), ('ftf', 0.04), ('fpi', 0.04), ('stevens', 0.04), ('scheme', 0.04), ('appearance', 0.038), ('components', 0.038), ('aligned', 0.038), ('balance', 0.038), ('invariant', 0.038), ('video', 0.038), ('outperformed', 0.037), ('date', 0.036), ('published', 0.035), ('confined', 0.035), ('inhibition', 0.035), ('pinto', 0.035), ('lnp', 0.035), ('features', 0.035), ('difference', 0.034), ('building', 0.034), ('mbgs', 0.034), ('universal', 0.034), ('variance', 0.033), ('cao', 0.032), ('adopted', 0.031), ('lateral', 0.031), ('sid', 0.031), ('nowak', 0.031), ('adaptive', 0.031), ('em', 0.031), ('svm', 0.031), ('pi', 0.031), ('kg', 0.03), ('location', 0.03), ('recognition', 0.03), ('augment', 0.029), ('gaussians', 0.029), ('posteriori', 0.029), ('extracted', 0.029), ('gk', 0.029), ('partly', 0.029), ('nk', 0.029), ('ka', 0.029), ('protocols', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999905 338 cvpr-2013-Probabilistic Elastic Matching for Pose Variant Face Verification
Author: Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang
Abstract: Pose variation remains to be a major challenge for realworld face recognition. We approach this problem through a probabilistic elastic matching method. We take a part based representation by extracting local features (e.g., LBP or SIFT) from densely sampled multi-scale image patches. By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus. Each mixture component of the GMM is confined to be a spherical Gaussian to balance the influence of the appearance and the location terms. Each Gaussian component builds correspondence of a pair of features to be matched between two faces/face tracks. For face verification, we train an SVM on the vector concatenating the difference vectors of all the feature pairs to decide if a pair of faces/face tracks is matched or not. We further propose a joint Bayesian adaptation algorithm to adapt the universally trained GMM to better model the pose variations between the target pair of faces/face tracks, which consistently improves face verification accuracy. Our experiments show that our method outperforms the state-ofthe-art in the most restricted protocol on Labeled Face in the Wild (LFW) and the YouTube video face database by a significant margin.
2 0.31419411 182 cvpr-2013-Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild
Author: Zhen Cui, Wen Li, Dong Xu, Shiguang Shan, Xilin Chen
Abstract: In many real-world face recognition scenarios, face images can hardly be aligned accurately due to complex appearance variations or low-quality images. To address this issue, we propose a new approach to extract robust face region descriptors. Specifically, we divide each image (resp. video) into several spatial blocks (resp. spatial-temporal volumes) and then represent each block (resp. volume) by sum-pooling the nonnegative sparse codes of position-free patches sampled within the block (resp. volume). Whitened Principal Component Analysis (WPCA) is further utilized to reduce the feature dimension, which leads to our Spatial Face Region Descriptor (SFRD) (resp. Spatial-Temporal Face Region Descriptor, STFRD) for images (resp. videos). Moreover, we develop a new distance method for face verification metric learning called Pairwise-constrained Multiple Metric Learning (PMML) to effectively integrate the face region descriptors of all blocks (resp. volumes) from an image (resp. a video). Our work achieves the state- of-the-art performances on two real-world datasets LFW and YouTube Faces (YTF) according to the restricted protocol.
Author: Enrique G. Ortiz, Alan Wright, Mubarak Shah
Abstract: This paper presents an end-to-end video face recognition system, addressing the difficult problem of identifying a video face track using a large dictionary of still face images of a few hundred people, while rejecting unknown individuals. A straightforward application of the popular ?1minimization for face recognition on a frame-by-frame basis is prohibitively expensive, so we propose a novel algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and the knowledge that the face track frames belong to the same individual. By adding a strict temporal constraint to the ?1-minimization that forces individual frames in a face track to all reconstruct a single identity, we show the optimization reduces to a single minimization over the mean of the face track. We also introduce a new Movie Trailer Face Dataset collected from 101 movie trailers on YouTube. Finally, we show that our methodmatches or outperforms the state-of-the-art on three existing datasets (YouTube Celebrities, YouTube Faces, and Buffy) and our unconstrained Movie Trailer Face Dataset. More importantly, our method excels at rejecting unknown identities by at least 8% in average precision.
4 0.303931 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval
Author: Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu
Abstract: Detecting faces in uncontrolled environments continues to be a challenge to traditional face detection methods[24] due to the large variation in facial appearances, as well as occlusion and clutter. In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning. A large database of faces with bounding rectangles and facial landmark locations is collected, and simple discriminative classifiers are learned from each of them. A voting-based method is then proposed to let these classifiers cast votes on the test image through an efficient image retrieval technique. As a result, faces can be very efficiently detected by selecting the modes from the voting maps, without resorting to exhaustive sliding window-style scanning. Moreover, due to the exemplar-based framework, our approach can detect faces under challenging conditions without explicitly modeling their variations. Evaluation on two public benchmark datasets shows that our new face detection approach is accurate and efficient, and achieves the state-of-the-art performance. We further propose to use image retrieval for face validation (in order to remove false positives) and for face alignment/landmark localization. The same methodology can also be easily generalized to other facerelated tasks, such as attribute recognition, as well as general object detection.
5 0.27853984 64 cvpr-2013-Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification
Author: Dong Chen, Xudong Cao, Fang Wen, Jian Sun
Abstract: Making a high-dimensional (e.g., 100K-dim) feature for face recognition seems not a good idea because it will bring difficulties on consequent training, computation, and storage. This prevents further exploration of the use of a highdimensional feature. In this paper, we study the performance of a highdimensional feature. We first empirically show that high dimensionality is critical to high performance. A 100K-dim feature, based on a single-type Local Binary Pattern (LBP) descriptor, can achieve significant improvements over both its low-dimensional version and the state-of-the-art. We also make the high-dimensional feature practical. With our proposed sparse projection method, named rotated sparse regression, both computation and model storage can be reduced by over 100 times without sacrificing accuracy quality.
6 0.27381861 438 cvpr-2013-Towards Pose Robust Face Recognition
7 0.2413931 92 cvpr-2013-Constrained Clustering and Its Application to Face Clustering in Videos
8 0.18870531 152 cvpr-2013-Exemplar-Based Face Parsing
9 0.18523884 252 cvpr-2013-Learning Locally-Adaptive Decision Functions for Person Verification
10 0.183612 430 cvpr-2013-The SVM-Minus Similarity Score for Video Face Recognition
11 0.17318188 399 cvpr-2013-Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer
12 0.15189129 389 cvpr-2013-Semi-supervised Learning with Constraints for Person Identification in Multimedia Data
14 0.13314597 161 cvpr-2013-Facial Feature Tracking Under Varying Facial Expressions and Face Poses Based on Restricted Boltzmann Machines
15 0.12680978 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories
16 0.11607129 4 cvpr-2013-3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image
17 0.109005 50 cvpr-2013-Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling
18 0.10608115 387 cvpr-2013-Semi-supervised Domain Adaptation with Instance Constraints
19 0.10397553 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection
topicId topicWeight
[(0, 0.215), (1, -0.093), (2, -0.098), (3, 0.015), (4, 0.053), (5, -0.019), (6, -0.044), (7, -0.154), (8, 0.319), (9, -0.231), (10, 0.116), (11, -0.05), (12, 0.129), (13, 0.105), (14, -0.02), (15, -0.025), (16, 0.025), (17, -0.065), (18, -0.024), (19, 0.012), (20, -0.082), (21, 0.013), (22, -0.023), (23, 0.033), (24, 0.028), (25, -0.01), (26, -0.053), (27, 0.091), (28, -0.05), (29, -0.077), (30, -0.02), (31, 0.06), (32, 0.121), (33, 0.052), (34, 0.08), (35, 0.048), (36, 0.003), (37, 0.068), (38, 0.048), (39, -0.031), (40, 0.022), (41, 0.006), (42, 0.026), (43, -0.15), (44, -0.065), (45, 0.037), (46, 0.047), (47, 0.059), (48, -0.04), (49, 0.033)]
simIndex simValue paperId paperTitle
same-paper 1 0.98137099 338 cvpr-2013-Probabilistic Elastic Matching for Pose Variant Face Verification
Author: Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang
Abstract: Pose variation remains to be a major challenge for realworld face recognition. We approach this problem through a probabilistic elastic matching method. We take a part based representation by extracting local features (e.g., LBP or SIFT) from densely sampled multi-scale image patches. By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus. Each mixture component of the GMM is confined to be a spherical Gaussian to balance the influence of the appearance and the location terms. Each Gaussian component builds correspondence of a pair of features to be matched between two faces/face tracks. For face verification, we train an SVM on the vector concatenating the difference vectors of all the feature pairs to decide if a pair of faces/face tracks is matched or not. We further propose a joint Bayesian adaptation algorithm to adapt the universally trained GMM to better model the pose variations between the target pair of faces/face tracks, which consistently improves face verification accuracy. Our experiments show that our method outperforms the state-ofthe-art in the most restricted protocol on Labeled Face in the Wild (LFW) and the YouTube video face database by a significant margin.
2 0.93108433 160 cvpr-2013-Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification
Author: Enrique G. Ortiz, Alan Wright, Mubarak Shah
Abstract: This paper presents an end-to-end video face recognition system, addressing the difficult problem of identifying a video face track using a large dictionary of still face images of a few hundred people, while rejecting unknown individuals. A straightforward application of the popular ?1minimization for face recognition on a frame-by-frame basis is prohibitively expensive, so we propose a novel algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and the knowledge that the face track frames belong to the same individual. By adding a strict temporal constraint to the ?1-minimization that forces individual frames in a face track to all reconstruct a single identity, we show the optimization reduces to a single minimization over the mean of the face track. We also introduce a new Movie Trailer Face Dataset collected from 101 movie trailers on YouTube. Finally, we show that our methodmatches or outperforms the state-of-the-art on three existing datasets (YouTube Celebrities, YouTube Faces, and Buffy) and our unconstrained Movie Trailer Face Dataset. More importantly, our method excels at rejecting unknown identities by at least 8% in average precision.
3 0.92192167 182 cvpr-2013-Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild
Author: Zhen Cui, Wen Li, Dong Xu, Shiguang Shan, Xilin Chen
Abstract: In many real-world face recognition scenarios, face images can hardly be aligned accurately due to complex appearance variations or low-quality images. To address this issue, we propose a new approach to extract robust face region descriptors. Specifically, we divide each image (resp. video) into several spatial blocks (resp. spatial-temporal volumes) and then represent each block (resp. volume) by sum-pooling the nonnegative sparse codes of position-free patches sampled within the block (resp. volume). Whitened Principal Component Analysis (WPCA) is further utilized to reduce the feature dimension, which leads to our Spatial Face Region Descriptor (SFRD) (resp. Spatial-Temporal Face Region Descriptor, STFRD) for images (resp. videos). Moreover, we develop a new distance method for face verification metric learning called Pairwise-constrained Multiple Metric Learning (PMML) to effectively integrate the face region descriptors of all blocks (resp. volumes) from an image (resp. a video). Our work achieves the state- of-the-art performances on two real-world datasets LFW and YouTube Faces (YTF) according to the restricted protocol.
4 0.86882514 438 cvpr-2013-Towards Pose Robust Face Recognition
Author: Dong Yi, Zhen Lei, Stan Z. Li
Abstract: Most existing pose robust methods are too computational complex to meet practical applications and their performance under unconstrained environments are rarely evaluated. In this paper, we propose a novel method for pose robust face recognition towards practical applications, which is fast, pose robust and can work well under unconstrained environments. Firstly, a 3D deformable model is built and a fast 3D model fitting algorithm is proposed to estimate the pose of face image. Secondly, a group of Gabor filters are transformed according to the pose and shape of face image for feature extraction. Finally, PCA is applied on the pose adaptive Gabor features to remove the redundances and Cosine metric is used to evaluate the similarity. The proposed method has three advantages: (1) The pose correction is applied in the filter space rather than image space, which makes our method less affected by the precision of the 3D model; (2) By combining the holistic pose transformation and local Gabor filtering, the final feature is robust to pose and other negative factors in face recognition; (3) The 3D structure and facial symmetry are successfully used to deal with self-occlusion. Extensive experiments on FERET and PIE show the proposed method outperforms state-ofthe-art methods significantly, meanwhile, the method works well on LFW.
5 0.85430169 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval
Author: Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu
Abstract: Detecting faces in uncontrolled environments continues to be a challenge to traditional face detection methods[24] due to the large variation in facial appearances, as well as occlusion and clutter. In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning. A large database of faces with bounding rectangles and facial landmark locations is collected, and simple discriminative classifiers are learned from each of them. A voting-based method is then proposed to let these classifiers cast votes on the test image through an efficient image retrieval technique. As a result, faces can be very efficiently detected by selecting the modes from the voting maps, without resorting to exhaustive sliding window-style scanning. Moreover, due to the exemplar-based framework, our approach can detect faces under challenging conditions without explicitly modeling their variations. Evaluation on two public benchmark datasets shows that our new face detection approach is accurate and efficient, and achieves the state-of-the-art performance. We further propose to use image retrieval for face validation (in order to remove false positives) and for face alignment/landmark localization. The same methodology can also be easily generalized to other facerelated tasks, such as attribute recognition, as well as general object detection.
6 0.78721529 92 cvpr-2013-Constrained Clustering and Its Application to Face Clustering in Videos
7 0.78456885 389 cvpr-2013-Semi-supervised Learning with Constraints for Person Identification in Multimedia Data
8 0.76912308 430 cvpr-2013-The SVM-Minus Similarity Score for Video Face Recognition
9 0.73251361 399 cvpr-2013-Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer
10 0.72226423 64 cvpr-2013-Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification
11 0.71653801 152 cvpr-2013-Exemplar-Based Face Parsing
12 0.648166 220 cvpr-2013-In Defense of Sparsity Based Face Recognition
13 0.63850349 463 cvpr-2013-What's in a Name? First Names as Facial Attributes
14 0.63642055 252 cvpr-2013-Learning Locally-Adaptive Decision Functions for Person Verification
15 0.62467861 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection
16 0.62436366 420 cvpr-2013-Supervised Descent Method and Its Applications to Face Alignment
18 0.59274638 415 cvpr-2013-Structured Face Hallucination
19 0.56210619 4 cvpr-2013-3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image
20 0.53485006 261 cvpr-2013-Learning by Associating Ambiguously Labeled Images
topicId topicWeight
[(10, 0.092), (16, 0.044), (19, 0.032), (26, 0.045), (29, 0.18), (33, 0.273), (67, 0.141), (69, 0.033), (87, 0.069)]
simIndex simValue paperId paperTitle
1 0.92300135 140 cvpr-2013-Efficient Color Boundary Detection with Color-Opponent Mechanisms
Author: Kaifu Yang, Shaobing Gao, Chaoyi Li, Yongjie Li
Abstract: Color information plays an important role in better understanding of natural scenes by at least facilitating discriminating boundaries of objects or areas. In this study, we propose a new framework for boundary detection in complex natural scenes based on the color-opponent mechanisms of the visual system. The red-green and blue-yellow color opponent channels in the human visual system are regarded as the building blocks for various color perception tasks such as boundary detection. The proposed framework is a feedforward hierarchical model, which has direct counterpart to the color-opponent mechanisms involved in from the retina to the primary visual cortex (V1). Results show that our simple framework has excellent ability to flexibly capture both the structured chromatic and achromatic boundaries in complex scenes.
2 0.91916972 391 cvpr-2013-Sensing and Recognizing Surface Textures Using a GelSight Sensor
Author: Rui Li, Edward H. Adelson
Abstract: Sensing surface textures by touch is a valuable capability for robots. Until recently it wwas difficult to build a compliant sensor with high sennsitivity and high resolution. The GelSight sensor is coompliant and offers sensitivity and resolution exceeding that of the human fingertips. This opens the possibility of measuring and recognizing highly detailed surface texxtures. The GelSight sensor, when pressed against a surfacce, delivers a height map. This can be treated as an image, aand processed using the tools of visual texture analysis. WWe have devised a simple yet effective texture recognitioon system based on local binary patterns, and enhanced it by the use of a multi-scale pyramid and a Hellinger ddistance metric. We built a database with 40 classes of taactile textures using materials such as fabric, wood, and sanndpaper. Our system can correctly categorize materials fromm this database with high accuracy. This suggests that the GGelSight sensor can be useful for material recognition by roobots.
3 0.8944847 418 cvpr-2013-Submodular Salient Region Detection
Author: Zhuolin Jiang, Larry S. Davis
Abstract: The problem of salient region detection is formulated as the well-studied facility location problem from operations research. High-level priors are combined with low-level features to detect salient regions. Salient region detection is achieved by maximizing a submodular objective function, which maximizes the total similarities (i.e., total profits) between the hypothesized salient region centers (i.e., facility locations) and their region elements (i.e., clients), and penalizes the number of potential salient regions (i.e., the number of open facilities). The similarities are efficiently computedbyfinding a closed-form harmonic solution on the constructed graph for an input image. The saliency of a selected region is modeled in terms of appearance and spatial location. By exploiting the submodularity properties of the objectivefunction, a highly efficient greedy-based optimization algorithm can be employed. This algorithm is guaranteed to be at least a (e − 1)/e ≈ 0.632-approximation to t heeed optimum. lEeaxpster aim (een −tal 1 r)e/seult ≈s d 0e.m63o2n-satrpaptero txhimata our approach outperforms several recently proposed saliency detection approaches.
same-paper 4 0.89146996 338 cvpr-2013-Probabilistic Elastic Matching for Pose Variant Face Verification
Author: Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang
Abstract: Pose variation remains to be a major challenge for realworld face recognition. We approach this problem through a probabilistic elastic matching method. We take a part based representation by extracting local features (e.g., LBP or SIFT) from densely sampled multi-scale image patches. By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus. Each mixture component of the GMM is confined to be a spherical Gaussian to balance the influence of the appearance and the location terms. Each Gaussian component builds correspondence of a pair of features to be matched between two faces/face tracks. For face verification, we train an SVM on the vector concatenating the difference vectors of all the feature pairs to decide if a pair of faces/face tracks is matched or not. We further propose a joint Bayesian adaptation algorithm to adapt the universally trained GMM to better model the pose variations between the target pair of faces/face tracks, which consistently improves face verification accuracy. Our experiments show that our method outperforms the state-ofthe-art in the most restricted protocol on Labeled Face in the Wild (LFW) and the YouTube video face database by a significant margin.
5 0.8898018 35 cvpr-2013-Adaptive Compressed Tomography Sensing
Author: Oren Barkan, Jonathan Weill, Amir Averbuch, Shai Dekel
Abstract: One of the main challenges in Computed Tomography (CT) is how to balance between the amount of radiation the patient is exposed to during scan time and the quality of the CT image. We propose a mathematical model for adaptive CT acquisition whose goal is to reduce dosage levels while maintaining high image quality at the same time. The adaptive algorithm iterates between selective limited acquisition and improved reconstruction, with the goal of applying only the dose level required for sufficient image quality. The theoretical foundation of the algorithm is nonlinear Ridgelet approximation and a discrete form of Ridgelet analysis is used to compute the selective acquisition steps that best capture the image edges. We show experimental results where for the same number of line projections, the adaptive model produces higher image quality, when compared with standard limited angle, non-adaptive acquisition algorithms.
6 0.86844009 2 cvpr-2013-3D Pictorial Structures for Multiple View Articulated Pose Estimation
7 0.86642545 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval
8 0.86563116 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection
9 0.86536604 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation
10 0.86505353 345 cvpr-2013-Real-Time Model-Based Rigid Object Pose Estimation and Tracking Combining Dense and Sparse Visual Cues
11 0.86410129 45 cvpr-2013-Articulated Pose Estimation Using Discriminative Armlet Classifiers
13 0.86134589 275 cvpr-2013-Lp-Norm IDF for Large Scale Image Search
14 0.85924381 375 cvpr-2013-Saliency Detection via Graph-Based Manifold Ranking
15 0.8571322 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence
16 0.85664469 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes
17 0.85620868 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors
18 0.85480249 215 cvpr-2013-Improved Image Set Classification via Joint Sparse Approximated Nearest Subspaces
19 0.8541078 246 cvpr-2013-Learning Binary Codes for High-Dimensional Data Using Bilinear Projections
20 0.85317463 438 cvpr-2013-Towards Pose Robust Face Recognition