cvpr cvpr2013 cvpr2013-119 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu
Abstract: Detecting faces in uncontrolled environments continues to be a challenge to traditional face detection methods[24] due to the large variation in facial appearances, as well as occlusion and clutter. In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning. A large database of faces with bounding rectangles and facial landmark locations is collected, and simple discriminative classifiers are learned from each of them. A voting-based method is then proposed to let these classifiers cast votes on the test image through an efficient image retrieval technique. As a result, faces can be very efficiently detected by selecting the modes from the voting maps, without resorting to exhaustive sliding window-style scanning. Moreover, due to the exemplar-based framework, our approach can detect faces under challenging conditions without explicitly modeling their variations. Evaluation on two public benchmark datasets shows that our new face detection approach is accurate and efficient, and achieves the state-of-the-art performance. We further propose to use image retrieval for face validation (in order to remove false positives) and for face alignment/landmark localization. The same methodology can also be easily generalized to other facerelated tasks, such as attribute recognition, as well as general object detection.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract Detecting faces in uncontrolled environments continues to be a challenge to traditional face detection methods[24] due to the large variation in facial appearances, as well as occlusion and clutter. [sent-3, score-1.058]
2 In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning. [sent-4, score-0.689]
3 A large database of faces with bounding rectangles and facial landmark locations is collected, and simple discriminative classifiers are learned from each of them. [sent-5, score-0.883]
4 As a result, faces can be very efficiently detected by selecting the modes from the voting maps, without resorting to exhaustive sliding window-style scanning. [sent-7, score-0.714]
5 Evaluation on two public benchmark datasets shows that our new face detection approach is accurate and efficient, and achieves the state-of-the-art performance. [sent-9, score-0.655]
6 We further propose to use image retrieval for face validation (in order to remove false positives) and for face alignment/landmark localization. [sent-10, score-1.318]
7 Introduction Although boosting-based object detection methods[24] and their variations[28] have achieved great success in frontal-view face detection, so-called face detection in the wild (i. [sent-13, score-1.277]
8 The exemplar-based approach is an intuitive and straightforward alternative, in which a test sample can be directly matched against a collection of face images to determine its label. [sent-22, score-0.597]
9 Without explicit modeling, a face can be detected as long as enough similar exemplars are included in the collection. [sent-23, score-0.709]
10 However, there are two challenges confronting this approach: (1) To achieve good performance, lots of exemplar faces are needed to span the large appearance variation. [sent-24, score-0.631]
11 Our new face detector is essentially an image retrieval system that uses a database of face images annotated with bounding rectangles and landmark locations. [sent-32, score-1.633]
12 The face regions in the test image, even with challenging poses or expressions, shall receive high prediction scores from similar exemplar faces. [sent-35, score-0.963]
13 In addition to the voting-based face detection, we also propose a new face validation step to further boost the detection performance by reducing false positives. [sent-40, score-1.314]
14 Each candidate face rectangle is used to perform search and localization against a face database. [sent-41, score-1.254]
15 True face samples shall retrieve similar faces and accurately localize those faces, while false positives tend to retrieve and localize on non-face image regions, and are consequently removed. [sent-42, score-1.204]
16 We evaluate our method on two public face detection datasets and show that our approach outperforms state-of-the-art methods. [sent-43, score-0.655]
17 Although we mainly focus on face detection in this paper, since we retrieved similar faces to the test image during validation, robust face alignment can also be achieved as a by-product by transferring landmark locations from the exemplar face images, which is an additional benefit of our method. [sent-44, score-2.743]
18 We propose a novel exemplar-based face detection approach by combining image retrieval with discriminative learning, and designing a voting-based method to efficiently detect faces without exhaustive scanning. [sent-48, score-1.091]
19 We introduce an efficient image retrieval-based frame- work to simultaneously perform face validation and facial landmark localization. [sent-50, score-1.01]
20 We achieve the stat-of-the-art performance challenging face detection benchmarks. [sent-52, score-0.615]
21 Sliding window scanning is then performed for face detection. [sent-58, score-0.606]
22 In [27], face images with the same identities were retrieved. [sent-66, score-0.506]
23 However, in all of those methods, the query image is given, and the task is to find the identical object or visually similar objects from the database, which is a different task from face detection, as the category of face has much larger appearance variations than a single object. [sent-68, score-1.072]
24 To the best of our knowledge, there is no previous work on face detection leveraging large-scale image retrieval. [sent-71, score-0.615]
25 Exemplar Database To detect faces using image retrieval, we build a database with 18486 exemplar face images under different viewpoints, poses, expressions and lighting conditions. [sent-75, score-1.303]
26 The face region in the image is around the image center and manually marked with four main facial landmark locations: the center of two eyes, mouth center and nose tip. [sent-76, score-1.017]
27 A rectangle bounding the face is then generated according to the landmark positions1 . [sent-77, score-0.905]
28 Algorithm In order to detect faces in a test image by searching the database images, we need to define a similarity measure between any detection window(represented by a 1For profile faces, if one annotation would be absent. [sent-84, score-0.633]
29 (a) test image, (c) The face rectangle in an exemplar image, (b) generated voting map when using (c) to vote on (a). [sent-87, score-1.434]
30 sub-rectangle)2 in the test image and the face rectangle in a database image. [sent-88, score-0.82]
31 T is the spatial transformation that maps rectangle 푥 in the test image to 푐푖 in the exemplar image. [sent-96, score-0.598]
32 Consider that if a feature 푔 inside the face rectangle of an exemplar image is matched with a feature 푓 in a possible positive detection window of a test image, the relative locations of 푔 and 푓 to their respective rectangle centers should be consistent under a certain scale change. [sent-103, score-1.417]
33 푘 푘 =푥,푤푔 to the face center in the exemplar image, and use that to predict the location of the face center in the test image accordingly. [sent-105, score-1.524]
34 To achieve better detection performance, we differ from [20] in that each vote 2In this paper, we fix the aspect ratio of a detection window to 1. [sent-108, score-0.455]
35 only shows voting maps at a certain scale, while in practice we generate voting maps at multiple scales. [sent-110, score-0.612]
36 is further weighted by the distance from the feature 푔 to the face center in the exemplar image. [sent-111, score-0.875]
37 Features closer to the face center will cast votes with higher weights, as they contain more feature information on the faces. [sent-112, score-0.6]
38 2, for example, if we use all the features in the exemplar face (Fig. [sent-114, score-0.844]
39 Therefore the similarities between any sub-rectangle of the test image and the exemplars can be obtained from the voting maps, without resorting to sliding window search. [sent-120, score-0.678]
40 , SIFT[17]) are quantized for fast retrieval, the similarities between a face exemplar and a non-face test sample can be as high as face-to-face similarities, and the voting maps may be noisy. [sent-123, score-1.276]
41 Therefore, only obtaining and simply aggregating the similarities between test samples and the face exemplars is not sufficient to robustly detect the faces. [sent-124, score-0.828]
42 To this end, we combine image retrieval and discriminative learning, and propose the pipeline of our face detection algorithm as illustrated in Fig. [sent-127, score-0.755]
43 Given a test image, we first use all the exemplar faces to vote on the test image and generate corresponding voting maps at multiple scales. [sent-129, score-1.17]
44 The threshold 푡푖 corresponding to each exemplar face is discriminatively learned in the training stage, as explained in Section 3. [sent-133, score-0.879]
45 We then aggregate the gated voting maps together to get the final score map. [sent-136, score-0.403]
46 This operation can be interpreted mathematically in the following equation: 푆(푥) = ∑ (푠푖(푥) − 푡푖) (2) 푖:푠푖∑ ∑(푥) >푡푖 333444666200 where 푆(푥) is the final detection score of 푥, 푠푖 (푥) is the similarity score between 푥 and database exemplar 푐푖, 푡푖 is the corresponding threshold. [sent-137, score-0.699]
47 Based on the aggregated voting maps, we then select the maximal modes from the maps with non-maxima suppression to get the final detection results, as shown in the last column in Fig. [sent-142, score-0.45]
48 The reason we use gating before aggregation is to limit the contributions of irrelevant exemplars to a given test image, or more accurately, to a given sub-rectangle of a test image. [sent-144, score-0.418]
49 The appearance variation of face images can be very large, and we expect that only the exemplars which are very similar to the test region are informative for classification, while the more distant exemplars are uninformative. [sent-145, score-0.927]
50 Therefore our assumption is that, if 푥 is sufficiently similar to 푐푖, 푥 should be voted as a face with very high probability, while if 푥 is far away from 푐푖, 푐푖 cannot determine the label of 푥 with any preference. [sent-146, score-0.506]
51 Suppose that we consider the gated voting of a particular exemplar (in Section 3. [sent-153, score-0.638]
52 , face images) 푐푖, and a test sample 푥, let 푠푖 (푥) be the similarity between 푐푖 and 푥. [sent-158, score-0.641]
53 In the test stage, suppose there are 푚 total exemplar faces, and for simplicity we use 푠푖 to denote the similarity 푠푖 (푥), the likelihood ratio can be defined as: 퐿(푠1,. [sent-170, score-0.522]
54 Valid faces tend to retrieval similar faces and accurately localize on these faces, while invalid detections produce inconsistent search and localization results. [sent-204, score-0.825]
55 Thus we choose the final threshold as: 푡푖 =m푗 ∈a풩x푠푖(푥푗) (12) This means the threshold is the maximum similarity score between exemplar 푐푖 and any negative training samples. [sent-216, score-0.503]
56 Face Validation After the face detection step, several candidate face rectangles are obtained. [sent-218, score-1.208]
57 Therefore we propose a face validation step using image retrieval again to identify and filter out these false positives and further improve the detection accuracy. [sent-220, score-0.957]
58 We use each detected face window to perform search and localization on a validation face database using the same similarity measure as in Eqn. [sent-221, score-1.48]
59 The validation database is set as the same as our face database for detection, but it can also be augmented with non-face images for improved discriminability. [sent-224, score-0.875]
60 If the candidate region is a true face, it will retrieve faces with similar poses and meanwhile accurately localize the faces, as shown in Fig. [sent-225, score-0.551]
61 Therefore we use such information to generate the validation score and further refine our face detection results. [sent-229, score-0.823]
62 Consider that top-푘 images are retrieved for a detected candidate window 푥, with a localized rectangle obtained in each retrieved image, we calculate the overlap ratio between the localized rectangle 푙푖 and ground truth rectangle 푔푖 for each retrieved image 퐼푖 (푖 = 1. [sent-230, score-1.132]
63 The validation score is then determined by: ∑푘 푉 (푥) = ∑ 푅푖∑(푖=푥)1>휃 푠푖(푥) 푅푖(푥) (14) where 푠푖 (푥) is the similarity score between the test sample 푥 and the 푖-th retrieved image. [sent-234, score-0.523]
64 Face Alignment In addition to bounding rectangles, our database faces are annotated with landmark locations. [sent-241, score-0.671]
65 Therefore, we can transfer the facial landmark locations from the images retrieved during validation to the test image. [sent-242, score-0.726]
66 In this way, face alignment can be performed without any additional search cost, which is an additional benefit of our method. [sent-243, score-0.598]
67 We localize each landmark using a modified version of our voting scheme in face detection, and generate voting maps for each landmark separately. [sent-244, score-1.596]
68 To vote on a landmark, when we find a matched feature pair between the test sample and an exemplar face, we calculate the relative location of the feature to the landmark in the exemplar face image, and vote on the estimated location of that landmark 333444666422 Figure 5. [sent-245, score-2.124]
69 Face alignment and pose estimation using top retrieved face images. [sent-246, score-0.695]
70 Meanwhile, similar as in face detection, the vote is weighted by the relative distance from the feature to the landmark in the exemplar face. [sent-249, score-1.204]
71 After voting, the peak location in each individual voting map is the estimated landmark location based on 푐푖. [sent-251, score-0.576]
72 If we have 푘-top retrieved images, then the final estimated location of that landmark is determined as the per-component median value of 푒1 , 푒2 , . [sent-253, score-0.406]
73 If the exemplar faces in the database are annotated with additional information (e. [sent-258, score-0.751]
74 , attributes such as age, gender and expressions), we can use the the top retrieved face images and the same methodology to estimate these attributes in the test image through label transfer. [sent-260, score-0.779]
75 In face detection, the smallest scale on which we vote is 80 80 (in a 1280-pixel d simmaelnlesisotn sc image). [sent-268, score-0.634]
76 To speed up the process and reduce the memory, given a test image, we first use the bag-of-words model[22] to retrieve 3000 similar images from the database, and then do voting and face detection using only those retrieved images. [sent-271, score-1.143]
77 Without code optimization, the entire face detection, validation and alignment finishes in less than 10 seconds in C++ implementation. [sent-272, score-0.723]
78 The voting and validation tasks can be parallelized to further reduce the detection time, which shows its potential in real time processing. [sent-273, score-0.52]
79 Both datasets contain faces in uncontrolled conditions with cluttered backgrounds and large variations in both face viewpoint and appearance, and thus bring forward great challenges to the current face detection algorithms. [sent-277, score-1.479]
80 In the AFW dataset, the results of the following face detection methods are reported in [29]: (1) OpenCV implementations of 2-view Viola-Jones, (2) Boosted 2-view face detector of [11], (3) Deformable part model(DPM)[6], (4) Mixture of trees[29], (5) face. [sent-278, score-1.164]
81 After face validation, our method further outperforms [29], achieving the state-of-the-art in research approaches, and closing the gap with face. [sent-285, score-0.506]
82 On this dataset, our initial face detection has already achieved quite good performance, and face validation does not show much improvement. [sent-296, score-1.278]
83 Moreover, there are many small faces in the ground-truth files which our method will not detect (the minimum resolution of the ground-truth faces is 20 pixels, while the minimum scale of our detection is 80 pixels in a 1280resolution image). [sent-302, score-0.714]
84 resolutions, poses and attributes, in severe occlusions and cluttered background, as well as blurred face images. [sent-319, score-0.53]
85 Although the main focus of this paper is face detection, the proposed framework allows us to perform face alignment using the same methodology, as described in Section 5. [sent-320, score-1.072]
86 8 we can see that our approach can accurately localize the landmarks under large facial appearance variations, which shows great potential in more complete face alignment (e. [sent-323, score-0.839]
87 , eye corners and mouth corners) given the availability of more precise landmark annotations on our exemplar face database 8. [sent-325, score-1.225]
88 Discussions Currently, we include only 18486 face images in the database, without specifically selecting the types of faces, 8Please see more results. [sent-328, score-0.53]
89 In principal, adding more faces to the database will further improve performance since the larger database will better span the face appearance variations. [sent-334, score-0.982]
90 Meanwhile, how to design a better database for face detection is an interesting problem that merits further study. [sent-336, score-0.721]
91 Conclusions In this paper, we propose a robust face detector by combining state-of-the-art visual search with discriminative learning. [sent-338, score-0.608]
92 Simple discriminative classifiers are learned for the exemplar face images in the database and collaboratively cast their prediction scores on the test image. [sent-339, score-1.111]
93 Face detection is then efficiently performed by selecting modes from multi-scale voting maps. [sent-340, score-0.422]
94 A face validation step using image retrieval is further proposed, and face alignment can 333444666644 Figure8. [sent-341, score-1.342]
95 The evaluation on two public face detection datasets shows that our approach outperforms other state-of-the-art methods. [sent-345, score-0.655]
96 On the design of cascades of boosted ensembles for face detection. [sent-370, score-0.572]
97 Fddb: A benchmark for face detection in unconstrained settings. [sent-410, score-0.615]
98 Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization. [sent-429, score-0.614]
99 Fast rotation invariant multi-view face detection based on real adaboost. [sent-513, score-0.615]
100 Scalable face image retrieval with identity-based quantization and multireference reranking. [sent-521, score-0.619]
wordName wordTfidf (topN-words)
[('face', 0.506), ('exemplar', 0.338), ('faces', 0.264), ('voting', 0.254), ('landmark', 0.232), ('fddb', 0.185), ('exemplars', 0.177), ('validation', 0.157), ('rectangle', 0.141), ('retrieved', 0.129), ('vote', 0.128), ('facial', 0.115), ('retrieval', 0.113), ('detection', 0.109), ('gating', 0.107), ('database', 0.106), ('afw', 0.104), ('roc', 0.083), ('retrieve', 0.078), ('test', 0.067), ('localize', 0.066), ('al', 0.066), ('window', 0.063), ('bayes', 0.061), ('alignment', 0.06), ('rectangles', 0.058), ('maps', 0.052), ('facerelated', 0.052), ('picasa', 0.052), ('score', 0.051), ('brandt', 0.05), ('sliding', 0.049), ('tf', 0.049), ('naive', 0.048), ('wild', 0.047), ('gated', 0.046), ('subburaman', 0.046), ('jain', 0.046), ('landmarks', 0.046), ('accurately', 0.046), ('ratio', 0.046), ('expressions', 0.046), ('location', 0.045), ('meanwhile', 0.044), ('similarity', 0.044), ('detect', 0.043), ('annotated', 0.043), ('mouth', 0.043), ('detector', 0.043), ('mikolajczyk', 0.042), ('calculate', 0.041), ('localized', 0.041), ('public', 0.04), ('localization', 0.04), ('cast', 0.038), ('boosted', 0.037), ('scanning', 0.037), ('false', 0.036), ('positives', 0.036), ('similarities', 0.035), ('google', 0.035), ('overlap', 0.035), ('threshold', 0.035), ('modes', 0.035), ('kalal', 0.035), ('lo', 0.034), ('variations', 0.034), ('files', 0.034), ('continues', 0.033), ('resorting', 0.033), ('apparently', 0.032), ('search', 0.032), ('uncontrolled', 0.031), ('adobe', 0.031), ('cascade', 0.031), ('center', 0.031), ('log', 0.03), ('opencv', 0.03), ('candidate', 0.029), ('methodology', 0.029), ('classifiers', 0.029), ('cascades', 0.029), ('challenges', 0.029), ('exhaustive', 0.029), ('shall', 0.028), ('army', 0.028), ('nose', 0.028), ('suppose', 0.027), ('discriminative', 0.027), ('bounding', 0.026), ('detected', 0.026), ('query', 0.026), ('locations', 0.026), ('positive', 0.026), ('wu', 0.025), ('votes', 0.025), ('selecting', 0.024), ('attributes', 0.024), ('poses', 0.024), ('sample', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999839 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval
Author: Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu
Abstract: Detecting faces in uncontrolled environments continues to be a challenge to traditional face detection methods[24] due to the large variation in facial appearances, as well as occlusion and clutter. In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning. A large database of faces with bounding rectangles and facial landmark locations is collected, and simple discriminative classifiers are learned from each of them. A voting-based method is then proposed to let these classifiers cast votes on the test image through an efficient image retrieval technique. As a result, faces can be very efficiently detected by selecting the modes from the voting maps, without resorting to exhaustive sliding window-style scanning. Moreover, due to the exemplar-based framework, our approach can detect faces under challenging conditions without explicitly modeling their variations. Evaluation on two public benchmark datasets shows that our new face detection approach is accurate and efficient, and achieves the state-of-the-art performance. We further propose to use image retrieval for face validation (in order to remove false positives) and for face alignment/landmark localization. The same methodology can also be easily generalized to other facerelated tasks, such as attribute recognition, as well as general object detection.
2 0.41417533 152 cvpr-2013-Exemplar-Based Face Parsing
Author: Brandon M. Smith, Li Zhang, Jonathan Brandt, Zhe Lin, Jianchao Yang
Abstract: In this work, we propose an exemplar-based face image segmentation algorithm. We take inspiration from previous works on image parsing for general scenes. Our approach assumes a database of exemplar face images, each of which is associated with a hand-labeled segmentation map. Given a test image, our algorithm first selects a subset of exemplar images from the database, Our algorithm then computes a nonrigid warp for each exemplar image to align it with the test image. Finally, we propagate labels from the exemplar images to the test image in a pixel-wise manner, using trained weights to modulate and combine label maps from different exemplars. We evaluate our method on two challenging datasets and compare with two face parsing algorithms and a general scene parsing algorithm. We also compare our segmentation results with contour-based face alignment results; that is, we first run the alignment algorithms to extract contour points and then derive segments from the contours. Our algorithm compares favorably with all previous works on all datasets evaluated.
3 0.303931 338 cvpr-2013-Probabilistic Elastic Matching for Pose Variant Face Verification
Author: Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang
Abstract: Pose variation remains to be a major challenge for realworld face recognition. We approach this problem through a probabilistic elastic matching method. We take a part based representation by extracting local features (e.g., LBP or SIFT) from densely sampled multi-scale image patches. By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus. Each mixture component of the GMM is confined to be a spherical Gaussian to balance the influence of the appearance and the location terms. Each Gaussian component builds correspondence of a pair of features to be matched between two faces/face tracks. For face verification, we train an SVM on the vector concatenating the difference vectors of all the feature pairs to decide if a pair of faces/face tracks is matched or not. We further propose a joint Bayesian adaptation algorithm to adapt the universally trained GMM to better model the pose variations between the target pair of faces/face tracks, which consistently improves face verification accuracy. Our experiments show that our method outperforms the state-ofthe-art in the most restricted protocol on Labeled Face in the Wild (LFW) and the YouTube video face database by a significant margin.
4 0.29057002 92 cvpr-2013-Constrained Clustering and Its Application to Face Clustering in Videos
Author: Baoyuan Wu, Yifan Zhang, Bao-Gang Hu, Qiang Ji
Abstract: In this paper, we focus on face clustering in videos. Given the detected faces from real-world videos, we partition all faces into K disjoint clusters. Different from clustering on a collection of facial images, the faces from videos are organized as face tracks and the frame index of each face is also provided. As a result, many pairwise constraints between faces can be easily obtained from the temporal and spatial knowledge of the face tracks. These constraints can be effectively incorporated into a generative clustering model based on the Hidden Markov Random Fields (HMRFs). Within the HMRF model, the pairwise constraints are augmented by label-level and constraint-level local smoothness to guide the clustering process. The parameters for both the unary and the pairwise potential functions are learned by the simulated field algorithm, and the weights of constraints can be easily adjusted. We further introduce an efficient clustering framework specially for face clustering in videos, considering that faces in adjacent frames of the same face track are very similar. The framework is applicable to other clustering algorithms to significantly reduce the computational cost. Experiments on two face data sets from real-world videos demonstrate the significantly improved performance of our algorithm over state-of-theart algorithms.
5 0.28897798 438 cvpr-2013-Towards Pose Robust Face Recognition
Author: Dong Yi, Zhen Lei, Stan Z. Li
Abstract: Most existing pose robust methods are too computational complex to meet practical applications and their performance under unconstrained environments are rarely evaluated. In this paper, we propose a novel method for pose robust face recognition towards practical applications, which is fast, pose robust and can work well under unconstrained environments. Firstly, a 3D deformable model is built and a fast 3D model fitting algorithm is proposed to estimate the pose of face image. Secondly, a group of Gabor filters are transformed according to the pose and shape of face image for feature extraction. Finally, PCA is applied on the pose adaptive Gabor features to remove the redundances and Cosine metric is used to evaluate the similarity. The proposed method has three advantages: (1) The pose correction is applied in the filter space rather than image space, which makes our method less affected by the precision of the 3D model; (2) By combining the holistic pose transformation and local Gabor filtering, the final feature is robust to pose and other negative factors in face recognition; (3) The 3D structure and facial symmetry are successfully used to deal with self-occlusion. Extensive experiments on FERET and PIE show the proposed method outperforms state-ofthe-art methods significantly, meanwhile, the method works well on LFW.
6 0.28115627 160 cvpr-2013-Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification
7 0.25244373 182 cvpr-2013-Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild
8 0.25112838 415 cvpr-2013-Structured Face Hallucination
10 0.21339045 399 cvpr-2013-Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer
11 0.20338495 172 cvpr-2013-Finding Group Interactions in Social Clutter
12 0.20000911 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection
13 0.19706945 64 cvpr-2013-Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification
14 0.16852716 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning
15 0.16750588 189 cvpr-2013-Graph-Based Discriminative Learning for Location Recognition
16 0.16581775 325 cvpr-2013-Part Discovery from Partial Correspondence
17 0.16271664 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
18 0.15738335 430 cvpr-2013-The SVM-Minus Similarity Score for Video Face Recognition
19 0.14785886 4 cvpr-2013-3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image
20 0.13635023 36 cvpr-2013-Adding Unlabeled Samples to Categories by Learned Attributes
topicId topicWeight
[(0, 0.249), (1, -0.135), (2, -0.074), (3, -0.009), (4, 0.143), (5, 0.008), (6, -0.051), (7, -0.13), (8, 0.383), (9, -0.276), (10, 0.109), (11, -0.053), (12, 0.207), (13, 0.139), (14, 0.041), (15, -0.062), (16, 0.079), (17, -0.015), (18, -0.026), (19, 0.053), (20, -0.037), (21, -0.006), (22, 0.049), (23, 0.135), (24, 0.085), (25, -0.0), (26, 0.036), (27, 0.018), (28, -0.03), (29, 0.046), (30, -0.0), (31, -0.016), (32, 0.09), (33, 0.021), (34, -0.017), (35, -0.032), (36, 0.063), (37, -0.099), (38, 0.001), (39, 0.024), (40, 0.028), (41, -0.014), (42, -0.012), (43, -0.04), (44, 0.004), (45, -0.012), (46, -0.044), (47, -0.027), (48, -0.04), (49, 0.012)]
simIndex simValue paperId paperTitle
same-paper 1 0.97951978 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval
Author: Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu
Abstract: Detecting faces in uncontrolled environments continues to be a challenge to traditional face detection methods[24] due to the large variation in facial appearances, as well as occlusion and clutter. In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning. A large database of faces with bounding rectangles and facial landmark locations is collected, and simple discriminative classifiers are learned from each of them. A voting-based method is then proposed to let these classifiers cast votes on the test image through an efficient image retrieval technique. As a result, faces can be very efficiently detected by selecting the modes from the voting maps, without resorting to exhaustive sliding window-style scanning. Moreover, due to the exemplar-based framework, our approach can detect faces under challenging conditions without explicitly modeling their variations. Evaluation on two public benchmark datasets shows that our new face detection approach is accurate and efficient, and achieves the state-of-the-art performance. We further propose to use image retrieval for face validation (in order to remove false positives) and for face alignment/landmark localization. The same methodology can also be easily generalized to other facerelated tasks, such as attribute recognition, as well as general object detection.
2 0.88184208 152 cvpr-2013-Exemplar-Based Face Parsing
Author: Brandon M. Smith, Li Zhang, Jonathan Brandt, Zhe Lin, Jianchao Yang
Abstract: In this work, we propose an exemplar-based face image segmentation algorithm. We take inspiration from previous works on image parsing for general scenes. Our approach assumes a database of exemplar face images, each of which is associated with a hand-labeled segmentation map. Given a test image, our algorithm first selects a subset of exemplar images from the database, Our algorithm then computes a nonrigid warp for each exemplar image to align it with the test image. Finally, we propagate labels from the exemplar images to the test image in a pixel-wise manner, using trained weights to modulate and combine label maps from different exemplars. We evaluate our method on two challenging datasets and compare with two face parsing algorithms and a general scene parsing algorithm. We also compare our segmentation results with contour-based face alignment results; that is, we first run the alignment algorithms to extract contour points and then derive segments from the contours. Our algorithm compares favorably with all previous works on all datasets evaluated.
3 0.83771181 338 cvpr-2013-Probabilistic Elastic Matching for Pose Variant Face Verification
Author: Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang
Abstract: Pose variation remains to be a major challenge for realworld face recognition. We approach this problem through a probabilistic elastic matching method. We take a part based representation by extracting local features (e.g., LBP or SIFT) from densely sampled multi-scale image patches. By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus. Each mixture component of the GMM is confined to be a spherical Gaussian to balance the influence of the appearance and the location terms. Each Gaussian component builds correspondence of a pair of features to be matched between two faces/face tracks. For face verification, we train an SVM on the vector concatenating the difference vectors of all the feature pairs to decide if a pair of faces/face tracks is matched or not. We further propose a joint Bayesian adaptation algorithm to adapt the universally trained GMM to better model the pose variations between the target pair of faces/face tracks, which consistently improves face verification accuracy. Our experiments show that our method outperforms the state-ofthe-art in the most restricted protocol on Labeled Face in the Wild (LFW) and the YouTube video face database by a significant margin.
4 0.80644399 415 cvpr-2013-Structured Face Hallucination
Author: Chih-Yuan Yang, Sifei Liu, Ming-Hsuan Yang
Abstract: The goal of face hallucination is to generate highresolution images with fidelity from low-resolution ones. In contrast to existing methods based on patch similarity or holistic constraints in the image space, we propose to exploit local image structures for face hallucination. Each face image is represented in terms of facial components, contours and smooth regions. The image structure is maintained via matching gradients in the reconstructed highresolution output. For facial components, we align input images to generate accurate exemplars and transfer the high-frequency details for preserving structural consistency. For contours, we learn statistical priors to generate salient structures in the high-resolution images. A patch matching method is utilized on the smooth regions where the image gradients are preserved. Experimental results demonstrate that the proposed algorithm generates hallucinated face images with favorable quality and adaptability.
Author: Enrique G. Ortiz, Alan Wright, Mubarak Shah
Abstract: This paper presents an end-to-end video face recognition system, addressing the difficult problem of identifying a video face track using a large dictionary of still face images of a few hundred people, while rejecting unknown individuals. A straightforward application of the popular ?1minimization for face recognition on a frame-by-frame basis is prohibitively expensive, so we propose a novel algorithm Mean Sequence SRC (MSSRC) that performs video face recognition using a joint optimization leveraging all of the available video data and the knowledge that the face track frames belong to the same individual. By adding a strict temporal constraint to the ?1-minimization that forces individual frames in a face track to all reconstruct a single identity, we show the optimization reduces to a single minimization over the mean of the face track. We also introduce a new Movie Trailer Face Dataset collected from 101 movie trailers on YouTube. Finally, we show that our methodmatches or outperforms the state-of-the-art on three existing datasets (YouTube Celebrities, YouTube Faces, and Buffy) and our unconstrained Movie Trailer Face Dataset. More importantly, our method excels at rejecting unknown identities by at least 8% in average precision.
6 0.8010658 438 cvpr-2013-Towards Pose Robust Face Recognition
7 0.79912925 182 cvpr-2013-Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild
8 0.74584204 399 cvpr-2013-Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer
9 0.71983004 463 cvpr-2013-What's in a Name? First Names as Facial Attributes
10 0.69136393 92 cvpr-2013-Constrained Clustering and Its Application to Face Clustering in Videos
11 0.68993026 420 cvpr-2013-Supervised Descent Method and Its Applications to Face Alignment
12 0.67631066 161 cvpr-2013-Facial Feature Tracking Under Varying Facial Expressions and Face Poses Based on Restricted Boltzmann Machines
13 0.65962183 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection
14 0.65692306 389 cvpr-2013-Semi-supervised Learning with Constraints for Person Identification in Multimedia Data
15 0.62734246 430 cvpr-2013-The SVM-Minus Similarity Score for Video Face Recognition
16 0.61500126 64 cvpr-2013-Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification
17 0.60315865 220 cvpr-2013-In Defense of Sparsity Based Face Recognition
19 0.59127074 359 cvpr-2013-Robust Discriminative Response Map Fitting with Constrained Local Models
20 0.56176859 159 cvpr-2013-Expressive Visual Text-to-Speech Using Active Appearance Models
topicId topicWeight
[(10, 0.118), (16, 0.031), (19, 0.098), (26, 0.087), (33, 0.274), (67, 0.15), (69, 0.082), (87, 0.082)]
simIndex simValue paperId paperTitle
1 0.94240475 463 cvpr-2013-What's in a Name? First Names as Facial Attributes
Author: Huizhong Chen, Andrew C. Gallagher, Bernd Girod
Abstract: This paper introduces a new idea in describing people using their first names, i.e., the name assigned at birth. We show that describing people in terms of similarity to a vector of possible first names is a powerful description of facial appearance that can be used for face naming and building facial attribute classifiers. We build models for 100 common first names used in the United States and for each pair, construct a pairwise firstname classifier. These classifiers are built using training images downloaded from the internet, with no additional user interaction. This gives our approach important advantages in building practical systems that do not require additional human intervention for labeling. We use the scores from each pairwise name classifier as a set of facial attributes. We show several surprising results. Our name attributes predict the correct first names of test faces at rates far greater than chance. The name attributes are applied to gender recognition and to age classification, outperforming state-of-the-art methods with all training images automatically gathered from the internet.
same-paper 2 0.94079185 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval
Author: Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu
Abstract: Detecting faces in uncontrolled environments continues to be a challenge to traditional face detection methods[24] due to the large variation in facial appearances, as well as occlusion and clutter. In order to overcome these challenges, we present a novel and robust exemplarbased face detector that integrates image retrieval and discriminative learning. A large database of faces with bounding rectangles and facial landmark locations is collected, and simple discriminative classifiers are learned from each of them. A voting-based method is then proposed to let these classifiers cast votes on the test image through an efficient image retrieval technique. As a result, faces can be very efficiently detected by selecting the modes from the voting maps, without resorting to exhaustive sliding window-style scanning. Moreover, due to the exemplar-based framework, our approach can detect faces under challenging conditions without explicitly modeling their variations. Evaluation on two public benchmark datasets shows that our new face detection approach is accurate and efficient, and achieves the state-of-the-art performance. We further propose to use image retrieval for face validation (in order to remove false positives) and for face alignment/landmark localization. The same methodology can also be easily generalized to other facerelated tasks, such as attribute recognition, as well as general object detection.
3 0.93765205 356 cvpr-2013-Representing and Discovering Adversarial Team Behaviors Using Player Roles
Author: Patrick Lucey, Alina Bialkowski, Peter Carr, Stuart Morgan, Iain Matthews, Yaser Sheikh
Abstract: In this paper, we describe a method to represent and discover adversarial group behavior in a continuous domain. In comparison to other types of behavior, adversarial behavior is heavily structured as the location of a player (or agent) is dependent both on their teammates and adversaries, in addition to the tactics or strategies of the team. We present a method which can exploit this relationship through the use of a spatiotemporal basis model. As players constantly change roles during a match, we show that employing a “role-based” representation instead of one based on player “identity” can best exploit the playing structure. As vision-based systems currently do not provide perfect detection/tracking (e.g. missed or false detections), we show that our compact representation can effectively “denoise ” erroneous detections as well as enabling temporal analysis, which was previously prohibitive due to the dimensionality of the signal. To evaluate our approach, we used a fully instrumented field-hockey pitch with 8 fixed highdefinition (HD) cameras and evaluated our approach on approximately 200,000 frames of data from a state-of-the- art real-time player detector and compare it to manually labelled data.
4 0.93173671 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection
Author: Jianguo Li, Yimin Zhang
Abstract: This paper presents a novel learning framework for training boosting cascade based object detector from large scale dataset. The framework is derived from the wellknown Viola-Jones (VJ) framework but distinguished by three key differences. First, the proposed framework adopts multi-dimensional SURF features instead of single dimensional Haar features to describe local patches. In this way, the number of used local patches can be reduced from hundreds of thousands to several hundreds. Second, it adopts logistic regression as weak classifier for each local patch instead of decision trees in the VJ framework. Third, we adopt AUC as a single criterion for the convergence test during cascade training rather than the two trade-off criteria (false-positive-rate and hit-rate) in the VJ framework. The benefit is that the false-positive-rate can be adaptive among different cascade stages, and thus yields much faster convergence speed of SURF cascade. Combining these points together, the proposed approach has three good properties. First, the boosting cascade can be trained very efficiently. Experiments show that the proposed approach can train object detectors from billions of negative samples within one hour even on personal computers. Second, the built detector is comparable to the stateof-the-art algorithm not only on the accuracy but also on the processing speed. Third, the built detector is small in model-size due to short cascade stages.
5 0.9286412 2 cvpr-2013-3D Pictorial Structures for Multiple View Articulated Pose Estimation
Author: Magnus Burenius, Josephine Sullivan, Stefan Carlsson
Abstract: We consider the problem of automatically estimating the 3D pose of humans from images, taken from multiple calibrated views. We show that it is possible and tractable to extend the pictorial structures framework, popular for 2D pose estimation, to 3D. We discuss how to use this framework to impose view, skeleton, joint angle and intersection constraints in 3D. The 3D pictorial structures are evaluated on multiple view data from a professional football game. The evaluation is focused on computational tractability, but we also demonstrate how a simple 2D part detector can be plugged into the framework.
6 0.925726 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
7 0.92454195 339 cvpr-2013-Probabilistic Graphlet Cut: Exploiting Spatial Structure Cue for Weakly Supervised Image Segmentation
8 0.92347383 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection
9 0.92043257 160 cvpr-2013-Face Recognition in Movie Trailers via Mean Sequence Sparse Representation-Based Classification
10 0.92013049 45 cvpr-2013-Articulated Pose Estimation Using Discriminative Armlet Classifiers
11 0.91981775 345 cvpr-2013-Real-Time Model-Based Rigid Object Pose Estimation and Tracking Combining Dense and Sparse Visual Cues
12 0.91913193 311 cvpr-2013-Occlusion Patterns for Object Class Detection
13 0.91756713 122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence
14 0.91575861 275 cvpr-2013-Lp-Norm IDF for Large Scale Image Search
15 0.9139939 197 cvpr-2013-Hallucinated Humans as the Hidden Context for Labeling 3D Scenes
16 0.91366386 416 cvpr-2013-Studying Relationships between Human Gaze, Description, and Computer Vision
17 0.91298938 322 cvpr-2013-PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors
18 0.91291195 414 cvpr-2013-Structure Preserving Object Tracking
19 0.91247761 94 cvpr-2013-Context-Aware Modeling and Recognition of Activities in Video
20 0.91146713 325 cvpr-2013-Part Discovery from Partial Correspondence