iccv iccv2013 iccv2013-97 knowledge-graph by maker-knowledge-mining

97 iccv-2013-Coupling Alignments with Recognition for Still-to-Video Face Recognition


Source: pdf

Author: Zhiwu Huang, Xiaowei Zhao, Shiguang Shan, Ruiping Wang, Xilin Chen

Abstract: The Still-to-Video (S2V) face recognition systems typically need to match faces in low-quality videos captured under unconstrained conditions against high quality still face images, which is very challenging because of noise, image blur, lowface resolutions, varying headpose, complex lighting, and alignment difficulty. To address the problem, one solution is to select the frames of ‘best quality ’ from videos (hereinafter called quality alignment in this paper). Meanwhile, the faces in the selected frames should also be geometrically aligned to the still faces offline well-aligned in the gallery. In this paper, we discover that the interactions among the three tasks–quality alignment, geometric alignment and face recognition–can benefit from each other, thus should be performed jointly. With this in mind, we propose a Coupling Alignments with Recognition (CAR) method to tightly couple these tasks via low-rank regularized sparse representation in a unified framework. Our method makes the three tasks promote mutually by a joint optimization in an Augmented Lagrange Multiplier routine. Extensive , experiments on two challenging S2V datasets demonstrate that our method outperforms the state-of-the-art methods impressively.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 To address the problem, one solution is to select the frames of ‘best quality ’ from videos (hereinafter called quality alignment in this paper). [sent-5, score-0.678]

2 Meanwhile, the faces in the selected frames should also be geometrically aligned to the still faces offline well-aligned in the gallery. [sent-6, score-0.95]

3 In this paper, we discover that the interactions among the three tasks–quality alignment, geometric alignment and face recognition–can benefit from each other, thus should be performed jointly. [sent-7, score-0.549]

4 , [1, 2, 3, 4, 5], address the so called Video-to-Video (V2V) face recognition problem, in which query video sequences are matched against a set of target video sequences. [sent-15, score-0.482]

5 In our method, we jointly perform geometric alignment, recognition and quality alignment in a close loop to estimate the alignment parameters T, the identity labels L and the selecting confidences C for a video face sequence. [sent-25, score-1.38]

6 To differentiate it from the V2V face recognition problem, this scenario is specifically called the Still-to-Video (S2V) face recognition problem [6, 7]. [sent-29, score-0.46]

7 Most of them learned the relationship between the still images and video frames but did not directly handle bad quality frames, 33228969 which very likely make the recognition perform badly. [sent-31, score-0.399]

8 In this paper, we call the task of selecting good quality frames, with the most similar quality to that of still images, as quality alignment. [sent-33, score-0.417]

9 1, the frames in the red box can be selected to match against the target faces, as these faces are in near frontal view. [sent-35, score-0.463]

10 But, how to achieve accurate quality alignment forms the first challenge to attack for S2V face recognition scenario. [sent-36, score-0.686]

11 Specifically, the problem arises because the faces taken from video can hardly be geometrically aligned accurately by existing methods (e. [sent-38, score-0.621]

12 , Active Shape Model (ASM) [11], Active Appearance Model (AAM) [12]), as the faces are generally of low resolution, probably with motion blurring and often taken under non-ideal lighting conditions. [sent-40, score-0.376]

13 Furthermore, it is worth pointing out that here the “misalignment” means not only the mutual misalignment of the video frames but also their joint misalignment with the target faces. [sent-41, score-0.287]

14 As an example, due to geometric misalignments, all the input video faces in Fig. [sent-42, score-0.559]

15 We call this alignment task geometric alignment, in contrast to the quality alignment mentioned above. [sent-44, score-0.84]

16 It is also worth noticing that the above two alignment problems are related. [sent-45, score-0.327]

17 For example, intuitively, it is not necessary for us to geometrically align the target faces with those frame not selected by the quality aligning. [sent-46, score-0.623]

18 Furthermore, it is clear that the above two types of alignments can heavily affect the recognition results. [sent-49, score-0.289]

19 In other words, alignment should not only simply precede recognition, but should also benefit from recognition. [sent-54, score-0.327]

20 To put it in another way, the two kinds of alignments and recognition should be coupled together in a loop. [sent-55, score-0.289]

21 With above belief in mind, in this paper, we propose Coupling Alignments and Recognition (CAR) method which tightly couples the above two alignment tasks with recognition task, thus making them benefit each other in a unified loop, as shown in Fig. [sent-56, score-0.476]

22 Specifically, we assume that if the faces in a video are accurately aligned with wellaligned gallery faces, they can be well represented as sparse linear combinations of the gallery faces with the same identity. [sent-58, score-1.827]

23 This can connect and improve both of geometric alignment and recognition by simultaneously aligning and seeking sparse representations of video faces over gallery still faces. [sent-59, score-1.583]

24 With better alignments and sparse representations, our proposed quality alignment can cluster and weight different quality frames more accurately. [sent-60, score-0.932]

25 In addition, we also adopt low-rank prior that if video faces are in mild variations, a proper low-rank structure will exist. [sent-61, score-0.502]

26 By incorporating the low-rank prior, each cluster of the same quality faces obtained by quality alignment can be jointly aligned and consistently represented as sparse linear combinations of gallery set, which can backward promote both geometric alignment and recognition. [sent-62, score-1.956]

27 Consequently, in this close loop, our method iteratively aligns the video faces, identifies them and selects good frames, which can improve the three tasks mutually and finally corrects the initial possibly erroneous recognition decision. [sent-63, score-0.338]

28 Related Work In this section,we briefly introduce the sparse representation for alignment and the low-rank representation for subspace segmentation. [sent-69, score-0.494]

29 Robust Alignment by Sparse Representation Designed for still images, Robust Alignment by Sparse Representation (RASR) [10] simultaneously optimizes the alignment parameters and the sparse representation coefficients. [sent-72, score-0.494]

30 In this case, RASR turns to seek the best alignment of the test face from subject to subject: αim,τi n,ei ? [sent-86, score-0.518]

31 To overcome this drawback, MisalignmentRobust Representation method (MRR) [14] firstly aligns the gallery images well offline and then directly solves its objective function in a global representation over the whole gallery images without concerning local minima. [sent-94, score-0.88]

32 Proposed Method In this section, we present Coupling Alignments with Recognition approach to jointly optimize three tasks— geometrically aligning faces (geometric alignment), performing recognition and selecting good quality frames (quality alignment)—in a unified framework. [sent-117, score-0.789]

33 Formulation The S2V face recognition problem matches low quality facial video frames against high quality gallery still faces. [sent-121, score-1.079]

34 , Ac] be the gallery dictionary with c subjects, where Ai represents the well-aligned gallery still faces of the i-th subject. [sent-128, score-1.218]

35 Formally, for the video faces Y , we want to estimate the alignment transformations T and the identity labels L simultaneously by the following: ? [sent-129, score-0.927]

36 he segment of faces with the i-th identity in recogntion, B’s columns contain the sparse representations of video faces and E = [e1, e2 , . [sent-155, score-1.016]

37 For joint geometric alignment and recognition, we adopt the sparse representation prior that if the alignments of video faces are accurate, they can be represented as good linear combinations of well-aligned gallery still faces. [sent-159, score-1.663]

38 So, we need to seek an optimal set of deformations T for the video sequence Y simultaneously with their sparse representations over the gallery dictionary A. [sent-160, score-0.731]

39 In this way, the sparse representations over gallery set make faces from video aligned with the gallery faces, thus geometrically align them more accurately. [sent-161, score-1.547]

40 Meanwhile, the aligned video faces will obtain more accurate sparse representations in terms of the entire gallery set. [sent-162, score-1.069]

41 Furthermore, both better geometric alignment and recognition can facilitate quality alignment selecting good quality frames. [sent-163, score-1.034]

42 Specifically, we assume that the video faces with the same recognition identity are similar in appearance. [sent-164, score-0.595]

43 Under this assumption, video faces will be automatically clustered into different segments (i. [sent-165, score-0.502]

44 Additionally, different clusters of faces will be − weighted with different confidences, which are defined as: CSi=j=1|? [sent-171, score-0.376]

45 Since the gallery dictionary only contains frontal faces, the reconstructed errors of faces in frontal cluster are often smaller than that of non-frontal ones. [sent-184, score-0.878]

46 As a result, with more accurate clustering and weighting, quality alignment will select good quality frames with higher confidences. [sent-185, score-0.634]

47 Besides, the three tasks are also simultaneously coupled by low-rank prior, which assumes that since faces in one video vary continuously, they should own a good low-rank structure. [sent-186, score-0.578]

48 When one video sequence has large inter-frame differences, better low-rank structures will be obtained in each of the individual clusters divided by quality alignment task. [sent-187, score-0.619]

49 In these different video segments, the sparse representations of video faces are regularized by low-rank prior respectively to achieve more consistent linear combinations of gallery images and more accurate joint alignment of the faces with gallery images. [sent-188, score-2.245]

50 Specifically, in our algorithm, we employ coarse-to-fine search strategy, which performs 33229981 Algorithm 1 Main Algorithm of the CAR method Algorithm 1Main Algorithm of the CAR method INPUT: Gallery data matrix A, probe video sequence data matrix Y and initial transformation T of Y 1. [sent-189, score-0.301]

51 In the input, the initial transformations of the probe video sequence could be the similarity transformations according to the automatically detected locations of eye centers. [sent-283, score-0.359]

52 Following [19], step 3 calculates the collaborative representations of video faces over dictionary. [sent-344, score-0.566]

53 Experiments In this section, we present extensive experiments to demonstrate the effectiveness of our proposed Coupling Alignments with Recognition (CAR) algorithm in terms of both alignment accuracy and recognition performance. [sent-348, score-0.392]

54 For alignment, we compare our approach with one of the state-of-the-art blind joint alignment algorithm RASL [21]. [sent-350, score-0.358]

55 Besides, the alignment results of recently proposed simultaneous alignment and recognition method MRR [10] is also shown. [sent-351, score-0.719]

56 As it is not easy to collect more than one high quality frontal face images for ALL the subjects, to keep uniform for all the subjects, we included only 1frontal face image for each person in the gallery, as in COX-S2V [7]. [sent-357, score-0.497]

57 For evaluation of S2V face recognition, we design an unsupervised scenario, which uses the still images for gallery and the videos for probe without any training set. [sent-358, score-0.717]

58 In the test, the still images are enrolled in the gallery while the video sequences are contained in the probe. [sent-362, score-0.542]

59 Experimental Settings In our experiments, a commercial face detection SDK OKAO1 was employed to detect faces and locate eyes in both still images and videos. [sent-366, score-0.597]

60 Rows from top to bottom show the alignment and identification results of SRC(original faces), A-SRC(RASL), MRR and CAR respectively on one video sequence from YouTube-S2V. [sent-376, score-0.523]

61 The tick indicates the face is correctly identified, and the red box indicates the face frame is selected in quality alignment. [sent-377, score-0.485]

62 Gallery faces and average faces from videos before and after alignment. [sent-379, score-0.796]

63 SRC shows the average of original video faces from a face detector using its quality alignment result; and A-SRC(RASL), MRR, CAR show the average faces after their respective alignments. [sent-380, score-1.499]

64 Evaluation results on S2V alignment and recognition In our CAR algorithm, the tasks of alignments and recognition are tightly coupled. [sent-389, score-0.765]

65 However, to facilitate the comparison with conventional alignment and recognition approaches respectively, we will present the results for alignments and recognition respectively in the subsection. [sent-390, score-0.681]

66 1 Evaluation of Alignment Accuracy We illustrate results on the YouTube-S2V database to evaluate the alignment accuracies of RASL, MRR and our method CAR. [sent-393, score-0.327]

67 As a blind alignment method, original RASL jointly aligns the video faces without considering the gallery still faces. [sent-395, score-1.401]

68 To align with the gallery images, in this experiment, RASL is used to add all the gallery still images into the video sequence and jointly align the still faces and video faces together. [sent-396, score-2.009]

69 As a simultaneous alignment and recognition method designed for still images, we use MRR to align the video faces frame by frame in each video clip. [sent-397, score-1.146]

70 In contrast, our algorithm CAR jointly align the video faces mutually with the gallery still images. [sent-398, score-1.036]

71 2 shows alignment and recognition results of one video sequence with 13 selected frames from the YouTubeS2V dataset. [sent-400, score-0.604]

72 2, the misalignments of input video faces are very serious: the eyes of most faces are not in horizontality, the face scales are not the same, and most ones do not have the whole mouth. [sent-402, score-1.096]

73 Although RASL jointly aligns the eyes of video faces in the same horizontality, most aligned faces have partial mouth. [sent-403, score-1.1]

74 MRR aligns several video faces accurately, but it still makes some faces in a wrong scale, such as f2,f3,f5. [sent-404, score-0.985]

75 In contrast, except faces f11-f13 owning exaggerated facial expressions, our method CAR aligns most of faces including f2,f3,f5 to the gallery still image. [sent-405, score-1.245]

76 In addition, faces in red box are the selected ones in quality alignment. [sent-406, score-0.505]

77 Since there is no ground truth for this dataset, we verify performances of involved method visually by plotting the average faces before and after geometric alignment and quality alignment. [sent-408, score-0.889]

78 3 shows the gallery faces and the mean faces of videos from 10 subjects in YouTube-S2V. [sent-410, score-1.243]

79 3, the first average face of SRC is the mean of the faces with red box (i. [sent-412, score-0.541]

80 Note that the average faces after CAR’s alignment are more clear and aligned with gallery images more accurately than those of other methods. [sent-416, score-1.16]

81 This result suggests that our method CAR achieves the improved geometric alignment in unconstrained S2V scenario. [sent-417, score-0.384]

82 This can be explained by that, via jointly exploiting the sparse representation prior and low-rank prior, our method demonstrates much more robustness in aligning the video faces. [sent-418, score-0.353]

83 On one hand, the sparse representation over gallery dictionary makes the video faces aligned to the still images, which are well-aligned. [sent-419, score-1.134]

84 But, in the S2V case, several faces are not robustly aligned only with the sparse representation. [sent-420, score-0.521]

85 Consequently, on the other hand, the low-rank prior facilitates those more easily aligned video faces to correct the alignments of the others in the same video sequence. [sent-421, score-0.923]

86 2, the bad aligned faces f2,f3,f5 33329014 Table 1. [sent-423, score-0.447]

87 Unsupervised S2V face recognition results (rank-1 recognition rate (%)) on YouTube-S2V(Y) and COX-S2V(Ci) datasets. [sent-424, score-0.295]

88 Due to better geometric alignment of CAR, the faces f2,f3,f5 are also identified correctly. [sent-441, score-0.787]

89 2 Evaluation of S2V Face Recognition The S2V recognition involves defining the similarity between video sequence and gallery images and determining which strategy is used for classification. [sent-444, score-0.614]

90 This is because that the C-Voting with quality alignment is more suitable for S2V scenario. [sent-464, score-0.456]

91 Then, we conclude the results of different methods: Due to the blind alignment preprocess of RASL, A-SRC and A-CRC performs slightly better than the original methods Table 2. [sent-468, score-0.358]

92 Supervised S2V face recognition results (rank-1 recognition rate (%)) on COX-S2V (Ci) dataset. [sent-469, score-0.295]

93 By simultaneously aligning and recognizing faces frame by frame, MRR works better than RASL. [sent-475, score-0.508]

94 This is because the proposed method CAR improves the recognition by both more accurate geometric alignment and better frame selections in quality alignment: the geometric alignment generates better sparse representations over gallery set by jointly aligning the video faces more accurately. [sent-477, score-2.108]

95 In different clusters divided by quality alignment, low-rank regularizations are conducted on the sparse representations of video faces to make the sparse representation-based classification more robust. [sent-478, score-0.847]

96 In training, for A-SRC, A-CRC, MRR and CAR, the training images are all aligned in advance by the corresponding alignment methods. [sent-480, score-0.398]

97 This supervised case also suggests that the idea coupling alignments and recognition is utterly desirable for S2V face recognition. [sent-488, score-0.542]

98 Conclusion In this paper, we first studied the mutual influence among geometric alignment, quality alignment and recognition in the S2V face recognition scenario. [sent-504, score-0.808]

99 Patch-based probabilistic image quality assessment for face selection and improved video-based face recognition. [sent-573, score-0.459]

100 Towards a practical face recognition system: robust alignment and illumination by sparse representation. [sent-582, score-0.631]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('gallery', 0.386), ('faces', 0.376), ('mrr', 0.35), ('alignment', 0.327), ('alignments', 0.224), ('rasl', 0.212), ('car', 0.175), ('face', 0.165), ('quality', 0.129), ('video', 0.126), ('rasr', 0.124), ('src', 0.103), ('probe', 0.092), ('coupling', 0.088), ('multiplier', 0.079), ('aligns', 0.077), ('aligning', 0.074), ('sparse', 0.074), ('lagrange', 0.072), ('bk', 0.071), ('aligned', 0.071), ('loop', 0.068), ('bsi', 0.068), ('recognition', 0.065), ('csi', 0.064), ('akzjk', 0.062), ('subjects', 0.061), ('geometric', 0.057), ('misalignment', 0.056), ('djk', 0.055), ('ek', 0.053), ('augmented', 0.05), ('frames', 0.049), ('jointly', 0.048), ('geometrically', 0.048), ('mink', 0.048), ('alm', 0.047), ('transformation', 0.046), ('az', 0.045), ('xk', 0.045), ('videos', 0.044), ('align', 0.044), ('tasks', 0.044), ('yj', 0.041), ('horizontality', 0.041), ('idy', 0.041), ('misalignmentrobust', 0.041), ('ruiping', 0.041), ('vevce', 0.041), ('zhiwu', 0.041), ('dictionary', 0.04), ('linearized', 0.04), ('linearization', 0.04), ('confidences', 0.04), ('tightly', 0.04), ('tk', 0.039), ('frontal', 0.038), ('ganesh', 0.038), ('lagrangian', 0.038), ('transformations', 0.038), ('sequence', 0.037), ('shiguang', 0.037), ('outer', 0.036), ('arg', 0.036), ('si', 0.036), ('representations', 0.036), ('wright', 0.036), ('convex', 0.034), ('identification', 0.033), ('simultaneously', 0.032), ('combinations', 0.032), ('conducts', 0.032), ('regularizations', 0.032), ('beijing', 0.031), ('subspace', 0.031), ('representation', 0.031), ('blind', 0.031), ('svd', 0.031), ('still', 0.03), ('crc', 0.03), ('searching', 0.03), ('lj', 0.03), ('collaborative', 0.028), ('lda', 0.028), ('shan', 0.028), ('eye', 0.028), ('uk', 0.028), ('identity', 0.028), ('identified', 0.027), ('misalignments', 0.027), ('nst', 0.027), ('mutually', 0.026), ('kh', 0.026), ('iinn', 0.026), ('misaligned', 0.026), ('eyes', 0.026), ('subject', 0.026), ('frame', 0.026), ('fine', 0.026), ('nonlinearity', 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 97 iccv-2013-Coupling Alignments with Recognition for Still-to-Video Face Recognition

Author: Zhiwu Huang, Xiaowei Zhao, Shiguang Shan, Ruiping Wang, Xilin Chen

Abstract: The Still-to-Video (S2V) face recognition systems typically need to match faces in low-quality videos captured under unconstrained conditions against high quality still face images, which is very challenging because of noise, image blur, lowface resolutions, varying headpose, complex lighting, and alignment difficulty. To address the problem, one solution is to select the frames of ‘best quality ’ from videos (hereinafter called quality alignment in this paper). Meanwhile, the faces in the selected frames should also be geometrically aligned to the still faces offline well-aligned in the gallery. In this paper, we discover that the interactions among the three tasks–quality alignment, geometric alignment and face recognition–can benefit from each other, thus should be performed jointly. With this in mind, we propose a Coupling Alignments with Recognition (CAR) method to tightly couple these tasks via low-rank regularized sparse representation in a unified framework. Our method makes the three tasks promote mutually by a joint optimization in an Augmented Lagrange Multiplier routine. Extensive , experiments on two challenging S2V datasets demonstrate that our method outperforms the state-of-the-art methods impressively.

2 0.34173107 356 iccv-2013-Robust Feature Set Matching for Partial Face Recognition

Author: Renliang Weng, Jiwen Lu, Junlin Hu, Gao Yang, Yap-Peng Tan

Abstract: Over the past two decades, a number of face recognition methods have been proposed in the literature. Most of them use holistic face images to recognize people. However, human faces are easily occluded by other objects in many real-world scenarios and we have to recognize the person of interest from his/her partial faces. In this paper, we propose a new partial face recognition approach by using feature set matching, which is able to align partial face patches to holistic gallery faces automatically and is robust to occlusions and illumination changes. Given each gallery image and probe face patch, we first detect keypoints and extract their local features. Then, we propose a Metric Learned ExtendedRobust PointMatching (MLERPM) method to discriminatively match local feature sets of a pair of gallery and probe samples. Lastly, the similarity of two faces is converted as the distance between two feature sets. Experimental results on three public face databases are presented to show the effectiveness of the proposed approach.

3 0.29905197 335 iccv-2013-Random Faces Guided Sparse Many-to-One Encoder for Pose-Invariant Face Recognition

Author: Yizhe Zhang, Ming Shao, Edward K. Wong, Yun Fu

Abstract: One of the most challenging task in face recognition is to identify people with varied poses. Namely, the test faces have significantly different poses compared with the registered faces. In this paper, we propose a high-level feature learning scheme to extract pose-invariant identity feature for face recognition. First, we build a single-hiddenlayer neural network with sparse constraint, to extractposeinvariant feature in a supervised fashion. Second, we further enhance the discriminative capability of the proposed feature by using multiple random faces as the target values for multiple encoders. By enforcing the target values to be uniquefor inputfaces over differentposes, the learned highlevel feature that is represented by the neurons in the hidden layer is pose free and only relevant to the identity information. Finally, we conduct face identification on CMU MultiPIE, and verification on Labeled Faces in the Wild (LFW) databases, where identification rank-1 accuracy and face verification accuracy with ROC curve are reported. These experiments demonstrate that our model is superior to oth- er state-of-the-art approaches on handling pose variations.

4 0.20744832 398 iccv-2013-Sparse Variation Dictionary Learning for Face Recognition with a Single Training Sample per Person

Author: Meng Yang, Luc Van_Gool, Lei Zhang

Abstract: Face recognition (FR) with a single training sample per person (STSPP) is a very challenging problem due to the lack of information to predict the variations in the query sample. Sparse representation based classification has shown interesting results in robust FR; however, its performance will deteriorate much for FR with STSPP. To address this issue, in this paper we learn a sparse variation dictionary from a generic training set to improve the query sample representation by STSPP. Instead of learning from the generic training set independently w.r.t. the gallery set, the proposed sparse variation dictionary learning (SVDL) method is adaptive to the gallery set by jointly learning a projection to connect the generic training set with the gallery set. The learnt sparse variation dictionary can be easily integrated into the framework of sparse representation based classification so that various variations in face images, including illumination, expression, occlusion, pose, etc., can be better handled. Experiments on the large-scale CMU Multi-PIE, FRGC and LFW databases demonstrate the promising performance of SVDL on FR with STSPP.

5 0.20022395 305 iccv-2013-POP: Person Re-identification Post-rank Optimisation

Author: Chunxiao Liu, Chen Change Loy, Shaogang Gong, Guijin Wang

Abstract: Owing to visual ambiguities and disparities, person reidentification methods inevitably produce suboptimal ranklist, which still requires exhaustive human eyeballing to identify the correct target from hundreds of different likelycandidates. Existing re-identification studies focus on improving the ranking performance, but rarely look into the critical problem of optimising the time-consuming and error-prone post-rank visual search at the user end. In this study, we present a novel one-shot Post-rank OPtimisation (POP) method, which allows a user to quickly refine their search by either “one-shot” or a couple of sparse negative selections during a re-identification process. We conduct systematic behavioural studies to understand user’s searching behaviour and show that the proposed method allows correct re-identification to converge 2.6 times faster than the conventional exhaustive search. Importantly, through extensive evaluations we demonstrate that the method is capable of achieving significant improvement over the stateof-the-art distance metric learning based ranking models, even with just “one shot” feedback optimisation, by as much as over 30% performance improvement for rank 1reidentification on the VIPeR and i-LIDS datasets.

6 0.19976221 169 iccv-2013-Fine-Grained Categorization by Alignments

7 0.19237064 157 iccv-2013-Fast Face Detector Training Using Tailored Views

8 0.17168029 261 iccv-2013-Markov Network-Based Unified Classifier for Face Identification

9 0.15970077 444 iccv-2013-Viewing Real-World Faces in 3D

10 0.14561464 339 iccv-2013-Rank Minimization across Appearance and Shape for AAM Ensemble Fitting

11 0.11985205 45 iccv-2013-Affine-Constrained Group Sparse Coding and Its Application to Image-Based Classifications

12 0.11488962 20 iccv-2013-A Max-Margin Perspective on Sparse Representation-Based Classification

13 0.11467555 195 iccv-2013-Hidden Factor Analysis for Age Invariant Face Recognition

14 0.11356188 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation

15 0.11265825 180 iccv-2013-From Where and How to What We See

16 0.11075829 321 iccv-2013-Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model

17 0.10840603 106 iccv-2013-Deep Learning Identity-Preserving Face Space

18 0.10696218 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification

19 0.10452343 393 iccv-2013-Simultaneous Clustering and Tracklet Linking for Multi-face Tracking in Videos

20 0.096809946 392 iccv-2013-Similarity Metric Learning for Face Recognition


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.207), (1, 0.045), (2, -0.105), (3, -0.074), (4, -0.099), (5, -0.107), (6, 0.225), (7, 0.096), (8, 0.018), (9, 0.018), (10, 0.008), (11, 0.073), (12, 0.066), (13, 0.013), (14, -0.095), (15, -0.013), (16, -0.056), (17, -0.004), (18, -0.064), (19, 0.003), (20, -0.073), (21, -0.184), (22, -0.012), (23, -0.133), (24, 0.112), (25, 0.155), (26, -0.129), (27, 0.145), (28, 0.005), (29, -0.057), (30, 0.077), (31, -0.184), (32, -0.053), (33, -0.053), (34, -0.018), (35, -0.032), (36, -0.073), (37, 0.006), (38, 0.065), (39, 0.001), (40, 0.004), (41, 0.01), (42, -0.012), (43, 0.002), (44, -0.086), (45, 0.08), (46, 0.05), (47, 0.086), (48, 0.002), (49, -0.098)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95320153 97 iccv-2013-Coupling Alignments with Recognition for Still-to-Video Face Recognition

Author: Zhiwu Huang, Xiaowei Zhao, Shiguang Shan, Ruiping Wang, Xilin Chen

Abstract: The Still-to-Video (S2V) face recognition systems typically need to match faces in low-quality videos captured under unconstrained conditions against high quality still face images, which is very challenging because of noise, image blur, lowface resolutions, varying headpose, complex lighting, and alignment difficulty. To address the problem, one solution is to select the frames of ‘best quality ’ from videos (hereinafter called quality alignment in this paper). Meanwhile, the faces in the selected frames should also be geometrically aligned to the still faces offline well-aligned in the gallery. In this paper, we discover that the interactions among the three tasks–quality alignment, geometric alignment and face recognition–can benefit from each other, thus should be performed jointly. With this in mind, we propose a Coupling Alignments with Recognition (CAR) method to tightly couple these tasks via low-rank regularized sparse representation in a unified framework. Our method makes the three tasks promote mutually by a joint optimization in an Augmented Lagrange Multiplier routine. Extensive , experiments on two challenging S2V datasets demonstrate that our method outperforms the state-of-the-art methods impressively.

2 0.87069309 356 iccv-2013-Robust Feature Set Matching for Partial Face Recognition

Author: Renliang Weng, Jiwen Lu, Junlin Hu, Gao Yang, Yap-Peng Tan

Abstract: Over the past two decades, a number of face recognition methods have been proposed in the literature. Most of them use holistic face images to recognize people. However, human faces are easily occluded by other objects in many real-world scenarios and we have to recognize the person of interest from his/her partial faces. In this paper, we propose a new partial face recognition approach by using feature set matching, which is able to align partial face patches to holistic gallery faces automatically and is robust to occlusions and illumination changes. Given each gallery image and probe face patch, we first detect keypoints and extract their local features. Then, we propose a Metric Learned ExtendedRobust PointMatching (MLERPM) method to discriminatively match local feature sets of a pair of gallery and probe samples. Lastly, the similarity of two faces is converted as the distance between two feature sets. Experimental results on three public face databases are presented to show the effectiveness of the proposed approach.

3 0.72841674 398 iccv-2013-Sparse Variation Dictionary Learning for Face Recognition with a Single Training Sample per Person

Author: Meng Yang, Luc Van_Gool, Lei Zhang

Abstract: Face recognition (FR) with a single training sample per person (STSPP) is a very challenging problem due to the lack of information to predict the variations in the query sample. Sparse representation based classification has shown interesting results in robust FR; however, its performance will deteriorate much for FR with STSPP. To address this issue, in this paper we learn a sparse variation dictionary from a generic training set to improve the query sample representation by STSPP. Instead of learning from the generic training set independently w.r.t. the gallery set, the proposed sparse variation dictionary learning (SVDL) method is adaptive to the gallery set by jointly learning a projection to connect the generic training set with the gallery set. The learnt sparse variation dictionary can be easily integrated into the framework of sparse representation based classification so that various variations in face images, including illumination, expression, occlusion, pose, etc., can be better handled. Experiments on the large-scale CMU Multi-PIE, FRGC and LFW databases demonstrate the promising performance of SVDL on FR with STSPP.

4 0.72252798 261 iccv-2013-Markov Network-Based Unified Classifier for Face Identification

Author: Wonjun Hwang, Kyungshik Roh, Junmo Kim

Abstract: We propose a novel unifying framework using a Markov network to learn the relationship between multiple classifiers in face recognition. We assume that we have several complementary classifiers and assign observation nodes to the features of a query image and hidden nodes to the features of gallery images. We connect each hidden node to its corresponding observation node and to the hidden nodes of other neighboring classifiers. For each observation-hidden node pair, we collect a set of gallery candidates that are most similar to the observation instance, and the relationship between the hidden nodes is captured in terms of the similarity matrix between the collected gallery images. Posterior probabilities in the hidden nodes are computed by the belief-propagation algorithm. The novelty of the proposed framework is the method that takes into account the classifier dependency using the results of each neighboring classifier. We present extensive results on two different evaluation protocols, known and unknown image variation tests, using three different databases, which shows that the proposed framework always leads to good accuracy in face recognition.

5 0.72130233 154 iccv-2013-Face Recognition via Archetype Hull Ranking

Author: Yuanjun Xiong, Wei Liu, Deli Zhao, Xiaoou Tang

Abstract: The archetype hull model is playing an important role in large-scale data analytics and mining, but rarely applied to vision problems. In this paper, we migrate such a geometric model to address face recognition and verification together through proposing a unified archetype hull ranking framework. Upon a scalable graph characterized by a compact set of archetype exemplars whose convex hull encompasses most of the training images, the proposed framework explicitly captures the relevance between any query and the stored archetypes, yielding a rank vector over the archetype hull. The archetype hull ranking is then executed on every block of face images to generate a blockwise similarity measure that is achieved by comparing two different rank vectors with respect to the same archetype hull. After integrating blockwise similarity measurements with learned importance weights, we accomplish a sensible face similarity measure which can support robust and effective face recognition and verification. We evaluate the face similarity measure in terms of experiments performed on three benchmark face databases Multi-PIE, Pubfig83, and LFW, demonstrat- ing its performance superior to the state-of-the-arts.

6 0.70386755 195 iccv-2013-Hidden Factor Analysis for Age Invariant Face Recognition

7 0.69741416 335 iccv-2013-Random Faces Guided Sparse Many-to-One Encoder for Pose-Invariant Face Recognition

8 0.6754182 106 iccv-2013-Deep Learning Identity-Preserving Face Space

9 0.60861641 267 iccv-2013-Model Recommendation with Virtual Probes for Egocentric Hand Detection

10 0.58403939 157 iccv-2013-Fast Face Detector Training Using Tailored Views

11 0.56434685 206 iccv-2013-Hybrid Deep Learning for Face Verification

12 0.56116927 272 iccv-2013-Modifying the Memorability of Face Photographs

13 0.55700076 158 iccv-2013-Fast High Dimensional Vector Multiplication Face Recognition

14 0.55124235 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation

15 0.55098104 305 iccv-2013-POP: Person Re-identification Post-rank Optimisation

16 0.54900581 393 iccv-2013-Simultaneous Clustering and Tracklet Linking for Multi-face Tracking in Videos

17 0.54408824 153 iccv-2013-Face Recognition Using Face Patch Networks

18 0.53912497 84 iccv-2013-Complex 3D General Object Reconstruction from Line Drawings

19 0.50769889 392 iccv-2013-Similarity Metric Learning for Face Recognition

20 0.50514364 14 iccv-2013-A Generalized Iterated Shrinkage Algorithm for Non-convex Sparse Coding


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.062), (7, 0.018), (12, 0.015), (26, 0.059), (27, 0.011), (31, 0.054), (32, 0.176), (34, 0.017), (35, 0.016), (42, 0.146), (48, 0.015), (64, 0.067), (73, 0.028), (77, 0.01), (89, 0.183), (98, 0.035)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.86520118 97 iccv-2013-Coupling Alignments with Recognition for Still-to-Video Face Recognition

Author: Zhiwu Huang, Xiaowei Zhao, Shiguang Shan, Ruiping Wang, Xilin Chen

Abstract: The Still-to-Video (S2V) face recognition systems typically need to match faces in low-quality videos captured under unconstrained conditions against high quality still face images, which is very challenging because of noise, image blur, lowface resolutions, varying headpose, complex lighting, and alignment difficulty. To address the problem, one solution is to select the frames of ‘best quality ’ from videos (hereinafter called quality alignment in this paper). Meanwhile, the faces in the selected frames should also be geometrically aligned to the still faces offline well-aligned in the gallery. In this paper, we discover that the interactions among the three tasks–quality alignment, geometric alignment and face recognition–can benefit from each other, thus should be performed jointly. With this in mind, we propose a Coupling Alignments with Recognition (CAR) method to tightly couple these tasks via low-rank regularized sparse representation in a unified framework. Our method makes the three tasks promote mutually by a joint optimization in an Augmented Lagrange Multiplier routine. Extensive , experiments on two challenging S2V datasets demonstrate that our method outperforms the state-of-the-art methods impressively.

2 0.86238813 77 iccv-2013-Codemaps - Segment, Classify and Search Objects Locally

Author: Zhenyang Li, Efstratios Gavves, Koen E.A. van_de_Sande, Cees G.M. Snoek, Arnold W.M. Smeulders

Abstract: In this paper we aim for segmentation and classification of objects. We propose codemaps that are a joint formulation of the classification score and the local neighborhood it belongs to in the image. We obtain the codemap by reordering the encoding, pooling and classification steps over lattice elements. Other than existing linear decompositions who emphasize only the efficiency benefits for localized search, we make three novel contributions. As a preliminary, we provide a theoretical generalization of the sufficient mathematical conditions under which image encodings and classification becomes locally decomposable. As first novelty we introduce ℓ2 normalization for arbitrarily shaped image regions, which is fast enough for semantic segmentation using our Fisher codemaps. Second, using the same lattice across images, we propose kernel pooling which embeds nonlinearities into codemaps for object classification by explicit or approximate feature mappings. Results demonstrate that ℓ2 normalized Fisher codemaps improve the state-of-the-art in semantic segmentation for PAS- CAL VOC. For object classification the addition of nonlinearities brings us on par with the state-of-the-art, but is 3x faster. Because of the codemaps ’ inherent efficiency, we can reach significant speed-ups for localized search as well. We exploit the efficiency gain for our third novelty: object segment retrieval using a single query image only.

3 0.84486389 255 iccv-2013-Local Signal Equalization for Correspondence Matching

Author: Derek Bradley, Thabo Beeler

Abstract: Correspondence matching is one of the most common problems in computer vision, and it is often solved using photo-consistency of local regions. These approaches typically assume that the frequency content in the local region is consistent in the image pair, such that matching is performed on similar signals. However, in many practical situations this is not the case, for example with low depth of field cameras a scene point may be out of focus in one view and in-focus in the other, causing a mismatch of frequency signals. Furthermore, this mismatch can vary spatially over the entire image. In this paper we propose a local signal equalization approach for correspondence matching. Using a measure of local image frequency, we equalize local signals using an efficient scale-space image representation such that their frequency contents are optimally suited for matching. Our approach allows better correspondence matching, which we demonstrate with a number of stereo reconstruction examples on synthetic and real datasets.

4 0.83725953 223 iccv-2013-Joint Noise Level Estimation from Personal Photo Collections

Author: Yichang Shih, Vivek Kwatra, Troy Chinen, Hui Fang, Sergey Ioffe

Abstract: Personal photo albums are heavily biased towards faces of people, but most state-of-the-art algorithms for image denoising and noise estimation do not exploit facial information. We propose a novel technique for jointly estimating noise levels of all face images in a photo collection. Photos in a personal album are likely to contain several faces of the same people. While some of these photos would be clean and high quality, others may be corrupted by noise. Our key idea is to estimate noise levels by comparing multiple images of the same content that differ predominantly in their noise content. Specifically, we compare geometrically and photometrically aligned face images of the same person. Our estimation algorithm is based on a probabilistic formulation that seeks to maximize the joint probability of estimated noise levels across all images. We propose an approximate solution that decomposes this joint maximization into a two-stage optimization. The first stage determines the relative noise between pairs of images by pooling estimates from corresponding patch pairs in a probabilistic fashion. The second stage then jointly optimizes for all absolute noise parameters by conditioning them upon relative noise levels, which allows for a pairwise factorization of the probability distribution. We evaluate our noise estimation method using quantitative experiments to measure accuracy on synthetic data. Additionally, we employ the estimated noise levels for automatic denoising using “BM3D”, and evaluate the quality of denoising on real-world photos through a user study.

5 0.82337022 328 iccv-2013-Probabilistic Elastic Part Model for Unsupervised Face Detector Adaptation

Author: Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang

Abstract: We propose an unsupervised detector adaptation algorithm to adapt any offline trained face detector to a specific collection of images, and hence achieve better accuracy. The core of our detector adaptation algorithm is a probabilistic elastic part (PEP) model, which is offline trained with a set of face examples. It produces a statisticallyaligned part based face representation, namely the PEP representation. To adapt a general face detector to a collection of images, we compute the PEP representations of the candidate detections from the general face detector, and then train a discriminative classifier with the top positives and negatives. Then we re-rank all the candidate detections with this classifier. This way, a face detector tailored to the statistics of the specific image collection is adapted from the original detector. We present extensive results on three datasets with two state-of-the-art face detectors. The significant improvement of detection accuracy over these state- of-the-art face detectors strongly demonstrates the efficacy of the proposed face detector adaptation algorithm.

6 0.81959885 427 iccv-2013-Transfer Feature Learning with Joint Distribution Adaptation

7 0.81929803 26 iccv-2013-A Practical Transfer Learning Algorithm for Face Verification

8 0.81640816 44 iccv-2013-Adapting Classification Cascades to New Domains

9 0.8157807 124 iccv-2013-Domain Transfer Support Vector Ranking for Person Re-identification without Target Camera Label Information

10 0.81541169 59 iccv-2013-Bayesian Joint Topic Modelling for Weakly Supervised Object Localisation

11 0.81482518 52 iccv-2013-Attribute Adaptation for Personalized Image Search

12 0.8147794 438 iccv-2013-Unsupervised Visual Domain Adaptation Using Subspace Alignment

13 0.81413567 338 iccv-2013-Randomized Ensemble Tracking

14 0.81326658 277 iccv-2013-Multi-channel Correlation Filters

15 0.81321573 181 iccv-2013-Frustratingly Easy NBNN Domain Adaptation

16 0.81291211 187 iccv-2013-Group Norm for Learning Structured SVMs with Unstructured Latent Variables

17 0.81240696 157 iccv-2013-Fast Face Detector Training Using Tailored Views

18 0.8120535 123 iccv-2013-Domain Adaptive Classification

19 0.81143069 180 iccv-2013-From Where and How to What We See

20 0.81082577 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps