nips nips2001 nips2001-74 knowledge-graph by maker-knowledge-mining

74 nips-2001-Face Recognition Using Kernel Methods

Source: pdf

Author: Ming-Hsuan Yang

Abstract: Principal Component Analysis and Fisher Linear Discriminant methods have demonstrated their success in face detection, recognition, and tracking. The representation in these subspace methods is based on second order statistics of the image set, and does not address higher order statistical dependencies such as the relationships among three or more pixels. Recently Higher Order Statistics and Independent Component Analysis (ICA) have been used as informative low dimensional representations for visual recognition. In this paper, we investigate the use of Kernel Principal Component Analysis and Kernel Fisher Linear Discriminant for learning low dimensional representations for face recognition, which we call Kernel Eigenface and Kernel Fisherface methods. While Eigenface and Fisherface methods aim to find projection directions based on the second order correlation of samples, Kernel Eigenface and Kernel Fisherface methods provide generalizations which take higher order correlations into account. We compare the performance of kernel methods with Eigenface, Fisherface and ICA-based methods for face recognition with variation in pose, scale, lighting and expression. Experimental results show that kernel methods provide better representations and achieve lower error rates for face recognition. 1 Motivation and Approach Subspace methods have been applied successfully in numerous visual recognition tasks such as face localization, face recognition, 3D object recognition, and tracking. In particular, Principal Component Analysis (PCA) [20] [13] ,and Fisher Linear Discriminant (FLD) methods [6] have been applied to face recognition with impressive results. While PCA aims to extract a subspace in which the variance is maximized (or the reconstruction error is minimized), some unwanted variations (due to lighting, facial expressions, viewing points, etc.) may be retained (See [8] for examples). It has been observed that in face recognition the variations between the images of the same face due to illumination and viewing direction are almost always larger than image variations due to the changes in face identity [1]. Therefore, while the PCA projections are optimal in a correlation sense (or for reconstruction

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 com Abstract Principal Component Analysis and Fisher Linear Discriminant methods have demonstrated their success in face detection, recognition, and tracking. [sent-2, score-0.445]

2 The representation in these subspace methods is based on second order statistics of the image set, and does not address higher order statistical dependencies such as the relationships among three or more pixels. [sent-3, score-0.568]

3 Recently Higher Order Statistics and Independent Component Analysis (ICA) have been used as informative low dimensional representations for visual recognition. [sent-4, score-0.194]

4 In this paper, we investigate the use of Kernel Principal Component Analysis and Kernel Fisher Linear Discriminant for learning low dimensional representations for face recognition, which we call Kernel Eigenface and Kernel Fisherface methods. [sent-5, score-0.601]

5 While Eigenface and Fisherface methods aim to find projection directions based on the second order correlation of samples, Kernel Eigenface and Kernel Fisherface methods provide generalizations which take higher order correlations into account. [sent-6, score-0.383]

6 We compare the performance of kernel methods with Eigenface, Fisherface and ICA-based methods for face recognition with variation in pose, scale, lighting and expression. [sent-7, score-1.017]

7 Experimental results show that kernel methods provide better representations and achieve lower error rates for face recognition. [sent-8, score-0.795]

8 1 Motivation and Approach Subspace methods have been applied successfully in numerous visual recognition tasks such as face localization, face recognition, 3D object recognition, and tracking. [sent-9, score-1.152]

9 In particular, Principal Component Analysis (PCA) [20] [13] ,and Fisher Linear Discriminant (FLD) methods [6] have been applied to face recognition with impressive results. [sent-10, score-0.637]

10 While PCA aims to extract a subspace in which the variance is maximized (or the reconstruction error is minimized), some unwanted variations (due to lighting, facial expressions, viewing points, etc. [sent-11, score-0.391]

11 It has been observed that in face recognition the variations between the images of the same face due to illumination and viewing direction are almost always larger than image variations due to the changes in face identity [1]. [sent-13, score-1.693]

12 Therefore, while the PCA projections are optimal in a correlation sense (or for reconstruction" from a low dimensional subspace), these eigenvectors or bases may be suboptimal from the classification viewpoint. [sent-14, score-0.224]

13 Representations of Eigenface [20] (based on PCA) and Fisherface [6] (based on FLD) methods encode the pattern information based on the second order dependencies, i. [sent-15, score-0.089]

14 , pixelwise covariance among the pixels, and are insensitive to the dependencies among multiple (more than two) pixels in the samples. [sent-17, score-0.3]

15 Higher order dependencies in an image include nonlinear relations among the pixel intensity values, such as the relationships among three or more pixels in an edge or a curve, which can capture important information for recognition. [sent-18, score-0.493]

16 Several researchers have conjectured that higher order statistics may be crucial to better represent complex patterns. [sent-19, score-0.172]

17 Recently, Higher Order Statistics (HOS) have been applied to visual learning problems. [sent-20, score-0.046]

18 Rajagopalan et ale use HOS of the images of a target object to get a better approximation of an unknown distribution. [sent-21, score-0.345]

19 Experiments on face detection [16] and vehicle detection [15] show comparable, if no better, results than other PCA-based methods. [sent-22, score-0.491]

20 The concept of Independent Component Analysis (ICA) maximizes the degree of statistical independence of output variables using contrast functions such as Kullback-Leibler divergence, negentropy, and cumulants [9] [10]. [sent-23, score-0.063]

21 A neural network algorithm to carry out ICA was proposed by Bell and Sejnowski [7], and was applied to face recognition [3]. [sent-24, score-0.587]

22 Although the idea of computing higher order moments in the ICA-based face recognition method is attractive, the assumption that the face images comprise of a set of independent basis images (or factorial codes) is not intuitively clear. [sent-25, score-1.286]

23 In [3] Bartlett et ale showed that ICA representation outperform PCA representation in face recognition using a subset of frontal FERET face images. [sent-26, score-1.318]

24 However, Moghaddam recently showed that ICA representation does not provide significant advantage over PCA [12]. [sent-27, score-0.11]

25 The experimental results suggest that seeking non-Gaussian and independent components may not necessarily yield better representation for face recognition. [sent-28, score-0.524]

26 In [18], Sch6lkopf et ale extended the conventional PCA to Kernel Principal Component Analysis (KPCA). [sent-29, score-0.253]

27 Empirical results on digit recognition using MNIST data set and object recognition using a database of rendered chair images showed that Kernel PCA is able to extract nonlinear features and thus provided better recognition results. [sent-30, score-1.027]

28 Recently Baudat and Anouar, Roth and Steinhage, and Mika et ale applied kernel tricks to FLD and proposed Kernel Fisher Linear Discriminant (KFLD) method [11] [17] [5]. [sent-31, score-0.494]

29 Their experiments showed that KFLD is able to extract the most discriminant features in the feature space, which is equivalent to extracting the most discriminant nonlinear features in the original input space. [sent-32, score-0.559]

30 In this paper we seek a method that not only extracts higher order statistics of samples as features, but also maximizes the class separation when we project these features to a lower dimensional space for efficient recognition. [sent-33, score-0.434]

31 In the meanwhile, we explain why kernel methods are suitable for visual recognition tasks such as face recognition. [sent-35, score-0.979]

32 2 Kernel Principal Component Analysis Given a set of m centered (zero mean, unit variance) samples Xk, Xk == [Xkl, . [sent-36, score-0.113]

33 ,Xkn]T ERn, PCA aims to find the projection directions that maximize the variance, C, which is equivalent to finding the eigenvalues from the covariance matrix AW=CW (1) for eigenvalues A ~ 0 and eigenvectors W E Rn. [sent-39, score-0.473]

34 In Kernel PCA, each vector x is projected from the input space, Rn, to a high dimensional feature space, Rf, by a nonlinear mapping function: : Rn -+ Rf, f ~ n. [sent-40, score-0.236]

35 Note that the dimensionality of the feature space can be arbitrarily large. [sent-41, score-0.027]

36 In Rf, the corresponding eigenvalue problem is "AW4> = C4>w4> (2) where C4> is a covariance matrix. [sent-42, score-0.032]

37 , (Xm ), and there exist coefficients ai such that m w4> = E ai (xi) (3) i=l Denoting an m x m matrix K by K·· - k(x· x·) - (x·)· (x·) ~1 ~'1 ~ 1 (4) , the Kernel PCA problem becomes mAKa =K2 a (5) (6) mAa =Ka where a denotes a column vector with entries aI, . [sent-46, score-0.108]

38 The above derivations assume that all the projected samples (x) are centered in Rf. [sent-50, score-0.215]

39 Note that conventional PCA is a special case of Kernel PCA with polynomial kernel of first order. [sent-52, score-0.323]

40 In other words, Kernel PCA is a generalization of conventional PCA since different kernels can be utilized for different nonlinear projections. [sent-53, score-0.168]

41 We can now project the vectors in Rf to a lower dimensional space spanned by the eigenvectors weI>, Let x be a test sample whose projection is (x) in Rf, then the projection of (x) onto the eigenvectors weI> is the nonlinear principal components corresponding to : m w4> . [sent-54, score-0.769]

42 (x)) m = i=l E aik(xi, x) (7) i=l In other words, we can extract the first q (1 ~ q ~ m) nonlinear principal components (Le. [sent-56, score-0.309]

43 , eigenvectors w4» using the kernel function without the expensive operation that explicitly projects the samples to a high dimensional space Rf" The first q components correspond to the first q non-increasing eigenvalues of (6). [sent-57, score-0.721]

44 For face recognition where each x encodes a face image, we call the extracted nonlinear principal components Kernel Eigenfaces. [sent-58, score-1.253]

45 3 Kernel Fisher Linear Discriminant Similar to the derivations in Kernel PCA, we assume the projected samples (x) are centered in Rf (See [18] for a method to center the vectors (x) in Rf), we formulate the equations in a way that use dot products for FLD only. [sent-59, score-0.215]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('face', 0.395), ('pca', 0.298), ('eigenface', 0.296), ('fisherface', 0.296), ('kernel', 0.268), ('fld', 0.254), ('rf', 0.198), ('recognition', 0.192), ('wei', 0.184), ('ale', 0.169), ('ica', 0.148), ('discriminant', 0.137), ('eigenvectors', 0.127), ('principal', 0.116), ('dimensional', 0.097), ('fisher', 0.094), ('nonlinear', 0.085), ('hos', 0.085), ('kfld', 0.085), ('dependencies', 0.082), ('eigenvalues', 0.075), ('pixels', 0.072), ('images', 0.07), ('extract', 0.069), ('subspace', 0.066), ('higher', 0.063), ('lighting', 0.062), ('image', 0.062), ('projection', 0.057), ('samples', 0.057), ('among', 0.057), ('centered', 0.056), ('conventional', 0.055), ('ai', 0.054), ('projected', 0.054), ('component', 0.053), ('viewing', 0.051), ('representations', 0.051), ('variations', 0.051), ('methods', 0.05), ('aims', 0.049), ('detection', 0.048), ('derivations', 0.048), ('object', 0.046), ('visual', 0.046), ('showed', 0.045), ('features', 0.043), ('rn', 0.042), ('xk', 0.042), ('denoting', 0.041), ('relationships', 0.039), ('reconstruction', 0.039), ('order', 0.039), ('statistics', 0.039), ('components', 0.039), ('moghaddam', 0.037), ('roth', 0.037), ('aik', 0.037), ('ethod', 0.037), ('kpca', 0.037), ('meanwhile', 0.037), ('mountain', 0.037), ('unwanted', 0.037), ('project', 0.037), ('xi', 0.037), ('ka', 0.033), ('sij', 0.033), ('dependences', 0.033), ('rajagopalan', 0.033), ('recently', 0.033), ('representation', 0.032), ('covariance', 0.032), ('maximizes', 0.032), ('projects', 0.031), ('chair', 0.031), ('rendered', 0.031), ('comprise', 0.031), ('factorial', 0.031), ('honda', 0.031), ('illumination', 0.031), ('cumulants', 0.031), ('localization', 0.031), ('call', 0.031), ('better', 0.031), ('find', 0.031), ('frontal', 0.029), ('facial', 0.029), ('et', 0.029), ('utilized', 0.028), ('tricks', 0.028), ('mika', 0.028), ('tasks', 0.028), ('directions', 0.027), ('space', 0.027), ('investigate', 0.027), ('seeking', 0.027), ('xm', 0.027), ('generalizations', 0.027), ('mnist', 0.027), ('retained', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 74 nips-2001-Face Recognition Using Kernel Methods

Author: Ming-Hsuan Yang

2 0.2039353 164 nips-2001-Sampling Techniques for Kernel Methods

Author: Dimitris Achlioptas, Frank Mcsherry, Bernhard Schölkopf

Abstract: We propose randomized techniques for speeding up Kernel Principal Component Analysis on three levels: sampling and quantization of the Gram matrix in training, randomized rounding in evaluating the kernel expansions, and random projections in evaluating the kernel itself. In all three cases, we give sharp bounds on the accuracy of the obtained approximations. Rather intriguingly, all three techniques can be viewed as instantiations of the following idea: replace the kernel function by a “randomized kernel” which behaves like in expectation.

3 0.18043257 46 nips-2001-Categorization by Learning and Combining Object Parts

Author: Bernd Heisele, Thomas Serre, Massimiliano Pontil, Thomas Vetter, Tomaso Poggio

Abstract: We describe an algorithm for automatically learning discriminative components of objects with SVM classiﬁers. It is based on growing image parts by minimizing theoretical bounds on the error probability of an SVM. Component-based face classiﬁers are then combined in a second stage to yield a hierarchical SVM classiﬁer. Experimental results in face classiﬁcation show considerable robustness against rotations in depth and suggest performance at signiﬁcantly better level than other face detection systems. Novel aspects of our approach are: a) an algorithm to learn component-based classiﬁcation experts and their combination, b) the use of 3-D morphable models for training, and c) a maximum operation on the output of each component classiﬁer which may be relevant for biological models of visual recognition.

4 0.17242527 9 nips-2001-A Generalization of Principal Components Analysis to the Exponential Family

Author: Michael Collins, S. Dasgupta, Robert E. Schapire

Abstract: Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not real-valued, such as binary-valued data. This paper draws on ideas from the Exponential family, Generalized linear models, and Bregman distances, to give a generalization of PCA to loss functions that we argue are better suited to other data types. We describe algorithms for minimizing the loss functions, and give examples on simulated data.

5 0.15211102 58 nips-2001-Covariance Kernels from Bayesian Generative Models

Author: Matthias Seeger

Abstract: We propose the framework of mutual information kernels for learning covariance kernels, as used in Support Vector machines and Gaussian process classifiers, from unlabeled task data using Bayesian techniques. We describe an implementation of this framework which uses variational Bayesian mixtures of factor analyzers in order to attack classification problems in high-dimensional spaces where labeled data is sparse, but unlabeled data is abundant. 1

6 0.14894404 136 nips-2001-On the Concentration of Spectral Properties

7 0.13564692 15 nips-2001-A New Discriminative Kernel From Probabilistic Models

8 0.12746291 63 nips-2001-Dynamic Time-Alignment Kernel in Support Vector Machine

9 0.12483628 103 nips-2001-Kernel Feature Spaces and Nonlinear Blind Souce Separation

10 0.12148366 170 nips-2001-Spectral Kernel Methods for Clustering

11 0.12033671 77 nips-2001-Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade

12 0.11542927 127 nips-2001-Multi Dimensional ICA to Separate Correlated Sources

13 0.11178228 38 nips-2001-Asymptotic Universality for Learning Curves of Support Vector Machines

14 0.10797206 134 nips-2001-On Kernel-Target Alignment

15 0.09581387 155 nips-2001-Quantizing Density Estimators

16 0.095648751 153 nips-2001-Product Analysis: Learning to Model Observations as Products of Hidden Variables

17 0.094719984 88 nips-2001-Grouping and dimensionality reduction by locally linear embedding

18 0.087746844 92 nips-2001-Incorporating Invariances in Non-Linear Support Vector Machines

19 0.081964344 48 nips-2001-Characterizing Neural Gain Control using Spike-triggered Covariance

20 0.081191845 109 nips-2001-Learning Discriminative Feature Transforms to Low Dimensions in Low Dimentions

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.228), (1, 0.133), (2, -0.117), (3, -0.158), (4, 0.051), (5, 0.208), (6, -0.112), (7, 0.031), (8, 0.123), (9, 0.025), (10, -0.133), (11, -0.003), (12, 0.088), (13, -0.053), (14, -0.102), (15, -0.096), (16, 0.031), (17, -0.03), (18, -0.088), (19, 0.048), (20, 0.069), (21, -0.081), (22, 0.072), (23, -0.03), (24, -0.154), (25, 0.009), (26, -0.05), (27, 0.043), (28, -0.068), (29, -0.007), (30, -0.182), (31, 0.086), (32, 0.015), (33, -0.148), (34, -0.002), (35, 0.019), (36, 0.014), (37, -0.079), (38, 0.049), (39, 0.035), (40, 0.069), (41, 0.017), (42, -0.017), (43, 0.01), (44, 0.226), (45, -0.025), (46, -0.095), (47, 0.012), (48, 0.106), (49, -0.062)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97126353 74 nips-2001-Face Recognition Using Kernel Methods

Author: Ming-Hsuan Yang

2 0.64326489 164 nips-2001-Sampling Techniques for Kernel Methods

Author: Dimitris Achlioptas, Frank Mcsherry, Bernhard Schölkopf

3 0.59681046 155 nips-2001-Quantizing Density Estimators

Author: Peter Meinicke, Helge Ritter

Abstract: We suggest a nonparametric framework for unsupervised learning of projection models in terms of density estimation on quantized sample spaces. The objective is not to optimally reconstruct the data but instead the quantizer is chosen to optimally reconstruct the density of the data. For the resulting quantizing density estimator (QDE) we present a general method for parameter estimation and model selection. We show how projection sets which correspond to traditional unsupervised methods like vector quantization or PCA appear in the new framework. For a principal component quantizer we present results on synthetic and realworld data, which show that the QDE can improve the generalization of the kernel density estimator although its estimate is based on signiﬁcantly lower-dimensional projection indices of the data.

4 0.54428935 136 nips-2001-On the Concentration of Spectral Properties

Author: John Shawe-Taylor, Nello Cristianini, Jaz S. Kandola

Abstract: We consider the problem of measuring the eigenvalues of a randomly drawn sample of points. We show that these values can be reliably estimated as can the sum of the tail of eigenvalues. Furthermore, the residuals when data is projected into a subspace is shown to be reliably estimated on a random sample. Experiments are presented that confirm the theoretical results. 1

5 0.53093439 15 nips-2001-A New Discriminative Kernel From Probabilistic Models

Author: Koji Tsuda, Motoaki Kawanabe, Gunnar Rätsch, Sören Sonnenburg, Klaus-Robert Müller

Abstract: Recently, Jaakkola and Haussler proposed a method for constructing kernel functions from probabilistic models. Their so called

6 0.48224935 103 nips-2001-Kernel Feature Spaces and Nonlinear Blind Souce Separation

7 0.47842526 48 nips-2001-Characterizing Neural Gain Control using Spike-triggered Covariance

8 0.47754046 58 nips-2001-Covariance Kernels from Bayesian Generative Models

9 0.4721033 9 nips-2001-A Generalization of Principal Components Analysis to the Exponential Family

10 0.43104893 182 nips-2001-The Fidelity of Local Ordinal Encoding

11 0.41724211 92 nips-2001-Incorporating Invariances in Non-Linear Support Vector Machines

12 0.41180217 60 nips-2001-Discriminative Direction for Kernel Classifiers

13 0.41109076 109 nips-2001-Learning Discriminative Feature Transforms to Low Dimensions in Low Dimentions

14 0.41059417 63 nips-2001-Dynamic Time-Alignment Kernel in Support Vector Machine

15 0.39316222 46 nips-2001-Categorization by Learning and Combining Object Parts

16 0.38283369 88 nips-2001-Grouping and dimensionality reduction by locally linear embedding

17 0.37818936 38 nips-2001-Asymptotic Universality for Learning Curves of Support Vector Machines

18 0.34654227 153 nips-2001-Product Analysis: Learning to Model Observations as Products of Hidden Variables

19 0.34253407 20 nips-2001-A Sequence Kernel and its Application to Speaker Recognition

20 0.33642617 113 nips-2001-Learning a Gaussian Process Prior for Automatically Generating Music Playlists

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.241), (14, 0.054), (17, 0.057), (19, 0.028), (27, 0.141), (30, 0.08), (38, 0.029), (49, 0.011), (59, 0.083), (72, 0.037), (79, 0.019), (83, 0.015), (91, 0.107)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.85529691 74 nips-2001-Face Recognition Using Kernel Methods

Author: Ming-Hsuan Yang

2 0.79303038 3 nips-2001-ACh, Uncertainty, and Cortical Inference

Author: Peter Dayan, Angela J. Yu

Abstract: Acetylcholine (ACh) has been implicated in a wide variety of tasks involving attentional processes and plasticity. Following extensive animal studies, it has previously been suggested that ACh reports on uncertainty and controls hippocampal, cortical and cortico-amygdalar plasticity. We extend this view and consider its effects on cortical representational inference, arguing that ACh controls the balance between bottom-up inference, inﬂuenced by input stimuli, and top-down inference, inﬂuenced by contextual information. We illustrate our proposal using a hierarchical hidden Markov model.

3 0.65348965 164 nips-2001-Sampling Techniques for Kernel Methods

Author: Dimitris Achlioptas, Frank Mcsherry, Bernhard Schölkopf

4 0.63941538 88 nips-2001-Grouping and dimensionality reduction by locally linear embedding

Author: Marzia Polito, Pietro Perona

Abstract: Locally Linear Embedding (LLE) is an elegant nonlinear dimensionality-reduction technique recently introduced by Roweis and Saul [2]. It fails when the data is divided into separate groups. We study a variant of LLE that can simultaneously group the data and calculate local embedding of each group. An estimate for the upper bound on the intrinsic dimension of the data set is obtained automatically. 1

5 0.63792515 127 nips-2001-Multi Dimensional ICA to Separate Correlated Sources

Author: Roland Vollgraf, Klaus Obermayer

Abstract: We present a new method for the blind separation of sources, which do not fulfill the independence assumption. In contrast to standard methods we consider groups of neighboring samples (

6 0.63779819 27 nips-2001-Activity Driven Adaptive Stochastic Resonance

7 0.63746196 13 nips-2001-A Natural Policy Gradient

8 0.63384557 131 nips-2001-Neural Implementation of Bayesian Inference in Population Codes

9 0.63299906 103 nips-2001-Kernel Feature Spaces and Nonlinear Blind Souce Separation

10 0.63299888 9 nips-2001-A Generalization of Principal Components Analysis to the Exponential Family

11 0.63047308 108 nips-2001-Learning Body Pose via Specialized Maps

12 0.63015008 121 nips-2001-Model-Free Least-Squares Policy Iteration

13 0.62996829 44 nips-2001-Blind Source Separation via Multinode Sparse Representation

14 0.62810284 92 nips-2001-Incorporating Invariances in Non-Linear Support Vector Machines

15 0.62760973 60 nips-2001-Discriminative Direction for Kernel Classifiers

16 0.62737137 138 nips-2001-On the Generalization Ability of On-Line Learning Algorithms

17 0.62690997 89 nips-2001-Grouping with Bias

18 0.62640035 190 nips-2001-Thin Junction Trees

19 0.62613976 137 nips-2001-On the Convergence of Leveraging

20 0.62543356 197 nips-2001-Why Neuronal Dynamics Should Control Synaptic Learning Rules