nips nips2006 nips2006-126 knowledge-graph by maker-knowledge-mining

126 nips-2006-Logistic Regression for Single Trial EEG Classification

Source: pdf

Author: Ryota Tomioka, Kazuyuki Aihara, Klaus-Robert Müller

Abstract: We propose a novel framework for the classiﬁcation of single trial ElectroEncephaloGraphy (EEG), based on regularized logistic regression. Framed in this robust statistical framework no prior feature extraction or outlier removal is required. We present two variations of parameterizing the regression function: (a) with a full rank symmetric matrix coeﬃcient and (b) as a diﬀerence of two rank=1 matrices. In the ﬁrst case, the problem is convex and the logistic regression is optimal under a generative model. The latter case is shown to be related to the Common Spatial Pattern (CSP) algorithm, which is a popular technique in Brain Computer Interfacing. The regression coeﬃcients can also be topographically mapped onto the scalp similarly to CSP projections, which allows neuro-physiological interpretation. Simulations on 162 BCI datasets demonstrate that classiﬁcation accuracy and robustness compares favorably against conventional CSP based classiﬁers. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 de Abstract We propose a novel framework for the classiﬁcation of single trial ElectroEncephaloGraphy (EEG), based on regularized logistic regression. [sent-14, score-0.306]

2 Framed in this robust statistical framework no prior feature extraction or outlier removal is required. [sent-15, score-0.126]

3 We present two variations of parameterizing the regression function: (a) with a full rank symmetric matrix coeﬃcient and (b) as a diﬀerence of two rank=1 matrices. [sent-16, score-0.403]

4 In the ﬁrst case, the problem is convex and the logistic regression is optimal under a generative model. [sent-17, score-0.374]

5 The regression coeﬃcients can also be topographically mapped onto the scalp similarly to CSP projections, which allows neuro-physiological interpretation. [sent-19, score-0.235]

6 Simulations on 162 BCI datasets demonstrate that classiﬁcation accuracy and robustness compares favorably against conventional CSP based classiﬁers. [sent-20, score-0.048]

7 1 Introduction The goal of Brain-Computer Interface (BCI) research [1, 2, 3, 4, 5, 6, 7] is to provide a direct control pathway from human intentions reﬂected in brain signals to computers. [sent-21, score-0.074]

8 Machine learning approaches to BCI have proven to be eﬀective by requiring less subject training and by compensating for the high inter-subject variability. [sent-25, score-0.07]

9 In this ﬁeld, a number of studies have focused on constructing better low dimensional representations that combine various features of brain activities [3, 4], because the problem of classifying EEG signals is intrinsically high dimensional. [sent-26, score-0.123]

10 In particular, eﬀorts have been made to reduce the number of electrodes by eliminating electrodes recursively [8] or by decomposition techniques e. [sent-27, score-0.088]

11 In practice, often a BCI system has been constructed by combining a feature extraction step and a classiﬁcation step. [sent-30, score-0.044]

12 Our contribution is a logistic regression classiﬁer that integrates both steps under the roof of a single minimization problem and uses well controlled regularization. [sent-31, score-0.374]

13 We study a BCI based on the motor ∗ † Fraunhofer FIRST. [sent-33, score-0.091]

14 Motor imagination can be captured through spatially localized bandpower modulation in the µ- (10-15Hz) or β- (20-30Hz) band characterized by the secondorder statistics of the signal; the underlying neuro-physiology is well known as Event Related Desynchronization (ERD) [10]. [sent-37, score-0.072]

15 1 Problem setting Let us denote by X ∈ Rd×T the EEG signal of a single trial of an imaginary motor movement1 , where d is the number of electrodes and T is the number of sampled time-points in a trial. [sent-39, score-0.352]

16 right or left hand imaginary movement, is called positive (+) or negative (−) class. [sent-42, score-0.136]

17 Given a set of trials and labels {Xi , yi }n , the task is to predict the class label i=1 y for an unobserved trial X. [sent-44, score-0.163]

18 2 Conventional method: classifying with CSP features In the motor-imagery EEG signal classiﬁcation, Common Spatial Pattern (CSP) based classiﬁers have proven to be powerful [11, 3, 6]. [sent-46, score-0.109]

19 CSP is a decomposition method proposed by Koles [9] that ﬁnds a set of projections that simultaneously diagonalize the covariance matrices corresponding to two brain states. [sent-47, score-0.147]

20 Formally, the covariance matrices2 are deﬁned as: 1 Xi Xi (c ∈ {+, −}), (1) Σc = |Ic | i∈Ic where Ic is the set of indices belonging to a class c ∈ {+, −}; thus I+ ∪ I− = {1, . [sent-48, score-0.081]

21 Then, the simultaneous diagonalization is achieved by solving the following generalized eigenvalue problem: Σ+ w = λΣ− w. [sent-52, score-0.126]

22 (2) w Σ+ wj Note that for each pair of eigenvector and eigenvalue (wj , λj ), the equality λj = w j Σ− wj j holds. [sent-53, score-0.277]

23 Therefore, the eigenvector with the largest eigenvalue corresponds to the projection with the maximum ratio of power for the “+” class and the “−” class, and the otherway-around for the eigenvector with the smallest eigenvalue. [sent-54, score-0.157]

24 It is common practice that only the ﬁrst nof largest eigenvectors and the last nof smallest eigenvectors are used to construct a low dimensional feature representation. [sent-56, score-0.334]

25 The feature vector consists of logarithms of the projected signal powers and a Linear Discriminant Analysis (LDA) classiﬁer is trained on the resulting feature vector. [sent-57, score-0.06]

26 To summarize, the conventional CSP based classiﬁer can be constructed as follows: How to build a CSP based classiﬁer: 1. [sent-58, score-0.048]

27 Take the nof largest and smallest eigenvectors {wj }J j=1 3. [sent-62, score-0.167]

28 i=1 1 For simplicity, we assume that the signal is already band-pass ﬁltered and each trial is centered ´ ` 1 1 and scaled as X = √T Xoriginal IT − T 11 . [sent-70, score-0.14]

29 (1) a covariance matrix, calling it an averaged cross power matrix gives better insight into the nature of the problem, because we are focusing on the task related modulation of rhythmic activities. [sent-72, score-0.165]

30 The model (3) can be derived by assuming a zero-mean Gaussian distribution with no temporal correlation with a covariance matrix Σ± for each class as follows: 1 P (y = +1|X) = tr −Σ+ −1 + Σ− −1 XX + const. [sent-76, score-0.114]

31 1 Logistic regression Linear logistic regression We minimize the negative log-likelihood of Eq. [sent-86, score-0.522]

32 (3) with an additional regularization term, which is written as follows: 1 W ∈Sym(d),b∈R n min n C log 1 + e−yi f (Xi ; θ) + trΣP W ΣP W + b2 . [sent-87, score-0.054]

33 (5) converges asymptotically to the true loss where the empirical average is replaced by the expectation over X and y, whose minimum over functions in L2 (PX ) is achieved by the symmetric logit transform of P (y = +1|X) [15]. [sent-94, score-0.094]

34 The problem of classifying motor imagery EEG signals is now addressed under a single loss function. [sent-97, score-0.201]

35 2 Rank=2 approximation of the linear logistic regression Here we present a rank=2 approximation of the regression function (3). [sent-102, score-0.522]

36 Using this approximation we can greatly reduce the number of parameters to be estimated from a symmetric matrix coeﬃcient to a pair of projection coeﬃcients and additionally gain insight into the relevant feature the classiﬁer has found. [sent-103, score-0.042]

37 The rank=2 approximation of the regression function (3) is written as follows: 1 ¯ ¯ f (X; θ) := tr −w1 w1 + w2 w2 XX + b, 2 (6) ¯ where θ := (w1 , w2 , b) ∈ Rd × Rd × R. [sent-104, score-0.181]

38 The rationale for choosing this special form of function is that the Bayes optimal regression coeﬃcients in Eq. [sent-105, score-0.148]

39 (4) is the diﬀerence of two positive deﬁnite matrices; therefore two bases with opposite signs are at least necessary in capturing the nature of Eq. [sent-106, score-0.042]

40 (4) (incorporating more bases goes beyond the scope of this contribution). [sent-107, score-0.042]

41 The rank=2 parameterized logistic regression can be obtained by minimizing the sum of the logistic regression loss and regularization terms similarly to Eq. [sent-108, score-0.831]

42 2n i=1 (7) Here, again the pooled covariance matrix ΣP is used as a metric in order to ensure the invariance to linear transformations. [sent-110, score-0.096]

43 Note that the bases {w1 , w2 } give projections of the signal into a two dimensional feature space in a similar manner as CSP (see Sec. [sent-111, score-0.13]

44 The ﬁlters can be topographically mapped onto the scalp, from which insight into the classiﬁer can be obtained. [sent-115, score-0.038]

45 However, the major diﬀerence between CSP and the rank=2 parameterized logistic regression (Eq. [sent-116, score-0.403]

46 (7)) is that in our new approach, there is no distinction between the feature extraction step and the classiﬁer training step. [sent-117, score-0.044]

47 1 Results Experimental settings We compare the logistic regression classiﬁers (Eqs. [sent-123, score-0.374]

48 (3) and (6)) against CSP based classiﬁers with nof = 1 (total 2 ﬁlters) and nof = 3 (total 6 ﬁlters). [sent-124, score-0.22]

49 We use 60 BCI experiments [6] from 29 subjects where the subjects performed three imaginary movements, namely “right hand” (R), “left hand” (L) and “foot” (F) according to the visual cue presented on the screen, except 9 experiments where only two classes were performed. [sent-127, score-0.077]

50 Each dataset contains 70 to 600 trials (at median 280) of imaginary movements. [sent-129, score-0.077]

51 The signal was recorded from the scalp with multi-channel EEG ampliﬁers using 32, 64 or 128 channels. [sent-133, score-0.109]

52 The signal was sampled at 1000Hz and down-sampled to 100Hz before the processing. [sent-134, score-0.06]

53 The signal is band-pass ﬁltered at 7-30Hz and the interval 500-3500ms after the appearance of visual cue is cut out from the continuous EEG signal as a trial X. [sent-135, score-0.2]

54 (3) and wj = ΣP wj (j = 1, 2) for ˜ and wj denote the minimizer of Eqs. [sent-139, score-0.276]

55 Note that we did not whitened the training and test data jointly, which could have improved the performance. [sent-142, score-0.052]

56 The regularization constant C for the proposed method is chosen by 5×10 cross-validation on the training set. [sent-143, score-0.054]

57 1, logistic regression (LR) classiﬁers with the full rank parameterization (Eq. [sent-146, score-0.565]

58 Here the bit-rate (per decision) is deﬁned based on the classiﬁcation test error perr as the capacity of a binary symmetric channel with the same error probability: 1 43% 52% 0. [sent-150, score-0.186]

59 8 CSP (2 filters) Figure 1: Comparison of bit-rates achieved by the CSP based classiﬁers and the logistic regression (LR) classiﬁers. [sent-182, score-0.374]

60 The bit-rates achieved by the conventional CSP based classiﬁer and the proposed LR classiﬁer are shown as a circle for each dataset. [sent-183, score-0.048]

61 The proportion of datasets lying above/below the diagonal is shown at top-left/bottom-right corners of each plot, respectively. [sent-184, score-0.054]

62 1 1 1 − perr log2 perr + (1 − perr ) log2 1−perr . [sent-186, score-0.33]

63 The proposed method improves upon the conventional method for datasets lying above the diagonal. [sent-187, score-0.102]

64 Note that our proposed logistic regression ansatz is signiﬁcantly better only in the lower right plot. [sent-188, score-0.374]

65 Figure 2 shows examples of spatial ﬁlter coeﬃcients obtained by CSP (6 ﬁlters) and rank=2 parameterized logistic regression. [sent-189, score-0.284]

66 2(a)) include typical cases (the ﬁrst ﬁlter for the “left hand” class and the ﬁrst two ﬁlters for the “right hand” class) of ﬁlters corrupted by artifacts, e. [sent-191, score-0.071]

67 The CSP ﬁlters for the “foot” class in subject B (see Fig. [sent-194, score-0.099]

68 2(b)) are corrupted by strong occipital α-activity, which might have been weakly correlated to the labels by chance. [sent-195, score-0.086]

69 On the other hand the ﬁlter coeﬃcients obtained by the logistic regression are clearly focused on the area physiologically corresponding to ERD in the motor cortex (see Figs. [sent-198, score-0.559]

70 (7) the regression coeﬃcients w1 and w2 are generalized eigenvectors of two uncertainty weighted covariance matrices corresponding to two motor imagery classes, which are weighted by the uncertainty of the decision 1 − P (y = yi |X = Xi ) for each sample. [sent-202, score-0.493]

71 Samples that are easily explained by the regression function are weighed low whereas those lying close to the decision boundary or those lying on the wrong side of the boundary are highly weighted. [sent-203, score-0.256]

72 e−zi ± yi Xi Xi w∗ + CΣP w∗ = 0 (j = 1, 2), (8) j j 1 + e−zi i=1 ¯ ¯∗ where we deﬁne the short hand zi := yi f (Xi ; θ ) and ± denotes + and − for j = 1, 2, respectively. [sent-208, score-0.205]

73 (8) can be rewritten as follows: ¯∗ ¯∗ Σ−(θ , 0)w∗ = Σ+(θ , C)w∗ , 1 1 ¯ ∗ , 0)w∗ = Σ−(θ ∗ , C)w∗ , ¯ Σ+(θ 2 2 (9) (10) where we deﬁne the uncertainty weighted covariance matrix as: n e−zi C ¯∗ Xi Xi . [sent-210, score-0.052]

74 Σ± (θ , C) = Xi Xi + 1 + e−zi n i=1 i∈I± Note that increasing the regularization constant C biases the uncertainty weighted covariance matrix to the pooled covariance matrix ΣP ; the regularization only aﬀects the righthand side of Eqs. [sent-211, score-0.256]

75 If C > 0, the optimal ﬁlter coeﬃcients w∗ (j = 1, 2) are the j generalized eigenvectors of Eqs. [sent-213, score-0.094]

76 After being introduced to the BCI community by [11], it has proved to be also powerful in classifying imaginary motor movements [3, 6]. [sent-217, score-0.217]

77 A widely used heuristic is to choose several generalized eigenvectors from both ends of the eigenvalue spectrum. [sent-220, score-0.152]

78 Secondly, simultaneous diagonalization of covariance matrices can suﬀer greatly from a few outlier trials as seen in subject A in Fig. [sent-224, score-0.233]

79 Again, in practice one can inspect the EEG signals to detect outliers, however a manual outlier detection is also a somewhat arbitrary, non-reproducible process, which cannot be validated. [sent-226, score-0.087]

80 5 Conclusion In this paper, we have proposed an uniﬁed framework for single trial classiﬁcation of motorimagery EEG signals. [sent-227, score-0.08]

81 The problem is addressed as a single minimization problem without any prior feature extraction or outlier removal steps. [sent-228, score-0.126]

82 The task is to minimize a logistic regression loss with a regularization term. [sent-229, score-0.428]

83 The regression function is a linear function with respect to the second order statistics of the EEG signal. [sent-230, score-0.148]

84 By parameterizing the whole regression coeﬃcients directly, we have obtained comparable classiﬁcation accuracy with CSP based classiﬁers. [sent-232, score-0.205]

85 By parameterizing the regression coeﬃcients as the diﬀerence of two rank-one matrices, improvement against CSP based classiﬁers was obtained. [sent-233, score-0.205]

86 We have shown that in the rank=2 parameterization of the logistic regression function, the optimal ﬁlter coeﬃcients has an interpretation as a solution to a generalized eigenvalue problem similarly to CSP. [sent-234, score-0.504]

87 However, the diﬀerence is that in the case of logistic regression every sample is weighted according to the importance to the overall classiﬁcation problem whereas in CSP all the samples have uniform importance. [sent-235, score-0.374]

88 For example, incorporating more than two ﬁlters will connect the two parameterizations of the regression function shown in this paper and it may allow us to investigate how many ﬁlters are sufﬁcient for good classiﬁcation. [sent-237, score-0.148]

89 Since the classiﬁer output is the logit transform of the class probability, it is straightforward to generalize the method to multi-class problems. [sent-238, score-0.081]

90 caused by a covariate shift (see [16, 17]) in the density P (X) from one session to another, could be corrected by adapting the likelihood model. [sent-241, score-0.066]

91 Cuu rio, “The Berlin Brain-Computer Interface: EEG-based communication without subject training”, IEEE Trans. [sent-313, score-0.07]

92 Koles, “The quantitative extraction and topographic mapping of the abnormal components in the clinical EEG”, Electroencephalogr. [sent-338, score-0.044]

93 Pfurtscheller, “Optimal spatial ﬁltering of single trial u EEG during imagined hand movement”, IEEE Trans. [sent-352, score-0.168]

94 Shimodaira, “Improving predictive inference under covariate shift by weighting the loglikelihood function”, Journal of Statistical Planning and Inference, 90: 227–244, 2000. [sent-378, score-0.066]

95 CSP ﬁlter coeﬃcients left hand right hand (c) Subject A. [sent-384, score-0.118]

96 CSP ﬁlter coeﬃcients left hand foot (d) Subject B. [sent-398, score-0.129]

97 Logistic regression (rank=2) ﬁlter coeﬃcients Figure 2: Examples of spatial ﬁlter coeﬃcients obtained by CSP and the rank=2 parameterized logistic regression. [sent-399, score-0.432]

98 Some CSP ﬁlters are corrupted by strong occipital α-activity. [sent-403, score-0.086]

99 Logistic regression coeﬃcients are focusing on the physiologically expected “left hand” and “right hand” areas. [sent-405, score-0.219]

100 Logistic regression coeﬃcients are focusing on the “left hand” and “foot” areas. [sent-407, score-0.184]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('csp', 0.614), ('coe', 0.281), ('eeg', 0.228), ('logistic', 0.226), ('lters', 0.182), ('rank', 0.156), ('bci', 0.153), ('regression', 0.148), ('lr', 0.121), ('nof', 0.11), ('perr', 0.11), ('cients', 0.104), ('classi', 0.1), ('wj', 0.092), ('motor', 0.091), ('erence', 0.087), ('lter', 0.087), ('trial', 0.08), ('imaginary', 0.077), ('ller', 0.07), ('foot', 0.07), ('subject', 0.07), ('er', 0.069), ('xi', 0.068), ('pfurtscheller', 0.061), ('signal', 0.06), ('hand', 0.059), ('filters', 0.058), ('eigenvalue', 0.058), ('eigenvectors', 0.057), ('parameterizing', 0.057), ('hinterberger', 0.057), ('yi', 0.054), ('regularization', 0.054), ('outlier', 0.054), ('lying', 0.054), ('logit', 0.052), ('blankertz', 0.052), ('dornhege', 0.052), ('aihara', 0.052), ('whitened', 0.052), ('covariance', 0.052), ('di', 0.052), ('interface', 0.05), ('classifying', 0.049), ('birbaumer', 0.049), ('scalp', 0.049), ('conventional', 0.048), ('ers', 0.047), ('extraction', 0.044), ('occipital', 0.044), ('sym', 0.044), ('pooled', 0.044), ('tokyo', 0.044), ('electrodes', 0.044), ('corrupted', 0.042), ('bases', 0.042), ('symmetric', 0.042), ('berlin', 0.041), ('brain', 0.041), ('ic', 0.04), ('xx', 0.04), ('modulation', 0.039), ('covariate', 0.039), ('topographically', 0.038), ('rhythmic', 0.038), ('koles', 0.038), ('kunzmann', 0.038), ('losch', 0.038), ('zi', 0.038), ('generalized', 0.037), ('focusing', 0.036), ('parameterization', 0.035), ('physiologically', 0.035), ('ramoser', 0.035), ('curio', 0.035), ('eigenvector', 0.035), ('channel', 0.034), ('tr', 0.033), ('signals', 0.033), ('interfacing', 0.033), ('erd', 0.033), ('imagination', 0.033), ('mcfarland', 0.033), ('rd', 0.032), ('graz', 0.031), ('desynchronization', 0.031), ('diagonalization', 0.031), ('whitening', 0.029), ('parameterized', 0.029), ('spatial', 0.029), ('class', 0.029), ('discriminative', 0.028), ('lda', 0.028), ('removal', 0.028), ('imagery', 0.028), ('projections', 0.028), ('shift', 0.027), ('lal', 0.027), ('matrices', 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999964 126 nips-2006-Logistic Regression for Single Trial EEG Classification

Author: Ryota Tomioka, Kazuyuki Aihara, Klaus-Robert Müller

2 0.66387022 168 nips-2006-Reducing Calibration Time For Brain-Computer Interfaces: A Clustering Approach

Author: Matthias Krauledat, Michael Schröder, Benjamin Blankertz, Klaus-Robert Müller

Abstract: Up to now even subjects that are experts in the use of machine learning based BCI systems still have to undergo a calibration session of about 20-30 min. From this data their (movement) intentions are so far infered. We now propose a new paradigm that allows to completely omit such calibration and instead transfer knowledge from prior sessions. To achieve this goal we ﬁrst deﬁne normalized CSP features and distances in-between. Second, we derive prototypical features across sessions: (a) by clustering or (b) by feature concatenation methods. Finally, we construct a classiﬁer based on these individualized prototypes and show that, indeed, classiﬁers can be successfully transferred to a new session for a number of subjects.

3 0.36102331 22 nips-2006-Adaptive Spatial Filters with predefined Region of Interest for EEG based Brain-Computer-Interfaces

Author: Moritz Grosse-wentrup, Klaus Gramann, Martin Buss

Abstract: The performance of EEG-based Brain-Computer-Interfaces (BCIs) critically depends on the extraction of features from the EEG carrying information relevant for the classiﬁcation of different mental states. For BCIs employing imaginary movements of different limbs, the method of Common Spatial Patterns (CSP) has been shown to achieve excellent classiﬁcation results. The CSP-algorithm however suffers from a lack of robustness, requiring training data without artifacts for good performance. To overcome this lack of robustness, we propose an adaptive spatial ﬁlter that replaces the training data in the CSP approach by a-priori information. More speciﬁcally, we design an adaptive spatial ﬁlter that maximizes the ratio of the variance of the electric ﬁeld originating in a predeﬁned region of interest (ROI) and the overall variance of the measured EEG. Since it is known that the component of the EEG used for discriminating imaginary movements originates in the motor cortex, we design two adaptive spatial ﬁlters with the ROIs centered in the hand areas of the left and right motor cortex. We then use these to classify EEG data recorded during imaginary movements of the right and left hand of three subjects, and show that the adaptive spatial ﬁlters outperform the CSP-algorithm, enabling classiﬁcation rates of up to 94.7 % without artifact rejection. 1

4 0.18583597 24 nips-2006-Aggregating Classification Accuracy across Time: Application to Single Trial EEG

Author: Steven Lemm, Christin Schäfer, Gabriel Curio

Abstract: We present a method for binary on-line classiﬁcation of triggered but temporally blurred events that are embedded in noisy time series in the context of on-line discrimination between left and right imaginary hand-movement. In particular the goal of the binary classiﬁcation problem is to obtain the decision, as fast and as reliably as possible from the recorded EEG single trials. To provide a probabilistic decision at every time-point t the presented method gathers information from two distinct sequences of features across time. In order to incorporate decisions from prior time-points we suggest an appropriate weighting scheme, that emphasizes time instances, providing a higher discriminatory power between the instantaneous class distributions of each feature, where the discriminatory power is quantiﬁed in terms of the Bayes error of misclassiﬁcation. The eﬀectiveness of this procedure is veriﬁed by its successful application in the 3rd BCI competition. Disclosure of the data after the competition revealed this approach to be superior with single trial error rates as low as 10.7, 11.5 and 16.7% for the three diﬀerent subjects under study. 1

5 0.098158836 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation

Author: Dong S. Cheng, Vittorio Murino, Mário Figueiredo

Abstract: This paper proposes a new approach to model-based clustering under prior knowledge. The proposed formulation can be interpreted from two different angles: as penalized logistic regression, where the class labels are only indirectly observed (via the probability density of each class); as ﬁnite mixture learning under a grouping prior. To estimate the parameters of the proposed model, we derive a (generalized) EM algorithm with a closed-form E-step, in contrast with other recent approaches to semi-supervised probabilistic clustering which require Gibbs sampling or suboptimal shortcuts. We show that our approach is ideally suited for image segmentation: it avoids the combinatorial nature Markov random ﬁeld priors, and opens the door to more sophisticated spatial priors (e.g., wavelet-based) in a simple and computationally efﬁcient way. Finally, we extend our formulation to work in unsupervised, semi-supervised, or discriminative modes. 1

6 0.094376415 178 nips-2006-Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation

7 0.086651281 72 nips-2006-Efficient Learning of Sparse Representations with an Energy-Based Model

8 0.080859371 182 nips-2006-Statistical Modeling of Images with Fields of Gaussian Scale Mixtures

9 0.08051914 148 nips-2006-Nonlinear physically-based models for decoding motor-cortical population activity

10 0.071071923 164 nips-2006-Randomized PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension

11 0.069602266 92 nips-2006-High-Dimensional Graphical Model Selection Using $\ell 1$-Regularized Logistic Regression

12 0.069045015 131 nips-2006-Mixture Regression for Covariate Shift

13 0.067783795 76 nips-2006-Emergence of conjunctive visual features by quadratic independent component analysis

14 0.065067358 130 nips-2006-Max-margin classification of incomplete data

15 0.064777292 65 nips-2006-Denoising and Dimension Reduction in Feature Space

16 0.063307486 179 nips-2006-Sparse Representation for Signal Classification

17 0.0614358 84 nips-2006-Generalized Regularized Least-Squares Learning with Predefined Features in a Hilbert Space

18 0.061177745 156 nips-2006-Ordinal Regression by Extended Binary Classification

19 0.059173867 186 nips-2006-Support Vector Machines on a Budget

20 0.058429997 75 nips-2006-Efficient sparse coding algorithms

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.216), (1, 0.01), (2, 0.14), (3, -0.083), (4, -0.642), (5, -0.095), (6, -0.034), (7, 0.301), (8, 0.107), (9, -0.09), (10, -0.041), (11, -0.057), (12, 0.155), (13, 0.111), (14, -0.038), (15, 0.147), (16, -0.155), (17, 0.041), (18, 0.09), (19, 0.017), (20, -0.031), (21, 0.018), (22, 0.018), (23, -0.014), (24, -0.013), (25, -0.016), (26, 0.054), (27, 0.017), (28, -0.011), (29, 0.026), (30, 0.049), (31, 0.044), (32, -0.107), (33, 0.061), (34, 0.081), (35, -0.002), (36, -0.005), (37, 0.059), (38, -0.014), (39, 0.05), (40, 0.069), (41, 0.046), (42, -0.018), (43, 0.043), (44, 0.023), (45, -0.057), (46, 0.008), (47, -0.03), (48, -0.02), (49, 0.017)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.97079945 168 nips-2006-Reducing Calibration Time For Brain-Computer Interfaces: A Clustering Approach

Author: Matthias Krauledat, Michael Schröder, Benjamin Blankertz, Klaus-Robert Müller

same-paper 2 0.94656104 126 nips-2006-Logistic Regression for Single Trial EEG Classification

Author: Ryota Tomioka, Kazuyuki Aihara, Klaus-Robert Müller

3 0.82834131 22 nips-2006-Adaptive Spatial Filters with predefined Region of Interest for EEG based Brain-Computer-Interfaces

Author: Moritz Grosse-wentrup, Klaus Gramann, Martin Buss

4 0.62903422 24 nips-2006-Aggregating Classification Accuracy across Time: Application to Single Trial EEG

Author: Steven Lemm, Christin Schäfer, Gabriel Curio

5 0.31285119 178 nips-2006-Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation

Author: Gavin C. Cawley, Nicola L. Talbot, Mark Girolami

Abstract: Multinomial logistic regression provides the standard penalised maximumlikelihood solution to multi-class pattern recognition problems. More recently, the development of sparse multinomial logistic regression models has found application in text processing and microarray classiﬁcation, where explicit identiﬁcation of the most informative features is of value. In this paper, we propose a sparse multinomial logistic regression method, in which the sparsity arises from the use of a Laplace prior, but where the usual regularisation parameter is integrated out analytically. Evaluation over a range of benchmark datasets reveals this approach results in similar generalisation performance to that obtained using cross-validation, but at greatly reduced computational expense. 1

6 0.26589018 72 nips-2006-Efficient Learning of Sparse Representations with an Energy-Based Model

7 0.26062548 179 nips-2006-Sparse Representation for Signal Classification

8 0.25620753 156 nips-2006-Ordinal Regression by Extended Binary Classification

9 0.24210894 129 nips-2006-Map-Reduce for Machine Learning on Multicore

10 0.23516946 140 nips-2006-Multiple Instance Learning for Computer Aided Diagnosis

11 0.2331624 182 nips-2006-Statistical Modeling of Images with Fields of Gaussian Scale Mixtures

12 0.22972138 73 nips-2006-Efficient Methods for Privacy Preserving Face Detection

13 0.22465166 148 nips-2006-Nonlinear physically-based models for decoding motor-cortical population activity

14 0.21463968 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation

15 0.21244381 193 nips-2006-Tighter PAC-Bayes Bounds

16 0.19822912 63 nips-2006-Cross-Validation Optimization for Large Scale Hierarchical Classification Kernel Methods

17 0.19653791 84 nips-2006-Generalized Regularized Least-Squares Learning with Predefined Features in a Hilbert Space

18 0.19074805 118 nips-2006-Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields

19 0.19014584 105 nips-2006-Large Margin Component Analysis

20 0.1838069 76 nips-2006-Emergence of conjunctive visual features by quadratic independent component analysis

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(1, 0.113), (3, 0.018), (7, 0.054), (9, 0.037), (13, 0.25), (20, 0.013), (22, 0.132), (44, 0.061), (57, 0.057), (65, 0.042), (69, 0.02), (71, 0.013), (90, 0.011), (95, 0.036), (98, 0.047)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.83087826 126 nips-2006-Logistic Regression for Single Trial EEG Classification

Author: Ryota Tomioka, Kazuyuki Aihara, Klaus-Robert Müller

2 0.80672997 111 nips-2006-Learning Motion Style Synthesis from Perceptual Observations

Author: Lorenzo Torresani, Peggy Hackney, Christoph Bregler

Abstract: This paper presents an algorithm for synthesis of human motion in speciﬁed styles. We use a theory of movement observation (Laban Movement Analysis) to describe movement styles as points in a multi-dimensional perceptual space. We cast the task of learning to synthesize desired movement styles as a regression problem: sequences generated via space-time interpolation of motion capture data are used to learn a nonlinear mapping between animation parameters and movement styles in perceptual space. We demonstrate that the learned model can apply a variety of motion styles to pre-recorded motion sequences and it can extrapolate styles not originally included in the training data. 1

3 0.77087575 79 nips-2006-Fast Iterative Kernel PCA

Author: Nicol N. Schraudolph, Simon Günter, S.v.n. Vishwanathan

Abstract: We introduce two methods to improve convergence of the Kernel Hebbian Algorithm (KHA) for iterative kernel PCA. KHA has a scalar gain parameter which is either held constant or decreased as 1/t, leading to slow convergence. Our KHA/et algorithm accelerates KHA by incorporating the reciprocal of the current estimated eigenvalues as a gain vector. We then derive and apply Stochastic MetaDescent (SMD) to KHA/et; this further speeds convergence by performing gain adaptation in RKHS. Experimental results for kernel PCA and spectral clustering of USPS digits as well as motion capture and image de-noising problems conﬁrm that our methods converge substantially faster than conventional KHA. 1

4 0.75974047 168 nips-2006-Reducing Calibration Time For Brain-Computer Interfaces: A Clustering Approach

Author: Matthias Krauledat, Michael Schröder, Benjamin Blankertz, Klaus-Robert Müller

5 0.61323905 62 nips-2006-Correcting Sample Selection Bias by Unlabeled Data

Author: Jiayuan Huang, Arthur Gretton, Karsten M. Borgwardt, Bernhard Schölkopf, Alex J. Smola

Abstract: We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias. Most algorithms for this setting try to ﬁrst recover sampling distributions and then make appropriate corrections based on the distribution estimate. We present a nonparametric method which directly produces resampling weights without distribution estimation. Our method works by matching distributions between training and testing sets in feature space. Experimental results demonstrate that our method works well in practice.

6 0.61036897 22 nips-2006-Adaptive Spatial Filters with predefined Region of Interest for EEG based Brain-Computer-Interfaces

7 0.60771865 61 nips-2006-Convex Repeated Games and Fenchel Duality

8 0.60740292 152 nips-2006-Online Classification for Complex Problems Using Simultaneous Projections

9 0.60580689 165 nips-2006-Real-time adaptive information-theoretic optimization of neurophysiology experiments

10 0.60325062 24 nips-2006-Aggregating Classification Accuracy across Time: Application to Single Trial EEG

11 0.60080308 195 nips-2006-Training Conditional Random Fields for Maximum Labelwise Accuracy

12 0.59657973 83 nips-2006-Generalized Maximum Margin Clustering and Unsupervised Kernel Learning

13 0.59615737 20 nips-2006-Active learning for misspecified generalized linear models

14 0.59394652 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming

15 0.59371471 172 nips-2006-Scalable Discriminative Learning for Natural Language Parsing and Translation

16 0.58931023 131 nips-2006-Mixture Regression for Covariate Shift

17 0.58780479 203 nips-2006-implicit Online Learning with Kernels

18 0.58709979 194 nips-2006-Towards a general independent subspace analysis

19 0.58382726 65 nips-2006-Denoising and Dimension Reduction in Feature Space

20 0.58343595 92 nips-2006-High-Dimensional Graphical Model Selection Using $\ell 1$-Regularized Logistic Regression