nips nips2006 nips2006-126 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ryota Tomioka, Kazuyuki Aihara, Klaus-Robert Müller
Abstract: We propose a novel framework for the classification of single trial ElectroEncephaloGraphy (EEG), based on regularized logistic regression. Framed in this robust statistical framework no prior feature extraction or outlier removal is required. We present two variations of parameterizing the regression function: (a) with a full rank symmetric matrix coefficient and (b) as a difference of two rank=1 matrices. In the first case, the problem is convex and the logistic regression is optimal under a generative model. The latter case is shown to be related to the Common Spatial Pattern (CSP) algorithm, which is a popular technique in Brain Computer Interfacing. The regression coefficients can also be topographically mapped onto the scalp similarly to CSP projections, which allows neuro-physiological interpretation. Simulations on 162 BCI datasets demonstrate that classification accuracy and robustness compares favorably against conventional CSP based classifiers. 1
Reference: text
sentIndex sentText sentNum sentScore
1 de Abstract We propose a novel framework for the classification of single trial ElectroEncephaloGraphy (EEG), based on regularized logistic regression. [sent-14, score-0.306]
2 Framed in this robust statistical framework no prior feature extraction or outlier removal is required. [sent-15, score-0.126]
3 We present two variations of parameterizing the regression function: (a) with a full rank symmetric matrix coefficient and (b) as a difference of two rank=1 matrices. [sent-16, score-0.403]
4 In the first case, the problem is convex and the logistic regression is optimal under a generative model. [sent-17, score-0.374]
5 The regression coefficients can also be topographically mapped onto the scalp similarly to CSP projections, which allows neuro-physiological interpretation. [sent-19, score-0.235]
6 Simulations on 162 BCI datasets demonstrate that classification accuracy and robustness compares favorably against conventional CSP based classifiers. [sent-20, score-0.048]
7 1 Introduction The goal of Brain-Computer Interface (BCI) research [1, 2, 3, 4, 5, 6, 7] is to provide a direct control pathway from human intentions reflected in brain signals to computers. [sent-21, score-0.074]
8 Machine learning approaches to BCI have proven to be effective by requiring less subject training and by compensating for the high inter-subject variability. [sent-25, score-0.07]
9 In this field, a number of studies have focused on constructing better low dimensional representations that combine various features of brain activities [3, 4], because the problem of classifying EEG signals is intrinsically high dimensional. [sent-26, score-0.123]
10 In particular, efforts have been made to reduce the number of electrodes by eliminating electrodes recursively [8] or by decomposition techniques e. [sent-27, score-0.088]
11 In practice, often a BCI system has been constructed by combining a feature extraction step and a classification step. [sent-30, score-0.044]
12 Our contribution is a logistic regression classifier that integrates both steps under the roof of a single minimization problem and uses well controlled regularization. [sent-31, score-0.374]
13 We study a BCI based on the motor ∗ † Fraunhofer FIRST. [sent-33, score-0.091]
14 Motor imagination can be captured through spatially localized bandpower modulation in the µ- (10-15Hz) or β- (20-30Hz) band characterized by the secondorder statistics of the signal; the underlying neuro-physiology is well known as Event Related Desynchronization (ERD) [10]. [sent-37, score-0.072]
15 1 Problem setting Let us denote by X ∈ Rd×T the EEG signal of a single trial of an imaginary motor movement1 , where d is the number of electrodes and T is the number of sampled time-points in a trial. [sent-39, score-0.352]
16 right or left hand imaginary movement, is called positive (+) or negative (−) class. [sent-42, score-0.136]
17 Given a set of trials and labels {Xi , yi }n , the task is to predict the class label i=1 y for an unobserved trial X. [sent-44, score-0.163]
18 2 Conventional method: classifying with CSP features In the motor-imagery EEG signal classification, Common Spatial Pattern (CSP) based classifiers have proven to be powerful [11, 3, 6]. [sent-46, score-0.109]
19 CSP is a decomposition method proposed by Koles [9] that finds a set of projections that simultaneously diagonalize the covariance matrices corresponding to two brain states. [sent-47, score-0.147]
20 Formally, the covariance matrices2 are defined as: 1 Xi Xi (c ∈ {+, −}), (1) Σc = |Ic | i∈Ic where Ic is the set of indices belonging to a class c ∈ {+, −}; thus I+ ∪ I− = {1, . [sent-48, score-0.081]
21 Then, the simultaneous diagonalization is achieved by solving the following generalized eigenvalue problem: Σ+ w = λΣ− w. [sent-52, score-0.126]
22 (2) w Σ+ wj Note that for each pair of eigenvector and eigenvalue (wj , λj ), the equality λj = w j Σ− wj j holds. [sent-53, score-0.277]
23 Therefore, the eigenvector with the largest eigenvalue corresponds to the projection with the maximum ratio of power for the “+” class and the “−” class, and the otherway-around for the eigenvector with the smallest eigenvalue. [sent-54, score-0.157]
24 It is common practice that only the first nof largest eigenvectors and the last nof smallest eigenvectors are used to construct a low dimensional feature representation. [sent-56, score-0.334]
25 The feature vector consists of logarithms of the projected signal powers and a Linear Discriminant Analysis (LDA) classifier is trained on the resulting feature vector. [sent-57, score-0.06]
26 To summarize, the conventional CSP based classifier can be constructed as follows: How to build a CSP based classifier: 1. [sent-58, score-0.048]
27 Take the nof largest and smallest eigenvectors {wj }J j=1 3. [sent-62, score-0.167]
28 i=1 1 For simplicity, we assume that the signal is already band-pass filtered and each trial is centered ´ ` 1 1 and scaled as X = √T Xoriginal IT − T 11 . [sent-70, score-0.14]
29 (1) a covariance matrix, calling it an averaged cross power matrix gives better insight into the nature of the problem, because we are focusing on the task related modulation of rhythmic activities. [sent-72, score-0.165]
30 The model (3) can be derived by assuming a zero-mean Gaussian distribution with no temporal correlation with a covariance matrix Σ± for each class as follows: 1 P (y = +1|X) = tr −Σ+ −1 + Σ− −1 XX + const. [sent-76, score-0.114]
31 1 Logistic regression Linear logistic regression We minimize the negative log-likelihood of Eq. [sent-86, score-0.522]
32 (3) with an additional regularization term, which is written as follows: 1 W ∈Sym(d),b∈R n min n C log 1 + e−yi f (Xi ; θ) + trΣP W ΣP W + b2 . [sent-87, score-0.054]
33 (5) converges asymptotically to the true loss where the empirical average is replaced by the expectation over X and y, whose minimum over functions in L2 (PX ) is achieved by the symmetric logit transform of P (y = +1|X) [15]. [sent-94, score-0.094]
34 The problem of classifying motor imagery EEG signals is now addressed under a single loss function. [sent-97, score-0.201]
35 2 Rank=2 approximation of the linear logistic regression Here we present a rank=2 approximation of the regression function (3). [sent-102, score-0.522]
36 Using this approximation we can greatly reduce the number of parameters to be estimated from a symmetric matrix coefficient to a pair of projection coefficients and additionally gain insight into the relevant feature the classifier has found. [sent-103, score-0.042]
37 The rank=2 approximation of the regression function (3) is written as follows: 1 ¯ ¯ f (X; θ) := tr −w1 w1 + w2 w2 XX + b, 2 (6) ¯ where θ := (w1 , w2 , b) ∈ Rd × Rd × R. [sent-104, score-0.181]
38 The rationale for choosing this special form of function is that the Bayes optimal regression coefficients in Eq. [sent-105, score-0.148]
39 (4) is the difference of two positive definite matrices; therefore two bases with opposite signs are at least necessary in capturing the nature of Eq. [sent-106, score-0.042]
40 (4) (incorporating more bases goes beyond the scope of this contribution). [sent-107, score-0.042]
41 The rank=2 parameterized logistic regression can be obtained by minimizing the sum of the logistic regression loss and regularization terms similarly to Eq. [sent-108, score-0.831]
42 2n i=1 (7) Here, again the pooled covariance matrix ΣP is used as a metric in order to ensure the invariance to linear transformations. [sent-110, score-0.096]
43 Note that the bases {w1 , w2 } give projections of the signal into a two dimensional feature space in a similar manner as CSP (see Sec. [sent-111, score-0.13]
44 The filters can be topographically mapped onto the scalp, from which insight into the classifier can be obtained. [sent-115, score-0.038]
45 However, the major difference between CSP and the rank=2 parameterized logistic regression (Eq. [sent-116, score-0.403]
46 (7)) is that in our new approach, there is no distinction between the feature extraction step and the classifier training step. [sent-117, score-0.044]
47 1 Results Experimental settings We compare the logistic regression classifiers (Eqs. [sent-123, score-0.374]
48 (3) and (6)) against CSP based classifiers with nof = 1 (total 2 filters) and nof = 3 (total 6 filters). [sent-124, score-0.22]
49 We use 60 BCI experiments [6] from 29 subjects where the subjects performed three imaginary movements, namely “right hand” (R), “left hand” (L) and “foot” (F) according to the visual cue presented on the screen, except 9 experiments where only two classes were performed. [sent-127, score-0.077]
50 Each dataset contains 70 to 600 trials (at median 280) of imaginary movements. [sent-129, score-0.077]
51 The signal was recorded from the scalp with multi-channel EEG amplifiers using 32, 64 or 128 channels. [sent-133, score-0.109]
52 The signal was sampled at 1000Hz and down-sampled to 100Hz before the processing. [sent-134, score-0.06]
53 The signal is band-pass filtered at 7-30Hz and the interval 500-3500ms after the appearance of visual cue is cut out from the continuous EEG signal as a trial X. [sent-135, score-0.2]
54 (3) and wj = ΣP wj (j = 1, 2) for ˜ and wj denote the minimizer of Eqs. [sent-139, score-0.276]
55 Note that we did not whitened the training and test data jointly, which could have improved the performance. [sent-142, score-0.052]
56 The regularization constant C for the proposed method is chosen by 5×10 cross-validation on the training set. [sent-143, score-0.054]
57 1, logistic regression (LR) classifiers with the full rank parameterization (Eq. [sent-146, score-0.565]
58 Here the bit-rate (per decision) is defined based on the classification test error perr as the capacity of a binary symmetric channel with the same error probability: 1 43% 52% 0. [sent-150, score-0.186]
59 8 CSP (2 filters) Figure 1: Comparison of bit-rates achieved by the CSP based classifiers and the logistic regression (LR) classifiers. [sent-182, score-0.374]
60 The bit-rates achieved by the conventional CSP based classifier and the proposed LR classifier are shown as a circle for each dataset. [sent-183, score-0.048]
61 The proportion of datasets lying above/below the diagonal is shown at top-left/bottom-right corners of each plot, respectively. [sent-184, score-0.054]
62 1 1 1 − perr log2 perr + (1 − perr ) log2 1−perr . [sent-186, score-0.33]
63 The proposed method improves upon the conventional method for datasets lying above the diagonal. [sent-187, score-0.102]
64 Note that our proposed logistic regression ansatz is significantly better only in the lower right plot. [sent-188, score-0.374]
65 Figure 2 shows examples of spatial filter coefficients obtained by CSP (6 filters) and rank=2 parameterized logistic regression. [sent-189, score-0.284]
66 2(a)) include typical cases (the first filter for the “left hand” class and the first two filters for the “right hand” class) of filters corrupted by artifacts, e. [sent-191, score-0.071]
67 The CSP filters for the “foot” class in subject B (see Fig. [sent-194, score-0.099]
68 2(b)) are corrupted by strong occipital α-activity, which might have been weakly correlated to the labels by chance. [sent-195, score-0.086]
69 On the other hand the filter coefficients obtained by the logistic regression are clearly focused on the area physiologically corresponding to ERD in the motor cortex (see Figs. [sent-198, score-0.559]
70 (7) the regression coefficients w1 and w2 are generalized eigenvectors of two uncertainty weighted covariance matrices corresponding to two motor imagery classes, which are weighted by the uncertainty of the decision 1 − P (y = yi |X = Xi ) for each sample. [sent-202, score-0.493]
71 Samples that are easily explained by the regression function are weighed low whereas those lying close to the decision boundary or those lying on the wrong side of the boundary are highly weighted. [sent-203, score-0.256]
72 e−zi ± yi Xi Xi w∗ + CΣP w∗ = 0 (j = 1, 2), (8) j j 1 + e−zi i=1 ¯ ¯∗ where we define the short hand zi := yi f (Xi ; θ ) and ± denotes + and − for j = 1, 2, respectively. [sent-208, score-0.205]
73 (8) can be rewritten as follows: ¯∗ ¯∗ Σ−(θ , 0)w∗ = Σ+(θ , C)w∗ , 1 1 ¯ ∗ , 0)w∗ = Σ−(θ ∗ , C)w∗ , ¯ Σ+(θ 2 2 (9) (10) where we define the uncertainty weighted covariance matrix as: n e−zi C ¯∗ Xi Xi . [sent-210, score-0.052]
74 Σ± (θ , C) = Xi Xi + 1 + e−zi n i=1 i∈I± Note that increasing the regularization constant C biases the uncertainty weighted covariance matrix to the pooled covariance matrix ΣP ; the regularization only affects the righthand side of Eqs. [sent-211, score-0.256]
75 If C > 0, the optimal filter coefficients w∗ (j = 1, 2) are the j generalized eigenvectors of Eqs. [sent-213, score-0.094]
76 After being introduced to the BCI community by [11], it has proved to be also powerful in classifying imaginary motor movements [3, 6]. [sent-217, score-0.217]
77 A widely used heuristic is to choose several generalized eigenvectors from both ends of the eigenvalue spectrum. [sent-220, score-0.152]
78 Secondly, simultaneous diagonalization of covariance matrices can suffer greatly from a few outlier trials as seen in subject A in Fig. [sent-224, score-0.233]
79 Again, in practice one can inspect the EEG signals to detect outliers, however a manual outlier detection is also a somewhat arbitrary, non-reproducible process, which cannot be validated. [sent-226, score-0.087]
80 5 Conclusion In this paper, we have proposed an unified framework for single trial classification of motorimagery EEG signals. [sent-227, score-0.08]
81 The problem is addressed as a single minimization problem without any prior feature extraction or outlier removal steps. [sent-228, score-0.126]
82 The task is to minimize a logistic regression loss with a regularization term. [sent-229, score-0.428]
83 The regression function is a linear function with respect to the second order statistics of the EEG signal. [sent-230, score-0.148]
84 By parameterizing the whole regression coefficients directly, we have obtained comparable classification accuracy with CSP based classifiers. [sent-232, score-0.205]
85 By parameterizing the regression coefficients as the difference of two rank-one matrices, improvement against CSP based classifiers was obtained. [sent-233, score-0.205]
86 We have shown that in the rank=2 parameterization of the logistic regression function, the optimal filter coefficients has an interpretation as a solution to a generalized eigenvalue problem similarly to CSP. [sent-234, score-0.504]
87 However, the difference is that in the case of logistic regression every sample is weighted according to the importance to the overall classification problem whereas in CSP all the samples have uniform importance. [sent-235, score-0.374]
88 For example, incorporating more than two filters will connect the two parameterizations of the regression function shown in this paper and it may allow us to investigate how many filters are sufficient for good classification. [sent-237, score-0.148]
89 Since the classifier output is the logit transform of the class probability, it is straightforward to generalize the method to multi-class problems. [sent-238, score-0.081]
90 caused by a covariate shift (see [16, 17]) in the density P (X) from one session to another, could be corrected by adapting the likelihood model. [sent-241, score-0.066]
91 Cuu rio, “The Berlin Brain-Computer Interface: EEG-based communication without subject training”, IEEE Trans. [sent-313, score-0.07]
92 Koles, “The quantitative extraction and topographic mapping of the abnormal components in the clinical EEG”, Electroencephalogr. [sent-338, score-0.044]
93 Pfurtscheller, “Optimal spatial filtering of single trial u EEG during imagined hand movement”, IEEE Trans. [sent-352, score-0.168]
94 Shimodaira, “Improving predictive inference under covariate shift by weighting the loglikelihood function”, Journal of Statistical Planning and Inference, 90: 227–244, 2000. [sent-378, score-0.066]
95 CSP filter coefficients left hand right hand (c) Subject A. [sent-384, score-0.118]
96 CSP filter coefficients left hand foot (d) Subject B. [sent-398, score-0.129]
97 Logistic regression (rank=2) filter coefficients Figure 2: Examples of spatial filter coefficients obtained by CSP and the rank=2 parameterized logistic regression. [sent-399, score-0.432]
98 Some CSP filters are corrupted by strong occipital α-activity. [sent-403, score-0.086]
99 Logistic regression coefficients are focusing on the physiologically expected “left hand” and “right hand” areas. [sent-405, score-0.219]
100 Logistic regression coefficients are focusing on the “left hand” and “foot” areas. [sent-407, score-0.184]
wordName wordTfidf (topN-words)
[('csp', 0.614), ('coe', 0.281), ('eeg', 0.228), ('logistic', 0.226), ('lters', 0.182), ('rank', 0.156), ('bci', 0.153), ('regression', 0.148), ('lr', 0.121), ('nof', 0.11), ('perr', 0.11), ('cients', 0.104), ('classi', 0.1), ('wj', 0.092), ('motor', 0.091), ('erence', 0.087), ('lter', 0.087), ('trial', 0.08), ('imaginary', 0.077), ('ller', 0.07), ('foot', 0.07), ('subject', 0.07), ('er', 0.069), ('xi', 0.068), ('pfurtscheller', 0.061), ('signal', 0.06), ('hand', 0.059), ('filters', 0.058), ('eigenvalue', 0.058), ('eigenvectors', 0.057), ('parameterizing', 0.057), ('hinterberger', 0.057), ('yi', 0.054), ('regularization', 0.054), ('outlier', 0.054), ('lying', 0.054), ('logit', 0.052), ('blankertz', 0.052), ('dornhege', 0.052), ('aihara', 0.052), ('whitened', 0.052), ('covariance', 0.052), ('di', 0.052), ('interface', 0.05), ('classifying', 0.049), ('birbaumer', 0.049), ('scalp', 0.049), ('conventional', 0.048), ('ers', 0.047), ('extraction', 0.044), ('occipital', 0.044), ('sym', 0.044), ('pooled', 0.044), ('tokyo', 0.044), ('electrodes', 0.044), ('corrupted', 0.042), ('bases', 0.042), ('symmetric', 0.042), ('berlin', 0.041), ('brain', 0.041), ('ic', 0.04), ('xx', 0.04), ('modulation', 0.039), ('covariate', 0.039), ('topographically', 0.038), ('rhythmic', 0.038), ('koles', 0.038), ('kunzmann', 0.038), ('losch', 0.038), ('zi', 0.038), ('generalized', 0.037), ('focusing', 0.036), ('parameterization', 0.035), ('physiologically', 0.035), ('ramoser', 0.035), ('curio', 0.035), ('eigenvector', 0.035), ('channel', 0.034), ('tr', 0.033), ('signals', 0.033), ('interfacing', 0.033), ('erd', 0.033), ('imagination', 0.033), ('mcfarland', 0.033), ('rd', 0.032), ('graz', 0.031), ('desynchronization', 0.031), ('diagonalization', 0.031), ('whitening', 0.029), ('parameterized', 0.029), ('spatial', 0.029), ('class', 0.029), ('discriminative', 0.028), ('lda', 0.028), ('removal', 0.028), ('imagery', 0.028), ('projections', 0.028), ('shift', 0.027), ('lal', 0.027), ('matrices', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999964 126 nips-2006-Logistic Regression for Single Trial EEG Classification
Author: Ryota Tomioka, Kazuyuki Aihara, Klaus-Robert Müller
Abstract: We propose a novel framework for the classification of single trial ElectroEncephaloGraphy (EEG), based on regularized logistic regression. Framed in this robust statistical framework no prior feature extraction or outlier removal is required. We present two variations of parameterizing the regression function: (a) with a full rank symmetric matrix coefficient and (b) as a difference of two rank=1 matrices. In the first case, the problem is convex and the logistic regression is optimal under a generative model. The latter case is shown to be related to the Common Spatial Pattern (CSP) algorithm, which is a popular technique in Brain Computer Interfacing. The regression coefficients can also be topographically mapped onto the scalp similarly to CSP projections, which allows neuro-physiological interpretation. Simulations on 162 BCI datasets demonstrate that classification accuracy and robustness compares favorably against conventional CSP based classifiers. 1
2 0.66387022 168 nips-2006-Reducing Calibration Time For Brain-Computer Interfaces: A Clustering Approach
Author: Matthias Krauledat, Michael Schröder, Benjamin Blankertz, Klaus-Robert Müller
Abstract: Up to now even subjects that are experts in the use of machine learning based BCI systems still have to undergo a calibration session of about 20-30 min. From this data their (movement) intentions are so far infered. We now propose a new paradigm that allows to completely omit such calibration and instead transfer knowledge from prior sessions. To achieve this goal we first define normalized CSP features and distances in-between. Second, we derive prototypical features across sessions: (a) by clustering or (b) by feature concatenation methods. Finally, we construct a classifier based on these individualized prototypes and show that, indeed, classifiers can be successfully transferred to a new session for a number of subjects.
3 0.36102331 22 nips-2006-Adaptive Spatial Filters with predefined Region of Interest for EEG based Brain-Computer-Interfaces
Author: Moritz Grosse-wentrup, Klaus Gramann, Martin Buss
Abstract: The performance of EEG-based Brain-Computer-Interfaces (BCIs) critically depends on the extraction of features from the EEG carrying information relevant for the classification of different mental states. For BCIs employing imaginary movements of different limbs, the method of Common Spatial Patterns (CSP) has been shown to achieve excellent classification results. The CSP-algorithm however suffers from a lack of robustness, requiring training data without artifacts for good performance. To overcome this lack of robustness, we propose an adaptive spatial filter that replaces the training data in the CSP approach by a-priori information. More specifically, we design an adaptive spatial filter that maximizes the ratio of the variance of the electric field originating in a predefined region of interest (ROI) and the overall variance of the measured EEG. Since it is known that the component of the EEG used for discriminating imaginary movements originates in the motor cortex, we design two adaptive spatial filters with the ROIs centered in the hand areas of the left and right motor cortex. We then use these to classify EEG data recorded during imaginary movements of the right and left hand of three subjects, and show that the adaptive spatial filters outperform the CSP-algorithm, enabling classification rates of up to 94.7 % without artifact rejection. 1
4 0.18583597 24 nips-2006-Aggregating Classification Accuracy across Time: Application to Single Trial EEG
Author: Steven Lemm, Christin Schäfer, Gabriel Curio
Abstract: We present a method for binary on-line classification of triggered but temporally blurred events that are embedded in noisy time series in the context of on-line discrimination between left and right imaginary hand-movement. In particular the goal of the binary classification problem is to obtain the decision, as fast and as reliably as possible from the recorded EEG single trials. To provide a probabilistic decision at every time-point t the presented method gathers information from two distinct sequences of features across time. In order to incorporate decisions from prior time-points we suggest an appropriate weighting scheme, that emphasizes time instances, providing a higher discriminatory power between the instantaneous class distributions of each feature, where the discriminatory power is quantified in terms of the Bayes error of misclassification. The effectiveness of this procedure is verified by its successful application in the 3rd BCI competition. Disclosure of the data after the competition revealed this approach to be superior with single trial error rates as low as 10.7, 11.5 and 16.7% for the three different subjects under study. 1
5 0.098158836 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation
Author: Dong S. Cheng, Vittorio Murino, Mário Figueiredo
Abstract: This paper proposes a new approach to model-based clustering under prior knowledge. The proposed formulation can be interpreted from two different angles: as penalized logistic regression, where the class labels are only indirectly observed (via the probability density of each class); as finite mixture learning under a grouping prior. To estimate the parameters of the proposed model, we derive a (generalized) EM algorithm with a closed-form E-step, in contrast with other recent approaches to semi-supervised probabilistic clustering which require Gibbs sampling or suboptimal shortcuts. We show that our approach is ideally suited for image segmentation: it avoids the combinatorial nature Markov random field priors, and opens the door to more sophisticated spatial priors (e.g., wavelet-based) in a simple and computationally efficient way. Finally, we extend our formulation to work in unsupervised, semi-supervised, or discriminative modes. 1
6 0.094376415 178 nips-2006-Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation
7 0.086651281 72 nips-2006-Efficient Learning of Sparse Representations with an Energy-Based Model
8 0.080859371 182 nips-2006-Statistical Modeling of Images with Fields of Gaussian Scale Mixtures
9 0.08051914 148 nips-2006-Nonlinear physically-based models for decoding motor-cortical population activity
10 0.071071923 164 nips-2006-Randomized PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension
11 0.069602266 92 nips-2006-High-Dimensional Graphical Model Selection Using $\ell 1$-Regularized Logistic Regression
12 0.069045015 131 nips-2006-Mixture Regression for Covariate Shift
13 0.067783795 76 nips-2006-Emergence of conjunctive visual features by quadratic independent component analysis
14 0.065067358 130 nips-2006-Max-margin classification of incomplete data
15 0.064777292 65 nips-2006-Denoising and Dimension Reduction in Feature Space
16 0.063307486 179 nips-2006-Sparse Representation for Signal Classification
17 0.0614358 84 nips-2006-Generalized Regularized Least-Squares Learning with Predefined Features in a Hilbert Space
18 0.061177745 156 nips-2006-Ordinal Regression by Extended Binary Classification
19 0.059173867 186 nips-2006-Support Vector Machines on a Budget
20 0.058429997 75 nips-2006-Efficient sparse coding algorithms
topicId topicWeight
[(0, -0.216), (1, 0.01), (2, 0.14), (3, -0.083), (4, -0.642), (5, -0.095), (6, -0.034), (7, 0.301), (8, 0.107), (9, -0.09), (10, -0.041), (11, -0.057), (12, 0.155), (13, 0.111), (14, -0.038), (15, 0.147), (16, -0.155), (17, 0.041), (18, 0.09), (19, 0.017), (20, -0.031), (21, 0.018), (22, 0.018), (23, -0.014), (24, -0.013), (25, -0.016), (26, 0.054), (27, 0.017), (28, -0.011), (29, 0.026), (30, 0.049), (31, 0.044), (32, -0.107), (33, 0.061), (34, 0.081), (35, -0.002), (36, -0.005), (37, 0.059), (38, -0.014), (39, 0.05), (40, 0.069), (41, 0.046), (42, -0.018), (43, 0.043), (44, 0.023), (45, -0.057), (46, 0.008), (47, -0.03), (48, -0.02), (49, 0.017)]
simIndex simValue paperId paperTitle
1 0.97079945 168 nips-2006-Reducing Calibration Time For Brain-Computer Interfaces: A Clustering Approach
Author: Matthias Krauledat, Michael Schröder, Benjamin Blankertz, Klaus-Robert Müller
Abstract: Up to now even subjects that are experts in the use of machine learning based BCI systems still have to undergo a calibration session of about 20-30 min. From this data their (movement) intentions are so far infered. We now propose a new paradigm that allows to completely omit such calibration and instead transfer knowledge from prior sessions. To achieve this goal we first define normalized CSP features and distances in-between. Second, we derive prototypical features across sessions: (a) by clustering or (b) by feature concatenation methods. Finally, we construct a classifier based on these individualized prototypes and show that, indeed, classifiers can be successfully transferred to a new session for a number of subjects.
same-paper 2 0.94656104 126 nips-2006-Logistic Regression for Single Trial EEG Classification
Author: Ryota Tomioka, Kazuyuki Aihara, Klaus-Robert Müller
Abstract: We propose a novel framework for the classification of single trial ElectroEncephaloGraphy (EEG), based on regularized logistic regression. Framed in this robust statistical framework no prior feature extraction or outlier removal is required. We present two variations of parameterizing the regression function: (a) with a full rank symmetric matrix coefficient and (b) as a difference of two rank=1 matrices. In the first case, the problem is convex and the logistic regression is optimal under a generative model. The latter case is shown to be related to the Common Spatial Pattern (CSP) algorithm, which is a popular technique in Brain Computer Interfacing. The regression coefficients can also be topographically mapped onto the scalp similarly to CSP projections, which allows neuro-physiological interpretation. Simulations on 162 BCI datasets demonstrate that classification accuracy and robustness compares favorably against conventional CSP based classifiers. 1
3 0.82834131 22 nips-2006-Adaptive Spatial Filters with predefined Region of Interest for EEG based Brain-Computer-Interfaces
Author: Moritz Grosse-wentrup, Klaus Gramann, Martin Buss
Abstract: The performance of EEG-based Brain-Computer-Interfaces (BCIs) critically depends on the extraction of features from the EEG carrying information relevant for the classification of different mental states. For BCIs employing imaginary movements of different limbs, the method of Common Spatial Patterns (CSP) has been shown to achieve excellent classification results. The CSP-algorithm however suffers from a lack of robustness, requiring training data without artifacts for good performance. To overcome this lack of robustness, we propose an adaptive spatial filter that replaces the training data in the CSP approach by a-priori information. More specifically, we design an adaptive spatial filter that maximizes the ratio of the variance of the electric field originating in a predefined region of interest (ROI) and the overall variance of the measured EEG. Since it is known that the component of the EEG used for discriminating imaginary movements originates in the motor cortex, we design two adaptive spatial filters with the ROIs centered in the hand areas of the left and right motor cortex. We then use these to classify EEG data recorded during imaginary movements of the right and left hand of three subjects, and show that the adaptive spatial filters outperform the CSP-algorithm, enabling classification rates of up to 94.7 % without artifact rejection. 1
4 0.62903422 24 nips-2006-Aggregating Classification Accuracy across Time: Application to Single Trial EEG
Author: Steven Lemm, Christin Schäfer, Gabriel Curio
Abstract: We present a method for binary on-line classification of triggered but temporally blurred events that are embedded in noisy time series in the context of on-line discrimination between left and right imaginary hand-movement. In particular the goal of the binary classification problem is to obtain the decision, as fast and as reliably as possible from the recorded EEG single trials. To provide a probabilistic decision at every time-point t the presented method gathers information from two distinct sequences of features across time. In order to incorporate decisions from prior time-points we suggest an appropriate weighting scheme, that emphasizes time instances, providing a higher discriminatory power between the instantaneous class distributions of each feature, where the discriminatory power is quantified in terms of the Bayes error of misclassification. The effectiveness of this procedure is verified by its successful application in the 3rd BCI competition. Disclosure of the data after the competition revealed this approach to be superior with single trial error rates as low as 10.7, 11.5 and 16.7% for the three different subjects under study. 1
5 0.31285119 178 nips-2006-Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation
Author: Gavin C. Cawley, Nicola L. Talbot, Mark Girolami
Abstract: Multinomial logistic regression provides the standard penalised maximumlikelihood solution to multi-class pattern recognition problems. More recently, the development of sparse multinomial logistic regression models has found application in text processing and microarray classification, where explicit identification of the most informative features is of value. In this paper, we propose a sparse multinomial logistic regression method, in which the sparsity arises from the use of a Laplace prior, but where the usual regularisation parameter is integrated out analytically. Evaluation over a range of benchmark datasets reveals this approach results in similar generalisation performance to that obtained using cross-validation, but at greatly reduced computational expense. 1
6 0.26589018 72 nips-2006-Efficient Learning of Sparse Representations with an Energy-Based Model
7 0.26062548 179 nips-2006-Sparse Representation for Signal Classification
8 0.25620753 156 nips-2006-Ordinal Regression by Extended Binary Classification
9 0.24210894 129 nips-2006-Map-Reduce for Machine Learning on Multicore
10 0.23516946 140 nips-2006-Multiple Instance Learning for Computer Aided Diagnosis
11 0.2331624 182 nips-2006-Statistical Modeling of Images with Fields of Gaussian Scale Mixtures
12 0.22972138 73 nips-2006-Efficient Methods for Privacy Preserving Face Detection
13 0.22465166 148 nips-2006-Nonlinear physically-based models for decoding motor-cortical population activity
14 0.21463968 51 nips-2006-Clustering Under Prior Knowledge with Application to Image Segmentation
15 0.21244381 193 nips-2006-Tighter PAC-Bayes Bounds
16 0.19822912 63 nips-2006-Cross-Validation Optimization for Large Scale Hierarchical Classification Kernel Methods
17 0.19653791 84 nips-2006-Generalized Regularized Least-Squares Learning with Predefined Features in a Hilbert Space
18 0.19074805 118 nips-2006-Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields
19 0.19014584 105 nips-2006-Large Margin Component Analysis
20 0.1838069 76 nips-2006-Emergence of conjunctive visual features by quadratic independent component analysis
topicId topicWeight
[(1, 0.113), (3, 0.018), (7, 0.054), (9, 0.037), (13, 0.25), (20, 0.013), (22, 0.132), (44, 0.061), (57, 0.057), (65, 0.042), (69, 0.02), (71, 0.013), (90, 0.011), (95, 0.036), (98, 0.047)]
simIndex simValue paperId paperTitle
same-paper 1 0.83087826 126 nips-2006-Logistic Regression for Single Trial EEG Classification
Author: Ryota Tomioka, Kazuyuki Aihara, Klaus-Robert Müller
Abstract: We propose a novel framework for the classification of single trial ElectroEncephaloGraphy (EEG), based on regularized logistic regression. Framed in this robust statistical framework no prior feature extraction or outlier removal is required. We present two variations of parameterizing the regression function: (a) with a full rank symmetric matrix coefficient and (b) as a difference of two rank=1 matrices. In the first case, the problem is convex and the logistic regression is optimal under a generative model. The latter case is shown to be related to the Common Spatial Pattern (CSP) algorithm, which is a popular technique in Brain Computer Interfacing. The regression coefficients can also be topographically mapped onto the scalp similarly to CSP projections, which allows neuro-physiological interpretation. Simulations on 162 BCI datasets demonstrate that classification accuracy and robustness compares favorably against conventional CSP based classifiers. 1
2 0.80672997 111 nips-2006-Learning Motion Style Synthesis from Perceptual Observations
Author: Lorenzo Torresani, Peggy Hackney, Christoph Bregler
Abstract: This paper presents an algorithm for synthesis of human motion in specified styles. We use a theory of movement observation (Laban Movement Analysis) to describe movement styles as points in a multi-dimensional perceptual space. We cast the task of learning to synthesize desired movement styles as a regression problem: sequences generated via space-time interpolation of motion capture data are used to learn a nonlinear mapping between animation parameters and movement styles in perceptual space. We demonstrate that the learned model can apply a variety of motion styles to pre-recorded motion sequences and it can extrapolate styles not originally included in the training data. 1
3 0.77087575 79 nips-2006-Fast Iterative Kernel PCA
Author: Nicol N. Schraudolph, Simon Günter, S.v.n. Vishwanathan
Abstract: We introduce two methods to improve convergence of the Kernel Hebbian Algorithm (KHA) for iterative kernel PCA. KHA has a scalar gain parameter which is either held constant or decreased as 1/t, leading to slow convergence. Our KHA/et algorithm accelerates KHA by incorporating the reciprocal of the current estimated eigenvalues as a gain vector. We then derive and apply Stochastic MetaDescent (SMD) to KHA/et; this further speeds convergence by performing gain adaptation in RKHS. Experimental results for kernel PCA and spectral clustering of USPS digits as well as motion capture and image de-noising problems confirm that our methods converge substantially faster than conventional KHA. 1
4 0.75974047 168 nips-2006-Reducing Calibration Time For Brain-Computer Interfaces: A Clustering Approach
Author: Matthias Krauledat, Michael Schröder, Benjamin Blankertz, Klaus-Robert Müller
Abstract: Up to now even subjects that are experts in the use of machine learning based BCI systems still have to undergo a calibration session of about 20-30 min. From this data their (movement) intentions are so far infered. We now propose a new paradigm that allows to completely omit such calibration and instead transfer knowledge from prior sessions. To achieve this goal we first define normalized CSP features and distances in-between. Second, we derive prototypical features across sessions: (a) by clustering or (b) by feature concatenation methods. Finally, we construct a classifier based on these individualized prototypes and show that, indeed, classifiers can be successfully transferred to a new session for a number of subjects.
5 0.61323905 62 nips-2006-Correcting Sample Selection Bias by Unlabeled Data
Author: Jiayuan Huang, Arthur Gretton, Karsten M. Borgwardt, Bernhard Schölkopf, Alex J. Smola
Abstract: We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias. Most algorithms for this setting try to first recover sampling distributions and then make appropriate corrections based on the distribution estimate. We present a nonparametric method which directly produces resampling weights without distribution estimation. Our method works by matching distributions between training and testing sets in feature space. Experimental results demonstrate that our method works well in practice.
6 0.61036897 22 nips-2006-Adaptive Spatial Filters with predefined Region of Interest for EEG based Brain-Computer-Interfaces
7 0.60771865 61 nips-2006-Convex Repeated Games and Fenchel Duality
8 0.60740292 152 nips-2006-Online Classification for Complex Problems Using Simultaneous Projections
9 0.60580689 165 nips-2006-Real-time adaptive information-theoretic optimization of neurophysiology experiments
10 0.60325062 24 nips-2006-Aggregating Classification Accuracy across Time: Application to Single Trial EEG
11 0.60080308 195 nips-2006-Training Conditional Random Fields for Maximum Labelwise Accuracy
12 0.59657973 83 nips-2006-Generalized Maximum Margin Clustering and Unsupervised Kernel Learning
13 0.59615737 20 nips-2006-Active learning for misspecified generalized linear models
14 0.59394652 87 nips-2006-Graph Laplacian Regularization for Large-Scale Semidefinite Programming
15 0.59371471 172 nips-2006-Scalable Discriminative Learning for Natural Language Parsing and Translation
16 0.58931023 131 nips-2006-Mixture Regression for Covariate Shift
17 0.58780479 203 nips-2006-implicit Online Learning with Kernels
18 0.58709979 194 nips-2006-Towards a general independent subspace analysis
19 0.58382726 65 nips-2006-Denoising and Dimension Reduction in Feature Space
20 0.58343595 92 nips-2006-High-Dimensional Graphical Model Selection Using $\ell 1$-Regularized Logistic Regression