nips nips2007 nips2007-127 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Justin Dauwels, François Vialatte, Tomasz Rutkowski, Andrzej S. Cichocki
Abstract: A novel approach to measure the interdependence of two time series is proposed, referred to as “stochastic event synchrony” (SES); it quantifies the alignment of two point processes by means of the following parameters: time delay, variance of the timing jitter, fraction of “spurious” events, and average similarity of events. SES may be applied to generic one-dimensional and multi-dimensional point processes, however, the paper mainly focusses on point processes in time-frequency domain. The average event similarity is in that case described by two parameters: the average frequency offset between events in the time-frequency plane, and the variance of the frequency offset (“frequency jitter”); SES then consists of five parameters in total. Those parameters quantify the synchrony of oscillatory events, and hence, they provide an alternative to existing synchrony measures that quantify amplitude or phase synchrony. The pairwise alignment of point processes is cast as a statistical inference problem, which is solved by applying the maxproduct algorithm on a graphical model. The SES parameters are determined from the resulting pairwise alignment by maximum a posteriori (MAP) estimation. The proposed interdependence measure is applied to the problem of detecting anomalies in EEG synchrony of Mild Cognitive Impairment (MCI) patients; the results indicate that SES significantly improves the sensitivity of EEG in detecting MCI.
Reference: text
sentIndex sentText sentNum sentScore
1 SES may be applied to generic one-dimensional and multi-dimensional point processes, however, the paper mainly focusses on point processes in time-frequency domain. [sent-5, score-0.083]
2 The average event similarity is in that case described by two parameters: the average frequency offset between events in the time-frequency plane, and the variance of the frequency offset (“frequency jitter”); SES then consists of five parameters in total. [sent-6, score-0.528]
3 Those parameters quantify the synchrony of oscillatory events, and hence, they provide an alternative to existing synchrony measures that quantify amplitude or phase synchrony. [sent-7, score-0.73]
4 The pairwise alignment of point processes is cast as a statistical inference problem, which is solved by applying the maxproduct algorithm on a graphical model. [sent-8, score-0.175]
5 The SES parameters are determined from the resulting pairwise alignment by maximum a posteriori (MAP) estimation. [sent-9, score-0.094]
6 The proposed interdependence measure is applied to the problem of detecting anomalies in EEG synchrony of Mild Cognitive Impairment (MCI) patients; the results indicate that SES significantly improves the sensitivity of EEG in detecting MCI. [sent-10, score-0.371]
7 For instance, it is hotly debated whether the synchronous firing of neurons plays a role in cognition [1] and even in consciousness [2]. [sent-12, score-0.086]
8 The synchronous firing paradigm has also attracted substantial attention in both the experimental (e. [sent-13, score-0.059]
9 Moreover, medical studies have reported that many neurophysiological diseases (such as Alzheimer’s disease) are often associated with abnormalities in neural synchrony [5, 6]. [sent-18, score-0.31]
10 The pairwise alignment of point processes is cast as a statistical inference problem, which is solved by applying the max-product algorithm on a graphical model [7]. [sent-20, score-0.175]
11 Our experiments, however, indicate that the it finds reasonable alignments in practice. [sent-22, score-0.028]
12 The SES parameters are determined from the resulting pairwise alignments by maximum a posteriori (MAP) estimation. [sent-23, score-0.056]
13 The proposed method may be helpful to detect mental disorders such as Alzheimer’s disease, since mental disorders are often associated with abnormal blood and neural activity flows, and changes in the synchrony of brain activity (see, e. [sent-24, score-0.459]
14 In this paper, we will present promising results on the early prediction of Alzheimer’s disease from EEG signals based on SES. [sent-27, score-0.087]
15 In the next section, we introduce SES for the case of onedimensional point processes. [sent-29, score-0.022]
16 In Section 3, we consider the extension to multi-dimensional point processes. [sent-30, score-0.022]
17 In Section 4, we use our measure to detect abnormalities in the EEG synchrony of Alzheimer’s disease patients. [sent-31, score-0.383]
18 2 One-Dimensional Point Processes Let us consider the one-dimensional point processes (“event strings”) X and X in Fig. [sent-32, score-0.061]
19 We wish to quantify to which extent X and X are synchronized. [sent-34, score-0.036]
20 Intuitively speaking, two event strings can be considered as synchronous (or “locked”) if they are identical apart from: (i) a time shift δt ; (ii) small deviations in the event occurrence times (“event timing jitter”); (iii) a few event insertions and/or deletions. [sent-35, score-0.66]
21 More precisely, for two event strings to be synchronous, the event timing jitter should be significantly smaller than the average inter-event time, and the number of deletions and insertions should comprise only a small fraction of the total number of events. [sent-36, score-0.53]
22 This intuitive concept of synchrony is illustrated in Fig. [sent-37, score-0.269]
23 The event string X is obtained from event string X by successively shifting X over δt (resulting in Y ), slightly perturbing the event occurrence times (resulting in Z), and eventually, by adding (plus sign) and deleting (minus sign) events, resulting in X . [sent-39, score-0.537]
24 Adding and deleting events in Z leads to “spurious” events in X and X (see Fig. [sent-40, score-0.286]
25 1(a); spurious events are marked in red): a spurious event in X is an event that cannot be paired with an event in X and vice versa. [sent-41, score-0.897]
26 The above intuitive reasoning leads to our novel measure for synchrony between two event strings, i. [sent-42, score-0.434]
27 SES is related to the metrics (“distances”) proposed in [9]; those metrics are single numbers that quantify the synchrony between event strings. [sent-45, score-0.449]
28 In contrast, we characterize synchrony by means of three parameters, which allows us to distinguish different types of synchrony (see [10]). [sent-46, score-0.538]
29 First, one generates an event string V of length , where the events Vk are mutually independent and uniformly distributed in [0, T0 ]. [sent-50, score-0.299]
30 The strings Z and Z are generated by delaying V over −δt /2 and δt /2 respectively and by (slightly) perturbing the resulting event occurrence times (variance of timing jitter equals st /2). [sent-51, score-0.652]
31 The sequences X and X are obtained from Z and Z by removing some of the events; more precisely, from each pair (Zk , Zk ), either Zk or Zk is removed with probability ps . [sent-52, score-0.057]
32 Since we do not wish/need to encode prior information about δt and st , we adopt improper priors p(δt ) = 1 = p(st ). [sent-56, score-0.247]
33 In the following, we will denote model (6) by p(x, x , j, j , δt , st ) instead of p(x, x , b, b , δt , st , ), since for given x, x , b, and b (and hence given n, n , and nnon-spur ), the length is fully determined, i. [sent-62, score-0.494]
34 It also noteworthy that T0 , λ and ps do not need to be specified individually, since they appear in (6) only through β. [sent-67, score-0.077]
35 The latter serves in practice as a knob to control the number of spurious events. [sent-68, score-0.166]
36 I B 2 0 34 00 78 9 00 0 5 6 1 0 X Z X V T0 0 δt 2 Y Z δt δt 2 Z X B 1 0 0 0 0 0 00 I X 1 2 3 4 6 7 89 (a) Asymmetric procedure (b) Symmetric procedure Figure 1: One-dimensional stochastic event synchrony. [sent-69, score-0.144]
37 Given event strings X and X , we wish to determine the parameters δt and st , and the hidden variables B and B ; the parameter ρspur (cf. [sent-70, score-0.451]
38 (1)) can obtained from the latter : n k=1 bk + n+n ρspur = n k=1 bk . [sent-71, score-0.316]
39 3 Multi-Dimensional Point Processes In this section, we will focus on the interdependence of multi-dimensional point processes. [sent-75, score-0.103]
40 As a concrete example, we will consider multi-dimensional point processes in time-frequency domain; the proposed algorithm, however, is not restricted to that particular situation, it is applicable to generic multi-dimensional point processes. [sent-76, score-0.083]
41 2 and [17]); each bump is described by five parameters: time X, frequency F , width ∆X, height ∆F , and amplitude W . [sent-81, score-0.214]
42 The resulting bump models Y = ((X1 , F1 , ∆X1 , ∆F1 , W1 ), . [sent-82, score-0.14]
43 , (Xn , Fn , ∆Xn , ∆Fn , Wn )), representing the most prominent oscillatory activity, are thus 5-dimensional point processes. [sent-88, score-0.06]
44 Our extension of stochastic event synchrony to multi-dimensional point processes (and bump models in particular) is derived from the following observation (see Fig. [sent-89, score-0.614]
45 3): bumps in one time-frequency map may not be present in the other map (“spurious” bumps); other bumps are present in both maps (“non-spurious bumps”), but appear at slightly different positions on the maps. [sent-90, score-0.236]
46 3 connect the centers of non-spurious bumps, and hence, visualize the offset between pairs of non-spurious bumps. [sent-92, score-0.064]
47 We quantify the interdependence between two bump models by five parameters, i. [sent-93, score-0.257]
48 , the parameters ρspur , δt , and st introduced in Section 2, in addition to: • δf : the average frequency offset between non-spurious bumps, • sf : the variance of the frequency offset between non-spurious bumps. [sent-95, score-0.719]
49 We determine the alignment of two bump models in addition to the 5 above parameters by an inference algorithm similar to the one of Section 2, as we will explain in the following; we will use the notation θ = (δt , st , δf , sf ). [sent-96, score-0.674]
50 In principle, one may determine the sequences J and J and the parameters θ by cyclic maximization along the lines of (8) and (9). [sent-98, score-0.069]
51 As a result, the Viterbi algorithm (or equivalently, the max-product algorithm applied on cycle-free factor graph of model (10)) becomes impractical. [sent-100, score-0.044]
52 We solve this problem by applying the max-product algorithm on a cyclic factor graph of the system at hand, which will amount to a suboptimal but practical procedure to obtain pairwise alignments of multi-dimensional point processes (and bump models in particular). [sent-101, score-0.37]
53 To this end, we introduce a representation of model (10) that is naturally represented by a cyclic graph: for each pair of events Yk and Yk , we introduce a binary variable Ckk that equals one if Yk and Yk form pair of nonspurious events and is zero otherwise. [sent-102, score-0.335]
54 Since each event in Y associated to at most one event in Y , we have the constraints: n n C1k = S1 ∈ {0, 1}, k =1 n C2k = S2 ∈ {0, 1}, . [sent-103, score-0.288]
55 , k =1 Cnk = Sn ∈ {0, 1}, k =1 4 (11) and similarly, each event in Y is associated to at most one event in Y , which is expressed by a similar set of constraints. [sent-106, score-0.288]
56 On the other hand, we have prior knowledge about st and sf . [sent-111, score-0.468]
57 Indeed, we expect a bump in one time-frequency map to appear in the other map at about the same frequency, but there may be some timing offset between both bumps. [sent-112, score-0.326]
58 8s), since the former is much closer in frequency than the latter. [sent-120, score-0.052]
59 As a consequence, we a priori expect smaller values for sf than for st . [sent-121, score-0.468]
60 We encode this prior information by means of conjugate priors for st and sf , i. [sent-122, score-0.468]
61 4 (each edge represents a variable, each node corresponds to a factor of (14), as indicated by the arrows at the right hand side; we refer to [7] for an introduction to factor graphs). [sent-126, score-0.04]
62 We omitted the edges for the (observed) variables Xk , Xk , Fk , Fk , ∆Xk , ∆Xk , ∆Fk , and ∆Fk in order not to clutter the figure. [sent-127, score-0.029]
63 Time-frequency map Time-frequency map ↓ ↓ Bump model Bump model ⇔ Figure 2: Two-dimensional stochastic event synchrony. [sent-128, score-0.2]
64 5 10 t [s] 15 20 (b) Non-spurious bumps (ρspur = 27%); the black lines connect the centers of non-spurious bumps. [sent-134, score-0.09]
65 n k =1 cnk = δ[bn + − 1] µ↓ µ↑ µ↑ µ↓ = C11 N = . [sent-145, score-0.027]
66 ttt N cnn ˆ θ(k) = θ = (δt , st , δf , sf ) N ˆ θ(k) xn −xn ∆xn +∆xn ; δ t , st N fn −fn ∆fn +∆fn ; δ f , sf p(δt , st , δf , sf ) = p(δt )p(st )p(δf )p(sf ) Figure 4: Factor graph of model (14). [sent-160, score-1.558]
67 From c, we obtain the estimate ρspur as: ˆ ˆ n n n + k=1 ˆk b n + n − 2 k=1 k =1 ckk ˆ = . [sent-161, score-0.556]
68 ˆ (17) θ ˆ The estimate θ(i+1) (17) is available in closed-form; indeed, it is easily verified that the point esˆ(i+1) and δ (i+1) are the (sample) mean of the timing and frequency offset respectively, ˆ timates δt f (i+1) computed over all pairs of non-spurious events. [sent-163, score-0.204]
69 (i+1) and sf ˆ are obtained simi- ˆ Update (16), i. [sent-165, score-0.221]
70 , finding the optimal pairwise alignment C for given values θ(i) of the parameters θ, is less straightforward: it involves an intractable combinatorial optimization problem. [sent-167, score-0.094]
71 We attempt to solve that problem by applying the max-product algorithm to the (cyclic) factor graph depicted ˆ in Fig. [sent-168, score-0.044]
72 Let us first point out that, since the alignment C is computed for given θ = θ(i) , (i) ˆ the (upward) messages along the edges θ are the point estimate θ (cf. [sent-170, score-0.2]
73 (16)); equivalently, for the purpose of computing (16), one may remove the θ edges and the two bottom nodes in Fig. [sent-171, score-0.029]
74 The other messages in the graph are iteratively updated according to the generic max-product update rule [7]. [sent-173, score-0.085]
75 The messages ¯ µ↑(ckk ) and µ↑ (ckk ) propagate upward along the edges ckk towards the Σ-nodes connected to the edges Bk and Bk respectively (see Fig. [sent-175, score-0.702]
76 4, left hand side); the messages µ↓(ckk ) and µ↓ (ckk ) ¯ propagate downward along the edges ckk from the Σ-nodes connected to the edges Bk and Bk respectively. [sent-176, score-0.694]
77 After initialization (18) of the messages µ↑(ckk ) and µ↑ (ckk ) (k = 1, 2, . [sent-177, score-0.061]
78 At last, one computes the marginals p(ckk ) (23), and from the latter, one may determine the decisions ckk by greedy decimation. [sent-184, score-0.556]
79 The sampling frequency was 200 Hz, and the signals were bandpass filtered between 4 6 Initialization µ↑(ckk ) = µ↑ (ckk ) ∝ N ckk xk − xk fk − fk ; δ t , st N ; δ f , sf ∆xk + ∆xk ∆fk + ∆fk (18) Iteratively compute messages until convergence A. [sent-186, score-2.076]
80 The subjects comprised two study groups: the first consisted of a group of 22 patients diagnosed as suffering from MCI, who subsequently developed mild AD. [sent-190, score-0.067]
81 We computed a large variety of synchrony measures for both data sets; the results are summarized in Table 2. [sent-193, score-0.297]
82 We report results for global synchrony, obtained by averaging the synchrony measures over 5 brain regions (frontal, temporal left and right, central, occipital). [sent-194, score-0.34]
83 For SES, the bump models were clustered by means of the aggregation algorithm described in [17]. [sent-195, score-0.14]
84 The strongest observed effect is a significantly higher degree of background noise (ρspur ) in MCI patients, more specifically, a high number of spurious, non-synchronous oscillatory events (p = 0. [sent-196, score-0.171]
85 We verified that the SES measures are not correlated (Pearson r) with other synchrony measures (p > 0. [sent-198, score-0.325]
86 10); in contrast to the other measures, SES quantifies the synchrony of oscillatory events (instead of more conventional amplitude or phase synchrony). [sent-199, score-0.494]
87 Interestingly, we did not observe a significant effect on the timing jitter st of the non-spurious events (p = 0. [sent-203, score-0.54]
88 In other words, AD seems to be associated with a significant increase of spurious background activity, while the non-spurious activity remains well synchronized. [sent-205, score-0.208]
89 Moreover, only the non-spurious activity slows down (p = 0. [sent-206, score-0.042]
90 5(c)), the average frequency of the spurious activity is not affected in MCI patients (see Fig. [sent-208, score-0.31]
91 012∗ References [16] [18] [20] Measure Granger coherence Partial Coherence PDC DTF ffDTF dDTF p-value 0. [sent-216, score-0.032]
92 030∗ Measure Kullback-Leibler R´ nyi e Jensen-Shannon Jensen-R´ nyi e IW I p-value 0. [sent-222, score-0.048]
93 00021∗∗ Table 2: Sensitivity of synchrony measures for early prediction of AD (p-values for Mann-Whitney test; * and ** indicate p < 0. [sent-238, score-0.297]
94 N k , S k , and H k are three measures of nonlinear interdependence [15]. [sent-241, score-0.109]
wordName wordTfidf (topN-words)
[('ckk', 0.556), ('fk', 0.272), ('synchrony', 0.269), ('st', 0.247), ('spur', 0.23), ('sf', 0.221), ('xk', 0.18), ('ses', 0.176), ('spurious', 0.166), ('bk', 0.158), ('event', 0.144), ('bump', 0.14), ('events', 0.133), ('eeg', 0.125), ('mci', 0.122), ('nspur', 0.122), ('jitter', 0.094), ('bumps', 0.09), ('alzheimer', 0.081), ('interdependence', 0.081), ('cyclic', 0.069), ('timing', 0.066), ('alignment', 0.066), ('fn', 0.064), ('offset', 0.064), ('messages', 0.061), ('strings', 0.06), ('synchronous', 0.059), ('ps', 0.057), ('vik', 0.054), ('frequency', 0.052), ('disease', 0.052), ('patients', 0.05), ('brain', 0.043), ('activity', 0.042), ('bn', 0.042), ('abnormalities', 0.041), ('cnn', 0.041), ('ctr', 0.041), ('ffdtf', 0.041), ('ntot', 0.041), ('xjk', 0.041), ('processes', 0.039), ('oscillatory', 0.038), ('synchronization', 0.036), ('quantify', 0.036), ('firing', 0.035), ('signals', 0.035), ('argmax', 0.034), ('zk', 0.033), ('coherence', 0.032), ('phase', 0.032), ('edges', 0.029), ('map', 0.028), ('alignments', 0.028), ('pairwise', 0.028), ('yk', 0.028), ('measures', 0.028), ('cichocki', 0.027), ('cnk', 0.027), ('consciousness', 0.027), ('lachaux', 0.027), ('martinerie', 0.027), ('quiroga', 0.027), ('riken', 0.027), ('saitama', 0.027), ('varela', 0.027), ('vialatte', 0.027), ('upward', 0.027), ('wavelet', 0.025), ('xn', 0.025), ('graph', 0.024), ('nyi', 0.024), ('rodriguez', 0.024), ('kraskov', 0.024), ('grassberger', 0.024), ('point', 0.022), ('string', 0.022), ('amplitude', 0.022), ('justin', 0.022), ('insertions', 0.022), ('disorders', 0.022), ('measure', 0.021), ('occurrence', 0.021), ('likewise', 0.02), ('cast', 0.02), ('noteworthy', 0.02), ('amari', 0.02), ('elapsed', 0.02), ('deleting', 0.02), ('perturbing', 0.02), ('factor', 0.02), ('hz', 0.02), ('downward', 0.019), ('blood', 0.019), ('variance', 0.019), ('clinical', 0.018), ('sk', 0.018), ('alternates', 0.017), ('mild', 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 127 nips-2007-Measuring Neural Synchrony by Message Passing
Author: Justin Dauwels, François Vialatte, Tomasz Rutkowski, Andrzej S. Cichocki
Abstract: A novel approach to measure the interdependence of two time series is proposed, referred to as “stochastic event synchrony” (SES); it quantifies the alignment of two point processes by means of the following parameters: time delay, variance of the timing jitter, fraction of “spurious” events, and average similarity of events. SES may be applied to generic one-dimensional and multi-dimensional point processes, however, the paper mainly focusses on point processes in time-frequency domain. The average event similarity is in that case described by two parameters: the average frequency offset between events in the time-frequency plane, and the variance of the frequency offset (“frequency jitter”); SES then consists of five parameters in total. Those parameters quantify the synchrony of oscillatory events, and hence, they provide an alternative to existing synchrony measures that quantify amplitude or phase synchrony. The pairwise alignment of point processes is cast as a statistical inference problem, which is solved by applying the maxproduct algorithm on a graphical model. The SES parameters are determined from the resulting pairwise alignment by maximum a posteriori (MAP) estimation. The proposed interdependence measure is applied to the problem of detecting anomalies in EEG synchrony of Mild Cognitive Impairment (MCI) patients; the results indicate that SES significantly improves the sensitivity of EEG in detecting MCI.
2 0.14404741 102 nips-2007-Incremental Natural Actor-Critic Algorithms
Author: Shalabh Bhatnagar, Mohammad Ghavamzadeh, Mark Lee, Richard S. Sutton
Abstract: We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their compatibility with function approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further reduce variance in some cases. Our results extend prior two-timescale convergence results for actor-critic methods by Konda and Tsitsiklis by using temporal difference learning in the actor and by incorporating natural gradients, and they extend prior empirical studies of natural actor-critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms. 1
3 0.11561605 58 nips-2007-Consistent Minimization of Clustering Objective Functions
Author: Ulrike V. Luxburg, Stefanie Jegelka, Michael Kaufmann, Sébastien Bubeck
Abstract: Clustering is often formulated as a discrete optimization problem. The objective is to find, among all partitions of the data set, the best one according to some quality measure. However, in the statistical setting where we assume that the finite data set has been sampled from some underlying space, the goal is not to find the best partition of the given sample, but to approximate the true partition of the underlying space. We argue that the discrete optimization approach usually does not achieve this goal. As an alternative, we suggest the paradigm of “nearest neighbor clustering”. Instead of selecting the best out of all partitions of the sample, it only considers partitions in some restricted function class. Using tools from statistical learning theory we prove that nearest neighbor clustering is statistically consistent. Moreover, its worst case complexity is polynomial by construction, and it can be implemented with small average case complexity using branch and bound. 1
4 0.11460064 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI
Author: Francois Meyer, Greg Stephens
Abstract: Functional Magnetic Resonance Imaging (fMRI) provides dynamical access into the complex functioning of the human brain, detailing the hemodynamic activity of thousands of voxels during hundreds of sequential time points. One approach towards illuminating the connection between fMRI and cognitive function is through decoding; how do the time series of voxel activities combine to provide information about internal and external experience? Here we seek models of fMRI decoding which are balanced between the simplicity of their interpretation and the effectiveness of their prediction. We use signals from a subject immersed in virtual reality to compare global and local methods of prediction applying both linear and nonlinear techniques of dimensionality reduction. We find that the prediction of complex stimuli is remarkably low-dimensional, saturating with less than 100 features. In particular, we build effective models based on the decorrelated components of cognitive activity in the classically-defined Brodmann areas. For some of the stimuli, the top predictive areas were surprisingly transparent, including Wernicke’s area for verbal instructions, visual cortex for facial and body features, and visual-temporal regions for velocity. Direct sensory experience resulted in the most robust predictions, with the highest correlation (c ∼ 0.8) between the predicted and experienced time series of verbal instructions. Techniques based on non-linear dimensionality reduction (Laplacian eigenmaps) performed similarly. The interpretability and relative simplicity of our approach provides a conceptual basis upon which to build more sophisticated techniques for fMRI decoding and offers a window into cognitive function during dynamic, natural experience. 1
5 0.11112355 173 nips-2007-Second Order Bilinear Discriminant Analysis for single trial EEG analysis
Author: Christoforos Christoforou, Paul Sajda, Lucas C. Parra
Abstract: Traditional analysis methods for single-trial classification of electroencephalography (EEG) focus on two types of paradigms: phase locked methods, in which the amplitude of the signal is used as the feature for classification, e.g. event related potentials; and second order methods, in which the feature of interest is the power of the signal, e.g. event related (de)synchronization. The procedure for deciding which paradigm to use is ad hoc and is typically driven by knowledge of the underlying neurophysiology. Here we propose a principled method, based on a bilinear model, in which the algorithm simultaneously learns the best first and second order spatial and temporal features for classification of EEG. The method is demonstrated on simulated data as well as on EEG taken from a benchmark data used to test classification algorithms for brain computer interfaces. 1 1.1
6 0.10376125 146 nips-2007-On higher-order perceptron algorithms
7 0.084315971 123 nips-2007-Loop Series and Bethe Variational Bounds in Attractive Graphical Models
8 0.083650783 213 nips-2007-Variational Inference for Diffusion Processes
9 0.077206269 86 nips-2007-Exponential Family Predictive Representations of State
10 0.07609804 34 nips-2007-Bayesian Policy Learning with Trans-Dimensional MCMC
11 0.071216337 103 nips-2007-Inferring Elapsed Time from Stochastic Neural Processes
12 0.064764179 208 nips-2007-TrueSkill Through Time: Revisiting the History of Chess
13 0.062572822 57 nips-2007-Congruence between model and human attention reveals unique signatures of critical visual events
14 0.060851205 197 nips-2007-The Infinite Markov Model
15 0.060745236 151 nips-2007-Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
16 0.052898414 191 nips-2007-Temporal Difference Updating without a Learning Rate
17 0.051750429 106 nips-2007-Invariant Common Spatial Patterns: Alleviating Nonstationarities in Brain-Computer Interfacing
18 0.047558192 74 nips-2007-EEG-Based Brain-Computer Interaction: Improved Accuracy by Automatic Single-Trial Error Detection
19 0.04688992 163 nips-2007-Receding Horizon Differential Dynamic Programming
20 0.041617323 200 nips-2007-The Tradeoffs of Large Scale Learning
topicId topicWeight
[(0, -0.131), (1, -0.054), (2, 0.055), (3, -0.025), (4, -0.06), (5, 0.019), (6, 0.016), (7, 0.018), (8, -0.088), (9, -0.109), (10, -0.054), (11, -0.034), (12, -0.098), (13, -0.029), (14, 0.134), (15, 0.105), (16, -0.044), (17, -0.046), (18, -0.247), (19, 0.247), (20, -0.019), (21, -0.029), (22, 0.117), (23, -0.009), (24, 0.06), (25, 0.067), (26, 0.004), (27, 0.081), (28, 0.036), (29, -0.01), (30, 0.047), (31, 0.074), (32, 0.027), (33, -0.065), (34, 0.078), (35, -0.007), (36, -0.103), (37, 0.026), (38, -0.003), (39, 0.005), (40, 0.013), (41, -0.063), (42, -0.111), (43, 0.043), (44, -0.044), (45, -0.065), (46, 0.076), (47, 0.089), (48, -0.091), (49, 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 0.96829396 127 nips-2007-Measuring Neural Synchrony by Message Passing
Author: Justin Dauwels, François Vialatte, Tomasz Rutkowski, Andrzej S. Cichocki
Abstract: A novel approach to measure the interdependence of two time series is proposed, referred to as “stochastic event synchrony” (SES); it quantifies the alignment of two point processes by means of the following parameters: time delay, variance of the timing jitter, fraction of “spurious” events, and average similarity of events. SES may be applied to generic one-dimensional and multi-dimensional point processes, however, the paper mainly focusses on point processes in time-frequency domain. The average event similarity is in that case described by two parameters: the average frequency offset between events in the time-frequency plane, and the variance of the frequency offset (“frequency jitter”); SES then consists of five parameters in total. Those parameters quantify the synchrony of oscillatory events, and hence, they provide an alternative to existing synchrony measures that quantify amplitude or phase synchrony. The pairwise alignment of point processes is cast as a statistical inference problem, which is solved by applying the maxproduct algorithm on a graphical model. The SES parameters are determined from the resulting pairwise alignment by maximum a posteriori (MAP) estimation. The proposed interdependence measure is applied to the problem of detecting anomalies in EEG synchrony of Mild Cognitive Impairment (MCI) patients; the results indicate that SES significantly improves the sensitivity of EEG in detecting MCI.
2 0.54532981 208 nips-2007-TrueSkill Through Time: Revisiting the History of Chess
Author: Pierre Dangauthier, Ralf Herbrich, Tom Minka, Thore Graepel
Abstract: We extend the Bayesian skill rating system TrueSkill to infer entire time series of skills of players by smoothing through time instead of filtering. The skill of each participating player, say, every year is represented by a latent skill variable which is affected by the relevant game outcomes that year, and coupled with the skill variables of the previous and subsequent year. Inference in the resulting factor graph is carried out by approximate message passing (EP) along the time series of skills. As before the system tracks the uncertainty about player skills, explicitly models draws, can deal with any number of competing entities and can infer individual skills from team results. We extend the system to estimate player-specific draw margins. Based on these models we present an analysis of the skill curves of important players in the history of chess over the past 150 years. Results include plots of players’ lifetime skill development as well as the ability to compare the skills of different players across time. Our results indicate that a) the overall playing strength has increased over the past 150 years, and b) that modelling a player’s ability to force a draw provides significantly better predictive power. 1
3 0.46219495 191 nips-2007-Temporal Difference Updating without a Learning Rate
Author: Marcus Hutter, Shane Legg
Abstract: We derive an equation for temporal difference learning from statistical principles. Specifically, we start with the variational principle and then bootstrap to produce an updating rule for discounted state value estimates. The resulting equation is similar to the standard equation for temporal difference learning with eligibility traces, so called TD(λ), however it lacks the parameter α that specifies the learning rate. In the place of this free parameter there is now an equation for the learning rate that is specific to each state transition. We experimentally test this new learning rule against TD(λ) and find that it offers superior performance in various settings. Finally, we make some preliminary investigations into how to extend our new temporal difference algorithm to reinforcement learning. To do this we combine our update equation with both Watkins’ Q(λ) and Sarsa(λ) and find that it again offers superior performance without a learning rate parameter. 1
4 0.42587838 102 nips-2007-Incremental Natural Actor-Critic Algorithms
Author: Shalabh Bhatnagar, Mohammad Ghavamzadeh, Mark Lee, Richard S. Sutton
Abstract: We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which the value-function parameters are estimated using temporal difference learning and the policy parameters are updated by stochastic gradient descent. Methods based on policy gradients in this way are of special interest because of their compatibility with function approximation methods, which are needed to handle large or infinite state spaces. The use of temporal difference learning in this way is of interest because in many applications it dramatically reduces the variance of the gradient estimates. The use of the natural gradient is of interest because it can produce better conditioned parameterizations and has been shown to further reduce variance in some cases. Our results extend prior two-timescale convergence results for actor-critic methods by Konda and Tsitsiklis by using temporal difference learning in the actor and by incorporating natural gradients, and they extend prior empirical studies of natural actor-critic methods by Peters, Vijayakumar and Schaal by providing the first convergence proofs and the first fully incremental algorithms. 1
5 0.42094404 123 nips-2007-Loop Series and Bethe Variational Bounds in Attractive Graphical Models
Author: Alan S. Willsky, Erik B. Sudderth, Martin J. Wainwright
Abstract: Variational methods are frequently used to approximate or bound the partition or likelihood function of a Markov random field. Methods based on mean field theory are guaranteed to provide lower bounds, whereas certain types of convex relaxations provide upper bounds. In general, loopy belief propagation (BP) provides often accurate approximations, but not bounds. We prove that for a class of attractive binary models, the so–called Bethe approximation associated with any fixed point of loopy BP always lower bounds the true likelihood. Empirically, this bound is much tighter than the naive mean field bound, and requires no further work than running BP. We establish these lower bounds using a loop series expansion due to Chertkov and Chernyak, which we show can be derived as a consequence of the tree reparameterization characterization of BP fixed points. 1
6 0.41256329 86 nips-2007-Exponential Family Predictive Representations of State
7 0.40349287 58 nips-2007-Consistent Minimization of Clustering Objective Functions
8 0.37832925 173 nips-2007-Second Order Bilinear Discriminant Analysis for single trial EEG analysis
9 0.35620055 57 nips-2007-Congruence between model and human attention reveals unique signatures of critical visual events
10 0.33500552 197 nips-2007-The Infinite Markov Model
11 0.3297745 106 nips-2007-Invariant Common Spatial Patterns: Alleviating Nonstationarities in Brain-Computer Interfacing
12 0.32308051 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI
13 0.29932475 200 nips-2007-The Tradeoffs of Large Scale Learning
14 0.28620479 74 nips-2007-EEG-Based Brain-Computer Interaction: Improved Accuracy by Automatic Single-Trial Error Detection
15 0.28467819 133 nips-2007-Modelling motion primitives and their timing in biologically executed movements
16 0.26731268 34 nips-2007-Bayesian Policy Learning with Trans-Dimensional MCMC
17 0.25120932 146 nips-2007-On higher-order perceptron algorithms
18 0.25046617 44 nips-2007-Catching Up Faster in Bayesian Model Selection and Model Averaging
19 0.24087413 103 nips-2007-Inferring Elapsed Time from Stochastic Neural Processes
20 0.22755539 213 nips-2007-Variational Inference for Diffusion Processes
topicId topicWeight
[(5, 0.047), (13, 0.059), (16, 0.024), (18, 0.014), (19, 0.057), (21, 0.056), (31, 0.039), (33, 0.349), (34, 0.023), (35, 0.016), (47, 0.052), (49, 0.022), (83, 0.086), (87, 0.014), (90, 0.051)]
simIndex simValue paperId paperTitle
same-paper 1 0.76834738 127 nips-2007-Measuring Neural Synchrony by Message Passing
Author: Justin Dauwels, François Vialatte, Tomasz Rutkowski, Andrzej S. Cichocki
Abstract: A novel approach to measure the interdependence of two time series is proposed, referred to as “stochastic event synchrony” (SES); it quantifies the alignment of two point processes by means of the following parameters: time delay, variance of the timing jitter, fraction of “spurious” events, and average similarity of events. SES may be applied to generic one-dimensional and multi-dimensional point processes, however, the paper mainly focusses on point processes in time-frequency domain. The average event similarity is in that case described by two parameters: the average frequency offset between events in the time-frequency plane, and the variance of the frequency offset (“frequency jitter”); SES then consists of five parameters in total. Those parameters quantify the synchrony of oscillatory events, and hence, they provide an alternative to existing synchrony measures that quantify amplitude or phase synchrony. The pairwise alignment of point processes is cast as a statistical inference problem, which is solved by applying the maxproduct algorithm on a graphical model. The SES parameters are determined from the resulting pairwise alignment by maximum a posteriori (MAP) estimation. The proposed interdependence measure is applied to the problem of detecting anomalies in EEG synchrony of Mild Cognitive Impairment (MCI) patients; the results indicate that SES significantly improves the sensitivity of EEG in detecting MCI.
2 0.62288684 194 nips-2007-The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information
Author: John Langford, Tong Zhang
Abstract: We present Epoch-Greedy, an algorithm for contextual multi-armed bandits (also known as bandits with side information). Epoch-Greedy has the following properties: 1. No knowledge of a time horizon T is necessary. 2. The regret incurred by Epoch-Greedy is controlled by a sample complexity bound for a hypothesis class. 3. The regret scales as O(T 2/3 S 1/3 ) or better (sometimes, much better). Here S is the complexity term in a sample complexity bound for standard supervised learning. 1
Author: Ping Li, Trevor J. Hastie
Abstract: Many tasks (e.g., clustering) in machine learning only require the lα distances instead of the original data. For dimension reductions in the lα norm (0 < α ≤ 2), the method of stable random projections can efficiently compute the lα distances in massive datasets (e.g., the Web or massive data streams) in one pass of the data. The estimation task for stable random projections has been an interesting topic. We propose a simple estimator based on the fractional power of the samples (projected data), which is surprisingly near-optimal in terms of the asymptotic variance. In fact, it achieves the Cram´ r-Rao bound when α = 2 and α = 0+. This e new result will be useful when applying stable random projections to distancebased clustering, classifications, kernels, massive data streams etc.
Author: Lars Buesing, Wolfgang Maass
Abstract: We show that under suitable assumptions (primarily linearization) a simple and perspicuous online learning rule for Information Bottleneck optimization with spiking neurons can be derived. This rule performs on common benchmark tasks as well as a rather complex rule that has previously been proposed [1]. Furthermore, the transparency of this new learning rule makes a theoretical analysis of its convergence properties feasible. A variation of this learning rule (with sign changes) provides a theoretically founded method for performing Principal Component Analysis (PCA) with spiking neurons. By applying this rule to an ensemble of neurons, different principal components of the input can be extracted. In addition, it is possible to preferentially extract those principal components from incoming signals X that are related or are not related to some additional target signal YT . In a biological interpretation, this target signal YT (also called relevance variable) could represent proprioceptive feedback, input from other sensory modalities, or top-down signals. 1
5 0.3773672 106 nips-2007-Invariant Common Spatial Patterns: Alleviating Nonstationarities in Brain-Computer Interfacing
Author: Benjamin Blankertz, Motoaki Kawanabe, Ryota Tomioka, Friederike Hohlefeld, Klaus-Robert Müller, Vadim V. Nikulin
Abstract: Brain-Computer Interfaces can suffer from a large variance of the subject conditions within and across sessions. For example vigilance fluctuations in the individual, variable task involvement, workload etc. alter the characteristics of EEG signals and thus challenge a stable BCI operation. In the present work we aim to define features based on a variant of the common spatial patterns (CSP) algorithm that are constructed invariant with respect to such nonstationarities. We enforce invariance properties by adding terms to the denominator of a Rayleigh coefficient representation of CSP such as disturbance covariance matrices from fluctuations in visual processing. In this manner physiological prior knowledge can be used to shape the classification engine for BCI. As a proof of concept we present a BCI classifier that is robust to changes in the level of parietal α -activity. In other words, the EEG decoding still works when there are lapses in vigilance.
6 0.3737846 24 nips-2007-An Analysis of Inference with the Universum
7 0.37265271 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI
8 0.3702189 138 nips-2007-Near-Maximum Entropy Models for Binary Neural Representations of Natural Images
9 0.36953679 7 nips-2007-A Kernel Statistical Test of Independence
10 0.3691566 173 nips-2007-Second Order Bilinear Discriminant Analysis for single trial EEG analysis
11 0.36836004 88 nips-2007-Fast and Scalable Training of Semi-Supervised CRFs with Application to Activity Recognition
12 0.36816645 63 nips-2007-Convex Relaxations of Latent Variable Training
13 0.36809087 74 nips-2007-EEG-Based Brain-Computer Interaction: Improved Accuracy by Automatic Single-Trial Error Detection
14 0.36787942 18 nips-2007-A probabilistic model for generating realistic lip movements from speech
15 0.36713576 84 nips-2007-Expectation Maximization and Posterior Constraints
16 0.36678752 86 nips-2007-Exponential Family Predictive Representations of State
17 0.36678267 93 nips-2007-GRIFT: A graphical model for inferring visual classification features from human data
18 0.36487371 45 nips-2007-Classification via Minimum Incremental Coding Length (MICL)
19 0.36484578 79 nips-2007-Efficient multiple hyperparameter learning for log-linear models
20 0.36458057 76 nips-2007-Efficient Convex Relaxation for Transductive Support Vector Machine