nips nips2001 nips2001-71 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Frank C. Meinecke, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: When applying unsupervised learning techniques like ICA or temporal decorrelation, a key question is whether the discovered projections are reliable. In other words: can we give error bars or can we assess the quality of our separation? We use resampling methods to tackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error. We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance, and, most important, to mark the components that have a actual physical meaning. Application to 49-channel-data from an magneto encephalography (MEG) experiment underlines the usefulness of our approach. 1
Reference: text
sentIndex sentText sentNum sentScore
1 de Abstract When applying unsupervised learning techniques like ICA or temporal decorrelation, a key question is whether the discovered projections are reliable. [sent-10, score-0.194]
2 In other words: can we give error bars or can we assess the quality of our separation? [sent-11, score-0.104]
3 We use resampling methods to tackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error. [sent-12, score-0.674]
4 We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance, and, most important, to mark the components that have a actual physical meaning. [sent-13, score-0.506]
5 Application to 49-channel-data from an magneto encephalography (MEG) experiment underlines the usefulness of our approach. [sent-14, score-0.071]
6 1 Introduction Blind source separation (BSS) techniques have found wide-spread use in various application domains , e. [sent-15, score-0.386]
7 BSS is a statistical technique to reveal unknown source signals when only mixtures of them can be observed. [sent-21, score-0.162]
8 In the following we will only consider linear mixtures; the goal is then to estimate those projection directions, that recover the source signals. [sent-22, score-0.162]
9 Many different BSS algorithms have been proposed, but to our knowledge, so far, no principled attempts have been made to assess the reliability of BSS algorithms, such that error bars are given along with the resulting projection estimates. [sent-23, score-0.357]
10 This lack of error bars or means for selecting between competing models is of course a basic dilemma for most unsupervised learning algorithms. [sent-24, score-0.133]
11 The sources of potential unreliability of unsupervised algorithms are ubiquous , i. [sent-25, score-0.142]
12 Unsupervised projection techniques like PCA or BSS will always give an answer that is found within their model class, e. [sent-30, score-0.085]
13 resampling methods (see [12] or [7] for references), where algorithms for assessing the stability of the solution have been analyzed e. [sent-35, score-0.313]
14 This will enable us to select a good BSS model, in order to improve the separation performance and to find potentially meaningful projection directions. [sent-39, score-0.422]
15 In the following we will give an algorithmic description of the resampling methods, accompanied by some theoretical remarks (section 2) and show excellent experimental results (sections 3 and 4). [sent-40, score-0.283]
16 1 Resampling Techniques for BSS The leA Model In blind source separation we assume that at time instant t each component Xi(t) of the observed n-dimensional data vector, x(t) is a linear superposition of m ::::: n statistically independent signals: m Xi(t) = LAijSj(t) j=l (e. [sent-43, score-0.593]
17 The source signals Sj(t) are unknown, as are the coefficients Aij of the mixing matrix A. [sent-46, score-0.155]
18 Let {ed be the canonical basis of the true sources s = 'E eiSi. [sent-54, score-0.127]
19 Using this, we can define a component-wise separation error Ei as the angle difference between the true direction of the source and the direction of the respective leA channel: Ei = arccos ("e~i: ~ifill) . [sent-56, score-0.463]
20 To calculate this angle difference, remember that component-wise we have Yj 'E WjkAkisi. [sent-57, score-0.068]
21 In the following, we will illustrate our approach for two different source separation algorithms (JADE, TDSEP). [sent-61, score-0.362]
22 JADE [4] using higher order statistics is based on the joint diagonalization of matrices obtained from 'parallel slices' of the fourth order cumulant tensor. [sent-62, score-0.111]
23 TDSEP [14] relies on second order statistics only, enforcing temporal decorrelation between channels. [sent-63, score-0.171]
24 2 About Resampling The objective of resampling techniques is to produce surrogate data sets that eventually allow to approximate the 'separation error' by a repeated estimation of the parameters of interest. [sent-65, score-0.465]
25 The underlying mixing should of course be independent of the generation process of the surrogate data and therefore remain invariant under resampling. [sent-66, score-0.214]
26 Bootstrap R esampling The most popular res amp ling methods are the Jackknife and the Bootstrap (see e. [sent-67, score-0.298]
27 [12, 7]) The Jackknife produces surrogate data sets by just deleting one datum each time from the original data. [sent-69, score-0.225]
28 For obtaining one bootstrap sample, we draw randomly N elements from the original data, i. [sent-73, score-0.236]
29 some data points might occur several times, others don't occur at all in the bootstrap sample. [sent-75, score-0.26]
30 Then, the separating matrix is computed on the full block and repeatedly on each of the N -element bootstrap samples. [sent-77, score-0.305]
31 The variance is computed as the squared average difference between the estimate on the full block and the respective bootstrap unmixings. [sent-78, score-0.365]
32 (These resampling methods have some desirable properties, which make them very attractive; for example, it can be shown that for iid data the bootstrap estimators of the distributions of many commonly used statistics are consistent. [sent-79, score-0.643]
33 For example, the time lagged correlation matrices needed for TDSEP, can be obtained from {ad by 1 Cij(T) = N N 2: at 'Xi(t)Xj(t+T) t= l with L at = N and at E {O, 1, 2, . [sent-81, score-0.1]
34 Other resampling methods Besides the Bootstrap, there are other res amp ling methods like the Jackknife or cross-validation which can be understood as special cases of Bootstrap. [sent-85, score-0.581]
35 3 The Resampling Algorithm After performing BSS, the estimated ICA-projections are used to generate surrogate data by resampling. [sent-88, score-0.158]
36 On the whitened l surrogate data, the source separation algorithm is used again to estimate a rotation that separates this surrogate data. [sent-89, score-0.774]
37 Using this parameterization we can easily compare different N-dimensional rotations by comparing the rotation parameters aij. [sent-91, score-0.146]
38 Since the sources are already separated, the estimated rotation matrices will be in the vicinity of the identity matrix. [sent-92, score-0.262]
39 21t is important to perform the resampling when the sources are already separated, so that the aij are distributed around zero, because SO(N) is a non-Abelian group; that means that in general R(a)R«(3 ) of- R«(3) R(a) . [sent-95, score-0.564]
40 Var(aij) measures the instability of the separation with respect to a rotation in the (i, j)-plane. [sent-96, score-0.405]
41 Since the reliability of a projection is bounded by the maximum angle variance of all rotations that affect this direction, we define the uncertainty of the i-th ICA-Projection as Ui := maxj Var(aij). [sent-97, score-0.542]
42 Produce k surrogate data sets from y and whiten these data sets 3. [sent-101, score-0.182]
43 For each surrogate data set: do BSS, producing a set of rotation matrices 4. [sent-102, score-0.318]
44 For each ICA component calculate the uncertainty Ui = maxVar(aij). [sent-104, score-0.174]
45 4 Asymptotic Considerations for Resampling Properties of res amp ling methods are typically studied in the limit when the number of bootstrap samples B -+ 00 and the length of signal T -+ 00 [12]. [sent-106, score-0.577]
46 In our case, as B -+ 00, the bootstrap variance estimator Ut(B) computed from the aiJ's converge to Ut(oo) := maxj Varp[aij] where aij denotes the res amp led deviation and F denotes the distribution generating it. [sent-107, score-0.752]
47 Furthermore, if F -+ F, Ut (00) converges to the true variance Ui = maxj VarF[aij ] as T -+ 00. [sent-108, score-0.154]
48 When the data has time structure, F does not necessarily converge to the generating distribution F of the original signal anymore. [sent-113, score-0.095]
49 in TDSEP, where the aij depend on the variation of the time-lagged covariances Cij(T) of the signals, we can show that their estimators Ctj (T) are unbiased: Furthermore, we can bound the difference t:. [sent-116, score-0.211]
50 ijkl(T,V) = COV p [Ctj ( T), Ckl (v)] COVF [Cij(T),Ckl(V)] between the covariance of the real matrices and their boot- strap estimators as if :3a < 1, M ;::: 1, Vi: ICii (T) I :S M aJLICii(O) I. [sent-117, score-0.076]
51 1 Experiments Comparing the separation error with the uncertainty estimate To show the practical applicability of the resampling idea to ICA, the separation error Ei was compared with the uncertainty Ui . [sent-120, score-1.165]
52 The separation was performed on different artificial 2D mixtures of speech and music signals and different iid data sets of the same variance. [sent-121, score-0.457]
53 To achieve different separation qualities, white gaussian noise of different intensity has been added to the mixtures. [sent-122, score-0.316]
54 6 separation error E j o L-----~----~----~--~ 0. [sent-137, score-0.331]
55 45 Figure 1: (a) The probability distribution for the separation error for a small uncertainty is close to zero, for higher uncertainty it spreads over a larger range. [sent-143, score-0.548]
56 Figure 1 relates the uncertainty to the separation error for JADE (TDSEP results look qualitatively the same) . [sent-145, score-0.427]
57 1 (left) we see the separation error distribution which has a strong peak for small values of our uncertainty measure, whereas for large uncertainties it tends to become flat, i. [sent-147, score-0.427]
58 1 (right) the uncertainty reflects very well the true separation error. [sent-150, score-0.41]
59 2 Selecting the appropriate BSS algorithm As our variance estimation gives a high correlation to the (true) separation error, the next logical step is to use it as a model selection criterion for: (a) selecting some hyperparameter of the BSS algorithm, e. [sent-152, score-0.404]
60 To illustrate the usefulness of our reliability measure, we study a five-channel mixture of two channels of pure white gaussian noise, two audio signals and one channel of uniformly distributed noise. [sent-165, score-0.528]
61 The reliability analysis for higher order statistics (JADE) temporal decorrelation (TDSEP) 0. [sent-166, score-0.388]
62 05 3 ICA Channel i 3 ICA Channel i Figure 2: Uncertainty of leA projections of an artificial mixture using JADE and TDSEP. [sent-182, score-0.099]
63 Resampling displays the strengths and weaknesses of the different models JADE gives the advice to rely only on channels 3,4,5 (d. [sent-183, score-0.094]
64 In fact , these are the channels that contain the audio signals and the uniformly distributed noise. [sent-186, score-0.225]
65 ,20) shows, that TDSEP can give reliable estimates only for the two audio sources (which is to be expected; d. [sent-190, score-0.27]
66 According to our measure, the estimation for the audio sources is more reliable in the TDSEP-case. [sent-193, score-0.241]
67 Calculation of the separation error verifies this: TDSEP separates better by about 3 orders of magnitude (JADE: E3 = 1. [sent-194, score-0.331]
68 Finally, in our example, estimating the audio sources with TDSEP and after this applying JADE to the orthogonal subspace, gives the optimal solution since it combines the small separation errors E 3, E4 for TDSEP with the ability of JADE to separate the uniformly distributed noise. [sent-202, score-0.497]
69 3 Blockwise uncertainty estimates For a longer time series it is not only important to know which ICA channels are reliable, but also to know whether different parts of a given time series are more (or less) reliable to separate than others. [sent-204, score-0.377]
70 To demonstrate these effects, we mixed two audio sources (8kHz, lOs - 80000 data points) , where the mixtures are partly corrupted by white gaussian noise. [sent-205, score-0.325]
71 Reliability analysis is performed on windows of length 1000, shifted in steps of 250; the resulting variance estimates are smoothed. [sent-206, score-0.09]
72 3 shows again that the uncertainty measure is nicely correlated with the true separation error, furthermore the variance goes systematically up within the noisy part but also in other parts of the time series that do not seem to match the assumptions underlying the algorithm. [sent-208, score-0.575]
73 3 So our reliability estimates can eventually Figure 3: Upper panel: mixtures, partly corrupted by noise. [sent-209, score-0.272]
74 Lower panel: the blockwise variance estimate (solid line) vs the true separation error on this block (dotted line) . [sent-210, score-0.53]
75 be used to improve separation performance by removing all but the 'reliable' parts of the time series. [sent-211, score-0.317]
76 For our example this reduces the overall separation error by 2 orders of magnitude from 2. [sent-212, score-0.331]
77 This moving-window resampling can detect instabilities of the projections in two different ways: Besides the resampling variance that can be calculated for each window, one can also calculate the change of the projection directions between two windows. [sent-217, score-0.85]
78 4 Assigning Meaning: Application to Biomedical Data We now apply our reliability analysis to biomedical data that has been produced by an MEG experiment with acoustic stimulation. [sent-221, score-0.299]
79 While previously 3For example, the peak in the last third of the time series can be traced back to the fact that the original time series are correlated in this region. [sent-226, score-0.167]
80 [13] analysing the data, we found that many of the ICA components are seemingly meaningless and it took some medical knowledge to find potential meaningful projections for a later close inspection. [sent-227, score-0.196]
81 However, our reliability assessment can also be seen as indication for meaningful projections, i. [sent-228, score-0.264]
82 In the experiment, BSS was performed on the 23 most powerful principal components using (a) higher order statistics (JADE) and (b) temporal decorrelation (TDSEP, time lag 0 . [sent-231, score-0.296]
83 4 show that none of higher order statistics (JADE) temporal decorrelation (TDSEP) 0. [sent-235, score-0.196]
84 05 10 15 leA-Channel i ,~ 20 10 15 leA-Channel i 20 Figure 4: Resampling on the biomedical data from MEG experiment shows: (a) no JADE projection is reliable (has low uncertainty) (b) TDSEP is able to identify three sources with low uncertainty. [sent-249, score-0.327]
85 the JADE-projections (left) have small variance whereas TDSEP (right) identifies three sources with a good reliability. [sent-250, score-0.163]
86 5 1 234 5 6 7 t[min) Figure 5: Spatial field pattern, frequency content and time course of TDSEP channel 6. [sent-256, score-0.11]
87 The components found by JADE do not show such a clear structure and the strongest correlation of any component to the stimulus is about 0. [sent-259, score-0.162]
88 3, which is of the same order of magnitude as the strongest correlated PCA-component before applying JADE. [sent-260, score-0.07]
89 5 Discussion We proposed a simple method to estimate the reliability of ICA projections based on res amp ling techniques. [sent-261, score-0.617]
90 After showing that our technique approximates the separation error, several directions are open(ed) for applications. [sent-262, score-0.318]
91 Second, variances can be estimated on blocks of data and separation performance can be enhanced by using only low variance blocks where the model matches the data nicely. [sent-264, score-0.456]
92 Finally reliability estimates can be used to find meaningful components. [sent-265, score-0.293]
93 Here our assumption is that the more meaningful a component is, the more stably we should be able to estimate it. [sent-266, score-0.144]
94 Future research will focus on applying res amp ling techniques to other unsupervised learning scenarios. [sent-268, score-0.362]
95 We will also consider Bayesian modelings where often a variance estimate comes for free, along with the trained model. [sent-269, score-0.089]
96 An information maximisation approach to blind separation and blind deconvolution. [sent-296, score-0.559]
97 Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture . [sent-334, score-0.289]
98 Assessing reliability of ica projections - a resampling approach. [sent-351, score-0.746]
99 Independent component analysis of non-invasively recorded cortical magnetic dc-fields in humans. [sent-370, score-0.08]
100 TDSEP - an efficient algorithm for blind separation using time structure. [sent-375, score-0.452]
wordName wordTfidf (topN-words)
[('tdsep', 0.413), ('jade', 0.317), ('separation', 0.289), ('resampling', 0.283), ('bss', 0.256), ('bootstrap', 0.236), ('reliability', 0.192), ('aij', 0.179), ('ica', 0.172), ('blind', 0.135), ('surrogate', 0.134), ('amp', 0.118), ('rotation', 0.116), ('sources', 0.102), ('projections', 0.099), ('decorrelation', 0.098), ('uncertainty', 0.096), ('jackknife', 0.091), ('res', 0.09), ('ling', 0.09), ('lea', 0.09), ('biomedical', 0.083), ('audio', 0.082), ('source', 0.073), ('meaningful', 0.072), ('ziehe', 0.072), ('channels', 0.069), ('maxj', 0.068), ('projection', 0.061), ('variance', 0.061), ('channel', 0.058), ('reliable', 0.057), ('meg', 0.054), ('ui', 0.051), ('signals', 0.05), ('cij', 0.05), ('lag', 0.047), ('blockwise', 0.045), ('ctj', 0.045), ('meinecke', 0.045), ('underlines', 0.045), ('matrices', 0.044), ('component', 0.044), ('ei', 0.043), ('signal', 0.043), ('statistics', 0.042), ('error', 0.042), ('correlated', 0.041), ('block', 0.04), ('unsupervised', 0.04), ('makeig', 0.039), ('datum', 0.039), ('mixtures', 0.039), ('stimulus', 0.036), ('fj', 0.036), ('ut', 0.036), ('magnetic', 0.036), ('assess', 0.035), ('series', 0.035), ('calculate', 0.034), ('angle', 0.034), ('kawanabe', 0.033), ('potsdam', 0.033), ('mixing', 0.032), ('estimators', 0.032), ('rg', 0.031), ('temporal', 0.031), ('besides', 0.03), ('assessing', 0.03), ('rotations', 0.03), ('var', 0.03), ('directions', 0.029), ('separating', 0.029), ('music', 0.029), ('blocks', 0.029), ('wx', 0.029), ('strongest', 0.029), ('estimates', 0.029), ('estimate', 0.028), ('correlation', 0.028), ('time', 0.028), ('bars', 0.027), ('white', 0.027), ('corrupted', 0.026), ('hyperparameter', 0.026), ('iid', 0.026), ('angles', 0.026), ('usefulness', 0.026), ('higher', 0.025), ('components', 0.025), ('partly', 0.025), ('displays', 0.025), ('true', 0.025), ('techniques', 0.024), ('group', 0.024), ('purposes', 0.024), ('data', 0.024), ('course', 0.024), ('uniformly', 0.024), ('springer', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999994 71 nips-2001-Estimating the Reliability of ICA Projections
Author: Frank C. Meinecke, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: When applying unsupervised learning techniques like ICA or temporal decorrelation, a key question is whether the discovered projections are reliable. In other words: can we give error bars or can we assess the quality of our separation? We use resampling methods to tackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error. We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance, and, most important, to mark the components that have a actual physical meaning. Application to 49-channel-data from an magneto encephalography (MEG) experiment underlines the usefulness of our approach. 1
2 0.40580028 103 nips-2001-Kernel Feature Spaces and Nonlinear Blind Souce Separation
Author: Stefan Harmeling, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: In kernel based learning the data is mapped to a kernel feature space of a dimension that corresponds to the number of training data points. In practice, however, the data forms a smaller submanifold in feature space, a fact that has been used e.g. by reduced set techniques for SVMs. We propose a new mathematical construction that permits to adapt to the intrinsic dimension and to find an orthonormal basis of this submanifold. In doing so, computations get much simpler and more important our theoretical framework allows to derive elegant kernelized blind source separation (BSS) algorithms for arbitrary invertible nonlinear mixings. Experiments demonstrate the good performance and high computational efficiency of our kTDSEP algorithm for the problem of nonlinear BSS.
3 0.25390726 127 nips-2001-Multi Dimensional ICA to Separate Correlated Sources
Author: Roland Vollgraf, Klaus Obermayer
Abstract: We present a new method for the blind separation of sources, which do not fulfill the independence assumption. In contrast to standard methods we consider groups of neighboring samples (
4 0.16028631 44 nips-2001-Blind Source Separation via Multinode Sparse Representation
Author: Michael Zibulevsky, Pavel Kisilev, Yehoshua Y. Zeevi, Barak A. Pearlmutter
Abstract: We consider a problem of blind source separation from a set of instantaneous linear mixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in an appropriate representation according to some signal dictionary, dramatically improves the quality of separation. In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. The performance of the algorithm is verified on noise-free and noisy data. Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation quality over previously reported results. 1
5 0.09316913 165 nips-2001-Scaling Laws and Local Minima in Hebbian ICA
Author: Magnus Rattray, Gleb Basalyga
Abstract: We study the dynamics of a Hebbian ICA algorithm extracting a single non-Gaussian component from a high-dimensional Gaussian background. For both on-line and batch learning we find that a surprisingly large number of examples are required to avoid trapping in a sub-optimal state close to the initial conditions. To extract a skewed signal at least examples are required for -dimensional data and examples are required to extract a symmetrical signal with non-zero kurtosis. § ¡ ©£¢ £ §¥ ¡ ¨¦¤£¢
6 0.070151113 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models
7 0.065459326 74 nips-2001-Face Recognition Using Kernel Methods
8 0.061920423 164 nips-2001-Sampling Techniques for Kernel Methods
9 0.056567632 50 nips-2001-Classifying Single Trial EEG: Towards Brain Computer Interfacing
10 0.05168331 135 nips-2001-On Spectral Clustering: Analysis and an algorithm
11 0.05056154 123 nips-2001-Modeling Temporal Structure in Classical Conditioning
12 0.049760971 4 nips-2001-ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition
13 0.047726147 9 nips-2001-A Generalization of Principal Components Analysis to the Exponential Family
14 0.047109306 35 nips-2001-Analysis of Sparse Bayesian Learning
15 0.046512473 195 nips-2001-Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
16 0.045951504 155 nips-2001-Quantizing Density Estimators
17 0.045606393 178 nips-2001-TAP Gibbs Free Energy, Belief Propagation and Sparsity
18 0.044143829 21 nips-2001-A Variational Approach to Learning Curves
19 0.042702951 43 nips-2001-Bayesian time series classification
20 0.042688511 8 nips-2001-A General Greedy Approximation Algorithm with Applications
topicId topicWeight
[(0, -0.167), (1, -0.014), (2, -0.044), (3, -0.165), (4, 0.033), (5, 0.041), (6, 0.016), (7, 0.091), (8, 0.277), (9, -0.418), (10, -0.235), (11, -0.104), (12, -0.053), (13, 0.005), (14, 0.162), (15, 0.033), (16, 0.043), (17, 0.036), (18, -0.055), (19, -0.065), (20, -0.023), (21, 0.027), (22, -0.138), (23, -0.018), (24, -0.044), (25, 0.004), (26, -0.043), (27, 0.001), (28, 0.058), (29, -0.036), (30, 0.022), (31, -0.022), (32, -0.041), (33, 0.036), (34, -0.023), (35, -0.041), (36, -0.032), (37, -0.069), (38, -0.042), (39, -0.075), (40, -0.016), (41, -0.03), (42, -0.014), (43, 0.007), (44, -0.085), (45, -0.042), (46, 0.015), (47, -0.004), (48, -0.006), (49, 0.013)]
simIndex simValue paperId paperTitle
same-paper 1 0.95459718 71 nips-2001-Estimating the Reliability of ICA Projections
Author: Frank C. Meinecke, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: When applying unsupervised learning techniques like ICA or temporal decorrelation, a key question is whether the discovered projections are reliable. In other words: can we give error bars or can we assess the quality of our separation? We use resampling methods to tackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error. We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance, and, most important, to mark the components that have a actual physical meaning. Application to 49-channel-data from an magneto encephalography (MEG) experiment underlines the usefulness of our approach. 1
2 0.84315151 127 nips-2001-Multi Dimensional ICA to Separate Correlated Sources
Author: Roland Vollgraf, Klaus Obermayer
Abstract: We present a new method for the blind separation of sources, which do not fulfill the independence assumption. In contrast to standard methods we consider groups of neighboring samples (
3 0.81109577 44 nips-2001-Blind Source Separation via Multinode Sparse Representation
Author: Michael Zibulevsky, Pavel Kisilev, Yehoshua Y. Zeevi, Barak A. Pearlmutter
Abstract: We consider a problem of blind source separation from a set of instantaneous linear mixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in an appropriate representation according to some signal dictionary, dramatically improves the quality of separation. In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. The performance of the algorithm is verified on noise-free and noisy data. Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation quality over previously reported results. 1
4 0.81100827 103 nips-2001-Kernel Feature Spaces and Nonlinear Blind Souce Separation
Author: Stefan Harmeling, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: In kernel based learning the data is mapped to a kernel feature space of a dimension that corresponds to the number of training data points. In practice, however, the data forms a smaller submanifold in feature space, a fact that has been used e.g. by reduced set techniques for SVMs. We propose a new mathematical construction that permits to adapt to the intrinsic dimension and to find an orthonormal basis of this submanifold. In doing so, computations get much simpler and more important our theoretical framework allows to derive elegant kernelized blind source separation (BSS) algorithms for arbitrary invertible nonlinear mixings. Experiments demonstrate the good performance and high computational efficiency of our kTDSEP algorithm for the problem of nonlinear BSS.
5 0.48664644 165 nips-2001-Scaling Laws and Local Minima in Hebbian ICA
Author: Magnus Rattray, Gleb Basalyga
Abstract: We study the dynamics of a Hebbian ICA algorithm extracting a single non-Gaussian component from a high-dimensional Gaussian background. For both on-line and batch learning we find that a surprisingly large number of examples are required to avoid trapping in a sub-optimal state close to the initial conditions. To extract a skewed signal at least examples are required for -dimensional data and examples are required to extract a symmetrical signal with non-zero kurtosis. § ¡ ©£¢ £ §¥ ¡ ¨¦¤£¢
6 0.24095733 177 nips-2001-Switch Packet Arbitration via Queue-Learning
7 0.22757496 178 nips-2001-TAP Gibbs Free Energy, Belief Propagation and Sparsity
8 0.22681344 19 nips-2001-A Rotation and Translation Invariant Discrete Saliency Network
9 0.22568247 57 nips-2001-Correlation Codes in Neuronal Populations
10 0.21070421 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models
11 0.20312056 164 nips-2001-Sampling Techniques for Kernel Methods
12 0.18985407 26 nips-2001-Active Portfolio-Management based on Error Correction Neural Networks
13 0.18949822 155 nips-2001-Quantizing Density Estimators
14 0.1889897 74 nips-2001-Face Recognition Using Kernel Methods
15 0.1877899 73 nips-2001-Eye movements and the maturation of cortical orientation selectivity
16 0.18761621 14 nips-2001-A Neural Oscillator Model of Auditory Selective Attention
17 0.17792097 92 nips-2001-Incorporating Invariances in Non-Linear Support Vector Machines
18 0.17411445 122 nips-2001-Model Based Population Tracking and Automatic Detection of Distribution Changes
19 0.16922754 153 nips-2001-Product Analysis: Learning to Model Observations as Products of Hidden Variables
20 0.16832739 88 nips-2001-Grouping and dimensionality reduction by locally linear embedding
topicId topicWeight
[(3, 0.266), (14, 0.034), (17, 0.027), (19, 0.034), (27, 0.112), (30, 0.061), (36, 0.013), (38, 0.021), (55, 0.047), (59, 0.065), (72, 0.061), (79, 0.043), (83, 0.028), (91, 0.099), (97, 0.013)]
simIndex simValue paperId paperTitle
same-paper 1 0.81823891 71 nips-2001-Estimating the Reliability of ICA Projections
Author: Frank C. Meinecke, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: When applying unsupervised learning techniques like ICA or temporal decorrelation, a key question is whether the discovered projections are reliable. In other words: can we give error bars or can we assess the quality of our separation? We use resampling methods to tackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error. We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance, and, most important, to mark the components that have a actual physical meaning. Application to 49-channel-data from an magneto encephalography (MEG) experiment underlines the usefulness of our approach. 1
2 0.75754827 138 nips-2001-On the Generalization Ability of On-Line Learning Algorithms
Author: Nicolò Cesa-bianchi, Alex Conconi, Claudio Gentile
Abstract: In this paper we show that on-line algorithms for classification and regression can be naturally used to obtain hypotheses with good datadependent tail bounds on their risk. Our results are proven without requiring complicated concentration-of-measure arguments and they hold for arbitrary on-line learning algorithms. Furthermore, when applied to concrete on-line algorithms, our results yield tail bounds that in many cases are comparable or better than the best known bounds.
3 0.58835089 103 nips-2001-Kernel Feature Spaces and Nonlinear Blind Souce Separation
Author: Stefan Harmeling, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: In kernel based learning the data is mapped to a kernel feature space of a dimension that corresponds to the number of training data points. In practice, however, the data forms a smaller submanifold in feature space, a fact that has been used e.g. by reduced set techniques for SVMs. We propose a new mathematical construction that permits to adapt to the intrinsic dimension and to find an orthonormal basis of this submanifold. In doing so, computations get much simpler and more important our theoretical framework allows to derive elegant kernelized blind source separation (BSS) algorithms for arbitrary invertible nonlinear mixings. Experiments demonstrate the good performance and high computational efficiency of our kTDSEP algorithm for the problem of nonlinear BSS.
4 0.57226485 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models
Author: John R. Hershey, Michael Casey
Abstract: It is well known that under noisy conditions we can hear speech much more clearly when we read the speaker's lips. This suggests the utility of audio-visual information for the task of speech enhancement. We propose a method to exploit audio-visual cues to enable speech separation under non-stationary noise and with a single microphone. We revise and extend HMM-based speech enhancement techniques, in which signal and noise models are factori ally combined, to incorporate visual lip information and employ novel signal HMMs in which the dynamics of narrow-band and wide band components are factorial. We avoid the combinatorial explosion in the factorial model by using a simple approximate inference technique to quickly estimate the clean signals in a mixture. We present a preliminary evaluation of this approach using a small-vocabulary audio-visual database, showing promising improvements in machine intelligibility for speech enhanced using audio and visual information. 1
5 0.56532192 127 nips-2001-Multi Dimensional ICA to Separate Correlated Sources
Author: Roland Vollgraf, Klaus Obermayer
Abstract: We present a new method for the blind separation of sources, which do not fulfill the independence assumption. In contrast to standard methods we consider groups of neighboring samples (
6 0.56492043 13 nips-2001-A Natural Policy Gradient
7 0.56386578 95 nips-2001-Infinite Mixtures of Gaussian Process Experts
8 0.56015271 29 nips-2001-Adaptive Sparseness Using Jeffreys Prior
9 0.55980951 131 nips-2001-Neural Implementation of Bayesian Inference in Population Codes
10 0.55835968 88 nips-2001-Grouping and dimensionality reduction by locally linear embedding
12 0.55738831 164 nips-2001-Sampling Techniques for Kernel Methods
13 0.55728257 8 nips-2001-A General Greedy Approximation Algorithm with Applications
14 0.55715775 132 nips-2001-Novel iteration schemes for the Cluster Variation Method
15 0.55637509 57 nips-2001-Correlation Codes in Neuronal Populations
16 0.55552351 190 nips-2001-Thin Junction Trees
17 0.55523324 155 nips-2001-Quantizing Density Estimators
18 0.5551917 27 nips-2001-Activity Driven Adaptive Stochastic Resonance
19 0.55450678 121 nips-2001-Model-Free Least-Squares Policy Iteration
20 0.55354613 74 nips-2001-Face Recognition Using Kernel Methods