nips nips2001 nips2001-127 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Roland Vollgraf, Klaus Obermayer
Abstract: We present a new method for the blind separation of sources, which do not fulfill the independence assumption. In contrast to standard methods we consider groups of neighboring samples (
Reference: text
sentIndex sentText sentNum sentScore
1 de Abstract We present a new method for the blind separation of sources, which do not fulfill the independence assumption. [sent-3, score-0.451]
2 First we extract independent features from the observed patches. [sent-5, score-0.198]
3 It turns out that the average dependencies between these features in different sources is in general lower than the dependencies between the amplitudes of different sources. [sent-6, score-0.781]
4 We show that it might be the case that most of the dependencies is carried by only a small number of features. [sent-7, score-0.233]
5 Is this case - provided these features can be identified by some heuristic - we project all patches into the subspace which is orthogonal to the subspace spanned by the "correlated" features. [sent-8, score-0.499]
6 Standard ICA is then performed on the elements of the transformed patches (for which the independence assumption holds) and robustly yields a good estimate of the mixing matrix. [sent-9, score-0.7]
7 1 Introduction ICA as a method for blind source separation has been proven very useful in a wide range of statistical data analysis. [sent-10, score-0.665]
8 A strong criterion, that allows to detect and separate linearly mixed source signals from the observed mixtures, is the independence of the source signals amplitude distribution. [sent-11, score-0.893]
9 Others consider higher order moments of the source estimates [4, 5]. [sent-15, score-0.317]
10 In such situations it can be very useful to consider temporal/spatial statistical properties of the source signals as well. [sent-17, score-0.367]
11 In [7] the author suggests to model the sources as a stochastic process and to do the ICA on the innovations rather than on the signals them self. [sent-19, score-0.362]
12 In this work we extend the ICA to multidimensional channels of neighboring realizations. [sent-20, score-0.258]
13 In section 3 it will be shown, that there are optimal features, that carry lower dependencies between the sources and can be used for source separation. [sent-22, score-0.84]
14 A heuristic is introduced, that allows to discard those features, that carry most of the dependencies. [sent-23, score-0.211]
15 Our method requires (i) sources which exhibit correlations between neighboring pixels (e. [sent-25, score-0.44]
16 continuous sources like images or sound signals), and (ii) sources from which sparse and almost independent features can be extracted. [sent-27, score-0.845]
17 In section 5 we show separation results and benchmarks for linearly mixed passport photographs. [sent-28, score-0.407]
18 The method is fast and provides good separation results even for sources, whose correlation coefficient is as large as 0. [sent-29, score-0.211]
19 2 Sources and observations Let us consider a set of N source signals Si(r), i = 1, . [sent-31, score-0.434]
20 The sample index might be a scalar for sources which are time series and a two-dimensional vector for sources which are images 1 . [sent-36, score-0.678]
21 The sources are linearly combined by an unknown mixing matrix A of full rank to produce a set of N observations Xi(r), N Xi(r) = l: AijSj(r) , (1) j =l and we assume that the mixing process is stationary, i. [sent-37, score-0.697]
22 The goal is to find an appropriate demixing matrix W which - when applied to the observations X(r) - recovers good estimates S(r), S(r) = WX(r) ~ S(r) (2) of the original source signals (up to a permutation and scaling of the sources). [sent-47, score-0.695]
23 Since the mixing matrix A is not known its inverse W has to be detected blindly, i. [sent-48, score-0.192]
24 only properties of the sources which are detectable in the mixtures can be exploited. [sent-50, score-0.331]
25 For a large class of ICA algorithms one assumes that the sources are non-Gaussian and independent, i. [sent-51, score-0.283]
26 In situations, however, where the independence assumption does not hold, it can be helpful to take into account spatial dependencies, which can be very prominent for natural signals, and have been subject for a number of blind source separation algorithms [8, 9, 6]. [sent-57, score-0.805]
27 Let us now consider patches si(r), s(r) = (4) 1 In the following we will mostly consider images, hence we will refer to the abovementioned neighborhood relations as spatial relations. [sent-58, score-0.405]
28 si(r) could be a sequence of M adjacent samples of an audio signal or a rectangular patch of M pixels in an image. [sent-61, score-0.26]
29 (5) Because of the stationarity of the mixing process we obtain x = As s and (6) = Wx, where x is an N x M matrix of neighboring observations and where the matrices A and W operate on every column vector of sand x. [sent-65, score-0.437]
30 3 Optimal spatial features Let us now consider a set of sources which are not statistically independent , i. [sent-66, score-0.447]
31 (7) i=1 Our goal is to find in a first step a linear transformation 0 E IRMxM which when applied to every patch - yields transformed sources u = sOT for which the independence assumption, p(Ulk, . [sent-73, score-0.682]
32 When 0 is applied to the observations x , v = xOT , we obtain a modified source separation problem (8) where the demixing matrix W can be estimated from the transformed observations v in a second step using standard ICA. [sent-80, score-0.9]
33 (7) is tantamount to positive transinformation of the source amplitudes. [sent-82, score-0.344]
34 As all elements of the patches are equally distributed, this quantity is the same for all k. [sent-84, score-0.385]
35 k=1 (10) Only if the sources where spatially white and s would consist of independent column vectors, this would hold with equality. [sent-89, score-0.482]
36 When 0 is applied to the source patches, the trans-information between patches is not changed, provided 0 is a non-singular transformation. [sent-90, score-0.648]
37 Neither information is introduced nor discarded by this transformation and it holds (11) For the optimal 0 now the column vectors of u = sOT shall be independent. [sent-91, score-0.295]
38 So it remains to estimate a matrix 0 that provides a matrix u with independent columns. [sent-94, score-0.22]
39 We approach this by estimating 0 so that it provides row vectors of u that have independent elements, i. [sent-95, score-0.388]
40 With that and under the assumption that all sources may come from the same distribution and that there are no "cross dependencies" in u (i. [sent-99, score-0.315]
41 l), the independence is guaranteed also for whole column vectors of u. [sent-102, score-0.22]
42 Thus, standard leA can be applied to patches of sources which yields 0 as the de-mixing matrix. [sent-103, score-0.671]
43 So column vectors of u are independent to each other if, and only if columns of v are independent 3 . [sent-108, score-0.421]
44 (12) the trans-information of the elements of columns of u has decreased in average, but not necessarily uniformly. [sent-111, score-0.211]
45 One can expect some columns to have more independent elements than others. [sent-112, score-0.255]
46 the corresponding rows of 0 and discard them prior to the second leA step. [sent-114, score-0.186]
47 Each source patch Si can be considered as linear combination of independent components, that are given by the columns of 0- 1 , where the elements of Ui are the coefficients. [sent-115, score-0.678]
48 Therefore, those components, that have large Euklidian norm, occur as features with high entropy in the source patches. [sent-117, score-0.378]
49 At the same time it is clear that, if there are features , that are responsible for the source dependencies, these features have to be present with large entropy, otherwise the source dependencies would have been low. [sent-118, score-0.96]
50 Accordingly we propose a heuristic that discards the rows of 0 with the smallest Euklidian norm prior to the second leA step. [sent-119, score-0.391]
51 How many rows have to be discarded and if this type of heuristic is applicable at all , depends of the statistical nature of the sources. [sent-120, score-0.27]
52 In section 5 we show that for the test data this heuristic is well applicable and almost all dependencies are contained in one feature. [sent-121, score-0.282]
53 The resulting "demixing matrix" 0 is applied to the patches of observations, generating a matrix v(r) = x(r )OT, the columns of which are candidates for the input for the second leA. [sent-126, score-0.56]
54 A number of M D columns that belong to rows of 0 with small norm are discarded as they very likely represent features , that carry dependencies between the sources. [sent-127, score-0.825]
55 For the remaining columns it is not obvious which one represents the most sparse and independent feature. [sent-130, score-0.233]
56 So any of them with equal probability now serve as input sample for the second ICA , which estimates the demixing matrix W. [sent-131, score-0.234]
57 When the number N of sources is large, the first ICA may fail to extract the independent source features, because, according to the central limit theorem, the distribution of their coefficients in the mixtures may be close to a Gaussian distribution. [sent-132, score-0.833]
58 The source estimates Wx(r) are used as input for the first ICA to achieve a better n, which in turn allows to better estimate W. [sent-134, score-0.346]
59 Figure 1: Results of standard and multidimensional ICA performed on a set of 8 correlated passport images. [sent-135, score-0.287]
60 Top row: source images; Second row: linearly mixed sources; Third row: separation results using kurtosis optimization (FastICA Matlab package); Bottom row: separation results using multidimensional ICA (For explanation see text). [sent-136, score-0.949]
61 The correlation coefficients of the source images were between 0. [sent-143, score-0.419]
62 The top row shows the row vectors of n sorted by the logarithm of their norm. [sent-153, score-0.694]
63 The second row shows the features (the corresponding columns of n - 1 ) which are extracted by n . [sent-154, score-0.477]
64 1 gram below the stars indicate the logarithm of the row norm, log 0%1' and the squares indicate the mutual information J(Ulk,U7k) between the k-th features in sources 1 and 7 5, calculated using a histogram estimator. [sent-159, score-0.751]
65 It is quite prominent that (i) a small norm of a column vector corresponds to a strongly correlated feature, and (ii) there is only one feature which carries most of the dependencies between the sources. [sent-160, score-0.602]
66 1, top and bottom rows, shows that all sources were successfully recovered. [sent-164, score-0.372]
67 Figure 2: Result of an ICA (kurtosis optimization) performed on patches of observations (cf. [sent-165, score-0.398]
68 Vectors are sorted by increasing norm of the row vectors; dark and bright pixels indicate positive and negative values. [sent-170, score-0.508]
69 Bottom diagram: Logarithm of the norm of row vectors (stars) and mutual information J(Ulk' U7k) (squares) between the coefficients of the corresponding features in the source images 1 and 7. [sent-171, score-1.002]
70 As expected, only the first column yields poor reconstruction error. [sent-177, score-0.304]
71 We see, that for all values a good reconstruction is achieved (re < 0. [sent-182, score-0.155]
72 Even if no row is discarded the result is only slightly worse than for one or two discarded rows. [sent-184, score-0.408]
73 The dependencies of the first component are "averaged" by the vast majority of components, that carry no dependencies, in this case. [sent-185, score-0.328]
74 To evaluate the influence of the patch size M, the Two-Step algorithm was applied to 9 different mixtures of the sources shown in Fig. [sent-188, score-0.521]
75 1, top row, and using patch sizes between M = 2 x 2 and M = 6 x 6. [sent-189, score-0.183]
76 The mixing matrix A was randomly chosen from a normal distribution with mean zero and variance one. [sent-191, score-0.218]
77 105 sample patches were used to extract the optimal features and 2. [sent-193, score-0.484]
78 The algorithm shows a quite robust performance, and even for patch sizes of 2 x 2 pixels a fairly good separation result is achieved 5Images no. [sent-197, score-0.417]
79 衯;: =I 1 6 11 16 21 large row norm 26 31 36 0 5 10 small row norm Figure 3: Every single row of 0 used to generate input for the second leA. [sent-208, score-1.074]
80 Only the first (smallest norm) row causes bad reconstruction error for the second leA step. [sent-209, score-0.471]
81 0460 Figure 4: M D rows with smallest norm discarded. [sent-220, score-0.313]
82 All values of M D provide good reconstruction error in the second step. [sent-221, score-0.182]
83 Table 1: Separation result of the TwoStep algorithm performed on a set of 8 correlated passport images (d. [sent-223, score-0.282]
84 The table shows the average reconstruction error J-lr e and its standard deviation (Jr e calculated from 9 different mixtures. [sent-226, score-0.182]
85 (Note, for comparison, that the reconstruction error of the separation in Fig. [sent-227, score-0.393]
86 6 Summary and outlook We extended the source separation model to multidimensional channels (image patches). [sent-230, score-0.671]
87 There are two linear transformations to be considered, one operating inside the channels (0) and one operating between the different channels (W). [sent-231, score-0.261]
88 There are mainly two advantages, that can be taken from the first transformation: (i) By arranging independence among the columns of the transformed patches, the average transinformation between different channels is decreased. [sent-233, score-0.432]
89 (ii) A suitable heuristic can be applied to discard those columns of the transformed patches, that carry most of the dependencies between different channels. [sent-234, score-0.633]
90 A heuristic, that identifies the dependence carrying components by a small norm of the corresponding rows of 0 has been introduced. [sent-235, score-0.306]
91 A Reconstruction error The reconstruction error re is a measure for the success of a source separation. [sent-242, score-0.497]
92 It compares the estimated de-mixing matrix W with the inverse of the original mixing matrix A with respect to the indeterminacies: scalings and permutations. [sent-243, score-0.265]
93 This is the case when for every row of P exactly one element is different from zero and the rows of P are orthogonal, i. [sent-245, score-0.378]
94 The reconstruction error is the sum of measures for both aspects N re N N N N 2LlogL P 7j - Llog LPij i=1 i=1 j=1 N N N j=1 i=1 + Llog L P 7j -log detppT N i=1 3 j=1 N i=1 j=1 L log L P 7j - L log L pi j - j=1 log det ppT . [sent-248, score-0.182]
95 Sejnowski, "An information-maximization approach to blind separation and blind deconvolution," Neural Computation, vol. [sent-253, score-0.543]
96 Yang, "A new learning algorithm for blind signal separation," in Advances in Neural Information Processing Systems, D. [sent-261, score-0.192]
97 Cardoso, "Infomax and maximum likelihood for blind source separation," IEEE Signal Processing Lett. [sent-272, score-0.454]
98 [4] J ean-Franc;ois Cardoso, Sandip Bose, and Benjamin Friedlander, "On optimal source separation based on second and fourth order cumulants," in Proc. [sent-274, score-0.499]
99 Pearlmutter, "Blind source separation by sparse decomposition in a signal dictionary," Neural Computation, vol. [sent-286, score-0.557]
100 Schreiner, "Blind source separation and deconvolution: The dynamic component analysis algorithm," Neural Comput. [sent-308, score-0.529]
wordName wordTfidf (topN-words)
[('patches', 0.331), ('source', 0.288), ('sources', 0.283), ('ica', 0.276), ('row', 0.26), ('separation', 0.211), ('dependencies', 0.204), ('blind', 0.166), ('reconstruction', 0.155), ('lea', 0.154), ('norm', 0.147), ('patch', 0.135), ('columns', 0.127), ('mixing', 0.119), ('rows', 0.118), ('passport', 0.111), ('demixing', 0.103), ('fastica', 0.097), ('column', 0.092), ('features', 0.09), ('correlated', 0.088), ('multidimensional', 0.088), ('neighboring', 0.086), ('channels', 0.084), ('slk', 0.084), ('images', 0.083), ('signals', 0.079), ('heuristic', 0.078), ('independent', 0.074), ('independence', 0.074), ('discarded', 0.074), ('matrix', 0.073), ('pixels', 0.071), ('discard', 0.068), ('observations', 0.067), ('kurtosis', 0.066), ('uik', 0.066), ('realizations', 0.066), ('carry', 0.065), ('transformed', 0.062), ('deconvolution', 0.056), ('euklidian', 0.056), ('llog', 0.056), ('sot', 0.056), ('terrence', 0.056), ('transinformation', 0.056), ('ulk', 0.056), ('vectors', 0.054), ('elements', 0.054), ('wx', 0.053), ('mixed', 0.049), ('ppt', 0.048), ('factorizing', 0.048), ('abovementioned', 0.048), ('coefficients', 0.048), ('mixtures', 0.048), ('smallest', 0.048), ('top', 0.048), ('si', 0.045), ('stars', 0.044), ('transformation', 0.042), ('logarithm', 0.042), ('bottom', 0.041), ('components', 0.041), ('cardoso', 0.041), ('transformations', 0.041), ('package', 0.039), ('carries', 0.037), ('anthony', 0.037), ('linearly', 0.036), ('extract', 0.034), ('prominent', 0.034), ('hold', 0.033), ('holds', 0.033), ('sejnowski', 0.033), ('bell', 0.033), ('assumption', 0.032), ('mutual', 0.032), ('sparse', 0.032), ('dfg', 0.031), ('ui', 0.031), ('sorted', 0.03), ('decreased', 0.03), ('component', 0.03), ('sample', 0.029), ('applied', 0.029), ('estimates', 0.029), ('first', 0.029), ('carried', 0.029), ('fail', 0.029), ('matlab', 0.029), ('yields', 0.028), ('adjacent', 0.028), ('permutation', 0.027), ('error', 0.027), ('relations', 0.026), ('influence', 0.026), ('chosen', 0.026), ('signal', 0.026), ('operating', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 127 nips-2001-Multi Dimensional ICA to Separate Correlated Sources
Author: Roland Vollgraf, Klaus Obermayer
Abstract: We present a new method for the blind separation of sources, which do not fulfill the independence assumption. In contrast to standard methods we consider groups of neighboring samples (
2 0.31875378 44 nips-2001-Blind Source Separation via Multinode Sparse Representation
Author: Michael Zibulevsky, Pavel Kisilev, Yehoshua Y. Zeevi, Barak A. Pearlmutter
Abstract: We consider a problem of blind source separation from a set of instantaneous linear mixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in an appropriate representation according to some signal dictionary, dramatically improves the quality of separation. In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. The performance of the algorithm is verified on noise-free and noisy data. Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation quality over previously reported results. 1
3 0.25390726 71 nips-2001-Estimating the Reliability of ICA Projections
Author: Frank C. Meinecke, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: When applying unsupervised learning techniques like ICA or temporal decorrelation, a key question is whether the discovered projections are reliable. In other words: can we give error bars or can we assess the quality of our separation? We use resampling methods to tackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error. We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance, and, most important, to mark the components that have a actual physical meaning. Application to 49-channel-data from an magneto encephalography (MEG) experiment underlines the usefulness of our approach. 1
4 0.2408205 103 nips-2001-Kernel Feature Spaces and Nonlinear Blind Souce Separation
Author: Stefan Harmeling, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: In kernel based learning the data is mapped to a kernel feature space of a dimension that corresponds to the number of training data points. In practice, however, the data forms a smaller submanifold in feature space, a fact that has been used e.g. by reduced set techniques for SVMs. We propose a new mathematical construction that permits to adapt to the intrinsic dimension and to find an orthonormal basis of this submanifold. In doing so, computations get much simpler and more important our theoretical framework allows to derive elegant kernelized blind source separation (BSS) algorithms for arbitrary invertible nonlinear mixings. Experiments demonstrate the good performance and high computational efficiency of our kTDSEP algorithm for the problem of nonlinear BSS.
5 0.18709452 165 nips-2001-Scaling Laws and Local Minima in Hebbian ICA
Author: Magnus Rattray, Gleb Basalyga
Abstract: We study the dynamics of a Hebbian ICA algorithm extracting a single non-Gaussian component from a high-dimensional Gaussian background. For both on-line and batch learning we find that a surprisingly large number of examples are required to avoid trapping in a sub-optimal state close to the initial conditions. To extract a skewed signal at least examples are required for -dimensional data and examples are required to extract a symmetrical signal with non-zero kurtosis. § ¡ ©£¢ £ §¥ ¡ ¨¦¤£¢
6 0.11542927 74 nips-2001-Face Recognition Using Kernel Methods
7 0.10183514 135 nips-2001-On Spectral Clustering: Analysis and an algorithm
8 0.081136473 171 nips-2001-Spectral Relaxation for K-means Clustering
9 0.080243208 178 nips-2001-TAP Gibbs Free Energy, Belief Propagation and Sparsity
10 0.073555596 153 nips-2001-Product Analysis: Learning to Model Observations as Products of Hidden Variables
11 0.065910697 164 nips-2001-Sampling Techniques for Kernel Methods
12 0.065418094 191 nips-2001-Transform-invariant Image Decomposition with Similarity Templates
13 0.062228046 111 nips-2001-Learning Lateral Interactions for Feature Binding and Sensory Segmentation
14 0.061434776 190 nips-2001-Thin Junction Trees
15 0.060243644 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models
16 0.060189735 129 nips-2001-Multiplicative Updates for Classification by Mixture Models
17 0.059735578 13 nips-2001-A Natural Policy Gradient
18 0.057826545 89 nips-2001-Grouping with Bias
19 0.054286666 75 nips-2001-Fast, Large-Scale Transformation-Invariant Clustering
20 0.053552851 35 nips-2001-Analysis of Sparse Bayesian Learning
topicId topicWeight
[(0, -0.205), (1, 0.002), (2, -0.049), (3, -0.197), (4, 0.036), (5, 0.01), (6, -0.044), (7, 0.097), (8, 0.274), (9, -0.437), (10, -0.202), (11, -0.137), (12, 0.004), (13, -0.015), (14, 0.122), (15, -0.025), (16, 0.07), (17, -0.012), (18, -0.006), (19, -0.085), (20, 0.003), (21, 0.052), (22, -0.133), (23, -0.014), (24, -0.011), (25, 0.009), (26, 0.004), (27, -0.002), (28, 0.064), (29, -0.016), (30, 0.047), (31, -0.014), (32, 0.012), (33, -0.017), (34, 0.044), (35, -0.009), (36, 0.039), (37, 0.026), (38, -0.009), (39, 0.076), (40, 0.051), (41, 0.073), (42, 0.015), (43, -0.04), (44, -0.003), (45, 0.039), (46, -0.034), (47, 0.036), (48, 0.024), (49, -0.0)]
simIndex simValue paperId paperTitle
same-paper 1 0.97721797 127 nips-2001-Multi Dimensional ICA to Separate Correlated Sources
Author: Roland Vollgraf, Klaus Obermayer
Abstract: We present a new method for the blind separation of sources, which do not fulfill the independence assumption. In contrast to standard methods we consider groups of neighboring samples (
2 0.91074067 44 nips-2001-Blind Source Separation via Multinode Sparse Representation
Author: Michael Zibulevsky, Pavel Kisilev, Yehoshua Y. Zeevi, Barak A. Pearlmutter
Abstract: We consider a problem of blind source separation from a set of instantaneous linear mixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in an appropriate representation according to some signal dictionary, dramatically improves the quality of separation. In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. The performance of the algorithm is verified on noise-free and noisy data. Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation quality over previously reported results. 1
3 0.89041948 71 nips-2001-Estimating the Reliability of ICA Projections
Author: Frank C. Meinecke, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: When applying unsupervised learning techniques like ICA or temporal decorrelation, a key question is whether the discovered projections are reliable. In other words: can we give error bars or can we assess the quality of our separation? We use resampling methods to tackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error. We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance, and, most important, to mark the components that have a actual physical meaning. Application to 49-channel-data from an magneto encephalography (MEG) experiment underlines the usefulness of our approach. 1
4 0.72252768 103 nips-2001-Kernel Feature Spaces and Nonlinear Blind Souce Separation
Author: Stefan Harmeling, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: In kernel based learning the data is mapped to a kernel feature space of a dimension that corresponds to the number of training data points. In practice, however, the data forms a smaller submanifold in feature space, a fact that has been used e.g. by reduced set techniques for SVMs. We propose a new mathematical construction that permits to adapt to the intrinsic dimension and to find an orthonormal basis of this submanifold. In doing so, computations get much simpler and more important our theoretical framework allows to derive elegant kernelized blind source separation (BSS) algorithms for arbitrary invertible nonlinear mixings. Experiments demonstrate the good performance and high computational efficiency of our kTDSEP algorithm for the problem of nonlinear BSS.
5 0.58288884 165 nips-2001-Scaling Laws and Local Minima in Hebbian ICA
Author: Magnus Rattray, Gleb Basalyga
Abstract: We study the dynamics of a Hebbian ICA algorithm extracting a single non-Gaussian component from a high-dimensional Gaussian background. For both on-line and batch learning we find that a surprisingly large number of examples are required to avoid trapping in a sub-optimal state close to the initial conditions. To extract a skewed signal at least examples are required for -dimensional data and examples are required to extract a symmetrical signal with non-zero kurtosis. § ¡ ©£¢ £ §¥ ¡ ¨¦¤£¢
6 0.28814167 177 nips-2001-Switch Packet Arbitration via Queue-Learning
7 0.27455768 74 nips-2001-Face Recognition Using Kernel Methods
8 0.27348903 53 nips-2001-Constructing Distributed Representations Using Additive Clustering
9 0.27260152 171 nips-2001-Spectral Relaxation for K-means Clustering
10 0.27037278 178 nips-2001-TAP Gibbs Free Energy, Belief Propagation and Sparsity
11 0.26967037 182 nips-2001-The Fidelity of Local Ordinal Encoding
12 0.26772529 191 nips-2001-Transform-invariant Image Decomposition with Similarity Templates
13 0.2629689 35 nips-2001-Analysis of Sparse Bayesian Learning
14 0.25939378 110 nips-2001-Learning Hierarchical Structures with Linear Relational Embedding
15 0.2517314 190 nips-2001-Thin Junction Trees
16 0.25133118 135 nips-2001-On Spectral Clustering: Analysis and an algorithm
17 0.24705462 153 nips-2001-Product Analysis: Learning to Model Observations as Products of Hidden Variables
18 0.24403448 111 nips-2001-Learning Lateral Interactions for Feature Binding and Sensory Segmentation
19 0.23148645 109 nips-2001-Learning Discriminative Feature Transforms to Low Dimensions in Low Dimentions
20 0.22958066 185 nips-2001-The Method of Quantum Clustering
topicId topicWeight
[(14, 0.035), (17, 0.035), (19, 0.023), (20, 0.035), (27, 0.212), (30, 0.059), (38, 0.016), (59, 0.067), (72, 0.068), (79, 0.044), (83, 0.01), (91, 0.133), (97, 0.186)]
simIndex simValue paperId paperTitle
same-paper 1 0.9105764 127 nips-2001-Multi Dimensional ICA to Separate Correlated Sources
Author: Roland Vollgraf, Klaus Obermayer
Abstract: We present a new method for the blind separation of sources, which do not fulfill the independence assumption. In contrast to standard methods we consider groups of neighboring samples (
2 0.88832271 57 nips-2001-Correlation Codes in Neuronal Populations
Author: Maoz Shamir, Haim Sompolinsky
Abstract: Population codes often rely on the tuning of the mean responses to the stimulus parameters. However, this information can be greatly suppressed by long range correlations. Here we study the efficiency of coding information in the second order statistics of the population responses. We show that the Fisher Information of this system grows linearly with the size of the system. We propose a bilinear readout model for extracting information from correlation codes, and evaluate its performance in discrimination and estimation tasks. It is shown that the main source of information in this system is the stimulus dependence of the variances of the single neuron responses.
3 0.81323344 9 nips-2001-A Generalization of Principal Components Analysis to the Exponential Family
Author: Michael Collins, S. Dasgupta, Robert E. Schapire
Abstract: Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not real-valued, such as binary-valued data. This paper draws on ideas from the Exponential family, Generalized linear models, and Bregman distances, to give a generalization of PCA to loss functions that we argue are better suited to other data types. We describe algorithms for minimizing the loss functions, and give examples on simulated data.
4 0.80969226 44 nips-2001-Blind Source Separation via Multinode Sparse Representation
Author: Michael Zibulevsky, Pavel Kisilev, Yehoshua Y. Zeevi, Barak A. Pearlmutter
Abstract: We consider a problem of blind source separation from a set of instantaneous linear mixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in an appropriate representation according to some signal dictionary, dramatically improves the quality of separation. In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. The performance of the algorithm is verified on noise-free and noisy data. Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation quality over previously reported results. 1
5 0.80367166 88 nips-2001-Grouping and dimensionality reduction by locally linear embedding
Author: Marzia Polito, Pietro Perona
Abstract: Locally Linear Embedding (LLE) is an elegant nonlinear dimensionality-reduction technique recently introduced by Roweis and Saul [2]. It fails when the data is divided into separate groups. We study a variant of LLE that can simultaneously group the data and calculate local embedding of each group. An estimate for the upper bound on the intrinsic dimension of the data set is obtained automatically. 1
6 0.79407537 172 nips-2001-Speech Recognition using SVMs
7 0.7916069 135 nips-2001-On Spectral Clustering: Analysis and an algorithm
8 0.79142773 8 nips-2001-A General Greedy Approximation Algorithm with Applications
9 0.78896821 84 nips-2001-Global Coordination of Local Linear Models
10 0.78861278 98 nips-2001-Information Geometrical Framework for Analyzing Belief Propagation Decoder
11 0.78705084 190 nips-2001-Thin Junction Trees
12 0.78664482 137 nips-2001-On the Convergence of Leveraging
13 0.78348505 103 nips-2001-Kernel Feature Spaces and Nonlinear Blind Souce Separation
14 0.78290206 114 nips-2001-Learning from Infinite Data in Finite Time
15 0.78288668 13 nips-2001-A Natural Policy Gradient
16 0.78098643 197 nips-2001-Why Neuronal Dynamics Should Control Synaptic Learning Rules
17 0.77606738 188 nips-2001-The Unified Propagation and Scaling Algorithm
18 0.77522928 89 nips-2001-Grouping with Bias
19 0.77430117 131 nips-2001-Neural Implementation of Bayesian Inference in Population Codes