nips nips2001 nips2001-44 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Michael Zibulevsky, Pavel Kisilev, Yehoshua Y. Zeevi, Barak A. Pearlmutter
Abstract: We consider a problem of blind source separation from a set of instantaneous linear mixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in an appropriate representation according to some signal dictionary, dramatically improves the quality of separation. In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. The performance of the algorithm is verified on noise-free and noisy data. Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation quality over previously reported results. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We consider a problem of blind source separation from a set of instantaneous linear mixtures, where the mixing matrix is unknown. [sent-13, score-0.74]
2 It was discovered recently, that exploiting the sparsity of sources in an appropriate representation according to some signal dictionary, dramatically improves the quality of separation. [sent-14, score-0.575]
3 In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. [sent-15, score-0.609]
4 We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. [sent-16, score-0.06]
5 The performance of the algorithm is verified on noise-free and noisy data. [sent-17, score-0.063]
6 Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation quality over previously reported results. [sent-18, score-0.316]
7 1 Introduction In the blind source separation problem an N-channel sensor signal x(~ ) is generated by M unknown scalar source signals s rn(~) , linearly mixed together by an unknown N x M mixing, or crosstalk, matrix A , and possibly corrupted by additive noise n(~): x(~) = As(~) + n(~ ). [sent-19, score-1.189]
8 (1) The independent variable ~ is either time or spatial coordinates in the case of images. [sent-20, score-0.034]
9 We wish to estimate the mixing matrix A and the M-dimensional source signal s(~). [sent-21, score-0.517]
10 The assumption of statistical independence of the source components Srn(~) , m = 1, . [sent-22, score-0.258]
11 A stronger assumption is the °Supported in part by the Ollendorff Minerva Center, by the Israeli Ministry of Science, by NSF CAREER award 97-02-311 and by the National Foundation for Functional Brain Imaging sparsity of decomposition coefficients, when the sources are properly represented [3]. [sent-26, score-0.582]
12 In particular, let each 8 m (~ ) have a sparse representation obtained by means of its decomposition coefficients C according to a signal dictionary offunctions Y k (~ ): mk 8m (~ ) = L Cmk Yk(~)' (2) k The functions Yk (~ ) are called atoms or elements of the dictionary. [sent-27, score-0.819]
13 These elements do not have to be linearly independent, and instead may form an overcomplete dictionary, e. [sent-28, score-0.032]
14 Sparsity means that only a small number of coefficients Cmk differ significantly from zero. [sent-32, score-0.297]
15 Then, unmixing of the sources is performed in the transform domain, i. [sent-33, score-0.375]
16 The property of sparsity often yields much better source separation than standard ICA, and can work well even with more sources than mixtures. [sent-36, score-0.89]
17 In many cases there are distinct groups of coefficients, wherein sources have different sparsity properties. [sent-37, score-0.539]
18 The key idea in this study is to select only a subset of features (coefficients) which is best suited for separation, with respect to the following criteria: (1) sparsity of coefficients (2) separability of sources' features. [sent-38, score-0.574]
19 After this subset is formed , one uses it in the separation process, which can be accomplished by standard ICA algorithms or by clustering. [sent-39, score-0.217]
20 The performance of our approach is verified on noise-free and noisy data. [sent-40, score-0.063]
21 Our experiments with ID signals and images demonstrate that the proposed method further improves separation quality, as compared with result obtained by using sparsity of all decomposition coefficients. [sent-41, score-0.749]
22 2 Two approaches to sparse source separation: InfoMax and Clustering Sparse sources can be separated by each one of several techniques, e. [sent-42, score-0.577]
23 In the former case, the algorithm estimates the unmixing matrix W = A - I, while in the later case the output is the estimated mixing matrix. [sent-45, score-0.417]
24 In both cases, these matrices can be estimated only up to a column permutation and a scaling factor [4]. [sent-46, score-0.069]
25 Under the assumption of a noiseless system and a square mixing matrix in (1), the BS InfoMax is equivalent to the maximum likelihood (ML) formulation of the problem [4], which is used in this section. [sent-48, score-0.363]
26 For the sake of simplicity of the presentation, let us consider the case where the dictionary of functions used in a source decomposition (2) is an orthonormal basis. [sent-49, score-0.516]
27 (In this case, the corresponding coefficients Cmk =< 8m , 'P k >, where < ', ' > denotes the inner product). [sent-50, score-0.297]
28 From (1) and (2) the decomposition coefficients of the noiseless mixtures, according to the same signal dictionary of functions Y k (~ ) ' are: Ak= ACk, (3) where M -dimensional vector Ck forms the k-th column of the matrix C = { Cmk}. [sent-51, score-0.863]
29 Let Y be thefeatures , or (new) data, matrix of dimension M x K , where K is the number of features. [sent-52, score-0.095]
30 Its rows are either the samples of sensor signals (mixtures), or their decomposition coefficients. [sent-53, score-0.414]
31 In the later case, the coefficients Ak'S form the columns ofY. [sent-54, score-0.353]
32 Let the corresponding coefficients Cmk be independent random variables with a probability density function (pdf) of an exponential type (4) where the scalar function v(·) is a smooth approximation of an absolute value function. [sent-57, score-0.332]
33 Such kind of distribution is widely used for modeling sparsity [5]. [sent-58, score-0.218]
34 In view of the independence of Cmk, and (4), the prior pdf of C is p(C) ex II exp{ - V(Cmk)}. [sent-59, score-0.11]
35 (5) m,k Taking into account that Y = AC, the parametric model for the pdf of Y with respect to parameters A is (6) Let W = A -I be the unmixing matrix, to be estimated. [sent-60, score-0.23]
36 (7) m=l k = l Maximization of Lw(Y) with respect to W is equivalent to the BS InfoMax, and can be solved efficiently by the Natural Gradient algorithm [6]. [sent-62, score-0.04]
37 We used this algorithm as implemented in the ICAlEEG Matlab toolbox [7]. [sent-63, score-0.037]
38 In the case of geometry based methods, separation of sparse sources can be achieved by clustering along orientations of data concentration in the N-dimensional space wherein each column Yk of the matrix Y represents a data point (N is the number of mixtures). [sent-65, score-0.971]
39 Let us consider a two-dimensional noiseless case, wherein two source signals, Sl(t) and S2(t), are mixed by a 2x2 matrix A, arriving at two mixtures Xl(t) and X2(t). [sent-66, score-0.73]
40 (Here, the data matrix is constructed from these mixtures Xl (t) and xd t)). [sent-67, score-0.243]
41 Typically, a scatter plot of two sparse mixtures X1(t) versus X2(t), looks like the rightmost plot in Figure 2. [sent-68, score-0.377]
42 If only one source, say Sl (t), was present, the sensor signals would be Xl (t) = all Sl (t) and X2(t) = a21s1 (t) and the data points at the scatter diagram of Xl (t) versus X2(t) would belong to the straight line placed along the vector [ana21 ]T. [sent-69, score-0.469]
43 The same thing happens, when two sparse sources are present. [sent-70, score-0.38]
44 In this sparse case, at each particular index where a sample of the first source is large, there is a high probability, that the corresponding sample of the second source is small, and the point at the scatter diagram still lies close to the mentioned straight line. [sent-71, score-0.74]
45 As a result, data points are concentrated around two dominant orientations, which are directly related to the columns of A. [sent-73, score-0.093]
46 Source signals are rarely sparse in their original domain. [sent-74, score-0.258]
47 In contrast, their decomposition coefficients (2) usually show high sparsity. [sent-75, score-0.435]
48 Therefore, we construct the data matrix Y from the decomposition coefficients of mixtures (3), rather than from the mixtures themselves. [sent-76, score-0.826]
49 In order to determine orientations of scattered data, we project the data points onto the surface of a unit sphere by normalizing corresponding vectors, and then apply a standard clustering algorithm. [sent-77, score-0.405]
50 This clustering approach works efficiently even if the number of sources is greater than the number of sensors. [sent-78, score-0.397]
51 Our clustering procedure can be summarized as follows: 1. [sent-79, score-0.131]
52 Form the feature matrix Y , by putting samples of the sensor signals or (subset of) their decomposition coefficients into the corresponding rows ofthe matrix; = Yk /II Yk I12' in order to project data points onto the surface of a unit sphere, where 11 ·11 2 denotes the l2 norm. [sent-80, score-0.919]
53 Normalize feature vectors (columns ofY): Yk ization, it is reasonable to remove data points with a very small norm, since these very likely to be crosstalk-corrupted by small coefficients from others' sources. [sent-82, score-0.334]
54 by forcing the sign of the first coordinate yk to be positive: IF yk < 0 THEN Yk = - Yk. [sent-86, score-0.422]
55 , along a line) clustered data points would yield two clusters on opposite sides of the sphere. [sent-89, score-0.037]
56 The coordinates of these centers will form the columns of the estimated mixing matrix A. [sent-92, score-0.402]
57 We used Fuzzy C-means (FCM) clustering algorithm as implemented in Matlab Fuzzy Logic Toolbox. [sent-93, score-0.131]
58 The estimated unmixing matrix A-I is obtained by either the BS InfoMax or the above clustering procedure, applied to either complete data set, or to some subsets of data (to be explained in the next section). [sent-95, score-0.436]
59 Then, the sources are recovered in their original domain by s(t) = A- lX(t). [sent-96, score-0.261]
60 We should stress here that if the clustering approach is used, the estimation of sources is not restricted to the case of square mixing matrices, although the sources recovery is more complicated in the rectangular cases (this topic is out of scope of this paper). [sent-97, score-0.79]
61 3 Multinode based source separation Motivating example: sparsity of random blocks in the Haar basis. [sent-98, score-0.635]
62 To provide intuitive insight into the practical implications of our main idea, we first use ID block functions, that are piecewise constant, with random amplitude and duration of each constant piece (Figure 1). [sent-99, score-0.068]
63 It is known, that the Haar wavelet basis provides compact representation of such functions. [sent-100, score-0.222]
64 Let us take a close look at the Haar wavelet coefficients at different resolution levels j =O1, . [sent-101, score-0.559]
wordName wordTfidf (topN-words)
[('cmk', 0.301), ('coefficients', 0.297), ('source', 0.229), ('sources', 0.226), ('wavelet', 0.222), ('sparsity', 0.218), ('yk', 0.211), ('separation', 0.188), ('infomax', 0.17), ('dictionary', 0.149), ('haar', 0.149), ('unmixing', 0.149), ('mixtures', 0.148), ('mixing', 0.143), ('decomposition', 0.138), ('signals', 0.136), ('clustering', 0.131), ('sparse', 0.122), ('haifa', 0.112), ('sensor', 0.104), ('technion', 0.102), ('bs', 0.1), ('noiseless', 0.095), ('wherein', 0.095), ('matrix', 0.095), ('eoo', 0.086), ('multinode', 0.086), ('toc', 0.086), ('zeevi', 0.086), ('blind', 0.085), ('pdf', 0.081), ('oo', 0.076), ('scatter', 0.075), ('orientations', 0.075), ('xl', 0.075), ('lw', 0.075), ('packets', 0.075), ('israel', 0.07), ('wy', 0.068), ('fuzzy', 0.068), ('mk', 0.063), ('verified', 0.063), ('sl', 0.061), ('electrical', 0.061), ('columns', 0.056), ('sphere', 0.054), ('signal', 0.05), ('ica', 0.05), ('ak', 0.048), ('id', 0.045), ('centers', 0.044), ('matlab', 0.044), ('diagram', 0.043), ('straight', 0.042), ('quality', 0.041), ('improves', 0.04), ('resolution', 0.04), ('efficiently', 0.04), ('surface', 0.039), ('column', 0.039), ('block', 0.038), ('mixed', 0.038), ('points', 0.037), ('albuquerque', 0.037), ('ack', 0.037), ('finest', 0.037), ('mother', 0.037), ('pavel', 0.037), ('toolbox', 0.037), ('wavelets', 0.037), ('project', 0.037), ('rows', 0.036), ('domain', 0.035), ('scalar', 0.035), ('stress', 0.034), ('pearlmutter', 0.034), ('mexico', 0.034), ('nm', 0.034), ('coordinates', 0.034), ('maximization', 0.033), ('versus', 0.032), ('engineering', 0.032), ('imaging', 0.032), ('thing', 0.032), ('scattered', 0.032), ('overcomplete', 0.032), ('israeli', 0.032), ('motivating', 0.032), ('subsets', 0.031), ('square', 0.03), ('musical', 0.03), ('piece', 0.03), ('separability', 0.03), ('arriving', 0.03), ('estimated', 0.03), ('property', 0.029), ('subset', 0.029), ('independence', 0.029), ('images', 0.029), ('sounds', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999976 44 nips-2001-Blind Source Separation via Multinode Sparse Representation
Author: Michael Zibulevsky, Pavel Kisilev, Yehoshua Y. Zeevi, Barak A. Pearlmutter
Abstract: We consider a problem of blind source separation from a set of instantaneous linear mixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in an appropriate representation according to some signal dictionary, dramatically improves the quality of separation. In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. The performance of the algorithm is verified on noise-free and noisy data. Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation quality over previously reported results. 1
2 0.31875378 127 nips-2001-Multi Dimensional ICA to Separate Correlated Sources
Author: Roland Vollgraf, Klaus Obermayer
Abstract: We present a new method for the blind separation of sources, which do not fulfill the independence assumption. In contrast to standard methods we consider groups of neighboring samples (
3 0.19219825 103 nips-2001-Kernel Feature Spaces and Nonlinear Blind Souce Separation
Author: Stefan Harmeling, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: In kernel based learning the data is mapped to a kernel feature space of a dimension that corresponds to the number of training data points. In practice, however, the data forms a smaller submanifold in feature space, a fact that has been used e.g. by reduced set techniques for SVMs. We propose a new mathematical construction that permits to adapt to the intrinsic dimension and to find an orthonormal basis of this submanifold. In doing so, computations get much simpler and more important our theoretical framework allows to derive elegant kernelized blind source separation (BSS) algorithms for arbitrary invertible nonlinear mixings. Experiments demonstrate the good performance and high computational efficiency of our kTDSEP algorithm for the problem of nonlinear BSS.
4 0.16028631 71 nips-2001-Estimating the Reliability of ICA Projections
Author: Frank C. Meinecke, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: When applying unsupervised learning techniques like ICA or temporal decorrelation, a key question is whether the discovered projections are reliable. In other words: can we give error bars or can we assess the quality of our separation? We use resampling methods to tackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error. We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance, and, most important, to mark the components that have a actual physical meaning. Application to 49-channel-data from an magneto encephalography (MEG) experiment underlines the usefulness of our approach. 1
5 0.094928563 171 nips-2001-Spectral Relaxation for K-means Clustering
Author: Hongyuan Zha, Xiaofeng He, Chris Ding, Ming Gu, Horst D. Simon
Abstract: The popular K-means clustering partitions a data set by minimizing a sum-of-squares cost function. A coordinate descend method is then used to find local minima. In this paper we show that the minimization can be reformulated as a trace maximization problem associated with the Gram matrix of the data vectors. Furthermore, we show that a relaxed version of the trace maximization problem possesses global optimal solutions which can be obtained by computing a partial eigendecomposition of the Gram matrix, and the cluster assignment for each data vectors can be found by computing a pivoted QR decomposition of the eigenvector matrix. As a by-product we also derive a lower bound for the minimum of the sum-of-squares cost function. 1
6 0.094435781 165 nips-2001-Scaling Laws and Local Minima in Hebbian ICA
7 0.082013227 135 nips-2001-On Spectral Clustering: Analysis and an algorithm
8 0.08072973 178 nips-2001-TAP Gibbs Free Energy, Belief Propagation and Sparsity
9 0.076937683 35 nips-2001-Analysis of Sparse Bayesian Learning
10 0.072059259 185 nips-2001-The Method of Quantum Clustering
11 0.068540893 164 nips-2001-Sampling Techniques for Kernel Methods
12 0.065311857 8 nips-2001-A General Greedy Approximation Algorithm with Applications
13 0.064386778 170 nips-2001-Spectral Kernel Methods for Clustering
14 0.057797201 38 nips-2001-Asymptotic Universality for Learning Curves of Support Vector Machines
15 0.05674044 69 nips-2001-Escaping the Convex Hull with Extrapolated Vector Machines
16 0.056633621 177 nips-2001-Switch Packet Arbitration via Queue-Learning
17 0.056181643 136 nips-2001-On the Concentration of Spectral Properties
18 0.054208498 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models
19 0.052309208 13 nips-2001-A Natural Policy Gradient
20 0.051513068 129 nips-2001-Multiplicative Updates for Classification by Mixture Models
topicId topicWeight
[(0, -0.18), (1, 0.018), (2, -0.029), (3, -0.187), (4, 0.047), (5, -0.001), (6, -0.027), (7, 0.087), (8, 0.229), (9, -0.376), (10, -0.119), (11, -0.118), (12, -0.02), (13, 0.023), (14, 0.125), (15, -0.001), (16, 0.073), (17, -0.034), (18, 0.01), (19, -0.055), (20, 0.034), (21, 0.019), (22, -0.08), (23, 0.022), (24, 0.071), (25, 0.033), (26, -0.0), (27, 0.007), (28, 0.012), (29, 0.004), (30, 0.122), (31, -0.071), (32, 0.016), (33, -0.001), (34, 0.009), (35, -0.006), (36, -0.047), (37, 0.018), (38, -0.009), (39, 0.047), (40, 0.067), (41, 0.078), (42, -0.013), (43, -0.022), (44, -0.121), (45, 0.05), (46, -0.054), (47, 0.039), (48, -0.016), (49, 0.005)]
simIndex simValue paperId paperTitle
same-paper 1 0.97263163 44 nips-2001-Blind Source Separation via Multinode Sparse Representation
Author: Michael Zibulevsky, Pavel Kisilev, Yehoshua Y. Zeevi, Barak A. Pearlmutter
Abstract: We consider a problem of blind source separation from a set of instantaneous linear mixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in an appropriate representation according to some signal dictionary, dramatically improves the quality of separation. In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. The performance of the algorithm is verified on noise-free and noisy data. Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation quality over previously reported results. 1
2 0.91373253 127 nips-2001-Multi Dimensional ICA to Separate Correlated Sources
Author: Roland Vollgraf, Klaus Obermayer
Abstract: We present a new method for the blind separation of sources, which do not fulfill the independence assumption. In contrast to standard methods we consider groups of neighboring samples (
3 0.84313995 71 nips-2001-Estimating the Reliability of ICA Projections
Author: Frank C. Meinecke, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: When applying unsupervised learning techniques like ICA or temporal decorrelation, a key question is whether the discovered projections are reliable. In other words: can we give error bars or can we assess the quality of our separation? We use resampling methods to tackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error. We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance, and, most important, to mark the components that have a actual physical meaning. Application to 49-channel-data from an magneto encephalography (MEG) experiment underlines the usefulness of our approach. 1
4 0.66632819 103 nips-2001-Kernel Feature Spaces and Nonlinear Blind Souce Separation
Author: Stefan Harmeling, Andreas Ziehe, Motoaki Kawanabe, Klaus-Robert Müller
Abstract: In kernel based learning the data is mapped to a kernel feature space of a dimension that corresponds to the number of training data points. In practice, however, the data forms a smaller submanifold in feature space, a fact that has been used e.g. by reduced set techniques for SVMs. We propose a new mathematical construction that permits to adapt to the intrinsic dimension and to find an orthonormal basis of this submanifold. In doing so, computations get much simpler and more important our theoretical framework allows to derive elegant kernelized blind source separation (BSS) algorithms for arbitrary invertible nonlinear mixings. Experiments demonstrate the good performance and high computational efficiency of our kTDSEP algorithm for the problem of nonlinear BSS.
5 0.46184748 165 nips-2001-Scaling Laws and Local Minima in Hebbian ICA
Author: Magnus Rattray, Gleb Basalyga
Abstract: We study the dynamics of a Hebbian ICA algorithm extracting a single non-Gaussian component from a high-dimensional Gaussian background. For both on-line and batch learning we find that a surprisingly large number of examples are required to avoid trapping in a sub-optimal state close to the initial conditions. To extract a skewed signal at least examples are required for -dimensional data and examples are required to extract a symmetrical signal with non-zero kurtosis. § ¡ ©£¢ £ §¥ ¡ ¨¦¤£¢
6 0.38995209 171 nips-2001-Spectral Relaxation for K-means Clustering
7 0.36358413 135 nips-2001-On Spectral Clustering: Analysis and an algorithm
8 0.35729569 35 nips-2001-Analysis of Sparse Bayesian Learning
9 0.31961274 177 nips-2001-Switch Packet Arbitration via Queue-Learning
10 0.30342132 53 nips-2001-Constructing Distributed Representations Using Additive Clustering
11 0.28530025 100 nips-2001-Iterative Double Clustering for Unsupervised and Semi-Supervised Learning
12 0.27484232 159 nips-2001-Reducing multiclass to binary by coupling probability estimates
13 0.27064413 185 nips-2001-The Method of Quantum Clustering
14 0.2675381 178 nips-2001-TAP Gibbs Free Energy, Belief Propagation and Sparsity
15 0.26163054 110 nips-2001-Learning Hierarchical Structures with Linear Relational Embedding
16 0.25632483 88 nips-2001-Grouping and dimensionality reduction by locally linear embedding
17 0.23392142 153 nips-2001-Product Analysis: Learning to Model Observations as Products of Hidden Variables
18 0.21729729 7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data
19 0.20569114 29 nips-2001-Adaptive Sparseness Using Jeffreys Prior
20 0.20558611 129 nips-2001-Multiplicative Updates for Classification by Mixture Models
topicId topicWeight
[(14, 0.041), (17, 0.048), (19, 0.025), (20, 0.111), (27, 0.143), (30, 0.054), (38, 0.025), (40, 0.155), (59, 0.064), (72, 0.05), (79, 0.05), (83, 0.021), (91, 0.121)]
simIndex simValue paperId paperTitle
same-paper 1 0.88936144 44 nips-2001-Blind Source Separation via Multinode Sparse Representation
Author: Michael Zibulevsky, Pavel Kisilev, Yehoshua Y. Zeevi, Barak A. Pearlmutter
Abstract: We consider a problem of blind source separation from a set of instantaneous linear mixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in an appropriate representation according to some signal dictionary, dramatically improves the quality of separation. In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. The performance of the algorithm is verified on noise-free and noisy data. Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation quality over previously reported results. 1
2 0.82442975 172 nips-2001-Speech Recognition using SVMs
Author: N. Smith, Mark Gales
Abstract: An important issue in applying SVMs to speech recognition is the ability to classify variable length sequences. This paper presents extensions to a standard scheme for handling this variable length data, the Fisher score. A more useful mapping is introduced based on the likelihood-ratio. The score-space defined by this mapping avoids some limitations of the Fisher score. Class-conditional generative models are directly incorporated into the definition of the score-space. The mapping, and appropriate normalisation schemes, are evaluated on a speaker-independent isolated letter task where the new mapping outperforms both the Fisher score and HMMs trained to maximise likelihood. 1
3 0.78514391 116 nips-2001-Linking Motor Learning to Function Approximation: Learning in an Unlearnable Force Field
Author: O. Donchin, Reza Shadmehr
Abstract: Reaching movements require the brain to generate motor commands that rely on an internal model of the task’s dynamics. Here we consider the errors that subjects make early in their reaching trajectories to various targets as they learn an internal model. Using a framework from function approximation, we argue that the sequence of errors should reflect the process of gradient descent. If so, then the sequence of errors should obey hidden state transitions of a simple dynamical system. Fitting the system to human data, we find a surprisingly good fit accounting for 98% of the variance. This allows us to draw tentative conclusions about the basis elements used by the brain in transforming sensory space to motor commands. To test the robustness of the results, we estimate the shape of the basis elements under two conditions: in a traditional learning paradigm with a consistent force field, and in a random sequence of force fields where learning is not possible. Remarkably, we find that the basis remains invariant. 1
4 0.76040089 127 nips-2001-Multi Dimensional ICA to Separate Correlated Sources
Author: Roland Vollgraf, Klaus Obermayer
Abstract: We present a new method for the blind separation of sources, which do not fulfill the independence assumption. In contrast to standard methods we consider groups of neighboring samples (
5 0.74572456 1 nips-2001-(Not) Bounding the True Error
Author: John Langford, Rich Caruana
Abstract: We present a new approach to bounding the true error rate of a continuous valued classifier based upon PAC-Bayes bounds. The method first constructs a distribution over classifiers by determining how sensitive each parameter in the model is to noise. The true error rate of the stochastic classifier found with the sensitivity analysis can then be tightly bounded using a PAC-Bayes bound. In this paper we demonstrate the method on artificial neural networks with results of a order of magnitude improvement vs. the best deterministic neural net bounds. £ ¡ ¤¢
6 0.73769033 88 nips-2001-Grouping and dimensionality reduction by locally linear embedding
7 0.7359699 13 nips-2001-A Natural Policy Gradient
8 0.73014867 135 nips-2001-On Spectral Clustering: Analysis and an algorithm
9 0.7264179 89 nips-2001-Grouping with Bias
10 0.72433478 132 nips-2001-Novel iteration schemes for the Cluster Variation Method
11 0.7236321 9 nips-2001-A Generalization of Principal Components Analysis to the Exponential Family
12 0.72352773 131 nips-2001-Neural Implementation of Bayesian Inference in Population Codes
13 0.72338772 95 nips-2001-Infinite Mixtures of Gaussian Process Experts
14 0.723234 27 nips-2001-Activity Driven Adaptive Stochastic Resonance
16 0.72091019 121 nips-2001-Model-Free Least-Squares Policy Iteration
17 0.72016943 74 nips-2001-Face Recognition Using Kernel Methods
18 0.71955019 84 nips-2001-Global Coordination of Local Linear Models
19 0.71940762 8 nips-2001-A General Greedy Approximation Algorithm with Applications
20 0.71885705 190 nips-2001-Thin Junction Trees