nips nips2000 nips2000-36 knowledge-graph by maker-knowledge-mining

36 nips-2000-Constrained Independent Component Analysis

Source: pdf

Author: Wei Lu, Jagath C. Rajapakse

Abstract: The paper presents a novel technique of constrained independent component analysis (CICA) to introduce constraints into the classical ICA and solve the constrained optimization problem by using Lagrange multiplier methods. This paper shows that CICA can be used to order the resulted independent components in a specific manner and normalize the demixing matrix in the signal separation procedure. It can systematically eliminate the ICA's indeterminacy on permutation and dilation. The experiments demonstrate the use of CICA in ordering of independent components while providing normalized demixing processes. Keywords: Independent component analysis, constrained independent component analysis, constrained optimization, Lagrange multiplier methods 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 sg Abstract The paper presents a novel technique of constrained independent component analysis (CICA) to introduce constraints into the classical ICA and solve the constrained optimization problem by using Lagrange multiplier methods. [sent-4, score-1.307]

2 This paper shows that CICA can be used to order the resulted independent components in a specific manner and normalize the demixing matrix in the signal separation procedure. [sent-5, score-0.918]

3 It can systematically eliminate the ICA's indeterminacy on permutation and dilation. [sent-6, score-0.24]

4 The experiments demonstrate the use of CICA in ordering of independent components while providing normalized demixing processes. [sent-7, score-0.727]

5 There has been a growing interest in research for efficient realization of ICA neural networks (ICNNs). [sent-9, score-0.081]

6 These neural algorithms provide adaptive solutions to satisfy independent conditions after the convergence of learning [2, 3, 4]. [sent-10, score-0.2]

7 However, ICA only defines the directions of independent components. [sent-11, score-0.198]

8 The magnitudes of independent components and the norms of demixing matrix may still be varied. [sent-12, score-0.757]

9 Also the order of the resulted components is arbitrary. [sent-13, score-0.229]

10 In general, ICA has such an inherent indeterminacy on dilation and permutation. [sent-14, score-0.278]

11 Such indetermination cannot be reduced further without additional assumptions and constraints [5]. [sent-15, score-0.139]

12 • To produce unity transform operators: normalization of the demixing channels reduces dilation effect on resulted components. [sent-17, score-0.747]

13 With such conditions applied, the ICA problem becomes a constrained optimization problem. [sent-19, score-0.279]

14 In the present paper, Lagrange multiplier methods are adopted to provide an adaptive solution to this problem. [sent-20, score-0.557]

15 It can be well implemented as an iterative updating system of neural networks, referred to ICNNs. [sent-21, score-0.098]

16 Next section briefly gives an introduction to the problem, analysis and solution of Lagrange multiplier methods. [sent-22, score-0.427]

17 And Lagrange multiplier methods are utilized to develop a systematic approach to CICA. [sent-24, score-0.494]

18 Simulations are performed to demonstrate the usefulness of the analytical results and indicate the improvements due to the constraints. [sent-25, score-0.099]

19 2 Lagrange Multiplier Methods Lagrange multiplier methods introduce Lagrange multipliers to resolve a constrained optimization iteratively. [sent-26, score-0.843]

20 A penalty parameter is also introduced to fit the condition so that the local convexity assumption holds at the solution. [sent-27, score-0.224]

21 Lagrange multiplier methods can handle problems with both equality and inequality constraints. [sent-28, score-0.711]

22 Because Lagrangian methods cannot directly deal with inequality constraints 9i(X) ~ 0, it is possible to transform inequality constraints into equality constraints by introducing a vector of slack variables z = [Zl . [sent-33, score-1.013]

23 zmjT to result in equality constraints Pi(X) = 9i(X) + zl = 0, i = 1· . [sent-36, score-0.346]

24 Based on the transformation, the corresponding simplified augmented Lagrangian function for problem (1) is defined as: where f-L = [f-Ll . [sent-39, score-0.148]

25 AnjT are two sets of Lagrange multipliers, "I is the scalar penalty parameter, 9i(X) equals to f-Li+"I9i(X), 11·11 denotes Euclidean norm, and ! [sent-45, score-0.209]

26 112 is the penalty term to ensure that the optimization problem is held at the condition of local convexity assumption: 'V5cx£ > 0. [sent-47, score-0.371]

27 We use the augmented Lagrangian function in this paper because it gives wider applicability and provides better stability [6]. [sent-48, score-0.202]

28 For discrete problems, the changes in the augmented Lagrangian function can be defined as ~x£(X, f-L, A) to achieve the saddle point in the discrete variable space. [sent-49, score-0.223]

29 The iterative equations to solve the problem in eq. [sent-50, score-0.098]

30 (2) are given as follows: X(k + 1) = X(k) - ~x£(X(k),f-L(k),A(k)) f-L(k + 1) = f-L(k) + "Ip(X(k)) = max{O,g(X(k))} A(k + 1) = A(k) + "Ih(X(k)) where k denotes the iterative index and g(X(k)) = f-L(k) + "I g(X(k)). [sent-51, score-0.067]

31 (3) 3 Unconstrained ICA Let the time varying input signal be x = (Xl, X2, . [sent-52, score-0.092]

32 , XN)T and the interested signal consisting of independent components (ICs) be c = (CI, C2, . [sent-55, score-0.343]

33 The signal x is considered to be a linear mixture of independent components c: x = Ac, where A is an N x M mixing matrix with full column rank. [sent-59, score-0.428]

34 The goal of general rcA is to obtain a linear M x N demixing matrix W to recover the independent components c with a minimal knowledge of A and c, normally M = N. [sent-60, score-0.774]

35 Then, the recovered components u are given by u = Wx. [sent-61, score-0.125]

36 In the present paper, the contrast function used is the mutual information (M) of the output signal which is defined in the sense of variable's entropy to measure the independence: (4) where H(Ui) is the marginal entropy of component Ui and H(u) is the output joint entropy. [sent-62, score-0.421]

37 M has non-negative value and equals to zero when components are completely independent. [sent-63, score-0.191]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('lagrange', 0.38), ('demixing', 0.37), ('multiplier', 0.356), ('ica', 0.3), ('cica', 0.247), ('constrained', 0.174), ('lagrangian', 0.158), ('constraints', 0.139), ('equality', 0.138), ('independent', 0.126), ('components', 0.125), ('dilation', 0.123), ('indeterminacy', 0.123), ('augmented', 0.119), ('penalty', 0.113), ('optimization', 0.105), ('inequality', 0.104), ('resulted', 0.104), ('component', 0.094), ('signal', 0.092), ('rca', 0.089), ('jt', 0.084), ('convexity', 0.079), ('multipliers', 0.075), ('defines', 0.072), ('ordering', 0.072), ('zl', 0.069), ('transform', 0.069), ('iterative', 0.067), ('equals', 0.066), ('recover', 0.066), ('matrix', 0.056), ('ui', 0.055), ('methods', 0.054), ('introducing', 0.051), ('highlight', 0.048), ('singapore', 0.048), ('wei', 0.048), ('hl', 0.048), ('keywords', 0.048), ('operators', 0.048), ('resolve', 0.048), ('wider', 0.048), ('output', 0.046), ('adopted', 0.045), ('unity', 0.045), ('growing', 0.045), ('normalize', 0.045), ('norms', 0.045), ('analysis', 0.043), ('deal', 0.043), ('email', 0.042), ('systematic', 0.042), ('utilized', 0.042), ('permutation', 0.042), ('salient', 0.042), ('held', 0.042), ('entropy', 0.041), ('systematically', 0.04), ('saddle', 0.04), ('cm', 0.04), ('adaptive', 0.039), ('ih', 0.038), ('mutually', 0.038), ('ip', 0.038), ('realization', 0.036), ('sort', 0.036), ('channels', 0.036), ('provide', 0.035), ('applicability', 0.035), ('eliminate', 0.035), ('magnitudes', 0.035), ('demonstrate', 0.034), ('technique', 0.034), ('slack', 0.033), ('usefulness', 0.033), ('unconstrained', 0.033), ('discrete', 0.032), ('condition', 0.032), ('improvements', 0.032), ('inherent', 0.032), ('lu', 0.032), ('introduce', 0.031), ('solve', 0.031), ('updating', 0.031), ('mutual', 0.031), ('normally', 0.031), ('sense', 0.03), ('handle', 0.03), ('norm', 0.03), ('scalar', 0.03), ('arguments', 0.03), ('indices', 0.03), ('problems', 0.029), ('characteristics', 0.029), ('simplified', 0.029), ('mixing', 0.029), ('solution', 0.028), ('ci', 0.028), ('school', 0.028)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999982 36 nips-2000-Constrained Independent Component Analysis

Author: Wei Lu, Jagath C. Rajapakse

2 0.23349746 33 nips-2000-Combining ICA and Top-Down Attention for Robust Speech Recognition

Author: Un-Min Bae, Soo-Young Lee

Abstract: We present an algorithm which compensates for the mismatches between characteristics of real-world problems and assumptions of independent component analysis algorithm. To provide additional information to the ICA network, we incorporate top-down selective attention. An MLP classifier is added to the separated signal channel and the error of the classifier is backpropagated to the ICA network. This backpropagation process results in estimation of expected ICA output signal for the top-down attention. Then, the unmixing matrix is retrained according to a new cost function representing the backpropagated error as well as independence. It modifies the density of recovered signals to the density appropriate for classification. For noisy speech signal recorded in real environments, the algorithm improved the recognition performance and showed robustness against parametric changes. 1

3 0.096145771 59 nips-2000-From Mixtures of Mixtures to Adaptive Transform Coding

Author: Cynthia Archer, Todd K. Leen

Abstract: We establish a principled framework for adaptive transform coding. Transform coders are often constructed by concatenating an ad hoc choice of transform with suboptimal bit allocation and quantizer design. Instead, we start from a probabilistic latent variable model in the form of a mixture of constrained Gaussian mixtures. From this model we derive a transform coding algorithm, which is a constrained version of the generalized Lloyd algorithm for vector quantizer design. A byproduct of our derivation is the introduction of a new transform basis, which unlike other transforms (PCA, DCT, etc.) is explicitly optimized for coding. Image compression experiments show adaptive transform coders designed with our algorithm improve compressed image signal-to-noise ratio up to 3 dB compared to global transform coding and 0.5 to 2 dB compared to other adaptive transform coders. 1

4 0.074301258 2 nips-2000-A Comparison of Image Processing Techniques for Visual Speech Recognition Applications

Author: Michael S. Gray, Terrence J. Sejnowski, Javier R. Movellan

Abstract: We examine eight different techniques for developing visual representations in machine vision tasks. In particular we compare different versions of principal component and independent component analysis in combination with stepwise regression methods for variable selection. We found that local methods, based on the statistics of image patches, consistently outperformed global methods based on the statistics of entire images. This result is consistent with previous work on emotion and facial expression recognition. In addition, the use of a stepwise regression technique for selecting variables and regions of interest substantially boosted performance. 1

5 0.074123643 38 nips-2000-Data Clustering by Markovian Relaxation and the Information Bottleneck Method

Author: Naftali Tishby, Noam Slonim

Abstract: We introduce a new, non-parametric and principled, distance based clustering method. This method combines a pairwise based approach with a vector-quantization method which provide a meaningful interpretation to the resulting clusters. The idea is based on turning the distance matrix into a Markov process and then examine the decay of mutual-information during the relaxation of this process. The clusters emerge as quasi-stable structures during this relaxation, and then are extracted using the information bottleneck method. These clusters capture the information about the initial point of the relaxation in the most effective way. The method can cluster data with no geometric or other bias and makes no assumption about the underlying distribution. 1

6 0.063396767 65 nips-2000-Higher-Order Statistical Properties Arising from the Non-Stationarity of Natural Signals

7 0.062276278 12 nips-2000-A Support Vector Method for Clustering

8 0.061240438 24 nips-2000-An Information Maximization Approach to Overcomplete and Recurrent Representations

9 0.054041322 74 nips-2000-Kernel Expansions with Unlabeled Examples

10 0.052449871 31 nips-2000-Beyond Maximum Likelihood and Density Estimation: A Sample-Based Criterion for Unsupervised Learning of Complex Models

11 0.052039478 86 nips-2000-Model Complexity, Goodness of Fit and Diminishing Returns

12 0.051184904 51 nips-2000-Factored Semi-Tied Covariance Matrices

13 0.049084712 69 nips-2000-Incorporating Second-Order Functional Knowledge for Better Option Pricing

14 0.048614983 76 nips-2000-Learning Continuous Distributions: Simulations With Field Theoretic Priors

15 0.048022047 121 nips-2000-Sparse Kernel Principal Component Analysis

16 0.047126852 18 nips-2000-Active Support Vector Machine Classification

17 0.045683138 46 nips-2000-Ensemble Learning and Linear Response Theory for ICA

18 0.045549367 60 nips-2000-Gaussianization

19 0.042483997 32 nips-2000-Color Opponency Constitutes a Sparse Representation for the Chromatic Structure of Natural Scenes

20 0.042454738 106 nips-2000-Propagation Algorithms for Variational Bayesian Learning

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.15), (1, -0.007), (2, 0.058), (3, 0.067), (4, 0.006), (5, -0.01), (6, -0.111), (7, -0.078), (8, -0.047), (9, 0.047), (10, -0.309), (11, -0.027), (12, 0.045), (13, -0.008), (14, -0.108), (15, -0.06), (16, -0.242), (17, -0.108), (18, 0.118), (19, 0.189), (20, -0.102), (21, -0.058), (22, -0.021), (23, -0.053), (24, -0.004), (25, -0.133), (26, -0.162), (27, -0.262), (28, 0.071), (29, 0.002), (30, -0.215), (31, -0.047), (32, -0.124), (33, -0.122), (34, 0.082), (35, -0.002), (36, -0.007), (37, -0.009), (38, 0.068), (39, 0.034), (40, 0.056), (41, 0.142), (42, 0.028), (43, 0.078), (44, -0.015), (45, 0.133), (46, -0.006), (47, 0.099), (48, -0.014), (49, -0.05)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9817332 36 nips-2000-Constrained Independent Component Analysis

Author: Wei Lu, Jagath C. Rajapakse

2 0.84227765 33 nips-2000-Combining ICA and Top-Down Attention for Robust Speech Recognition

Author: Un-Min Bae, Soo-Young Lee

3 0.35342592 59 nips-2000-From Mixtures of Mixtures to Adaptive Transform Coding

Author: Cynthia Archer, Todd K. Leen

4 0.29362401 2 nips-2000-A Comparison of Image Processing Techniques for Visual Speech Recognition Applications

Author: Michael S. Gray, Terrence J. Sejnowski, Javier R. Movellan

5 0.29313844 60 nips-2000-Gaussianization

Author: Scott Saobing Chen, Ramesh A. Gopinath

Abstract: High dimensional data modeling is difficult mainly because the so-called

6 0.2904678 93 nips-2000-On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems

7 0.24973461 65 nips-2000-Higher-Order Statistical Properties Arising from the Non-Stationarity of Natural Signals

8 0.23777334 38 nips-2000-Data Clustering by Markovian Relaxation and the Information Bottleneck Method

9 0.21434024 24 nips-2000-An Information Maximization Approach to Overcomplete and Recurrent Representations

10 0.20734654 32 nips-2000-Color Opponency Constitutes a Sparse Representation for the Chromatic Structure of Natural Scenes

11 0.20579974 31 nips-2000-Beyond Maximum Likelihood and Density Estimation: A Sample-Based Criterion for Unsupervised Learning of Complex Models

12 0.20242706 86 nips-2000-Model Complexity, Goodness of Fit and Diminishing Returns

13 0.20201242 18 nips-2000-Active Support Vector Machine Classification

14 0.20045798 12 nips-2000-A Support Vector Method for Clustering

15 0.18119615 69 nips-2000-Incorporating Second-Order Functional Knowledge for Better Option Pricing

16 0.17627685 74 nips-2000-Kernel Expansions with Unlabeled Examples

17 0.17568853 51 nips-2000-Factored Semi-Tied Covariance Matrices

18 0.17060632 99 nips-2000-Periodic Component Analysis: An Eigenvalue Method for Representing Periodic Structure in Speech

19 0.15638155 96 nips-2000-One Microphone Source Separation

20 0.15233608 126 nips-2000-Stagewise Processing in Error-correcting Codes and Image Restoration

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.017), (17, 0.137), (32, 0.019), (33, 0.054), (35, 0.292), (54, 0.02), (55, 0.02), (62, 0.035), (65, 0.075), (67, 0.066), (76, 0.053), (79, 0.014), (90, 0.081), (97, 0.028)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.82365417 36 nips-2000-Constrained Independent Component Analysis

Author: Wei Lu, Jagath C. Rajapakse

2 0.7857005 71 nips-2000-Interactive Parts Model: An Application to Recognition of On-line Cursive Script

Author: Predrag Neskovic, Philip C. Davis, Leon N. Cooper

Abstract: In this work, we introduce an Interactive Parts (IP) model as an alternative to Hidden Markov Models (HMMs). We t ested both models on a database of on-line cursive script. We show that implementations of HMMs and the IP model, in which all letters are assumed to have the same average width , give comparable results. However , in contrast to HMMs, the IP model can handle duration modeling without an increase in computational complexity. 1

3 0.54497671 68 nips-2000-Improved Output Coding for Classification Using Continuous Relaxation

Author: Koby Crammer, Yoram Singer

Abstract: Output coding is a general method for solving multiclass problems by reducing them to multiple binary classification problems. Previous research on output coding has employed, almost solely, predefined discrete codes. We describe an algorithm that improves the performance of output codes by relaxing them to continuous codes. The relaxation procedure is cast as an optimization problem and is reminiscent of the quadratic program for support vector machines. We describe experiments with the proposed algorithm, comparing it to standard discrete output codes. The experimental results indicate that continuous relaxations of output codes often improve the generalization performance, especially for short codes.

4 0.5315088 74 nips-2000-Kernel Expansions with Unlabeled Examples

Author: Martin Szummer, Tommi Jaakkola

Abstract: Modern classification applications necessitate supplementing the few available labeled examples with unlabeled examples to improve classification performance. We present a new tractable algorithm for exploiting unlabeled examples in discriminative classification. This is achieved essentially by expanding the input vectors into longer feature vectors via both labeled and unlabeled examples. The resulting classification method can be interpreted as a discriminative kernel density estimate and is readily trained via the EM algorithm, which in this case is both discriminative and achieves the optimal solution. We provide, in addition, a purely discriminative formulation of the estimation problem by appealing to the maximum entropy framework. We demonstrate that the proposed approach requires very few labeled examples for high classification accuracy.

5 0.52588367 7 nips-2000-A New Approximate Maximal Margin Classification Algorithm

Author: Claudio Gentile

Abstract: A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p ~ 2 for a set of linearly separable data. Our algorithm, called ALMAp (Approximate Large Margin algorithm w.r.t. norm p), takes 0 ((P~21;;2) corrections to separate the data with p-norm margin larger than (1 - 0:) ,,(, where,,( is the p-norm margin of the data and X is a bound on the p-norm of the instances. ALMAp avoids quadratic (or higher-order) programming methods. It is very easy to implement and is as fast as on-line algorithms, such as Rosenblatt's perceptron. We report on some experiments comparing ALMAp to two incremental algorithms: Perceptron and Li and Long's ROMMA. Our algorithm seems to perform quite better than both. The accuracy levels achieved by ALMAp are slightly inferior to those obtained by Support vector Machines (SVMs). On the other hand, ALMAp is quite faster and easier to implement than standard SVMs training algorithms.

6 0.52280623 111 nips-2000-Regularized Winnow Methods

7 0.52105892 4 nips-2000-A Linear Programming Approach to Novelty Detection

8 0.51766557 52 nips-2000-Fast Training of Support Vector Classifiers

9 0.51471704 122 nips-2000-Sparse Representation for Gaussian Process Models

10 0.51239204 119 nips-2000-Some New Bounds on the Generalization Error of Combined Classifiers

11 0.50921649 37 nips-2000-Convergence of Large Margin Separable Linear Classification

12 0.50921649 107 nips-2000-Rate-coded Restricted Boltzmann Machines for Face Recognition

13 0.50840628 21 nips-2000-Algorithmic Stability and Generalization Performance

14 0.50799638 133 nips-2000-The Kernel Gibbs Sampler

15 0.50749314 79 nips-2000-Learning Segmentation by Random Walks

16 0.50680125 60 nips-2000-Gaussianization

17 0.50109661 130 nips-2000-Text Classification using String Kernels

18 0.49979639 95 nips-2000-On a Connection between Kernel PCA and Metric Multidimensional Scaling

19 0.49907404 12 nips-2000-A Support Vector Method for Clustering

20 0.49850956 106 nips-2000-Propagation Algorithms for Variational Bayesian Learning