nips nips2001 nips2001-173 knowledge-graph by maker-knowledge-mining

173 nips-2001-Speech Recognition with Missing Data using Recurrent Neural Nets

Source: pdf

Author: S. Parveen, P. Green

Abstract: In the ‘missing data’ approach to improving the robustness of automatic speech recognition to added noise, an initial process identiﬁes spectraltemporal regions which are dominated by the speech source. The remaining regions are considered to be ‘missing’. In this paper we develop a connectionist approach to the problem of adapting speech recognition to the missing data case, using Recurrent Neural Networks. In contrast to methods based on Hidden Markov Models, RNNs allow us to make use of long-term time constraints and to make the problems of classiﬁcation with incomplete data and imputing missing values interact. We report encouraging results on an isolated digit recognition task.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 uk Abstract In the ‘missing data’ approach to improving the robustness of automatic speech recognition to added noise, an initial process identiﬁes spectraltemporal regions which are dominated by the speech source. [sent-12, score-0.65]

2 In this paper we develop a connectionist approach to the problem of adapting speech recognition to the missing data case, using Recurrent Neural Networks. [sent-14, score-0.952]

3 In contrast to methods based on Hidden Markov Models, RNNs allow us to make use of long-term time constraints and to make the problems of classiﬁcation with incomplete data and imputing missing values interact. [sent-15, score-0.622]

4 We report encouraging results on an isolated digit recognition task. [sent-16, score-0.227]

5 Introduction Automatic Speech Recognition systems perform reasonably well in controlled and matched training and recognition conditions. [sent-18, score-0.15]

6 However, performance deteriorates when there is a mismatch between training and testing conditions, caused for instance by additive noise (Lippmann, 1997). [sent-19, score-0.052]

7 Missing data techniques provide an alternative solution for speech corrupted by additive noise which make minimal assumptions about the nature of the noise. [sent-21, score-0.308]

8 They are based on identifying uncorrupted, reliable regions in the frequency domain and adapting recognition algorithms so that classiﬁcation is based on these regions. [sent-22, score-0.231]

9 Present missing data techniques developed at Shefﬁeld (Barker et al. [sent-23, score-0.649]

10 Neural Networks, unlike HMMs, are discriminative models which do give direct estimates of posterior probabilities and have been used with success in hybrid ANN/HMM speech recognition systems (Bourlard et al. [sent-30, score-0.425]

11 In this paper, we adapt a recurrent neural network architecture introduced by (Gingras & Bengio, 1998) for robust ASR with missing data. [sent-32, score-0.78]

12 1 Missing data masks Speech recognition with missing data is based on the assumption that some regions in time/frequency remain uncorrupted for speech with added noise. [sent-35, score-1.065]

13 Initial processes, based on local signal-to-noise estimates, on auditory grouping cues, or a combination (Barker et al. [sent-38, score-0.086]

14 , 2001) deﬁne a binary ‘missing data mask’: ones in the mask indicate reliable (or ‘present’) features and zeros indicate unreliable (or ‘missing’) features. [sent-39, score-0.221]

15 2 Classiﬁcation with missing data Techniques for classiﬁcation with incomplete data can be divided into imputation and marginalisation. [sent-41, score-1.01]

16 Imputation is a technique in which missing features are replaced by estimated values to allow the recognition process proceed in normal way. [sent-42, score-0.785]

17 If the missing values are replaced by either zeros, random values or their means based on training data, the approach is called unconditional imputation. [sent-43, score-0.649]

18 On the other hand in conditional imputation conditional statistics are used to estimate the missing values given the present values. [sent-44, score-0.938]

19 In the marginalisation approach missing values are ignored (by integrating over their possible ranges) and recognition is performed with the reduced data vector which is considered reliable. [sent-45, score-0.798]

20 For the multivariate mixture Gaussian distributions used in CDHMMs, marginalisation and conditional imputation can be formulated analytically (Cooke et al. [sent-46, score-0.505]

21 For missing data ASR further improvements in both techniques follow from using the knowledge that for spectral energy features the unreliable data is bounded between zero and the energy in speech+noise mixture (Vizinho et al. [sent-48, score-0.922]

22 These techniques are referred to as bounded marginalisation and bounded imputation. [sent-51, score-0.092]

23 3 Why recurrent neural nets for missing data robust ASR? [sent-55, score-0.755]

24 Several neural net architectures have been proposed to deal with the missing data problem in general (Ahmed & Tresp, 1993), (Ghahramani & Jordan, 1994). [sent-56, score-0.588]

25 The problem in using neural networks with missing data is to compute the output of a node/unit when some of its input values are unavailable. [sent-57, score-0.652]

26 For marginalisation, this involves ﬁnding a way of integrating over the range of the missing values. [sent-58, score-0.525]

27 A robust ASR system to deal with missing data using neural networks has recently been proposed by (Morris et al. [sent-59, score-0.702]

28 This is basically a radial basis function neural network with the hidden units associated with a diagonal covariance gaussian. [sent-61, score-0.12]

29 The marginal over the missing values can be computed in this case and hence the resulting system is equivalent to the HMM based missing data speech recognition system using marginalisation. [sent-62, score-1.477]

30 Reported performance is also comparable to that of the HMM based speech recognition system. [sent-63, score-0.365]

31 In this paper missing data is dealt with by imputation. [sent-64, score-0.558]

32 We use recurrent neural networks to estimate missing values in the input vector. [sent-65, score-0.691]

33 RNNs have the potential to capture long-term contextual effects over time, and hence to use temporal context to compensate for missing data which CDHMM based missing data techniques do not do. [sent-66, score-1.147]

34 RNNs also allow a single net to perform both imputation and classiﬁcation, with the potential of combining these processes to mutual beneﬁt. [sent-68, score-0.414]

35 (1998) is based on a fully-connected feedforward network with input, hidden and output layers using hyperbolic tangent activation functions. [sent-70, score-0.136]

36 The output layer has one unit for each class and the network is trained with the correct classiﬁcation as target. [sent-71, score-0.141]

37 Recurrent links are added to the feedforward net with unit delay from output to the hidden units as in Jordan networks (Jordan, 1988). [sent-72, score-0.371]

38 There are also recurrent links with unit delay from hidden units to missing input units to impute missing features. [sent-73, score-1.525]

39 In addition, there are self delayed terms with a ﬁxed weight for each unit which basically serve to stabilise RNN behaviour over time and help in imputation as well. [sent-74, score-0.469]

40 used this RNN both for a pattern classiﬁcation task with static data (one input vector for each example) and sequential data (a sequence of input values for each example). [sent-76, score-0.167]

41 Our aim is to adapt this architecture for robust ASR with missing data. [sent-77, score-0.667]

42 Some preliminary static classiﬁcation experiments were performed on vowel spectra (individual spectral slices excised from the TIMIT database). [sent-78, score-0.057]

43 RNN performance on this task with missing data was better than standard MLP and gaussian classiﬁers. [sent-79, score-0.558]

44 In the next section we show how the net can be adapted for dynamic classiﬁcation of the spectral sequences constituting words. [sent-80, score-0.087]

45 RNN architecture for robust ASR with missing data Figure 1 illustrates our modiﬁed version of the Gingras and Bengio architecture. [sent-82, score-0.677]

46 Instead of taking feedback from the output to the hidden layer we have chosen a fully connected or Elman RNN (Elman, 1990) where there are full recurrent links from the past hidden layer to the present hidden layer (ﬁgure 1). [sent-83, score-0.557]

47 We have observed that these links produce faster convergence, in agreement with (Pedersen, 1997). [sent-84, score-0.067]

48 The number of input units depends on the size of feature vector, i. [sent-85, score-0.099]

49 The number of hidden units is determined by experimentation. [sent-88, score-0.12]

50 In our case the classes are taken to be whole words, so in the isolated digit recognition experiments we report, there are eleven output units, for ‘1’ - ‘9’, ‘zero’ and ‘oh’. [sent-90, score-0.268]

51 In training, missing inputs are initialised with their unconditional means. [sent-91, score-0.566]

52 The RNN is then allowed to impute missing values for the next frame through the recurrent links, after a feedforward pass. [sent-92, score-0.771]

53 H X (m,t) = ( 1 – γ )X (m,t – 1) + ∑ v jm f ( hid ( j, t – 1 ) ) j=1 Where X (m,t) is the missing feature at time t, γ is the learning rate, v jm indicates recurrent links from a hidden unit to the missing input and hid ( j, t – 1 ) is the activation of hidden unit j at time t-1. [sent-93, score-1.606]

54 The average of the RNN output over all the frames of an example is taken after these frames have gone through a forward pass. [sent-94, score-0.063]

55 The sum squared error between the correct targets and the RNN output for each frame is back-propagated through time and RNN weights are updated until a stopping criterion is reached. [sent-95, score-0.119]

56 The recognition phase consists of a forward pass to produce RNN output for unseen data and imputation of missing features at each time step. [sent-96, score-1.211]

57 The highest value in the averaged output vector is taken as the correct class. [sent-97, score-0.041]

58 Reliable features one two h i d d e n o u t p u t three nine oh zero -1 -1 -1 Figure 1: RNN architecture for robust ASR with missing data technique. [sent-98, score-0.778]

59 Solid arrows show full forward and recurrent connections between two layers. [sent-99, score-0.161]

60 Shaded blocks in the input layer indicate missing inputs which keep changing at every time step. [sent-100, score-0.592]

61 Missing inputs are fully connected (solid arrows) with the hidden layer with a unit delay in addition to delayed self-connection (thin arrows) with a ﬁxed weight. [sent-101, score-0.206]

62 Isolated word recognition experiments Continuous pattern classiﬁcation experiments were performed using data from 30 male speakers in the isolated digits section of the TIDIGIT database (Leonard, 1984). [sent-103, score-0.283]

63 220 examples were chosen from a subset of 10 speakers for training. [sent-107, score-0.035]

64 Features were extracted from hamming windowed speech with a window size of 25 msec and 50% overlap. [sent-110, score-0.215]

65 In the initial experiments we report, the missing data masks were formed by deleting spectral energy features at random. [sent-112, score-0.769]

66 This allows comparison with early results with HMMbased missing data recognition (Cooke et al. [sent-113, score-0.768]

67 Recognition performance was evaluated with 0% to 80% missing features with an increment of 10%. [sent-116, score-0.581]

68 1 RNN performance as a classiﬁer An RNN with 20 inputs, 65 hidden and 11 output units was chosen for recognition and imputation with 20 features per time frame. [sent-119, score-0.751]

69 Its performance on various amounts of missing features from 0% to 80%, shown in Figure 2 (the ‘RNN imputation’ curve), is much better than the standard Elman RNN trained on clean speech only for classiﬁcation task and tested with the mean imputation. [sent-120, score-0.882]

70 Use of the self delayed term in addition to the recurrent links for imputation of missing features contributes positively in case of sequential data. [sent-121, score-1.196]

71 Results resemble those reported for HMMs in (Cooke et al. [sent-122, score-0.06]

72 We also show that results are superior to ‘last reliable imputation’ in which the imputed value of a feature is the last reliable value for that feature. [sent-124, score-0.25]

73 2 RNN performance on pattern completion Imputation, or pattern completion, performance was observed for an RNN trained with 4 features per frame of the speech and is shown in Figure 3. [sent-127, score-0.426]

74 The RNN for this task had 4 input, 45 hidden and 11 output units. [sent-128, score-0.11]

75 In ﬁgure 3(a), solid curves show the true values of the feature in each frequency band at every frame for an example of a spoken ‘9’, the horizontal lines are mean feature values, and the circles are the missing values imputed by the RNN. [sent-129, score-0.796]

76 For this network, classiﬁcation error for recognition was 10. [sent-131, score-0.15]

77 The bottom curve is the average pattern completion error with missing features imputed by the network. [sent-138, score-0.768]

78 This demonstrates the clear advantage of using the RNN for both imputation and classiﬁcation. [sent-139, score-0.384]

79 5 0 5 10 15 20 25 30 35 frame number (a) 0. [sent-144, score-0.043]

80 2 10 20 30 40 50 60 70 80 % missing (b) Figure 3: (a) Missing values for digit 9 imputed by an RNN (b) Average imputation errors for mean imputation and RNN imputation 6. [sent-146, score-1.864]

81 Our next step will be to extend this recognition system for the connected digits recognition task with missing data, following the Aurora standard for robust ASR (Pearce et al. [sent-148, score-0.969]

82 This will provide a direct comparison with HMM-based missing data recognition (Barker et al. [sent-150, score-0.768]

83 In this case we will need to introduce ‘silence’ as an additional recognition class, and the training targets will be obtained by forced-alignment on clean speech with an existing recogniser. [sent-152, score-0.463]

84 We will use realistic missing data masks, rather than random deletions. [sent-153, score-0.558]

85 This is known to be a more demanding condition (Cooke et al. [sent-154, score-0.06]

86 When we are training using clean speech with added noise, another possibility is to use the true values of the corrupted features as training targets for imputation. [sent-156, score-0.42]

87 Use of actual targets for missing values has been reported by (Seung, 1997) but the RNN architecture in the latter work supports only pattern completion. [sent-157, score-0.648]

88 Some solutions to the missing feature problem in vision. [sent-162, score-0.549]

89 Linking auditory scene analysis and robust ASR by missing data techniques. [sent-177, score-0.668]

90 Soft decisions in missing data techniques for robust automatic speech recognition. [sent-186, score-0.914]

91 Decoding speech in the presence of other sound sources. [sent-195, score-0.215]

92 Hybrid HMM/ANN systems for speech recognition: Overview and new research directions. [sent-199, score-0.215]

93 Robust automatic speech recognition with missing and unreliable acoustic data. [sent-212, score-0.951]

94 Speaker veriﬁcation in noisy environment with combined spectral subtraction and missing data theory. [sent-225, score-0.641]

95 Supervised learning from incomplete data via an EM approach. [sent-251, score-0.068]

96 State based imputation of missing data for robust speech recognition and speech enhancement. [sent-269, score-1.606]

97 A neural network for classiﬁcation with incomplete data: application to robust ASR. [sent-299, score-0.119]

98 The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. [sent-305, score-0.395]

99 Reconstruction of damaged spectrographic features for robust speech recognition. [sent-318, score-0.355]

100 Missing data theory, spectral subtraction and signal-to-noise estimation for robust ASR: An integrated study. [sent-331, score-0.2]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('missing', 0.525), ('rnn', 0.424), ('imputation', 0.384), ('speech', 0.215), ('cooke', 0.212), ('asr', 0.182), ('recognition', 0.15), ('barker', 0.14), ('imputed', 0.122), ('recurrent', 0.113), ('josifovski', 0.105), ('gingras', 0.087), ('shef', 0.087), ('robust', 0.084), ('classi', 0.075), ('green', 0.072), ('vizinho', 0.07), ('hidden', 0.069), ('links', 0.067), ('clean', 0.063), ('marginalisation', 0.061), ('et', 0.06), ('spectral', 0.057), ('features', 0.056), ('cation', 0.054), ('reliable', 0.052), ('beijing', 0.052), ('bourlard', 0.052), ('masks', 0.052), ('elman', 0.051), ('units', 0.051), ('energy', 0.046), ('rnns', 0.045), ('oh', 0.045), ('icslp', 0.045), ('layer', 0.043), ('frame', 0.043), ('output', 0.041), ('unconditional', 0.041), ('completion', 0.041), ('isolated', 0.041), ('morris', 0.039), ('hz', 0.036), ('digit', 0.036), ('architecture', 0.035), ('targets', 0.035), ('incomplete', 0.035), ('ahmed', 0.035), ('cdhmms', 0.035), ('furui', 0.035), ('hid', 0.035), ('impute', 0.035), ('leonard', 0.035), ('pedersen', 0.035), ('uncorrupted', 0.035), ('speakers', 0.035), ('unreliable', 0.035), ('unit', 0.034), ('data', 0.033), ('eld', 0.033), ('bengio', 0.032), ('jordan', 0.032), ('techniques', 0.031), ('delay', 0.031), ('pearce', 0.03), ('aurora', 0.03), ('raj', 0.03), ('budapest', 0.03), ('cdhmm', 0.03), ('net', 0.03), ('values', 0.029), ('noise', 0.029), ('delayed', 0.029), ('adapting', 0.029), ('hearing', 0.028), ('automatic', 0.026), ('arrows', 0.026), ('auditory', 0.026), ('jm', 0.026), ('eurospeech', 0.026), ('subtraction', 0.026), ('accepted', 0.026), ('lippmann', 0.026), ('feedforward', 0.026), ('replaced', 0.025), ('pattern', 0.024), ('feature', 0.024), ('input', 0.024), ('trained', 0.023), ('adapt', 0.023), ('communication', 0.023), ('zeros', 0.023), ('mismatch', 0.023), ('forward', 0.022), ('self', 0.022), ('mask', 0.022), ('speaker', 0.022), ('added', 0.022), ('uk', 0.022), ('workshop', 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999982 173 nips-2001-Speech Recognition with Missing Data using Recurrent Neural Nets

Author: S. Parveen, P. Green

2 0.19343275 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models

Author: John R. Hershey, Michael Casey

Abstract: It is well known that under noisy conditions we can hear speech much more clearly when we read the speaker's lips. This suggests the utility of audio-visual information for the task of speech enhancement. We propose a method to exploit audio-visual cues to enable speech separation under non-stationary noise and with a single microphone. We revise and extend HMM-based speech enhancement techniques, in which signal and noise models are factori ally combined, to incorporate visual lip information and employ novel signal HMMs in which the dynamics of narrow-band and wide band components are factorial. We avoid the combinatorial explosion in the factorial model by using a simple approximate inference technique to quickly estimate the clean signals in a mixture. We present a preliminary evaluation of this approach using a small-vocabulary audio-visual database, showing promising improvements in machine intelligibility for speech enhanced using audio and visual information. 1

3 0.18340524 4 nips-2001-ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition

Author: Brendan J. Frey, Trausti T. Kristjansson, Li Deng, Alex Acero

Abstract: A challenging, unsolved problem in the speech recognition community is recognizing speech signals that are corrupted by loud, highly nonstationary noise. One approach to noisy speech recognition is to automatically remove the noise from the cepstrum sequence before feeding it in to a clean speech recognizer. In previous work published in Eurospeech, we showed how a probability model trained on clean speech and a separate probability model trained on noise could be combined for the purpose of estimating the noisefree speech from the noisy speech. We showed how an iterative 2nd order vector Taylor series approximation could be used for probabilistic inference in this model. In many circumstances, it is not possible to obtain examples of noise without speech. Noise statistics may change significantly during an utterance, so that speechfree frames are not sufficient for estimating the noise model. In this paper, we show how the noise model can be learned even when the data contains speech. In particular, the noise model can be learned from the test utterance and then used to de noise the test utterance. The approximate inference technique is used as an approximate E step in a generalized EM algorithm that learns the parameters of the noise model from a test utterance. For both Wall Street J ournal data with added noise samples and the Aurora benchmark, we show that the new noise adaptive technique performs as well as or significantly better than the non-adaptive algorithm, without the need for a separate training set of noise examples. 1

4 0.11924254 63 nips-2001-Dynamic Time-Alignment Kernel in Support Vector Machine

Author: Hiroshi Shimodaira, Ken-ichi Noma, Mitsuru Nakai, Shigeki Sagayama

Abstract: A new class of Support Vector Machine (SVM) that is applicable to sequential-pattern recognition such as speech recognition is developed by incorporating an idea of non-linear time alignment into the kernel function. Since the time-alignment operation of sequential pattern is embedded in the new kernel function, standard SVM training and classiﬁcation algorithms can be employed without further modiﬁcations. The proposed SVM (DTAK-SVM) is evaluated in speaker-dependent speech recognition experiments of hand-segmented phoneme recognition. Preliminary experimental results show comparable recognition performance with hidden Markov models (HMMs). 1

5 0.10456865 20 nips-2001-A Sequence Kernel and its Application to Speaker Recognition

Author: William M. Campbell

Abstract: A novel approach for comparing sequences of observations using an explicit-expansion kernel is demonstrated. The kernel is derived using the assumption of the independence of the sequence of observations and a mean-squared error training criterion. The use of an explicit expansion kernel reduces classiﬁer model size and computation dramatically, resulting in model sizes and computation one-hundred times smaller in our application. The explicit expansion also preserves the computational advantages of an earlier architecture based on mean-squared error training. Training using standard support vector machine methodology gives accuracy that signiﬁcantly exceeds the performance of state-of-the-art mean-squared error training for a speaker recognition task.

6 0.092347205 190 nips-2001-Thin Junction Trees

7 0.082024284 52 nips-2001-Computing Time Lower Bounds for Recurrent Sigmoidal Neural Networks

8 0.081560463 161 nips-2001-Reinforcement Learning with Long Short-Term Memory

9 0.079603337 162 nips-2001-Relative Density Nets: A New Way to Combine Backpropagation with HMM's

10 0.077553049 168 nips-2001-Sequential Noise Compensation by Sequential Monte Carlo Method

11 0.071326964 172 nips-2001-Speech Recognition using SVMs

12 0.061673176 50 nips-2001-Classifying Single Trial EEG: Towards Brain Computer Interfacing

13 0.059239198 46 nips-2001-Categorization by Learning and Combining Object Parts

14 0.057378564 167 nips-2001-Semi-supervised MarginBoost

15 0.057124507 109 nips-2001-Learning Discriminative Feature Transforms to Low Dimensions in Low Dimentions

16 0.056121543 123 nips-2001-Modeling Temporal Structure in Classical Conditioning

17 0.055389628 129 nips-2001-Multiplicative Updates for Classification by Mixture Models

18 0.052700419 16 nips-2001-A Parallel Mixture of SVMs for Very Large Scale Problems

19 0.051152129 43 nips-2001-Bayesian time series classification

20 0.048450559 193 nips-2001-Unsupervised Learning of Human Motion Models

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.157), (1, 0.01), (2, -0.065), (3, 0.026), (4, -0.229), (5, 0.079), (6, 0.086), (7, -0.108), (8, -0.001), (9, -0.067), (10, -0.013), (11, 0.049), (12, -0.02), (13, 0.214), (14, -0.082), (15, 0.086), (16, -0.05), (17, -0.059), (18, -0.122), (19, -0.001), (20, -0.037), (21, -0.035), (22, 0.05), (23, 0.125), (24, -0.048), (25, 0.054), (26, 0.151), (27, -0.052), (28, 0.085), (29, 0.061), (30, 0.037), (31, 0.083), (32, -0.073), (33, -0.103), (34, -0.059), (35, 0.087), (36, -0.047), (37, -0.027), (38, -0.073), (39, -0.04), (40, 0.002), (41, -0.008), (42, 0.061), (43, 0.018), (44, 0.041), (45, 0.023), (46, -0.015), (47, 0.008), (48, -0.005), (49, 0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96589547 173 nips-2001-Speech Recognition with Missing Data using Recurrent Neural Nets

Author: S. Parveen, P. Green

2 0.73783052 4 nips-2001-ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition

Author: Brendan J. Frey, Trausti T. Kristjansson, Li Deng, Alex Acero

3 0.73686147 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models

Author: John R. Hershey, Michael Casey

4 0.66201937 168 nips-2001-Sequential Noise Compensation by Sequential Monte Carlo Method

Author: K. Yao, S. Nakamura

Abstract: We present a sequential Monte Carlo method applied to additive noise compensation for robust speech recognition in time-varying noise. The method generates a set of samples according to the prior distribution given by clean speech models and noise prior evolved from previous estimation. An explicit model representing noise effects on speech features is used, so that an extended Kalman ﬁlter is constructed for each sample, generating the updated continuous state estimate as the estimation of the noise parameter, and prediction likelihood for weighting each sample. Minimum mean square error (MMSE) inference of the time-varying noise parameter is carried out over these samples by fusion the estimation of samples according to their weights. A residual resampling selection step and a Metropolis-Hastings smoothing step are used to improve calculation eﬃciency. Experiments were conducted on speech recognition in simulated non-stationary noises, where noise power changed artiﬁcially, and highly non-stationary Machinegun noise. In all the experiments carried out, we observed that the method can have signiﬁcant recognition performance improvement, over that achieved by noise compensation with stationary noise assumption. 1

5 0.52099884 20 nips-2001-A Sequence Kernel and its Application to Speaker Recognition

Author: William M. Campbell

6 0.424932 52 nips-2001-Computing Time Lower Bounds for Recurrent Sigmoidal Neural Networks

7 0.42240632 109 nips-2001-Learning Discriminative Feature Transforms to Low Dimensions in Low Dimentions

8 0.40039653 63 nips-2001-Dynamic Time-Alignment Kernel in Support Vector Machine

9 0.38581672 161 nips-2001-Reinforcement Learning with Long Short-Term Memory

10 0.35294977 91 nips-2001-Improvisation and Learning

11 0.32523191 162 nips-2001-Relative Density Nets: A New Way to Combine Backpropagation with HMM's

12 0.31235191 172 nips-2001-Speech Recognition using SVMs

13 0.30224809 14 nips-2001-A Neural Oscillator Model of Auditory Selective Attention

14 0.29648638 190 nips-2001-Thin Junction Trees

15 0.2905674 26 nips-2001-Active Portfolio-Management based on Error Correction Neural Networks

16 0.27306485 12 nips-2001-A Model of the Phonological Loop: Generalization and Binding

17 0.26759869 77 nips-2001-Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade

18 0.25893012 193 nips-2001-Unsupervised Learning of Human Motion Models

19 0.25835216 16 nips-2001-A Parallel Mixture of SVMs for Very Large Scale Problems

20 0.25426626 76 nips-2001-Fast Parameter Estimation Using Green's Functions

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(13, 0.014), (14, 0.018), (19, 0.015), (20, 0.01), (27, 0.07), (30, 0.577), (38, 0.021), (59, 0.025), (72, 0.043), (79, 0.03), (83, 0.011), (91, 0.066)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.97442806 33 nips-2001-An Efficient Clustering Algorithm Using Stochastic Association Model and Its Implementation Using Nanostructures

Author: Takashi Morie, Tomohiro Matsuura, Makoto Nagata, Atsushi Iwata

Abstract: This paper describes a clustering algorithm for vector quantizers using a “stochastic association model”. It offers a new simple and powerful softmax adaptation rule. The adaptation process is the same as the on-line K-means clustering method except for adding random ﬂuctuation in the distortion error evaluation process. Simulation results demonstrate that the new algorithm can achieve efﬁcient adaptation as high as the “neural gas” algorithm, which is reported as one of the most efﬁcient clustering methods. It is a key to add uncorrelated random ﬂuctuation in the similarity evaluation process for each reference vector. For hardware implementation of this process, we propose a nanostructure, whose operation is described by a single-electron circuit. It positively uses ﬂuctuation in quantum mechanical tunneling processes.

same-paper 2 0.96119928 173 nips-2001-Speech Recognition with Missing Data using Recurrent Neural Nets

Author: S. Parveen, P. Green

3 0.9446795 82 nips-2001-Generating velocity tuning by asymmetric recurrent connections

Author: Xiaohui Xie, Martin A. Giese

Abstract: Asymmetric lateral connections are one possible mechanism that can account for the direction selectivity of cortical neurons. We present a mathematical analysis for a class of these models. Contrasting with earlier theoretical work that has relied on methods from linear systems theory, we study the network’s nonlinear dynamic properties that arise when the threshold nonlinearity of the neurons is taken into account. We show that such networks have stimulus-locked traveling pulse solutions that are appropriate for modeling the responses of direction selective cortical neurons. In addition, our analysis shows that outside a certain regime of stimulus speeds the stability of this solutions breaks down giving rise to another class of solutions that are characterized by speciﬁc spatiotemporal periodicity. This predicts that if direction selectivity in the cortex is mainly achieved by asymmetric lateral connections lurching activity waves might be observable in ensembles of direction selective cortical neurons within appropriate regimes of the stimulus speed.

4 0.93044734 159 nips-2001-Reducing multiclass to binary by coupling probability estimates

Author: B. Zadrozny

Abstract: This paper presents a method for obtaining class membership probability estimates for multiclass classiﬁcation problems by coupling the probability estimates produced by binary classiﬁers. This is an extension for arbitrary code matrices of a method due to Hastie and Tibshirani for pairwise coupling of probability estimates. Experimental results with Boosted Naive Bayes show that our method produces calibrated class membership probability estimates, while having similar classiﬁcation accuracy as loss-based decoding, a method for obtaining the most likely class that does not generate probability estimates.

5 0.90397882 163 nips-2001-Risk Sensitive Particle Filters

Author: Sebastian Thrun, John Langford, Vandi Verma

Abstract: We propose a new particle ﬁlter that incorporates a model of costs when generating particles. The approach is motivated by the observation that the costs of accidentally not tracking hypotheses might be signiﬁcant in some areas of state space, and next to irrelevant in others. By incorporating a cost model into particle ﬁltering, states that are more critical to the system performance are more likely to be tracked. Automatic calculation of the cost model is implemented using an MDP value function calculation that estimates the value of tracking a particular state. Experiments in two mobile robot domains illustrate the appropriateness of the approach.

6 0.88888961 151 nips-2001-Probabilistic principles in unsupervised learning of visual structure: human data and a model

7 0.7533744 65 nips-2001-Effective Size of Receptive Fields of Inferior Temporal Visual Cortex Neurons in Natural Scenes

8 0.75072104 149 nips-2001-Probabilistic Abstraction Hierarchies

9 0.73449868 73 nips-2001-Eye movements and the maturation of cortical orientation selectivity

10 0.69916314 102 nips-2001-KLD-Sampling: Adaptive Particle Filters

11 0.65484875 63 nips-2001-Dynamic Time-Alignment Kernel in Support Vector Machine

12 0.61707878 46 nips-2001-Categorization by Learning and Combining Object Parts

13 0.60987276 20 nips-2001-A Sequence Kernel and its Application to Speaker Recognition

14 0.60825652 77 nips-2001-Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade

15 0.59826183 60 nips-2001-Discriminative Direction for Kernel Classifiers

16 0.59725791 52 nips-2001-Computing Time Lower Bounds for Recurrent Sigmoidal Neural Networks

17 0.59284693 34 nips-2001-Analog Soft-Pattern-Matching Classifier using Floating-Gate MOS Technology

18 0.59216511 176 nips-2001-Stochastic Mixed-Signal VLSI Architecture for High-Dimensional Kernel Machines

19 0.58757699 4 nips-2001-ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition

20 0.58675164 116 nips-2001-Linking Motor Learning to Function Approximation: Learning in an Unlearnable Force Field