nips nips2001 nips2001-63 knowledge-graph by maker-knowledge-mining

63 nips-2001-Dynamic Time-Alignment Kernel in Support Vector Machine

Source: pdf

Author: Hiroshi Shimodaira, Ken-ichi Noma, Mitsuru Nakai, Shigeki Sagayama

Abstract: A new class of Support Vector Machine (SVM) that is applicable to sequential-pattern recognition such as speech recognition is developed by incorporating an idea of non-linear time alignment into the kernel function. Since the time-alignment operation of sequential pattern is embedded in the new kernel function, standard SVM training and classiﬁcation algorithms can be employed without further modiﬁcations. The proposed SVM (DTAK-SVM) is evaluated in speaker-dependent speech recognition experiments of hand-segmented phoneme recognition. Preliminary experimental results show comparable recognition performance with hidden Markov models (HMMs). 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 jp Abstract A new class of Support Vector Machine (SVM) that is applicable to sequential-pattern recognition such as speech recognition is developed by incorporating an idea of non-linear time alignment into the kernel function. [sent-11, score-0.855]

2 Since the time-alignment operation of sequential pattern is embedded in the new kernel function, standard SVM training and classiﬁcation algorithms can be employed without further modiﬁcations. [sent-12, score-0.539]

3 The proposed SVM (DTAK-SVM) is evaluated in speaker-dependent speech recognition experiments of hand-segmented phoneme recognition. [sent-13, score-0.601]

4 Preliminary experimental results show comparable recognition performance with hidden Markov models (HMMs). [sent-14, score-0.327]

5 1 Introduction Support Vector Machine (SVM) [1] is one of the latest and most successful statistical pattern classiﬁer that utilizes a kernel technique [2, 3]. [sent-15, score-0.24]

6 Despite the successful applications of SVM in the ﬁeld of pattern recognition such as character recognition and text classiﬁcation, SVM has not been applied to speech recognition that much. [sent-17, score-0.825]

7 This is because SVM assumes that each sample is a vector of ﬁxed dimension, and hence it can not deal with the variable length sequences directly. [sent-18, score-0.207]

8 Because of this, most of the eﬀorts that have been made so far to apply SVM to speech recognition employ linear time normalization, where input feature vector sequences with diﬀerent lengths are aligned to same length [4]. [sent-19, score-0.732]

9 A variant of this approach is a hybrid of SVM and HMM (hidden Markov model), in which HMM works as a pre-processor to feed time-aligned ﬁxed-dimensional vectors to SVM [5]. [sent-20, score-0.035]

10 Another approach is to utilize probabilistic generative models as a SVM kernel function. [sent-21, score-0.267]

11 This includes the Fisher kernels [6, 7], and conditional symmetric independence (CSI) kernels [8], both of which employ HMMs as the generative models. [sent-22, score-0.191]

12 Since HMMs can treat sequential patterns, SVM that employs the generative models based on HMMs can handle sequential patterns as well. [sent-23, score-0.204]

13 In contrast to those approaches, our approach is a direct extension of the original SVM to the case of variable length sequence. [sent-24, score-0.135]

14 The idea is to incorporate the operation of dynamic time alignment into the kernel function itself. [sent-25, score-0.459]

15 Because of this, the proposed new SVM is called “Dynamic Time-Alignment Kernel SVM (DTAKSVM)”. [sent-26, score-0.045]

16 Unlike the SVM with Fisher kernel that requires two training stages with diﬀerent training criteria, one is for training the generative models and the second is for training the SVM, the DTAK-SVM uses one training criterion as well as the original SVM. [sent-27, score-0.678]

17 2 Dynamic Time-Alignment Kernel We consider a sequence of vectors X = (x1 , x2 , · · · , x L ), where xi ∈ Rn , L is the length of the sequence, and the notation |X| is sometimes used to represent the length of the sequence instead. [sent-28, score-0.323]

18 For simpliﬁcation, we at ﬁrst assume the so-called linear SVM that does not employ non-linear mapping function φ. [sent-29, score-0.089]

19 In such case, the kernel operation in (1) is identical to the inner product operation. [sent-30, score-0.531]

20 1 Formulation for linear kernel Assume that we have two vector sequences X and V . [sent-32, score-0.323]

21 |X| = |V | = L, then the inner product between X and V can be obtained easily as a summation of each inner product between xk and v k for k = 1, · · · , L: L X ·V = xk · v k , (2) k=1 and therefore an SVM classiﬁer can be deﬁned as given in (1). [sent-35, score-0.586]

22 On the other hand in case where the two sequences are diﬀerent in length, the inner product can not be calculated directly. [sent-36, score-0.328]

23 Even in such case, however, some sort of inner product like operation can be deﬁned if we align the lengths of the patterns. [sent-37, score-0.394]

24 To that end, let ψ(k), θ(k) be the time-warping functions of normalized time frame k for the pattern X and V , respectively, and let “◦” be the new inner product operator instead of the original inner product “·”. [sent-38, score-0.622]

25 Then the new inner product between the two vector sequences X and V can be given by X ◦V = 1 L L xψ(k) · v θ(k) , (3) k=1 where L is a normalized length that can be either |X|, |V | or arbitrary positive integer. [sent-39, score-0.466]

26 As it can be seen from the deﬁnition given above, the linear warping function is not suitable for continuous speech recognition, i. [sent-43, score-0.477]

27 frame-synchronous processing, because the sequence lengths, |X| and |V |, should be known beforehand. [sent-45, score-0.043]

28 On the other hand, non-linear time warping, or dynamic time warping (DTW) [9] in other word, enables frame-synchronous processing. [sent-46, score-0.404]

29 Furthermore, the past research on speech recognition has shown that the recognition performance by the non-linear time normalization outperforms the one by the linear time normalization. [sent-47, score-0.663]

30 Because of these reasons, we focus on the non-linear time warping based on DTW. [sent-48, score-0.275]

31 In the standard L DTW, the normalizing factor M ψθ is given as k=1 m(k), and the weighting coefﬁcients m(k) are chosen so that Mψθ is independent of the warping functions. [sent-50, score-0.337]

32 The above optimization problem can be solved eﬃciently by dynamic programming. [sent-51, score-0.129]

33 The recursive formula in the dynamic programming employed in the present study is as follows G(i, j) = max G(i − 1, j) + Inp(i, j), G(i − 1, j − 1) + 2 Inp(i, j), G(i, j − 1) + Inp(i, j), (6) where Inp(i, j) is the standard inner product between the two vectors corresponding to point i and j. [sent-52, score-0.534]

34 2 (7) Formulation for non-linear kernel In the last subsection, a linear kernel, i. [sent-55, score-0.21]

35 the inner product, for two vector sequences with diﬀerent lengths has been formulated in the framework of dynamic time-warping. [sent-57, score-0.466]

36 With a little constraint, similar formulation is possible for the case where SVM’s non-linear mapping function Φ is applied to the vector sequences. [sent-58, score-0.127]

37 To that end, Φ is restricted to the one having the following form: Φ(X) = (φ(x1 ), φ(x2 ), · · · , φ(x L )), (8) where φ is a non-linear mapping function that is applied to each frame vector x i , as given in (1). [sent-59, score-0.122]

38 It should be noted that under the above restriction Φ preserves the original length of sequence at the cost of losing long-term correlations such as the one between x1 and xL . [sent-60, score-0.178]

39 As a result, a new class of kernel can be deﬁned by using the extended inner product introduced in the previous section; Ks (X, V ) = Φ(X) ◦ Φ(V ) = max ψ,θ = max ψ,θ 1 Mψθ 1 Mψθ (9) L m(k)φ(xψ(k) ) · φ(v θ(k) ) (10) m(k)K(xψ(k) , vθ(k) ). [sent-61, score-0.597]

40 (11) k=1 L k=1 We call this new kernel “dynamic time-alignment kernel (DTAK)”. [sent-62, score-0.42]

41 3 Properties of the dynamic time-alignment kernel It has not been proven that the proposed function Ks (, ) is really an SVM’s admissible kernel which guarantees the existence of a feature space. [sent-64, score-0.623]

42 This is because that the mapping function to a feature space is not independent but dependent on the given vector sequences. [sent-65, score-0.118]

43 Although a class of data-dependent asymmetric kernel for SVM has been developed in [10], our proposed function is more complicated and diﬃcult to analyze because the input data is a vector sequence with variable length and non-linear time normalization is embedded in the function. [sent-66, score-0.536]

44 On the other hand, L Ks (X, X) = L max ψ,θ φ(xψ(k) ) · φ(xθ(k) ) = k=1 φ(xψ+ (k) ) · φ(xθ+ (k) ). [sent-69, score-0.064]

45 (15) k=1 Because here we assume that ψ+ (k), θ+ (k) are the optimal warping functions that maximize (15), for any warping functions including ψ ∗ (k), the following inequality holds: L Ks (X, X) ≥ L φ(xψ∗ (k) ) · φ(xψ∗ (k) ) = k=1 2 φ(xψ∗ (k) ) . [sent-70, score-0.612]

46 As it can be seen from these expressions, the SVM discriminant function for time sequence has the same form with the original SVM except for the diﬀerence in kernels. [sent-73, score-0.116]

47 It is straightforward to deduce the learning problem which is given as 1 W ◦W +C 2 min W,b,ξi subject to N i=1 (i) ξi , yi (W ◦ Φ(X ) + b) ≥ 1 − ξi , ξi ≥ 0, i = 1, · · · , N. [sent-74, score-0.089]

48 (21) (22) Again, since the formulation of learning problem deﬁned above is almost the same with that for the original SVM, same training algorithms for the original SVM can be used to solve the problem. [sent-75, score-0.194]

49 4 Experiments Speech recognition experiments were carried out to evaluated the classiﬁcation performance of DTAK-SVM. [sent-76, score-0.247]

50 As our objective is to evaluate the basic performance of the proposed method, very limited task, hand-segmented phoneme recognition task in which positions of target patterns in the utterance are known, was chosen. [sent-77, score-0.55]

51 Continuous speech recognition task that does not require phoneme labeling would be our next step. [sent-78, score-0.592]

52 1 Experimental conditions The details of the experimental conditions are given in Table 1. [sent-80, score-0.043]

53 0 C=10 95 90 # SVs / # training samples [%] Correct classification rate [%] 100 85 80 75 70 65 60 55 50 C=0. [sent-83, score-0.182]

54 In consonant-recognition task (Experiment-1), only six voiced-consonants /b,d,g,m,n,N/ were used to save time. [sent-88, score-0.036]

55 The classiﬁcation task of those 6 phonemes without using contextual information is considered as a relatively diﬃcult task, whereas the classiﬁcation of 5 vowels /a,i,u,e,o/ (Experiment-2) is considered as an easier task. [sent-89, score-0.148]

56 The proposed DTAK-SVM has been implemented with the publicly available toolkit, SVMTorch [11]. [sent-91, score-0.045]

57 1 depicts the experimental results for Experiment-1, where average values over 5 speakers are shown. [sent-94, score-0.134]

58 Table 2: Recognition performance comparison of DTAK-SVM with HMM. [sent-102, score-0.039]

59 Results of Experiment-1 for 1 male and 1 female speakers are shown. [sent-103, score-0.257]

60 (numbers represent correct classiﬁcation rate [%]) Model HMM (1 mix. [sent-104, score-0.037]

61 ) DTAK-SVM # training samples/phoneme male female 50 100 200 50 100 200 75. [sent-108, score-0.24]

62 7 Next, the classiﬁcation performance of DTAK-SVM was compared with that of the state-of-the-art HMM. [sent-138, score-0.039]

63 In order to see the eﬀect of generalization performance on the size of training data set and model complexity, experiments were carried out by varying the number of training samples (50, 100, 200), and mixtures (1,4,8,16) for each state of HMM. [sent-139, score-0.295]

64 The HMM used in this experiment was a 3-states, continuous density, Gaussian-distribution mixtures with diagonal covariances, contextindependent model. [sent-140, score-0.031]

65 The results for Experiment-1 with respect to 1 male and 1 female speakers are given in Table 2. [sent-144, score-0.257]

66 It can be said from the experimental results that DTAK-SVM shows better classiﬁcation performance when the number of training samples is 50, while comparable performance when the number of samples is 200. [sent-145, score-0.448]

67 One might argue that the number of training samples used in this experiment is not enough at all for HMM to achieve best performance. [sent-146, score-0.182]

68 But such shortage of training samples occurs often in HMMbased real-world speech recognition, especially when context-dependent models are employed, which prevents HMM from improving the generalization performance. [sent-147, score-0.353]

69 5 Conclusions A novel approach to extend the SVM framework for the sequential-pattern classiﬁcation problem has been proposed by embedding a dynamic time-alignment operation into the kernel. [sent-148, score-0.236]

70 Though long-term correlations between the feature vectors are omitted at the cost of achieving frame-synchronous processing for speech recognition, the proposed DTAK-SVMs demonstrated comparable performance in hand-segmented phoneme recognition with HMMs. [sent-149, score-0.706]

71 The DTAK-SVM is potentially applicable to continuous speech recognition with some extension of One-pass search algorithm [9]. [sent-150, score-0.41]

72 Picone, “Hybrid SVM/HMM architectures for speech recognition,” in ICSLP2000, 2000. [sent-172, score-0.171]

73 Jaakkola and David Haussler, “Exploiting generative models in discriminative classiﬁers,” in Advances in Neural Information Processing Systems 11 (M. [sent-174, score-0.057]

74 Niranjan, “Data-dependent Kernels in SVM classiﬁcation of speech patterns,” in ICSLP-2000, vol. [sent-185, score-0.171]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('svm', 0.385), ('ks', 0.38), ('warping', 0.275), ('kernel', 0.21), ('recognition', 0.208), ('classi', 0.191), ('phoneme', 0.177), ('speech', 0.171), ('svs', 0.163), ('inner', 0.151), ('dtw', 0.137), ('hmm', 0.134), ('dynamic', 0.129), ('cation', 0.114), ('samples', 0.108), ('product', 0.108), ('inp', 0.101), ('length', 0.094), ('di', 0.092), ('speakers', 0.091), ('female', 0.09), ('yi', 0.089), ('vowels', 0.082), ('employed', 0.082), ('male', 0.076), ('training', 0.074), ('lengths', 0.073), ('hmms', 0.073), ('erent', 0.071), ('dtak', 0.069), ('htk', 0.069), ('sagayama', 0.069), ('sequences', 0.069), ('speaker', 0.065), ('max', 0.064), ('operation', 0.062), ('inequality', 0.062), ('toolkit', 0.06), ('advanced', 0.058), ('alignment', 0.058), ('generative', 0.057), ('school', 0.057), ('er', 0.055), ('japan', 0.055), ('males', 0.054), ('path', 0.054), ('rn', 0.052), ('rbf', 0.052), ('sequential', 0.051), ('xi', 0.049), ('svmtorch', 0.048), ('patterns', 0.045), ('mapping', 0.045), ('proposed', 0.045), ('kernels', 0.045), ('vector', 0.044), ('employ', 0.044), ('sequence', 0.043), ('technology', 0.043), ('experimental', 0.043), ('accumulated', 0.042), ('original', 0.041), ('performance', 0.039), ('formulation', 0.038), ('comparable', 0.037), ('correct', 0.037), ('normalization', 0.037), ('task', 0.036), ('nds', 0.035), ('hybrid', 0.035), ('xk', 0.034), ('frame', 0.033), ('asymmetric', 0.033), ('lkopf', 0.033), ('normalizing', 0.033), ('support', 0.032), ('discriminant', 0.032), ('sch', 0.032), ('continuous', 0.031), ('embedded', 0.03), ('fisher', 0.03), ('holds', 0.03), ('pattern', 0.03), ('phonemes', 0.03), ('ganapathiraju', 0.03), ('juang', 0.03), ('niranjan', 0.03), ('picone', 0.03), ('females', 0.03), ('sim', 0.03), ('japanese', 0.03), ('voiced', 0.03), ('atr', 0.03), ('hmmbased', 0.03), ('mfccs', 0.03), ('orts', 0.03), ('rhs', 0.03), ('feature', 0.029), ('table', 0.029), ('weighting', 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 63 nips-2001-Dynamic Time-Alignment Kernel in Support Vector Machine

Author: Hiroshi Shimodaira, Ken-ichi Noma, Mitsuru Nakai, Shigeki Sagayama

2 0.26012498 20 nips-2001-A Sequence Kernel and its Application to Speaker Recognition

Author: William M. Campbell

Abstract: A novel approach for comparing sequences of observations using an explicit-expansion kernel is demonstrated. The kernel is derived using the assumption of the independence of the sequence of observations and a mean-squared error training criterion. The use of an explicit expansion kernel reduces classiﬁer model size and computation dramatically, resulting in model sizes and computation one-hundred times smaller in our application. The explicit expansion also preserves the computational advantages of an earlier architecture based on mean-squared error training. Training using standard support vector machine methodology gives accuracy that signiﬁcantly exceeds the performance of state-of-the-art mean-squared error training for a speaker recognition task.

3 0.24527632 172 nips-2001-Speech Recognition using SVMs

Author: N. Smith, Mark Gales

Abstract: An important issue in applying SVMs to speech recognition is the ability to classify variable length sequences. This paper presents extensions to a standard scheme for handling this variable length data, the Fisher score. A more useful mapping is introduced based on the likelihood-ratio. The score-space defined by this mapping avoids some limitations of the Fisher score. Class-conditional generative models are directly incorporated into the definition of the score-space. The mapping, and appropriate normalisation schemes, are evaluated on a speaker-independent isolated letter task where the new mapping outperforms both the Fisher score and HMMs trained to maximise likelihood. 1

4 0.23627912 46 nips-2001-Categorization by Learning and Combining Object Parts

Author: Bernd Heisele, Thomas Serre, Massimiliano Pontil, Thomas Vetter, Tomaso Poggio

Abstract: We describe an algorithm for automatically learning discriminative components of objects with SVM classiﬁers. It is based on growing image parts by minimizing theoretical bounds on the error probability of an SVM. Component-based face classiﬁers are then combined in a second stage to yield a hierarchical SVM classiﬁer. Experimental results in face classiﬁcation show considerable robustness against rotations in depth and suggest performance at signiﬁcantly better level than other face detection systems. Novel aspects of our approach are: a) an algorithm to learn component-based classiﬁcation experts and their combination, b) the use of 3-D morphable models for training, and c) a maximum operation on the output of each component classiﬁer which may be relevant for biological models of visual recognition.

5 0.18793483 38 nips-2001-Asymptotic Universality for Learning Curves of Support Vector Machines

Author: Manfred Opper, Robert Urbanczik

Abstract: Using methods of Statistical Physics, we investigate the rOle of model complexity in learning with support vector machines (SVMs). We show the advantages of using SVMs with kernels of infinite complexity on noisy target rules, which, in contrast to common theoretical beliefs, are found to achieve optimal generalization error although the training error does not converge to the generalization error. Moreover, we find a universal asymptotics of the learning curves which only depend on the target rule but not on the SVM kernel. 1

6 0.18515263 122 nips-2001-Model Based Population Tracking and Automatic Detection of Distribution Changes

7 0.18010297 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models

8 0.16280083 164 nips-2001-Sampling Techniques for Kernel Methods

9 0.15766864 16 nips-2001-A Parallel Mixture of SVMs for Very Large Scale Problems

10 0.15668295 134 nips-2001-On Kernel-Target Alignment

11 0.15230688 29 nips-2001-Adaptive Sparseness Using Jeffreys Prior

12 0.14806961 92 nips-2001-Incorporating Invariances in Non-Linear Support Vector Machines

13 0.14729035 58 nips-2001-Covariance Kernels from Bayesian Generative Models

14 0.1446726 15 nips-2001-A New Discriminative Kernel From Probabilistic Models

15 0.14451054 104 nips-2001-Kernel Logistic Regression and the Import Vector Machine

16 0.1424897 170 nips-2001-Spectral Kernel Methods for Clustering

17 0.14144085 60 nips-2001-Discriminative Direction for Kernel Classifiers

18 0.14128305 162 nips-2001-Relative Density Nets: A New Way to Combine Backpropagation with HMM's

19 0.13298184 4 nips-2001-ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition

20 0.13244584 50 nips-2001-Classifying Single Trial EEG: Towards Brain Computer Interfacing

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.328), (1, 0.235), (2, -0.122), (3, 0.053), (4, -0.169), (5, 0.333), (6, 0.132), (7, -0.125), (8, -0.01), (9, 0.132), (10, -0.009), (11, -0.065), (12, -0.003), (13, 0.144), (14, 0.007), (15, 0.03), (16, 0.01), (17, -0.024), (18, -0.026), (19, -0.044), (20, 0.005), (21, -0.002), (22, 0.023), (23, -0.054), (24, -0.013), (25, -0.001), (26, -0.109), (27, 0.018), (28, 0.127), (29, -0.062), (30, -0.024), (31, -0.106), (32, 0.012), (33, 0.041), (34, 0.024), (35, 0.034), (36, 0.062), (37, -0.041), (38, 0.016), (39, -0.029), (40, -0.086), (41, 0.054), (42, -0.041), (43, -0.022), (44, 0.089), (45, -0.049), (46, -0.032), (47, -0.056), (48, -0.006), (49, 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97296971 63 nips-2001-Dynamic Time-Alignment Kernel in Support Vector Machine

Author: Hiroshi Shimodaira, Ken-ichi Noma, Mitsuru Nakai, Shigeki Sagayama

2 0.86167902 20 nips-2001-A Sequence Kernel and its Application to Speaker Recognition

Author: William M. Campbell

3 0.70469695 104 nips-2001-Kernel Logistic Regression and the Import Vector Machine

Author: Ji Zhu, Trevor Hastie

Abstract: The support vector machine (SVM) is known for its good performance in binary classiﬁcation, but its extension to multi-class classiﬁcation is still an on-going research issue. In this paper, we propose a new approach for classiﬁcation, called the import vector machine (IVM), which is built on kernel logistic regression (KLR). We show that the IVM not only performs as well as the SVM in binary classiﬁcation, but also can naturally be generalized to the multi-class case. Furthermore, the IVM provides an estimate of the underlying probability. Similar to the “support points” of the SVM, the IVM model uses only a fraction of the training data to index kernel basis functions, typically a much smaller fraction than the SVM. This gives the IVM a computational advantage over the SVM, especially when the size of the training data set is large.

4 0.68970138 172 nips-2001-Speech Recognition using SVMs

Author: N. Smith, Mark Gales

5 0.62113762 92 nips-2001-Incorporating Invariances in Non-Linear Support Vector Machines

Author: Olivier Chapelle, Bernhard Schćžšlkopf

Abstract: The choice of an SVM kernel corresponds to the choice of a representation of the data in a feature space and, to improve performance , it should therefore incorporate prior knowledge such as known transformation invariances. We propose a technique which extends earlier work and aims at incorporating invariances in nonlinear kernels. We show on a digit recognition task that the proposed approach is superior to the Virtual Support Vector method, which previously had been the method of choice. 1

6 0.5856483 15 nips-2001-A New Discriminative Kernel From Probabilistic Models

7 0.57946199 46 nips-2001-Categorization by Learning and Combining Object Parts

8 0.54483098 38 nips-2001-Asymptotic Universality for Learning Curves of Support Vector Machines

9 0.53973675 29 nips-2001-Adaptive Sparseness Using Jeffreys Prior

10 0.5397262 60 nips-2001-Discriminative Direction for Kernel Classifiers

11 0.52363312 16 nips-2001-A Parallel Mixture of SVMs for Very Large Scale Problems

12 0.51416743 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models

13 0.51271617 173 nips-2001-Speech Recognition with Missing Data using Recurrent Neural Nets

14 0.49431831 164 nips-2001-Sampling Techniques for Kernel Methods

15 0.46150765 74 nips-2001-Face Recognition Using Kernel Methods

16 0.44885254 62 nips-2001-Duality, Geometry, and Support Vector Regression

17 0.44582692 58 nips-2001-Covariance Kernels from Bayesian Generative Models

18 0.44552806 99 nips-2001-Intransitive Likelihood-Ratio Classifiers

19 0.44521406 50 nips-2001-Classifying Single Trial EEG: Towards Brain Computer Interfacing

20 0.43064472 105 nips-2001-Kernel Machines and Boolean Functions

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(14, 0.07), (17, 0.027), (19, 0.022), (20, 0.02), (27, 0.115), (30, 0.131), (38, 0.013), (59, 0.059), (66, 0.212), (72, 0.104), (79, 0.052), (83, 0.023), (91, 0.078)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.85782826 63 nips-2001-Dynamic Time-Alignment Kernel in Support Vector Machine

Author: Hiroshi Shimodaira, Ken-ichi Noma, Mitsuru Nakai, Shigeki Sagayama

2 0.8177796 128 nips-2001-Multiagent Planning with Factored MDPs

Author: Carlos Guestrin, Daphne Koller, Ronald Parr

Abstract: We present a principled and efﬁcient planning algorithm for cooperative multiagent dynamic systems. A striking feature of our method is that the coordination and communication between the agents is not imposed, but derived directly from the system dynamics and function approximation architecture. We view the entire multiagent system as a single, large Markov decision process (MDP), which we assume can be represented in a factored way using a dynamic Bayesian network (DBN). The action space of the resulting MDP is the joint action space of the entire set of agents. Our approach is based on the use of factored linear value functions as an approximation to the joint value function. This factorization of the value function allows the agents to coordinate their actions at runtime using a natural message passing scheme. We provide a simple and efﬁcient method for computing such an approximate value function by solving a single linear program, whose size is determined by the interaction between the value function structure and the DBN. We thereby avoid the exponential blowup in the state and action space. We show that our approach compares favorably with approaches based on reward sharing. We also show that our algorithm is an efﬁcient alternative to more complicated algorithms even in the single agent case.

3 0.7117694 102 nips-2001-KLD-Sampling: Adaptive Particle Filters

Author: Dieter Fox

Abstract: Over the last years, particle ﬁlters have been applied with great success to a variety of state estimation problems. We present a statistical approach to increasing the efﬁciency of particle ﬁlters by adapting the size of sample sets on-the-ﬂy. The key idea of the KLD-sampling method is to bound the approximation error introduced by the sample-based representation of the particle ﬁlter. The name KLD-sampling is due to the fact that we measure the approximation error by the Kullback-Leibler distance. Our adaptation approach chooses a small number of samples if the density is focused on a small part of the state space, and it chooses a large number of samples if the state uncertainty is high. Both the implementation and computation overhead of this approach are small. Extensive experiments using mobile robot localization as a test application show that our approach yields drastic improvements over particle ﬁlters with ﬁxed sample set sizes and over a previously introduced adaptation technique.

4 0.69975913 29 nips-2001-Adaptive Sparseness Using Jeffreys Prior

Author: Mário Figueiredo

Abstract: In this paper we introduce a new sparseness inducing prior which does not involve any (hyper)parameters that need to be adjusted or estimated. Although other applications are possible, we focus here on supervised learning problems: regression and classiﬁcation. Experiments with several publicly available benchmark data sets show that the proposed approach yields state-of-the-art performance. In particular, our method outperforms support vector machines and performs competitively with the best alternative techniques, both in terms of error rates and sparseness, although it involves no tuning or adjusting of sparsenesscontrolling hyper-parameters.

5 0.69731265 149 nips-2001-Probabilistic Abstraction Hierarchies

Author: Eran Segal, Daphne Koller, Dirk Ormoneit

Abstract: Many domains are naturally organized in an abstraction hierarchy or taxonomy, where the instances in “nearby” classes in the taxonomy are similar. In this paper, we provide a general probabilistic framework for clustering data into a set of classes organized as a taxonomy, where each class is associated with a probabilistic model from which the data was generated. The clustering algorithm simultaneously optimizes three things: the assignment of data instances to clusters, the models associated with the clusters, and the structure of the abstraction hierarchy. A unique feature of our approach is that it utilizes global optimization algorithms for both of the last two steps, reducing the sensitivity to noise and the propensity to local maxima that are characteristic of algorithms such as hierarchical agglomerative clustering that only take local steps. We provide a theoretical analysis for our algorithm, showing that it converges to a local maximum of the joint likelihood of model and data. We present experimental results on synthetic data, and on real data in the domains of gene expression and text.

6 0.69278699 60 nips-2001-Discriminative Direction for Kernel Classifiers

7 0.68721956 92 nips-2001-Incorporating Invariances in Non-Linear Support Vector Machines

8 0.6866684 46 nips-2001-Categorization by Learning and Combining Object Parts

9 0.68588132 20 nips-2001-A Sequence Kernel and its Application to Speaker Recognition

10 0.68061846 77 nips-2001-Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade

11 0.6782645 56 nips-2001-Convolution Kernels for Natural Language

12 0.67748553 27 nips-2001-Activity Driven Adaptive Stochastic Resonance

13 0.67734194 28 nips-2001-Adaptive Nearest Neighbor Classification Using Support Vector Machines

14 0.67655385 185 nips-2001-The Method of Quantum Clustering

15 0.67607898 4 nips-2001-ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition

16 0.67544651 131 nips-2001-Neural Implementation of Bayesian Inference in Population Codes

17 0.67535985 22 nips-2001-A kernel method for multi-labelled classification

18 0.67479366 1 nips-2001-(Not) Bounding the True Error

19 0.67105484 50 nips-2001-Classifying Single Trial EEG: Towards Brain Computer Interfacing

20 0.66868043 172 nips-2001-Speech Recognition using SVMs