nips nips2001 nips2001-183 knowledge-graph by maker-knowledge-mining

183 nips-2001-The Infinite Hidden Markov Model

Source: pdf

Author: Matthew J. Beal, Zoubin Ghahramani, Carl E. Rasmussen

Abstract: We show that it is possible to extend hidden Markov models to have a countably inﬁnite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the inﬁnitely many transition parameters, leaving only three hyperparameters which can be learned from data. These three hyperparameters deﬁne a hierarchical Dirichlet process capable of capturing a rich set of transition dynamics. The three hyperparameters control the time scale of the dynamics, the sparsity of the underlying state-transition matrix, and the expected number of distinct hidden states in a ﬁnite sequence. In this framework it is also natural to allow the alphabet of emitted symbols to be inﬁnite— consider, for example, symbols being possible words appearing in English text.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 uk ¡ Abstract We show that it is possible to extend hidden Markov models to have a countably inﬁnite number of hidden states. [sent-10, score-0.537]

2 By using the theory of Dirichlet processes we can implicitly integrate out the inﬁnitely many transition parameters, leaving only three hyperparameters which can be learned from data. [sent-11, score-0.529]

3 These three hyperparameters deﬁne a hierarchical Dirichlet process capable of capturing a rich set of transition dynamics. [sent-12, score-0.492]

4 The three hyperparameters control the time scale of the dynamics, the sparsity of the underlying state-transition matrix, and the expected number of distinct hidden states in a ﬁnite sequence. [sent-13, score-0.664]

5 In this framework it is also natural to allow the alphabet of emitted symbols to be inﬁnite— consider, for example, symbols being possible words appearing in English text. [sent-14, score-0.167]

6 An HMM deﬁnes a probability distribution over sequences of observations (symbols) by invoking another sequence of unobserved, or hidden, discrete state variables . [sent-16, score-0.299]

7 The basic idea in an HMM is that the seis independent of quence of hidden states has Markov dynamics—i. [sent-17, score-0.42]

8 The model is deﬁned in terms of two sets of parameters, the transition matrix whose element is and the emission matrix whose element is . [sent-20, score-0.621]

9 It has been proposed to approximate such Bayesian integration both using variational methods [3] and by conditioning on a single most likely hidden state sequence [8]. [sent-28, score-0.511]

10 In this paper we start from the point of view that the basic modelling assumption of HMMs—that the data was generated by some discrete state variable which can take on one of several values—is unreasonable for most real-world problems. [sent-29, score-0.207]

11 Instead we formulate the idea of HMMs with a countably inﬁnite number of hidden states. [sent-30, score-0.29]

12 In principle, such models have inﬁnitely many parameters in the state transition matrix. [sent-31, score-0.4]

13 Obviously it would not be sensible to optimise these parameters; instead we use the theory of Dirichlet processes (DPs) [2, 1] to implicitly integrate them out, leaving just three hyperparameters deﬁning the prior over transition dynamics. [sent-32, score-0.565]

14 1 Because of this we have extended the notion of a DP to a two-stage hierarchical process which couples transitions between different states. [sent-35, score-0.204]

15 It should be stressed that Dirichlet distributions have been used extensively both as priors for mixing proportions and to smooth n-gram models over ﬁnite alphabets [4], which differs considerably from the model presented here. [sent-36, score-0.142]

16 We explore properties of the HDP prior, showing that it can generate interesting hidden state sequences and that it can also be used as an emission model for an inﬁnite alphabet of symbols. [sent-39, score-0.895]

17 This inﬁnite emission model is controlled by two additional hyperparameters. [sent-40, score-0.389]

18 In section 4 we describe the procedures for inference (Gibbs sampling the hidden states), learning (optimising the hyperparameters), and likelihood evaluation (inﬁnite-state particle ﬁltering). [sent-41, score-0.421]

19 2 Properties of the Dirichlet Process @ £ $ Let us examine in detail the statistics of hidden state transitions from a particular state to , with the number of hidden states ﬁnite and equal to . [sent-43, score-1.13]

20 The transition probabilities row of the transition matrix can be interpreted as mixing proportions for given in the that we call . [sent-44, score-0.672]

21 © ¡ ¡©¦¦0©© §¦ % £ ED @C ©§ ¦ £ §¥ £ ¤ ¨ ¥ ¨ © samples from a discrete indicator variable which can take with proportions given by . [sent-46, score-0.139]

22 Let us see what happens to the distribution of these indicators when we integrate out the mixing proportions under a conjugate prior. [sent-49, score-0.25]

23 The conditional probability of an indicator given the setting of all other indicators (denoted ) is given by (4) where is the counts as in (1) with the indicator removed. [sent-53, score-0.286]

24 A key property of DPs, which is at the very heart of the model in this paper, is the expression for (4) when we take the limit as the number of hidden states tends to inﬁnity: i. [sent-55, score-0.42]

25 T Q I§ U ¦ R G £ P) © T ¨ SA £ ¦ ¨ HF Q Q T for all unrepresented , combined ) ¦ 0 ©T b ' where is the number of represented states (i. [sent-58, score-0.295]

26 ( ¡ b © ¦¦© £ ) ¡ G F ¡ G ¢ ) 3 Hierarchical Dirichlet Process (HDP) We now consider modelling each row of the transition and emission matrices of an HMM as a DP. [sent-65, score-0.767]

27 The ﬁrst is that we can integrate out the inﬁnite number of transition parameters, and represent the process with a ﬁnite number of indicator variables. [sent-67, score-0.37]

28 The second is that under a DP there is a natural tendency to use existing transitions in proportion to their previous usage, which gives rise to typical trajectories. [sent-68, score-0.224]

29 2 we describe in detail the HDP model for transitions and emissions for an inﬁnite-state HMM. [sent-71, score-0.212]

30 1 Hidden state transition mechanism £ 3b PBI $ 4 A @ 1§ Imagine we have generated a hidden state sequence up to and including time , building a table of counts for transitions that have occured so far from state to , i. [sent-73, score-1.355]

31 Given that we are in state , we impose on state a DP (5) with parameter whose counts are those entries in the row of , i. [sent-76, score-0.491]

32 Given that we have defaulted to the oracle DP, the probabilities of transitioning now become ) G H) i. [sent-82, score-0.286]

33 represented (7) 2 Under the inﬁnite model, at any time, there are an inﬁnite number of (indistinguishable) unrepresented states available, each of which have inﬁnitesimal mass proportional to . [sent-86, score-0.327]

34 R a) nii + α Σ nij + β + α j self transition nij b) c) d) β Σ nij + β + α Σnij + β + α j j existing transition oracle j=i njo γ Σ n jo + γ Σ n jo + γ existing state new state j j Figure 1: (left) State transition generative mechanism. [sent-87, score-1.625]

35 (right a-d) Sampled state trajectories (time along horizontal axis) from the HDP: we give examples of four modes of , explores many states with a sparse transition matrix. [sent-88, score-0.672]

36 (c) , , has strict left-to-right transition switches between a few different states. [sent-90, score-0.232]

37 (a) £ ¡ £ ¡ 1 ¡ ¢(54¢&R; 320¨ §§§§ ¡ ¡ ¡ ¦895¦87R 6¨ 8 Under the oracle, with probability proportional to an entirely new state is transitioned to. [sent-94, score-0.275]

38 This is the only mechanism for visiting new states from the inﬁnitely many available to us. [sent-95, score-0.208]

39 After each transition we set and, if we transitioned to the state via the . [sent-96, score-0.507]

40 If we transitioned to a new oracle DP just described then in addition we set state then the size of and will increase. [sent-97, score-0.48]

41 A ¤ 7b A 2b 1 @ 1 ¤ 2b @ 3b 9 9 9b b Self-transitions are special because their probability deﬁnes a time scale over which the dynamics of the hidden state evolves. [sent-98, score-0.446]

42 We assign a ﬁnite prior mass to self transitions for each state; this is the third hyperparameter in our model. [sent-99, score-0.347]

43 Therefore, when ﬁrst visited (via in the HDP), its self-transition count is initialised to . [sent-100, score-0.148]

44 B 8 B The full hidden state transition mechanism is a two-level DP hierarchy shown in decision tree form in Figure 1. [sent-101, score-0.682]

45 Alongside are shown typical state trajectories under the prior with different hyperparameters. [sent-102, score-0.276]

46 Note that controls the expected number of represented hidden states, and inﬂuences the tendency to explore new transitions, corresponding to the size and density respectively of the resulting transition count matrix. [sent-104, score-0.718]

47 First it serves to couple the transition DPs from different hidden states. [sent-107, score-0.479]

48 Since a newly visited state has no previous transitions to existing states, without an oracle (which necessarily has knowledge of all represented states as it created them) it would transition to itself or yet another new state with probability 1. [sent-108, score-1.174]

49 By consulting the oracle, new states can have ﬁnite probability of transitioning to represented states. [sent-109, score-0.287]

50 The second role of the oracle is to allow some states to be more inﬂuential (more commonly transitioned to) than others. [sent-110, score-0.485]

51 2 Emission mechanism § PBI $ C E $ ¥ C D $ The emission process is identical to the transition process in every respect except that there is no concept analogous to a self-transition. [sent-112, score-0.712]

52 Therefore we need only introduce two further hyperparameters and for the emission HDP. [sent-113, score-0.572]

53 5 4 x 10 10 0 20 40 60 80 100 Figure 2: (left) State emission generative mechanism. [sent-117, score-0.389]

54 (right) (Exp 1) Evolution of number of represented (vertical), plotted against iterations of Gibbs sweeps (horizontal) during learning of the states ascending-descending sequence which requires exactly 10 states to model the data perfectly. [sent-120, score-0.593]

55 Each line represents initialising the hidden state to a random sequence containing distinct represented states. [sent-121, score-0.63]

56 ) ¥ 1£ £ ¡ ¦¦$$¤3£ ¦4¢¡ has been emitted using the emission oracle. [sent-123, score-0.44]

57 The combination of an HDP for both hidden states and emissions may well be able to capture the somewhat super-logarithmic word generation found in Alice. [sent-132, score-0.542]

58 4 Inference, learning and likelihoods Given a sequence of observations, there are two sets of unknowns in the inﬁnite HMM: , and the ﬁve hyperparameters the hidden state sequence deﬁning the transition and emission HDPs. [sent-133, score-1.411]

59 Note that by using HDPs for both states and observations, we have implicitly integrated out the inﬁnitely many transition and emission parameters. [sent-134, score-0.794]

60 Making an analogy with non-parametric models such as Gaussian Processes, we deﬁne a learned model as a set of counts and optimised hyperparameters . [sent-135, score-0.297]

61 ¡F 8©F © 8© © B ) ¡ '$© ¦00 " ©§$ £ ) ¡9 9 %Qc © c © 3b © b ¦¡F 8 © F © 8 © © B ) ) We ﬁrst describe an approximate Gibbs sampling procedure for inferring the posterior over the hidden state sequence. [sent-136, score-0.473]

62 5 $¦ ¡ - Gibbs sample given hyperparameter settings, count matrices, and observations. [sent-142, score-0.226]

63 - Update count matrices to reﬂect new ; this may change , the number of represented hidden states. [sent-143, score-0.472]

64 1 Gibbs sampling the hidden state sequence c 9 cb 9 b ¦$ c b Deﬁne and as the results of removing from and the transition and emission counts contributed by . [sent-150, score-1.276]

65 Deﬁne similar items and related to the transition and emission £ ¤ U G ¡ F ©0PBI0 '©! [sent-151, score-0.621]

66 An exact Gibbs sweep of the hidden state from takes operations, since under the HDP generative process changing affects the probability of all subsequent hidden state transitions and emissions. [sent-154, score-1.015]

67 3 However this computation can be reasonably approximated in , by basing the Gibbs update for only on the state of its neigbours and the total counts . [sent-155, score-0.282]

68 2 Hyperparameter optimisation ¦¡A8 © F © 8 © © B F We place vague Gamma priors5 on the hyperparameters . [sent-158, score-0.183]

69 I ) ' 0( where is the number of represented states that are transitioned to from state (includis the number of possible emissions from state . [sent-162, score-0.759]

70 and ing itself); similarly are the number of times the oracle has been used for the transition and emission processes, calculated from the indicator variables . [sent-163, score-0.884]

71 3 Inﬁnite-state particle ﬁlter The likelihood for a particular observable sequence of symbols involves intractable sums over the possible hidden state trajectories. [sent-165, score-0.669]

72 In particular, in the DP, making the transition makes that transition more likely later on in the sequence, so we cannot use standard tricks like dynamic programming. [sent-167, score-0.464]

73 Furthermore, the number of distinct states can grow with the sequence length as new states are generated. [sent-168, score-0.53]

74 If the chain starts with distinct states, at time there could be possible distinct states making the total number of trajectories over the entire length of the sequence . [sent-169, score-0.49]

75 ' 6 4 ¤ ' ' G U ¤ 'G 6 7 A C@ 4 3 Although the hidden states in an HMM satisfy the Markov condition, integrating out the parameters induces these long-range dependencies. [sent-170, score-0.42]

76 Consider sampling parameters from the posterior distribution of parameter matrices, which will depend on the count matrices. [sent-172, score-0.159]

77 8 ¦P © IH§ § W 8 V G E 7F D CA¤9 B 8@ © r p i IqH F H a S CgRfec¡` YX R@ (¤S h G V@ d b a W G W V U T ¡ A © RQ § We propose estimating the likelihood of a test sequence given a learned model using particle ﬁltering. [sent-175, score-0.211]

78 The idea is to start with some number of particles distributed on the represented hidden states according to the ﬁnal state marginal from the training sequence (some of the may fall onto new states). [sent-176, score-0.801]

79 Update transition and emission tables , for each particle. [sent-180, score-0.621]

80 Since it is a discrete state space, with much of the probability mass concentrated on the represented states, it is feasible to use particles. [sent-194, score-0.258]

81 8 6 5 7"C U 'G 9 5 Synthetic experiments Exp 1: Discovering the number of hidden states We applied the inﬁnite HMM inference algorithm to the ascending-descending observation sequence consisting of 30 con. [sent-195, score-0.545]

82 The most parsimonious HMM which models this catenated copies of data perfectly has exactly 10 hidden states. [sent-196, score-0.247]

83 The inﬁnite HMM was initialised with a random hidden state sequence, containing distinct represented states. [sent-197, score-0.581]

84 In Figure 2 (right) we show how the number of represented states evolves with successive Gibbs sweeps, starting from a variety of initial . [sent-198, score-0.231]

85 ' £ ' A C E G I G E C A FSRQPHFDB@ ' Exp 2: Expansive A sequence of length was generated from a 4-state 8-symbol HMM with the transition and emission probabilities as shown in Figure 3 (top left). [sent-200, score-0.769]

86 ( ( U T £ was generated from a 4-state 3-symbol Exp 3: Compressive A sequence of length HMM with the transition and emission probabilities as shown in Figure 3 (bottom left). [sent-201, score-0.769]

87 ( ( V T In both Exp 2 and Exp 3 the inﬁnite HMM was initialised with a hidden state sequence with distinct states. [sent-202, score-0.619]

88 Figure 3 shows that, over successive Gibbs sweeps and hyperparameter learning, the count matrices for the inﬁnite HMM converge to resemble the true probability matrices as shown on the far left. [sent-203, score-0.481]

89 The HDP implicity integrates out the transition and emission parameters of the HMM. [sent-206, score-0.651]

90 An advantage of this is that it is no longer necessary to constrain the HMM to have ﬁnitely many states and observation symbols. [sent-207, score-0.173]

91 The prior over hidden state transitions deﬁned by the HDP is capable of producing a wealth of interesting trajectories by varying the three hyperparameters that control it. [sent-208, score-0.867]

92 We have presented the necessary tools for using the inﬁnite HMM, namely a linear-time approximate Gibbs sampler for inference, equations for hyperparameter learning, and a particle ﬁlter for likelihood evaluation. [sent-209, score-0.27]

93 6 Different particle initialisations apply if we do not assume that the test sequence immediately follows the training sequence. [sent-210, score-0.183]

94 True transition and emission probability matrices used for Exp 2 £ ¤¥ H £ ¤¥ ! [sent-211, score-0.687]

95 ¡ True transition and emission probability matrices used for Exp 3 ¤©! [sent-215, score-0.687]

96 ¡ H Figure 3: The far left pair of Hinton diagrams represent the true transition and emission probabilities used to generate the data for each experiment 2 and 3 (up to a permutation of the hidden states; lighter boxes correspond to higher values). [sent-219, score-0.893]

97 Similar to top row displaying count matrices after Gibbs sampling. [sent-223, score-0.208]

98 A%¡ ' & On synthetic data we have shown that the inﬁnite HMM discovers both the appropriate number of states required to model the data and the structure of the emission and transition matrices. [sent-226, score-0.821]

99 It is important to emphasise that although the count matrices found by the inﬁnite HMM resemble point estimates of HMM parameters (e. [sent-227, score-0.197]

100 We believe that for many problems the inﬁnite HMM’s ﬂexibile nature and its ability to automatically determine the required number of hidden states make it superior to the conventional treatment of HMMs with its associated difﬁcult model selection problem. [sent-230, score-0.42]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('emission', 0.389), ('hidden', 0.247), ('pbi', 0.235), ('transition', 0.232), ('hdp', 0.222), ('oracle', 0.205), ('hmm', 0.185), ('hyperparameters', 0.183), ('nite', 0.179), ('dp', 0.175), ('states', 0.173), ('hf', 0.169), ('state', 0.168), ('gibbs', 0.163), ('dirichlet', 0.151), ('transitions', 0.127), ('hyperparameter', 0.125), ('counts', 0.114), ('transitioned', 0.107), ('count', 0.101), ('sequence', 0.096), ('sweeps', 0.093), ('particle', 0.087), ('dps', 0.085), ('emissions', 0.085), ('proportions', 0.081), ('nitely', 0.081), ('trajectories', 0.072), ('hmms', 0.068), ('matrices', 0.066), ('miq', 0.064), ('unrepresented', 0.064), ('nij', 0.063), ('mixing', 0.061), ('distinct', 0.061), ('qc', 0.059), ('particles', 0.059), ('indicator', 0.058), ('represented', 0.058), ('indicators', 0.056), ('transitioning', 0.056), ('exp', 0.055), ('tendency', 0.054), ('integrate', 0.052), ('emitted', 0.051), ('hierarchical', 0.049), ('markov', 0.049), ('initialised', 0.047), ('sa', 0.047), ('symbols', 0.043), ('bbg', 0.043), ('compressive', 0.043), ('countably', 0.043), ('expansive', 0.043), ('hdps', 0.043), ('jo', 0.043), ('linger', 0.043), ('mqo', 0.043), ('occurence', 0.043), ('existing', 0.043), ('row', 0.041), ('symbol', 0.041), ('map', 0.039), ('modelling', 0.039), ('nonparametric', 0.037), ('alongside', 0.037), ('word', 0.037), ('prior', 0.036), ('mechanism', 0.035), ('sequences', 0.035), ('processes', 0.035), ('de', 0.035), ('wealth', 0.034), ('goto', 0.034), ('alice', 0.034), ('mass', 0.032), ('vu', 0.031), ('dynamics', 0.031), ('sampling', 0.03), ('alphabet', 0.03), ('uf', 0.03), ('resemble', 0.03), ('integrates', 0.03), ('sampler', 0.03), ('sweep', 0.03), ('inference', 0.029), ('likelihood', 0.028), ('au', 0.028), ('hh', 0.028), ('posterior', 0.028), ('bayesian', 0.028), ('process', 0.028), ('ne', 0.027), ('self', 0.027), ('leaving', 0.027), ('synthetic', 0.027), ('horizontal', 0.027), ('length', 0.027), ('explore', 0.026), ('probabilities', 0.025)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 183 nips-2001-The Infinite Hidden Markov Model

Author: Matthew J. Beal, Zoubin Ghahramani, Carl E. Rasmussen

2 0.19998382 162 nips-2001-Relative Density Nets: A New Way to Combine Backpropagation with HMM's

Author: Andrew D. Brown, Geoffrey E. Hinton

Abstract: Logistic units in the first hidden layer of a feedforward neural network compute the relative probability of a data point under two Gaussians. This leads us to consider substituting other density models. We present an architecture for performing discriminative learning of Hidden Markov Models using a network of many small HMM's. Experiments on speech data show it to be superior to the standard method of discriminatively training HMM's. 1

3 0.18770547 115 nips-2001-Linear-time inference in Hierarchical HMMs

Author: Kevin P. Murphy, Mark A. Paskin

Abstract: The hierarchical hidden Markov model (HHMM) is a generalization of the hidden Markov model (HMM) that models sequences with structure at many length/time scales [FST98]. Unfortunately, the original infertime, where is ence algorithm is rather complicated, and takes the length of the sequence, making it impractical for many domains. In this paper, we show how HHMMs are a special kind of dynamic Bayesian network (DBN), and thereby derive a much simpler inference algorithm, which only takes time. Furthermore, by drawing the connection between HHMMs and DBNs, we enable the application of many standard approximation techniques to further speed up inference. ¥ ©§ £ ¨¦¥¤¢ © £ ¦¥¤¢

4 0.17367113 95 nips-2001-Infinite Mixtures of Gaussian Process Experts

Author: Carl E. Rasmussen, Zoubin Ghahramani

Abstract: We present an extension to the Mixture of Experts (ME) model, where the individual experts are Gaussian Process (GP) regression models. Using an input-dependent adaptation of the Dirichlet Process, we implement a gating network for an inﬁnite number of Experts. Inference in this model may be done efﬁciently using a Markov Chain relying on Gibbs sampling. The model allows the effective covariance function to vary with the inputs, and may handle large datasets – thus potentially overcoming two of the biggest hurdles with GP models. Simulations show the viability of this approach.

5 0.15708108 43 nips-2001-Bayesian time series classification

Author: Peter Sykacek, Stephen J. Roberts

Abstract: This paper proposes an approach to classiﬁcation of adjacent segments of a time series as being either of classes. We use a hierarchical model that consists of a feature extraction stage and a generative classiﬁer which is built on top of these features. Such two stage approaches are often used in signal and image processing. The novel part of our work is that we link these stages probabilistically by using a latent feature space. To use one joint model is a Bayesian requirement, which has the advantage to fuse information according to its certainty. The classiﬁer is implemented as hidden Markov model with Gaussian and Multinomial observation distributions deﬁned on a suitably chosen representation of autoregressive models. The Markov dependency is motivated by the assumption that successive classiﬁcations will be correlated. Inference is done with Markov chain Monte Carlo (MCMC) techniques. We apply the proposed approach to synthetic data and to classiﬁcation of EEG that was recorded while the subjects performed different cognitive tasks. All experiments show that using a latent feature space results in a signiﬁcant improvement in generalization accuracy. Hence we expect that this idea generalizes well to other hierarchical models.

6 0.14145924 123 nips-2001-Modeling Temporal Structure in Classical Conditioning

7 0.12886883 7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data

8 0.11896734 163 nips-2001-Risk Sensitive Particle Filters

9 0.11511179 172 nips-2001-Speech Recognition using SVMs

10 0.10163124 35 nips-2001-Analysis of Sparse Bayesian Learning

11 0.10066237 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models

12 0.084706292 3 nips-2001-ACh, Uncertainty, and Cortical Inference

13 0.084241971 178 nips-2001-TAP Gibbs Free Energy, Belief Propagation and Sparsity

14 0.081149928 102 nips-2001-KLD-Sampling: Adaptive Particle Filters

15 0.081046738 61 nips-2001-Distribution of Mutual Information

16 0.075152799 40 nips-2001-Batch Value Function Approximation via Support Vectors

17 0.073777974 67 nips-2001-Efficient Resources Allocation for Markov Decision Processes

18 0.072903253 153 nips-2001-Product Analysis: Learning to Model Observations as Products of Hidden Variables

19 0.071912259 76 nips-2001-Fast Parameter Estimation Using Green's Functions

20 0.069561593 59 nips-2001-Direct value-approximation for factored MDPs

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.216), (1, -0.055), (2, 0.087), (3, -0.115), (4, -0.263), (5, -0.03), (6, 0.183), (7, 0.073), (8, -0.061), (9, -0.004), (10, 0.01), (11, 0.042), (12, 0.001), (13, -0.014), (14, -0.021), (15, -0.164), (16, 0.012), (17, -0.009), (18, 0.255), (19, -0.016), (20, 0.138), (21, 0.065), (22, -0.025), (23, -0.025), (24, 0.013), (25, 0.124), (26, -0.199), (27, -0.055), (28, -0.046), (29, 0.016), (30, 0.016), (31, -0.005), (32, 0.052), (33, 0.098), (34, -0.085), (35, -0.077), (36, -0.07), (37, -0.063), (38, 0.145), (39, 0.111), (40, 0.035), (41, -0.032), (42, 0.073), (43, -0.046), (44, -0.016), (45, -0.014), (46, 0.003), (47, -0.001), (48, -0.03), (49, -0.009)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97343302 183 nips-2001-The Infinite Hidden Markov Model

Author: Matthew J. Beal, Zoubin Ghahramani, Carl E. Rasmussen

2 0.81058365 115 nips-2001-Linear-time inference in Hierarchical HMMs

Author: Kevin P. Murphy, Mark A. Paskin

3 0.69778639 162 nips-2001-Relative Density Nets: A New Way to Combine Backpropagation with HMM's

Author: Andrew D. Brown, Geoffrey E. Hinton

4 0.62593251 7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data

Author: Jens Kohlmorgen, Steven Lemm

Abstract: We propose a novel method for the analysis of sequential data that exhibits an inherent mode switching. In particular, the data might be a non-stationary time series from a dynamical system that switches between multiple operating modes. Unlike other approaches, our method processes the data incrementally and without any training of internal parameters. We use an HMM with a dynamically changing number of states and an on-line variant of the Viterbi algorithm that performs an unsupervised segmentation and classification of the data on-the-fly, i.e. the method is able to process incoming data in real-time. The main idea of the approach is to track and segment changes of the probability density of the data in a sliding window on the incoming data stream. The usefulness of the algorithm is demonstrated by an application to a switching dynamical system. 1

5 0.5886901 43 nips-2001-Bayesian time series classification

Author: Peter Sykacek, Stephen J. Roberts

6 0.58580548 3 nips-2001-ACh, Uncertainty, and Cortical Inference

7 0.58481759 123 nips-2001-Modeling Temporal Structure in Classical Conditioning

8 0.51499957 95 nips-2001-Infinite Mixtures of Gaussian Process Experts

9 0.48270813 172 nips-2001-Speech Recognition using SVMs

10 0.42717171 148 nips-2001-Predictive Representations of State

11 0.3764115 67 nips-2001-Efficient Resources Allocation for Markov Decision Processes

12 0.35916361 163 nips-2001-Risk Sensitive Particle Filters

13 0.34893665 179 nips-2001-Tempo tracking and rhythm quantization by sequential Monte Carlo

14 0.34616986 178 nips-2001-TAP Gibbs Free Energy, Belief Propagation and Sparsity

15 0.34466752 194 nips-2001-Using Vocabulary Knowledge in Bayesian Multinomial Estimation

16 0.34342393 68 nips-2001-Entropy and Inference, Revisited

17 0.34187078 61 nips-2001-Distribution of Mutual Information

18 0.33691692 85 nips-2001-Grammar Transfer in a Second Order Recurrent Neural Network

19 0.31617042 83 nips-2001-Geometrical Singularities in the Neuromanifold of Multilayer Perceptrons

20 0.30526516 55 nips-2001-Convergence of Optimistic and Incremental Q-Learning

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(14, 0.027), (17, 0.017), (19, 0.014), (27, 0.098), (30, 0.082), (38, 0.02), (51, 0.214), (59, 0.054), (72, 0.069), (79, 0.124), (83, 0.04), (91, 0.154)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.8484748 183 nips-2001-The Infinite Hidden Markov Model

Author: Matthew J. Beal, Zoubin Ghahramani, Carl E. Rasmussen

2 0.72145188 78 nips-2001-Fragment Completion in Humans and Machines

Author: David Jacobs, Bas Rokers, Archisman Rudra, Zili Liu

Abstract: Partial information can trigger a complete memory. At the same time, human memory is not perfect. A cue can contain enough information to specify an item in memory, but fail to trigger that item. In the context of word memory, we present experiments that demonstrate some basic patterns in human memory errors. We use cues that consist of word fragments. We show that short and long cues are completed more accurately than medium length ones and study some of the factors that lead to this behavior. We then present a novel computational model that shows some of the ﬂexibility and patterns of errors that occur in human memory. This model iterates between bottom-up and top-down computations. These are tied together using a Markov model of words that allows memory to be accessed with a simple feature set, and enables a bottom-up process to compute a probability distribution of possible completions of word fragments, in a manner similar to models of visual perceptual completion.

3 0.71697676 115 nips-2001-Linear-time inference in Hierarchical HMMs

Author: Kevin P. Murphy, Mark A. Paskin

4 0.70331073 162 nips-2001-Relative Density Nets: A New Way to Combine Backpropagation with HMM's

Author: Andrew D. Brown, Geoffrey E. Hinton

5 0.70107329 95 nips-2001-Infinite Mixtures of Gaussian Process Experts

Author: Carl E. Rasmussen, Zoubin Ghahramani

6 0.69616741 132 nips-2001-Novel iteration schemes for the Cluster Variation Method

7 0.69364309 3 nips-2001-ACh, Uncertainty, and Cortical Inference

8 0.69353724 123 nips-2001-Modeling Temporal Structure in Classical Conditioning

9 0.69340014 7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data

10 0.69328451 169 nips-2001-Small-World Phenomena and the Dynamics of Information

11 0.69258392 161 nips-2001-Reinforcement Learning with Long Short-Term Memory