nips nips2001 nips2001-7 knowledge-graph by maker-knowledge-mining

7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data

Source: pdf

Author: Jens Kohlmorgen, Steven Lemm

Abstract: We propose a novel method for the analysis of sequential data that exhibits an inherent mode switching. In particular, the data might be a non-stationary time series from a dynamical system that switches between multiple operating modes. Unlike other approaches, our method processes the data incrementally and without any training of internal parameters. We use an HMM with a dynamically changing number of states and an on-line variant of the Viterbi algorithm that performs an unsupervised segmentation and classification of the data on-the-fly, i.e. the method is able to process incoming data in real-time. The main idea of the approach is to track and segment changes of the probability density of the data in a sliding window on the incoming data stream. The usefulness of the algorithm is demonstrated by an application to a switching dynamical system. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 de Abstract We propose a novel method for the analysis of sequential data that exhibits an inherent mode switching. [sent-7, score-0.261]

2 In particular, the data might be a non-stationary time series from a dynamical system that switches between multiple operating modes. [sent-8, score-0.255]

3 Unlike other approaches, our method processes the data incrementally and without any training of internal parameters. [sent-9, score-0.114]

4 We use an HMM with a dynamically changing number of states and an on-line variant of the Viterbi algorithm that performs an unsupervised segmentation and classification of the data on-the-fly, i. [sent-10, score-0.529]

5 the method is able to process incoming data in real-time. [sent-12, score-0.176]

6 The main idea of the approach is to track and segment changes of the probability density of the data in a sliding window on the incoming data stream. [sent-13, score-0.547]

7 The usefulness of the algorithm is demonstrated by an application to a switching dynamical system. [sent-14, score-0.23]

8 1 Introduction Abrupt changes can occur in many different real-world systems like, for example, in speech, in climatological or industrial processes, in financial markets, and also in physiological signals (EEG/MEG). [sent-15, score-0.048]

9 Methods for the analysis of time-varying dynamical systems are therefore an important issue in many application areas. [sent-16, score-0.069]

10 In [12], we introduced the annealed competition of experts method for time series from nonlinear switching dynamics, related approaches were presented, e. [sent-17, score-0.179]

11 First, the segmentation does not depend on the predictability of the system. [sent-22, score-0.292]

12 Instead, we merely estimate the density distribution of the data and track its changes. [sent-23, score-0.164]

13 This is particularly an improvement for systems where data is hard to predict, like, for example, EEG recordings [7] or financial data. [sent-24, score-0.109]

14 An incoming data stream is processed incrementally while keeping the computational effort limited by a fixed â€˘ http://www. [sent-26, score-0.443]

15 the algorithm is able to perpetually segment and classify data streams with a fixed amount of memory and CPU resources. [sent-34, score-0.333]

16 It is even possible to continuously monitor measured data in real-time, as long as the sampling rate is not too high. [sent-35, score-0.103]

17 Instead, it optimizes the segmentation on-the-fly by means of dynamic programming [1], which thereby results in an automatic correction or fine-tuning of previously estimated segmentation bounds. [sent-39, score-0.584]

18 2 The segmentation algorithm We consider the problem of continuously segmenting a data stream on-line and simultaneously labeling the segments. [sent-40, score-0.687]

19 The data stream is supposed to have a sequential or temporal structure as follows: it is supposed to consist of consecutive blocks of data in such a way that the data points in each block originate from the same underlying distribution. [sent-41, score-0.521]

20 The segmentation task is to be performed in an unsupervised fashion, i. [sent-42, score-0.292]

21 1 Using pdfs as features for segmentation Consider Yl, Y2 , Y3, . [sent-46, score-0.716]

22 , with Yt E Rn, an incoming data stream to be analyzed. [sent-49, score-0.273]

23 The sequence might have already passed a pre-processing step like filtering or subsampling, as long as this can be done on-the-fly in case of an on-line scenario. [sent-50, score-0.15]

24 The idea behind embedding is that the measured data might be a potentially non-linear projection of the systems state or phase space. [sent-53, score-0.334]

25 In any case, an embedding in a higher-dimensional space might help to resolve structure in the data, a property which is exploited, e. [sent-54, score-0.143]

26 After the embedding step one might perform a sub-sampling of the embedded data in order to reduce the amount of data for real-time processing. [sent-57, score-0.368]

27 2 Next, we want to track the density distribution of the embedded data and therefore estimate the probability density function (pdf) in a sliding window of length W. [sent-58, score-0.402]

28 We use a standard density estimator with multivariate Gaussian kernels [4] for this purpose, centered on the data points 3 in the window ~ { Xt-w }W -l w=o, () 1 ~l 1 Pt x = W ~ (27fa 2 )d/2 exp (x - Xt_w)2) ( - 2a 2 . [sent-59, score-0.165]

29 (2) The kernel width a is a smoothing parameter and its value is important to obtain a good representation of the underlying distribution. [sent-60, score-0.044]

30 We propose to choose a proportional to the mean distance of each Xt to its first d nearest neighbors, averaged over a sample set {xt}. [sent-61, score-0.031]

31 1 In our reported application we can process data at 1000 Hz (450 Hz including display) on a 1. [sent-62, score-0.061]

32 2In that case, our further notation of time indices would refer to the subsampled data. [sent-64, score-0.072]

33 2 Similarity of two pdfs Once we have sampled enough data points to compute the first pdf according to eq. [sent-67, score-0.931]

34 (2), we can compute a new pdf with each new incoming data point. [sent-68, score-0.547]

35 3 The HMM in the off-line case Before we can discuss the on-line variant, it is necessary to introduce the HMM and the respective off-line algorithm first. [sent-70, score-0.045]

36 For a given a data sequence, {X'dT=l' we can obtain the corresponding sequence of pdfs {Pt(X)}tES, S = {W, . [sent-71, score-0.588]

37 We now construct a hidden Markov model (HMM) where each of these pdfs is represented by a state s E S, with S being the set of states in the HMM. [sent-76, score-0.654]

38 For each state s, we define a continuous observation probability distribution, - ( ) PPt (X) I s-~ 1 V 21f <; exp ( - d(Ps(X),Pt(x))) 22 <; ' (4) for observing a pdf Pt(x) in state s. [sent-77, score-0.55]

39 Next, the initial state distribution {1f s LES of the HMM is given by the uniform distribution, 1fs = liN, with N = lSI being the number of states. [sent-78, score-0.13]

40 The HMM transition matrix, A = (PijkjES, determines each probability to switch from a state Si to a state Sj. [sent-80, score-0.323]

41 Our aim is to find a representation of the given sequence of pdfs in terms of a sequence of a small number of representative pdfs, that we call prototypes, which moreover exhibits only a small number of prototype changes. [sent-81, score-0.875]

42 We therefore define A in such a way that transitions to the same state are k times more likely than transitions to any of the other states, _ { Pij - k+~-l 1 k+N - l ;ifi=J ;ifi-j. [sent-82, score-0.194]

43 The well-known Viterbi algorithm [13] can now be applied to the above HMM in order to compute the optimal - i. [sent-85, score-0.18]

44 the most likely - state sequence of prototype pdfs that might have generated the given sequence of pdfs. [sent-87, score-0.993]

45 This state sequence represents the segmentation we are aiming at. [sent-88, score-0.525]

46 We can compute the most likely state sequence more efficiently if we compute it in terms of costs, c = -log(p), instead of probabilities p, i. [sent-89, score-0.395]

47 instead of computing the maximum of the likelihood function L , we compute the minimum of the cost function , -log(L), which yields the optimal state sequence as well. [sent-91, score-0.414]

48 In addition to that, we can further simplify the computation for the special case of our particular HMM architecture, which finally results in the following algorithm: For each time step, t = w, . [sent-93, score-0.033]

49 ,T, we compute for all S E S the cost cs(t) of the optimal state sequence from W to t, subject to the constraint that it ends in state S at time t. [sent-96, score-0.62]

50 We call these constrained optimal sequences c-paths and the unconstrained optimum 0* -path. [sent-97, score-0.054]

51 The iteration can be formulated as follows, with ds,t being a short hand for d(ps(x)'pt(x)) and bs,s denoting the Kronecker delta function : Initialization, Vs E S: Cs(W) := (6) ds ,w, Induction, Vs E S: cs(t) := ds,t + min { cs(t sES 1) + C (1- bs 's)}, for t = W + 1, . [sent-98, score-0.079]

52 , T, (7) Termination: 0* := (8) min { cs(T) } . [sent-101, score-0.079]

53 sES The regularization constant C, which is given by C = 2C; 2 10g(k) and thus subsumes our two free HMM parameters, can be interpreted as transition cost for switching to a new state in the path. [sent-102, score-0.292]

54 4 The optimal prototype sequence with minimal costs 0* for the complete series of pdfs, which is determined in the last step, is obtained by logging and updating the c-paths for all states s during the iteration and finally choosing the one with minimal costs according to eq. [sent-103, score-0.687]

55 4 The on-line algorithm In order to turn the above segmentation algorithm into an on-line algorithm, we must restrict the incremental update in eq. [sent-106, score-0.512]

56 (7), such that it only uses pdfs (and therewith states) from past data points. [sent-107, score-0.518]

57 We neglect at this stage that memory and CPU resources are limited. [sent-108, score-0.147]

58 Suppose that we have already processed data up to T - 1. [sent-109, score-0.103]

59 When a new data point YT arrives at time T, we can generate a new embedded vector XT (once we have sampled enough initial data points for the embedding), we have a new pdf pT(X) (once we have sampled enough embedded vectors Xt for the first pdf window), and thus we have given a new HMM state. [sent-110, score-1.021]

60 We can also readily compute the distances between the new pdf and all the previous pdfs, dT,t, t < T, according to eq. [sent-111, score-0.371]

61 A similarly simple and straightforward update of the costs, the c-paths and the optimal state sequence is only possible, however, if we neglect to consider potential c-paths that would have contained the new pdf as a prototype in previous segments. [sent-113, score-0.883]

62 The on-line update at time T for these restricted paths, that we henceforth denote with a tilde, can be performed as follows: For T = W, we have cw(W) := o*(W) := dw,w = O. [sent-115, score-0.097]

63 Compute the cost cT(T - 1) for the new state s For t = T - 1, compute w, . [sent-117, score-0.257]

64 , =T at time T - 1: 0 ift=W CT(t) :=dT,t+ { min{cT(t-1) ; o*(t-1)+C}: else (9) and update o*(t) := CT(t), if CT(t) < o*(t). [sent-120, score-0.097]

65 (10) Here we use all previous optimal segmentations o*(t), so we don't need to keep the complete matrix (cs(t))S,tES and repeatedly compute the minimum 4We developed an algorithm that computes an appropriate value for the hyperparameter C from a sample set {it}. [sent-121, score-0.256]

66 Due to the limited space we will present that algorithm in a forthcoming publication [8]. [sent-122, score-0.077]

67 However, we must store and update the history of optimal segmentations 8* (t). [sent-124, score-0.225]

68 Update from T - 1 to T and compute cs(T) for all states s E S obtained so far, and also get 8*(T): For s = W, . [sent-126, score-0.181]

69 , T , compute cs(T) := ds,T + min {cs(T - 1); 8*(T - 1) + C} (11) and finally get the cost of the optimal path 8* (T) := min {cs(T)} . [sent-129, score-0.384]

70 sES (12) As for the off-line case, the above algorithm only shows the update equations for the costs of the C- and 8* -paths. [sent-130, score-0.231]

71 The associated state sequences must be logged simultaneously during the computation. [sent-131, score-0.232]

72 Note that this can be done by just storing the sequence of switching points for each path. [sent-132, score-0.219]

73 So far we have presented the incremental version of the segmentation algorithm. [sent-134, score-0.327]

74 This algorithm still needs an amount of memory and CPU time that is increasing with each new data point. [sent-135, score-0.209]

75 In order to limit both resources to a fixed amount, we must remove old pdfs, i. [sent-136, score-0.279]

76 We propose to do this by discarding all states with time indices smaller or equal to s each time the path associated with cs(T) in eq. [sent-139, score-0.329]

77 (11) exhibits a switch back from a more recent state/pdf to the currently considered state s as a result of the min-operation in eq. [sent-140, score-0.252]

78 In the above algorithm this can simply be done by setting W := s + 1 in that case, which also allows us to discard the corresponding old cs(T)- and 8* (t)-paths, for all s::::: sand t < s. [sent-142, score-0.233]

79 In addition, the "if t = W" initialization clause in eq. [sent-143, score-0.066]

80 (9) must be ignored after the first such cut and the 8* (W - I)-path must therefore still be kept to compute the else-part also for t = W now. [sent-144, score-0.143]

81 Moreover, we do not have CT(W -1) and we therefore assume min {CT(W - 1); 8*(W - 1) + C} = 8*(W - 1) + C (in eq. [sent-145, score-0.079]

82 The explanation for this is as follows: A switch back in eq. [sent-147, score-0.063]

83 Vice versa, a newly obtained pdf is unlikely to properly represent the previous mode then, which justifies our above assumption about CT (W -1). [sent-149, score-0.358]

84 The effect of the proposed cut-off strategy is that we discard paths that end in pdfs from old modes but still allow to find the optimal pdf prototype within the current segment. [sent-150, score-1.178]

85 Cut-off conditions occur shortly after mode changes in the data and cause the removal of HMM states with pdfs from old modes. [sent-151, score-0.801]

86 However, if no mode change takes place in the incoming data sequence, no states will be discarded. [sent-152, score-0.344]

87 We therefore still need to set a fixed upper limit", for the number of candidate paths/pdfs that are simultaneously under consideration if we only have limited resources available. [sent-153, score-0.169]

88 When this limit is reached because no switches are detected, we must successively discard the oldest path/pdf stored, which finally might result in choosing a suboptimal prototype for that segment however. [sent-154, score-0.491]

89 Ultimately, a continuous discarding even enforces a change of prototypes after 2", time steps if no switching is induced by the data until then. [sent-155, score-0.41]

90 The buffer size", should therefore be as large as possible. [sent-156, score-0.066]

91 In any case, the buffer overflow condition can be recorded along with the segmentation, which allows us to identify such artificial switchings. [sent-157, score-0.097]

92 5 The labeling algorithm A labeling algorithm is required to identify segments that represent the same underlying distribution and thus have similar pdf prototypes. [sent-159, score-0.674]

93 The labeling algorithm generates labels for the segments and assigns identical labels to segments that are similar in this respect. [sent-160, score-0.318]

94 To this end, we propose a relatively simple on-line clustering scheme for the prototypes, since we expect the prototypes obtained from the same underlying distribution to be already well-separated from the other prototypes as a result of the segmentation algorithm. [sent-161, score-0.671]

95 We assign a new label to a segment if the distance of its associated prototype to all preceding prototypes exceeds a certain threshold and we assign the existing label of the closest preceding prototype otherwise. [sent-162, score-0.687]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('pdfs', 0.424), ('segmentation', 0.292), ('pdf', 0.29), ('cs', 0.249), ('hmm', 0.232), ('prototype', 0.186), ('ct', 0.179), ('prototypes', 0.152), ('state', 0.13), ('costs', 0.122), ('old', 0.118), ('switching', 0.116), ('incoming', 0.115), ('xt', 0.109), ('sequence', 0.103), ('pt', 0.101), ('states', 0.1), ('stream', 0.097), ('embedding', 0.096), ('ses', 0.085), ('segment', 0.081), ('compute', 0.081), ('min', 0.079), ('labeling', 0.079), ('lemm', 0.076), ('segmentations', 0.076), ('cpu', 0.076), ('discard', 0.07), ('dynamical', 0.069), ('mode', 0.068), ('embedded', 0.068), ('buffer', 0.066), ('sliding', 0.066), ('update', 0.064), ('switch', 0.063), ('segments', 0.061), ('supposed', 0.061), ('data', 0.061), ('window', 0.06), ('track', 0.059), ('exhibits', 0.059), ('neglect', 0.056), ('resources', 0.056), ('optimal', 0.054), ('viterbi', 0.053), ('incrementally', 0.053), ('fraunhofer', 0.053), ('discarding', 0.048), ('financial', 0.048), ('ps', 0.048), ('might', 0.047), ('cost', 0.046), ('algorithm', 0.045), ('path', 0.045), ('switches', 0.045), ('density', 0.044), ('underlying', 0.044), ('ends', 0.043), ('fixed', 0.043), ('sequential', 0.042), ('processed', 0.042), ('continuously', 0.042), ('sampled', 0.042), ('preceding', 0.041), ('yt', 0.041), ('hz', 0.039), ('indices', 0.039), ('simultaneously', 0.038), ('initialization', 0.036), ('paths', 0.036), ('labels', 0.036), ('amount', 0.035), ('memory', 0.035), ('vs', 0.035), ('incremental', 0.035), ('abrupt', 0.033), ('embed', 0.033), ('ise', 0.033), ('kohlmorgen', 0.033), ('logged', 0.033), ('markets', 0.033), ('originate', 0.033), ('perpetually', 0.033), ('ppt', 0.033), ('segmenting', 0.033), ('therewith', 0.033), ('enough', 0.033), ('time', 0.033), ('transitions', 0.032), ('limited', 0.032), ('berlin', 0.032), ('germany', 0.032), ('must', 0.031), ('identify', 0.031), ('variant', 0.031), ('propose', 0.031), ('limit', 0.031), ('clause', 0.03), ('shortly', 0.03), ('annealed', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000005 7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data

Author: Jens Kohlmorgen, Steven Lemm

2 0.19136864 162 nips-2001-Relative Density Nets: A New Way to Combine Backpropagation with HMM's

Author: Andrew D. Brown, Geoffrey E. Hinton

Abstract: Logistic units in the first hidden layer of a feedforward neural network compute the relative probability of a data point under two Gaussians. This leads us to consider substituting other density models. We present an architecture for performing discriminative learning of Hidden Markov Models using a network of many small HMM's. Experiments on speech data show it to be superior to the standard method of discriminatively training HMM's. 1

3 0.14097348 123 nips-2001-Modeling Temporal Structure in Classical Conditioning

Author: Aaron C. Courville, David S. Touretzky

Abstract: The Temporal Coding Hypothesis of Miller and colleagues [7] suggests that animals integrate related temporal patterns of stimuli into single memory representations. We formalize this concept using quasi-Bayes estimation to update the parameters of a constrained hidden Markov model. This approach allows us to account for some surprising temporal effects in the second order conditioning experiments of Miller et al. [1 , 2, 3], which other models are unable to explain. 1

4 0.13769585 89 nips-2001-Grouping with Bias

Author: Stella X. Yu, Jianbo Shi

Abstract: With the optimization of pattern discrimination as a goal, graph partitioning approaches often lack the capability to integrate prior knowledge to guide grouping. In this paper, we consider priors from unitary generative models, partially labeled data and spatial attention. These priors are modelled as constraints in the solution space. By imposing uniformity condition on the constraints, we restrict the feasible space to one of smooth solutions. A subspace projection method is developed to solve this constrained eigenproblema We demonstrate that simple priors can greatly improve image segmentation results. 1

5 0.12886883 183 nips-2001-The Infinite Hidden Markov Model

Author: Matthew J. Beal, Zoubin Ghahramani, Carl E. Rasmussen

Abstract: We show that it is possible to extend hidden Markov models to have a countably inﬁnite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the inﬁnitely many transition parameters, leaving only three hyperparameters which can be learned from data. These three hyperparameters deﬁne a hierarchical Dirichlet process capable of capturing a rich set of transition dynamics. The three hyperparameters control the time scale of the dynamics, the sparsity of the underlying state-transition matrix, and the expected number of distinct hidden states in a ﬁnite sequence. In this framework it is also natural to allow the alphabet of emitted symbols to be inﬁnite— consider, for example, symbols being possible words appearing in English text.

6 0.11219033 115 nips-2001-Linear-time inference in Hierarchical HMMs

7 0.10543779 111 nips-2001-Learning Lateral Interactions for Feature Binding and Sensory Segmentation

8 0.10418443 172 nips-2001-Speech Recognition using SVMs

9 0.098833278 67 nips-2001-Efficient Resources Allocation for Markov Decision Processes

10 0.077280156 43 nips-2001-Bayesian time series classification

11 0.076807939 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models

12 0.073823199 63 nips-2001-Dynamic Time-Alignment Kernel in Support Vector Machine

13 0.072040752 156 nips-2001-Rao-Blackwellised Particle Filtering via Data Augmentation

14 0.069241777 122 nips-2001-Model Based Population Tracking and Automatic Detection of Distribution Changes

15 0.068584047 55 nips-2001-Convergence of Optimistic and Incremental Q-Learning

16 0.068497926 105 nips-2001-Kernel Machines and Boolean Functions

17 0.06832286 54 nips-2001-Contextual Modulation of Target Saliency

18 0.068250038 40 nips-2001-Batch Value Function Approximation via Support Vectors

19 0.067996904 4 nips-2001-ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition

20 0.065541536 195 nips-2001-Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.213), (1, -0.042), (2, 0.058), (3, -0.097), (4, -0.135), (5, 0.006), (6, 0.064), (7, 0.033), (8, -0.063), (9, -0.056), (10, 0.01), (11, 0.039), (12, 0.05), (13, 0.229), (14, 0.015), (15, -0.108), (16, -0.012), (17, 0.08), (18, 0.271), (19, -0.032), (20, 0.052), (21, 0.158), (22, -0.017), (23, -0.004), (24, 0.002), (25, -0.017), (26, -0.026), (27, -0.049), (28, -0.141), (29, -0.04), (30, -0.096), (31, -0.012), (32, 0.077), (33, 0.118), (34, 0.063), (35, 0.055), (36, 0.033), (37, 0.159), (38, -0.005), (39, 0.041), (40, -0.001), (41, -0.0), (42, -0.033), (43, 0.025), (44, -0.015), (45, -0.104), (46, -0.02), (47, 0.064), (48, -0.008), (49, 0.035)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94455451 7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data

Author: Jens Kohlmorgen, Steven Lemm

2 0.65668178 162 nips-2001-Relative Density Nets: A New Way to Combine Backpropagation with HMM's

Author: Andrew D. Brown, Geoffrey E. Hinton

3 0.63313508 89 nips-2001-Grouping with Bias

Author: Stella X. Yu, Jianbo Shi

4 0.60296935 183 nips-2001-The Infinite Hidden Markov Model

Author: Matthew J. Beal, Zoubin Ghahramani, Carl E. Rasmussen

5 0.58335006 115 nips-2001-Linear-time inference in Hierarchical HMMs

Author: Kevin P. Murphy, Mark A. Paskin

Abstract: The hierarchical hidden Markov model (HHMM) is a generalization of the hidden Markov model (HMM) that models sequences with structure at many length/time scales [FST98]. Unfortunately, the original infertime, where is ence algorithm is rather complicated, and takes the length of the sequence, making it impractical for many domains. In this paper, we show how HHMMs are a special kind of dynamic Bayesian network (DBN), and thereby derive a much simpler inference algorithm, which only takes time. Furthermore, by drawing the connection between HHMMs and DBNs, we enable the application of many standard approximation techniques to further speed up inference. ¥ ©§ £ ¨¦¥¤¢ © £ ¦¥¤¢

6 0.51247615 111 nips-2001-Learning Lateral Interactions for Feature Binding and Sensory Segmentation

7 0.50573766 172 nips-2001-Speech Recognition using SVMs

8 0.47369424 123 nips-2001-Modeling Temporal Structure in Classical Conditioning

9 0.41261163 177 nips-2001-Switch Packet Arbitration via Queue-Learning

10 0.40443102 67 nips-2001-Efficient Resources Allocation for Markov Decision Processes

11 0.38680804 148 nips-2001-Predictive Representations of State

12 0.35572952 43 nips-2001-Bayesian time series classification

13 0.34675583 55 nips-2001-Convergence of Optimistic and Incremental Q-Learning

14 0.33221772 3 nips-2001-ACh, Uncertainty, and Cortical Inference

15 0.32580811 195 nips-2001-Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

16 0.32245162 156 nips-2001-Rao-Blackwellised Particle Filtering via Data Augmentation

17 0.30787155 93 nips-2001-Incremental A*

18 0.30521774 26 nips-2001-Active Portfolio-Management based on Error Correction Neural Networks

19 0.30158186 180 nips-2001-The Concave-Convex Procedure (CCCP)

20 0.30023423 94 nips-2001-Incremental Learning and Selective Sampling via Parametric Optimization Framework for SVM

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(14, 0.038), (17, 0.062), (19, 0.033), (27, 0.119), (30, 0.071), (36, 0.011), (38, 0.015), (52, 0.186), (59, 0.04), (72, 0.052), (79, 0.082), (83, 0.042), (91, 0.182)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.89599997 7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data

Author: Jens Kohlmorgen, Steven Lemm

2 0.78547698 13 nips-2001-A Natural Policy Gradient

Author: Sham M. Kakade

Abstract: We provide a natural gradient method that represents the steepest descent direction based on the underlying structure of the parameter space. Although gradient methods cannot make large changes in the values of the parameters, we show that the natural gradient is moving toward choosing a greedy optimal action rather than just a better action. These greedy optimal actions are those that would be chosen under one improvement step of policy iteration with approximate, compatible value functions, as defined by Sutton et al. [9]. We then show drastic performance improvements in simple MDPs and in the more challenging MDP of Tetris. 1

3 0.78194916 183 nips-2001-The Infinite Hidden Markov Model

Author: Matthew J. Beal, Zoubin Ghahramani, Carl E. Rasmussen

4 0.77977812 89 nips-2001-Grouping with Bias

Author: Stella X. Yu, Jianbo Shi

5 0.77626824 132 nips-2001-Novel iteration schemes for the Cluster Variation Method

Author: Hilbert J. Kappen, Wim Wiegerinck

Abstract: The Cluster Variation method is a class of approximation methods containing the Bethe and Kikuchi approximations as special cases. We derive two novel iteration schemes for the Cluster Variation Method. One is a fixed point iteration scheme which gives a significant improvement over loopy BP, mean field and TAP methods on directed graphical models. The other is a gradient based method, that is guaranteed to converge and is shown to give useful results on random graphs with mild frustration. We conclude that the methods are of significant practical value for large inference problems. 1

6 0.77305603 123 nips-2001-Modeling Temporal Structure in Classical Conditioning

7 0.77144635 135 nips-2001-On Spectral Clustering: Analysis and an algorithm

8 0.77035594 169 nips-2001-Small-World Phenomena and the Dynamics of Information

9 0.76924753 161 nips-2001-Reinforcement Learning with Long Short-Term Memory

10 0.768713 55 nips-2001-Convergence of Optimistic and Incremental Q-Learning

11 0.76826048 95 nips-2001-Infinite Mixtures of Gaussian Process Experts

12 0.76770401 121 nips-2001-Model-Free Least-Squares Policy Iteration

13 0.76566052 157 nips-2001-Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning

14 0.76540458 160 nips-2001-Reinforcement Learning and Time Perception -- a Model of Animal Experiments

15 0.76535571 162 nips-2001-Relative Density Nets: A New Way to Combine Backpropagation with HMM's

16 0.76458156 36 nips-2001-Approximate Dynamic Programming via Linear Programming

17 0.76221019 59 nips-2001-Direct value-approximation for factored MDPs

18 0.76170588 3 nips-2001-ACh, Uncertainty, and Cortical Inference

19 0.76015472 56 nips-2001-Convolution Kernels for Natural Language

20 0.75981289 66 nips-2001-Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms