nips nips2012 nips2012-41 knowledge-graph by maker-knowledge-mining

41 nips-2012-Ancestor Sampling for Particle Gibbs

Source: pdf

Author: Fredrik Lindsten, Thomas Schön, Michael I. Jordan

Abstract: We present a novel method in the family of particle MCMC methods that we refer to as particle Gibbs with ancestor sampling (PG-AS). Similarly to the existing PG with backward simulation (PG-BS) procedure, we use backward sampling to (considerably) improve the mixing of the PG kernel. Instead of using separate forward and backward sweeps as in PG-BS, however, we achieve the same effect in a single forward sweep. We apply the PG-AS framework to the challenging class of non-Markovian state-space models. We develop a truncation strategy of these models that is applicable in principle to any backward-simulation-based method, but which is particularly well suited to the PG-AS framework. In particular, as we show in a simulation study, PG-AS can yield an order-of-magnitude improved accuracy relative to PG-BS due to its robustness to the truncation error. Several application examples are discussed, including Rao-Blackwellized particle smoothing and inference in degenerate state-space models. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 se Abstract We present a novel method in the family of particle MCMC methods that we refer to as particle Gibbs with ancestor sampling (PG-AS). [sent-12, score-1.018]

2 Similarly to the existing PG with backward simulation (PG-BS) procedure, we use backward sampling to (considerably) improve the mixing of the PG kernel. [sent-13, score-0.886]

3 Instead of using separate forward and backward sweeps as in PG-BS, however, we achieve the same effect in a single forward sweep. [sent-14, score-0.416]

4 We develop a truncation strategy of these models that is applicable in principle to any backward-simulation-based method, but which is particularly well suited to the PG-AS framework. [sent-16, score-0.219]

5 In particular, as we show in a simulation study, PG-AS can yield an order-of-magnitude improved accuracy relative to PG-BS due to its robustness to the truncation error. [sent-17, score-0.311]

6 Several application examples are discussed, including Rao-Blackwellized particle smoothing and inference in degenerate state-space models. [sent-18, score-0.504]

7 It remains a major challenge to develop inference algorithms for non-Markovian SSMs: xt+1 ∼ f (xt+1 | θ, x1:t ), yt ∼ g(yt | θ, x1:t ), (1) where θ ∈ Θ is a static parameter with prior density p(θ), xt is the latent state and yt is the observation at time t, respectively. [sent-22, score-0.373]

8 To tackle the challenging problem of inference for non-Markovian SSMs, we work within the framework of particle MCMC (PMCMC), a family of inferential methods introduced in [1]. [sent-25, score-0.398]

9 In an idealized Gibbs sampler we would target this density by sampling as follows: (i) Draw θ | x1:T ∼ p(θ | x1:T , y1:T ); (ii) Draw x1:T | θ ∼ p(x1:T | θ , y1:T ). [sent-31, score-0.335]

10 In PMCMC, this is addressed by instead sampling a particle trajectory x1:T based on an SMC approximation of the joint smoothing density. [sent-35, score-0.555]

11 More precisely, we run an SMC sampler targeting p(x1:T | θ , y1:T ). [sent-36, score-0.213]

12 We then sample one of the particles 1 at the ﬁnal time T , according to their importance weights, and trace the ancestral lineage of this particle to obtain the trajectory x1:T . [sent-37, score-0.53]

13 This overall procedure is referred to as particle Gibbs (PG). [sent-38, score-0.339]

14 To exploit this ﬂexibility we must address a drawback of PG in the high-dimensional setting, which is that the mixing of the PG kernel can be very poor when there is path degeneracy in the SMC sampler [2, 3]. [sent-40, score-0.321]

15 This problem has been addressed in the generic setting of SSMs by adding a backward simulation step to the PG sampler, yielding a method denoted PG with backward simulation (PG-BS). [sent-41, score-0.832]

16 Unfortunately, however, the application of backward simulation is problematic for non-Markovian models. [sent-43, score-0.442]

17 The reason is that we need to consider full state trajectories during the backward simulation pass, leading to O(T 2 ) computational complexity (see Section 4 for details). [sent-44, score-0.471]

18 To address this issue, we develop a novel PMCMC method which we refer to as particle Gibbs with ancestor sampling (PG-AS) that achieves the effect of backward sampling without an explicit backward pass. [sent-45, score-1.436]

19 As part of our development, we also develop a truncation method geared to non-Markovian models. [sent-46, score-0.219]

20 This method is a generic method that is also applicable to PG-BS, but, as we show in a simulation study in Section 6, the effect of the truncation error is much less severe for PG-AS than for PG-BS. [sent-47, score-0.311]

21 Assume that {xm , wt−1 }N ¯ 1:t−1 m=1 is a weighted particle system targeting γt−1 (x1:t−1 ). [sent-59, score-0.405]

22 This particle system is propagated to time t by ¯ sampling independently from a proposal kernel, at at wt−1 νt−1 Mt (at , xt ) = Rt (xt | xat ). [sent-60, score-0.621]

23 (2) 1:t−1 l l wt−1 νt−1 l In this formulation, the resampling step is implicit and corresponds to sampling the ancestor indices at . [sent-61, score-0.376]

24 Note that am is the index of the ancestor particle of xm . [sent-62, score-0.785]

25 When we write xm we refer to the t t 1:t m ancestral path of xm . [sent-63, score-0.502]

26 The factors νt = νt (xm ), known as adjustment multiplier weights, are t 1:t used in the auxiliary SMC sampler to increase the probability of sampling ancestors that better can m describe the current observation [5]. [sent-64, score-0.316]

27 The particles are then weighted according to wt = Wt (xm ), 1:t where the weight function is given by γt (x1:t ) Wt (x1:t ) = , (3) γt−1 (x1:t−1 )νt−1 (x1:t−1 )Rt (xt | x1:t−1 ) for t ≥ 2. [sent-65, score-0.237]

28 The procedure is initiated by sampling from a proposal density xm ∼ R1 (x1 ) and 1 m assigning importance weights w1 = W1 (xm ) with W1 (x1 ) = γ1 (x1 )/R1 (x1 ). [sent-66, score-0.399]

29 In PMCMC it is 1 instructive to view this sampling procedure as a way of generating a single sample from the density N ψ(x1:T , a2:T ) T N R1 (xm ) 1 m=1 N (T −1) Mt (am , xm ) t t (4) t=2 m=1 on the space XN T × {1, . [sent-67, score-0.355]

30 t t 3 Particle Gibbs with ancestor sampling PMCMC methods is a class of MCMC samplers in which SMC is used to construct proposal kernels [1]. [sent-75, score-0.464]

31 The validity of these methods can be assessed by viewing them as MCMC samplers on an 2 extended state space in which all the random variables generated by the SMC sampler are seen as auxiliary variables. [sent-76, score-0.325]

32 (5) T b1 NT R1 (x1 ) t=2 Mt (abt , xbt ) t t By construction, this density admits γT (xk ) as a marginal, and can thus be used as a surrogate for ¯ 1:T the original target density γT [1]. [sent-78, score-0.205]

33 Here k is a variable indexing one of the particles at the ﬁnal time ¯ point and b1:T corresponds to the ancestral path of this particle: xk = xb1:T = {xb1 , . [sent-79, score-0.244]

34 1 1:T 1:T T bt+1 These indices are given recursively from the ancestor indices by bT = k and bt = at+1 . [sent-83, score-0.594]

35 The PG sampler [1] is a Gibbs sampler targeting φ using the following sweep (note that b1:T = {ab2:T , bT }), 2:T ,−b ,−b 1. [sent-84, score-0.425]

36 In [1], a sequential procedure for sampling from the conditional density appearing in Step 1 is given. [sent-98, score-0.174]

37 It takes the form of an SMC sampler in which we condition on the event that a prespeciﬁed b1:T path x1:T = x1:T , with indices b1:T , is maintained throughout the sampler (see Algorithm 1 for a related procedure). [sent-100, score-0.426]

38 Furthermore, the conditional distribution appearing in Step 2 of the PG sampler k is shown to be proportional to wT , and it can thus straightforwardly be sampled from. [sent-101, score-0.197]

39 Our idea is to sample new values for the ancestor indices b1:T −1 as part of the CSMC procedure1 . [sent-108, score-0.288]

40 , T , draw ,−b −b 1:t−1 xt ,−bt , at ,−bt ∼ φ(x−bt , at t | x1:t−1 , a2:t−1 , xb1:T , bt−1:T ), t 1:T ,−b 1:t−1 (at ,bt =) bt−1 ∼ φ(bt−1 | x1:t−1 , a2:t−1 , xb1:T , bt:T ). [sent-118, score-0.175]

41 = wt t l l ws νs s=1 R1 (xb1 ) 1 Mt (abs , xbs ). [sent-125, score-0.192]

42 t bt bs bs b1 b1 γt (x1:t ) R1 (x1 ) t Ms (abs , xbs ) R1 (x1 ) s=2 Ms (as , xs ) s s s=2 (7) By plugging (6) into the numerator we get, b b t+1:T φ(bt | x1:t , a2:t , xt+1:T , bt+1:T ) ∝ wt t γT (xk ) 1:T . [sent-128, score-0.462]

43 γt (xbt ) 1:t (8) Hence, to sample a new ancestor index for the conditioned path at time t + 1, we proceed as follows. [sent-129, score-0.294]

44 bt+1:T Given xt+1:T (= xt+1:T ) we compute the backward sampling weights, m m wt|T = wt γT ({xm , xt+1:T }) 1:t , γt (xm ) 1:t (9) m for m = 1, . [sent-130, score-0.566]

45 We then set bt = m with probability proportional to wt|T . [sent-134, score-0.27]

46 It follows that the proposed CSMC with ancestor sampling (Step 1 ), conditioned on {x1:T , b1:T }, can be realized as in Algorithm 1. [sent-135, score-0.34]

47 The difference between this algorithm and the CSMC sampler derived in [1] lies in the ancestor sampling step 2(b) (where instead, they set abt = bt−1 ). [sent-136, score-0.608]

48 By t introducing the ancestor sampling, we break the strong dependence between the generated particle trajectories and the path on which we condition. [sent-137, score-0.655]

49 We call the resulting method, deﬁned by Steps 1 and 2 above, PG with ancestor sampling (PG-AS). [sent-138, score-0.34]

50 Algorithm 1 CSMC with ancestor sampling, conditioned on {x1:T , b1:T } 1. [sent-139, score-0.252]

51 Initialize (t = 1): (a) Draw xm ∼ R1 (x1 ) for m = b1 and set xb1 = x1 . [sent-140, score-0.194]

52 , T : (a) Draw {am , xm } ∼ Mt (at , xt ) for m = bt and set xbt = xt . [sent-149, score-0.833]

53 t t am m t (c) Set xm = {x1:t−1 , xm } and wt = Wt (xm ) for m = 1, . [sent-151, score-0.542]

54 This previous work, however, accomplishes this with a explicit backward simulation pass, which, as we discuss in the following section, is problematic for our applications to non-Markovian SSMs. [sent-156, score-0.442]

55 In the PG-AS sampler, instead of requiring distinct forward and backward sequences of Gibbs steps as in PG with backward simulation (PG-BS), we obtain a similar effect via a single forward sweep. [sent-157, score-0.832]

56 To employ backward sampling, we need to evaluate the ratio T p(x1:T , y1:T ) γT (x1:T ) = = g(ys | x1:s )f (xs | x1:s−1 ). [sent-159, score-0.324]

57 γt (x1:t ) p(x1:t , y1:t ) s=t+1 (10) In general, the computational cost of computing the backward sampling weights will thus be O(T ). [sent-160, score-0.444]

58 This implies that the cost of generating a full backward trajectory is O(T 2 ). [sent-161, score-0.36]

59 It is therefore computationally prohibitive to employ backward simulation type of particle smoothers, as well as the PG samplers discussed above, for general non-Markovian models. [sent-162, score-0.843]

60 2 0 0 50 100 200 0 150 50 100 150 200 Figure 1: Probability under Pp as a function of the truncation level p for two different systems; one 5 dimensional (left) and one 20 dimensional (right). [sent-169, score-0.242]

61 The dashed vertical lines show the value of the truncation level padpt. [sent-176, score-0.242]

62 , N }, deﬁned by the backward sampling weight (9) and the truncated backward sampling weights (11), respectively. [sent-189, score-0.856]

63 From (11), we see that we can compute the backward weights in constant time under the truncation within the PG-AS framework. [sent-194, score-0.575]

64 In general, however, it will not be known a priori how to set the truncation level p for any given problem. [sent-196, score-0.242]

65 To address this problem, we propose to use an adaption of the truncation level. [sent-197, score-0.315]

66 Let εp = DTV (Pp , Pp−1 ) be the total variation (TV) distance between the distributions for two consecutive truncation levels. [sent-201, score-0.219]

67 1 Rao-Blackwellized particle smoothing One popular approach to increase the efﬁciency of SMC samplers for SSMs is to marginalize over one component of the state, and apply an SMC sampler in the lower-dimensional marginal space. [sent-217, score-0.717]

68 This leads to what is known as the Rao-Blackwellized particle ﬁlter (RBPF) [8–10]. [sent-218, score-0.339]

69 , [10]) with “nonlinear state” xt and “linear state” zt , can be reduced to xt ∼ p(xt | x1:t−1 , y1:t−1 ) and yt ∼ p(yt | x1:t , y1:t−1 ). [sent-222, score-0.355]

70 , a particle smoother) for this model, we need to evaluate the backward sampling weights (9). [sent-226, score-0.783]

71 The computational complexity of this calculation can be reduced by employing the truncation proposed in Section 4. [sent-231, score-0.219]

72 2 Particle smoothing for degenerate state-space models Many dynamical systems are most naturally modelled as degenerate in the sense that the transition kernel of the state process does not admit any dominating measure. [sent-233, score-0.232]

73 SMC samplers can straightforwardly be applied to this type of models, but it is more problematic to address the smoothing problem using particle methods. [sent-236, score-0.589]

74 The reason is that the backward kernel also will be degenerate and it cannot be approximated in a natural way by the forward ﬁlter particles, as is normally done in backward-simulation-based particle smoothers. [sent-237, score-0.752]

75 With vt ΣV T ωt and by appropriate deﬁnitions of the functions fx and h, the model (12) can thus be rewritten as, xt = fx (x1:t−1 ) + vt−1 and yt = h(x1:t ) + et , which is a non-degenerate, non-Markovian SSM. [sent-243, score-0.233]

76 By exploiting the truncation proposed in Section 4 we can thus apply PG-AS to do inference in this model. [sent-244, score-0.249]

77 SMC has 2 For the speciﬁc problem of Rao-Blackwellized smoothing in conditionally Gaussian models, a backward simulator which can be implemented in O(T ) computational complexity has recently been proposed in [11]. [sent-251, score-0.445]

78 This is based on the idea of propagating information backward in time as the backward samples are generated. [sent-252, score-0.648]

79 backward simulation −2 10 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Figure 2: Rao-Blackwellized state smoothing using PG. [sent-255, score-0.541]

80 To employ backward simulation to these models, we are thus faced with problems of a similar nature as those discussed in Section 4. [sent-271, score-0.416]

81 1 RBPS: Linear Gaussian state-space model As a ﬁrst example, we consider Rao-Blackwellized particle smoothing (RBPS) in a single-output 4th-order linear Gaussian SSM. [sent-278, score-0.431]

82 The truncation level is set to p = 1, leading to a coarse approximation. [sent-283, score-0.242]

83 Furthermore, it appears as if the error for PG-AS would decrease further, given more iterations, suggesting that the bias caused by the truncation is dominated by the Monte Carlo variance, even after R = 10000 iterations. [sent-290, score-0.219]

84 For further comparison, we also run an untruncated forward ﬁlter/backward simulator (FF-BS) particle smoother [21], using N = 5000 forward ﬁlter particles and M = 500 backward trajectories (with a computational complexity of O(N M T 2 )). [sent-291, score-0.989]

85 These results suggest that PMCMC samplers, such as the PG-AS, indeed can be serious competitors to more “standard” particle smoothers. [sent-293, score-0.339]

86 backward simulation −3 p=1 p=5 p = 10 Adapt. [sent-298, score-0.416]

87 Different values for the truncation level p are considered. [sent-303, score-0.242]

88 FF-BS in terms of accuracy and, due to the fact that the ancestor sampling allows us to use as few as N = 5 particles at each iteration, at a lower computational cost. [sent-306, score-0.423]

89 2 Random linear Gaussian systems with rank deﬁcient process noise covariances To see how the PG samplers are affected by the choice of truncation level p and by the mixing properties of the system, we evaluate them on random linear Gaussian SSMs of different orders. [sent-308, score-0.388]

90 We consider different ﬁxed truncation levels, as well as an adaptive level with γ = 0. [sent-314, score-0.242]

91 , N } change as we increase the truncation level, in two representative cases for a 5th and a 20th order system, respectively. [sent-322, score-0.243]

92 7 Discussion PG-AS is a novel approach to PMCMC that makes use of backward simulation ideas without needing an explicit backward pass. [sent-324, score-0.74]

93 Compared to PG-BS, a conceptually similar method that does require an explicit backward pass, PG-AS has advantages, most notably for inference in the non-Markovian SSMs that have been our focus here. [sent-325, score-0.354]

94 When using the proposed truncation of the backward weights, we have found PG-AS to be more robust to the approximation error than PG-BS. [sent-326, score-0.543]

95 It can also be more memory efﬁcient, since it does not require us to store intermediate quantities that are needed for a separate backward simulation pass, as is done in PG-BS. [sent-328, score-0.416]

96 Doucet, “Efﬁcient Bayesian inference for switching statespace models using discrete particle Markov chain Monte Carlo methods,” Bristol Statistics Research Report 10:04, Tech. [sent-341, score-0.369]

97 Sch¨ n, “On the use of backward simulation in the particle Gibbs sampler,” o in Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, Mar. [sent-347, score-0.755]

98 Johansen, “A tutorial on particle ﬁltering and smoothing: Fifteen years later,” in The Oxford Handbook of Nonlinear Filtering, D. [sent-351, score-0.339]

99 Shephard, “Filtering via simulation: Auxiliary particle ﬁlters,” Journal of the American Statistical Association, vol. [sent-358, score-0.339]

100 Nordlund, “Marginalized particle ﬁlters for mixed lino ear/nonlinear state-space models,” IEEE Transactions on Signal Processing, vol. [sent-391, score-0.339]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('particle', 0.339), ('backward', 0.324), ('pg', 0.29), ('smc', 0.274), ('bt', 0.27), ('ancestor', 0.252), ('truncation', 0.219), ('xm', 0.194), ('ssms', 0.188), ('sampler', 0.174), ('pmcmc', 0.166), ('wt', 0.154), ('csmc', 0.132), ('xt', 0.131), ('xbt', 0.107), ('abt', 0.094), ('smoothing', 0.092), ('simulation', 0.092), ('sampling', 0.088), ('samplers', 0.088), ('gibbs', 0.086), ('pp', 0.085), ('particles', 0.083), ('adaption', 0.075), ('lindsten', 0.075), ('ancestral', 0.072), ('doucet', 0.067), ('carlo', 0.065), ('monte', 0.063), ('mixing', 0.058), ('godsill', 0.058), ('whiteley', 0.056), ('yt', 0.054), ('mcmc', 0.05), ('smoother', 0.05), ('untruncated', 0.05), ('density', 0.049), ('xk', 0.047), ('forward', 0.046), ('ssm', 0.046), ('mt', 0.045), ('draw', 0.044), ('degenerate', 0.043), ('path', 0.042), ('andrieu', 0.041), ('phylogenetic', 0.039), ('targeting', 0.039), ('zt', 0.039), ('lters', 0.038), ('sweep', 0.038), ('dkld', 0.038), ('dpmm', 0.038), ('mbf', 0.038), ('rbpf', 0.038), ('rbps', 0.038), ('xbs', 0.038), ('sequential', 0.037), ('trajectory', 0.036), ('proposal', 0.036), ('indices', 0.036), ('pass', 0.034), ('state', 0.033), ('rmses', 0.033), ('kalman', 0.033), ('weights', 0.032), ('swedish', 0.031), ('rmse', 0.031), ('rt', 0.03), ('inference', 0.03), ('auxiliary', 0.03), ('markovian', 0.029), ('simulator', 0.029), ('abs', 0.029), ('inferential', 0.029), ('system', 0.027), ('lter', 0.026), ('west', 0.026), ('degeneracy', 0.026), ('jordan', 0.026), ('problematic', 0.026), ('ping', 0.025), ('nonlinear', 0.025), ('sch', 0.025), ('fx', 0.024), ('idealized', 0.024), ('increase', 0.024), ('instructive', 0.024), ('dim', 0.024), ('level', 0.023), ('straightforwardly', 0.023), ('agglomerative', 0.023), ('hs', 0.023), ('gaussian', 0.023), ('static', 0.022), ('trajectories', 0.022), ('densities', 0.022), ('considerably', 0.022), ('address', 0.021), ('dynamical', 0.021), ('filtering', 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999958 41 nips-2012-Ancestor Sampling for Particle Gibbs

Author: Fredrik Lindsten, Thomas Schön, Michael I. Jordan

2 0.33680218 118 nips-2012-Entangled Monte Carlo

Author: Seong-hwan Jun, Liangliang Wang, Alexandre Bouchard-côté

Abstract: We propose a novel method for scalable parallelization of SMC algorithms, Entangled Monte Carlo simulation (EMC). EMC avoids the transmission of particles between nodes, and instead reconstructs them from the particle genealogy. In particular, we show that we can reduce the communication to the particle weights for each machine while efﬁciently maintaining implicit global coherence of the parallel simulation. We explain methods to efﬁciently maintain a genealogy of particles from which any particle can be reconstructed. We demonstrate using examples from Bayesian phylogenetic that the computational gain from parallelization using EMC signiﬁcantly outweighs the cost of particle reconstruction. The timing experiments show that reconstruction of particles is indeed much more efﬁcient as compared to transmission of particles. 1

3 0.16650841 205 nips-2012-MCMC for continuous-time discrete-state systems

Author: Vinayak Rao, Yee W. Teh

Abstract: We propose a simple and novel framework for MCMC inference in continuoustime discrete-state systems with pure jump trajectories. We construct an exact MCMC sampler for such systems by alternately sampling a random discretization of time given a trajectory of the system, and then a new trajectory given the discretization. The ﬁrst step can be performed efﬁciently using properties of the Poisson process, while the second step can avail of discrete-time MCMC techniques based on the forward-backward algorithm. We show the advantage of our approach compared to particle MCMC and a uniformization-based sampler. 1

4 0.15599549 138 nips-2012-Fully Bayesian inference for neural models with negative-binomial spiking

Author: James Scott, Jonathan W. Pillow

Abstract: Characterizing the information carried by neural populations in the brain requires accurate statistical models of neural spike responses. The negative-binomial distribution provides a convenient model for over-dispersed spike counts, that is, responses with greater-than-Poisson variability. Here we describe a powerful data-augmentation framework for fully Bayesian inference in neural models with negative-binomial spiking. Our approach relies on a recently described latentvariable representation of the negative-binomial distribution, which equates it to a Polya-gamma mixture of normals. This framework provides a tractable, conditionally Gaussian representation of the posterior that can be used to design efﬁcient EM and Gibbs sampling based algorithms for inference in regression and dynamic factor models. We apply the model to neural data from primate retina and show that it substantially outperforms Poisson regression on held-out data, and reveals latent structure underlying spike count correlations in simultaneously recorded spike trains. 1

5 0.14526454 121 nips-2012-Expectation Propagation in Gaussian Process Dynamical Systems

Author: Marc Deisenroth, Shakir Mohamed

Abstract: Rich and complex time-series data, such as those generated from engineering systems, ﬁnancial markets, videos, or neural recordings are now a common feature of modern data analysis. Explaining the phenomena underlying these diverse data sets requires ﬂexible and accurate models. In this paper, we promote Gaussian process dynamical systems as a rich model class that is appropriate for such an analysis. We present a new approximate message-passing algorithm for Bayesian state estimation and inference in Gaussian process dynamical systems, a nonparametric probabilistic generalization of commonly used state-space models. We derive our message-passing algorithm using Expectation Propagation and provide a unifying perspective on message passing in general state-space models. We show that existing Gaussian ﬁlters and smoothers appear as special cases within our inference framework, and that these existing approaches can be improved upon using iterated message passing. Using both synthetic and real-world data, we demonstrate that iterated message passing can improve inference in a wide range of tasks in Bayesian state estimation, thus leading to improved predictions and more effective decision making. 1

6 0.13463148 292 nips-2012-Regularized Off-Policy TD-Learning

7 0.12141824 203 nips-2012-Locating Changes in Highly Dependent Data with Unknown Number of Change Points

8 0.11866777 315 nips-2012-Slice sampling normalized kernel-weighted completely random measure mixture models

9 0.11794846 336 nips-2012-The Coloured Noise Expansion and Parameter Estimation of Diffusion Processes

10 0.10701154 23 nips-2012-A lattice filter model of the visual pathway

11 0.10678002 218 nips-2012-Mixing Properties of Conditional Markov Chains with Unbounded Feature Functions

12 0.10583349 56 nips-2012-Bayesian active learning with localized priors for fast receptive field characterization

13 0.10402285 151 nips-2012-High-Order Multi-Task Feature Learning to Identify Longitudinal Phenotypic Markers for Alzheimer's Disease Progression Prediction

14 0.10214404 82 nips-2012-Continuous Relaxations for Discrete Hamiltonian Monte Carlo

15 0.10089846 13 nips-2012-A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

16 0.1005915 153 nips-2012-How Prior Probability Influences Decision Making: A Unifying Probabilistic Model

17 0.099905849 324 nips-2012-Stochastic Gradient Descent with Only One Projection

18 0.098345503 111 nips-2012-Efficient Sampling for Bipartite Matching Problems

19 0.095053777 126 nips-2012-FastEx: Hash Clustering with Exponential Families

20 0.093728036 349 nips-2012-Training sparse natural image models with a fast Gibbs sampler of an extended state space

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.192), (1, -0.003), (2, 0.071), (3, 0.187), (4, -0.11), (5, -0.1), (6, 0.01), (7, -0.049), (8, 0.099), (9, -0.047), (10, -0.093), (11, -0.11), (12, 0.006), (13, -0.049), (14, -0.042), (15, -0.107), (16, -0.064), (17, -0.072), (18, 0.071), (19, -0.071), (20, 0.091), (21, 0.095), (22, -0.165), (23, -0.145), (24, -0.12), (25, -0.023), (26, -0.195), (27, 0.066), (28, 0.038), (29, -0.223), (30, 0.052), (31, -0.093), (32, 0.077), (33, 0.065), (34, 0.057), (35, 0.116), (36, -0.15), (37, 0.082), (38, -0.008), (39, 0.012), (40, 0.077), (41, -0.032), (42, 0.043), (43, -0.062), (44, 0.033), (45, 0.058), (46, 0.069), (47, 0.033), (48, 0.02), (49, 0.059)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95378435 41 nips-2012-Ancestor Sampling for Particle Gibbs

Author: Fredrik Lindsten, Thomas Schön, Michael I. Jordan

2 0.81627172 118 nips-2012-Entangled Monte Carlo

Author: Seong-hwan Jun, Liangliang Wang, Alexandre Bouchard-côté

3 0.72024304 205 nips-2012-MCMC for continuous-time discrete-state systems

Author: Vinayak Rao, Yee W. Teh

4 0.59321427 11 nips-2012-A Marginalized Particle Gaussian Process Regression

Author: Yali Wang, Brahim Chaib-draa

Abstract: We present a novel marginalized particle Gaussian process (MPGP) regression, which provides a fast, accurate online Bayesian ﬁltering framework to model the latent function. Using a state space model established by the data construction procedure, our MPGP recursively ﬁlters out the estimation of hidden function values by a Gaussian mixture. Meanwhile, it provides a new online method for training hyperparameters with a number of weighted particles. We demonstrate the estimated performance of our MPGP on both simulated and real large data sets. The results show that our MPGP is a robust estimation algorithm with high computational efﬁciency, which outperforms other state-of-art sparse GP methods. 1

5 0.55348861 136 nips-2012-Forward-Backward Activation Algorithm for Hierarchical Hidden Markov Models

Author: Kei Wakabayashi, Takao Miura

Abstract: Hierarchical Hidden Markov Models (HHMMs) are sophisticated stochastic models that enable us to capture a hierarchical context characterization of sequence data. However, existing HHMM parameter estimation methods require large computations of time complexity O(T N 2D ) at least for model inference, where D is the depth of the hierarchy, N is the number of states in each level, and T is the sequence length. In this paper, we propose a new inference method of HHMMs for which the time complexity is O(T N D+1 ). A key idea of our algorithm is application of the forward-backward algorithm to state activation probabilities. The notion of a state activation, which offers a simple formalization of the hierarchical transition behavior of HHMMs, enables us to conduct model inference efﬁciently. We present some experiments to demonstrate that our proposed method works more efﬁciently to estimate HHMM parameters than do some existing methods such as the ﬂattening method and Gibbs sampling method. 1

6 0.5502708 336 nips-2012-The Coloured Noise Expansion and Parameter Estimation of Diffusion Processes

7 0.48863617 299 nips-2012-Scalable imputation of genetic data with a discrete fragmentation-coagulation process

8 0.44146338 349 nips-2012-Training sparse natural image models with a fast Gibbs sampler of an extended state space

9 0.43963283 82 nips-2012-Continuous Relaxations for Discrete Hamiltonian Monte Carlo

10 0.43676633 111 nips-2012-Efficient Sampling for Bipartite Matching Problems

11 0.43187755 315 nips-2012-Slice sampling normalized kernel-weighted completely random measure mixture models

12 0.41843989 56 nips-2012-Bayesian active learning with localized priors for fast receptive field characterization

13 0.40548012 107 nips-2012-Effective Split-Merge Monte Carlo Methods for Nonparametric Models of Sequential Data

14 0.39633352 151 nips-2012-High-Order Multi-Task Feature Learning to Identify Longitudinal Phenotypic Markers for Alzheimer's Disease Progression Prediction

15 0.38640594 138 nips-2012-Fully Bayesian inference for neural models with negative-binomial spiking

16 0.38546735 128 nips-2012-Fast Resampling Weighted v-Statistics

17 0.38135216 292 nips-2012-Regularized Off-Policy TD-Learning

18 0.37203333 333 nips-2012-Synchronization can Control Regularization in Neural Systems via Correlated Noise Processes

19 0.36167791 203 nips-2012-Locating Changes in Highly Dependent Data with Unknown Number of Change Points

20 0.35314703 121 nips-2012-Expectation Propagation in Gaussian Process Dynamical Systems

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.027), (17, 0.011), (21, 0.038), (23, 0.248), (38, 0.069), (39, 0.014), (42, 0.019), (54, 0.024), (55, 0.024), (60, 0.012), (74, 0.052), (76, 0.192), (80, 0.115), (92, 0.078)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.84069234 132 nips-2012-Fiedler Random Fields: A Large-Scale Spectral Approach to Statistical Network Modeling

Author: Antonino Freno, Mikaela Keller, Marc Tommasi

Abstract: Statistical models for networks have been typically committed to strong prior assumptions concerning the form of the modeled distributions. Moreover, the vast majority of currently available models are explicitly designed for capturing some speciﬁc graph properties (such as power-law degree distributions), which makes them unsuitable for application to domains where the behavior of the target quantities is not known a priori. The key contribution of this paper is twofold. First, we introduce the Fiedler delta statistic, based on the Laplacian spectrum of graphs, which allows to dispense with any parametric assumption concerning the modeled network properties. Second, we use the deﬁned statistic to develop the Fiedler random ﬁeld model, which allows for efﬁcient estimation of edge distributions over large-scale random networks. After analyzing the dependence structure involved in Fiedler random ﬁelds, we estimate them over several real-world networks, showing that they achieve a much higher modeling accuracy than other well-known statistical approaches.

same-paper 2 0.81942481 41 nips-2012-Ancestor Sampling for Particle Gibbs

Author: Fredrik Lindsten, Thomas Schön, Michael I. Jordan

3 0.80238074 214 nips-2012-Minimizing Sparse High-Order Energies by Submodular Vertex-Cover

Author: Andrew Delong, Olga Veksler, Anton Osokin, Yuri Boykov

Abstract: Inference in high-order graphical models has become important in recent years. Several approaches are based, for example, on generalized message-passing, or on transformation to a pairwise model with extra ‘auxiliary’ variables. We focus on a special case where a much more efﬁcient transformation is possible. Instead of adding variables, we transform the original problem into a comparatively small instance of submodular vertex-cover. These vertex-cover instances can then be attacked by existing algorithms (e.g. belief propagation, QPBO), where they often run 4–15 times faster and ﬁnd better solutions than when applied to the original problem. We evaluate our approach on synthetic data, then we show applications within a fast hierarchical clustering and model-ﬁtting framework. 1

4 0.69853497 197 nips-2012-Learning with Recursive Perceptual Representations

Author: Oriol Vinyals, Yangqing Jia, Li Deng, Trevor Darrell

Abstract: Linear Support Vector Machines (SVMs) have become very popular in vision as part of state-of-the-art object recognition and other classiﬁcation tasks but require high dimensional feature spaces for good performance. Deep learning methods can ﬁnd more compact representations but current methods employ multilayer perceptrons that require solving a difﬁcult, non-convex optimization problem. We propose a deep non-linear classiﬁer whose layers are SVMs and which incorporates random projection as its core stacking element. Our method learns layers of linear SVMs recursively transforming the original data manifold through a random projection of the weak prediction computed from each layer. Our method scales as linear SVMs, does not rely on any kernel computations or nonconvex optimization, and exhibits better generalization ability than kernel-based SVMs. This is especially true when the number of training samples is smaller than the dimensionality of data, a common scenario in many real-world applications. The use of random projections is key to our method, as we show in the experiments section, in which we observe a consistent improvement over previous –often more complicated– methods on several vision and speech benchmarks. 1

5 0.69601989 329 nips-2012-Super-Bit Locality-Sensitive Hashing

Author: Jianqiu Ji, Jianmin Li, Shuicheng Yan, Bo Zhang, Qi Tian

Abstract: Sign-random-projection locality-sensitive hashing (SRP-LSH) is a probabilistic dimension reduction method which provides an unbiased estimate of angular similarity, yet suffers from the large variance of its estimation. In this work, we propose the Super-Bit locality-sensitive hashing (SBLSH). It is easy to implement, which orthogonalizes the random projection vectors in batches, and it is theoretically guaranteed that SBLSH also provides an unbiased estimate of angular similarity, yet with a smaller variance when the angle to estimate is within (0, ⇡/2]. The extensive experiments on real data well validate that given the same length of binary code, SBLSH may achieve signiﬁcant mean squared error reduction in estimating pairwise angular similarity. Moreover, SBLSH shows the superiority over SRP-LSH in approximate nearest neighbor (ANN) retrieval experiments. 1

6 0.69048226 315 nips-2012-Slice sampling normalized kernel-weighted completely random measure mixture models

7 0.68924594 279 nips-2012-Projection Retrieval for Classification

8 0.68876123 188 nips-2012-Learning from Distributions via Support Measure Machines

9 0.68772441 229 nips-2012-Multimodal Learning with Deep Boltzmann Machines

10 0.68725961 336 nips-2012-The Coloured Noise Expansion and Parameter Estimation of Diffusion Processes

11 0.68659562 318 nips-2012-Sparse Approximate Manifolds for Differential Geometric MCMC

12 0.68607354 90 nips-2012-Deep Learning of Invariant Features via Simulated Fixations in Video

13 0.68597871 321 nips-2012-Spectral learning of linear dynamics from generalised-linear observations with application to neural population data

14 0.68520528 74 nips-2012-Collaborative Gaussian Processes for Preference Learning

15 0.68289787 54 nips-2012-Bayesian Probabilistic Co-Subspace Addition

16 0.68214542 209 nips-2012-Max-Margin Structured Output Regression for Spatio-Temporal Action Localization

17 0.68167347 103 nips-2012-Distributed Probabilistic Learning for Camera Networks with Missing Data

18 0.68130785 280 nips-2012-Proper losses for learning from partial labels

19 0.67958081 203 nips-2012-Locating Changes in Highly Dependent Data with Unknown Number of Change Points

20 0.67951483 168 nips-2012-Kernel Latent SVM for Visual Recognition