nips nips2007 nips2007-214 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Manfred Opper, Guido Sanguinetti
Abstract: Markov jump processes play an important role in a large number of application domains. However, realistic systems are analytically intractable and they have traditionally been analysed using simulation based techniques, which do not provide a framework for statistical inference. We propose a mean field approximation to perform posterior inference and parameter estimation. The approximation allows a practical solution to the inference problem, while still retaining a good degree of accuracy. We illustrate our approach on two biologically motivated systems.
Reference: text
sentIndex sentText sentNum sentScore
1 Variational inference for Markov jump processes Guido Sanguinetti Department of Computer Science University of Sheffield, U. [sent-1, score-0.205]
2 de Abstract Markov jump processes play an important role in a large number of application domains. [sent-8, score-0.145]
3 We propose a mean field approximation to perform posterior inference and parameter estimation. [sent-10, score-0.172]
4 Introduction Markov jump processes (MJPs) underpin our understanding of many important systems in science and technology. [sent-13, score-0.145]
5 They provide a rigorous probabilistic framework to model the joint dynamics of groups (species) of interacting individuals, with applications ranging from information packets in a telecommunications network to epidemiology and population levels in the environment. [sent-14, score-0.093]
6 A traditional approach, which has been very successful throughout the past century, is to ignore the discrete nature of the processes and to approximate the stochastic process with a deterministic process whose behaviour is described by a system of non-linear, coupled ODEs. [sent-17, score-0.493]
7 This approximation relies on the stochastic fluctuations being negligible compared to the average population counts. [sent-18, score-0.194]
8 Researchers are now able to obtain accurate estimates of the number of macromolecules of a certain species within a cell [2, 3], prompting a need for practical statistical tools to handle discrete data. [sent-20, score-0.4]
9 However, it is not clear how observed data can be incorporated in a principled way, which renders this approach of limited use for posterior inference and parameter estimation. [sent-24, score-0.14]
10 In this paper we present an alternative, deterministic approach to posterior inference and parameter estimation in MJPs. [sent-29, score-0.17]
11 In this way, we replace the couplings between the 1 different species by their average, mean-field (MF) effect. [sent-34, score-0.352]
12 The rest of this paper is organised as follows: in sections 1 and 2 we review the theory of Markov jump processes and introduce our general strategy to obtain a MF approximation. [sent-36, score-0.145]
13 In section 4 we present experimental results on simulated data from the Lotka-Volterra model and from a simple gene regulatory network. [sent-38, score-0.097]
14 Finally, we discuss the relationship of our study to other stochastic models, as well as further extensions and developments of our approach. [sent-39, score-0.112]
15 1 Markov jump processes We start off by establishing some notation and basic definitions. [sent-40, score-0.145]
16 A D-dimensional discrete stochastic process is a family of D-dimensional discrete random variables x(t) indexed by the continuous time t. [sent-41, score-0.269]
17 The dimensionality D represents the number of (molecular) species present in the system; the 0 components of the vector x (t) then represent the number of individuals of each species present at time t. [sent-43, score-0.782]
18 Furthermore, the stochastic processes we will consider will always be Markovian, i. [sent-44, score-0.124]
19 given any sequence of observations for the state of the system (xt1 , . [sent-46, score-0.109]
20 A discrete stochastic process which exhibits the Markov property is called a Markov jump process (MJP). [sent-50, score-0.415]
21 A MJP is characterised by its process rates f (x |x), defined ∀x = x; in an infinitesimal time interval δt, the quantity f (x |x) δt represents the infinitesimal probability that the system will make a transition from state x at time t to state x at time t + δt. [sent-51, score-0.301]
22 The interpretation of the process rates as infinitesimal transition probabilities highlights the simple relationship between the marginal distribution pt (x) and the process rates. [sent-54, score-0.414]
23 In formulae, this is given by pt+δt (x) = pt (x) 1 − x =x f (x |x) δt + pt (x ) f (x|x ) δt. [sent-56, score-0.156]
24 x =x Taking the limit for δt → 0 we obtain the (forward) Master equation for the marginal probabilities dpt (x) = dt 2 [−pt (x) f (x |x) + pt (x ) f (x|x )] . [sent-57, score-0.275]
25 (2) x =x Variational approximate inference Let us assume that we have noisy observations yl l = 1, . [sent-58, score-0.209]
26 , N of the state of the system at a discrete number of time points; the noise model is specified by a likelihood function p (y l |x (tl )). [sent-61, score-0.127]
27 We ˆ can combine this likelihood with the prior process to obtain a posterior process. [sent-62, score-0.184]
28 As the observations happen at discrete time points, the posterior process is clearly still a Markov jump process. [sent-63, score-0.352]
29 While this is possible in principle, the computations would require simultaneously solving a very large system of coupled linear ODEs (the number of equations is of order S D , S being the number of states accessible to the system), which is not feasible even in simple systems. [sent-65, score-0.131]
30 2 In the following, we will use the variational mean field (MF) approach to approximate the posterior process by a factorizing process, minimising the Kullback - Leibler (KL) divergence between processes. [sent-66, score-0.343]
31 The inference process is then reduced to the solution of D one - dimensional Master and backward equations of size S. [sent-67, score-0.212]
32 This is still nontrivial because the KL divergence requires the joint probabilities of variables x(t) at infinitely many different times t, i. [sent-68, score-0.094]
33 probabilities over entire paths of a process rather than the simpler marginals pt (x). [sent-70, score-0.258]
34 Hence, we write the joint posterior probability as ppost (x0:K ) = 1 pprior (x0:K ) × Z K−1 N p (yl |x (tl )) ˆ with pprior (x0:K ) = p(x0 ) p(xk+1 |xk ) k=0 l=1 with Z = p(y1 , . [sent-76, score-0.373]
35 In the rest of this section, we will show how to compute the posterior rates and marginals by minimising the KL divergence. [sent-81, score-0.197]
36 We notice in passing that a similar framework for continuous stochastic processes was proposed recently in [8]. [sent-82, score-0.124]
37 1 KL divergence between MJPs The KL divergence between two MJPs defined by their path probabilities p(x 0:K ) and q(x0:K ) is KL [q, p] = q(x0:K ) ln x0:K q(x0:K ) = p(x0:K ) K−1 q(xk ) k=0 xk q(xk+1 |xk ) ln xk+1 q(xk+1 |xk ) + K0 p(xk+1 |xk ) will be set to zero in the following. [sent-84, score-0.462]
38 Notice that we have swapped from the path probabilities framework to an expression that depends solely on the process rates and marginals. [sent-86, score-0.282]
39 2 MF approximation to posterior MJPs We will now consider the case where p is a posterior MJP and q is an approximating process. [sent-88, score-0.245]
40 The prior process will be denoted as pprior and its rates will be denoted by f . [sent-89, score-0.279]
41 The KL divergence then is N KL(q, ppost ) = ln Z + KL(q, pprior ) − Eq [ln p (yl |x (tl ))] . [sent-90, score-0.344]
42 ˆ l=1 To obtain a tractable inference problem, we will assume that, in the approximating process q, the joint path probability for all the species factorises into the product of path probabilities for individual species. [sent-91, score-0.757]
43 This gives the following equations for the species probabilities and transition rates D qt (x) = D qit (xi ) gt (x |x) = i=1 δxj ,xj git (xi |xi ) . [sent-92, score-1.097]
44 (4) i=1 j=i Notice that we have emphasised that the process rates for the approximating process may depend explicitly on time, even if the process rates of the original process do not. [sent-93, score-0.639]
45 In order to find the MF approximation to the posterior process we must optimise the KL divergence (5) with respect to the marginals qit (x) and the rates git (x |x). [sent-98, score-0.918]
46 We will take care of this constraint by using a Lagrange multiplier function λi (x, t) and compute the stationary values of the Lagrangian L =KL (q, ppost ) T − dt i 0 x λi (x, t) ∂t qit (x) − x =x {git (x|x ) qit (x ) − git (x |x) qit (x)} . [sent-100, score-1.145]
47 Then ˆ ˆ equation (8) yields lim ri (x, t) = pi (yil |xi (tl )) lim ri (x, t) . [sent-103, score-0.171]
48 Starting with an initial guess for ˆ ˜ qt (x) and selecting a species i, we can compute fi (x |x) and fi (x |x). [sent-105, score-0.669]
49 Using these, we can solve equation (10) backwards starting from the condition ri (x, T ) = 1∀x (i. [sent-106, score-0.111]
50 This allows us to update our estimate of the rates git (x |x) using equation (9), which can then be used to solve the master equation (2) and update our guess of qit (x). [sent-109, score-0.797]
51 3 Parameter estimation Since KL [q, ppost ] ≥ 0, we obtain as useful by-product of the MF approximation a tractable variational lower bound on the log - likelihood of the data log Z = log p(y1 , . [sent-112, score-0.221]
52 g 7] such a bound can be used in order to optimise model parameters using a variational E-M algorithm. [sent-117, score-0.121]
53 4 3 Example: the Lotka-Volterra process The Lotka-Volterra (LV) process is often used as perhaps the simplest non-trivial MJP [6, 4]. [sent-118, score-0.208]
54 Lotka in 1925 and Vito Volterra in 1926, it describe the dynamics of a population composed of two interacting species, traditionally referred to as predator and prey. [sent-120, score-0.183]
55 The process rates for the LV system are given by fprey (x + 1|x, y) = αx fpredator (y + 1|x, y) = δxy fprey (x − 1|x, y) = βxy fprey (y − 1|x, y) = γy (11) where x is the number of preys and y is the number of predators. [sent-121, score-0.439]
56 All other rates are zero: individuals can only be created or destroyed one at the time. [sent-122, score-0.163]
57 Rate sparsity is a characteristic of very many processes, including all chemical kinetic processes (indeed, the LV model can be interpreted as a chemical kinetic model). [sent-123, score-0.353]
58 An immediate difficulty in implementing our strategy is that some of the process rates are identically zero when one of the species is extinct (i. [sent-124, score-0.586]
59 its numbers have reached zero); this will lead to infinities when computing the expectation of the logarithm of the rates in equation (6). [sent-126, score-0.172]
60 To avoid this, we will “regularise” the process by adding a small constant to the f (1|0); it can be proved that on average over the data generating process the variational approximation to the regularised process still optimises a bound analogous to (3) on the original process [9]. [sent-127, score-0.524]
61 The variational estimates for the parameters of the LV process are obtained by inserting the process rates (11) into the MF bound and taking derivatives w. [sent-128, score-0.401]
62 (12) Equations (12) have an appealing intuitive meaning in terms of the physics of the process: for example, α is given by the average total increase rate of the approximating process divided by the average total number of preys. [sent-133, score-0.157]
63 We generated 15 counts of predator and prey numbers at regular intervals from a LV process with parameters α = 5 × 10−4 , β = 1 × 10−4 , γ = 5 × 10−4 and δ = 1 × 10−4 , starting from initial population levels of seven predators and nineteen preys. [sent-134, score-0.475]
64 These counts were then corrupted according to the following noise model pi (yil |xi (tl )) ∝ ˆ 1 + 10−6 , 2|yil −xi (tl )| (13) where xi (tl ) is the (discrete) count for species i at time tl before the addition of noise. [sent-135, score-0.672]
65 Notice that, since population numbers are constrained to be positive, the noise model is not symmetric. [sent-136, score-0.129]
66 While in theory each species can have an arbitrarily large number of individuals, in order to solve the differential equations (2) and (10) we have to truncate the process. [sent-139, score-0.431]
67 While the truncation threshold could be viewed as another parameter and optimised variationally, in these experiments we took a more heuristic approach and limited the maximum number of individuals of each species to 200. [sent-140, score-0.498]
68 This was justified by considering that an exponential growth pattern fitted to the available data led to an estimate of approximately 90 individuals in the most abundant species, well below the truncation threshold. [sent-141, score-0.11]
69 The solid line is the mean of the approximating distribution, the dashed lines are the 90% confidence intervals, the dotted line is the true path from which the data was obtained. [sent-143, score-0.103]
70 32 × 10−4 , 5 25 25 y x 20 20 15 15 10 10 5 5 0 0 500 1000 1500 2000 2500 0 0 3000 t 500 1000 1500 2000 2500 3000 t (a) (b) Figure 1: MF approximation to posterior LV process: (a) predator population and (b) prey population. [sent-147, score-0.363]
71 While the process is well approximated in the area where data is present, the free-form prediction is less good, especially for the predator population. [sent-152, score-0.194]
72 The approximate posterior displays nontrivial emerging properties: for example, we predict that there is a 10% chance that the prey population will become extinct at the end of the period of interest. [sent-154, score-0.286]
73 4 Example: gene autoregulatory network As a second example we consider a gene autoregulatory network. [sent-160, score-0.302]
74 The system consists again of two species, mRNA and protein; the process rates are given by fRN A (x + 1|x, y) = α (1 − 0. [sent-163, score-0.235]
75 99 × Θ (y − yc )) fRN A (x − 1|x, y) = βx fp (y + 1|x, y) = γx fp (y − 1|x, y) = δy (14) where Θ is the Heavyside step function, y the protein number and x the mRNA number. [sent-164, score-0.487]
76 The intuitive meaning of these equations is simple: both protein and mRNA decay exponentially. [sent-165, score-0.146]
77 On the other hand, mRNA production depends on protein concentration levels through a logical function: as soon as protein numbers increase beyond a certain critical parameter yc , mRNA production drops dramatically by a factor 100. [sent-167, score-0.641]
78 The dependence of the MF bound on the critical parameter yc is less straightforward and is given by T Lyc = const + 2 dt¯h (yc ) + log 1 − 0. [sent-172, score-0.349]
79 99 g 0 1 T T T h (yc ) dt 0 dt¯ g (15) 0 where g = gRN A (x + 1|x) qRN A and h (yc ) = y≥yc qp (y). [sent-173, score-0.103]
80 We can determine the minimum of (15) by searching over the possible (discrete) values of yc . [sent-175, score-0.317]
81 5 0 22 20 40 60 80 100 120 140 160 180 200 yc (a) (b) Figure 2: (a) Mean squared error (MSE) in the estimate of the parameters as a function of the number of observations N for the LV process. [sent-182, score-0.347]
82 (b) Negative variational likelihood bound for the gene autoregulatory network as a function of the critical parameter yc . [sent-183, score-0.576]
83 N 30 y 18 28 x 17 16 26 15 24 14 22 13 20 12 18 11 16 14 0 10 500 1000 1500 9 0 2000 t 500 1000 1500 2000 t (a) (b) Figure 3: MF approximation to posterior autoregulatory network process: (a) protein population and (b) mRNA population. [sent-184, score-0.393]
84 Again, we generated data by simulating the process with parameter values y c = 20, α = 2 × 10−3 , β = 6 × 10−5 , γ = 5 × 10−4 and δ = 7 × 10−5 . [sent-186, score-0.104]
85 Fifteen counts were generated for both mRNA and proteins, with initial count of 17 protein and 12 mRNA molecules. [sent-187, score-0.177]
86 The results of the approximate posterior inference are shown in Figure 3. [sent-189, score-0.14]
87 The inferred parameter values are in good agreement with the true values: yc = 19, α = 2. [sent-190, score-0.317]
88 Interestingly, if the data is such that the protein count never exceeds the critical parameter yc , this becomes unidentifiable (the likelihood bound is optimised by yc = ∞ or yc = 0), as may be expected. [sent-195, score-1.152]
89 5 Discussion In this contribution we have shown how a MF approximation can be used to perform posterior inference in MJPs from discretely observed noisy data. [sent-197, score-0.237]
90 While these focus on sampling from the distribution of reactions happening in a small interval in time, we compute an approximation to the probability distribution over possible paths of the system. [sent-200, score-0.146]
91 The factorisation assumption implies that the computational complexity grows linearly in the number of species D; it is unclear how MCMC would scale to larger systems. [sent-203, score-0.352]
92 An alternative suggestion, proposed in [11], was somehow to seek a middle way between a MJP and a deterministic, ODE based approach by approximating the MJP with a continuous stochastic process, i. [sent-204, score-0.122]
93 While these authors show that this approximation works reasonably well for inference purposes, it is worth pointing out that the population sizes in their experimental results were approximately one order of magnitude larger than in ours. [sent-207, score-0.185]
94 It is arguable that a diffusion approximation might be suitable for population sizes as low as a few hundreds, but it cannot be expected to be reasonable for population sizes of the order of 10. [sent-208, score-0.268]
95 The availability of a practical tool for statistical inference in MJPs opens a number of important possible developments for modelling. [sent-209, score-0.103]
96 It would be of interest, for example, to develop mixed models where one species with low counts interacts with another species with high counts that can be modelled using a deterministic or diffusion approximation. [sent-210, score-0.872]
97 All of these extensions rely centrally on the possibility of estimating posterior probabilities, and we expect that the availability of a practical tool for the inference task will be very useful to facilitate this. [sent-214, score-0.14]
98 Stochastic protein expression in individual cells at the single molecule level. [sent-221, score-0.098]
99 Bayesian inference for a discretely observed stochastic kinetic model. [sent-238, score-0.255]
100 Bayesian inference for stochastic kinetic models using a diffusion approximation. [sent-261, score-0.269]
wordName wordTfidf (topN-words)
[('species', 0.352), ('git', 0.317), ('yc', 0.317), ('tl', 0.211), ('qit', 0.204), ('mrna', 0.197), ('mf', 0.196), ('lv', 0.177), ('mjp', 0.158), ('mjps', 0.136), ('kl', 0.133), ('fi', 0.12), ('ppost', 0.113), ('predatort', 0.113), ('preyt', 0.113), ('process', 0.104), ('dt', 0.103), ('protein', 0.098), ('population', 0.093), ('autoregulatory', 0.09), ('kinetic', 0.09), ('pprior', 0.09), ('predator', 0.09), ('jump', 0.09), ('yl', 0.09), ('ln', 0.09), ('xk', 0.087), ('rates', 0.085), ('posterior', 0.08), ('yil', 0.079), ('pt', 0.078), ('individuals', 0.078), ('variational', 0.076), ('stochastic', 0.069), ('diamonds', 0.068), ('fprey', 0.068), ('prey', 0.068), ('opper', 0.063), ('manfred', 0.063), ('gene', 0.061), ('master', 0.06), ('ri', 0.06), ('inference', 0.06), ('chemical', 0.059), ('processes', 0.055), ('approximating', 0.053), ('divergence', 0.051), ('equation', 0.051), ('nitesimal', 0.05), ('path', 0.05), ('diffusion', 0.05), ('mcmc', 0.05), ('discrete', 0.048), ('equations', 0.048), ('qt', 0.048), ('system', 0.046), ('darren', 0.045), ('extinct', 0.045), ('factorises', 0.045), ('frn', 0.045), ('gpredatort', 0.045), ('gpreyt', 0.045), ('guido', 0.045), ('jumped', 0.045), ('reactions', 0.045), ('xtn', 0.045), ('optimise', 0.045), ('counts', 0.044), ('developments', 0.043), ('proteins', 0.043), ('probabilities', 0.043), ('markov', 0.04), ('intervals', 0.04), ('boys', 0.039), ('ex', 0.038), ('coupled', 0.037), ('numbers', 0.036), ('regulatory', 0.036), ('happening', 0.036), ('fp', 0.036), ('optimised', 0.036), ('discretely', 0.036), ('count', 0.035), ('lagrangian', 0.034), ('mse', 0.034), ('state', 0.033), ('paths', 0.033), ('approximation', 0.032), ('minimising', 0.032), ('truncation', 0.032), ('inserting', 0.032), ('critical', 0.032), ('differential', 0.031), ('eld', 0.031), ('production', 0.03), ('observations', 0.03), ('xi', 0.03), ('deterministic', 0.03), ('noisy', 0.029), ('guess', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999964 214 nips-2007-Variational inference for Markov jump processes
Author: Manfred Opper, Guido Sanguinetti
Abstract: Markov jump processes play an important role in a large number of application domains. However, realistic systems are analytically intractable and they have traditionally been analysed using simulation based techniques, which do not provide a framework for statistical inference. We propose a mean field approximation to perform posterior inference and parameter estimation. The approximation allows a practical solution to the inference problem, while still retaining a good degree of accuracy. We illustrate our approach on two biologically motivated systems.
2 0.18589564 213 nips-2007-Variational Inference for Diffusion Processes
Author: Cédric Archambeau, Manfred Opper, Yuan Shen, Dan Cornford, John S. Shawe-taylor
Abstract: Diffusion processes are a family of continuous-time continuous-state stochastic processes that are in general only partially observed. The joint estimation of the forcing parameters and the system noise (volatility) in these dynamical systems is a crucial, but non-trivial task, especially when the system is nonlinear and multimodal. We propose a variational treatment of diffusion processes, which allows us to compute type II maximum likelihood estimates of the parameters by simple gradient techniques and which is computationally less demanding than most MCMC approaches. We also show how a cheap estimate of the posterior over the parameters can be constructed based on the variational free energy. 1
3 0.10438471 48 nips-2007-Collective Inference on Markov Models for Modeling Bird Migration
Author: M.a. S. Elmohamed, Dexter Kozen, Daniel R. Sheldon
Abstract: We investigate a family of inference problems on Markov models, where many sample paths are drawn from a Markov chain and partial information is revealed to an observer who attempts to reconstruct the sample paths. We present algorithms and hardness results for several variants of this problem which arise by revealing different information to the observer and imposing different requirements for the reconstruction of sample paths. Our algorithms are analogous to the classical Viterbi algorithm for Hidden Markov Models, which finds the single most probable sample path given a sequence of observations. Our work is motivated by an important application in ecology: inferring bird migration paths from a large database of observations. 1
4 0.10212287 17 nips-2007-A neural network implementing optimal state estimation based on dynamic spike train decoding
Author: Omer Bobrowski, Ron Meir, Shy Shoham, Yonina Eldar
Abstract: It is becoming increasingly evident that organisms acting in uncertain dynamical environments often employ exact or approximate Bayesian statistical calculations in order to continuously estimate the environmental state, integrate information from multiple sensory modalities, form predictions and choose actions. What is less clear is how these putative computations are implemented by cortical neural networks. An additional level of complexity is introduced because these networks observe the world through spike trains received from primary sensory afferents, rather than directly. A recent line of research has described mechanisms by which such computations can be implemented using a network of neurons whose activity directly represents a probability distribution across the possible “world states”. Much of this work, however, uses various approximations, which severely restrict the domain of applicability of these implementations. Here we make use of rigorous mathematical results from the theory of continuous time point process filtering, and show how optimal real-time state estimation and prediction may be implemented in a general setting using linear neural networks. We demonstrate the applicability of the approach with several examples, and relate the required network properties to the statistical nature of the environment, thereby quantifying the compatibility of a given network with its environment. 1
5 0.08011087 34 nips-2007-Bayesian Policy Learning with Trans-Dimensional MCMC
Author: Matthew Hoffman, Arnaud Doucet, Nando D. Freitas, Ajay Jasra
Abstract: A recently proposed formulation of the stochastic planning and control problem as one of parameter estimation for suitable artificial statistical models has led to the adoption of inference algorithms for this notoriously hard problem. At the algorithmic level, the focus has been on developing Expectation-Maximization (EM) algorithms. In this paper, we begin by making the crucial observation that the stochastic control problem can be reinterpreted as one of trans-dimensional inference. With this new interpretation, we are able to propose a novel reversible jump Markov chain Monte Carlo (MCMC) algorithm that is more efficient than its EM counterparts. Moreover, it enables us to implement full Bayesian policy search, without the need for gradients and with one single Markov chain. The new approach involves sampling directly from a distribution that is proportional to the reward and, consequently, performs better than classic simulations methods in situations where the reward is a rare event.
6 0.077637009 167 nips-2007-Regulator Discovery from Gene Expression Time Series of Malaria Parasites: a Hierachical Approach
7 0.071913838 176 nips-2007-Sequential Hypothesis Testing under Stochastic Deadlines
8 0.06706398 87 nips-2007-Fast Variational Inference for Large-scale Internet Diagnosis
9 0.066542618 203 nips-2007-The rat as particle filter
10 0.063228719 103 nips-2007-Inferring Elapsed Time from Stochastic Neural Processes
11 0.059178378 31 nips-2007-Bayesian Agglomerative Clustering with Coalescents
12 0.05759551 162 nips-2007-Random Sampling of States in Dynamic Programming
13 0.057144169 140 nips-2007-Neural characterization in partially observed populations of spiking neurons
14 0.056485374 47 nips-2007-Collapsed Variational Inference for HDP
15 0.054164998 59 nips-2007-Continuous Time Particle Filtering for fMRI
16 0.053778201 82 nips-2007-Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization
17 0.051814456 145 nips-2007-On Sparsity and Overcompleteness in Image Models
18 0.051781315 174 nips-2007-Selecting Observations against Adversarial Objectives
19 0.051635548 33 nips-2007-Bayesian Inference for Spiking Neuron Models with a Sparsity Prior
20 0.05124576 170 nips-2007-Robust Regression with Twinned Gaussian Processes
topicId topicWeight
[(0, -0.179), (1, -0.02), (2, 0.048), (3, -0.031), (4, -0.017), (5, -0.052), (6, -0.056), (7, -0.057), (8, -0.097), (9, -0.15), (10, 0.007), (11, 0.069), (12, -0.05), (13, 0.037), (14, 0.068), (15, 0.048), (16, -0.053), (17, 0.097), (18, 0.01), (19, -0.066), (20, 0.1), (21, 0.07), (22, -0.115), (23, -0.001), (24, 0.047), (25, -0.04), (26, -0.046), (27, 0.065), (28, -0.058), (29, -0.068), (30, -0.073), (31, -0.02), (32, 0.219), (33, -0.08), (34, 0.203), (35, 0.06), (36, 0.121), (37, -0.021), (38, 0.105), (39, 0.031), (40, -0.143), (41, -0.022), (42, -0.154), (43, -0.03), (44, 0.041), (45, 0.007), (46, 0.004), (47, -0.132), (48, 0.066), (49, -0.057)]
simIndex simValue paperId paperTitle
same-paper 1 0.9441756 214 nips-2007-Variational inference for Markov jump processes
Author: Manfred Opper, Guido Sanguinetti
Abstract: Markov jump processes play an important role in a large number of application domains. However, realistic systems are analytically intractable and they have traditionally been analysed using simulation based techniques, which do not provide a framework for statistical inference. We propose a mean field approximation to perform posterior inference and parameter estimation. The approximation allows a practical solution to the inference problem, while still retaining a good degree of accuracy. We illustrate our approach on two biologically motivated systems.
2 0.73441958 213 nips-2007-Variational Inference for Diffusion Processes
Author: Cédric Archambeau, Manfred Opper, Yuan Shen, Dan Cornford, John S. Shawe-taylor
Abstract: Diffusion processes are a family of continuous-time continuous-state stochastic processes that are in general only partially observed. The joint estimation of the forcing parameters and the system noise (volatility) in these dynamical systems is a crucial, but non-trivial task, especially when the system is nonlinear and multimodal. We propose a variational treatment of diffusion processes, which allows us to compute type II maximum likelihood estimates of the parameters by simple gradient techniques and which is computationally less demanding than most MCMC approaches. We also show how a cheap estimate of the posterior over the parameters can be constructed based on the variational free energy. 1
3 0.52243108 59 nips-2007-Continuous Time Particle Filtering for fMRI
Author: Lawrence Murray, Amos J. Storkey
Abstract: We construct a biologically motivated stochastic differential model of the neural and hemodynamic activity underlying the observed Blood Oxygen Level Dependent (BOLD) signal in Functional Magnetic Resonance Imaging (fMRI). The model poses a difficult parameter estimation problem, both theoretically due to the nonlinearity and divergence of the differential system, and computationally due to its time and space complexity. We adapt a particle filter and smoother to the task, and discuss some of the practical approaches used to tackle the difficulties, including use of sparse matrices and parallelisation. Results demonstrate the tractability of the approach in its application to an effective connectivity study. 1
4 0.51561064 17 nips-2007-A neural network implementing optimal state estimation based on dynamic spike train decoding
Author: Omer Bobrowski, Ron Meir, Shy Shoham, Yonina Eldar
Abstract: It is becoming increasingly evident that organisms acting in uncertain dynamical environments often employ exact or approximate Bayesian statistical calculations in order to continuously estimate the environmental state, integrate information from multiple sensory modalities, form predictions and choose actions. What is less clear is how these putative computations are implemented by cortical neural networks. An additional level of complexity is introduced because these networks observe the world through spike trains received from primary sensory afferents, rather than directly. A recent line of research has described mechanisms by which such computations can be implemented using a network of neurons whose activity directly represents a probability distribution across the possible “world states”. Much of this work, however, uses various approximations, which severely restrict the domain of applicability of these implementations. Here we make use of rigorous mathematical results from the theory of continuous time point process filtering, and show how optimal real-time state estimation and prediction may be implemented in a general setting using linear neural networks. We demonstrate the applicability of the approach with several examples, and relate the required network properties to the statistical nature of the environment, thereby quantifying the compatibility of a given network with its environment. 1
5 0.50017995 48 nips-2007-Collective Inference on Markov Models for Modeling Bird Migration
Author: M.a. S. Elmohamed, Dexter Kozen, Daniel R. Sheldon
Abstract: We investigate a family of inference problems on Markov models, where many sample paths are drawn from a Markov chain and partial information is revealed to an observer who attempts to reconstruct the sample paths. We present algorithms and hardness results for several variants of this problem which arise by revealing different information to the observer and imposing different requirements for the reconstruction of sample paths. Our algorithms are analogous to the classical Viterbi algorithm for Hidden Markov Models, which finds the single most probable sample path given a sequence of observations. Our work is motivated by an important application in ecology: inferring bird migration paths from a large database of observations. 1
6 0.47483015 31 nips-2007-Bayesian Agglomerative Clustering with Coalescents
7 0.45785058 167 nips-2007-Regulator Discovery from Gene Expression Time Series of Malaria Parasites: a Hierachical Approach
8 0.45662978 87 nips-2007-Fast Variational Inference for Large-scale Internet Diagnosis
9 0.43989643 176 nips-2007-Sequential Hypothesis Testing under Stochastic Deadlines
10 0.42031556 47 nips-2007-Collapsed Variational Inference for HDP
11 0.40530378 174 nips-2007-Selecting Observations against Adversarial Objectives
12 0.38719764 163 nips-2007-Receding Horizon Differential Dynamic Programming
13 0.35869625 203 nips-2007-The rat as particle filter
14 0.35432035 34 nips-2007-Bayesian Policy Learning with Trans-Dimensional MCMC
15 0.33160776 170 nips-2007-Robust Regression with Twinned Gaussian Processes
16 0.32863215 178 nips-2007-Simulated Annealing: Rigorous finite-time guarantees for optimization on continuous domains
17 0.32162571 139 nips-2007-Nearest-Neighbor-Based Active Learning for Rare Category Detection
18 0.31365678 208 nips-2007-TrueSkill Through Time: Revisiting the History of Chess
19 0.3058714 82 nips-2007-Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization
20 0.30329674 195 nips-2007-The Generalized FITC Approximation
topicId topicWeight
[(5, 0.036), (13, 0.027), (16, 0.049), (18, 0.059), (21, 0.068), (31, 0.415), (34, 0.033), (35, 0.014), (46, 0.01), (47, 0.063), (49, 0.014), (83, 0.079), (85, 0.017), (90, 0.041)]
simIndex simValue paperId paperTitle
1 0.83598995 162 nips-2007-Random Sampling of States in Dynamic Programming
Author: Chris Atkeson, Benjamin Stephens
Abstract: We combine three threads of research on approximate dynamic programming: sparse random sampling of states, value function and policy approximation using local models, and using local trajectory optimizers to globally optimize a policy and associated value function. Our focus is on finding steady state policies for deterministic time invariant discrete time control problems with continuous states and actions often found in robotics. In this paper we show that we can now solve problems we couldn’t solve previously. 1
same-paper 2 0.80059481 214 nips-2007-Variational inference for Markov jump processes
Author: Manfred Opper, Guido Sanguinetti
Abstract: Markov jump processes play an important role in a large number of application domains. However, realistic systems are analytically intractable and they have traditionally been analysed using simulation based techniques, which do not provide a framework for statistical inference. We propose a mean field approximation to perform posterior inference and parameter estimation. The approximation allows a practical solution to the inference problem, while still retaining a good degree of accuracy. We illustrate our approach on two biologically motivated systems.
3 0.76884258 144 nips-2007-On Ranking in Survival Analysis: Bounds on the Concordance Index
Author: Harald Steck, Balaji Krishnapuram, Cary Dehing-oberije, Philippe Lambin, Vikas C. Raykar
Abstract: In this paper, we show that classical survival analysis involving censored data can naturally be cast as a ranking problem. The concordance index (CI), which quantifies the quality of rankings, is the standard performance measure for model assessment in survival analysis. In contrast, the standard approach to learning the popular proportional hazard (PH) model is based on Cox’s partial likelihood. We devise two bounds on CI–one of which emerges directly from the properties of PH models–and optimize them directly. Our experimental results suggest that all three methods perform about equally well, with our new approach giving slightly better results. We also explain why a method designed to maximize the Cox’s partial likelihood also ends up (approximately) maximizing the CI. 1
4 0.6828323 80 nips-2007-Ensemble Clustering using Semidefinite Programming
Author: Vikas Singh, Lopamudra Mukherjee, Jiming Peng, Jinhui Xu
Abstract: We consider the ensemble clustering problem where the task is to ‘aggregate’ multiple clustering solutions into a single consolidated clustering that maximizes the shared information among given clustering solutions. We obtain several new results for this problem. First, we note that the notion of agreement under such circumstances can be better captured using an agreement measure based on a 2D string encoding rather than voting strategy based methods proposed in literature. Using this generalization, we first derive a nonlinear optimization model to maximize the new agreement measure. We then show that our optimization problem can be transformed into a strict 0-1 Semidefinite Program (SDP) via novel convexification techniques which can subsequently be relaxed to a polynomial time solvable SDP. Our experiments indicate improvements not only in terms of the proposed agreement measure but also the existing agreement measures based on voting strategies. We discuss evaluations on clustering and image segmentation databases. 1
5 0.47165316 163 nips-2007-Receding Horizon Differential Dynamic Programming
Author: Yuval Tassa, Tom Erez, William D. Smart
Abstract: The control of high-dimensional, continuous, non-linear dynamical systems is a key problem in reinforcement learning and control. Local, trajectory-based methods, using techniques such as Differential Dynamic Programming (DDP), are not directly subject to the curse of dimensionality, but generate only local controllers. In this paper,we introduce Receding Horizon DDP (RH-DDP), an extension to the classic DDP algorithm, which allows us to construct stable and robust controllers based on a library of local-control trajectories. We demonstrate the effectiveness of our approach on a series of high-dimensional problems using a simulated multi-link swimming robot. These experiments show that our approach effectively circumvents dimensionality issues, and is capable of dealing with problems of (at least) 24 state and 9 action dimensions. 1
6 0.44180185 168 nips-2007-Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods
7 0.42485988 174 nips-2007-Selecting Observations against Adversarial Objectives
8 0.4112536 34 nips-2007-Bayesian Policy Learning with Trans-Dimensional MCMC
9 0.39348742 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI
10 0.3932133 213 nips-2007-Variational Inference for Diffusion Processes
11 0.38934961 105 nips-2007-Infinite State Bayes-Nets for Structured Domains
12 0.38745931 100 nips-2007-Hippocampal Contributions to Control: The Third Way
13 0.38637373 52 nips-2007-Competition Adds Complexity
14 0.3863408 127 nips-2007-Measuring Neural Synchrony by Message Passing
15 0.38630661 87 nips-2007-Fast Variational Inference for Large-scale Internet Diagnosis
16 0.38241103 76 nips-2007-Efficient Convex Relaxation for Transductive Support Vector Machine
17 0.3797864 185 nips-2007-Stable Dual Dynamic Programming
18 0.37917864 209 nips-2007-Ultrafast Monte Carlo for Statistical Summations
19 0.37912461 94 nips-2007-Gaussian Process Models for Link Analysis and Transfer Learning
20 0.37813476 89 nips-2007-Feature Selection Methods for Improving Protein Structure Prediction with Rosetta