nips nips2006 nips2006-135 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Neil D. Lawrence, Guido Sanguinetti, Magnus Rattray
Abstract: Modelling the dynamics of transcriptional processes in the cell requires the knowledge of a number of key biological quantities. While some of them are relatively easy to measure, such as mRNA decay rates and mRNA abundance levels, it is still very hard to measure the active concentration levels of the transcription factor proteins that drive the process and the sensitivity of target genes to these concentrations. In this paper we show how these quantities for a given transcription factor can be inferred from gene expression levels of a set of known target genes. We treat the protein concentration as a latent function with a Gaussian process prior, and include the sensitivities, mRNA decay rates and baseline expression levels as hyperparameters. We apply this procedure to a human leukemia dataset, focusing on the tumour repressor p53 and obtaining results in good accordance with recent biological studies.
Reference: text
sentIndex sentText sentNum sentScore
1 Modelling transcriptional regulation using Gaussian processes Neil D. [sent-1, score-0.154]
2 uk Abstract Modelling the dynamics of transcriptional processes in the cell requires the knowledge of a number of key biological quantities. [sent-17, score-0.207]
3 While some of them are relatively easy to measure, such as mRNA decay rates and mRNA abundance levels, it is still very hard to measure the active concentration levels of the transcription factor proteins that drive the process and the sensitivity of target genes to these concentrations. [sent-18, score-0.977]
4 In this paper we show how these quantities for a given transcription factor can be inferred from gene expression levels of a set of known target genes. [sent-19, score-0.678]
5 We treat the protein concentration as a latent function with a Gaussian process prior, and include the sensitivities, mRNA decay rates and baseline expression levels as hyperparameters. [sent-20, score-0.562]
6 We apply this procedure to a human leukemia dataset, focusing on the tumour repressor p53 and obtaining results in good accordance with recent biological studies. [sent-21, score-0.114]
7 Microarray technology now allows measurement of mRNA abundance on a genomewide scale, and techniques such as chromatin immunoprecipitation (ChIP) have largely unveiled the wiring of the cellular transcriptional regulatory network, identifying which genes are bound by which transcription factors. [sent-23, score-0.761]
8 mRNA decay rates), most of them are very hard to measure with current techniques, and have therefore to be inferred from the available data. [sent-27, score-0.115]
9 One can formulate a large scale simplified model of regulation (for example assuming a linear response to protein concentrations) and then combine network architecture data and gene expression data to infer transcription factors’ protein concentrations on a genome-wide scale. [sent-29, score-1.027]
10 Alternatively, one can formulate a realistic model of a small subnetwork where few transcription factors regulate a small number of established target genes, trying to include the finer points of the dynamics of transcriptional regulation. [sent-31, score-0.513]
11 In this paper we follow the second approach, focussing on the simplest subnetwork consisting of one tran- scription factor regulating its target genes, but using a detailed model of the interaction dynamics to infer the transcription factor concentrations and the gene specific constants. [sent-32, score-0.737]
12 In these studies, parametric models were developed describing the rate of production of certain genes as a function of the concentration of transcription factor protein at some specified time points. [sent-36, score-0.864]
13 Markov chain Monte Carlo (MCMC) methods were then used to carry out Bayesian inference of the protein concentrations, requiring substantial computational resources and limiting the inference to the discrete time-points where the data was collected. [sent-37, score-0.234]
14 We show here how a Gaussian process model provides a simple and computationally efficient method for Bayesian inference of continuous transcription factor concentration profiles and associated model parameters. [sent-38, score-0.6]
15 Firstly, it allows for the inference of continuous quantities (concentration profiles) without discretization, therefore accounting naturally for the temporal structure of the data. [sent-41, score-0.062]
16 Secondly, it avoids the use of cumbersome interpolation techniques to estimate mRNA production rates from mRNA abundance data, and it allows us to deal naturally with the noise inherent in the measurements. [sent-42, score-0.202]
17 Finally, it greatly outstrips MCMC techniques in terms of computational efficiency, which we expect to be crucial in future extensions to more complex (and realistic) regulatory networks. [sent-43, score-0.094]
18 The paper is organised as follows: in the first section we discuss linear response models. [sent-44, score-0.091]
19 These are simplified models in which the mRNA production rate depends linearly on the transcription factor protein concentration. [sent-45, score-0.629]
20 Although the linear assumption is not verified in practice, it has the advantage of giving rise to an exactly tractable inference problem. [sent-46, score-0.054]
21 We then discuss how to extend the formalism to model cases where the dependence of mRNA production rate on transcription factor protein concentration is not linear, and propose a MAP-Laplace approach to carry out Bayesian inference. [sent-47, score-0.777]
22 1 Linear Response Model Let the data set under consideration consist of T measurements of the mRNA abundance of N genes. [sent-51, score-0.121]
23 We consider a linear differential equation that relates a given gene j’s expression level xj (t) at time t to the concentration of the regulating transcription factor protein f (t), dxj = Bj + Sj f (t) − Dj xj (t) . [sent-52, score-1.088]
24 dt (1) Here, Bj is the basal transcription rate of gene j, Sj is the sensitivity of gene j to the transcription factor and Dj is the decay rate of the mRNA. [sent-53, score-1.059]
25 Crucially, the dependence of the mRNA transcription rate on the protein concentration (response) is linear. [sent-54, score-0.675]
26 Assuming a linear response is a crude simplification, but it can still lead to interesting results in certain modelling situations. [sent-55, score-0.132]
27 [1] to model a simple network consisting of the tumour suppressor transcription factor p53 and five of its target genes. [sent-57, score-0.522]
28 The equation given in (1) can be solved to recover xj (t) = Bj + kj exp (−Dj t) + Sj exp (−Dj t) Dj t f (u) exp (Dj u) du (2) 0 where kj arises from the initial conditions, and is zero if we assume an initial baseline expression level xj (0) = Bj /Dj . [sent-59, score-0.532]
29 We will model the protein concentration f as a latent function drawn from a Gaussian process prior distribution. [sent-60, score-0.439]
30 This implies immediately that the mRNA abundance levels will also be modelled as a Gaussian process, and the covariance function of the marginal distribution p (x1 , . [sent-62, score-0.229]
31 , xN ) can be worked out explicitly from the covariance function of the latent function f . [sent-65, score-0.113]
32 If the covariance function associated with f (t) is given by kf f (t, t ) then elementary functional analysis yields that cov (Lj [f ] (t) , Lk [f ] (t )) = Lj ⊗ Lk [kf f ] (t, t ) . [sent-67, score-0.204]
33 Explicitly, this is given by the following formula t kxj xk (t, t ) = Sj Sk exp (−Dj t − Dk t ) t exp (Dj u) exp (Dk u ) kf f (u, u ) du du. [sent-68, score-0.452]
34 0 (4) 0 If the process prior over f (t) is taken to be a squared exponential kernel, 2 kf f (t, t ) = exp − (t − t ) l2 , where l controls the width of the basis functions1 , the integrals in equation (4) can be computed analytically. [sent-69, score-0.399]
35 The resulting covariances are obtained as √ πl [hkj (t , t) + hjk (t, t )] (5) kxj xk (t, t ) = Sj Sk 2 where 2 hkj (t , t) = exp (γk ) Dj + Dk −exp [− (Dk t + Dj )] erf Here erf(x) = x 0 2 √ π t −t − γk l exp [−Dk (t − t)] erf t − γk l exp −y 2 dy and γk = + erf (γk ) Dk l 2 . [sent-70, score-0.631]
36 To infer the protein concentration levels, one also needs the “cross-covariance” terms between xj (t) and f (t ), which is obtained as t kxj f (t, t ) = Sj exp (−Dj t) exp (Dj u) kf f (u, t ) du. [sent-74, score-0.749]
37 (6) 0 Again, this can be obtained explicitly for squared exponential priors on the latent function f as √ πlSj t −t t 2 exp (γj ) exp [−Dj (t − t)] erf − γj + erf + γj . [sent-75, score-0.503]
38 kxj f (t , t) = 2 l l Standard Gaussian process regression techniques [see e. [sent-76, score-0.098]
39 Alternatively, they can be assigned vague gamma prior distributions and estimated a posteriori using MCMC sampling. [sent-81, score-0.069]
40 In practice, we will allow the mRNA abundance of each gene at each time point to be corrupted by some noise, so that we can model the observations at times ti for i = 1, . [sent-82, score-0.453]
41 , T as, yj (ti ) = xj (ti ) + j (ti ) (8) 2 0, σji with j (ti ) ∼ N . [sent-85, score-0.153]
42 The covariance of the noisy process is simply obtained as 2 2 2 2 Kyy = Σ + Kxx , with Σ = diag σ11 , . [sent-87, score-0.087]
43 Also, all the quantities in equation (1) are positive, but one cannot constrain samples from a Gaussian process to be positive. [sent-98, score-0.087]
44 Modelling the response of the transcription rate to protein concentration using a positive nonlinear function is an elegant way to enforce this constraint. [sent-99, score-0.766]
45 In this case the induced distribution of xj (t) is no longer a Gaussian process. [sent-102, score-0.082]
46 The functional derivative of the log-likelihood with respect to f is then obtained as T N δ log p(Y |f ) (xj (ti ) − yj (ti )) =− Θ(ti − t) g (f (t))e−Dj (ti −t) 2 δf (t) σji i=1 j=1 (11) where Θ(x) is the Heaviside step function and we have omitted the model parameters for brevity. [sent-105, score-0.106]
47 In these and the following formulae ti is understood to mean the index of the grid point corresponding to the ith data point, whereas tp and tq correspond to the grid points themselves. [sent-115, score-0.595]
48 We can then compute the gradient and Hessian of the (discretised) un-normalised log posterior Ψ(f ) = log p(Y |f ) + log p(f ) [see 8, chapter 3] Ψ(f ) = log p(Y |f ) − K −1 f (15) Ψ(f ) = −(W + K −1 ) where K is the prior covariance matrix evaluated at the grid points. [sent-116, score-0.276]
49 The Laplace approximation to the log-marginal likelihood is then (ignoring terms that do not involve model parameters) ˆ ˆ ˆ (16) log p(Y ) log p(Y |f ) − 1 f T K −1 f − 1 log |I + KW |. [sent-118, score-0.105]
50 The gradient of the log-marginal with respect to the kernel parameters is [8] ˆ ∂ log p(Y |Ξ) ∂ fp ∂ log p(Y |Ξ) ∂K −1 ˆ 1 ∂K ˆ = 1 f T K −1 K f − 2 tr (I + KW )−1 W + (17) 2 ˆ ∂Ξ ∂Ξ ∂Ξ ∂Ξ ∂ fp p ˆ where the final term is due to the implicit dependence of f on Ξ. [sent-120, score-0.324]
51 3 Example: exponential response As an example, we consider the case in which g (f (t) , θj ) = Sj exp (f (t)) (18) which provides a useful way of constraining the protein concentration to be positive. [sent-122, score-0.538]
52 Substituting equation (18) in equations (13) and (14) one obtains T N ∂ log p(Y |f ) (xj (ti ) − yj (ti )) = −∆ Θ (ti − tp ) Sj efp −Dj (ti −tp ) 2 ∂fp σji i=1 j=1 T Wpq = −δpq N ∂ log p(Y |f ) −2 2 + ∆2 Θ (ti − tp ) Θ (ti − tq ) σji Sj efp +fq −Dj (2ti −tp −tq ) . [sent-123, score-0.674]
53 ∂fp i=1 j=1 The terms required in equation (17) are, ∂ log p(Y |Ξ) 1 = −(AW )pp − ˆ 2 ∂ fp where A = (W + K −1 −1 ) . [sent-124, score-0.19]
54 Aqq Wqp q ˆ ∂f ∂K = AK −1 ∂Ξ ∂Ξ ˆ log p(Y |f ) , 3 Results To test the efficacy of our method, we used a recently published biological data set which was studied using a linear response model by Barenco et al. [sent-125, score-0.192]
55 This study focused on the tumour suppressor protein p53. [sent-127, score-0.269]
56 mRNA abundance was measured at regular intervals in three independent human cell lines using Affymetrix U133A oligonucleotide microarrays. [sent-128, score-0.156]
57 The authors then restricted their interest to five known target genes of p53: DDB2, p21, SESN1/hPA26, BIK and TNFRSF10b. [sent-129, score-0.119]
58 They estimated the mRNA production rates by using quadratic interpolation between any three consecutive time points. [sent-130, score-0.081]
59 They then discretised the model and used MCMC sampling (assuming a log-normal noise model) to obtain estimates of the model parameters Bj , Sj , Dj and f (t). [sent-131, score-0.087]
60 To make the model identifiable, the value of the mRNA decay of one of the target genes, p21, was measured experimentally. [sent-132, score-0.09]
61 Also, the scale of the sensitivities was fixed by choosing p21’s sensitivity to be equal to one, and f (0) was constrained to be zero. [sent-133, score-0.064]
62 Their predictions were then validated by doing explicit protein concentration measurements and growing mutant cell lines where the p53 gene had been knocked out. [sent-134, score-0.444]
63 1 Linear response analysis We first analysed the data using the simple linear response model used by Barenco et al. [sent-136, score-0.218]
64 Raw data was processed using the mmgMOS model of [4], which also provides estimates of the credibility associated with each measurement. [sent-138, score-0.123]
65 Data from the different cell lines were treated as independent instantiations of f but sharing the model parameters {Bj , Sj , Dj , Ξ}. [sent-139, score-0.064]
66 We used a squared exponential covariance function for the prior distribution on the latent function f . [sent-140, score-0.254]
67 The inferred posterior mean function for f , together with 95% confidence intervals, is shown in Figure 1(a). [sent-141, score-0.057]
68 2 Notice that the right hand tail of the inferred mean function shows an oscillatory behaviour. [sent-145, score-0.081]
69 We believe that this is an artifact caused by the squared exponential covariance; the steep rise between time zero and time two forces the length scale of the function to be small, hence giving rise to wavy functions [see page 123 in 8]. [sent-146, score-0.156]
70 To avoid this, we repeated the experiment using the “MLP” covariance function for the prior distribution over f [12]. [sent-147, score-0.099]
71 The MLP covariance is obtained as the limiting case of an infinite number of sigmoidal neural networks and has the following covariance function k (t, t ) = arcsin wtt + b (wt2 + b + 1) (wt 2 + b + 1) (19) where w and b are parameters known as the weight and the bias variance. [sent-149, score-0.124]
72 The results using this covariance function are shown in Figure 1(b). [sent-150, score-0.062]
73 The resulting profile does not show the unexpected oscillatory behaviour and has tighter credibility intervals. [sent-151, score-0.115]
74 The hyperparameters were assigned a vague gamma prior distribution (a = b = 0. [sent-154, score-0.069]
75 Differences in the estimates of the basal transcription rates are probably due to the different methods used for probe-level processing of the microarray data. [sent-160, score-0.52]
76 2 Non-linear response analysis We then used the non-linear response model of section 2 in order to constrain the protein concentrations inferred to be positive. [sent-162, score-0.498]
77 We achieved this by using an exponential response of the transcription rate to the logged protein concentration. [sent-163, score-0.676]
78 The inferred MAP solutions for the latent function f are plotted in Figure 3 for the squared exponential prior (a) and for the MLP prior (b). [sent-164, score-0.286]
79 4 4 3 3 2 2 1 1 0 0 −1 −1 −2 0 5 (a) −2 10 0 5 (b) 10 Figure 1: Predicted protein concentration for p53 using a linear response model: (a) squared exponential prior on f ; (b) MLP prior on f . [sent-167, score-0.595]
80 Solid line is mean prediction, dashed lines are 95% credibility intervals. [sent-168, score-0.091]
81 The bar charts show (a) Basal transcription rates from our model and that of Barenco et al. [sent-189, score-0.414]
82 Grey are estimates obtained with our model, white are the estimates obtained by Barenco et al. [sent-191, score-0.1]
83 4 Discussion In this paper we showed how Gaussian processes can be used effectively in modelling the dynamics of a very simple regulatory network motif. [sent-194, score-0.205]
84 This approach has many advantages over standard parametric approaches: first of all, there is no need to restrict the inference to the observed time points, and the temporal continuity of the inferred functions is accounted for naturally. [sent-195, score-0.085]
85 It is well known that biological data exhibits a large variability, partly because of technical noise (due to the difficulty to measure mRNA abundance for low expressed genes, for example), and partly because of the difference between different cell lines. [sent-197, score-0.186]
86 Accounting for these sources of noise in a parametric model can be difficult (particularly when estimates of the derivatives of the measured quantities are required), while Gaussian Processes can incorporate this information naturally. [sent-198, score-0.066]
87 Finally, MCMC parameter estimation in a discretised model can be computationally expensive due to the high correlations between variables. [sent-199, score-0.055]
88 This is a consequence of treating the protein concentrations as parameters, and results in many MCMC iterations to obtain reliable samples. [sent-200, score-0.259]
89 For example, it is well known that transcriptional delays can play a significant role in determining the dynamics of many cellular processes [5]. [sent-203, score-0.204]
90 These effects can be introduced naturally in a Gaussian process model; however, the data must be sampled at a reasonably high frequency in order for delays to become identifiable in a stochastic model, which is often not the case with microarray data sets. [sent-204, score-0.104]
91 Another natural extension of our work would be to consider more biologically meaningful nonlinearities, such as the popular Michaelis-Menten model of transcription used in [9]. [sent-205, score-0.349]
92 Finally, networks consisting of a single transcription factor are very useful to study small systems of particular interest such as p53. [sent-206, score-0.399]
93 Solid line is mean prediction, dashed lines show 95% credibility intervals. [sent-208, score-0.091]
94 The results shown are for exp(f ), hence the asymmetry of the credibility intervals. [sent-209, score-0.091]
95 We gratefully acknowledge support from BBSRC Grant No BBS/B/0076X “Improved processing of microarray data with probabilistic models”. [sent-215, score-0.055]
96 Network component analysis: Reconstruction of regulatory signals in biological systems. [sent-243, score-0.124]
97 A tractable probabilistic model for affymetrix probelevel analysis across multiple chips. [sent-251, score-0.055]
98 Model based identification of transcription factor activity from microarray data. [sent-286, score-0.454]
99 Bayesian sparse hidden components analysis for transcription regulation networks. [sent-292, score-0.389]
100 A probabilistic dynamical model for quantitative inference of the regulatory mechanism of transcription. [sent-299, score-0.122]
wordName wordTfidf (topN-words)
[('dj', 0.457), ('transcription', 0.349), ('mrna', 0.303), ('barenco', 0.255), ('ti', 0.249), ('protein', 0.178), ('tp', 0.161), ('concentration', 0.148), ('kf', 0.142), ('fp', 0.127), ('sj', 0.124), ('bj', 0.122), ('abundance', 0.121), ('ji', 0.113), ('tq', 0.111), ('erf', 0.111), ('regulatory', 0.094), ('response', 0.091), ('credibility', 0.091), ('mlp', 0.091), ('genes', 0.087), ('gene', 0.083), ('xj', 0.082), ('concentrations', 0.081), ('dk', 0.076), ('bik', 0.073), ('kxj', 0.073), ('transcriptional', 0.072), ('yj', 0.071), ('fq', 0.063), ('exp', 0.063), ('covariance', 0.062), ('exponential', 0.058), ('decay', 0.058), ('inferred', 0.057), ('microarray', 0.055), ('affymetrix', 0.055), ('basal', 0.055), ('discretised', 0.055), ('kxx', 0.055), ('tumour', 0.055), ('production', 0.052), ('latent', 0.051), ('factor', 0.05), ('mcmc', 0.049), ('lj', 0.048), ('du', 0.048), ('pq', 0.047), ('levels', 0.046), ('squared', 0.046), ('processes', 0.042), ('modelling', 0.041), ('regulation', 0.04), ('cellular', 0.038), ('kj', 0.038), ('prior', 0.037), ('grid', 0.037), ('efp', 0.036), ('hkj', 0.036), ('magnus', 0.036), ('manchester', 0.036), ('mmgmos', 0.036), ('rattray', 0.036), ('sanguinetti', 0.036), ('suppressor', 0.036), ('wpq', 0.036), ('et', 0.036), ('cell', 0.035), ('log', 0.035), ('quantities', 0.034), ('estimates', 0.032), ('sensitivity', 0.032), ('guido', 0.032), ('post', 0.032), ('sensitivities', 0.032), ('sabatti', 0.032), ('vague', 0.032), ('regulating', 0.032), ('subnetwork', 0.032), ('target', 0.032), ('hessian', 0.031), ('pointwise', 0.031), ('biological', 0.03), ('kw', 0.029), ('instantiations', 0.029), ('rogers', 0.029), ('leukemia', 0.029), ('dxj', 0.029), ('rates', 0.029), ('inference', 0.028), ('equation', 0.028), ('dynamics', 0.028), ('expression', 0.027), ('lawrence', 0.027), ('rise', 0.026), ('gaussian', 0.026), ('collectively', 0.025), ('process', 0.025), ('oscillatory', 0.024), ('delays', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 135 nips-2006-Modelling transcriptional regulation using Gaussian Processes
Author: Neil D. Lawrence, Guido Sanguinetti, Magnus Rattray
Abstract: Modelling the dynamics of transcriptional processes in the cell requires the knowledge of a number of key biological quantities. While some of them are relatively easy to measure, such as mRNA decay rates and mRNA abundance levels, it is still very hard to measure the active concentration levels of the transcription factor proteins that drive the process and the sensitivity of target genes to these concentrations. In this paper we show how these quantities for a given transcription factor can be inferred from gene expression levels of a set of known target genes. We treat the protein concentration as a latent function with a Gaussian process prior, and include the sensitivities, mRNA decay rates and baseline expression levels as hyperparameters. We apply this procedure to a human leukemia dataset, focusing on the tumour repressor p53 and obtaining results in good accordance with recent biological studies.
2 0.12288638 81 nips-2006-Game Theoretic Algorithms for Protein-DNA binding
Author: Luis Pérez-breva, Luis E. Ortiz, Chen-hsiang Yeang, Tommi S. Jaakkola
Abstract: We develop and analyze game-theoretic algorithms for predicting coordinate binding of multiple DNA binding regulators. The allocation of proteins to local neighborhoods and to sites is carried out with resource constraints while explicating competing and coordinate binding relations among proteins with affinity to the site or region. The focus of this paper is on mathematical foundations of the approach. We also briefly demonstrate the approach in the context of the λ-phage switch. 1
3 0.10525024 3 nips-2006-A Complexity-Distortion Approach to Joint Pattern Alignment
Author: Andrea Vedaldi, Stefano Soatto
Abstract: Image Congealing (IC) is a non-parametric method for the joint alignment of a collection of images affected by systematic and unwanted deformations. The method attempts to undo the deformations by minimizing a measure of complexity of the image ensemble, such as the averaged per-pixel entropy. This enables alignment without an explicit model of the aligned dataset as required by other methods (e.g. transformed component analysis). While IC is simple and general, it may introduce degenerate solutions when the transformations allow minimizing the complexity of the data by collapsing them to a constant. Such solutions need to be explicitly removed by regularization. In this paper we propose an alternative formulation which solves this regularization issue on a more principled ground. We make the simple observation that alignment should simplify the data while preserving the useful information carried by them. Therefore we trade off fidelity and complexity of the aligned ensemble rather than minimizing the complexity alone. This eliminates the need for an explicit regularization of the transformations, and has a number of other useful properties such as noise suppression. We show the modeling and computational benefits of the approach to the some of the problems on which IC has been demonstrated. 1
4 0.088053286 54 nips-2006-Comparative Gene Prediction using Conditional Random Fields
Author: Jade P. Vinson, David Decaprio, Matthew D. Pearson, Stacey Luoma, James E. Galagan
Abstract: Computational gene prediction using generative models has reached a plateau, with several groups converging to a generalized hidden Markov model (GHMM) incorporating phylogenetic models of nucleotide sequence evolution. Further improvements in gene calling accuracy are likely to come through new methods that incorporate additional data, both comparative and species specific. Conditional Random Fields (CRFs), which directly model the conditional probability P (y|x) of a vector of hidden states conditioned on a set of observations, provide a unified framework for combining probabilistic and non-probabilistic information and have been shown to outperform HMMs on sequence labeling tasks in natural language processing. We describe the use of CRFs for comparative gene prediction. We implement a model that encapsulates both a phylogenetic-GHMM (our baseline comparative model) and additional non-probabilistic features. We tested our model on the genome sequence of the fungal human pathogen Cryptococcus neoformans. Our baseline comparative model displays accuracy comparable to the the best available gene prediction tool for this organism. Moreover, we show that discriminative training and the incorporation of non-probabilistic evidence significantly improve performance. Our software implementation, Conrad, is freely available with an open source license at http://www.broad.mit.edu/annotation/conrad/. 1
5 0.077662647 175 nips-2006-Simplifying Mixture Models through Function Approximation
Author: Kai Zhang, James T. Kwok
Abstract: Finite mixture model is a powerful tool in many statistical learning problems. In this paper, we propose a general, structure-preserving approach to reduce its model complexity, which can bring significant computational benefits in many applications. The basic idea is to group the original mixture components into compact clusters, and then minimize an upper bound on the approximation error between the original and simplified models. By adopting the L2 norm as the distance measure between mixture models, we can derive closed-form solutions that are more robust and reliable than using the KL-based distance measure. Moreover, the complexity of our algorithm is only linear in the sample size and dimensionality. Experiments on density estimation and clustering-based image segmentation demonstrate its outstanding performance in terms of both speed and accuracy.
6 0.074577309 64 nips-2006-Data Integration for Classification Problems Employing Gaussian Process Priors
7 0.071289383 29 nips-2006-An Information Theoretic Framework for Eukaryotic Gradient Sensing
8 0.063702926 195 nips-2006-Training Conditional Random Fields for Maximum Labelwise Accuracy
9 0.061428867 31 nips-2006-Analysis of Contour Motions
10 0.059731849 132 nips-2006-Modeling Dyadic Data with Binary Latent Factors
11 0.058821693 28 nips-2006-An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models
12 0.05881593 98 nips-2006-Inferring Network Structure from Co-Occurrences
13 0.056841895 153 nips-2006-Online Clustering of Moving Hyperplanes
14 0.055300336 41 nips-2006-Bayesian Ensemble Learning
15 0.050567802 108 nips-2006-Large Scale Hidden Semi-Markov SVMs
16 0.050410092 1 nips-2006-A Bayesian Approach to Diffusion Models of Decision-Making and Response Time
17 0.050409328 103 nips-2006-Kernels on Structured Objects Through Nested Histograms
18 0.050341167 75 nips-2006-Efficient sparse coding algorithms
19 0.049402792 117 nips-2006-Learning on Graph with Laplacian Regularization
20 0.048803933 113 nips-2006-Learning Structural Equation Models for fMRI
topicId topicWeight
[(0, -0.153), (1, -0.011), (2, 0.022), (3, -0.043), (4, 0.053), (5, 0.02), (6, 0.075), (7, 0.001), (8, -0.049), (9, 0.088), (10, -0.098), (11, 0.034), (12, 0.022), (13, 0.044), (14, -0.114), (15, 0.021), (16, -0.007), (17, -0.013), (18, -0.067), (19, -0.056), (20, 0.03), (21, 0.016), (22, -0.126), (23, -0.17), (24, 0.093), (25, 0.012), (26, -0.054), (27, -0.143), (28, 0.074), (29, 0.017), (30, -0.026), (31, -0.194), (32, -0.012), (33, 0.072), (34, 0.233), (35, -0.008), (36, 0.128), (37, -0.013), (38, 0.119), (39, 0.147), (40, -0.011), (41, 0.049), (42, 0.045), (43, -0.021), (44, -0.014), (45, -0.036), (46, -0.01), (47, 0.033), (48, -0.011), (49, 0.076)]
simIndex simValue paperId paperTitle
same-paper 1 0.93877757 135 nips-2006-Modelling transcriptional regulation using Gaussian Processes
Author: Neil D. Lawrence, Guido Sanguinetti, Magnus Rattray
Abstract: Modelling the dynamics of transcriptional processes in the cell requires the knowledge of a number of key biological quantities. While some of them are relatively easy to measure, such as mRNA decay rates and mRNA abundance levels, it is still very hard to measure the active concentration levels of the transcription factor proteins that drive the process and the sensitivity of target genes to these concentrations. In this paper we show how these quantities for a given transcription factor can be inferred from gene expression levels of a set of known target genes. We treat the protein concentration as a latent function with a Gaussian process prior, and include the sensitivities, mRNA decay rates and baseline expression levels as hyperparameters. We apply this procedure to a human leukemia dataset, focusing on the tumour repressor p53 and obtaining results in good accordance with recent biological studies.
2 0.66692704 81 nips-2006-Game Theoretic Algorithms for Protein-DNA binding
Author: Luis Pérez-breva, Luis E. Ortiz, Chen-hsiang Yeang, Tommi S. Jaakkola
Abstract: We develop and analyze game-theoretic algorithms for predicting coordinate binding of multiple DNA binding regulators. The allocation of proteins to local neighborhoods and to sites is carried out with resource constraints while explicating competing and coordinate binding relations among proteins with affinity to the site or region. The focus of this paper is on mathematical foundations of the approach. We also briefly demonstrate the approach in the context of the λ-phage switch. 1
3 0.55519247 29 nips-2006-An Information Theoretic Framework for Eukaryotic Gradient Sensing
Author: Joseph M. Kimmel, Richard M. Salter, Peter J. Thomas
Abstract: Chemical reaction networks by which individual cells gather and process information about their chemical environments have been dubbed “signal transduction” networks. Despite this suggestive terminology, there have been few attempts to analyze chemical signaling systems with the quantitative tools of information theory. Gradient sensing in the social amoeba Dictyostelium discoideum is a well characterized signal transduction system in which a cell estimates the direction of a source of diffusing chemoattractant molecules based on the spatiotemporal sequence of ligand-receptor binding events at the cell membrane. Using Monte Carlo techniques (MCell) we construct a simulation in which a collection of individual ligand particles undergoing Brownian diffusion in a three-dimensional volume interact with receptors on the surface of a static amoeboid cell. Adapting a method for estimation of spike train entropies described by Victor (originally due to Kozachenko and Leonenko), we estimate lower bounds on the mutual information between the transmitted signal (direction of ligand source) and the received signal (spatiotemporal pattern of receptor binding/unbinding events). Hence we provide a quantitative framework for addressing the question: how much could the cell know, and when could it know it? We show that the time course of the mutual information between the cell’s surface receptors and the (unknown) gradient direction is consistent with experimentally measured cellular response times. We find that the acquisition of directional information depends strongly on the time constant at which the intracellular response is filtered. 1 Introduction: gradient sensing in eukaryotes Biochemical signal transduction networks provide the computational machinery by which neurons, amoebae or other single cells sense and react to their chemical environments. The precision of this chemical sensing is limited by fluctuations inherent in reaction and diffusion processes involving a ∗ Current address: Computational Neuroscience Graduate Program, The University of Chicago. Oberlin Center for Computation and Modeling, http://occam.oberlin.edu/. ‡ To whom correspondence should be addressed. http://www.case.edu/artsci/math/thomas/thomas.html; Oberlin College Research Associate. † finite quantity of molecules [1, 2]. The theory of communication provides a framework that makes explicit the noise dependence of chemical signaling. For example, in any reaction A + B → C, we may view the time varying reactant concentrations A(t) and B(t) as input signals to a noisy channel, and the product concentration C(t) as an output signal carrying information about A(t) and B(t). In the present study we show that the mutual information between the (known) state of the cell’s surface receptors and the (unknown) gradient direction follows a time course consistent with experimentally measured cellular response times, reinforcing earlier claims that information theory can play a role in understanding biochemical cellular communication [3, 4]. Dictyostelium is a soil dwelling amoeba that aggregates into a multicellular form in order to survive conditions of drought or starvation. During aggregation individual amoebae perform chemotaxis, or chemically guided movement, towards sources of the signaling molecule cAMP, secreted by nearby amoebae. Quantitive studies have shown that Dictyostelium amoebae can sense shallow, static gradients of cAMP over long time scales (∼30 minutes), and that gradient steepness plays a crucial role in guiding cells [5]. The chemotactic efficiency (CE), the population average of the cosine between the cell displacement directions and the true gradient direction, peaks at a cAMP concentration of 25 nanoMolar, similar to the equilibrium constant for the cAMP receptor (the Keq is the concentration of cAMP at which the receptor has a 50% chance of being bound or unbound, respectively). For smaller or larger concentrations the CE dropped rapidly. Nevertheless over long times cells were able (on average) to detect gradients as small as 2% change in [cAMP] per cell length. At an early stage of development when the pattern of chemotactic centers and spirals is still forming, individual amoebae presumably experience an inchoate barrage of weak, noisy and conflicting directional signals. When cAMP binds receptors on a cell’s surface, second messengers trigger a chain of subsequent intracellular events including a rapid spatial reorganization of proteins involved in cell motility. Advances in fluorescence microscopy have revealed that the oriented subcellular response to cAMP stimulation is already well underway within two seconds [6, 7]. In order to understand the fundamental limits to communication in this cell signaling process we abstract the problem faced by a cell to that of rapidly identifying the direction of origin of a stimulus gradient superimposed on an existing mean background concentration. We model gradient sensing as an information channel in which an input signal – the direction of a chemical source – is noisily transmitted via a gradient of diffusing signaling molecules; and the “received signal” is the spatiotemporal pattern of binding events between cAMP and the cAMP receptors [8]. We neglect downstream intracellular events, which cannot increase the mutual information between the state of the cell and the direction of the imposed extracellular gradient [9]. The analysis of any signal transmission system depends on precise representation of the noise corrupting transmitted signals. We develop a Monte Carlo simulation (MCell, [10, 11]) in which a simulated cell is exposed to a cAMP distribution that evolves from a uniform background to a gradient at low (1 nMol) average concentration. The noise inherent in the communication of a diffusionmediated signal is accurately represented by this method. Our approach bridges both the transient and the steady state regimes and allows us to estimate the amount of stimulus-related information that is in principle available to the cell through its receptors as a function of time after stimulus initiation. Other efforts to address aspects of cell signaling using the conceptual tools of information theory have considered neurotransmitter release [3] and sensing temporal signals [4], but not gradient sensing in eukaryotic cells. A typical natural habitat for social amoebae such as Dictyostelium is the complex anisotropic threedimensional matrix of the forest floor. Under experimental conditions cells typically aggregate on a flat two-dimensional surface. We approach the problem of gradient sensing on a sphere, which is both harder and more natural for the ameoba, while still simple enough for us analytically and numerically. Directional data is naturally described using unit vectors in spherical coordinates, but the ameobae receive signals as binding events involving intramembrane protein complexes, so we have developed a method for projecting the ensemble of receptor bindings onto coordinates in R3 . In loose analogy with the chemotactic efficiency [5], we compare the projected directional estimate with the true gradient direction represented as a unit vector on S2 . Consistent with observed timing of the cell’s response to cAMP stimulation, we find that the directional signal converges quickly enough for the cell to make a decision about which direction to move within the first two seconds following stimulus onset. 2 Methods 2.1 Monte Carlo simulations Using MCell and DReAMM [10, 11] we construct a spherical cell (radius R = 7.5µm, [12]) centered in a cubic volume (side length L = 30µm). N = 980 triangular tiles partition the surface (mesh generated by DOME1 ); each contained one cell surface receptor for cAMP with binding rate k+ = 4.4 × 107 sec−1 M−1 , first-order cAMP unbinding rate k− = 1.1 sec−1 [12] and Keq = k− /k+ = 25nMol cAMP. We established a baseline concentration of approximately 1nMol by releasing a cAMP bolus at time 0 inside the cube with zero-flux boundary conditions imposed on each wall. At t = 2 seconds we introduced a steady flux at the x = −L/2 wall of 1 molecule of cAMP per square micron per msec, adding signaling molecules from the left. Simultaneously, the x = +L/2 wall of the cube assumes absorbing boundary conditions. The new boundary conditions lead (at equilibrium) to a linear gradient of 2 nMol/30µm, ranging from ≈ 2.0 nMol at the flux source wall to ≈ 0 nMol at the absorbing wall (see Figure 1); the concentration profile approaches this new steady state with time constant of approximately 1.25 msec. Sampling boxes centered along the planes x = ±13.5µm measured the local concentration, allowing us to validate the expected model behavior. Figure 1: Gradient sensing simulations performed with MCell (a Monte Carlo simulator of cellular microphysiology, http://www.mcell.cnl.salk.edu/) and rendered with DReAMM (Design, Render, and Animate MCell Models, http://www.mcell.psc.edu/). The model cell comprised a sphere triangulated with 980 tiles with one cAMP receptor per tile. Cell radius R = 7.5µm; cube side L = 30µm. Left: Initial equilibrium condition, before imposition of gradient. [cAMP] ≈ 1nMol (c. 15,000 molecules in the volume outside the sphere). Right: Gradient condition after transient (c. 15,000 molecules; see Methods for details). 2.2 Analysis 2.2.1 Assumptions We make the following assumptions to simplify the analysis of the distribution of receptor activities at equilibrium, whether pre- or post-stimulus onset: 1. Independence. At equilibrium, the state of each receptor (bound vs unbound) is independent of the states of the other receptors. 2. Linear Gradient. At equilibrium under the imposed gradient condition, the concentration of ligand molecule varies linearly with position along the gradient axis. 3. Symmetry. 1 http://nwg.phy.bnl.gov/∼bviren/uno/other/ (a) Rotational equivariance of receptor activities. In the absence of an applied gradient signal, the probability distribution describing the receptor states is equivariant with respect to arbitrary rotations of the sphere. (b) Rotational invariance of gradient direction. The imposed gradient seen by a model cell is equally likely to be coming from any direction; therefore the gradient direction vector is uniformly distributed over S2 . (c) Axial equivariance about the gradient direction. Once a gradient direction is imposed, the probability distribution describing receptor states is rotationally equivariant with respect to rotations about the axis parallel with the gradient. Berg and Purcell [1] calculate the inaccuracy in concentration estimates due to nonindependence of adjacent receptors; for our parameters (effective receptor radius = 5nm, receptor spacing ∼ 1µm) the fractional error in estimating concentration differences due to receptor nonindependence is negligible ( 10−11 ) [1, 2]. Because we fix receptors to be in 1:1 correspondence with surface tiles, spherical symmetry and uniform distribution of the receptors are only approximate. The gradient signal communicated via diffusion does not involve sharp spatial changes on the scale of the distance between nearby receptors, therefore spherical symmetry and uniform identical receptor distribution are good analytic approximations of the model configuration. By rotational equivariance we mean that combining any rotation of the sphere with a corresponding rotation of the indices labeling the N receptors, {j = 1, · · · , N }, leads to a statistically indistinguishable distribution of receptor activities. This same spherical symmetry is reflected in the a priori distribution of gradient directions, which is uniform over the sphere (with density 1/4π). Spherical symmetry is broken by the gradient signal, which fixes a preferred direction in space. About this axis however, we assume the system retains the rotational symmetry of the cylinder. 2.2.2 Mutual information of the receptors In order to quantify the directional information available to the cell from its surface receptors we construct an explicit model for the receptor states and the cell’s estimated direction. We model the receptor states via a collection of random variables {Bj } and develop an expression for the entropy of {Bj }. Then in section 2.2.3 we present a method for projecting a temporally filtered estimated direction, g , into three (rather than N ) dimensions. ˆ N Let the random variables {Bj }j=1 represent the states of the N cAMP receptors on the cell surface; Bj = 1 if the receptor is bound to a molecule of cAMP, otherwise Bj = 0. Let xj ∈ S2 represent the direction from the center of the center of the cell to the j th receptor. Invoking assumption 2 above, we take the equilibrium concentration of cAMP at x to be c(x|g) = a + b(x · g) where g ∈ S2 is a unit vector in the direction of the gradient. The parameter a is the mean concentration over the cell surface, and b = R| c| is half the drop in concentration from one extreme on the cell surface to the other. Before the stimulus begins, the gradient direction is undefined. It can be shown (see Supplemental Materials) that the entropy of receptor states given a fixed gradient direction g, H[{Bj }|g], is given by an integral over the sphere: π 2π a + b cos(θ) sin(θ) dφ dθ (as N → ∞). (1) a + b cos(θ) + Keq 4π θ=0 φ=0 On the other hand, if the gradient direction remains unspecified, the entropy of receptor states is given by H[{Bj }|g] ∼ N Φ π 2π θ=0 φ=0 H[{Bj }] ∼ N Φ a + b cos(θ) a + b cos(θ) + Keq sin(θ) dφ dθ (as N → ∞), 4π − (p log2 (p) + (1 − p) log2 (1 − p)) , 0 < p < 1 0, p = 0 or 1 binary random variable with state probabilities p and (1 − p). where Φ[p] = (2) denotes the entropy for a In both equations (1) and (2), the argument of Φ is a probability taking values 0 ≤ p ≤ 1. In (1) the values of Φ are averaged over the sphere; in (2) Φ is evaluated after averaging probabilities. Because Φ[p] is convex for 0 ≤ p ≤ 1, the integral in equation 1 cannot exceed that in equation 2. Therefore the mutual information upon receiving the signal is nonnegative (as expected): ∆ M I[{Bj }; g] = H[{Bj }] − H[{Bj }|g] ≥ 0. The analytic solution for equation (1) involves the polylogarithm function. For the parameters shown in the simulation (a = 1.078 nMol, b = .512 nMol, Keq = 25 nMol), the mutual information with 980 receptors is 2.16 bits. As one would expect, the mutual information peaks when the mean concentration is close to the Keq of the receptor, exceeding 16 bits when a = 25, b = 12.5 and Keq = 25 (nMol). 2.2.3 Dimension reduction The estimate obtained above does not give tell us how quickly the directional information available to the cell evolves over time. Direct estimate of the mutual information from stochastic simulations is impractical because the aggregate random variables occupy a 980 dimensional space that a limited number of simulation runs cannot sample adequately. Instead, we construct a deterministic function from the set of 980 time courses of the receptors, {Bj (t)}, to an aggregate directional estimate in R3 . Because of the cylindrical symmetry inherent in the system, our directional estimator g is ˆ an unbiased estimator of the true gradient direction g. The estimator g (t) may be thought of as ˆ representing a downstream chemical process that accumulates directional information and decays with some time constant τ . Let {xj }N be the spatial locations of the N receptors on the cell’s j=1 surface. Each vector is associated with a weight wj . Whenever the j th receptor binds a cAMP molecule, wj is incremented by one; otherwise wj decays with time constant τ . We construct an instantaneous estimate of the gradient direction from the linear combination of receptor positions, N gτ (t) = j=1 wj (t)xj . This procedure reflects the accumulation and reabsorption of intracellular ˆ second messengers released from the cell membrane upon receptor binding. Before the stimulus is applied, the weighted directional estimates gτ are small in absolute magniˆ tude, with direction uniformly distributed on S2 . In order to determine the information gained as the estimate vector evolves after stimulus application, we wish to determine the change in entropy in an ensemble of such estimates. As the cell gains information about the direction of the gradient signal from its receptors, the entropy of the estimate should decrease, leading to a rise in mutual information. By repeating multiple runs (M = 600) of the simulation we obtain samples from the ensemble of direction estimates, given a particular stimulus direction, g. In the method of Kozachenko and Leonenko [13], adapted for the analysis of neural spike train data by Victor [14] (“KLV method”), the cumulative distribution function is approximated directly from the observed samples, and the entropy is estimated via a change of variables transformation (see below). This method may be formulated in vector spaces Rd for d > 1 ([13]), but it is not guaranteed to be unbiased in the multivariate case [15] and has not been extended to curved manifolds such as the sphere. In the present case, however, we may exploit the symmetries inherent in the model (Assumptions 3a-3c) to reduce the empirical entropy estimation problem to one dimension. Adapting the argument in [14] to the case of spherical data from a distribution with rotational symmetry about a given axis, we obtain an estimate of the entropy based on a series of observations of the angles {θ1 , · · · , θM } between the estimates gτ and the true gradient direction g (for details, see ˆ Supplemental Materials): 1 H∼ M M log2 (λk ) + log2 (2(M − 1)) + k=1 γ + log2 (2π) + log2 (sin(θk )) loge (2) (3) ∆ (as M → ∞) where after sorting the θk in monotonic order, λk = min(|θk − θk±1 |) is the distance between each angle and its nearest neighbor in the sample, and γ is the Euler-Mascheroni constant. As shown in Figure 2, this approximation agrees with the analytic result for the uniform distribution, Hunif = log2 (4π) ≈ 3.651. 3 Results Figure 3 shows the results of M = 600 simulation runs. Panel A shows the concentration averaged across a set of 1µm3 sample boxes, four in the x = −13.5µm plane and four in the x = +13.5µm Figure 2: Monte Carlo simulation results and information analysis. A: Average concentration profiles along two planes perpendicular to the gradient, at x = ±13.5µm. B: Estimated direction vector (x, y, and z components; x = dark blue trace) gτ , τ = 500 msec. C: Entropy of the ensemble of diˆ rectional vector estimates for different values of the intracellular filtering time constant τ . Given the directions of the estimates θk , φk on each of M runs, we calculate the entropy of the ensemble using equation (3). All time constants yield uniformly distributed directional estimates in the pre-stimulus period, 0 ≤ t ≤ 2 (sec). After stimulus onset, directional estimates obtained with shorter time constants respond more quickly but achieve smaller gains in mutual information (smaller reductions in entropy). Filtering time constants τ range from lightest to darkest colors: 20, 50, 100, 200, 500, 1000, 2000 msec. plane. The initial bolus of cAMP released into the volume at t = 0 sec is not uniformly distributed, but spreads out evenly within 0.25 sec. At t = 2.0 sec the boundary conditions are changed, causing a gradient to emerge along a realistic time course. Consistent with the analytic solution for the mean concentration (not shown), the concentration approaches equilibrium more rapidly near the absorbing wall (descending trace) than at the imposed flux wall (ascending trace). Panel B shows the evolution of a directional estimate vector gτ for a single run, with τ = 500 ˆ msec. During uniform conditions all vectors fluctuate near the origin. After gradient onset the variance increases and the x component (dark trace) becomes biased towards the gradient source (g = [−1, 0, 0]) while the y and z components still have a mean of zero. Across all 600 runs the mean of the y and z components remains close to zero, while the mean of the x component systematically departs from zero shortly after stimulus onset (not shown). Hence the directional estimator is unbiased (as required by symmetry). See Supplemental Materials for the population average of g . ˆ Panel C shows the time course of the entropy of the ensemble of normalized directional estimate vectors gτ /|ˆτ | over M = 600 simulations, for intracellular filtering time constants ranging from 20 ˆ g msec to 2000 msec (light to dark shading), calculated using equation (3). Following stimulus onset, entropy decreases steadily, showing an increase in information available to the amoeba about the direction of the stimulus; the mutual information at a given point in time is the difference between the entropy at that time and before stimulus onset. For a cell with roughly 1000 receptors the mutual information has increased at most by ∼ 2 bits of information by one second (for τ = 500 msec), and at most by ∼ 3 bits of information by two seconds (for τ =1000 or 2000 msec), under our stimulation protocol. A one bit reduction in uncertainty is equivalent to identifying the correct value of the x component (positive versus negative) when the stimulus direction is aligned along the x-axis. Alternatively, note that a one bit reduction results in going from the uniform distribution on the sphere to the uniform distribution on one hemisphere. For τ ≤ 100 msec, the weighted average with decay time τ never gains more than one bit of information about the stimulus direction, even at long times. This observation suggestions that signaling must involve some chemical components with lifetimes longer than 100 msec. The τ = 200 msec filter saturates after about one second, at ∼ 1 bit of information gain. Longer lived second messengers would respond more slowly to changes from the background stimulus distribution, but would provide better more informative estimates over time. The τ = 500 msec estimate gains roughly two bits of information within 1.5 seconds, but not much more over time. Heuristically, we may think of a two bit gain in information as corresponding to the change from a uniform distribution to one covering uniformly covering one quarter of S2 , i.e. all points within π/3 of the true direction. Within two seconds the τ = 1000 msec and τ = 2000 msec weighted averages have each gained approximately three bits of information, equivalent to a uniform distribution covering all points with 0.23π or 41o of the true direction. 4 Discussion & conclusions Clearly there is an opportunity for more precise control of experimental conditions to deepen our understanding of spatio-temporal information processing at the membranes of gradient-sensitive cells. Efforts in this direction are now using microfluidic technology to create carefully regulated spatial profiles for probing cellular responses [16]. Our results suggest that molecular processes relevant to these responses must have lasting effects ≥ 100 msec. We use a static, immobile cell. Could cell motion relative to the medium increase sensitivity to changes in the gradient? No: the Dictyostelium velocity required to affect concentration perception is on order 1cm sec−1 [1], whereas reported velocities are on the order µm sec−1 [5]. The chemotactic response mechanism is known to begin modifying the cell membrane on the edge facing up the gradient within two seconds after stimulus initiation [7, 6], suggesting that the cell strikes a balance between gathering data and deciding quickly. Indeed, our results show that the reported activation of the G-protein signaling system on the leading edge of a chemotactically responsive cell [7] rises at roughly the same rate as the available chemotactic information. Results such as these ([7, 6]) are obtained by introducing a pipette into the medium near the amoeba; the magnitude and time course of cAMP release are not precisely known, and when estimated the cAMP concentration at the cell surface is over 25 nMol by a full order of magnitude. Thomson and Kristan [17] show that for discrete probability distributions and for continuous distributions over linear spaces, stimulus discriminability may be better quantified using ideal observer analysis (mean squared error, for continuous variables) than information theory. The machinery of mean squared error (variance, expectation) do not carry over to the case of directional data without fundamental modifications [18]; in particular the notion of mean squared error is best represented by the mean resultant length 0 ≤ ρ ≤ 1, the expected length of the vector average of a collection of unit vectors representing samples from directional data. A resultant with length ρ ≈ 1 corresponds to a highly focused probability density function on the sphere. In addition to measuring the mutual information between the gradient direction and an intracellular estimate of direction, we also calculated the time evolution of ρ (see Supplemental Materials.) We find that ρ rapidly approaches 1 and can exceed 0.9, depending on τ . We found that in this case at least the behavior of the mean resultant length and the mutual information are very similar; there is no evidence of discrepancies of the sort described in [17]. We have shown that the mutual information between an arbitrarily oriented stimulus and the directional signal available at the cell’s receptors evolves with a time course consistent with observed reaction times of Dictyostelium amoeba. Our results reinforce earlier claims that information theory can play a role in understanding biochemical cellular communication. Acknowledgments MCell simulations were run on the Oberlin College Beowulf Cluster, supported by NSF grant CHE0420717. References [1] Howard C. Berg and Edward M. Purcell. Physics of chemoreception. Biophysical Journal, 20:193, 1977. [2] William Bialek and Sima Setayeshgar. Physical limits to biochemical signaling. PNAS, 102(29):10040– 10045, July 19 2005. [3] S. Qazi, A. Beltukov, and B.A. Trimmer. Simulation modeling of ligand receptor interactions at nonequilibrium conditions: processing of noisy inputs by ionotropic receptors. Math Biosci., 187(1):93–110, Jan 2004. [4] D. J. Spencer, S. K. Hampton, P. Park, J. P. Zurkus, and P. J. Thomas. The diffusion-limited biochemical signal-relay channel. In S. Thrun, L. Saul, and B. Sch¨ lkopf, editors, Advances in Neural Information o Processing Systems 16. MIT Press, Cambridge, MA, 2004. [5] P.R. Fisher, R. Merkl, and G. Gerisch. Quantitative analysis of cell motility and chemotaxis in Dictyostelium discoideum by using an image processing system and a novel chemotaxis chamber providing stationary chemical gradients. J. Cell Biology, 108:973–984, March 1989. [6] Carole A. Parent, Brenda J. Blacklock, Wendy M. Froehlich, Douglas B. Murphy, and Peter N. Devreotes. G protein signaling events are activated at the leading edge of chemotactic cells. Cell, 95:81–91, 2 October 1998. [7] Xuehua Xu, Martin Meier-Schellersheim, Xuanmao Jiao, Lauren E. Nelson, and Tian Jin. Quantitative imaging of single live cells reveals spatiotemporal dynamics of multistep signaling events of chemoattractant gradient sensing in dictyostelium. Molecular Biology of the Cell, 16:676–688, February 2005. [8] Jan Wouter-Rappel, Peter. J Thomas, Herbert Levine, and William F. Loomis. Establishing direction during chemotaxis in eukaryotic cells. Biophys. J., 83:1361–1367, 2002. [9] T.M. Cover and J.A. Thomas. Elements of Information Theory. John Wiley, New York, 1990. [10] J. R. Stiles, D. Van Helden, T. M. Bartol, E.E. Salpeter, and M. M. Salpeter. Miniature endplate current rise times less than 100 microseconds from improved dual recordings can be modeled with passive acetylcholine diffusion from a synaptic vesicle. Proc. Natl. Acad. Sci. U.S.A., 93(12):5747–52, Jun 11 1996. [11] J. R. Stiles and T. M. Bartol. Computational Neuroscience: Realistic Modeling for Experimentalists, chapter Monte Carlo methods for realistic simulation of synaptic microphysiology using MCell, pages 87–127. CRC Press, Boca Raton, FL, 2001. [12] M. Ueda, Y. Sako, T. Tanaka, P. Devreotes, and T. Yanagida. Single-molecule analysis of chemotactic signaling in Dictyostelium cells. Science, 294:864–867, October 2001. [13] L.F. Kozachenko and N.N. Leonenko. Probl. Peredachi Inf. [Probl. Inf. Transm.], 23(9):95, 1987. [14] Jonathan D. Victor. Binless strategies for estimation of information from neural data. Physical Review E, 66:051903, Nov 11 2002. [15] Marc M. Van Hulle. Edgeworth approximation of multivariate differential entropy. Neural Computation, 17:1903–1910, 2005. [16] Loling Song, Sharvari M. Nadkarnia, Hendrik U. B¨ dekera, Carsten Beta, Albert Bae, Carl Franck, o Wouter-Jan Rappel, William F. Loomis, and Eberhard Bodenschatz. Dictyostelium discoideum chemotaxis: Threshold for directed motion. Euro. J. Cell Bio, 85(9-10):981–9, 2006. [17] Eric E. Thomson and William B. Kristan. Quantifying stimulus discriminability: A comparison of information theory and ideal observer analysis. Neural Computation, 17:741–778, 2005. [18] Kanti V. Mardia and Peter E. Jupp. Directional Statistics. John Wiley & Sons, West Sussex, England, 2000.
4 0.4446187 54 nips-2006-Comparative Gene Prediction using Conditional Random Fields
Author: Jade P. Vinson, David Decaprio, Matthew D. Pearson, Stacey Luoma, James E. Galagan
Abstract: Computational gene prediction using generative models has reached a plateau, with several groups converging to a generalized hidden Markov model (GHMM) incorporating phylogenetic models of nucleotide sequence evolution. Further improvements in gene calling accuracy are likely to come through new methods that incorporate additional data, both comparative and species specific. Conditional Random Fields (CRFs), which directly model the conditional probability P (y|x) of a vector of hidden states conditioned on a set of observations, provide a unified framework for combining probabilistic and non-probabilistic information and have been shown to outperform HMMs on sequence labeling tasks in natural language processing. We describe the use of CRFs for comparative gene prediction. We implement a model that encapsulates both a phylogenetic-GHMM (our baseline comparative model) and additional non-probabilistic features. We tested our model on the genome sequence of the fungal human pathogen Cryptococcus neoformans. Our baseline comparative model displays accuracy comparable to the the best available gene prediction tool for this organism. Moreover, we show that discriminative training and the incorporation of non-probabilistic evidence significantly improve performance. Our software implementation, Conrad, is freely available with an open source license at http://www.broad.mit.edu/annotation/conrad/. 1
5 0.40905482 3 nips-2006-A Complexity-Distortion Approach to Joint Pattern Alignment
Author: Andrea Vedaldi, Stefano Soatto
Abstract: Image Congealing (IC) is a non-parametric method for the joint alignment of a collection of images affected by systematic and unwanted deformations. The method attempts to undo the deformations by minimizing a measure of complexity of the image ensemble, such as the averaged per-pixel entropy. This enables alignment without an explicit model of the aligned dataset as required by other methods (e.g. transformed component analysis). While IC is simple and general, it may introduce degenerate solutions when the transformations allow minimizing the complexity of the data by collapsing them to a constant. Such solutions need to be explicitly removed by regularization. In this paper we propose an alternative formulation which solves this regularization issue on a more principled ground. We make the simple observation that alignment should simplify the data while preserving the useful information carried by them. Therefore we trade off fidelity and complexity of the aligned ensemble rather than minimizing the complexity alone. This eliminates the need for an explicit regularization of the transformations, and has a number of other useful properties such as noise suppression. We show the modeling and computational benefits of the approach to the some of the problems on which IC has been demonstrated. 1
6 0.39640412 90 nips-2006-Hidden Markov Dirichlet Process: Modeling Genetic Recombination in Open Ancestral Space
7 0.37976283 108 nips-2006-Large Scale Hidden Semi-Markov SVMs
8 0.35393491 166 nips-2006-Recursive Attribute Factoring
9 0.33495012 98 nips-2006-Inferring Network Structure from Co-Occurrences
10 0.31710228 192 nips-2006-Theory and Dynamics of Perceptual Bistability
11 0.31304106 103 nips-2006-Kernels on Structured Objects Through Nested Histograms
12 0.30748478 101 nips-2006-Isotonic Conditional Random Fields and Local Sentiment Flow
13 0.29948062 175 nips-2006-Simplifying Mixture Models through Function Approximation
14 0.29873079 64 nips-2006-Data Integration for Classification Problems Employing Gaussian Process Priors
15 0.29195192 1 nips-2006-A Bayesian Approach to Diffusion Models of Decision-Making and Response Time
16 0.28310817 153 nips-2006-Online Clustering of Moving Hyperplanes
17 0.28027818 119 nips-2006-Learning to Rank with Nonsmooth Cost Functions
18 0.26974013 43 nips-2006-Bayesian Model Scoring in Markov Random Fields
19 0.26708475 31 nips-2006-Analysis of Contour Motions
20 0.26294199 71 nips-2006-Effects of Stress and Genotype on Meta-parameter Dynamics in Reinforcement Learning
topicId topicWeight
[(1, 0.079), (3, 0.019), (7, 0.072), (9, 0.034), (20, 0.035), (22, 0.039), (44, 0.053), (57, 0.038), (65, 0.033), (69, 0.03), (71, 0.459), (90, 0.022)]
simIndex simValue paperId paperTitle
1 0.96477538 145 nips-2006-Neurophysiological Evidence of Cooperative Mechanisms for Stereo Computation
Author: Jason M. Samonds, Brian R. Potetz, Tai S. Lee
Abstract: Although there has been substantial progress in understanding the neurophysiological mechanisms of stereopsis, how neurons interact in a network during stereo computation remains unclear. Computational models on stereopsis suggest local competition and long-range cooperation are important for resolving ambiguity during stereo matching. To test these predictions, we simultaneously recorded from multiple neurons in V1 of awake, behaving macaques while presenting surfaces of different depths rendered in dynamic random dot stereograms. We found that the interaction between pairs of neurons was a function of similarity in receptive fields, as well as of the input stimulus. Neurons coding the same depth experienced common inhibition early in their responses for stimuli presented at their nonpreferred disparities. They experienced mutual facilitation later in their responses for stimulation at their preferred disparity. These findings are consistent with a local competition mechanism that first removes gross mismatches, and a global cooperative mechanism that further refines depth estimates. 1 In trod u ction The human visual system is able to extract three-dimensional (3D) structures in random noise stereograms even when such images evoke no perceptible patterns when viewed monocularly [1]. Bela Julesz proposed that this is accomplished by a stereopsis mechanism that detects correlated shifts in 2D noise patterns between the two eyes. He also suggested that this mechanism likely involves cooperative neural processing early in the visual system. Marr and Poggio formalized the computational constraints for solving stereo matching (Fig. 1a) and devised an algorithm that can discover the underlying 3D structures in a variety of random dot stereogram patterns [2]. Their algorithm was based on two rules: (1) each element or feature is unique (i.e., can be assigned only one disparity) and (2) surfaces of objects are cohesive (i.e., depth changes gradually across space). To describe their algorithm in neurophysiological terms, we can consider neurons in primary visual cortex as simple element or feature detectors. The first rule is implemented by introducing competitive interactions (mutual inhibition) among neurons of different disparity tuning at each location (Fig. 1b, blue solid horizontal or vertical lines), allowing only one disparity to be detected at each location. The second rule is implemented by introducing cooperative interactions (mutual facilitation) among neurons tuned to the same depth (image disparity) across different spatial locations (Fig. 1b, along the red dashed diagonal lines). In other words, a disparity estimate at one location is more likely to be correct if neighboring locations have similar disparity estimates. A dynamic system under such constraints can relax to a stable global disparity map. Here, we present neurophysiological evidence of interactions between disparity-tuned neurons in the primary visual cortex that is consistent with this general approach. We sampled from a variety of spatially distributed disparity tuned neurons (see electrodes Fig. 1b) while displaying DRDS stimuli defined at various disparities (see stimulus Fig.1b). We then measured the dynamics of interactions by assessing the temporal evolution of correlation in neural responses. a Left Image b Right Image Electrodes Disparity Left Image ? Stimulus Right Image Figure 1: (a) Left and right images of random dot stereogram (right image has been shifted to the right). (b) 1D graphical depiction of competition (blue solid lines) and cooperation (red dashed lines) among disparity-tuned neurons with respect to space as defined by Marr and Poggio’s stereo algorithm [2]. 2 2.1 Methods Recording and stimulation a Posterior - Anterior Recordings were made in V1 of two awake, behaving macaques. We simultaneously recorded from 4-8 electrodes providing data from up to 10 neurons in a single recording session (some electrodes recorded from as many as 3 neurons). We collected data from 112 neurons that provided 224 pairs for cross-correlation analysis. For stimuli, we used 12 Hz dynamic random dot stereograms (DRDS; 25% density black and white pixels on a mean luminance background) presented in a 3.5-degree aperture. Liquid crystal shutter goggles were used to present random dot patterns to each eye separately. Eleven horizontal disparities between the two eyes, ranging from ±0.9 degrees, were tested. Seventy-four neurons (66%) had significant disparity tuning and 99 pairs (44%) were comprised of neurons that both had significant disparity tuning (1-way ANOVA, p<0.05). b 5mm Medial - Lateral 100µV 0.2ms 1° Figure 2: (a) Example recording session from five electrodes in V1. (b) Receptive field (white box—arrow represents direction preference) and random dot stereogram locations for same recording session (small red square is the fixation spot). 2.2 Data analysis Interaction between neurons was described as
2 0.93311459 36 nips-2006-Attentional Processing on a Spike-Based VLSI Neural Network
Author: Yingxue Wang, Rodney J. Douglas, Shih-Chii Liu
Abstract: The neurons of the neocortex communicate by asynchronous events called action potentials (or ’spikes’). However, for simplicity of simulation, most models of processing by cortical neural networks have assumed that the activations of their neurons can be approximated by event rates rather than taking account of individual spikes. The obstacle to exploring the more detailed spike processing of these networks has been reduced considerably in recent years by the development of hybrid analog-digital Very-Large Scale Integrated (hVLSI) neural networks composed of spiking neurons that are able to operate in real-time. In this paper we describe such a hVLSI neural network that performs an interesting task of selective attentional processing that was previously described for a simulated ’pointer-map’ rate model by Hahnloser and colleagues. We found that most of the computational features of their rate model can be reproduced in the spiking implementation; but, that spike-based processing requires a modification of the original network architecture in order to memorize a previously attended target. 1
same-paper 3 0.84680432 135 nips-2006-Modelling transcriptional regulation using Gaussian Processes
Author: Neil D. Lawrence, Guido Sanguinetti, Magnus Rattray
Abstract: Modelling the dynamics of transcriptional processes in the cell requires the knowledge of a number of key biological quantities. While some of them are relatively easy to measure, such as mRNA decay rates and mRNA abundance levels, it is still very hard to measure the active concentration levels of the transcription factor proteins that drive the process and the sensitivity of target genes to these concentrations. In this paper we show how these quantities for a given transcription factor can be inferred from gene expression levels of a set of known target genes. We treat the protein concentration as a latent function with a Gaussian process prior, and include the sensitivities, mRNA decay rates and baseline expression levels as hyperparameters. We apply this procedure to a human leukemia dataset, focusing on the tumour repressor p53 and obtaining results in good accordance with recent biological studies.
4 0.74697334 191 nips-2006-The Robustness-Performance Tradeoff in Markov Decision Processes
Author: Huan Xu, Shie Mannor
Abstract: Computation of a satisfactory control policy for a Markov decision process when the parameters of the model are not exactly known is a problem encountered in many practical applications. The traditional robust approach is based on a worstcase analysis and may lead to an overly conservative policy. In this paper we consider the tradeoff between nominal performance and the worst case performance over all possible models. Based on parametric linear programming, we propose a method that computes the whole set of Pareto efficient policies in the performancerobustness plane when only the reward parameters are subject to uncertainty. In the more general case when the transition probabilities are also subject to error, we show that the strategy with the “optimal” tradeoff might be non-Markovian and hence is in general not tractable. 1
Author: Elisabetta Chicca, Giacomo Indiveri, Rodney J. Douglas
Abstract: Cooperative competitive networks are believed to play a central role in cortical processing and have been shown to exhibit a wide set of useful computational properties. We propose a VLSI implementation of a spiking cooperative competitive network and show how it can perform context dependent computation both in the mean firing rate domain and in spike timing correlation space. In the mean rate case the network amplifies the activity of neurons belonging to the selected stimulus and suppresses the activity of neurons receiving weaker stimuli. In the event correlation case, the recurrent network amplifies with a higher gain the correlation between neurons which receive highly correlated inputs while leaving the mean firing rate unaltered. We describe the network architecture and present experimental data demonstrating its context dependent computation capabilities. 1
6 0.61452115 187 nips-2006-Temporal Coding using the Response Properties of Spiking Neurons
7 0.53460431 99 nips-2006-Information Bottleneck Optimization and Independent Component Extraction with Spiking Neurons
8 0.52424324 162 nips-2006-Predicting spike times from subthreshold dynamics of a neuron
9 0.51083529 189 nips-2006-Temporal dynamics of information content carried by neurons in the primary visual cortex
10 0.50029802 18 nips-2006-A selective attention multi--chip system with dynamic synapses and spiking neurons
11 0.46735272 154 nips-2006-Optimal Change-Detection and Spiking Neurons
12 0.45414895 165 nips-2006-Real-time adaptive information-theoretic optimization of neurophysiology experiments
13 0.43082884 29 nips-2006-An Information Theoretic Framework for Eukaryotic Gradient Sensing
14 0.4289726 71 nips-2006-Effects of Stress and Genotype on Meta-parameter Dynamics in Reinforcement Learning
15 0.40860188 81 nips-2006-Game Theoretic Algorithms for Protein-DNA binding
16 0.40096056 16 nips-2006-A Theory of Retinal Population Coding
17 0.37669852 192 nips-2006-Theory and Dynamics of Perceptual Bistability
18 0.37591726 49 nips-2006-Causal inference in sensorimotor integration
19 0.36617151 190 nips-2006-The Neurodynamics of Belief Propagation on Binary Markov Random Fields
20 0.36060515 17 nips-2006-A recipe for optimizing a time-histogram