nips nips2004 nips2004-128 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jongmin Kim, John Hopfield, Erik Winfree
Abstract: The structural similarity of neural networks and genetic regulatory networks to digital circuits, and hence to each other, was noted from the very beginning of their study [1, 2]. In this work, we propose a simple biochemical system whose architecture mimics that of genetic regulation and whose components allow for in vitro implementation of arbitrary circuits. We use only two enzymes in addition to DNA and RNA molecules: RNA polymerase (RNAP) and ribonuclease (RNase). We develop a rate equation for in vitro transcriptional networks, and derive a correspondence with general neural network rate equations [3]. As proof-of-principle demonstrations, an associative memory task and a feedforward network computation are shown by simulation. A difference between the neural network and biochemical models is also highlighted: global coupling of rate equations through enzyme saturation can lead to global feedback regulation, thus allowing a simple network without explicit mutual inhibition to perform the winner-take-all computation. Thus, the full complexity of the cell is not necessary for biochemical computation: a wide range of functional behaviors can be achieved with a small set of biochemical components. 1
Reference: text
sentIndex sentText sentNum sentScore
1 Neural network computation by in vitro transcriptional circuits Jongmin Kim1 , John J. [sent-1, score-0.493]
2 edu 1 Abstract The structural similarity of neural networks and genetic regulatory networks to digital circuits, and hence to each other, was noted from the very beginning of their study [1, 2]. [sent-7, score-0.256]
3 In this work, we propose a simple biochemical system whose architecture mimics that of genetic regulation and whose components allow for in vitro implementation of arbitrary circuits. [sent-8, score-0.336]
4 We develop a rate equation for in vitro transcriptional networks, and derive a correspondence with general neural network rate equations [3]. [sent-10, score-0.494]
5 A difference between the neural network and biochemical models is also highlighted: global coupling of rate equations through enzyme saturation can lead to global feedback regulation, thus allowing a simple network without explicit mutual inhibition to perform the winner-take-all computation. [sent-12, score-0.596]
6 Thus, the full complexity of the cell is not necessary for biochemical computation: a wide range of functional behaviors can be achieved with a small set of biochemical components. [sent-13, score-0.353]
7 Characterizing and decoding the connectivity of the genetic regulatory networks that govern these responses is a major challenge of the post-genome era [4]. [sent-15, score-0.236]
8 Understanding the operation of biological networks is intricately intertwined with the ability to create sophisticated biochemical networks de novo. [sent-16, score-0.252]
9 Recent work developing synthetic genetic regulatory networks has focused on engineered circuits in bacteria wherein protein signals are produced and degraded [5, 6]. [sent-17, score-0.347]
10 We propose a biochemical model system – a simplified analog of genetic regulatory circuits – that provides well-defined connectivity and uses nucleic acid species as fuel and signals that control the network. [sent-19, score-0.515]
11 Our goal is to establish an explicit model to guide the laboratory construction of synthetic biomolecular systems in which every component is known and RNAP transcript RNase inhibitor (A) activator DNA switch (B) Figure 1: (A) The components of an in vitro circuit. [sent-20, score-0.575]
12 The switch template (blue) is shown with the activator (red) attached. [sent-21, score-0.336]
13 (B) The correspondence between a neural network and an in vitro biochemical network. [sent-23, score-0.34]
14 Neuron activity corresponds to RNA transcript concentration, while synaptic connections correspond to DNA switches with specified input and output. [sent-24, score-0.222]
15 Only two enzymes are used in addition to synthetic DNA templates: RNA polymerase, which recognizes a specific promoter sequence in double-stranded DNA and transcribes the downstream DNA to produce an RNA transcript, and ribonuclease, which degrades RNA but not DNA. [sent-26, score-0.19]
16 In this system, RNA transcript concentrations are taken as signals. [sent-27, score-0.175]
17 Upon interaction with a RNA transcript of the appropriate sequence, the DNA template switches between different conformations like a gene regulated by transcription factors. [sent-29, score-0.343]
18 The network computation is powered by rNTP that drives the synthesis of RNA signals by RNAP, while RNase forces transient signals to decay. [sent-31, score-0.165]
19 With a few assumptions, we find that this stripped-down analog of genetic regulatory networks is mathematically equivalent to recurrent neural networks, confirming that a wide range of programmable dynamical behaviors is attainable. [sent-32, score-0.234]
20 2 Construction of the transcriptional network The DNA transcriptional switch. [sent-33, score-0.625]
21 The elementary unit of our networks will be a DNA switch, which serves the role of a gene in a genetic regulatory circuit. [sent-34, score-0.213]
22 The basic requirements for a DNA switch are to have separate input and output domains, to transcribe poorly by itself [7], and to transcribe efficiently when an activator is bound to it. [sent-35, score-0.369]
23 A possible mechanism of activation is the complementation of an incomplete promoter region, allowing more favorable binding of RNAP to the DNA template. [sent-36, score-0.288]
24 Figure 1A illustrates our proposed design for DNA transcriptional switches and circuits. [sent-37, score-0.392]
25 Furthermore, activator A contains a “toehold” region [8] that overhangs past the end of D, allowing inhibitor I to strip off A from the DA complex. [sent-40, score-0.314]
26 This set of binding reactions provides a means to choose the threshold of the sigmoidal activation function, as will be explained later. [sent-42, score-0.374]
27 RNAP and RNase drive changes in RNA transcript concentration; their activity is modeled using a first-order approximation for enzyme kinetics. [sent-43, score-0.262]
28 By RNA polymerase By RNase kp k d R→φ DA → DA + R αkp D → D+R where 0 < α < 1 due to lack of activation and φ represents the complete degradation of RNA products by RNase. [sent-45, score-0.311]
29 kd and kp are set by the concentration of enzymes. [sent-46, score-0.322]
30 Analysis of our i system is greatly simplified by the assumption that the binding reactions are fast and go to completion. [sent-48, score-0.271]
31 We define D tot as the sum of free and bound species:D tot = [D] + [DA]. [sent-49, score-1.072]
32 Similarly, I tot = [I]+[AI] and Atot = [A]+[DA]+[AI]. [sent-50, score-0.536]
33 Then, [DA] depends on D tot and ∆, where ∆ = Atot − I tot . [sent-51, score-1.072]
34 The amount of [DA] is proportional to ∆ when 0 < ∆ < D tot , as shown in Figure 2A. [sent-53, score-0.536]
35 At steady-state, kd [R] = kp [DA] + αkp [D]; thus, 1 kp tot ˆ [R] = D ((1 − α)σ(∆) + 1 + α) . [sent-56, score-0.888]
36 2 kd If we consider the activator concentration as an input and the steady-state transcript concentration as an output, then the (presumed constant) inhibitor concentration, I tot , sets the threshold, and the function assumes a sigmoidal shape (Fig. [sent-57, score-1.225]
37 Adjusting the amount of template, D tot , sets the magnitude of the output signal and the width of the transition region (Fig. [sent-59, score-0.589]
38 Thus, we have a sigmoidal function with an adjustable threshold, without reliance on cooperative binding of transcription factors as is common in biological systems [9]. [sent-62, score-0.251]
39 The input domain of a DNA switch is upstream of the promoter region; the output domain is downstream of the promoter region. [sent-64, score-0.408]
40 This separation of domains allows us to design DNA switches that have any desired connectivity. [sent-65, score-0.13]
41 [ DA ] D σ (x) tot 1 D tot −1 ∆ [R] 1 [R] x −1 (A) (B) (C) A tot (D) A tot Figure 2: (A) [DA] as a function of ∆. [sent-66, score-2.144]
42 (C,D) [R] as a function of Atot for three values of D tot and I tot , respectively. [sent-68, score-1.072]
43 We assume that distinct signals in the network are represented as distinct RNA sequences that have negligible crosstalk (undesired binding of two molecules representing different signals). [sent-69, score-0.353]
44 The tot change of RNA concentrations over time is easy to express with ∆i = Atot − Ii : i d∆i = −kd · ∆i + kp dt sij ([Dij Aj ] + α[Dij ]) . [sent-75, score-0.863]
45 We show next that the time evolution of this biochemical network model is equivalent to that of a general Hopfield neural network model [3]: τ dxi = −xi + dt wij σ(xj ) + θi . [sent-77, score-0.399]
46 2∆i ˆ Let ∆i = Dtot − 1 be a rescaled difference between activator and inhibitor concentrations, ∗i tot where D∗i is the load on Ai , i. [sent-79, score-0.864]
47 , the total concentration of all switches that bind to Ai : tot tot tot D∗i = j Dji and Dij = [Dij Aj ] + [Dij ]. [sent-81, score-1.905]
48 Then, we can derive the following rate ˆ equation, where ∆i plays the role of unit i’s activity xi : ˆ 1 d∆ i ˆ = − ∆i + kd dt j tot Dij kp (1 − α)sij tot kd D∗i ˆ σ(∆j )+ j tot Dij kp (1 + α)sij tot − 1 . [sent-82, score-2.618]
49 kd D∗i (3) Given the set of constants describing an arbitrary transcriptional network, the constants for an equivalent neural network can be obtained immediately by comparing Equations 2 and 3. [sent-83, score-0.505]
50 The time constant τ is the inverse of the RNase degradation rate: fast turnover of RNA molecules leads to fast response of the network. [sent-84, score-0.129]
51 The synaptic weight wij is proportional to the concentration of switch template ij, attenuated by the load on Ai . [sent-85, score-0.347]
52 To implement an arbitrary neural network, we must introduce two new types of switches to the transcriptional network. [sent-87, score-0.392]
53 To achieve arbitrary thresholds, we introduce bias switches D iB which have no input domain and thus produce outputs constitutively; this adds an adjustable constant to the right hand side of Equation 3. [sent-88, score-0.182]
54 To balance the load on Ai , we add null switches tot D0i which bind to Ai but have no output domain; this allows us to ensure that all D∗i are equal. [sent-89, score-0.811]
55 Consequently, given any neural network with weights wij and thresholds θi , we can tot specify concentrations Dij such that the biochemical network has identical dynamics, for some τ . [sent-90, score-1.012]
56 Next, we explore the validity of our assumption that enzyme kinetics are first-order reactions. [sent-92, score-0.193]
57 A basic but more realistic model is the Michaelis–Menten mechanism [10], in which the enzyme and substrate bind to form an enzyme-substrate complex. [sent-93, score-0.222]
58 An important ramification of Michaelis–Menten reactions is that there is competition for the enzyme by the substrates, because the concentration of available enzymes is reduced as they bind to substrates, leading to saturation when the enzyme concentration is limiting. [sent-95, score-0.833]
59 E tot [Dij A ] tot and Ed are the concentrations of RNAP and RNase, respectively. [sent-97, score-1.155]
60 L = i,j KM j + [Dij ] i,j KM [A ]+[I ]+[A I ]+[D A ] j j j j ij j is the load on RNAP and Ld = is the load on i,j Kd,M RNase (i. [sent-98, score-0.174]
61 , the total concentration of binding targets divided by the Michaelis constants of the enzymes), both of which may be time varying. [sent-100, score-0.29]
62 Introduction of a new type of switch with different Michaelis constants can make L constant by balancing the load on the enzyme. [sent-102, score-0.232]
63 3 Example computations by transcriptional networks Feed-forward networks. [sent-104, score-0.305]
64 We first consider a feed-forward network to compute f (x, y, z) = xyz + y z + x. [sent-105, score-0.127]
65 Using the conversion rule discussed in the network equivalence section, we can calculate the parameters of the transcriptional network. [sent-108, score-0.363]
66 For comparison, we also explicitly simulated mass action dynamics for the full set of chemical equations with the Michaelis–Menten enzyme reactions, using biologically tot plausible rate constants and with E tot and Ed calculated from Equation 4 using estimated values of L and Ld . [sent-111, score-1.336]
67 After training, the parameters of the neural network are converted to the parameters of the transcriptional network as previously described. [sent-118, score-0.464]
68 Starting from a random 6 f 1 −1 y y z (B) z 4 −2 1 1 2 2 1 f 1 0 −2 −1 1 (A) 1 −1 ^ ∆i x x −4 (C) 0 200 400 600 time(sec) 800 1000 Figure 3: (A,B) A Boolean circuit and a neural network to compute f (x, y, z) = xyz+¯z+ ¯ y x. [sent-119, score-0.141]
69 (B) Time-course for the transcriptional network recovery of the third pattern. [sent-122, score-0.363]
70 (odd columns: blue lines, even columns: red lines) initial state, a typical response of the transcriptional network (with the first-order approximation of Equation 3) is shown in Figure 4B. [sent-123, score-0.363]
71 Thus, our in vitro transcriptional networks can support complex sets of stable steady-states. [sent-124, score-0.378]
72 In a biochemical system, a limited global resource, such as RNAP, can act to regulate all the DNA switches and thus similarly produce global inhibition. [sent-129, score-0.296]
73 This effect is exploited by the simple transcriptional network shown in Figure 5B, in which the output from each DNA switch activates the same DNA switch itself, and mutual inhibition is achieved by competition for RNAP. [sent-130, score-0.642]
74 Specifically, we have switch templates D ii with fixed thresholds set by Ii , and Dii produces Ai as its output RNA. [sent-131, score-0.251]
75 With the instant binding assumption, we then derive the following equation: dAtot E tot kd,cat tot E tot i =− d · Ai + dt 1 + Ld Kd,M 1+L kcat k [Dii Ai ] + cat [Dii ] KM KM . [sent-132, score-1.865]
76 (5) The production rate of Ai depends on Atot and on L, while the degradation rate of Ai i depends on Atot and on Ld , as shown in Figure 6A. [sent-133, score-0.209]
77 For a winner-take-all network, an ON i state switch draws more RNAP than an OFF state switch (because of the smaller Michaelis constant for the ON state). [sent-134, score-0.252]
78 Thus, if the other switches are turned OFF, the load on RNAP (L) becomes small, leading to faster production of the remaining ON switches. [sent-135, score-0.264]
79 When the production rate curve and the degradation rate curve have three intersections, bistability is achieved such that the switches remain ON or OFF, depending on their current state. [sent-136, score-0.339]
80 Consider n equivalent switches starting with initial activator concentrations above the threshold, and with the highest concentration at least δ above the rest (as a percentage). [sent-137, score-0.498]
81 Analysis indicates that a less leaky system (small α) and sufficient differences in initial activator concentrations (large δ) can guarantee the existence of a unique winner. [sent-138, score-0.253]
82 Switches get turned OFF one by one whenever the activator level approaches the threshold, until only one switch remains ON. [sent-142, score-0.296]
83 4 L : high 10 tot Ii tot tot Ii +D ii tot Ai (B) 2 1. [sent-154, score-2.18]
84 5 i L : low dt 1 25 [A ] / µM 30 tot dAi 0. [sent-157, score-0.567]
85 Similarly, we can consider a k-WTA network where k winners persist. [sent-163, score-0.143]
86 The structure of our biochemical networks differs from that of previous formal models of genetic regulatory circuits [14, 15, 16]. [sent-167, score-0.436]
87 There, the occupancy of regulatory binding sites corresponds to the state of neurons, the weights are set by the cooperative interaction among transcription factors, and the thresholds are the effective dissociation constants at a binding site. [sent-169, score-0.45]
88 Thus, implementing a general N -unit neural network requires only O(N ) biochemical species, but up to O(N 2 ) significant binding interactions must be encoded in the molecular sequences. [sent-170, score-0.402]
89 In contrast, in our transcriptional networks, each weight and threshold is represented by the continuously adjustable concentration of a distinct species, and the introduction or deletion of any node is straightforward. [sent-172, score-0.437]
90 Each synapse is represented by a DNA switch with a single input–output specification, so the number of DNA switches grows as O(N 2 ) for a fully recurrent neural network with N neurons (unlike the circuits of [16]). [sent-173, score-0.414]
91 The time for computation will increase as O(N ) due to finite hybridization rates because, if the total concentration of all RNA signals is capped, the concentration of any given species will decrease as 1/N . [sent-175, score-0.363]
92 The weights are capped by the maximum gain of the system, which is the production rate divided by the degradation rate. [sent-176, score-0.206]
93 Since the time constant of the network is the inverse of the degradation rate, if we wish to implement a network with large weights, we must increase the time constant. [sent-177, score-0.285]
94 , 1013 for a 1µM signal in 20µl) of a given species must be produced to change the concentration in a bulk sample. [sent-183, score-0.182]
95 Worse yet, because degradation is not modulated in the transcriptional network, switching relies on selective change of production rates, thus continually using energy to maintain an ON state. [sent-184, score-0.413]
96 The theory presented here is meant to serve as a guide for the construction of real biochemical computing networks. [sent-186, score-0.166]
97 For example, hybridization is neither instantaneous nor irreversible, strands can have undesired conformations and crosstalk, and enzyme reactions depend on the sequence and are subject to side reactions that generate incomplete products. [sent-188, score-0.567]
98 Some problems, such as hybridization speed and crosstalk, can be reduced by slowing the enzyme reactions and using proper sequence design [19]. [sent-189, score-0.34]
99 Restoration of outputs to digital values, achieved by any sufficiently highgain sigmoidal activation function, provides some level of immunity to noise at the gate level, and attractor dynamics can provide restoration at the network level. [sent-191, score-0.221]
100 A full understanding of fault tolerance in biochemical computing remains an important open question. [sent-192, score-0.189]
wordName wordTfidf (topN-words)
[('tot', 0.536), ('transcriptional', 0.262), ('dij', 0.228), ('rna', 0.208), ('dna', 0.208), ('activator', 0.17), ('enzyme', 0.17), ('biochemical', 0.166), ('rnap', 0.157), ('kp', 0.145), ('reactions', 0.136), ('binding', 0.135), ('michaelis', 0.131), ('rnase', 0.131), ('switches', 0.13), ('switch', 0.126), ('da', 0.12), ('aj', 0.119), ('concentration', 0.115), ('network', 0.101), ('genetic', 0.097), ('inhibitor', 0.092), ('transcript', 0.092), ('km', 0.091), ('promoter', 0.091), ('concentrations', 0.083), ('degradation', 0.083), ('atot', 0.078), ('menten', 0.078), ('regulatory', 0.073), ('vitro', 0.073), ('ai', 0.073), ('ld', 0.069), ('sij', 0.068), ('production', 0.068), ('species', 0.067), ('load', 0.066), ('kcat', 0.065), ('kd', 0.062), ('circuits', 0.057), ('polymerase', 0.052), ('bind', 0.052), ('molecules', 0.046), ('enzymes', 0.046), ('sigmoidal', 0.043), ('networks', 0.043), ('ij', 0.042), ('jj', 0.042), ('transcription', 0.042), ('winners', 0.042), ('constants', 0.04), ('circuit', 0.04), ('template', 0.04), ('conformations', 0.039), ('crosstalk', 0.039), ('dtot', 0.039), ('hop', 0.039), ('templates', 0.037), ('ii', 0.036), ('hybridization', 0.034), ('signals', 0.032), ('incomplete', 0.031), ('activation', 0.031), ('adjustable', 0.031), ('downstream', 0.031), ('dt', 0.031), ('threshold', 0.029), ('saturation', 0.029), ('rate', 0.029), ('dii', 0.027), ('output', 0.027), ('region', 0.026), ('capped', 0.026), ('elowitz', 0.026), ('stoichiometry', 0.026), ('strip', 0.026), ('xyz', 0.026), ('yurke', 0.026), ('cat', 0.026), ('thresholds', 0.025), ('chemical', 0.025), ('reaction', 0.025), ('gate', 0.025), ('sec', 0.024), ('connectivity', 0.023), ('wta', 0.023), ('fault', 0.023), ('mills', 0.023), ('kinetics', 0.023), ('ribonuclease', 0.023), ('transcribe', 0.023), ('bacteria', 0.023), ('synthetic', 0.022), ('domain', 0.021), ('behaviors', 0.021), ('undesired', 0.021), ('complements', 0.021), ('restoration', 0.021), ('substrates', 0.021), ('units', 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 128 nips-2004-Neural Network Computation by In Vitro Transcriptional Circuits
Author: Jongmin Kim, John Hopfield, Erik Winfree
Abstract: The structural similarity of neural networks and genetic regulatory networks to digital circuits, and hence to each other, was noted from the very beginning of their study [1, 2]. In this work, we propose a simple biochemical system whose architecture mimics that of genetic regulation and whose components allow for in vitro implementation of arbitrary circuits. We use only two enzymes in addition to DNA and RNA molecules: RNA polymerase (RNAP) and ribonuclease (RNase). We develop a rate equation for in vitro transcriptional networks, and derive a correspondence with general neural network rate equations [3]. As proof-of-principle demonstrations, an associative memory task and a feedforward network computation are shown by simulation. A difference between the neural network and biochemical models is also highlighted: global coupling of rate equations through enzyme saturation can lead to global feedback regulation, thus allowing a simple network without explicit mutual inhibition to perform the winner-take-all computation. Thus, the full complexity of the cell is not necessary for biochemical computation: a wide range of functional behaviors can be achieved with a small set of biochemical components. 1
2 0.065512873 143 nips-2004-PAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data
Author: Mario Marchand, Mohak Shah
Abstract: We propose a “soft greedy” learning algorithm for building small conjunctions of simple threshold functions, called rays, defined on single real-valued attributes. We also propose a PAC-Bayes risk bound which is minimized for classifiers achieving a non-trivial tradeoff between sparsity (the number of rays used) and the magnitude of the separating margin of each ray. Finally, we test the soft greedy algorithm on four DNA micro-array data sets. 1
3 0.062110297 149 nips-2004-Probabilistic Inference of Alternative Splicing Events in Microarray Data
Author: Ofer Shai, Brendan J. Frey, Quaid D. Morris, Qun Pan, Christine Misquitta, Benjamin J. Blencowe
Abstract: Alternative splicing (AS) is an important and frequent step in mammalian gene expression that allows a single gene to specify multiple products, and is crucial for the regulation of fundamental biological processes. The extent of AS regulation, and the mechanisms involved, are not well understood. We have developed a custom DNA microarray platform for surveying AS levels on a large scale. We present here a generative model for the AS Array Platform (GenASAP) and demonstrate its utility for quantifying AS levels in different mouse tissues. Learning is performed using a variational expectation maximization algorithm, and the parameters are shown to correctly capture expected AS trends. A comparison of the results obtained with a well-established but low through-put experimental method demonstrate that AS levels obtained from GenASAP are highly predictive of AS levels in mammalian tissues. 1 Biological diversity through alternative splicing Current estimates place the number of genes in the human genome at approximately 30,000, which is a surprisingly small number when one considers that the genome of yeast, a singlecelled organism, has 6,000 genes. The number of genes alone cannot account for the complexity and cell specialization exhibited by higher eukaryotes (i.e. mammals, plants, etc.). Some of that added complexity can be achieved through the use of alternative splicing, whereby a single gene can be used to code for a multitude of products. Genes are segments of the double stranded DNA that contain the information required by the cell for protein synthesis. That information is coded using an alphabet of 4 (A, C, G, and T), corresponding to the four nucleotides that make up the DNA. In what is known as the central dogma of molecular biology, DNA is transcribed to RNA, which in turn is translated into proteins. Messenger RNA (mRNA) is synthesized in the nucleus of the cell and carries the genomic information to the ribosome. In eukaryotes, genes are generally comprised of both exons, which contain the information needed by the cell to synthesize proteins, and introns, sometimes referred to as spacer DNA, which are spliced out of the pre-mRNA to create mature mRNA. An estimated 35%-75% of human genes [1] can be C1 (a) C1 A C1 C2 C1 A 3’ C2 (b) C2 C1 A 3’ C1 A 5’ C2 A 5’ C2 C1 C1 C2 C1 (c) A2 A1 C1 A1 C1 A2 C1 C2 C2 C1 (d) C2 C2 A C2 C2 C2 C1 C2 Figure 1: Four types of AS. Boxes represent exons and lines represent introns, with the possible splicing alternatives indicated by the connectors. (a) Single cassette exon inclusion/exclusion. C1 and C2 are constitutive exons (exons that are included in all isoforms) and flank a single alternative exon (A). The alternative exon is included in one isoform and excluded in the other. (b) Alternative 3’ (or donor) and alternative 5’ (acceptor) splicing sites. Both exons are constitutive, but may contain alternative donor and/or acceptor splicing sites. (c) Mutually exclusive exons. One of the two alternative exons (A1 and A2 ) may be included in the isoform, but not both. (d) Intron inclusion. An intron may be included in the mature mRNA strand. spliced to yield different combinations of exons (called isoforms), a phenomenon referred to as alternative splicing (AS). There are four major types of AS as shown in Figure 1. Many multi-exon genes may undergo more than one alternative splicing event, resulting in many possible isoforms from a single gene. [2] In addition to adding to the genetic repertoire of an organism by enabling a single gene to code for more than one protein, AS has been shown to be critical for gene regulation, contributing to tissue specificity, and facilitating evolutionary processes. Despite the evident importance of AS, its regulation and impact on specific genes remains poorly understood. The work presented here is concerned with the inference of single cassette exon AS levels (Figure 1a) based on data obtained from RNA expression arrays, also known as microarrays. 1.1 An exon microarray data set that probes alternative splicing events Although it is possible to directly analyze the proteins synthesized by a cell, it is easier, and often more informative, to instead measure the abundance of mRNA present. Traditionally, gene expression (abundance of mRNA) has been studied using low throughput techniques (such as RT-PCR or Northern blots), limited to studying a few sequences at a time and making large scale analysis nearly impossible. In the early 1990s, microarray technology emerged as a method capable of measuring the expression of thousands of DNA sequences simultaneously. Sequences of interest are deposited on a substrate the size of a small microscope slide, to form probes. The mRNA is extracted from the cell and reverse-transcribed back into DNA, which is labelled with red and green fluorescent dye molecules (cy3 and cy5 respectively). When the sample of tagged DNA is washed over the slide, complementary strands of DNA from the sample hybridize to the probes on the array forming A-T and C-G pairings. The slide is then scanned and the fluorescent intensity is measured at each probe. It is generally assumed that the intensity measure at the probe is linearly related to the abundance of mRNA in the cell over a wide dynamic range. Despite significant improvements in microarray technologies in recent years, microarray data still presents some difficulties in analysis. Low measurements tend to have extremely low signal to noise ratio (SNR) [7] and probes often bind to sequences that are very similar, but not identical, to the one for which they were designed (a process referred to as cross- C1 A C1 A C1:A C1 C2 C2 3 Body probes A:C2 A C2 2 Inclusion junction probes C1:C2 C1 C2 1 Exclusion junction probe Figure 2: Each alternative splicing event is studied using six probes. Probes were chosen to measure the expression levels of each of the three exons involved in the event. Additionally, 3 probes are used that target the junctions that are formed by each of the two isoforms. The inclusion isoform would express the junctions formed by C1 and A, and A and C2 , while the exclusion isoform would express the junction formed by C1 and C2 hybridization). Additionally, probes exhibit somewhat varying hybridization efficiency, and sequences exhibit varying labelling efficiency. To design our data sets, we mined public sequence databases and identified exons that were strong candidates for exhibiting AS (the details of that analysis are provided elsewhere [4, 3]). Of the candidates, 3,126 potential AS events in 2,647 unique mouse genes were selected for the design of Agilent Custom Oligonucleotide microarray. The arrays were hybridized with unamplified mRNA samples extracted from 10 wild-type mouse tissues (brain, heart, intestine, kidney, liver, lung, salivary gland, skeletal muscle, spleen, and testis). Each AS event has six target probes on the arrays, chosen from regions of the C1 exon, C2 exon, A exon, C1 :A splice junction, A:C2 splice junction, and C1 :C2 splice junction, as shown in Figure 2. 2 Unsupervised discovery of alternative splicing With the exception of the probe measuring the alternative exon, A (Figure 2), all probes measure sequences that occur in both isoforms. For example, while the sequence of the probe measuring the junction A:C1 is designed to measure the inclusion isoform, half of it corresponds to a sequence that is found in the exclusion isoform. We can therefore safely assume that the measured intensity at each probe is a result of a certain amount of both isoforms binding to the probe. Due to the generally assumed linear relationship between the abundance of mRNA hybridized at a probe and the fluorescent intensity measured, we model the measured intensity as a weighted sum of the overall abundance of the two isoforms. A stronger assumption is that of a single, consistent hybridization profile for both isoforms across all probes and all slides. Ideally, one would prefer to estimate an individual hybridization profile for each AS event studied across all slides. However, in our current setup, the number of tissues is small (10), resulting in two difficulties. First, the number of parameters is very large when compared to the number of data point using this model, and second, a portion of the events do not exhibit tissue specific alternative splicing within our small set of tissues. While the first hurdle could be accounted for using Baysian parameter estimation, the second cannot. 2.1 GenASAP - a generative model for alternative splicing array platform Using the setup described above, the expression vector x, containing the six microarray measurements as real numbers, can be decomposed as a linear combination of the abundance of the two splice isoforms, represented by the real vector s, with some added noise: x = Λs + noise, where Λ is a 6 × 2 weight matrix containing the hybridization profiles for s1 ^ xC ^ xC 1 2 s2 ^ xC :A ^ xA ^ xA:C 2 ^ xC1:C2 2 xC1:C2 1 r xC xC 2 xA xC :A xA:C oC1 oC2 oA oC :A oA:C 1 1 1 2 oC :C 1 2 Σn 2 Figure 3: Graphical model for alternative splicing. Each measurement in the observed expression profile, x, is generated by either using a scale factor, r, on a linear combination of the isoforms, s, or drawing randomly from an outlier model. For a detailed description of the model, see text. the two isoforms across the six probes. Note that we may not have a negative amount of a given isoform, nor can the presence of an isoform deduct from the measured expression, and so both s and Λ are constrained to be positive. Expression levels measured by microarrays have previously been modelled as having expression-dependent noise [7]. To address this, we rewrite the above formulation as x = r(Λs + ε), (1) where r is a scale factor and ε is a zero-mean normally distributed random variable with a diagonal covariance matrix, Ψ, denoted as p(ε) = N (ε; 0, Ψ). The prior distribution for the abundance of the splice isoforms is given by a truncated normal distribution, denoted as p(s) ∝ N (s, 0, I)[s ≥ 0], where [·] is an indicator function such that [s ≥ 0] = 1 if ∀i, si ≥ 0, and [s ≥ 0] = 0 otherwise. Lastly, there is a need to account for aberrant observations (e.g. due to faulty probes, flakes of dust, etc.) with an outlier model. The complete GenASAP model (shown in Figure 3) accounts for the observations as the outcome of either applying equation (1) or an outlier model. To avoid degenerate cases and ensure meaningful and interpretable results, the number of faulty probes considered for each AS event may not exceed two, as indicated by the filled-in square constraint node in Figure 3. The distribution of x conditional on the latent variables, s, r, and o, is: N (xi ; rΛi s, r2 Ψi )[oi =0] N (xi ; Ei , Vi )[oi =1] , p(x|s, r, o) = (2) i where oi ∈ {0, 1} is a bernoulli random variable indicating if the measurement at probe xi is the result of the AS model or the outlier model parameterized by p(oi = 1) = γi . The parameters of the outlier model, E and V, are not optimized and are set to the mean and variance of the data. 2.2 Variational learning in the GenASAP model To infer the posterior distribution over the splice isoform abundances while at the same time learning the model parameters we use a variational expectation-maximization algorithm (EM). EM maximizes the log likelihood of the data by iteratively estimating the posterior distribution of the model given the data in the expectation (E) step, and maximizing the log likelihood with respect to the parameters, while keeping the posterior fixed, in the maximization (M) step. Variational EM is used when, as in the case of GenASAP, the exact posterior is intractable. Variational EM minimizes the free energy of the model, defined as the KL-divergence between the joint distribution of the latent and observed variables and the approximation to the posterior under the model parameters [5, 6]. We approximate the true posterior using the Q distribution given by T Q({s(t) }, {o(t) }, {r(t) }) = (t) Q(r(t) )Q(o(t) |r(t) ) t=1 (t) Q(si |oi , r(t) ) i −1 T =Z (t) (3) d d ρ(t) ω (t) N (s(t) ; µ(t) , Σ(t) )[s(t) ≥ 0], ro ro t=1 where Z is a normalization constant, the superscript d indicates that Σ is constrained to be diagonal, and there are T iid AS events. For computational efficiency, r is selected from a finite set, r ∈ {r1 , r2 , . . . , rC } with uniform probability. The variational free energy is given by Q({s(t) }, {o(t) }, {r(t) }) . P ({s(t) }, {o(t) }, {r(t) }, {x(t) }) s r o (4) Variational EM minimizes the free energy by iteratively updating the Q distribution’s vari(t)d (t)d ational parameters (ρ(t) , ω (t) , µro , and Σro ) in the E-step, and the model parameters (Λ, Ψ, {r1 , r2 , . . . , rC }, and γ) in the M-step. The resulting updates are too long to be shown in the context of this paper and are discussed in detail elsewhere [3]. A few particular points regarding the E-step are worth covering in detail here. Q({s(t) }, {o(t) }, {r(t) }) log F(Q, P ) = If the prior on s was a full normal distribution, there would be no need for a variational approach, and exact EM is possible. For a truncated normal distribution, however, the mixing proportions, Q(r)Q(o|r) cannot be calculated analytically except for the case where s is scalar, necessitating the diagonality constraint. Note that if Σ was allowed to be a full covariance matrix, equation (3) would be the true posterior, and we could find the sufficient statistics of Q(s(t) |o(t) , r(t) ): −1 µ(t) = (I + ΛT (I − O(t) )T Ψ−1 (I − O(t) )Λ)−1 ΛT (I − O(t) )T Ψ−1 x(t) r(t) ro −1 Σ(t) ro = (I + ΛT (I − O(t) )T Ψ−1 (I − O(t) )Λ) (5) (6) where O is a diagonal matrix with elements Oi,i = oi . Furthermore, it can be easily shown that the optimal settings for µd and Σd approximating a normal distribution with full covariance Σ and mean µ is µd optimal = µ −1 Σd optimal = diag(Σ (7) −1 ) (8) In the truncated case, equation (8) is still true. Equation (7) does not hold, though, and µd optimal cannot be found analytically. In our experiments, we found that using equation (7) still decreases the free energy every E-step, and it is significantly more efficient than using, for example, a gradient decent method to compute the optimal µd . Intuitive Weigh Matrix Optimal Weight Matrix 50 50 40 40 30 30 20 20 10 0 10 Inclusion Isoform 0 Exclusion Isoform Inclusion Isoform (a) Exclusion Isoform (b) Figure 4: (a) An intuitive set of weights. Based on the biological background, one would expect to see the inclusion isoform hybridize to the probes measuring C1 , C2 , A, C1 :A, and A:C2 , while the exclusion isoform hybridizes to C1 , C2 , and C1 :C2 . (b) The learned set of weights closely agrees with the intuition, and captures cross hybridization between the probes Contribution of exclusion isoform Contribution of inclusion isoform AS model Original Data (a) RT−PCR AS model measurement prediction (% exclusion) (% exclusion) 14% 72% (b) 27% 70% 8% 22% outliers (c) Figure 5: Three examples of data cases and their predictions. (a) The data does not follow our notion of single cassette exon AS, but the AS level is predicted accurately by the model.(b) The probe C1 :A is marked as outlier, allowing the model to predict the other probes accurately. (c) Two probes are marked as outliers, and the model is still successful in predicting the AS levels. 3 Making biological predictions about alternative splicing The results presented in this paper were obtained using two stages of learning. In the first step, the weight matrix, Λ, is learned on a subset of the data that is selected for quality. Two selection criteria were used: (a) sequencing data was used to select those cases for which, with high confidence, no other AS event is present (Figure 1) and (b) probe sets were selected for high expression, as determined by a set of negative controls. The second selection criterion is motivated by the common assumption that low intensity measurements are of lesser quality (see Section 1.1). In the second step, Λ is kept fixed, and we introduce the additional constraint that the noise is isotropic (Ψ = ψI) and learn on the entire data set. The constraint on the noise is introduced to prevent the model from using only a subset of the six probes for making the final set of predictions. We show a typical learned set of weights in Figure 4. The weights fit well with our intuition of what they should be to capture the presence of the two isoforms. Moreover, the learned weights account for the specific trends in the data. Examples of model prediction based on the microarray data are shown in Figure 5. Due to the nature of the microarray data, we do not expect all the inferred abundances to be equally good, and we devised a scoring criterion that ranks each AS event based on its fit to the model. Intuitively, given two input vectors that are equivalent up to a scale factor, with inferred MAP estimations that are equal up to the same scale factor, we would like their scores to be identical. The scoring criterion used, therefore is k (xk − rΛk s)2 /(xk + Rank 500 1000 2000 5000 10000 15000 20000 30000 Pearson’s correlation coefficient 0.94 0.95 0.95 0.79 0.79 0.78 0.75 0.65 False positive rate 0.11 0.08 0.05 0.2 0.25 0.29 0.32 0.42 Table 1: Model performance evaluated at various ranks. Using 180 RT-PCR measurements, we are able to predict the model’s performance at various ranks. Two evaluation criteria are used: Pearson’s correlation coefficient between the model’s predictions and the RT-PCR measurements and false positive rate, where a prediction is considered to be false positive if it is more than 15% away from the RT-PCR measurement. rΛk s)2 , where the MAP estimations for r and s are used. This scoring criterion can be viewed as proportional to the sum of noise to signal ratios, as estimated using the two values given by the observation and the model’s best prediction of that observation. Since it is the relative amount of the isoforms that is of most interest, we need to use the inferred distribution of the isoform abundances to obtain an estimate for the relative levels of AS. It is not immediately clear how this should be done. We do, however, have RTPCR measurements for 180 AS events to guide us (see figure 6 for details). Using the top 50 ranked RT-PCR measurement, we fit three parameters, {a1 , a2 , a3 }, such that the s2 proportion of excluded isoform present, p, is given by p = a1 s1 +a2 s2 + a3 , where s1 is the MAP estimation of the abundance of the inclusion isoform, s2 is the MAP estimation of the abundance of the exclusion isoform, and the RT-PCR measurement are used for target p. The parameters are fitted using gradient descent on a least squared error (LSE) evaluation criterion. We used two criteria to evaluate the quality of the AS model predictions. Pearson’s correlation coefficient (PCC) is used to evaluate the overall ability of the model to correctly estimate trends in the data. PCC is invariant to affine transformation and so is independent of the transformation parameters a1 and a3 discussed above, while the parameter a2 was found to effect PCC very little. The PCC stays above 0.75 for the top two thirds ranked predictions. The second evaluation criterion used is the false positive rate, where a prediction is considered to be false positive if it is more than 15% away from the RT-PCR measurement. This allows us to say, for example, that if a prediction is within the top 10000, we are 75% confident that it is within 15% of the actual levels of AS. 4 Summary We designed a novel AS model for the inference of the relative abundance of two alternatively spliced isoforms from six measurements. Unsupervised learning in the model is performed using a structured variational EM algorithm, which correctly captures the underlying structure of the data, as suggested by its biological nature. The AS model, though presented here for a cassette exon AS events, can be used to learn any type of AS, and with a simple adjustment, multiple types. The predictions obtained from the AS model are currently being used to verify various claims about the role of AS in evolution and functional genomics, and to help identify sequences that affect the regulation of AS. % Exclusion isoform RT−PCR measurement Vs. AS model predictions RT−PCR measurements: 90 80 AS model prediction Int es Te tine sti Kid s n Sa ey liva Br ry ain Sp le Liv en er Mu sc Lu le ng 100 70 60 50 40 30 14 22 27 32 47 46 66 78 63 20 AS model prediction: 10 27 24 26 26 51 75 60 85 100 (a) 0 0 20 40 60 80 RT−PCR measurement 100 (b) Figure 6: (a) Sample RT-PCR. RNA extracted from the cell is reverse-transcribed to DNA, amplified and labelled with radioactive or fluorescent molecules. The sample is pulled through a viscous gel in an electric field (DNA, being an acid, is positively charged). Shorter strands travel further through the gel than longer ones, resulting in two distinct bands, corresponding to the two isoforms, when exposed to a photosensitive or x-ray film. (b) A scatter plot showing the RT-PCR measurements as compared to the AS model predictions. The plot shows all available RT-PCR measurements with a rank of 8000 or better. The AS model presented assumes a single weight matrix for all data cases. This is an oversimplified view of the data, and current work is being carried out in identifying probe specific expression profiles. However, due to the low dimensionality of the problem (10 tissues, six probes per event), care must be taken to avoid overfitting and to ensure meaningful interpretations. Acknowledgments We would like to thank Wen Zhang, Naveed Mohammad, and Timothy Hughes for their contributions in generating the data set. This work was funded in part by an operating and infrastructure grants from the CIHR and CFI, and a operating grants from NSERC and a Premier’s Research Excellence Award. References [1] J. M. Johnson et al. Genome-wide survey of human alternative pre-mrna splicing with exon junction microarrays. Science, 302:2141–44, 2003. [2] L. Cartegni et al. Listening to silence and understanding nonsense: exonic mutations that affect splicing. Nature Gen. Rev., 3:285–98, 2002. [3] Q. Pan et al. Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform. Molecular Cell, 16(6):929–41, 2004. [4] Q. Pan et al. Alternative splicing of conserved exons is frequently species specific in human and mouse. Trends Gen., In Press, 2005. [5] M. I. Jordan, Z. Ghahramani, T. Jaakkola, and Lawrence K. Saul. An introduction to variational methods for graphical models. Machine Learning, 37(2):183– 233, 1999. [6] R. M. Neal and G. E. Hinton. A view of the em algorithm that justifies incremental, sparse, and other variants. In Learning in Graphical Models. Cambridge, MIT Press, 1998. [7] D. M. Rocke and B. Durbin. A model for measurement error for gene expression arrays. Journal of Computational Biology, 8(6):557–69, 2001.
4 0.058858149 26 nips-2004-At the Edge of Chaos: Real-time Computations and Self-Organized Criticality in Recurrent Neural Networks
Author: Nils Bertschinger, Thomas Natschläger, Robert A. Legenstein
Abstract: In this paper we analyze the relationship between the computational capabilities of randomly connected networks of threshold gates in the timeseries domain and their dynamical properties. In particular we propose a complexity measure which we find to assume its highest values near the edge of chaos, i.e. the transition from ordered to chaotic dynamics. Furthermore we show that the proposed complexity measure predicts the computational capabilities very well: only near the edge of chaos are such networks able to perform complex computations on time series. Additionally a simple synaptic scaling rule for self-organized criticality is presented and analyzed. 1
5 0.052719459 108 nips-2004-Markov Networks for Detecting Overalpping Elements in Sequence Data
Author: Mark Craven, Joseph Bockhorst
Abstract: Many sequential prediction tasks involve locating instances of patterns in sequences. Generative probabilistic language models, such as hidden Markov models (HMMs), have been successfully applied to many of these tasks. A limitation of these models however, is that they cannot naturally handle cases in which pattern instances overlap in arbitrary ways. We present an alternative approach, based on conditional Markov networks, that can naturally represent arbitrarily overlapping elements. We show how to efficiently train and perform inference with these models. Experimental results from a genomics domain show that our models are more accurate at locating instances of overlapping patterns than are baseline models based on HMMs. 1
6 0.051198509 151 nips-2004-Rate- and Phase-coded Autoassociative Memory
7 0.049516052 118 nips-2004-Methods for Estimating the Computational Power and Generalization Capability of Neural Microcircuits
8 0.049231794 76 nips-2004-Hierarchical Bayesian Inference in Networks of Spiking Neurons
9 0.045341402 59 nips-2004-Efficient Kernel Discriminant Analysis via QR Decomposition
10 0.044896092 28 nips-2004-Bayesian inference in spiking neurons
11 0.043779418 58 nips-2004-Edge of Chaos Computation in Mixed-Mode VLSI - A Hard Liquid
12 0.043693706 177 nips-2004-Supervised Graph Inference
13 0.041572779 32 nips-2004-Boosting on Manifolds: Adaptive Regularization of Base Classifiers
14 0.040747788 194 nips-2004-Theory of localized synfire chain: characteristic propagation speed of stable spike pattern
15 0.040230632 175 nips-2004-Stable adaptive control with online learning
16 0.037377361 135 nips-2004-On-Chip Compensation of Device-Mismatch Effects in Analog VLSI Neural Networks
17 0.03657354 176 nips-2004-Sub-Microwatt Analog VLSI Support Vector Machine for Pattern Classification and Sequence Estimation
18 0.034865193 33 nips-2004-Brain Inspired Reinforcement Learning
19 0.034851361 181 nips-2004-Synergies between Intrinsic and Synaptic Plasticity in Individual Model Neurons
20 0.033648696 157 nips-2004-Saliency-Driven Image Acuity Modulation on a Reconfigurable Array of Spiking Silicon Neurons
topicId topicWeight
[(0, -0.095), (1, -0.067), (2, -0.0), (3, 0.009), (4, -0.006), (5, 0.021), (6, 0.028), (7, 0.045), (8, -0.021), (9, -0.033), (10, -0.025), (11, -0.093), (12, -0.035), (13, 0.02), (14, 0.025), (15, -0.057), (16, -0.105), (17, -0.067), (18, 0.049), (19, -0.061), (20, -0.03), (21, 0.04), (22, -0.054), (23, -0.011), (24, 0.033), (25, -0.057), (26, -0.059), (27, -0.043), (28, -0.006), (29, 0.004), (30, -0.127), (31, 0.082), (32, 0.064), (33, 0.021), (34, -0.008), (35, 0.098), (36, 0.017), (37, 0.026), (38, 0.109), (39, -0.033), (40, 0.077), (41, -0.03), (42, 0.112), (43, 0.048), (44, 0.053), (45, 0.002), (46, 0.099), (47, -0.074), (48, -0.255), (49, 0.04)]
simIndex simValue paperId paperTitle
same-paper 1 0.93548703 128 nips-2004-Neural Network Computation by In Vitro Transcriptional Circuits
Author: Jongmin Kim, John Hopfield, Erik Winfree
Abstract: The structural similarity of neural networks and genetic regulatory networks to digital circuits, and hence to each other, was noted from the very beginning of their study [1, 2]. In this work, we propose a simple biochemical system whose architecture mimics that of genetic regulation and whose components allow for in vitro implementation of arbitrary circuits. We use only two enzymes in addition to DNA and RNA molecules: RNA polymerase (RNAP) and ribonuclease (RNase). We develop a rate equation for in vitro transcriptional networks, and derive a correspondence with general neural network rate equations [3]. As proof-of-principle demonstrations, an associative memory task and a feedforward network computation are shown by simulation. A difference between the neural network and biochemical models is also highlighted: global coupling of rate equations through enzyme saturation can lead to global feedback regulation, thus allowing a simple network without explicit mutual inhibition to perform the winner-take-all computation. Thus, the full complexity of the cell is not necessary for biochemical computation: a wide range of functional behaviors can be achieved with a small set of biochemical components. 1
2 0.53739762 149 nips-2004-Probabilistic Inference of Alternative Splicing Events in Microarray Data
Author: Ofer Shai, Brendan J. Frey, Quaid D. Morris, Qun Pan, Christine Misquitta, Benjamin J. Blencowe
Abstract: Alternative splicing (AS) is an important and frequent step in mammalian gene expression that allows a single gene to specify multiple products, and is crucial for the regulation of fundamental biological processes. The extent of AS regulation, and the mechanisms involved, are not well understood. We have developed a custom DNA microarray platform for surveying AS levels on a large scale. We present here a generative model for the AS Array Platform (GenASAP) and demonstrate its utility for quantifying AS levels in different mouse tissues. Learning is performed using a variational expectation maximization algorithm, and the parameters are shown to correctly capture expected AS trends. A comparison of the results obtained with a well-established but low through-put experimental method demonstrate that AS levels obtained from GenASAP are highly predictive of AS levels in mammalian tissues. 1 Biological diversity through alternative splicing Current estimates place the number of genes in the human genome at approximately 30,000, which is a surprisingly small number when one considers that the genome of yeast, a singlecelled organism, has 6,000 genes. The number of genes alone cannot account for the complexity and cell specialization exhibited by higher eukaryotes (i.e. mammals, plants, etc.). Some of that added complexity can be achieved through the use of alternative splicing, whereby a single gene can be used to code for a multitude of products. Genes are segments of the double stranded DNA that contain the information required by the cell for protein synthesis. That information is coded using an alphabet of 4 (A, C, G, and T), corresponding to the four nucleotides that make up the DNA. In what is known as the central dogma of molecular biology, DNA is transcribed to RNA, which in turn is translated into proteins. Messenger RNA (mRNA) is synthesized in the nucleus of the cell and carries the genomic information to the ribosome. In eukaryotes, genes are generally comprised of both exons, which contain the information needed by the cell to synthesize proteins, and introns, sometimes referred to as spacer DNA, which are spliced out of the pre-mRNA to create mature mRNA. An estimated 35%-75% of human genes [1] can be C1 (a) C1 A C1 C2 C1 A 3’ C2 (b) C2 C1 A 3’ C1 A 5’ C2 A 5’ C2 C1 C1 C2 C1 (c) A2 A1 C1 A1 C1 A2 C1 C2 C2 C1 (d) C2 C2 A C2 C2 C2 C1 C2 Figure 1: Four types of AS. Boxes represent exons and lines represent introns, with the possible splicing alternatives indicated by the connectors. (a) Single cassette exon inclusion/exclusion. C1 and C2 are constitutive exons (exons that are included in all isoforms) and flank a single alternative exon (A). The alternative exon is included in one isoform and excluded in the other. (b) Alternative 3’ (or donor) and alternative 5’ (acceptor) splicing sites. Both exons are constitutive, but may contain alternative donor and/or acceptor splicing sites. (c) Mutually exclusive exons. One of the two alternative exons (A1 and A2 ) may be included in the isoform, but not both. (d) Intron inclusion. An intron may be included in the mature mRNA strand. spliced to yield different combinations of exons (called isoforms), a phenomenon referred to as alternative splicing (AS). There are four major types of AS as shown in Figure 1. Many multi-exon genes may undergo more than one alternative splicing event, resulting in many possible isoforms from a single gene. [2] In addition to adding to the genetic repertoire of an organism by enabling a single gene to code for more than one protein, AS has been shown to be critical for gene regulation, contributing to tissue specificity, and facilitating evolutionary processes. Despite the evident importance of AS, its regulation and impact on specific genes remains poorly understood. The work presented here is concerned with the inference of single cassette exon AS levels (Figure 1a) based on data obtained from RNA expression arrays, also known as microarrays. 1.1 An exon microarray data set that probes alternative splicing events Although it is possible to directly analyze the proteins synthesized by a cell, it is easier, and often more informative, to instead measure the abundance of mRNA present. Traditionally, gene expression (abundance of mRNA) has been studied using low throughput techniques (such as RT-PCR or Northern blots), limited to studying a few sequences at a time and making large scale analysis nearly impossible. In the early 1990s, microarray technology emerged as a method capable of measuring the expression of thousands of DNA sequences simultaneously. Sequences of interest are deposited on a substrate the size of a small microscope slide, to form probes. The mRNA is extracted from the cell and reverse-transcribed back into DNA, which is labelled with red and green fluorescent dye molecules (cy3 and cy5 respectively). When the sample of tagged DNA is washed over the slide, complementary strands of DNA from the sample hybridize to the probes on the array forming A-T and C-G pairings. The slide is then scanned and the fluorescent intensity is measured at each probe. It is generally assumed that the intensity measure at the probe is linearly related to the abundance of mRNA in the cell over a wide dynamic range. Despite significant improvements in microarray technologies in recent years, microarray data still presents some difficulties in analysis. Low measurements tend to have extremely low signal to noise ratio (SNR) [7] and probes often bind to sequences that are very similar, but not identical, to the one for which they were designed (a process referred to as cross- C1 A C1 A C1:A C1 C2 C2 3 Body probes A:C2 A C2 2 Inclusion junction probes C1:C2 C1 C2 1 Exclusion junction probe Figure 2: Each alternative splicing event is studied using six probes. Probes were chosen to measure the expression levels of each of the three exons involved in the event. Additionally, 3 probes are used that target the junctions that are formed by each of the two isoforms. The inclusion isoform would express the junctions formed by C1 and A, and A and C2 , while the exclusion isoform would express the junction formed by C1 and C2 hybridization). Additionally, probes exhibit somewhat varying hybridization efficiency, and sequences exhibit varying labelling efficiency. To design our data sets, we mined public sequence databases and identified exons that were strong candidates for exhibiting AS (the details of that analysis are provided elsewhere [4, 3]). Of the candidates, 3,126 potential AS events in 2,647 unique mouse genes were selected for the design of Agilent Custom Oligonucleotide microarray. The arrays were hybridized with unamplified mRNA samples extracted from 10 wild-type mouse tissues (brain, heart, intestine, kidney, liver, lung, salivary gland, skeletal muscle, spleen, and testis). Each AS event has six target probes on the arrays, chosen from regions of the C1 exon, C2 exon, A exon, C1 :A splice junction, A:C2 splice junction, and C1 :C2 splice junction, as shown in Figure 2. 2 Unsupervised discovery of alternative splicing With the exception of the probe measuring the alternative exon, A (Figure 2), all probes measure sequences that occur in both isoforms. For example, while the sequence of the probe measuring the junction A:C1 is designed to measure the inclusion isoform, half of it corresponds to a sequence that is found in the exclusion isoform. We can therefore safely assume that the measured intensity at each probe is a result of a certain amount of both isoforms binding to the probe. Due to the generally assumed linear relationship between the abundance of mRNA hybridized at a probe and the fluorescent intensity measured, we model the measured intensity as a weighted sum of the overall abundance of the two isoforms. A stronger assumption is that of a single, consistent hybridization profile for both isoforms across all probes and all slides. Ideally, one would prefer to estimate an individual hybridization profile for each AS event studied across all slides. However, in our current setup, the number of tissues is small (10), resulting in two difficulties. First, the number of parameters is very large when compared to the number of data point using this model, and second, a portion of the events do not exhibit tissue specific alternative splicing within our small set of tissues. While the first hurdle could be accounted for using Baysian parameter estimation, the second cannot. 2.1 GenASAP - a generative model for alternative splicing array platform Using the setup described above, the expression vector x, containing the six microarray measurements as real numbers, can be decomposed as a linear combination of the abundance of the two splice isoforms, represented by the real vector s, with some added noise: x = Λs + noise, where Λ is a 6 × 2 weight matrix containing the hybridization profiles for s1 ^ xC ^ xC 1 2 s2 ^ xC :A ^ xA ^ xA:C 2 ^ xC1:C2 2 xC1:C2 1 r xC xC 2 xA xC :A xA:C oC1 oC2 oA oC :A oA:C 1 1 1 2 oC :C 1 2 Σn 2 Figure 3: Graphical model for alternative splicing. Each measurement in the observed expression profile, x, is generated by either using a scale factor, r, on a linear combination of the isoforms, s, or drawing randomly from an outlier model. For a detailed description of the model, see text. the two isoforms across the six probes. Note that we may not have a negative amount of a given isoform, nor can the presence of an isoform deduct from the measured expression, and so both s and Λ are constrained to be positive. Expression levels measured by microarrays have previously been modelled as having expression-dependent noise [7]. To address this, we rewrite the above formulation as x = r(Λs + ε), (1) where r is a scale factor and ε is a zero-mean normally distributed random variable with a diagonal covariance matrix, Ψ, denoted as p(ε) = N (ε; 0, Ψ). The prior distribution for the abundance of the splice isoforms is given by a truncated normal distribution, denoted as p(s) ∝ N (s, 0, I)[s ≥ 0], where [·] is an indicator function such that [s ≥ 0] = 1 if ∀i, si ≥ 0, and [s ≥ 0] = 0 otherwise. Lastly, there is a need to account for aberrant observations (e.g. due to faulty probes, flakes of dust, etc.) with an outlier model. The complete GenASAP model (shown in Figure 3) accounts for the observations as the outcome of either applying equation (1) or an outlier model. To avoid degenerate cases and ensure meaningful and interpretable results, the number of faulty probes considered for each AS event may not exceed two, as indicated by the filled-in square constraint node in Figure 3. The distribution of x conditional on the latent variables, s, r, and o, is: N (xi ; rΛi s, r2 Ψi )[oi =0] N (xi ; Ei , Vi )[oi =1] , p(x|s, r, o) = (2) i where oi ∈ {0, 1} is a bernoulli random variable indicating if the measurement at probe xi is the result of the AS model or the outlier model parameterized by p(oi = 1) = γi . The parameters of the outlier model, E and V, are not optimized and are set to the mean and variance of the data. 2.2 Variational learning in the GenASAP model To infer the posterior distribution over the splice isoform abundances while at the same time learning the model parameters we use a variational expectation-maximization algorithm (EM). EM maximizes the log likelihood of the data by iteratively estimating the posterior distribution of the model given the data in the expectation (E) step, and maximizing the log likelihood with respect to the parameters, while keeping the posterior fixed, in the maximization (M) step. Variational EM is used when, as in the case of GenASAP, the exact posterior is intractable. Variational EM minimizes the free energy of the model, defined as the KL-divergence between the joint distribution of the latent and observed variables and the approximation to the posterior under the model parameters [5, 6]. We approximate the true posterior using the Q distribution given by T Q({s(t) }, {o(t) }, {r(t) }) = (t) Q(r(t) )Q(o(t) |r(t) ) t=1 (t) Q(si |oi , r(t) ) i −1 T =Z (t) (3) d d ρ(t) ω (t) N (s(t) ; µ(t) , Σ(t) )[s(t) ≥ 0], ro ro t=1 where Z is a normalization constant, the superscript d indicates that Σ is constrained to be diagonal, and there are T iid AS events. For computational efficiency, r is selected from a finite set, r ∈ {r1 , r2 , . . . , rC } with uniform probability. The variational free energy is given by Q({s(t) }, {o(t) }, {r(t) }) . P ({s(t) }, {o(t) }, {r(t) }, {x(t) }) s r o (4) Variational EM minimizes the free energy by iteratively updating the Q distribution’s vari(t)d (t)d ational parameters (ρ(t) , ω (t) , µro , and Σro ) in the E-step, and the model parameters (Λ, Ψ, {r1 , r2 , . . . , rC }, and γ) in the M-step. The resulting updates are too long to be shown in the context of this paper and are discussed in detail elsewhere [3]. A few particular points regarding the E-step are worth covering in detail here. Q({s(t) }, {o(t) }, {r(t) }) log F(Q, P ) = If the prior on s was a full normal distribution, there would be no need for a variational approach, and exact EM is possible. For a truncated normal distribution, however, the mixing proportions, Q(r)Q(o|r) cannot be calculated analytically except for the case where s is scalar, necessitating the diagonality constraint. Note that if Σ was allowed to be a full covariance matrix, equation (3) would be the true posterior, and we could find the sufficient statistics of Q(s(t) |o(t) , r(t) ): −1 µ(t) = (I + ΛT (I − O(t) )T Ψ−1 (I − O(t) )Λ)−1 ΛT (I − O(t) )T Ψ−1 x(t) r(t) ro −1 Σ(t) ro = (I + ΛT (I − O(t) )T Ψ−1 (I − O(t) )Λ) (5) (6) where O is a diagonal matrix with elements Oi,i = oi . Furthermore, it can be easily shown that the optimal settings for µd and Σd approximating a normal distribution with full covariance Σ and mean µ is µd optimal = µ −1 Σd optimal = diag(Σ (7) −1 ) (8) In the truncated case, equation (8) is still true. Equation (7) does not hold, though, and µd optimal cannot be found analytically. In our experiments, we found that using equation (7) still decreases the free energy every E-step, and it is significantly more efficient than using, for example, a gradient decent method to compute the optimal µd . Intuitive Weigh Matrix Optimal Weight Matrix 50 50 40 40 30 30 20 20 10 0 10 Inclusion Isoform 0 Exclusion Isoform Inclusion Isoform (a) Exclusion Isoform (b) Figure 4: (a) An intuitive set of weights. Based on the biological background, one would expect to see the inclusion isoform hybridize to the probes measuring C1 , C2 , A, C1 :A, and A:C2 , while the exclusion isoform hybridizes to C1 , C2 , and C1 :C2 . (b) The learned set of weights closely agrees with the intuition, and captures cross hybridization between the probes Contribution of exclusion isoform Contribution of inclusion isoform AS model Original Data (a) RT−PCR AS model measurement prediction (% exclusion) (% exclusion) 14% 72% (b) 27% 70% 8% 22% outliers (c) Figure 5: Three examples of data cases and their predictions. (a) The data does not follow our notion of single cassette exon AS, but the AS level is predicted accurately by the model.(b) The probe C1 :A is marked as outlier, allowing the model to predict the other probes accurately. (c) Two probes are marked as outliers, and the model is still successful in predicting the AS levels. 3 Making biological predictions about alternative splicing The results presented in this paper were obtained using two stages of learning. In the first step, the weight matrix, Λ, is learned on a subset of the data that is selected for quality. Two selection criteria were used: (a) sequencing data was used to select those cases for which, with high confidence, no other AS event is present (Figure 1) and (b) probe sets were selected for high expression, as determined by a set of negative controls. The second selection criterion is motivated by the common assumption that low intensity measurements are of lesser quality (see Section 1.1). In the second step, Λ is kept fixed, and we introduce the additional constraint that the noise is isotropic (Ψ = ψI) and learn on the entire data set. The constraint on the noise is introduced to prevent the model from using only a subset of the six probes for making the final set of predictions. We show a typical learned set of weights in Figure 4. The weights fit well with our intuition of what they should be to capture the presence of the two isoforms. Moreover, the learned weights account for the specific trends in the data. Examples of model prediction based on the microarray data are shown in Figure 5. Due to the nature of the microarray data, we do not expect all the inferred abundances to be equally good, and we devised a scoring criterion that ranks each AS event based on its fit to the model. Intuitively, given two input vectors that are equivalent up to a scale factor, with inferred MAP estimations that are equal up to the same scale factor, we would like their scores to be identical. The scoring criterion used, therefore is k (xk − rΛk s)2 /(xk + Rank 500 1000 2000 5000 10000 15000 20000 30000 Pearson’s correlation coefficient 0.94 0.95 0.95 0.79 0.79 0.78 0.75 0.65 False positive rate 0.11 0.08 0.05 0.2 0.25 0.29 0.32 0.42 Table 1: Model performance evaluated at various ranks. Using 180 RT-PCR measurements, we are able to predict the model’s performance at various ranks. Two evaluation criteria are used: Pearson’s correlation coefficient between the model’s predictions and the RT-PCR measurements and false positive rate, where a prediction is considered to be false positive if it is more than 15% away from the RT-PCR measurement. rΛk s)2 , where the MAP estimations for r and s are used. This scoring criterion can be viewed as proportional to the sum of noise to signal ratios, as estimated using the two values given by the observation and the model’s best prediction of that observation. Since it is the relative amount of the isoforms that is of most interest, we need to use the inferred distribution of the isoform abundances to obtain an estimate for the relative levels of AS. It is not immediately clear how this should be done. We do, however, have RTPCR measurements for 180 AS events to guide us (see figure 6 for details). Using the top 50 ranked RT-PCR measurement, we fit three parameters, {a1 , a2 , a3 }, such that the s2 proportion of excluded isoform present, p, is given by p = a1 s1 +a2 s2 + a3 , where s1 is the MAP estimation of the abundance of the inclusion isoform, s2 is the MAP estimation of the abundance of the exclusion isoform, and the RT-PCR measurement are used for target p. The parameters are fitted using gradient descent on a least squared error (LSE) evaluation criterion. We used two criteria to evaluate the quality of the AS model predictions. Pearson’s correlation coefficient (PCC) is used to evaluate the overall ability of the model to correctly estimate trends in the data. PCC is invariant to affine transformation and so is independent of the transformation parameters a1 and a3 discussed above, while the parameter a2 was found to effect PCC very little. The PCC stays above 0.75 for the top two thirds ranked predictions. The second evaluation criterion used is the false positive rate, where a prediction is considered to be false positive if it is more than 15% away from the RT-PCR measurement. This allows us to say, for example, that if a prediction is within the top 10000, we are 75% confident that it is within 15% of the actual levels of AS. 4 Summary We designed a novel AS model for the inference of the relative abundance of two alternatively spliced isoforms from six measurements. Unsupervised learning in the model is performed using a structured variational EM algorithm, which correctly captures the underlying structure of the data, as suggested by its biological nature. The AS model, though presented here for a cassette exon AS events, can be used to learn any type of AS, and with a simple adjustment, multiple types. The predictions obtained from the AS model are currently being used to verify various claims about the role of AS in evolution and functional genomics, and to help identify sequences that affect the regulation of AS. % Exclusion isoform RT−PCR measurement Vs. AS model predictions RT−PCR measurements: 90 80 AS model prediction Int es Te tine sti Kid s n Sa ey liva Br ry ain Sp le Liv en er Mu sc Lu le ng 100 70 60 50 40 30 14 22 27 32 47 46 66 78 63 20 AS model prediction: 10 27 24 26 26 51 75 60 85 100 (a) 0 0 20 40 60 80 RT−PCR measurement 100 (b) Figure 6: (a) Sample RT-PCR. RNA extracted from the cell is reverse-transcribed to DNA, amplified and labelled with radioactive or fluorescent molecules. The sample is pulled through a viscous gel in an electric field (DNA, being an acid, is positively charged). Shorter strands travel further through the gel than longer ones, resulting in two distinct bands, corresponding to the two isoforms, when exposed to a photosensitive or x-ray film. (b) A scatter plot showing the RT-PCR measurements as compared to the AS model predictions. The plot shows all available RT-PCR measurements with a rank of 8000 or better. The AS model presented assumes a single weight matrix for all data cases. This is an oversimplified view of the data, and current work is being carried out in identifying probe specific expression profiles. However, due to the low dimensionality of the problem (10 tissues, six probes per event), care must be taken to avoid overfitting and to ensure meaningful interpretations. Acknowledgments We would like to thank Wen Zhang, Naveed Mohammad, and Timothy Hughes for their contributions in generating the data set. This work was funded in part by an operating and infrastructure grants from the CIHR and CFI, and a operating grants from NSERC and a Premier’s Research Excellence Award. References [1] J. M. Johnson et al. Genome-wide survey of human alternative pre-mrna splicing with exon junction microarrays. Science, 302:2141–44, 2003. [2] L. Cartegni et al. Listening to silence and understanding nonsense: exonic mutations that affect splicing. Nature Gen. Rev., 3:285–98, 2002. [3] Q. Pan et al. Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform. Molecular Cell, 16(6):929–41, 2004. [4] Q. Pan et al. Alternative splicing of conserved exons is frequently species specific in human and mouse. Trends Gen., In Press, 2005. [5] M. I. Jordan, Z. Ghahramani, T. Jaakkola, and Lawrence K. Saul. An introduction to variational methods for graphical models. Machine Learning, 37(2):183– 233, 1999. [6] R. M. Neal and G. E. Hinton. A view of the em algorithm that justifies incremental, sparse, and other variants. In Learning in Graphical Models. Cambridge, MIT Press, 1998. [7] D. M. Rocke and B. Durbin. A model for measurement error for gene expression arrays. Journal of Computational Biology, 8(6):557–69, 2001.
3 0.50750637 143 nips-2004-PAC-Bayes Learning of Conjunctions and Classification of Gene-Expression Data
Author: Mario Marchand, Mohak Shah
Abstract: We propose a “soft greedy” learning algorithm for building small conjunctions of simple threshold functions, called rays, defined on single real-valued attributes. We also propose a PAC-Bayes risk bound which is minimized for classifiers achieving a non-trivial tradeoff between sparsity (the number of rays used) and the magnitude of the separating margin of each ray. Finally, we test the soft greedy algorithm on four DNA micro-array data sets. 1
4 0.48795915 108 nips-2004-Markov Networks for Detecting Overalpping Elements in Sequence Data
Author: Mark Craven, Joseph Bockhorst
Abstract: Many sequential prediction tasks involve locating instances of patterns in sequences. Generative probabilistic language models, such as hidden Markov models (HMMs), have been successfully applied to many of these tasks. A limitation of these models however, is that they cannot naturally handle cases in which pattern instances overlap in arbitrary ways. We present an alternative approach, based on conditional Markov networks, that can naturally represent arbitrarily overlapping elements. We show how to efficiently train and perform inference with these models. Experimental results from a genomics domain show that our models are more accurate at locating instances of overlapping patterns than are baseline models based on HMMs. 1
5 0.42508823 177 nips-2004-Supervised Graph Inference
Author: Jean-philippe Vert, Yoshihiro Yamanishi
Abstract: We formulate the problem of graph inference where part of the graph is known as a supervised learning problem, and propose an algorithm to solve it. The method involves the learning of a mapping of the vertices to a Euclidean space where the graph is easy to infer, and can be formulated as an optimization problem in a reproducing kernel Hilbert space. We report encouraging results on the problem of metabolic network reconstruction from genomic data. 1
6 0.41858721 58 nips-2004-Edge of Chaos Computation in Mixed-Mode VLSI - A Hard Liquid
7 0.38116762 80 nips-2004-Identifying Protein-Protein Interaction Sites on a Genome-Wide Scale
8 0.33767679 26 nips-2004-At the Edge of Chaos: Real-time Computations and Self-Organized Criticality in Recurrent Neural Networks
9 0.32263157 90 nips-2004-Joint Probabilistic Curve Clustering and Alignment
10 0.3209734 135 nips-2004-On-Chip Compensation of Device-Mismatch Effects in Analog VLSI Neural Networks
11 0.31807813 157 nips-2004-Saliency-Driven Image Acuity Modulation on a Reconfigurable Array of Spiking Silicon Neurons
13 0.27904129 118 nips-2004-Methods for Estimating the Computational Power and Generalization Capability of Neural Microcircuits
14 0.27831268 193 nips-2004-Theories of Access Consciousness
15 0.26882675 176 nips-2004-Sub-Microwatt Analog VLSI Support Vector Machine for Pattern Classification and Sequence Estimation
16 0.24920081 52 nips-2004-Discrete profile alignment via constrained information bottleneck
17 0.24902868 57 nips-2004-Economic Properties of Social Networks
18 0.23997402 155 nips-2004-Responding to Modalities with Different Latencies
19 0.22406134 184 nips-2004-The Cerebellum Chip: an Analog VLSI Implementation of a Cerebellar Model of Classical Conditioning
20 0.21568504 76 nips-2004-Hierarchical Bayesian Inference in Networks of Spiking Neurons
topicId topicWeight
[(13, 0.069), (15, 0.098), (26, 0.032), (31, 0.016), (33, 0.127), (35, 0.02), (39, 0.013), (50, 0.014), (51, 0.013), (69, 0.471)]
simIndex simValue paperId paperTitle
same-paper 1 0.79802537 128 nips-2004-Neural Network Computation by In Vitro Transcriptional Circuits
Author: Jongmin Kim, John Hopfield, Erik Winfree
Abstract: The structural similarity of neural networks and genetic regulatory networks to digital circuits, and hence to each other, was noted from the very beginning of their study [1, 2]. In this work, we propose a simple biochemical system whose architecture mimics that of genetic regulation and whose components allow for in vitro implementation of arbitrary circuits. We use only two enzymes in addition to DNA and RNA molecules: RNA polymerase (RNAP) and ribonuclease (RNase). We develop a rate equation for in vitro transcriptional networks, and derive a correspondence with general neural network rate equations [3]. As proof-of-principle demonstrations, an associative memory task and a feedforward network computation are shown by simulation. A difference between the neural network and biochemical models is also highlighted: global coupling of rate equations through enzyme saturation can lead to global feedback regulation, thus allowing a simple network without explicit mutual inhibition to perform the winner-take-all computation. Thus, the full complexity of the cell is not necessary for biochemical computation: a wide range of functional behaviors can be achieved with a small set of biochemical components. 1
2 0.64480966 74 nips-2004-Harmonising Chorales by Probabilistic Inference
Author: Moray Allan, Christopher Williams
Abstract: We describe how we used a data set of chorale harmonisations composed by Johann Sebastian Bach to train Hidden Markov Models. Using a probabilistic framework allows us to create a harmonisation system which learns from examples, and which can compose new harmonisations. We make a quantitative comparison of our system’s harmonisation performance against simpler models, and provide example harmonisations. 1
3 0.60222638 18 nips-2004-Algebraic Set Kernels with Application to Inference Over Local Image Representations
Author: Amnon Shashua, Tamir Hazan
Abstract: This paper presents a general family of algebraic positive definite similarity functions over spaces of matrices with varying column rank. The columns can represent local regions in an image (whereby images have varying number of local parts), images of an image sequence, motion trajectories in a multibody motion, and so forth. The family of set kernels we derive is based on a group invariant tensor product lifting with parameters that can be naturally tuned to provide a cook-book of sorts covering the possible ”wish lists” from similarity measures over sets of varying cardinality. We highlight the strengths of our approach by demonstrating the set kernels for visual recognition of pedestrians using local parts representations. 1
4 0.35596111 29 nips-2004-Beat Tracking the Graphical Model Way
Author: Dustin Lang, Nando D. Freitas
Abstract: We present a graphical model for beat tracking in recorded music. Using a probabilistic graphical model allows us to incorporate local information and global smoothness constraints in a principled manner. We evaluate our model on a set of varied and difficult examples, and achieve impressive results. By using a fast dual-tree algorithm for graphical model inference, our system runs in less time than the duration of the music being processed. 1
5 0.35457072 131 nips-2004-Non-Local Manifold Tangent Learning
Author: Yoshua Bengio, Martin Monperrus
Abstract: We claim and present arguments to the effect that a large class of manifold learning algorithms that are essentially local and can be framed as kernel learning algorithms will suffer from the curse of dimensionality, at the dimension of the true underlying manifold. This observation suggests to explore non-local manifold learning algorithms which attempt to discover shared structure in the tangent planes at different positions. A criterion for such an algorithm is proposed and experiments estimating a tangent plane prediction function are presented, showing its advantages with respect to local manifold learning algorithms: it is able to generalize very far from training data (on learning handwritten character image rotations), where a local non-parametric method fails. 1
6 0.35418236 178 nips-2004-Support Vector Classification with Input Data Uncertainty
7 0.35249367 102 nips-2004-Learning first-order Markov models for control
8 0.35246935 189 nips-2004-The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees
9 0.35212255 31 nips-2004-Blind One-microphone Speech Separation: A Spectral Learning Approach
10 0.35165542 127 nips-2004-Neighbourhood Components Analysis
11 0.35112831 133 nips-2004-Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning
12 0.35109651 16 nips-2004-Adaptive Discriminative Generative Model and Its Applications
13 0.35071227 4 nips-2004-A Generalized Bradley-Terry Model: From Group Competition to Individual Skill
14 0.35058984 174 nips-2004-Spike Sorting: Bayesian Clustering of Non-Stationary Data
15 0.35055292 25 nips-2004-Assignment of Multiplicative Mixtures in Natural Images
16 0.35029143 187 nips-2004-The Entire Regularization Path for the Support Vector Machine
17 0.34957144 61 nips-2004-Efficient Out-of-Sample Extension of Dominant-Set Clusters
18 0.34931058 110 nips-2004-Matrix Exponential Gradient Updates for On-line Learning and Bregman Projection
19 0.34915134 60 nips-2004-Efficient Kernel Machines Using the Improved Fast Gauss Transform
20 0.34909052 70 nips-2004-Following Curved Regularized Optimization Solution Paths