nips nips2002 nips2002-7 knowledge-graph by maker-knowledge-mining

7 nips-2002-A Hierarchical Bayesian Markovian Model for Motifs in Biopolymer Sequences

Source: pdf

Author: Eric P. Xing, Michael I. Jordan, Richard M. Karp, Stuart Russell

Abstract: We propose a dynamic Bayesian model for motifs in biopolymer sequences which captures rich biological prior knowledge and positional dependencies in motif structure in a principled way. Our model posits that the position-speciﬁc multinomial parameters for monomer distribution are distributed as a latent Dirichlet-mixture random variable, and the position-speciﬁc Dirichlet component is determined by a hidden Markov process. Model parameters can be ﬁt on training motifs using a variational EM algorithm within an empirical Bayesian framework. Variational inference is also used for detecting hidden motifs. Our model improves over previous models that ignore biological priors and positional dependence. It has much higher sensitivity to motifs during detection and a notable ability to distinguish genuine motifs from false recurring patterns.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu ¡ Abstract We propose a dynamic Bayesian model for motifs in biopolymer sequences which captures rich biological prior knowledge and positional dependencies in motif structure in a principled way. [sent-6, score-1.398]

2 Our model posits that the position-speciﬁc multinomial parameters for monomer distribution are distributed as a latent Dirichlet-mixture random variable, and the position-speciﬁc Dirichlet component is determined by a hidden Markov process. [sent-7, score-0.187]

3 Model parameters can be ﬁt on training motifs using a variational EM algorithm within an empirical Bayesian framework. [sent-8, score-0.367]

4 Our model improves over previous models that ignore biological priors and positional dependence. [sent-10, score-0.094]

5 It has much higher sensitivity to motifs during detection and a notable ability to distinguish genuine motifs from false recurring patterns. [sent-11, score-0.727]

6 1 Introduction The identiﬁcation of motif structures in biopolymer sequences such as proteins and DNA is an important task in computational biology and is essential in advancing our knowledge about biological systems. [sent-12, score-0.988]

7 For example, the gene regulatory motifs in DNA provide key clues about the regulatory network underlying the complex control and coordination of gene expression in response to physiological or environmental changes in living cells [11]. [sent-13, score-0.548]

8 Most motif models assume independence of position-speciﬁc multinomial distributions of monomers such as nucleotides (nt) and animo acids (aa). [sent-15, score-1.004]

9 Such strategies contradict our intuition that the sites in motifs naturally possess spatial dependencies for functional reasons. [sent-16, score-0.356]

10 Furthermore, the vague Dirichlet prior used in some of these models acts as no more than a smoother, taking little consideration of the rich prior knowledge in biologically identiﬁed motifs. [sent-17, score-0.094]

11 The distribution of the monomers is a continuous mixture of position-speciﬁc multinomials which admit a Dirichlet prior according to the hidden Markov states, introducing both multi-modal prior information and dependencies. [sent-20, score-0.153]

12 We also propose a framework for decomposing the general motif model into a local alignment model for motif pattern and a global model for motif instance distribution, which allows complex models to be developed in a modular way. [sent-21, score-2.718]

13 To simplify our discussion, we use DNA motif modeling as a running example in this paper, though it should be clear that the model is applicable to other sequence modeling problems. [sent-22, score-0.929]

14 2 Preliminaries DNA motifs are short (about 6-30 bp) stochastic string patterns (Figure 1) in the regulatory sequences of genes that facilitate control functions by interacting with speciﬁc transcriptional regulatory proteins. [sent-23, score-0.599]

15 Each motif typically appears once or multiple times in the control regions of a small set of genes. [sent-24, score-0.841]

16 The goal of motif detection is to identify instances of possible motifs hidden in sequences and learn a model for each motif for future prediction. [sent-27, score-2.168]

17 A regulatory DNA sequence can be fully speciﬁed by a character string A,T,C,G , and an indicator string that signals the locations of the motif occurrences. [sent-28, score-1.035]

18 The reason to call a motif a stochastic string pattern rather than a word is due to the variability in the “spellings” of different instances of the same motif in the genome. [sent-29, score-1.748]

19 Conventionally, biologists display a motif pattern (of length ) by a multi-alignment of all its instances. [sent-30, score-0.862]

20 The stochasticity of motif patterns is reﬂected in the heterogeneity of nucleotide species appearing in each column (corresponding to a position or site in the motif) of the multi-alignment. [sent-31, score-0.988]

21 We denote the multi-alignment of all instances of a motif speciﬁed by the indicator string in sequence by . [sent-32, score-0.962]

22 Since any can be characterized (or ), by the nucleotide counts for each column, we deﬁne a counting matrix where each column is an integer vector with four elements, giving the number of occurrences of each nucleotide at position of the motif. [sent-33, score-0.186]

23 ) With these settings, one can model the nt-distribution of a position of the motif by a position-speciﬁc multinomial distribution, . [sent-35, score-0.981]

24 Formally, the problem of inferring and (often called a position-weight matrix, or PWM), given a sequence set , is motif detection in a nutshell 1 . [sent-36, score-0.916]

25 The axis indexes position and the axis represents the information content of the multinomial distribution of nt at position . [sent-45, score-0.194]

26 yt’ xt xt’ y m,l y m,l’ M M Figure 2: (Left) A general motif model is a Bayes-ian multinet. [sent-47, score-0.861]

27 (Right) The HMDM model for motif instances speciﬁed by a given . [sent-49, score-0.903]

28 x e d 2 20 θl’ 0 0 mcb (16) 2 0 ql’ θl Θ c t s d w t q i g e 7vuHsrphf d 1 Multiple motif detection can be formulated in a similar way, but for simplicity, we omit this elaboration. [sent-51, score-0.931]

29 y e d 3 Generative models for regulatory DNA sequences 3. [sent-54, score-0.129]

30 1 General setting and related work Without loss of generality, assume that the occurrences of motifs in a DNA sequence, as indicated by , are governed by a global distribution ; for each type of motif, the nucleotide sequence pattern shared by all its instances admits a local alignment model . [sent-55, score-0.612]

31 (Usually, the background non-motif sequences are modeled by a simple conditional model, , where the background nt-distribution are assumed to be learned a priori from the entire sequence and supplied parameters as constants in the motif detection process. [sent-56, score-0.988]

32 Thus, the likelihood of a regulatory sequence is: £ ¦¨ ¤X ¢¥! [sent-58, score-0.129]

33 Note that here is not necessarily equivalent to the position-speciﬁc multinomial parameters in Eq. [sent-74, score-0.087]

34 2 below, but is a generic symbol for the parameters of a general model of aligned motif instances. [sent-75, score-0.861]

35 The model captures properties such as the frequencies of different motifs and the dependencies between motif occurrences. [sent-76, score-1.217]

36 Although specifying this model is an important aspect of motif detection and remains largely unexplored, we defer this issue to future work. [sent-77, score-0.898]

37 In the current paper, our focus is on capturing the intrinsic properties within motifs that can help to improve sensitivity and speciﬁcity to genuine motif patterns. [sent-78, score-1.207]

38 For this the key lies in the local alignment model , which determines the PWM of the motif. [sent-79, score-0.1]

39 Depending on the value of the latent indicator (a motif or not at position ), admits different probabilistic models, such as a motif alignment model or a background model. [sent-80, score-1.872]

40 Thus sequence is characterized by a Bayesian multinet [6], a mixture model in which each component of the mixture is a speciﬁc nt-distribution model corresponding to sequences of a particular nature. [sent-81, score-0.169]

41 Our goal in this paper is to develop an expressive local alignment model capable of capturing characteristic site-dependencies in motifs. [sent-82, score-0.137]

42 £ 54 3 ¥¨ ¨ ¡ ¨ £ # (2) G "DA BE FEC 3 B ¦ ¦09 A 0@93 ¡ ¡ £ h 1¥# 98 Y8 X Although a popular model for many motif ﬁnders, PM nevertheless is sensitive to noise and random or trivial recurrent patterns, and is unable to capture potential site-dependencies inside the motifs. [sent-89, score-0.861]

43 , split a ’two-block’ motif into two coupled sub-motifs [9, 1]) have been developed to handle special patterns such as the U-shaped motifs, but they are inﬂexible and difﬁcult to generalize. [sent-94, score-0.876]

44 We depart from the PM model and introduce a dynamic hierarchical Bayesian model for motif alignment , which captures site dependencies inside the motif so that we can predict biologically more plausible motifs, and incorporate prior knowledge of nucleotide frequencies of general motif sites. [sent-96, score-2.8]

45 In order to keep the local alignment model our main focus as well as simplifying the presentation, we adopt an idealized global motif distribution model called “one-per-sequence” [8], which, as the name suggests, assumes each sequence harbors one motif instance (at an unknown location). [sent-97, score-1.911]

46 2 Hidden Markov Dirichlet-Multinomial (HMDM) Model In the HMDM model, we assume that there are underlying latent nt-distribution prototypes, according to which position-speciﬁc multinomial distributions of nt are determined, and that each prototype is represented by a Dirichlet distribution. [sent-100, score-0.168]

47 Furthermore, the choice of prototype at each position in the motif is governed by a ﬁrst-order Markov process. [sent-101, score-0.903]

48 More precisely, a multi-alignment containing motif instances is generated by the following process. [sent-102, score-0.883]

49 First we sample a sequence of prototype indicators from a ﬁrst-order Markov process with initial distribution and transition matrix . [sent-103, score-0.109]

50 (2) A multinomial distribution is sampled according to , the probability deﬁned by Dirichlet component over all such distributions. [sent-106, score-0.103]

51 characterized by counting matrix is: The complete likelihood of motif alignment §P¦¥£ r¥ Y¥¨ ¨¦ ¡ © Y£ ¡ §¤¢# $ ¨ 3B2 E 9 §¨¨ ¦ ¦£ ¡ 2 " '¨ ¨ C ¡ ! [sent-111, score-0.954]

52 In such a model the transition would be between the emission models (i. [sent-114, score-0.09]

53 In HMDM, the transitions are between different priors of the emission models, and the direct output of the HMM is the parameter vector of a generative model, which will be sampled multiple times at each position to generate random instances. [sent-117, score-0.096]

54 For example, for the case of motifs, biological evidence show that conserved positions (manifested by a low-entropy multinomial nt-distribution) are likely to concatenate, and maybe so do the less conserved positions. [sent-119, score-0.246]

55 However, it is unlikely that conserved and less conserved positions are interpolated [4]. [sent-120, score-0.131]

56 1 Variational Bayesian Learning In order to do Bayesian estimation of the motif parameter , and to predict the locations of motif instances via , we need to be able to compute the posterior distribution , which is infeasible in a complex motif model. [sent-123, score-2.607]

57 We seek to approximate the joint posterior over parameters and hidden states with a simpler distribution , where and can be, for the time being, thought of as free distributions to be optimized. [sent-125, score-0.084]

58 £ T T ¡¥ KL ¥ H ¡ £ 3PI Thus, maximizing the lower bound of the log likelihood (call it ) with respect to free distributions and is equivalent to minimizing the KL divergence between the true joint posterior and its variational approximation. [sent-142, score-0.086]

59 £ D £ In our motif model, the prior and the conditional submodels form a conjugate-exponential pair (Dirichlet-Multinomial). [sent-147, score-0.883]

60 It can be shown that in this case we can essentially recover the same form of the original conditional and prior distributions in their variational approximations except that the parameterization is augmented with appropriate Bayesian and posterior updates, respectively: (7) (8) ! [sent-148, score-0.107]

61 7 and 8 make clear, the locality of inference and marginalization on the latent variables is preserved in the variational approximation, which means probabilistic calculations can be performed in the prior and the conditional models separately and iteratively. [sent-158, score-0.132]

62 For motif modeling, this modular property means that the motif alignment model and motif distribution model can be treated separately with a simple interface of the posterior mean for the motif parameters and expected sufﬁcient statistics for the motif instances. [sent-159, score-4.388]

63 for multinomial We next compute the expectation of the natural parameters (which is parameters). [sent-168, score-0.087]

64 7, given the posterior means of the multinomial parameters, computing the expected counting matrix under the the one-per-sequence global model for sequence is straightforward based on Eq. [sent-179, score-0.225]

65 G Variational M step: Compute the expected natural parameter ence in the local motif alignment model given . [sent-183, score-0.941]

66 For example, the motif distribution model can be made more sophisticated so as to model complex properties of multiple motifs such as motif-level dependencies (e. [sent-185, score-1.253]

67 , co-occurrence, overlaps and concentration within regulatory modules) without complicating the inference in the local alignment model. [sent-187, score-0.207]

68 Similarly, the motif alignment model can also be more expressive (e. [sent-188, score-0.947]

69 , a mixture of HMDMs) without interfering with inference in the motif distribution model. [sent-190, score-0.913]

70 5 Experiments We test the HMDM model on a motif collection from The Promoter Database of Saccharomyces cerevisiae (SCPD). [sent-192, score-0.861]

71 The posterior distribution of the position-speciﬁc multinomial parameters , reﬂected in the parameters of the Dirichlet mixtures learned from data, can reveal the ntdistribution patterns of the motifs. [sent-195, score-0.164]

72 (c) Boxplots of hit and mishit rate of HMDM(1) and PM(2) on two motifs used during HMDM training. [sent-221, score-0.392]

73 Are the motif properties captured in HMDM useful in motif detection? [sent-222, score-1.682]

74 We ﬁrst examine an HMDM trained on the complete dataset for its ability to detect motifs used in training in the presence of a “decoy”: a permuted motif. [sent-223, score-0.343]

75 By randomly permuting the positions in the motif, the shapes of the “U-shaped” motifs (e. [sent-224, score-0.341]

76 2 We insert each instance of motif/decoy pair into a 300-500 bp random background sequence at random position and . [sent-227, score-0.127]

77 3 We allow a 3 bp offset as a tolerance window, and score a hit when (and a mis-hit when ), where is the position where a motif instance is found. [sent-228, score-0.964]

78 The (mis)hit rate is the proportion of (mis)hits to the total number of motif instances to be found in an experiment. [sent-229, score-0.883]

79 Figure 3(c) shows a boxplot of the hit and mishit rate of HMDM on abf1 and gal4 over 50 randomly generated experiments. [sent-230, score-0.093]

80 Note the dramatic contrast of the sensitivity of the HMDM to true motifs compared to that of the PM model (which is essentially the MEME model). [sent-231, score-0.355]

81 2 0 0 0 0 0 0 0 1 2 3 4 1 mat−a2 2 3 4 1 mcb 2 3 4 1 2 mig1 3 4 1 crp 2 3 4 1 mat−a2 2 3 4 0. [sent-263, score-0.085]

82 2 0 1 mcb 2 3 4 1 2 mig1 3 4 3 4 crp 1 1 1 1 1 1 1 1 0. [sent-264, score-0.085]

83 2 0 1 2 3 4 0 1 2 3 4 1 2 (a) true motif only (b) true motif + decoy Figure 4: Motif detection on an independent test dataset (the 8 motifs in Figure 1(a)). [sent-296, score-2.06]

84 In the ﬁrst motif ﬁnding task, we are given sequences each of which has only one true motif instance at a random position. [sent-305, score-1.734]

85 In three other cases they are comparable, but for motif mcb, all HMDM models lose. [sent-308, score-0.841]

86 The second task is more challenging and biologically more realistic, where we have both the true motifs and the permuted “decoys. [sent-314, score-0.364]

87 6 Conclusions We have presented a generative probabilistic framework for modeling motifs in biopolymer sequences. [sent-317, score-0.388]

88 Naively, categorical random variables with spatial/temporal dependencies can be modeled by a standard HMM with multinomial emission models. [sent-318, score-0.18]

89 However, the limited ﬂexibility of each multinomial distribution and the concomitant need for a potentially large number of states to model complex domains may require a large parameter count and lead to overﬁtting. [sent-319, score-0.123]

90 3 We resisted the temptation of using biological background sequences because we would not know if and how many other motifs are in such sequences, which renders them ill-suited for purposes of evaluation. [sent-322, score-0.403]

91 Furthermore, when the output of the HMM involves hidden variables (as for the case of motif detection), inference and learning is further complicated. [sent-324, score-0.906]

92 HMDM assumes that positional dependencies are induced at a higher level among the ﬁnite number of informative Dirichlet priors rather than between the multinomials themselves. [sent-325, score-0.121]

93 In motif modeling, such a strategy was used to capture different distribution patterns of nucleotides (homogeneous and heterogeneous) and transition properties between patterns (site clustering). [sent-327, score-0.995]

94 Such a prior proves to be beneﬁcial in searching for unseen motifs in our experiment and helps to distinguish more probable motifs from biologically meaningless random recurrent patterns. [sent-328, score-0.682]

95 This divide and conquer strategy makes it much easier to develop more sophisticated models for various aspects of motif analysis without being overburdened by the somewhat daunting complexity of the full motif problem. [sent-330, score-1.682]

96 Unsupervised learning of multiple motifs in biopolymers using EM. [sent-335, score-0.32]

97 The value of prior knowledge in discovering motifs with MEME. [sent-341, score-0.355]

98 Bioprospector: Discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. [sent-397, score-0.466]

99 Bayesian models for multiple local sequence alignment and Gibbs sampling strategies. [sent-407, score-0.118]

100 Deciphering genetic regulatory codes: A challenge for functional genomics. [sent-414, score-0.091]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('motif', 0.841), ('motifs', 0.32), ('hmdm', 0.263), ('regulatory', 0.091), ('multinomial', 0.087), ('dirichlet', 0.085), ('alignment', 0.066), ('pm', 0.058), ('conserved', 0.055), ('biopolymer', 0.053), ('mcb', 0.053), ('nucleotide', 0.053), ('hit', 0.051), ('dna', 0.049), ('variational', 0.047), ('emission', 0.044), ('nucleotides', 0.042), ('instances', 0.042), ('sequence', 0.038), ('sequences', 0.038), ('detection', 0.037), ('dependencies', 0.036), ('inference', 0.036), ('hmm', 0.035), ('patterns', 0.035), ('counting', 0.034), ('mis', 0.033), ('position', 0.033), ('crp', 0.032), ('hmdms', 0.032), ('mat', 0.032), ('pwm', 0.032), ('heterogeneous', 0.031), ('hidden', 0.029), ('prototype', 0.029), ('bayesian', 0.029), ('biological', 0.028), ('positional', 0.027), ('transition', 0.026), ('site', 0.026), ('posterior', 0.026), ('multinomials', 0.025), ('nt', 0.025), ('bp', 0.025), ('liu', 0.025), ('homogeneous', 0.024), ('string', 0.024), ('admits', 0.023), ('permuted', 0.023), ('gene', 0.023), ('bailey', 0.021), ('biologists', 0.021), ('bioprospector', 0.021), ('boxplot', 0.021), ('decoy', 0.021), ('harbors', 0.021), ('meme', 0.021), ('mishit', 0.021), ('modular', 0.021), ('monomer', 0.021), ('monomers', 0.021), ('neuwald', 0.021), ('recurring', 0.021), ('submodels', 0.021), ('uhsq', 0.021), ('ah', 0.021), ('positions', 0.021), ('biologically', 0.021), ('prior', 0.021), ('expressive', 0.02), ('hd', 0.02), ('mixture', 0.02), ('model', 0.02), ('global', 0.02), ('priors', 0.019), ('ql', 0.018), ('markov', 0.018), ('background', 0.017), ('pi', 0.017), ('indicator', 0.017), ('capturing', 0.017), ('vague', 0.017), ('distribution', 0.016), ('prototypes', 0.016), ('sensitivity', 0.015), ('modeling', 0.015), ('latent', 0.014), ('instance', 0.014), ('knowledge', 0.014), ('biology', 0.014), ('informative', 0.014), ('yt', 0.014), ('genuine', 0.014), ('picked', 0.014), ('local', 0.014), ('separately', 0.014), ('boxes', 0.013), ('categorical', 0.013), ('distributions', 0.013), ('characterized', 0.013)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999946 7 nips-2002-A Hierarchical Bayesian Markovian Model for Motifs in Biopolymer Sequences

Author: Eric P. Xing, Michael I. Jordan, Richard M. Karp, Stuart Russell

2 0.062517822 157 nips-2002-On the Dirichlet Prior and Bayesian Regularization

Author: Harald Steck, Tommi S. Jaakkola

Abstract: A common objective in learning a model from data is to recover its network structure, while the model parameters are of minor interest. For example, we may wish to recover regulatory networks from high-throughput data sources. In this paper we examine how Bayesian regularization using a product of independent Dirichlet priors over the model parameters affects the learned model structure in a domain with discrete variables. We show that a small scale parameter - often interpreted as

3 0.054227073 145 nips-2002-Mismatch String Kernels for SVM Protein Classification

Author: Eleazar Eskin, Jason Weston, William S. Noble, Christina S. Leslie

Abstract: We introduce a class of string kernels, called mismatch kernels, for use with support vector machines (SVMs) in a discriminative approach to the protein classiﬁcation problem. These kernels measure sequence similarity based on shared occurrences of -length subsequences, counted with up to mismatches, and do not rely on any generative model for the positive training sequences. We compute the kernels efﬁciently using a mismatch tree data structure and report experiments on a benchmark SCOP dataset, where we show that the mismatch kernel used with an SVM classiﬁer performs as well as the Fisher kernel, the most successful method for remote homology detection, while achieving considerable computational savings. ¡ ¢

4 0.050916832 73 nips-2002-Dynamic Bayesian Networks with Deterministic Latent Tables

Author: David Barber

Abstract: The application of latent/hidden variable Dynamic Bayesian Networks is constrained by the complexity of marginalising over latent variables. For this reason either small latent dimensions or Gaussian latent conditional tables linearly dependent on past states are typically considered in order that inference is tractable. We suggest an alternative approach in which the latent variables are modelled using deterministic conditional probability tables. This specialisation has the advantage of tractable inference even for highly complex non-linear/non-Gaussian visible conditional probability tables. This approach enables the consideration of highly complex latent dynamics whilst retaining the beneﬁts of a tractable probabilistic model. 1

5 0.042908154 21 nips-2002-Adaptive Classification by Variational Kalman Filtering

Author: Peter Sykacek, Stephen J. Roberts

Abstract: We propose in this paper a probabilistic approach for adaptive inference of generalized nonlinear classiﬁcation that combines the computational advantage of a parametric solution with the ﬂexibility of sequential sampling techniques. We regard the parameters of the classiﬁer as latent states in a ﬁrst order Markov process and propose an algorithm which can be regarded as variational generalization of standard Kalman ﬁltering. The variational Kalman ﬁlter is based on two novel lower bounds that enable us to use a non-degenerate distribution over the adaptation rate. An extensive empirical evaluation demonstrates that the proposed method is capable of infering competitive classiﬁers both in stationary and non-stationary environments. Although we focus on classiﬁcation, the algorithm is easily extended to other generalized nonlinear models.

6 0.042540975 25 nips-2002-An Asynchronous Hidden Markov Model for Audio-Visual Speech Recognition

7 0.038480867 204 nips-2002-VIBES: A Variational Inference Engine for Bayesian Networks

8 0.03117943 36 nips-2002-Automatic Alignment of Local Representations

9 0.030417392 31 nips-2002-Application of Variational Bayesian Approach to Speech Recognition

10 0.030321224 69 nips-2002-Discriminative Learning for Label Sequences via Boosting

11 0.029462151 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior

12 0.029027835 37 nips-2002-Automatic Derivation of Statistical Algorithms: The EM Family and Beyond

13 0.027412765 191 nips-2002-String Kernels, Fisher Kernels and Finite State Automata

14 0.026669355 98 nips-2002-Going Metric: Denoising Pairwise Data

15 0.026052235 113 nips-2002-Information Diffusion Kernels

16 0.02603716 53 nips-2002-Clustering with the Fisher Score

17 0.025941817 93 nips-2002-Forward-Decoding Kernel-Based Phone Recognition

18 0.025339542 10 nips-2002-A Model for Learning Variance Components of Natural Images

19 0.025182473 163 nips-2002-Prediction and Semantic Association

20 0.024219502 140 nips-2002-Margin Analysis of the LVQ Algorithm

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.085), (1, -0.014), (2, -0.01), (3, 0.015), (4, -0.04), (5, 0.047), (6, -0.035), (7, 0.013), (8, 0.02), (9, -0.017), (10, 0.024), (11, -0.003), (12, -0.0), (13, 0.031), (14, -0.108), (15, -0.024), (16, -0.018), (17, 0.058), (18, 0.022), (19, 0.016), (20, 0.039), (21, -0.075), (22, -0.032), (23, 0.004), (24, -0.097), (25, 0.052), (26, -0.016), (27, 0.004), (28, -0.027), (29, -0.042), (30, -0.057), (31, -0.011), (32, -0.026), (33, -0.026), (34, -0.004), (35, -0.016), (36, 0.051), (37, 0.011), (38, -0.068), (39, 0.021), (40, 0.066), (41, -0.023), (42, 0.025), (43, 0.007), (44, -0.037), (45, 0.02), (46, 0.033), (47, 0.101), (48, -0.106), (49, -0.161)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.88060552 7 nips-2002-A Hierarchical Bayesian Markovian Model for Motifs in Biopolymer Sequences

Author: Eric P. Xing, Michael I. Jordan, Richard M. Karp, Stuart Russell

2 0.56600595 157 nips-2002-On the Dirichlet Prior and Bayesian Regularization

Author: Harald Steck, Tommi S. Jaakkola

3 0.49690762 25 nips-2002-An Asynchronous Hidden Markov Model for Audio-Visual Speech Recognition

Author: Samy Bengio

Abstract: This paper presents a novel Hidden Markov Model architecture to model the joint probability of pairs of asynchronous sequences describing the same event. It is based on two other Markovian models, namely Asynchronous Input/ Output Hidden Markov Models and Pair Hidden Markov Models. An EM algorithm to train the model is presented, as well as a Viterbi decoder that can be used to obtain the optimal state sequence as well as the alignment between the two sequences. The model has been tested on an audio-visual speech recognition task using the M2VTS database and yielded robust performances under various noise conditions. 1

4 0.44945911 31 nips-2002-Application of Variational Bayesian Approach to Speech Recognition

Author: Shinji Watanabe, Yasuhiro Minami, Atsushi Nakamura, Naonori Ueda

Abstract: In this paper, we propose a Bayesian framework, which constructs shared-state triphone HMMs based on a variational Bayesian approach, and recognizes speech based on the Bayesian prediction classiﬁcation; variational Bayesian estimation and clustering for speech recognition (VBEC). An appropriate model structure with high recognition performance can be found within a VBEC framework. Unlike conventional methods, including BIC or MDL criterion based on the maximum likelihood approach, the proposed model selection is valid in principle, even when there are insufﬁcient amounts of data, because it does not use an asymptotic assumption. In isolated word recognition experiments, we show the advantage of VBEC over conventional methods, especially when dealing with small amounts of data.

5 0.40383983 53 nips-2002-Clustering with the Fisher Score

Author: Koji Tsuda, Motoaki Kawanabe, Klaus-Robert Müller

Abstract: Recently the Fisher score (or the Fisher kernel) is increasingly used as a feature extractor for classiﬁcation problems. The Fisher score is a vector of parameter derivatives of loglikelihood of a probabilistic model. This paper gives a theoretical analysis about how class information is preserved in the space of the Fisher score, which turns out that the Fisher score consists of a few important dimensions with class information and many nuisance dimensions. When we perform clustering with the Fisher score, K-Means type methods are obviously inappropriate because they make use of all dimensions. So we will develop a novel but simple clustering algorithm specialized for the Fisher score, which can exploit important dimensions. This algorithm is successfully tested in experiments with artiﬁcial data and real data (amino acid sequences).

6 0.39562583 150 nips-2002-Multiple Cause Vector Quantization

7 0.38086724 137 nips-2002-Location Estimation with a Differential Update Network

8 0.37670702 73 nips-2002-Dynamic Bayesian Networks with Deterministic Latent Tables

9 0.34310496 204 nips-2002-VIBES: A Variational Inference Engine for Bayesian Networks

10 0.34078383 22 nips-2002-Adaptive Nonlinear System Identification with Echo State Networks

11 0.33759853 21 nips-2002-Adaptive Classification by Variational Kalman Filtering

12 0.33080977 69 nips-2002-Discriminative Learning for Label Sequences via Boosting

13 0.32569969 191 nips-2002-String Kernels, Fisher Kernels and Finite State Automata

14 0.31864813 195 nips-2002-The Effect of Singularities in a Learning Machine when the True Parameters Do Not Lie on such Singularities

15 0.31344551 114 nips-2002-Information Regularization with Partially Labeled Data

16 0.30597061 84 nips-2002-Fast Exact Inference with a Factored Model for Natural Language Parsing

17 0.27346507 101 nips-2002-Handling Missing Data with Variational Bayesian Learning of ICA

18 0.27309644 1 nips-2002-"Name That Song!" A Probabilistic Approach to Querying on Music and Text

19 0.269427 98 nips-2002-Going Metric: Denoising Pairwise Data

20 0.263197 117 nips-2002-Intrinsic Dimension Estimation Using Packing Numbers

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(11, 0.024), (14, 0.012), (23, 0.026), (42, 0.033), (44, 0.011), (54, 0.088), (55, 0.032), (57, 0.016), (67, 0.024), (68, 0.037), (74, 0.079), (77, 0.318), (87, 0.02), (92, 0.029), (98, 0.108)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.75067896 7 nips-2002-A Hierarchical Bayesian Markovian Model for Motifs in Biopolymer Sequences

Author: Eric P. Xing, Michael I. Jordan, Richard M. Karp, Stuart Russell

2 0.55278438 53 nips-2002-Clustering with the Fisher Score

Author: Koji Tsuda, Motoaki Kawanabe, Klaus-Robert Müller

3 0.46770969 11 nips-2002-A Model for Real-Time Computation in Generic Neural Microcircuits

Author: Wolfgang Maass, Thomas Natschläger, Henry Markram

Abstract: A key challenge for neural modeling is to explain how a continuous stream of multi-modal input from a rapidly changing environment can be processed by stereotypical recurrent circuits of integrate-and-ﬁre neurons in real-time. We propose a new computational model that is based on principles of high dimensional dynamical systems in combination with statistical learning theory. It can be implemented on generic evolved or found recurrent circuitry.

4 0.46522319 135 nips-2002-Learning with Multiple Labels

Author: Rong Jin, Zoubin Ghahramani

Abstract: In this paper, we study a special kind of learning problem in which each training instance is given a set of (or distribution over) candidate class labels and only one of the candidate labels is the correct one. Such a problem can occur, e.g., in an information retrieval setting where a set of words is associated with an image, or if classes labels are organized hierarchically. We propose a novel discriminative approach for handling the ambiguity of class labels in the training examples. The experiments with the proposed approach over five different UCI datasets show that our approach is able to find the correct label among the set of candidate labels and actually achieve performance close to the case when each training instance is given a single correct label. In contrast, naIve methods degrade rapidly as more ambiguity is introduced into the labels. 1

5 0.46511593 10 nips-2002-A Model for Learning Variance Components of Natural Images

Author: Yan Karklin, Michael S. Lewicki

Abstract: We present a hierarchical Bayesian model for learning efﬁcient codes of higher-order structure in natural images. The model, a non-linear generalization of independent component analysis, replaces the standard assumption of independence for the joint distribution of coefﬁcients with a distribution that is adapted to the variance structure of the coefﬁcients of an efﬁcient image basis. This offers a novel description of higherorder image structure and provides a way to learn coarse-coded, sparsedistributed representations of abstract image properties such as object location, scale, and texture.

6 0.46506298 44 nips-2002-Binary Tuning is Optimal for Neural Rate Coding with High Temporal Resolution

7 0.46437269 93 nips-2002-Forward-Decoding Kernel-Based Phone Recognition

8 0.46360195 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions

9 0.46204028 24 nips-2002-Adaptive Scaling for Feature Selection in SVMs

10 0.46195403 28 nips-2002-An Information Theoretic Approach to the Functional Classification of Neurons

11 0.46183944 204 nips-2002-VIBES: A Variational Inference Engine for Bayesian Networks

12 0.46134502 48 nips-2002-Categorization Under Complexity: A Unified MDL Account of Human Learning of Regular and Irregular Categories

13 0.4613297 141 nips-2002-Maximally Informative Dimensions: Analyzing Neural Responses to Natural Signals

14 0.46002644 37 nips-2002-Automatic Derivation of Statistical Algorithms: The EM Family and Beyond

15 0.45968992 68 nips-2002-Discriminative Densities from Maximum Contrast Estimation

16 0.45860577 41 nips-2002-Bayesian Monte Carlo

17 0.45803529 27 nips-2002-An Impossibility Theorem for Clustering

18 0.45776731 31 nips-2002-Application of Variational Bayesian Approach to Speech Recognition

19 0.45773482 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture

20 0.45739457 2 nips-2002-A Bilinear Model for Sparse Coding