nips nips2008 nips2008-70 nips2008-70-reference knowledge-graph by maker-knowledge-mining

70 nips-2008-Efficient Inference in Phylogenetic InDel Trees

Source: pdf

Author: Alexandre Bouchard-côté, Dan Klein, Michael I. Jordan

Abstract: Accurate and efﬁcient inference in evolutionary trees is a central problem in computational biology. While classical treatments have made unrealistic site independence assumptions, ignoring insertions and deletions, realistic approaches require tracking insertions and deletions along the phylogenetic tree—a challenging and unsolved computational problem. We propose a new ancestry resampling procedure for inference in evolutionary trees. We evaluate our method in two problem domains—multiple sequence alignment and reconstruction of ancestral sequences—and show substantial improvement over the current state of the art. 1

reference text

perform equally, SSR beating AR slightly for trees with three nodes, which is not surprising since lly performs exact inference[1] thisGaltier, N. Tourasse, and M. Gouy. A nonhyperthermophilic common ancestor to extant in N. tiny topology conﬁguration. However, as trees get taller, the mes more difﬁcult, and only AR manages to Science, 283:220–221, 1999. life forms. maintain good performances. re Directions

[2] I. Holmes and W. J. Bruno. Evolutionary HMM: a Bayesian approach to multiple alignment. Bioinformatics, 17:803–820, 2001.

[3] J. Felsenstein. Inferring Phylogenies. Sinauer Associates, 2003. egrating afﬁne gap, hydrophobic core modelling, CRF models

[4] Z. Yang and B. Rannala. Bayesian phylogenetic inference using DNA sequences: A Markov chain Monte Carlo method. Molecular Biology and Evolultion, 14:717–724, 1997. ces

[5] B. Mau and M. A. Newton. Phylogenetic inference for binary data on dendrograms using ltier, N. Tourasse, and M. Gouy. AMarkov chain Monte common ancestorof Computational and Graphical Statistics, 6:122–131, nonhyperthermophilic Carlo. Journal to extant life forms. Science, 1997. 20–221, 1999. mes and W. J. Bruno. Evolutionary hmm: a bayesian approach to multiple alignment. Bioinformatics, 17:803– 001. 7 7

[6] S. Li, D. K. Pearl, and H. Doss. Phylogenetic tree construction using Markov chain Monte Carlo. Journal of the American Statistical Association, 95:493–508, 2000.

[7] W. J. Bruno, N. D. Socci, and A. L. Halpern. Weighted neighbor joining: A likelihoodbased approach to distance-based phylogeny reconstruction. Molecular Biology and Evolution, 17:189–197, 2000.

[8] D. G. Higgins and P. M. Sharp. CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene, 73:237–244, 1988.

[9] J. L. Thorne, H. Kishino, and J. Felsenstein. Inching toward reality: an improved likelihood model of sequence evolution. Journal of Molecular Evolution, 34:3–16, 1992.

[10] J. L. Thorne, H. Kishino, and J. Felsenstein. An evolutionary model for maximum likelihood alignment of DNA sequences. Journal of Molecular Evolution, 33:114–124, 1991.

[11] G. A. Lunter, I. Mikl´ s, Y. S. Song, and J. Hein. An efﬁcient algorithm for statistical multiple o alignment on arbitrary phylogenetic trees. Journal of Computational Biology, 10:869–889, 2003.

[12] A. Bouchard-Cˆ t´ , P. Liang, D. Klein, and T. L. Grifﬁths. A probabilistic approach to dioe achronic phonology. In Proceedings of EMNLP 2007, 2007.

[13] P. Diaconis, S. Holmes, and R. M. Neal. Analysis of a non-reversible Markov chain sampler. Technical report, Cornell University, 1997.

[14] S. Needleman and C. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48:443–453, 1970.

[15] K. St. John, T. Warnow, B. M. E. Moret, and L. Vawter. Performance study of phylogenetic methods: (unweighted) quartet methods and neighbor-joining. Journal of Algorithms, 48:173– 193, 2003.

[16] J. Thompson, F. Plewniak, and O. Poch. BAliBASE: A benchmark alignments database for the evaluation of multiple sequence alignment programs. Bioinformatics, 15:87–88, 1999.

[17] C. B. Do, M. S. P. Mahabhashyam, M. Brudno, and S. Batzoglou. PROBCONS: Probabilistic consistency-based multiple sequence alignment. Genome Research, 15:330–340, 2005. 8