nips nips2010 nips2010-283 nips2010-283-reference knowledge-graph by maker-knowledge-mining

283 nips-2010-Variational Inference over Combinatorial Spaces

Source: pdf

Author: Alexandre Bouchard-côté, Michael I. Jordan

Abstract: Since the discovery of sophisticated fully polynomial randomized algorithms for a range of #P problems [1, 2, 3], theoretical work on approximate inference in combinatorial spaces has focused on Markov chain Monte Carlo methods. Despite their strong theoretical guarantees, the slow running time of many of these randomized algorithms and the restrictive assumptions on the potentials have hindered the applicability of these algorithms to machine learning. Because of this, in applications to combinatorial spaces simple exact models are often preferred to more complex models that require approximate inference [4]. Variational inference would appear to provide an appealing alternative, given the success of variational methods for graphical models [5]; unfortunately, however, it is not obvious how to develop variational approximations for combinatorial objects such as matchings, partial orders, plane partitions and sequence alignments. We propose a new framework that extends variational inference to a wide range of combinatorial spaces. Our method is based on a simple assumption: the existence of a tractable measure factorization, which we show holds in many examples. Simulations on a range of matching models show that the algorithm is more general and empirically faster than a popular fully polynomial randomized algorithm. We also apply the framework to the problem of multiple alignment of protein sequences, obtaining state-of-the-art results on the BAliBASE dataset [6]. 1

reference text

[1] Alexander Karzanov and Leonid Khachiyan. On the conductance of order Markov chains. Order, V8(1):7–15, March 1991.

[2] Mark Jerrum, Alistair Sinclair, and Eric Vigoda. A polynomial-time approximation algorithm for the permanent of a matrix with non-negative entries. In Proceedings of the Annual ACM Symposium on Theory of Computing, pages 712–721, 2001.

[3] David Wilson. Mixing times of lozenge tiling and card shufﬂing Markov chains. The Annals of Applied Probability, 14:274–325, 2004.

[4] Adam Siepel and David Haussler. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol, 21(3):468–488, 2004.

[5] Martin J. Wainwright and Michael I. Jordan. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1:1–305, 2008.

[6] Julie Thompson, Fr´ d´ ric Plewniak, and Olivier Poch. BAliBASE: A benchmark alignments database for e e the evaluation of multiple sequence alignment programs. Bioinformatics, 15:87–88, 1999.

[7] David A. Smith and Jason Eisner. Dependency parsing by belief propagation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 145–156, Honolulu, October 2008.

[8] David Burkett, John Blitzer, and Dan Klein. Joint parsing and alignment with weakly synchronized grammars. In North American Association for Computational Linguistics, Los Angeles, 2010.

[9] Bert Huang and Tony Jebara. Approximating the permanent with belief propagation. ArXiv e-prints, 2009.

[10] Yusuke Watanabe and Michael Chertkov. Belief propagation and loop calculus for the permanent of a non-negative matrix. J. Phys. A: Math. Theor., 2010.

[11] Ben Taskar, Dan Klein, Michael Collins, Daphne Koller, and Christopher Manning. Max-margin parsing. In EMNLP, 2004.

[12] Ben Taskar, Simon Lacoste-Julien, and Dan Klein. A discriminative matching approach to word alignment. In EMNLP 2005, 2005.

[13] John Duchi, Daniel Tarlow, Gal Elidan, and Daphne Koller. Using combinatorial optimization within max-product belief propagation. In Advances in Neural Information Processing Systems, 2007.

[14] Aron Culotta, Andrew McCallum, Bart Selman, and Ashish Sabharwal. Sparse message passing algorithms for weighted maximum satisﬁability. In New England Student Symposium on Artiﬁcial Intelligence, 2007.

[15] Percy Liang, Ben Taskar, and Dan Klein. Alignment by agreement. In North American Association for Computational Linguistics (NAACL), pages 104–111, 2006.

[16] Percy Liang, Dan Klein, and Michael I. Jordan. Agreement-based learning. In Advances in Neural Information Processing Systems (NIPS), 2008.

[17] Leslie G. Valiant. The complexity of computing the permanent. Theoret. Comput. Sci., 1979.

[18] Jonathan S. Yedidia, William T. Freeman, and Yair Weiss. Generalized belief propagation. In Advances in Neural Information Processing Systems, pages 689–695, Cambridge, MA, 2001. MIT Press.

[19] Carsten Peterson and James R. Anderson. A mean ﬁeld theory learning algorithm for neural networks. Complex Systems, 1:995–1019, 1987.

[20] Martin J. Wainwright, Tommi S. Jaakkola, and Alan S. Willsky. Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudomoment matching. In Proceedings of the International Conference on Articial Intelligence and Statistics, 2003.

[21] Alexandre Bouchard-Cˆ t´ and Michael I. Jordan. Optimization of structured mean ﬁeld objectives. In oe Proceedings of Uncertainty in Artiﬁcal Intelligence, 2009.

[22] Graham Brightwell and Peter Winkler. Counting linear extensions. Order, 1991.

[23] Lars Eilstrup Rasmussen. Approximating the permanent: A simple approach. Random Structures and Algorithms, 1992.

[24] Des G. Higgins and Paul M. Sharp. CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene, 73:237–244, 1988.

[25] Chuong B. Do, Mahathi S. P. Mahabhashyam, Michael Brudno, and Seraﬁm Batzoglou. PROBCONS: Probabilistic consistency-based multiple sequence alignment. Genome Research, 15:330–340, 2005.

[26] David B. Searls and Kevin P. Murphy. Automata-theoretic models of mutation and alignment. In Proc Int Conf Intell Syst Mol Biol., 1995. 9