jmlr jmlr2008 jmlr2008-48 jmlr2008-48-reference knowledge-graph by maker-knowledge-mining

48 jmlr-2008-Learning Bounded Treewidth Bayesian Networks

Source: pdf

Author: Gal Elidan, Stephen Gould

Abstract: With the increased availability of data for complex domains, it is desirable to learn Bayesian network structures that are sufﬁciently expressive for generalization while at the same time allow for tractable inference. While the method of thin junction trees can, in principle, be used for this purpose, its fully greedy nature makes it prone to overﬁtting, particularly when data is scarce. In this work we present a novel method for learning Bayesian networks of bounded treewidth that employs global structure modiﬁcations and that is polynomial both in the size of the graph and the treewidth bound. At the heart of our method is a dynamic triangulation that we update in a way that facilitates the addition of chain structures that increase the bound on the model’s treewidth by at most one. We demonstrate the effectiveness of our “treewidth-friendly” method on several real-life data sets and show that it is superior to the greedy approach as soon as the bound on the treewidth is nontrivial. Importantly, we also show that by making use of global operators, we are able to achieve better generalization even when learning Bayesian networks of unbounded treewidth. Keywords: Bayesian networks, structure learning, model selection, bounded treewidth

reference text

P. Abbeel, D. Koller, and A. Y. Ng. Learning factor graphs in polynomial time and sample complexity. Journal of Machine Learning Research, 7:1743–1788, 2006. F. Bach and M. I. Jordan. Thin junction trees. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, Cambridge, Mass., 2002. MIT Press. A. Beygelzimer and I. Rish. Approximability of probability distributions. In Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA, 2004. J. Bilmes and R. Dechter. Evaluation of probabilistic inference, the twenty second conference on uncertainty in artiﬁcial intelligence. ssli.ee.washington.edu/∼bilmes/UAI06InferenceEvaluation, 2006. H. L. Bodlaender. A linear time algorithm for ﬁnding tree-decompositions of small treewidth. SIAM Journal on Computing, 25:1305–1317, 1996. A. Chechetka and C. Guestrin. Efﬁcient principled learning of thin junction trees. In Advances in Neural Information Processing Systems 20, pages 273–280. MIT Press, Cambridge, MA, 2008. 2729 E LIDAN AND G OULD D. M. Chickering. Learning Bayesian networks is NP-complete. In D. Fisher and H. J. Lenz, editors, Learning from Data: Artiﬁcial Intelligence and Statistics V, pages 121–130. SpringerVerlag, New York, 1996. C. K. Chow and C. N. Liu. Approximating discrete probability distributions with dependence trees. IEEE Trans. on Info. Theory, 14:462–467, 1968. The International HapMap Consortium. The international hapmap project. Nature, 426:789–796, 2003. G. F. Cooper. The computational complexity of probabilistic inference using Bayesian belief networks. Artiﬁcial Intelligence, 42:393–405, 1990. P. Dagum and M. Luby. An optimal approximation algorithm for baysian inference. Artiﬁcial Intelligence, 60:141–153, 1993. A. Darwiche. Recursive conditioning. Artiﬁcial Intelligence, 126, 2001. S. Dasgupta. Learning polytrees. In K. Laskey and H. Prade, editors, Proc. Fifteenth Conference on Uncertainty in Artiﬁcial Intelligence (UAI ’99), pages 134–141, San Francisco, 1999. Morgan Kaufmann. A. Deshpande, C. Guestrin, S. Madden, J. Hellerstein, and W. Hong. Model-driven data acquisition in sensor networks. In Proceedings of the Very Large Data Bases (VLDB) Conference, 2004. R. Diestel. Graph Theory. Springer, 3rd edition, 2005. G. A. Dirac. On rigid circuit graphs. Abhandlungen aus dem Mathematischen Seminar der Universit¨ t Hamburg 25, Universit¨ t Hamburg, 1961. a a G. Elidan, I. Nachman, and N. Friedman. “ideal parent” structure learning for continuous variable bayesian networks. Journal of Machine Learning Research, 8:1799–1833, 2007. N. Friedman, M. Linial, I. Nachman, and D. Pe’er. Using Bayesian networks to analyze expression data. Computational Biology, 7:601–620, 2000. A. Gasch, P. Spellman, C. Kao, O. Carmel-Harel, M. Eisen, G. Storz, D. Botstein, and P. Brown. Genomic expression program in the response of yeast cells to environmental changes. Molecular Biology of the Cell, 11:4241–4257, 2000. F. Glover and M. Laguna. Tabu search. In C. Reeves, editor, Modern Heuristic Techniques for Combinatorial Problems, Oxford, England, 1993. Blackwell Scientiﬁc Publishing. D. Heckerman. A tutorial on learning with Bayesian networks. In M. I. Jordan, editor, Learning in Graphical Models. Kluwer, Dordrecht, Netherlands, 1998. D. Heckerman, D. Geiger, and D. M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20:197–243, 1995. D. Karger and N. Srebro. Learning markov networks: maximum bounded tree-width graphs. In Symposium on Discrete Algorithms, pages 392–401, 2001. 2730 L EARNING B OUNDED T REEWIDTH BAYESIAN N ETWORKS A. Koster, H. Bodlaender, and S. Van Hoesel. Treewidth: Computational experiments. Technical report, Universiteit Utrecht, 2001. A. Krause and C. Guestrin. Near-optimal nonmyopic value of information in graphical models. In F. Bacchus and T. Jaakkola, editors, Proc. Twenty First Conference on Uncertainty in Artiﬁcial Intelligence (UAI ’05), San Francisco, 2005. Morgan Kaufmann. W. Lam and F. Bacchus. Learning Bayesian belief networks: An approach based on the MDL principle. Computational Intelligence, 10:269–293, 1994. S. L. Lauritzen and D. J. Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. J. of the Royal Statistical Society, B 50(2):157–224, 1988. R. Marinescu and R. Dechter. And/or branch-and-bound for graphical models. IJCAI, 2005. C. Meek. Finding a path is harder than ﬁnding a tree. Journal of Artiﬁcial Intelligence Research, 15:383–389, 2001. M. Meila and M. I. Jordan. Learning with mixtures of trees. Journal of Machine Learning Research, 1:1–48, 2000. M. Narasimhan and J. Bilmes. Pac-learning bounded tree-width graphical models. In M. Chickering and J. Halpern, editors, Proc. Twenieth Conference on Uncertainty in Artiﬁcial Intelligence (UAI ’04), San Francisco, 2003. Morgan Kaufmann. J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, 1988. N. Robertson and P. D. Seymour. Graph minors. ii. algorithmic aspects of tree-width. Journal of Algorithms, 7:309–322, 1987. G. Schwarz. Estimating the dimension of a model. Annals of Statistics, 6:461–464, 1978. R. Tarjan and M. Yannakakis. Simple linear time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM Journal on Computing, 13:3:566–579, 1984. 2731