nips nips2008 nips2008-115 nips2008-115-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Gal Elidan, Stephen Gould
Abstract: With the increased availability of data for complex domains, it is desirable to learn Bayesian network structures that are sufficiently expressive for generalization while also allowing for tractable inference. While the method of thin junction trees can, in principle, be used for this purpose, its fully greedy nature makes it prone to overfitting, particularly when data is scarce. In this work we present a novel method for learning Bayesian networks of bounded treewidth that employs global structure modifications and that is polynomial in the size of the graph and the treewidth bound. At the heart of our method is a triangulated graph that we dynamically update in a way that facilitates the addition of chain structures that increase the bound on the model’s treewidth by at most one. We demonstrate the effectiveness of our “treewidth-friendly” method on several real-life datasets. Importantly, we also show that by using global operators, we are able to achieve better generalization even when learning Bayesian networks of unbounded treewidth. 1
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24] P. Abbeel, D. Koller, and A. Ng. Learning factor graphs in poly. time & sample complexity. JMLR, 2006. F. Bach and M. I. Jordan. Thin junction trees. In NIPS, 2001. A. Chechetka and C. Guestrin. Efficient principled learning of thin junction trees. In NIPS. 2008. D. Chickering. Learning Bayesian networks is NP-complete. In Learning from Data: AI & Stats V. 1996. C. Chow and C. Liu. Approx. discrete distrib. with dependence trees. IEEE Trans. on Info. Theory, 1968. The International HapMap Consortium. The international hapmap project. Nature, 2003. G. F. Cooper. The computationl complexity of probabilistic inference using belief networks. AI, 1990. P. Dagum and M. Luby. An optimal approximation algorithm for baysian inference. AI, 1993. A. Darwiche. Recursive conditioning. Artificial Intelligence, 2001. S. Dasgupta. Learning polytrees. In UAI, 1999. G. A. Dirac. On rigid circuit graphs. Abhandlungen aus dem Math. Seminar der Univ. Hamburg 25, 1961. A. Gasch et al. Genomic expression program in the response of yeast cells to environmental changes. Molecular Biology of the Cell, 2000. F. Glover and M. Laguna. Tabu search. In Modern Heuristic Tech. for Comb. Problems, 1993. D. Heckerman. A tutorial on learning Bayesian networks. Technical report, Microsoft Research, 1995. D. Karger and N. Srebro. Learning markov networks: maximum bounded tree-width graphs. In Symposium on Discrete Algorithms, 2001. A. Koster, H. Bodlaender, and S. Van Hoesel. Treewidth: Computational experiments. Technical report, Universiteit Utrecht, 2001. S. Lauritzen and D. Spiegelhalter. Local computations with probabilities on graphical structures. Journal of the Royal Statistical Society, 1988. R. Marinescu and R. Dechter. And/or branch-and-bound for graphical models. IJCAI, 2005. C. Meek. Finding a path is harder than finding a tree. Journal of Artificial Intelligence Research, 2001. M. Meila and M. I. Jordan. Learning with mixtures of trees. JMLR, 2000. M. Narasimhan and J. Bilmes. Pac-learning bounded tree-width graphical models. In UAI, 2004. J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, 1988. N. Robertson and P. Seymour. Graph minors II. algorithmic aspects of tree-width. J. of Algorithms, 1987. G. Schwarz. Estimating the dimension of a model. Annals of Statistics, 6:461–464, 1978. 8