jmlr jmlr2008 jmlr2008-35 jmlr2008-35-reference knowledge-graph by maker-knowledge-mining

35 jmlr-2008-Finding Optimal Bayesian Network Given a Super-Structure

Source: pdf

Author: Eric Perrier, Seiya Imoto, Satoru Miyano

Abstract: Classical approaches used to learn Bayesian network structure from data have disadvantages in terms of complexity and lower accuracy of their results. However, a recent empirical study has shown that a hybrid algorithm improves sensitively accuracy and speed: it learns a skeleton with an independency test (IT) approach and constrains on the directed acyclic graphs (DAG) considered during the search-and-score phase. Subsequently, we theorize the structural constraint by introducing the concept of super-structure S, which is an undirected graph that restricts the search to networks whose skeleton is a subgraph of S. We develop a super-structure constrained optimal search (COS): its time complexity is upper bounded by O(γm n ), where γm < 2 depends on the maximal degree m of S. Empirically, complexity depends on the average degree m and sparse structures ˜ allow larger graphs to be calculated. Our algorithm is faster than an optimal search by several orders and even ﬁnds more accurate results when given a sound super-structure. Practically, S can be approximated by IT approaches; signiﬁcance level of the tests controls its sparseness, enabling to control the trade-off between speed and accuracy. For incomplete super-structures, a greedily post-processed version (COS+) still enables to signiﬁcantly outperform other heuristic searches. Keywords: subset Bayesian networks, structure learning, optimal search, super-structure, connected

reference text

H. Akaike. A new look at the statistical model identiﬁcation. IEEE Transactions on Automatic Control, 19:716–723, 1974. I. A. Beinlich, H. Suermondt, R. Chavez, G. Cooper, and et al. The alarm monitoring system: A case study with two probabilistic inference techniques for belief networks. In Second European Conference in Artiﬁcial Intelligence in Medicine, 1989. A. Bj¨ rklund, T. Husfeldt, P. Kaski, and M. Koivisto. The travelling salesman problem in bounded o degree graphs. In Proceedings of the 35th International Colloquium on Automata, Languages and Programming, 2008. R. Bouckaert. Bayesian Belief Networks from Construction to Inference. PhD thesis, University of Utrecht, 1995. J. Cheng, R. Greiner, J. Kelly, D.A. Bell, and W. Liu. Learning Bayesian networks from data: an information-theory based approach. Artiﬁcial Intelligence, 137:43–90, 2002. D. Chickering. Learning Bayesian networks is NP-complete. Learning from Data: Artiﬁcial Intelligence and Statistics V, pages 121–130, 1996. D.M. Chickering. A transformational characterization of equivalent Bayesian network structures. In Proceedings of the Eleventh Conference on Uncertainty in Artiﬁcial Intelligence, pages 87–98. Morgan Kaufman, 1995. D.M. Chickering. Learning equivalence classes of Bayesian-network structures. Journal of Machine Learning Research, pages 445–498, 2002b. D.M. Chickering, D. Geiger, and D. Heckerman. Learning Bayesian networks: Search methods and experimental results. In Fifth International Workshop on Artiﬁcial Intelligence and Statistics, pages 112–128, 1995. G.F. Cooper and E. Herskovits. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9(4):309–347, 1992. R.G. Cowell, A.P. Dawid, S.L. Lauritzen, and D.J. Spiegelhalter. Probabilistic Networks and Expert Systems. Springer, 1999. N. Friedman, I. Nachman, and D. Pe’er. Learning Bayesian network structure from massive datasets: The “sparse candidate” algorithm. In Fifteenth Conference on Uncertainty in Artiﬁcial Intelligence, UAI-99, 1999. 2284 F INDING O PTIMAL BAYESIAN N ETWORK G IVEN A S UPER -S TRUCTURE N. Friedman, M. Linial, I. Nachman, and D. Pe’er. Using Bayesian networks to analyze expression data. Computational Biology, 7:601–620, 2000. C.N. Glymour. The Mind’s Arrows: Bayes Nets & Graphical Causal Models in Psychology. MIT Press, 2001. C.N. Glymour and G.F. Cooper. Computation, Causation, and Discovery. AAAI Press / The MIT Press, 1999. D. Heckerman. A tutorial on learning with Bayesian networks. Technical report, Microsoft Research, 1996. D.E. Heckerman, D. Geiger, and D.M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20:197–243, 1995. S. Imoto, T. Goto, and S. Miyano. Estimation of genetic networks and functional structures between genes by using Bayesian networks and nonparametric regression. Paciﬁc Symposium on Biocomputing, 7:175–186, 2002. M. Kalisch and P. B¨ hlmann. Estimating high-dimensional directed acyclic graphs with the PCu algorithm. Journal of Machine Learning Research, 8:613–636, 2007. M. Koivisto and K. Sood. Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research, 5:549–573, 2004. D. Margaritis and S. Thrun. Bayesian network induction via local neighborhoods. In S.A. Solla, T.K. Leen, and K.R. M¨ ller, editors, Advances in Neural Information Processing Systems, volume 12, u pages 505–511. MIT Press, 2000. C. Meek. Strong completeness and faithfulness in Bayesian networks. In Conference on Uncertainty in Artiﬁcial Intelligence, pages 411–418, 1995. A. Moore and W. Wong. Optimal reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning. In Twentieth International Conference on Machine Learning, ICML-2003, 2003. R. Neapolitan. Learning Bayesian Networks. Prentice Hall, 2003. S. Ott and S. Miyano. Finding optimal gene networks using biological constraints. Genome Informatics, 14:124–133, 2003. S. Ott, S. Imoto, and S. Miyano. Finding optimal models for small gene networks. Paciﬁc Symposium on Biocomputing, 9:557–567, 2004. S. Ott, A. Hansen, S.Y. Kim, and S. Miyano. Superiority of network motifs over optimal networks and an application to the revelation of gene network evolution. Bioinformatics, 21(2):227–238, 2005. J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufman Publishers, San Mateo, CA, 1988. 2285 P ERRIER J. Rissanen. Modeling by shortest data description. Automatica, 14:465–671, 1978. R. Robinson. Counting labelled acyclic digraphs. In New Directions in the Theory of Graphs, pages 239–273. Academic Press, 1973. G. Schwartz. Estimating the dimension of a model. The Annals of Statistics, 6:461–464, 1978. E. Segal, D. Pe’er, A. Regev, D. Koller, and N. Friedman. Learning module networks. Journal of Machine Learning Research, 6:557–588, 2005. T. Silander and P. Myllym¨ ki. A simple approach for ﬁnding the globally optimal Bayesian network a structure. In Conference on Uncertainty in Artiﬁcial Intelligence, pages 445–452, 2006. A.P. Singh and A.W. Moore. Finding optimal Bayesian networks by dynamic programming. Technical report, Carnegie Mellon University, 2005. P. Spirtes, C. Glymour, and R. Scheines. Causation, Prediction, and Search. MIT Press, second edition, 2000. H. Steck and T. Jaakkola. On the Dirichlet prior and Bayesian regularization. In Advances in Neural Information Processing Systems, volume 15, 2002. J. Suzuki. Learning Bayesian belief networks based on the MDL principle: an efﬁcient algorithm using branch and bound technique. IEICE Transactions on Information and Systems, 12, 1998. M. Teyssier and D. Koller. Ordering-based search: a simple and effective algorithm for learning Bayesian networks. In Proceedings of the 21th Annual Conference on Uncertainty in Artiﬁcial Intelligence, 2005. I. Tsamardinos, L.E. Brown, and C.F. Aliferi. The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 65:31–78, 2006. 2286