jmlr jmlr2005 jmlr2005-15 jmlr2005-15-reference knowledge-graph by maker-knowledge-mining

15 jmlr-2005-Asymptotic Model Selection for Naive Bayesian Networks


Source: pdf

Author: Dmitry Rusakov, Dan Geiger

Abstract: We develop a closed form asymptotic formula to compute the marginal likelihood of data given a naive Bayesian network model with two hidden states and binary features. This formula deviates from the standard BIC score. Our work provides a concrete example that the BIC score is generally incorrect for statistical models that belong to stratified exponential families. This claim stands in contrast to linear and curved exponential families, where the BIC score has been proven to provide a correct asymptotic approximation for the marginal likelihood. Keywords: Bayesian networks, asymptotic model selection, Bayesian information criterion (BIC)


reference text

Shreeram S. Abhyankar. Algebraic Geometry for Scientists and Engineers. Number 35 in Mathematical Surveys and Monographs. American Mathematical Society, 1990. 33 RUSAKOV AND G EIGER Hirotugu Akaike. A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6):716–723, December 1974. M.F. Atiyah. Resolution of singularities and division of distributions. Communications on Pure and Applied Mathematics, 13:145–150, 1970. Peter Cheeseman and John Stutz. Bayesian classification (AutoClass): Theory and results. In U. Fayyad, G. Piatesky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 153–180. AAAI Press, 1995. Gregory F. Cooper and Edward Herskovits. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9(4):309–347, October 1992. Morris H. DeGroot. Optimal Statistical Decisions. McGraw-Hill Book Company, 1970. Nir Friedman, Dan Geiger, and Moises Goldszmidt. Bayesian network classifiers. Machine Learning, 29(2-3):131–163, 1997. Dan Geiger, David Heckerman, Henry King, and Christopher Meek. Stratified exponential families: Graphical models and model selection. Annals of Statistics, 29(2):505–529, 2001. Dan Geiger, David Heckerman, and Christopher Meek. Asymptotic model selection for directed networks with hidden variables. In Eric Horvitz and Finn Jensen, editors, Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence, pages 283–290. Morgan Kaufmann Publishers, Inc., 1996. Dominique Haughton. On the choice of a model to fit data from an exponential family. Annals of Statistics, 16(1):342–355, 1988. David Heckerman, Dan Geiger, and David M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3):197–243, 1995. Heisuke Hironaka. Resolution of singularities of an algebraic variety over a field of characteristic zero. Annals of Mathematics, 7(1,2):109–326, 1964. Christine Keribin. Consistent estimation of the order of mixture models. Sankhya, Series A, 62(1), February 2000. Serge Lang. Complex Analysis. Springer-Verlag, 3rd edition, 1993. Steffen L. Lauritzen. Graphical Models. Number 17 in Oxford Statistical Science Series. Clarendon Press, 1996. James D. Murray. Asymptotic Analysis. Number 48 in Applied Mathematical Sciences. SpringerVerlag, 1984. Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988. Dmitry Rusakov and Dan Geiger. Asymptotic model selection for naive Bayesian networks. In Adnan Darwiche and Nir Friedman, editors, Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI-02), 2002. 34 A SYMPTOTIC M ODEL S ELECTION FOR NAIVE BAYESIAN N ETWORKS Gideon Schwarz. Estimating the dimension of a model. Annals of Statistics, 6(2):461–464, 1978. Raffaella Settimi and Jim Q. Smith. On the geometry of Bayesian graphical models with hidden variables. In Gregory F. Cooper and Serafin Moral, editors, Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pages 472–479. Morgan Kaufmann Publishers, Inc., 1998. Raffaella Settimi and Jim Q. Smith. Geometry, moments and conditional independence trees with hidden variables. Annals of Statistics, 28:1179–1205, 2000. Peter Spirtes, T Richardson, and Christopher Meek. The dimensionality of mixed ancestral graphs. Technical Report CMU-PHIL-83, Philosophy Department, Carnegie Mellon University, 1997. Sumio Watanabe. Algebraic analysis for nonidentifiable learning machines. Neural Computation, 13(4):899–933, 2001. Roderick Wong. Asymptotic Approximations of Integrals. Computer Science and Scientific Computing. Academic Press, 1989. 35