nips nips2010 nips2010-32 nips2010-32-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Daniel Lowd, Pedro Domingos
Abstract: Arithmetic circuits (ACs) exploit context-specific independence and determinism to allow exact inference even in networks with high treewidth. In this paper, we introduce the first ever approximate inference methods using ACs, for domains where exact inference remains intractable. We propose and evaluate a variety of techniques based on exact compilation, forward sampling, AC structure learning, Markov network parameter learning, variational inference, and Gibbs sampling. In experiments on eight challenging real-world domains, we find that the methods based on sampling and learning work best: one such method (AC2 -F) is faster and usually more accurate than loopy belief propagation, mean field, and Gibbs sampling; another (AC2 -G) has a running time similar to Gibbs sampling but is consistently more accurate than all baselines. 1
[1] A. Darwiche. A differential approach to inference in Bayesian networks. Journal of the ACM, 50(3):280– 305, 2003.
[2] J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco, CA, 1988.
[3] C. Boutilier, N. Friedman, M. Goldszmidt, and D. Koller. Context-specific independence in Bayesian networks. In Proc. of the 12th Conference on Uncertainty in Artificial Intelligence, pages 115–123, Portland, OR, 1996. Morgan Kaufmann.
[4] N. Friedman and M. Goldszmidt. Learning Bayesian networks with local structure. In Proc. of the 12th Conference on Uncertainty in Artificial Intelligence, pages 252–262, Portland, OR, 1996. Morgan Kaufmann.
[5] D. Chickering, D. Heckerman, and C. Meek. A Bayesian approach to learning Bayesian networks with local structure. In Proc. of the 13th Conference on Uncertainty in Artificial Intelligence, pages 80–89, Providence, RI, 1997. Morgan Kaufmann.
[6] Arthur Choi and Adnan Darwiche. A variational approach for approximating Bayesian networks by edge deletion. In Proc. of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI-06), Arlington, Virginia, 2006. AUAI Press.
[7] E. P. Xing, M. I. Jordan, and S. Russell. Graph partition strategies for generalized mean field inference. In Proc. of the 20th Conference on Uncertainty in Artificial Intelligence, pages 602–610, Banff, Canada, 2004.
[8] D. Geiger, C. Meek, and Y. Wexler. A variational inference procedure allowing internal structure for overlapping clusters and deterministic constraints. Journal of Artificial Intelligence Research, 27:1–23, 2006.
[9] M. Chavira and A. Darwiche. Compiling Bayesian networks using variable elimination. In Proc. of the 20th International Joint Conference on Artificial Intelligence (IJCAI), pages 2443–2449, 2007.
[10] R. Dechter and R. Mateescu. AND/OR search spaces for graphical models. Artificial Intelligence, 171:73– 106, 2007.
[11] R. Dechter. Bucket elimination: a unifying framework for reasoning. Artificial Intelligence, 113:41–85, 1999.
[12] Y. Wexler and C. Meek. MAS: a multiplicative approximation scheme for probabilistic inference. In Advances in Neural Information Processing Systems 22, Cambridge, MA, 2008. MIT Press.
[13] D. Lowd and P. Domingos. Learning arithmetic circuits. In Proc. of the 24th Conference on Uncertainty in Artificial Intelligence, Helsinki, Finland, 2008. AUAI Press.
[14] G. Hulten and P. Domingos. Mining complex models from arbitrarily large databases in constant time. In Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 525–531, Edmonton, Canada, 2002. ACM Press.
[15] Y. Wang, N. L. Zhang, and T. Chen. Latent tree models and approximate inference in Bayesian networks. Journal of Artificial Intelligence Research, 32:879–900, 2008.
[16] P. Liang, III H. Daum´ , and D. Klein. Structure compilation: trading structure for features. In Proc. of e the 25th International Conference on Machine Learning, pages 592–599, Helsinki, Finland, 2008. ACM.
[17] D. C. Liu and J. Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(3):503–528, 1989.
[18] D. M. Chickering. The WinMine toolkit. Technical Report MSR-TR-2002-103, Microsoft, Redmond, WA, 2002.
[19] J. Davis and P. Domingos. Bottom-up learning of Markov network structure. In Proc. of the 27th International Conference on Machine Learning, Haifa, Israel, 2010. ACM Press.
[20] C. K. Chow and C. N Liu. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, 14:462–467, 1968. 9