nips nips2003 nips2003-169 nips2003-169-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Mark A. Paskin
Abstract: Rao–Blackwellization is an approximation technique for probabilistic inference that flexibly combines exact inference with sampling. It is useful in models where conditioning on some of the variables leaves a simpler inference problem that can be solved tractably. This paper presents Sample Propagation, an efficient implementation of Rao–Blackwellized approximate inference for a large class of models. Sample Propagation tightly integrates sampling with message passing in a junction tree, and is named for its simple, appealing structure: it walks the clusters of a junction tree, sampling some of the current cluster’s variables and then passing a message to one of its neighbors. We discuss the application of Sample Propagation to conditional Gaussian inference problems such as switching linear dynamical systems. 1
X1 X2 Y1 Y2 Z1 Z2 XT ... YT ZT ... (a) X1, X2, X2, X3, Z1, Z2 XT - 1, XT, ... Z2, Z3 Z T - 1, Z T (b) Assumed Density Filtering Sample Propagation Gibbs Sampling 9 average position error Lauritzen’s algorithm is intractable in this case because any strongly rooted junction tree for this network must have a cluster containing all of the discrete variables [3, Thm. 3.18]. Therefore, instead of comparing our approximate position estimates to the correct answer, we sampled a trajectory from the network and computed the average position error to the (unobserved) ground truth. Both Gibbs sampling and Sample Propagation were run with a forwards–backwards sampling schedule; Sample Propagation used the junction tree of Figure 2(b).6 Both algorithms were started in the same state and both were allowed to “burn in” for five forwards–backwards passes. We repeated this 10 times and averaged the results over trials. Figure 2(c) shows that Sample Propagation converged much more quickly than Gibbs sampling. Also, Sample Propagation found better answers than Assumed Density Filtering (a standard algorithm for this problem), but at increased computational cost. 8 7 6 5 4 3 5 10 6 7 8 10 10 10 floating point operations 9 10 (c) Figure 2: The TRACKING example.
[1] G. Shafer and P. Shenoy. Probability propagation. Annals of Mathematics and Artificial Intelligence, 2:327–352, 1990.
[2] R. Cowell, P. Dawid, S. Lauritzen, and D. Spiegelhalter. Probabilistic Networks and Expert Systems. Springer, 1999.
[3] U. Lerner. Hybrid Bayesian Networks for Reasoning About Complex Systems. PhD thesis, Stanford University, October 2002.
[4] A. Dawid, U. Kjærulff, and S. Lauritzen. Hybrid propagation in junction trees. In Advances in Intelligent Computing, volume 945 of Lecture Notes in Computer Science. Springer, 1995.
[5] U. Kjærulff. HUGS: Combining exact inference and Gibbs sampling in junction trees. In Proc. of the 11th Conf. on Uncertainty in Artificial Intelligence (UAI-95). Morgan Kaufmann, 1995.
[6] A. Doucet, N. de Freitas, K. Murphy, and S. Russell. Rao-Blackwellised particle filtering for dynamic Bayesian networks. In Proc. of the 16th Conf. on Uncertainty in AI (UAI-00), 2000.
[7] B. Bidyuk and R. Dechter. An empirical study of w-cutset sampling for Bayesian networks. In Proc. of the 19th Conf. on Uncertainty in AI (UAI-03). Morgan Kaufmann, 2003.
[8] R. Neal. Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRG-TR-93-1, University of Toronto, 1993.
[9] C. S. Jensen, A. Kong, and U. Kjærulff. Blocking Gibbs sampling in very large probabilistic expert systems. International Journal of Human-Computer Studies, 42:647–666, 1995.
[10] Y. W. Teh and M. Welling. On improving the efficiency of the iterative proportional fitting procedure. In Proc. of the 9th Int’l. Workshop on AI and Statistics (AISTATS-03), 2003.
[11] U. Lerner and R. Parr. Inference in hybrid networks: Theoretical limits and practical algorithms. In Proc. of the 17th Conf. on Uncertainty in AI (UAI-01). Morgan Kaufmann, 2001.
[12] S. Lauritzen. Propagation of probabilities, means, and variances in mixed graphical association models. Journal of the American Statistical Association, 87(420):1098–1108, 1992.
[13] C. Carter and R. Kohn. Markov chain Monte Carlo in conditionally Gaussian state space models. Biometrika, 83:589–601, 1996. 6 Carter & Kohn [13] describe a specialized algorithm for this model that is similar to a version of Sample Propagation that does not resample the discrete variables on the backwards pass.