nips nips2013 nips2013-347 nips2013-347-reference knowledge-graph by maker-knowledge-mining

347 nips-2013-Variational Planning for Graph-based MDPs


Source: pdf

Author: Qiang Cheng, Qiang Liu, Feng Chen, Alex Ihler

Abstract: Markov Decision Processes (MDPs) are extremely useful for modeling and solving sequential decision making problems. Graph-based MDPs provide a compact representation for MDPs with large numbers of random variables. However, the complexity of exactly solving a graph-based MDP usually grows exponentially in the number of variables, which limits their application. We present a new variational framework to describe and solve the planning problem of MDPs, and derive both exact and approximate planning algorithms. In particular, by exploiting the graph structure of graph-based MDPs, we propose a factored variational value iteration algorithm in which the value function is first approximated by the multiplication of local-scope value functions, then solved by minimizing a Kullback-Leibler (KL) divergence. The KL divergence is optimized using the belief propagation algorithm, with complexity exponential in only the cluster size of the graph. Experimental comparison on different models shows that our algorithm outperforms existing approximation algorithms at finding good policies. 1


reference text

R Iris Bahar, Erica A Frohm, Charles M Gaona, Gary D Hachtel, Enrico Macii, Abelardo Pardo, and Fabio Somenzi. Algebraic decision diagrams and their applications. In IEEE/ACM International Conference on Computer-Aided Design, pages 188–191, 1993. David Barber. Bayesian reasoning and machine learning. Cambridge University Press, 2011. Craig Boutilier, Richard Dearden, and Mois´ s Goldszmidt. Stochastic dynamic programming with factored e representations. Artificial Intelligence, 121(1):49–107, 2000. Nicklas Forsell and R´ gis Sabbadin. Approximate linear-programming algorithms for graph-based markov e decision processes. Frontiers in Artificial Intelligence and Applications, 141:590, 2006. Carlos Guestrin, Daphne Koller, and Ronald Parr. Multiagent planning with factored MDPs. Advances in Neural Information Processing Systems, 14:1523–1530, 2001. Carlos Guestrin, Daphne Koller, Ronald Parr, and Shobha Venkataraman. Efficient solution algorithms for factored mdps. Journal of Artificial Intelligence Research, 19:399–468, 2003. Jesse Hoey, Robert St-Aubin, Alan Hu, and Craig Boutilier. SPUDD: Stochastic planning using decision diagrams. In Proceedings of the Fifteenth conference on Uncertainty in Artificial Intelligence, pages 279– 288, 1999. Daphne Koller and Ronald Parr. Policy iteration for factored mdps. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, pages 326–334, 2000. Qiang Liu and Alexander Ihler. Variational algorithms for marginal MAP. In Uncertainty in Artificial Intelligence (UAI), 2011. Qiang Liu and Alexander Ihler. Belief propagation for structured decision making. In Uncertainty in Artificial Intelligence (UAI), pages 523–532, August 2012. A. Nath and P. Domingos. Efficient belief propagation for utility maximization and repeated inference. In The Proceeding of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010. Nathalie Peyrard and R´ gis Sabbadin. Mean field approximation of the policy iteration algorithm for graphe based markov decision processes. Frontiers in Artificial Intelligence and Applications, 141:595, 2006. Martin L Puterman. Markov decision processes: discrete stochastic dynamic programming. Wiley-Interscience, 2009. Matthew Richardson and Pedro Domingos. Mining knowledge-sharing sites for viral marketing. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 61– 70, 2002. R. Sabbadin, N. Peyrard, and N. Forsell. A framework and a mean-field algorithm for the local control of spatial processes. International Journal of Approximate Reasoning, 53(1):66 – 86, 2012. Ross D Shachter. Model building with belief networks and influence diagrams. Advances in decision analysis: from foundations to applications, pages 177–201, 2007. Olivier Sigaud, Olivier Buffet, et al. Markov decision processes in artificial intelligence. ISTE-Jonh Wiley & Sons, 2010. M.J. Wainwright and M.I. Jordan. Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1-2):1–305, 2008. 9