nips nips2003 nips2003-167 nips2003-167-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Arnab Nilim, Laurent El Ghaoui
Abstract: Optimal solutions to Markov Decision Problems (MDPs) are very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of those probabilities is far from accurate. Hence, estimation errors are limiting factors in applying MDPs to realworld problems. We propose an algorithm for solving finite-state and finite-action MDPs, where the solution is guaranteed to be robust with respect to estimation errors on the state transition probabilities. Our algorithm involves a statistically accurate yet numerically efficient representation of uncertainty, via Kullback-Leibler divergence bounds. The worst-case complexity of the robust algorithm is the same as the original Bellman recursion. Hence, robustness can be added at practically no extra computing cost.
[1] J. Bagnell, A. Ng, and J. Schneider. Solving uncertain Markov decision problems. Technical Report CMU-RI-TR-01-25, Robotics Institute, Carnegie Mellon University, August 2001.
[2] L. El-Ghaoui and A. Nilim. Robust solution to Markov decision problems with uncertain transition matrices: proofs and complexity analysis. Technical Report UCB/ERL M04/07, Department of EECS, University of California, Berkeley, January 2004. A related version has been submitted to Operations Research in Dec. 2003.
[3] E. Feinberg and A. Shwartz. Handbook of Markov Decision Processes, Methods and Applications. Kluwer’s Academic Publishers, Boston, 2002.
[4] T. Ferguson. Prior distributions on space of probability measures. The Annal of Statistics, 2(4):615–629, 1974.
[5] R. Givan, S. Leach, and T. Dean. Bounded parameter Markov decision processes. In fourth European Conference on Planning, pages 234–246, 1997.
[6] E. Lehmann and G. Casella. Theory of point estimation. Springer-Verlag, New York, USA, 1998.
[7] M. Putterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. WileyInterscince, New York, 1994.
[8] J. K. Satia and R. L. Lave. Markov decision processes with uncertain transition probabilities. Operations Research, 21(3):728–740, 1973.
[9] A. Shapiro and A. J. Kleywegt. Minimax analysis of stochastic problems. Optimization Methods and Software, 2002. to appear.
[10] C. C. White and H. K. Eldeib. Markov decision processes with imprecise transition probabilities. Operations Research, 42(4):739–749, 1994.