nips nips2003 nips2003-167 nips2003-167-reference knowledge-graph by maker-knowledge-mining

167 nips-2003-Robustness in Markov Decision Problems with Uncertain Transition Matrices

Source: pdf

Author: Arnab Nilim, Laurent El Ghaoui

Abstract: Optimal solutions to Markov Decision Problems (MDPs) are very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of those probabilities is far from accurate. Hence, estimation errors are limiting factors in applying MDPs to realworld problems. We propose an algorithm for solving ﬁnite-state and ﬁnite-action MDPs, where the solution is guaranteed to be robust with respect to estimation errors on the state transition probabilities. Our algorithm involves a statistically accurate yet numerically efﬁcient representation of uncertainty, via Kullback-Leibler divergence bounds. The worst-case complexity of the robust algorithm is the same as the original Bellman recursion. Hence, robustness can be added at practically no extra computing cost.

reference text

[1] J. Bagnell, A. Ng, and J. Schneider. Solving uncertain Markov decision problems. Technical Report CMU-RI-TR-01-25, Robotics Institute, Carnegie Mellon University, August 2001.

[2] L. El-Ghaoui and A. Nilim. Robust solution to Markov decision problems with uncertain transition matrices: proofs and complexity analysis. Technical Report UCB/ERL M04/07, Department of EECS, University of California, Berkeley, January 2004. A related version has been submitted to Operations Research in Dec. 2003.

[3] E. Feinberg and A. Shwartz. Handbook of Markov Decision Processes, Methods and Applications. Kluwer’s Academic Publishers, Boston, 2002.

[4] T. Ferguson. Prior distributions on space of probability measures. The Annal of Statistics, 2(4):615–629, 1974.

[5] R. Givan, S. Leach, and T. Dean. Bounded parameter Markov decision processes. In fourth European Conference on Planning, pages 234–246, 1997.

[6] E. Lehmann and G. Casella. Theory of point estimation. Springer-Verlag, New York, USA, 1998.

[7] M. Putterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. WileyInterscince, New York, 1994.

[8] J. K. Satia and R. L. Lave. Markov decision processes with uncertain transition probabilities. Operations Research, 21(3):728–740, 1973.

[9] A. Shapiro and A. J. Kleywegt. Minimax analysis of stochastic problems. Optimization Methods and Software, 2002. to appear.

[10] C. C. White and H. K. Eldeib. Markov decision processes with imprecise transition probabilities. Operations Research, 42(4):739–749, 1994.