nips nips2004 nips2004-39 nips2004-39-reference knowledge-graph by maker-knowledge-mining

39 nips-2004-Coarticulation in Markov Decision Processes

Source: pdf

Author: Khashayar Rohanimanesh, Robert Platt, Sridhar Mahadevan, Roderic Grupen

Abstract: We investigate an approach for simultaneously committing to multiple activities, each modeled as a temporally extended action in a semi-Markov decision process (SMDP). For each activity we deﬁne a set of admissible solutions consisting of the redundant set of optimal policies, and those policies that ascend the optimal statevalue function associated with them. A plan is then generated by merging them in such a way that the solutions to the subordinate activities are realized in the set of admissible solutions satisfying the superior activities. We present our theoretical results and empirically evaluate our approach in a simulated domain. 1

reference text

[1] C. Boutilier, R. Brafman, and C. Geib. Prioritized goal decomposition of Markov decision processes: Towards a synthesis of classical and decision theoretic planning. In Martha Pollack, editor, Proceedings of the Fifteenth International Joint Conference on Artiﬁcial Intelligence, pages 1156–1163, San Francisco, 1997. Morgan Kaufmann.

[2] C. Guestrin and G. Gordon. Distributed planning in hierarchical factored mdps. In In the Proceedings of the Eighteenth Conference on Uncertainty in Artiﬁcial Intelligence, pages 197 – 206, Edmonton, Canada, 2002.

[3] M. Huber. A Hybrid Architecture for Adaptive Robot Control. PhD thesis, University of Massachusetts, Amherst, 2000.

[4] Y. Nakamura. Advanced robotics: redundancy and optimization. Addison-Wesley Pub. Co., 1991.

[5] Theodore J. Perkins and Andrew G. Barto. Lyapunov-constrained action sets for reinforcement learning. In Proc. 18th International Conf. on Machine Learning, pages 409–416. Morgan Kaufmann, San Francisco, CA, 2001.

[6] R. Platt, A. Fagg, and R. Grupen. Nullspace composition of control laws for grasping. In the Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2002.

[7] D. Precup. Temporal Abstraction in Reinforcement Learning. PhD thesis, Department of Computer Science, University of Massachusetts, Amherst., 2000.

[8] K. Rohanimanesh, R. Platt, S. Mahadevan, and R. Grupen. A framework for coarticulation in markov decision processes. Technical Report 04-33, (www.cs.umass.edu/~khash/coarticulation04. pdf), Department of Computer Science, University of Massachusetts, Amherst, Massachusetts, USA., 2004.