nips nips2005 nips2005-148 nips2005-148-reference knowledge-graph by maker-knowledge-mining

148 nips-2005-Online Discovery and Learning of Predictive State Representations


Source: pdf

Author: Peter Mccracken, Michael Bowling

Abstract: Predictive state representations (PSRs) are a method of modeling dynamical systems using only observable data, such as actions and observations, to describe their model. PSRs use predictions about the outcome of future tests to summarize the system state. The best existing techniques for discovery and learning of PSRs use a Monte Carlo approach to explicitly estimate these outcome probabilities. In this paper, we present a new algorithm for discovery and learning of PSRs that uses a gradient descent approach to compute the predictions for the current state. The algorithm takes advantage of the large amount of structure inherent in a valid prediction matrix to constrain its predictions. Furthermore, the algorithm can be used online by an agent to constantly improve its prediction quality; something that current state of the art discovery and learning algorithms are unable to do. We give empirical results to show that our constrained gradient algorithm is able to discover core tests using very small amounts of data, and with larger amounts of data can compute accurate predictions of the system dynamics. 1


reference text

[1] Herbert Jaeger. Observable operator models for discrete stochastic time series. Neural Computation, 12(6):1371–1398, 2000.

[2] Michael Littman, Richard Sutton, and Satinder Singh. Predictive representations of state. In Advances in Neural Information Processing Systems 14 (NIPS), pages 1555–1561, 2002.

[3] Satinder Singh, Michael R. James, and Matthew R. Rudary. Predictive state representations: A new theory for modeling dynamical systems. In Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI), pages 512–519, 2004.

[4] Richard Sutton and Brian Tanner. Temporal-difference networks. In Advances in Neural Information Processing Systems 17, pages 1377–1384, 2005.

[5] Matthew Rosencrantz, Geoff Gordon, and Sebastian Thrun. Learning low dimensional predictive representations. In Twenty-First International Conference on Machine Learning (ICML), 2004.

[6] Michael R. James and Satinder Singh. Learning and discovery of predictive state representations in dynamical systems with reset. In Twenty-First International Conference on Machine Learning (ICML), 2004.

[7] Britton Wolfe, Michael R. James, and Satinder Singh. Learning predictive state representations in dynamical systems without reset. In Twenty-Second International Conference on Machine Learning (ICML), 2005.

[8] Satinder Singh, Michael Littman, Nicholas Jong, David Pardoe, and Peter Stone. Learning predictive state representations. In Twentieth International Conference on Machine Learning (ICML), pages 712–719, 2003.

[9] Peter McCracken. An online algorithm for discovery and learning of prediction state representations. Master’s thesis, University of Alberta, 2005.

[10] Eric Wiewiora. Learning predictive representations from a history. In Twenty-Second International Conference on Machine Learning (ICML), 2005.

[11] Anthony Cassandra. Tony’s POMDP file repository page. research/ai/pomdp/examples/index.html, 1999. http://www.cs.brown.edu/-