nips nips2007 nips2007-86 nips2007-86-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: David Wingate, Satinder S. Baveja
Abstract: In order to represent state in controlled, partially observable, stochastic dynamical systems, some sort of sufficient statistic for history is necessary. Predictive representations of state (PSRs) capture state as statistics of the future. We introduce a new model of such systems called the “Exponential family PSR,” which defines as state the time-varying parameters of an exponential family distribution which models n sequential observations in the future. This choice of state representation explicitly connects PSRs to state-of-the-art probabilistic modeling, which allows us to take advantage of current efforts in high-dimensional density estimation, and in particular, graphical models and maximum entropy models. We present a parameter learning algorithm based on maximum likelihood, and we show how a variety of current approximate inference methods apply. We evaluate the quality of our model with reinforcement learning by directly evaluating the control performance of the model. 1
[1] G. E. Hinton. Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8):1771–1800, 2002.
[2] E. T. Jaynes. Notes on present status and future prospects. In W. Grandy and L. Schick, editors, Maximum Entropy and Bayesian Methods, pages 1–13, 1991.
[3] J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In International Conference on Machine Learning (ICML), 2001.
[4] M. L. Littman, R. S. Sutton, and S. Singh. Predictive representations of state. In Neural Information Processing Systems (NIPS), pages 1555–1561, 2002.
[5] A. McCallum, D. Freitag, and F. Pereira. Maximum entropy Markov models for information extraction and segmentation. In International Conference on Machine Learning (ICML), pages 591–598, 2000.
[6] J. Peters, S. Vijayakumar, and S. Schaal. Natural actor-critic. In European Conference on Machine Learning (ECML), pages 280–291, 2005.
[7] S. D. Pietra, V. D. Pietra, and J. Lafferty. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4):380–393, 1997.
[8] M. Rudary, S. Singh, and D. Wingate. Predictive linear-Gaussian models of stochastic dynamical systems. In Uncertainty in Artificial Intelligence (UAI), pages 501–508, 2005.
[9] M. J. Wainwright and M. I. Jordan. Graphical models, exponential families, and variational inference. Technical Report 649, UC Berkeley, 2003.
[10] M. J. Wainwright and M. I. Jordan. Log-determinant relaxation for approximate inference in discrete Markov random fields. IEEE Transactions on Signal Processing, 54(6):2099–2109, 2006.
[11] D. Wingate. Exponential Family Predictive Representations of State. PhD thesis, University of Michigan, 2008.
[12] J. S. Yedida, W. T. Freeman, and Y. Weiss. Understanding belief propagation and its generalizations. Technical Report TR-2001-22, Mitsubishi Electric Research Laboratories, 2001. 8