nips nips2009 nips2009-100 nips2009-100-reference knowledge-graph by maker-knowledge-mining

100 nips-2009-Gaussian process regression with Student-t likelihood


Source: pdf

Author: Jarno Vanhatalo, Pasi Jylänki, Aki Vehtari

Abstract: In the Gaussian process regression the observation model is commonly assumed to be Gaussian, which is convenient in computational perspective. However, the drawback is that the predictive accuracy of the model can be significantly compromised if the observations are contaminated by outliers. A robust observation model, such as the Student-t distribution, reduces the influence of outlying observations and improves the predictions. The problem, however, is the analytically intractable inference. In this work, we discuss the properties of a Gaussian process regression model with the Student-t likelihood and utilize the Laplace approximation for approximate inference. We compare our approach to a variational approximation and a Markov chain Monte Carlo scheme, which utilize the commonly used scale mixture representation of the Student-t distribution. 1


reference text

[1] Bruno De Finetti. The Bayesian approach to the rejection of outliers. In Proceedings of the fourth Berkeley Symposium on Mathematical Statistics and Probability, pages 199–210. University of California Press, 1961.

[2] A. Philip Dawid. Posterior expectations for large observations. Biometrika, 60(3):664–667, December 1973.

[3] Anthony O’Hagan. On outlier rejection phenomena in Bayes inference. Royal Statistical Society. Series B., 41(3):358–367, 1979.

[4] Mike West. Outlier models and prior distributions in Bayesian linear regression. Journal of Royal Statistical Society. Serires B., 46(3):431–439, 1984.

[5] John Geweke. Bayesian treatment of the independent Student-t linear model. Journal of Applied Econometrics, 8:519–540, 1993.

[6] Radford M. Neal. Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification. Technical Report 9702, Dept. of statistics and Dept. of Computer Science, University of Toronto, January 1997.

[7] Malte Kuss. Gaussian Process Models for Robust Regression, Classification, and Reinforcement Learning. PhD thesis, Technische Universit¨ t Darmstadt, 2006. a

[8] Paul W. Goldberg, Christopher K.I. Williams, and Christopher M. Bishop. Regression with input-dependent noise: A Gaussian process treatment. In M. I. Jordan, M. J. Kearns, and S. A Solla, editors, Advances in Neural Information Processing Systems 10. MIT Press, Cambridge, MA, 1998.

[9] Andrew Naish-Guzman and Sean Holden. Robust regression with twinned gaussian processes. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 1065–1072. MIT Press, Cambridge, MA, 2008.

[10] Oliver Stegle, Sebastian V. Fallert, David J. C. MacKay, and Søren Brage. Gaussian process robust regression for noisy heart rate data. Biomedical Engineering, IEEE Transactions on, 55 (9):2143–2151, September 2008. ISSN 0018-9294. doi: 10.1109/TBME.2008.923118.

[11] Michael E. Tipping and Neil D. Lawrence. Variational inference for Student-t models: Robust bayesian interpolation and generalised component analysis. Neurocomputing, 69:123–141, 2005.

[12] Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for Machine Learning. The MIT Press, 2006.

[13] Andrew Gelman, John B. Carlin, Hal S. Stern, and Donald B. Rubin. Bayesian Data Analysis. Chapman & Hall/CRC, second edition, 2004.

[14] Christopher K. I. Williams and David Barber. Bayesian classification with Gaussian processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12):1342–1351, 1998.

[15] H˚ vard Rue, Sara Martino, and Nicolas Chopin. Approximate Bayesian inference for latent a Gaussian models by using integrated nested Laplace approximations. Journal of Royal statistical Society B, 71(2):1–35, 2009.

[16] David A. Harville. Matrix Algebra From a Statistician’s Perspective. Springer-Verlag, 1997.

[17] Aki Vehtari and Jouko Lampinen. Bayesian model assessment and comparison using crossvalidation predictive densities. Neural Computation, 14(10):2439–2468, 2002.

[18] Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer Science +Business Media, LLC, 2006.

[19] Manfred Opper and C´ dric Archambeau. The variational Gaussian approximation revisited. e Neural Computation, 21(3):786–792, March 2009.

[20] Thomas Minka. A family of algorithms for approximate Bayesian inference. PhD thesis, Massachusetts Institute of Technology, 2001.

[21] Matthias Seeger. Bayesian inference and optimal design for the sparse linear model. Journal of Machine Learning Research, 9:759–813, 2008. 9