nips nips2012 nips2012-55 nips2012-55-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Miguel Lázaro-gredilla
Abstract: Warped Gaussian processes (WGP) [1] model output observations in regression tasks as a parametric nonlinear transformation of a Gaussian process (GP). The use of this nonlinear transformation, which is included as part of the probabilistic model, was shown to enhance performance by providing a better prior model on several data sets. In order to learn its parameters, maximum likelihood was used. In this work we show that it is possible to use a non-parametric nonlinear transformation in WGP and variationally integrate it out. The resulting Bayesian WGP is then able to work in scenarios in which the maximum likelihood WGP failed: Low data regime, data with censored values, classification, etc. We demonstrate the superior performance of Bayesian warped GPs on several real data sets.
[1] E. Snelson, Z. Ghahramani, and C. Rasmussen. Warped Gaussian processes. In Advances in Neural Information Processing Systems 16, 2003.
[2] C. E. Rasmussen. Evaluation of Gaussian Processes and other Methods for Non-linear Regression. PhD thesis, University of Toronto, 1996.
[3] M.N. Gibbs. Bayesian Gaussian Processes for Regression and Classification. PhD thesis, University of Cambridge, 1997.
[4] C.E. Rasmussen and C.K.I. Williams. Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, 2006.
[5] M.N. Schmidt. Function factorization using warped gaussian processes. In Proc. of the 26th International Conference on Machine Learning, pages 21–928. Omnipress, 2009.
[6] Y. Zhang and D.-Y Yeung. Multi-task warped gaussian process for personalized age estimation. In IEEE Conf. on Computer Vision and Pattern Recognition, pages 2622–2629.
[7] C.K.I. Williams and C.E. Rasmussen. Gaussian processes for regression. In Advances in Neural Information Processing Systems 8. MIT Press, 1996.
[8] W. Chu and Z. Ghahramani. Gaussian processes for ordinal regression. Journal of Machine Learning Research, 6:1019–1041, 2005.
[9] M.K. Titsias and N.D. Lawrence. Bayesian Gaussian process latent variable model. In Proc. of the 13th International Workshop on Artificial Intelligence and Statistics, volume 9 of JMLR: W&CP;, pages 844–851, 2010.
[10] M.K. Titsias. Variational learning of inducing variables in sparse Gaussian processes. In Proc. of the 12th International Workshop on Artificial Intelligence and Statistics, 2009.
[11] M. L´ zaro-Gredilla and M. Titsias. Variational heteroscedastic Gaussian process regression. In 28th a International Conference on Machine Learning (ICML-11), pages 841–848, New York, NY, USA, June 2011. ACM.
[12] M. Opper and C. Archambeau. The variational Gaussian approximation revisited. Neural Computation, 21(3):786–792, 2009.
[13] M.K. Titsias A.C. Damianou and N.D. Lawrence. Variational gaussian process dynamical systems. In Advances in Neural Information Processing System 25. IEEE Conf. publications, 2011.
[14] A. Frank and A. Asuncion. UCI machine learning repository, 2010. http://archive.ics.uci. edu/ml University of California, Irvine, School of Information and Computer Sciences.
[15] Materials algorithms project (MAP) program and data library. http://www.msm.cam.ac.uk/ map/map.html.
[16] D. Cole, C. Martin-Moran, A. G. Sheard, H. K. D. H. Bhadeshia, and D. J. C. MacKay. Modelling creep rupture strength of ferritic steel welds. Science and Technology of Welding and Joining, 5:81–90, 2000.
[17] L. Torgo. http://www.liacc.up.pt/˜ltorgo/Regression/.
[18] G. R¨ tsch, T. Onoda, and K.-R. M¨ ller. Soft margins for AdaBoost. Machine Learning, 42(3):287– a u 320, 2001. http://people.tuebingen.mpg.de/vipin/www.fml.tuebingen.mpg.de/ Members/raetsch/benchmark.1.html.
[19] A. Naish-Guzman and S. Holden. The generalized FITC approximation. In Advances in Neural Information Processing Systems 20, pages 1057–1064. MIT Press, 2008.
[20] R.B. Nelsen. An Introduction to Copulas. Springer, 1999.
[21] P.X.-K. Song. Multivariate dispersion models generated from Gaussian copula. Scandinavian Journal of Statistics, 27(2):305–320, 2000.
[22] A. Wilson and Z. Ghahramani. Copula processes. In Advances in Neural Information Processing Systems 23, pages 2460–2468. MIT Press, 2010.
[23] F.L. Wauthier and M.I. Jordan. Heavy-tailed process priors for selective shrinkage. In Advances in Neural Information Processing Systems 23. MIT Press, 2010. 9