jmlr jmlr2011 jmlr2011-44 jmlr2011-44-reference knowledge-graph by maker-knowledge-mining

44 jmlr-2011-Information Rates of Nonparametric Gaussian Process Methods


Source: pdf

Author: Aad van der Vaart, Harry van Zanten

Abstract: We consider the quality of learning a response function by a nonparametric Bayesian approach using a Gaussian process (GP) prior on the response function. We upper bound the quadratic risk of the learning procedure, which in turn is an upper bound on the Kullback-Leibler information between the predictive and true data distribution. The upper bound is expressed in small ball probabilities and concentration measures of the GP prior. We illustrate the computation of the upper bound for the Mat´ rn and squared exponential kernels. For these priors the risk, and hence the e information criterion, tends to zero for all continuous response functions. However, the rate at which this happens depends on the combination of true response function and Gaussian prior, and is expressible in a certain concentration function. In particular, the results show that for good performance, the regularity of the GP prior should match the regularity of the unknown response function. Keywords: Bayesian learning, Gaussian prior, information rate, risk, Mat´ rn kernel, squared e exponential kernel


reference text

A. R. Barron. Information-theoretic characterization of Bayes performance and the choice of priors in parametric and nonparametric problems. In Bayesian Statistics, 6 (Alcoceber, 1998), pages 27–52. Oxford Univ. Press, New York, 1999. H. Bauer. Measure and integration theory, volume 26 of de Gruyter Studies in Mathematics. Walter de Gruyter & Co., Berlin, 2001. C. Borell. Inequalities of the Brunn-Minkowski type for Gaussian measures. Probab. Theory Related Fields, 140(1-2):195–205, 2008. I. Castillo. Lower bounds for posterior rates with Gaussian process priors. Electron. J. Stat., 2: 1281–1299, 2008. A. Cohen, W. Dahmen, I. Daubechies, and R. DeVore. Tree approximation and optimal encoding. Appl. Comput. Harmon. Anal., 11(2):192–226, 2001. D. E. Edmunds and H. Triebel. Function Spaces, Entropy Numbers, Differential Operators, volume 120 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, 1996. S. Ghosal, J. K. Ghosh, and A. W. van der Vaart. Convergence rates of posterior distributions. Ann. Statist., 28(2):500–531, 2000. I. Karatzas and S. E. Shreve. Brownian Motion and Stochastic Calculus, 2nd edition. SpringerVerlag, New York, 1991. J. Kuelbs and W. V. Li. Metric entropy and the small ball problem for Gaussian measures. J. Funct. Anal., 116(1):133–157, 1993. J. Kuelbs, W. V. Li, and W. Linde. The Gaussian measure of shifted balls. Probab. Theory Related Fields, 98(2):143–162, 1994. A. N. Kolmogorov and V. M. Tihomirov. ε-entropy and ε-capacity of sets in functional space. Amer. Math. Soc. Transl. (2), 17: 277–364, 1961 W. V. Li and Q.-M. Shao. Gaussian processes: inequalities, small ball probabilities and applications. In Stochastic Processes: Theory and Methods, volume 19 of Handbook of Statist., pages 533– 597. North-Holland, Amsterdam, 2001. W. V. Li and W. Linde. Existence of small ball constants for fractional Brownian motions. C. R. Acad. Sci. Paris S´ r. I Math., 326(11):1329–1334, 1998. e M. A. Lifshits. Gaussian Random Functions. Kluwer Academic Publishers, Dordrecht, 1995. C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine learning. MIT Press, Cambridge, MA, 2006. M. W. Seeger, S. M. Kakade, and D. P. Foster. Information consistency of nonparametric Gaussian process methods. IEEE Trans. Inform. Theory, 54(5):2376–2382, 2008. 2118 N ONPARAMETRIC G AUSSIAN P ROCESS M ETHODS A. B. Tsybakov., Introduction to Nonparametric Estimation. Springer, New York, 2009. A. W. van der Vaart. Asymptotic Statistics. Cambridge University Press, Cambridge, 1998. A. W. van der Vaart and J. H. van Zanten. Bayesian inference with rescaled Gaussian process priors. Electron. J. Stat., 1:433–448 (electronic), 2007. A. W. van der Vaart and J. H. van Zanten. Rates of contraction of posterior distributions based on Gaussian process priors. Ann. Statist., 36(3):1435–1463, 2008a. A. W. van der Vaart and J. H. van Zanten. Reproducing kernel Hilbert spaces of Gaussian priors. In Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, volume 3 of Inst. Math. Stat. Collect., pages 200–222. Inst. Math. Statist., Beachwood, OH, 2008b. A. W. van der Vaart and J. H. van Zanten. Adaptive Bayesian estimation using a Gaussian random field with inverse gamma bandwidth. Ann. Statist., 37(5B):2655–2675, 2009. A. W. van der Vaart and J. A. Wellner. Weak Convergence and Empirical Processes. Springer Series in Statistics. Springer-Verlag, New York, 1996. G. Wahba. Improper priors, spline smoothing and the problem of guarding against model errors in regression. J. Roy. Statist. Soc. Ser. B, 40(3): 364–372, 1978. 2119