nips nips2000 nips2000-133 nips2000-133-reference knowledge-graph by maker-knowledge-mining

133 nips-2000-The Kernel Gibbs Sampler

Source: pdf

Author: Thore Graepel, Ralf Herbrich

Abstract: We present an algorithm that samples the hypothesis space of kernel classifiers. Given a uniform prior over normalised weight vectors and a likelihood based on a model of label noise leads to a piecewise constant posterior that can be sampled by the kernel Gibbs sampler (KGS). The KGS is a Markov Chain Monte Carlo method that chooses a random direction in parameter space and samples from the resulting piecewise constant density along the line chosen. The KGS can be used as an analytical tool for the exploration of Bayesian transduction, Bayes point machines, active learning, and evidence-based model selection on small data sets that are contaminated with label noise. For a simple toy example we demonstrate experimentally how a Bayes point machine based on the KGS outperforms an SVM that is incapable of taking into account label noise. 1

reference text

[1] C. Cortes and V. Vapnik. Support Vector Networks. Machine Learning, 20:273- 297, 1995.

[2] Y. Freund, H. S. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28:133- 168, 1997.

[3] T. Graepel, R. Herbrich, and K. Obermayer. Bayesian Transduction. In Advances in Neural Information System Processing 12, pages 456-462, 2000.

[4] R. Herbrich, T. Graepel, and C. Campbell. Bayesian learning in reproducing kernel Hilbert spaces. Technical report, Technical University of Berlin, 1999. TR 99-1l.

[5] D. MacKay. The evidence framework applied to classification networks. 4(5):720-736, 1992. Neural Computation,

[6] D. A. McAllester. Some PAC Bayesian theorems. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pages 230-234, Madison, Wisconsin, 1998.

[7] R. M. Neal. Probabilistic inference using Markov chain Monte Carlo methods. Technical report, Dept. of Computer Science, University of Toronto, 1993. CRG-TR-93-l.

[8] P. Sollich. Probabilistic methods for Support Vector Machines. In Advances in Neural Information Processing Systems 12, pages 349-355, San Mateo, CA, 2000. Morgan Kaufmann.

[9] G. Wahba. Support Vector Machines, Reproducing Kernel Hilbert Spaces and the randomized GACV. Technical report , Department of Statistics, University of Wisconsin, Madison, 1997. TR- NO- 984.