nips nips2013 nips2013-242 nips2013-242-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ilya O. Tolstikhin, Yevgeny Seldin
Abstract: We present a PAC-Bayes-Empirical-Bernstein inequality. The inequality is based on a combination of the PAC-Bayesian bounding technique with an Empirical Bernstein bound. We show that when the empirical variance is significantly smaller than the empirical loss the PAC-Bayes-Empirical-Bernstein inequality is significantly tighter than the PAC-Bayes-kl inequality of Seeger (2002) and otherwise it is comparable. Our theoretical analysis is confirmed empirically on a synthetic example and several UCI datasets. The PAC-Bayes-Empirical-Bernstein inequality is an interesting example of an application of the PAC-Bayesian bounding technique to self-bounding functions. 1
[1] John Langford and John Shawe-Taylor. PAC-Bayes & margins. In Advances in Neural Information Processing Systems (NIPS), 2002.
[2] David McAllester. PAC-Bayesian stochastic model selection. Machine Learning, 51(1), 2003.
[3] John Langford. Tutorial on practical prediction theory for classification. Journal of Machine Learning Research, 2005.
[4] Yevgeny Seldin and Naftali Tishby. PAC-Bayesian analysis of co-clustering and beyond. Journal of Machine Learning Research, 11, 2010.
[5] Matthew Higgs and John Shawe-Taylor. A PAC-Bayes bound for tailored density estimation. In Proceedings of the International Conference on Algorithmic Learning Theory (ALT), 2010. 8
[6] Yevgeny Seldin, Peter Auer, Francois Laviolette, John Shawe-Taylor, and Ronald Ortner. PAC¸ Bayesian analysis of contextual bandits. In Advances in Neural Information Processing Systems (NIPS), 2011.
[7] Matthias Seeger. PAC-Bayesian generalization error bounds for Gaussian process classification. Journal of Machine Learning Research, 2002.
[8] Yevgeny Seldin, Francois Laviolette, Nicol` Cesa-Bianchi, John Shawe-Taylor, and Peter ¸ o Auer. PAC-Bayesian inequalities for martingales. IEEE Transactions on Information Theory, 58, 2012.
[9] Andreas Maurer and Massimiliano Pontil. Empirical Bernstein bounds and sample variance penalization. In Proceedings of the International Conference on Computational Learning Theory (COLT), 2009.
[10] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. John Wiley & Sons, 1991.
[11] Andreas Maurer. A note on the PAC-Bayesian theorem. www.arxiv.org, 2004.
[12] David McAllester. Some PAC-Bayesian theorems. Machine Learning, 37, 1999.
[13] A.W. Van Der Vaart. Asymptotic statistics. Cambridge University Press, 1998.
[14] St´ phane Boucheron, G´ bor Lugosi, and Olivier Bousquet. Concentration inequalities. In e a O. Bousquet, U.v. Luxburg, and G. R¨ tsch, editors, Advanced Lectures in Machine Learning. a Springer, 2004.
[15] A. Asuncion and D.J. Newman. UCI machine http://www.ics.uci.edu/∼mlearn/MLRepository.html. 9 learning repository, 2007.