jmlr jmlr2005 jmlr2005-67 jmlr2005-67-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Andre Elisseeff, Theodoros Evgeniou, Massimiliano Pontil
Abstract: We extend existing theory on stability, namely how much changes in the training data influence the estimated models, and generalization performance of deterministic learning algorithms to the case of randomized algorithms. We give formal definitions of stability for randomized algorithms and prove non-asymptotic bounds on the difference between the empirical and expected error as well as the leave-one-out and expected error of such algorithms that depend on their random stability. The setup we develop for this purpose can be also used for generally studying randomized learning algorithms. We then use these general results to study the effects of bagging on the stability of a learning method and to prove non-asymptotic bounds on the predictive performance of bagging which have not been possible to prove with the existing theory of stability for deterministic learning algorithms.1 Keywords: stability, randomized learning algorithms, sensitivity analysis, bagging, bootstrap methods, generalization error, leave-one-out error.
[1] Andonova, S., Elisseeff, A., Evgeniou, T., and Pontil, M. (2002), “A simple algorithm to learn stable machines”, Proceedings of the 15th European Conference on Artificial Intelligence (ECAI) 2002.
[2] Bousquet, O., and Elisseeff, A. (2002), “Stability and generalization”, Journal of Machine Learning Research, 2:499–526. 78 S TABILITY OF R ANDOMIZED L EARNING A LGORITHMS
[3] Breiman, L. (1996a), “Bagging predictors”, Machine Learning, 26(2):123–140.
[4] Breiman, L. (1996b), “Heuristics of instability and stabilization in model selection”, Annals of Statistics, 24(6):2350–2383.
[5] Devroye, L., Gy¨ rfi, L., and Lugosi, G. (1996), A Probabilistic Theory of Pattern Recognition, o Number 31 in Applications of Mathematics, Springer, New York.
[6] Devroye, L., and Wagner, T. (1979), “Distribution-free performance bounds for potential function rules”, IEEE Transactions on Information Theory, 25(5):601–604.
[7] Evgeniou, T., Pontil, M., and Elisseeff, A. (2004), “Leave-one-out error, stability, and generalization of voting combinations of classifiers”, Machine Learning, 55:1, 2004 .
[8] Kearns, M., and Ron, D. (1999), “Algorithmic stability and sanity check bounds for leaveone-out cross validation bounds”, Neural Computation, 11(6):1427–1453.
[9] Kutin, S., and Niyogi, P. (2002), “Almost-everywhere algorithmic stability and generalization error”, Uncertainty in Artificial Intelligence (UAI), August, 2002, Edmonton, Canada.
[10] McDiarmid, C. (1989), “On the method of bounded differences”, In Survey in Combinatorics, p. 148–188. Cambridge University Press, Cambridge.
[11] Poggio, T., and Girosi, F. (1990), “Networks for approximation and learning”, Proceedings of the IEEE, 78 (9).
[12] Poggio, T., Rifkin, R., Mukherjee, S. and Niyogi, P. (2004), “Learning Theory: general conditions for predictivity”, Nature, Vol. 428, 419-422.
[13] Vapnik, V. (1998), Statistical Learning Theory. Wiley, New York, 1998. 79