nips nips2008 nips2008-88 nips2008-88-reference knowledge-graph by maker-knowledge-mining

88 nips-2008-From Online to Batch Learning with Cutoff-Averaging

Source: pdf

Author: Ofer Dekel

Abstract: We present cutoff averaging, a technique for converting any conservative online learning algorithm into a batch learning algorithm. Most online-to-batch conversion techniques work well with certain types of online learning algorithms and not with others, whereas cutoff averaging explicitly tries to adapt to the characteristics of the online algorithm being converted. An attractive property of our technique is that it preserves the efﬁciency of the original online algorithm, making it appropriate for large-scale learning problems. We provide a statistical analysis of our technique and back our theoretical claims with experimental results. 1

reference text

[1] Anonimous. Technical appendix submitted with this manuscript, 2008.

[2] N. Cesa-Bianchi, A. Conconi, and C. Gentile. On the generalization ability of online learning algorithms. IEEE Transactions on Information Theory, 50(9):2050–2057, September 2004.

[3] N. Cesa-Bianchi and C. Gentile. Improved risk bounds for online algorithms. NIPS 19, 2006.

[4] O. Dekel, S. Shalev-Shwartz, and Y. Singer. The Forgetron: A kernel-based perceptron on a budget. SIAM Journal on Computing, 37:1342–1372, 2008.

[5] O. Dekel and Y. Singer. Data-driven online to batch conversions. NIPS 18, 2006.

[6] D. A. Freedman. On tail probabilities for martingales. Annals of Prob., 3(1):100–118, 1975.

[7] Y. Freund and R. E. Schapire. Large margin classiﬁcation using the perceptron algorithm. Machine Learning, 37(3):277–296, 1999.

[8] S. I. Gallant. Optimal linear discriminants. Proc. of ICPR 8, pages 849–852. IEEE, 1986.

[9] R. Khardon and G. Wachman. Noise tolerant variants of the perceptron algorithm. Journal of Machine Learning Research, 8:227–248, 2007.

[10] Y. Li. Selective voting for perceptron-like learning. Proc. of ICML 17, pages 559–566, 2000.

[11] Y. Li, H. Zaragoza, R. He, J. ShaweTaylor, and J. Kandola. The perceptron algorithm with uneven margins. Proc. of ICML 19, pages 379–386, 2002.

[12] N. Littlestone. From online to batch learning. Proc. of COLT 2, pages 269–284, 1989.

[13] F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65:386–407, 1958.

[14] T. Zhang. Solving large scale linear prediction problems using stochastic gradient descent algorithms. Proc. of ICML 21, 2004. 8