nips nips2008 nips2008-88 nips2008-88-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ofer Dekel
Abstract: We present cutoff averaging, a technique for converting any conservative online learning algorithm into a batch learning algorithm. Most online-to-batch conversion techniques work well with certain types of online learning algorithms and not with others, whereas cutoff averaging explicitly tries to adapt to the characteristics of the online algorithm being converted. An attractive property of our technique is that it preserves the efficiency of the original online algorithm, making it appropriate for large-scale learning problems. We provide a statistical analysis of our technique and back our theoretical claims with experimental results. 1
[1] Anonimous. Technical appendix submitted with this manuscript, 2008.
[2] N. Cesa-Bianchi, A. Conconi, and C. Gentile. On the generalization ability of online learning algorithms. IEEE Transactions on Information Theory, 50(9):2050–2057, September 2004.
[3] N. Cesa-Bianchi and C. Gentile. Improved risk bounds for online algorithms. NIPS 19, 2006.
[4] O. Dekel, S. Shalev-Shwartz, and Y. Singer. The Forgetron: A kernel-based perceptron on a budget. SIAM Journal on Computing, 37:1342–1372, 2008.
[5] O. Dekel and Y. Singer. Data-driven online to batch conversions. NIPS 18, 2006.
[6] D. A. Freedman. On tail probabilities for martingales. Annals of Prob., 3(1):100–118, 1975.
[7] Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277–296, 1999.
[8] S. I. Gallant. Optimal linear discriminants. Proc. of ICPR 8, pages 849–852. IEEE, 1986.
[9] R. Khardon and G. Wachman. Noise tolerant variants of the perceptron algorithm. Journal of Machine Learning Research, 8:227–248, 2007.
[10] Y. Li. Selective voting for perceptron-like learning. Proc. of ICML 17, pages 559–566, 2000.
[11] Y. Li, H. Zaragoza, R. He, J. ShaweTaylor, and J. Kandola. The perceptron algorithm with uneven margins. Proc. of ICML 19, pages 379–386, 2002.
[12] N. Littlestone. From online to batch learning. Proc. of COLT 2, pages 269–284, 1989.
[13] F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65:386–407, 1958.
[14] T. Zhang. Solving large scale linear prediction problems using stochastic gradient descent algorithms. Proc. of ICML 21, 2004. 8