nips nips2005 nips2005-54 nips2005-54-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ofer Dekel, Yoram Singer
Abstract: Online learning algorithms are typically fast, memory efficient, and simple to implement. However, many common learning problems fit more naturally in the batch learning setting. The power of online learning algorithms can be exploited in batch settings by using online-to-batch conversions techniques which build a new batch algorithm from an existing online algorithm. We first give a unified overview of three existing online-to-batch conversion techniques which do not use training data in the conversion process. We then build upon these data-independent conversions to derive and analyze data-driven conversions. Our conversions find hypotheses with a small risk by explicitly minimizing datadependent generalization bounds. We experimentally demonstrate the usefulness of our approach and in particular show that the data-driven conversions consistently outperform the data-independent conversions.
[1] K. Azuma. Weighted sums of certain dependent random variables. Tohoku Mathematical Journal, 68:357–367, 1967.
[2] N. Cesa-Bianchi, A. Conconi, and C.Gentile. On the generalization ability of on-line learning algorithms. IEEE Transactions on Information Theory, 2004.
[3] K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive aggressive algorithms. Journal of Machine Learning Research, 2006.
[4] K. Crammer, J. Kandola, and Y. Singer. Online classification on a budget. NIPS 16, 2003.
[5] O. Dekel, S. Shalev-Shwartz, and Y. Singer. The Forgetron: A kernel-based perceptron on a fixed budget. NIPS 18, 2005.
[6] Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277–296, 1999.
[7] S. I. Gallant. Optimal linear discriminants. ICPR 8, pages 849–852. IEEE, 1986.
[8] D. P. Helmbold and M. K. Warmuth. On weak learning. Journal of Computer and System Sciences, 50:551–573, 1995.
[9] Y. Li. Selective voting for perceptron-like on-line learning. In ICML 17, 2000.
[10] N. Littlestone. From on-line to batch learning. COLT 2, pages 269–284, July 1989.
[11] N. Littlestone and M. Warmuth. Relating data compression and learnability. Unpublished manuscript, November 1986.
[12] V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.
[13] J. Weston, A. Bordes, and L. Bottou. Online (and offline) on a tighter budget. AISTAT 10, 2005.