nips nips2006 nips2006-203 nips2006-203-reference knowledge-graph by maker-knowledge-mining

203 nips-2006-implicit Online Learning with Kernels


Source: pdf

Author: Li Cheng, Dale Schuurmans, Shaojun Wang, Terry Caelli, S.v.n. Vishwanathan

Abstract: We present two new algorithms for online learning in reproducing kernel Hilbert spaces. Our first algorithm, ILK (implicit online learning with kernels), employs a new, implicit update technique that can be applied to a wide variety of convex loss functions. We then introduce a bounded memory version, SILK (sparse ILK), that maintains a compact representation of the predictor without compromising solution quality, even in non-stationary environments. We prove loss bounds and analyze the convergence rate of both. Experimental evidence shows that our proposed algorithms outperform current methods on synthetic and real data. 1


reference text

[1] J. Kivinen and M. K. Warmuth. Exponentiated gradient versus gradient descent for linear predictors. Information and Computation, 132(1):1–64, 1997.

[2] J. Kivinen, A. J. Smola, and R. C. Williamson. Online learning with kernels. IEEE Transactions on Signal Processing, 52(8), 2004.

[3] S. V. N. Vishwanathan, N. N. Schraudolph, and A. J. Smola. Step size adaptation in reproducing kernel Hilbert space. Journal of Machine Learning Research, 7, 2006.

[4] R. T. Rockafellar. Convex Analysis, volume 28 of Princeton Mathematics Series. Princeton University Press, 1970.

[5] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in C: The Art of Scientific Computing (2nd ed.). Cambridge University Press, Cambridge, 1992. ISBN 0 - 521 - 43108 - 5.

[6] V. Vapnik. Statistical Learning Theory. John Wiley and Sons, New York, 1998.

[7] N. Cesa-Bianchi, A. Conconi, and C. Gentile. On the generalization ability of on-line learning algorithms. IEEE Trans. Information Theory, 50(9):2050–2057, 2004.

[8] K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. Journal of Machine Learning Research, 7:551–585, 2006.