nips nips2003 nips2003-148 nips2003-148-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Shai Shalev-shwartz, Koby Crammer, Ofer Dekel, Yoram Singer
Abstract: We present a unified view for online classification, regression, and uniclass problems. This view leads to a single algorithmic framework for the three problems. We prove worst case loss bounds for various algorithms for both the realizable case and the non-realizable case. A conversion of our main online algorithm to the setting of batch learning is also discussed. The end result is new algorithms and accompanying loss bounds for the hinge-loss. 1
[1] H. H. Bauschke and J. M. Borwein. On projection algorithms for solving convex feasibility problems. SIAM Review, 1996.
[2] Y. Censor and S. A. Zenios. Parallel Optimization.. Oxford University Press, 1997.
[3] K. Crammer and Y. Singer. Ultraconservative online algorithms for multiclass problems. Jornal of Machine Learning Research, 3:951–991, 2003.
[4] Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277–296, 1999.
[5] C. Gentile. A new approximate maximal margin classification algorithm. Journal of Machine Learning Research, 2:213–242, 2001.
[6] C. Gentile and M. Warmuth. Linear hinge loss and average margin. In NIPS’98.
[7] D. P. Helmbold, R. E. Schapire, Y. Singer, and M. K. Warmuth. A comparison of new and old algorithms for a mixture estimation problem. In COLT’95.
[8] M. Herbster. COLT’01. Learning additive models online with fast evaluating kernels. In
[9] J. Kivinen, D. P. Helmbold, and M. Warmuth. Relative loss bounds for single neurons. IEEE Transactions on Neural Networks, 10(6):1291–1304, 1999.
[10] J. Kivinen, A. J. Smola, and R. C. Williamson. Online learning with kernels. In NIPS’02.
[11] J. Kivinen and M. K. Warmuth. Exponentiated gradient versus gradient descent for linear predictors. Information and Computation, 132(1):1–64, January 1997.
[12] J. Kivinen and M. K. Warmuth. Relative loss bounds for multidimensional regression problems. Journal of Machine Learning, 45(3):301–329, July 2001.
[13] N. Klasner and H. U. Simon. From noise-free to noise-tolerant and from on-line to batch learning. In COLT’95.
[14] Y. Li and P. M. Long. The relaxed online maximum margin algorithm. Machine Learning, 46(1–3):361–387, 2002.
[15] V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.
[16] E. Xing, A. Y. Ng, M. Jordan, and S. Russel. Distance metric learning, with application to clustering with side-information. In NIPS’03.
[17] M. Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In ICML’03.