nips nips2000 nips2000-7 nips2000-7-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Claudio Gentile
Abstract: A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p ~ 2 for a set of linearly separable data. Our algorithm, called ALMAp (Approximate Large Margin algorithm w.r.t. norm p), takes 0 ((P~21;;2) corrections to separate the data with p-norm margin larger than (1 - 0:) ,,(, where,,( is the p-norm margin of the data and X is a bound on the p-norm of the instances. ALMAp avoids quadratic (or higher-order) programming methods. It is very easy to implement and is as fast as on-line algorithms, such as Rosenblatt's perceptron. We report on some experiments comparing ALMAp to two incremental algorithms: Perceptron and Li and Long's ROMMA. Our algorithm seems to perform quite better than both. The accuracy levels achieved by ALMAp are slightly inferior to those obtained by Support vector Machines (SVMs). On the other hand, ALMAp is quite faster and easier to implement than standard SVMs training algorithms.
[1] M. Anthony, P. Bartlett, Neural Network Learning: Theoretical Foundations, CMU, 1999.
[2] P. Auer and C. Gentile Adaptive and self-confident on-line learning algorithms. In 13th COLT, 107- 117, 2000.
[3] C. Cortes, V. Vapnik. Support-vector networks. Machine Learning, 20, 3: 273- 297, 1995.
[4] Y. Freund and R. Schapire. Large margin classification using the perceptron algorithm. Journal of Machine Learning, 37, 3: 277- 296, 1999.
[5] T.-T. Friess, N. Cristianini, and C. Campbell. The kernel adatron algorithm: a fast and simple leaming procedure for support vector machines. In 15th ICML, 1998.
[6] C. Gentile and N. Littlestone. The robustness of the p-norm algorithms. In 12th COLT, 1- 11 , 1999.
[7] c. Gentile, and M. K. Warmuth. Linear hinge loss and average margin. In 11th NIPS, 225- 231 , 1999.
[8] A. I . Grove, N. Littlestone, and D. Schuurmans. General convergence results for linear discriminant updates. In 10th COLT, 171- 183, 1997.
[9] D. Helmbold and M. K. Warmuth. On weak learning. JCSS, 50, 3: 551- 573, 1995.
[10] A. Kowalczyk. Maximal margin perceptron. In Smola, Bartlett, Scholkopf, and Schuurmans editors, Advances in large margin classifiers, MIT Press, 1999.
[11] Y. Le Cun, L.I. Iackel, L. Bottou, A. Brunot, C. Cortes, I.S. Denker, H. Drucker, I. Guyon, U. Muller, S. Sackinger, P. Simard, and V. Vapnik, Comparison of learning algorithms for handwritten digit recognition. In ICANN, 53-60, 1995.
[12] Y. Li, and P. Long. The relaxed online maximum margin algorithm. In 12th NIPS, 498- 504, 2000.
[13] Y. Li. From support vector machines to large margin classifiers, PhD Thesis, School of Computing, the National University of Singapore, 2000.
[14] N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285- 318, 1988.
[15] O. Mangasarian , Mathematical programming in data mining. Data Mining and Knowledge Discovery,42, 1: 183- 201, 1997.
[16] P. Nachbar, I.A. Nossek, I. Strobl, The generalized adatron algorithm. In Proc. 1993 IEEE ISCAS, 2152-5, 1993.
[17] I . C. Platt. Fast training of support vector machines using sequential minimal optimization. In Scholkopf, Burges and Smola editors, Advances in kernel methods: support vector machines, MIT Press, 1998.
[18] F. Rosenblatt. Principles of neurodynamics: Perceptrons and the theory of brain mechanisms. Spartan Books, Washington, D.C., 1962.