jmlr jmlr2007 jmlr2007-30 jmlr2007-30-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Michael Biehl, Anarta Ghosh, Barbara Hammer
Abstract: Learning vector quantization (LVQ) schemes constitute intuitive, powerful classification heuristics with numerous successful applications but, so far, limited theoretical background. We study LVQ rigorously within a simplifying model situation: two competing prototypes are trained from a sequence of examples drawn from a mixture of Gaussians. Concepts from statistical physics and the theory of on-line learning allow for an exact description of the training dynamics in highdimensional feature space. The analysis yields typical learning curves, convergence properties, and achievable generalization abilities. This is also possible for heuristic training schemes which do not relate to a cost function. We compare the performance of several algorithms, including Kohonen’s LVQ1 and LVQ+/-, a limiting case of LVQ2.1. The former shows close to optimal performance, while LVQ+/- displays divergent behavior. We investigate how early stopping can overcome this difficulty. Furthermore, we study a crisp version of robust soft LVQ, which was recently derived from a statistical formulation. Surprisingly, it exhibits relatively poor generalization. Performance improves if a window for the selection of data is introduced; the resulting algorithm corresponds to cost function based LVQ2. The dependence of these results on the model parameters, for example, prior class probabilities, is investigated systematically, simulations confirm our analytical findings. Keywords: prototype based classification, learning vector quantization, Winner-Takes-All algorithms, on-line learning, competitive learning
N. Barkai, H.S. Seung, and H. Sompolinksy. Scaling laws in learning of classification tasks. Physical Review Letters, 70(20):3167–3170, 1993. M. Biehl and N. Caticha. The statistical mechanics of on-line learning and generalization. In M.A. Arbib, editor, The Handbook of Brain Theory and Neural Networks, pages 1095–1098. MIT Press, Cambridge, MA, 2003. M. Biehl, A. Freking, and G. Reents. Dynamics of on-line competitive learning. Europhysics Letters, 38(1):73–78, 1997. M. Biehl, A. Ghosh, and B. Hammer. The dynamics of learning vector quantization. In M. Verleysen, editor, European Symposium on Artificial Neural Networks, ESANN’05, pages 13–18. d-side, Evere, Belgium, 2005. T. Bojer, B. Hammer, and C. Koers. Monitoring technical systems with prototype based clustering. In M. Verleysen, editor, European Symposium on Artificial Neural Networks, pages 433–439. d-side, Evere, Belgium, 2003. L. Bottou. Stochastic gradient learning in neural networks. In Proc. of Neuro-Nimes 91. EC2 editions, 1991. 358 DYNAMICS AND G ENERALIZATION A BILITY OF LVQ A LGORITHMS K. Crammer, R. Gilad-Bachrach, A. Navot, and A. Tishby. Margin analysis of the LVQ algorithm. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15, pages 462–469. MIT Press, Cambridge, MA, 2003. R.O. Duda, P.E. Hart, and D.G. Stork. Pattern Classification. Wiley, New York, 2000. A. Engel and C. van den Broeck. The Statistical Mechanics of Learning. Cambridge University Press, Cambridge, UK, 2001. A. Ghosh, M. Biehl, A. Freking, and G. Reents. A theoretical framework for analysing the dynamics of LVQ: A statistical physics approach. Technical Report 2004-9-02, Mathematics and Computing Science, University Groningen, P.O. Box 800, 9700 AV Groningen, The Netherlands, available from www.cs.rug.nl/˜biehl, 2004. A. Ghosh, M. Biehl, and B. Hammer. Dynamical analysis of LVQ type learning rules. In M. Cottrell, editor, Workshop on the Self-Organizing-Map WSOM’05. Univ. de Paris (I), 2005. B. Hammer and T. Villmann. Generalized relevance learning vector quantization. Neural Networks, 15:1059–1068, 2002. B. Hammer, M. Strickert, and T. Villmann. On the generalization ability of GRLVQ networks. Neural Processing Letters, 21(2):109–120, 2005a. B. Hammer, M. Strickert, and T. Villmann. Prototype based recognition of splice sites. In U. Seiffert, L. C. Jain, and P. Schweitzer, editors, Bioinformatics using Computational Intelligence Paradigms, pages 25–55. Springer, Berlin, 2005b. B. Hammer, M. Strickert, and T. Villmann. Supervised neural gas with general similarity measure. Neural Processing Letters, 21(1):21–44, 2005c. T. Kohonen. Learning vector quantization. In M.A. Arbib, editor, The Handbook of Brain Theory and Neural Networks., pages 537–540. MIT Press, Cambridge, MA, 1995. T. Kohonen. Self-Organizing Maps. Springer, Berlin, 1997. T. Kohonen. Improved versions of learning vector quantization. In Proc. of the International Joint conference on Neural Networks (San Diego, 1990), 1:545–550, 1990. T. Kohonen, G. Barna, and R. Chrisley. Statistical pattern recognition with neural network: Benchmarking studies. In Proc. of the IEEE second international conference on Neural Networks (San Diego, 1988), volume 1, pages 61–68. IEEE, New York, 1988. L. I. Kuncheva. Classifier ensembles for changing environments. In F. Roli, J. Kittler, and T. Windeatt, editors, Multiple Classifier Systems: 5th International Workshop, MCS2004, Cagliari, Italy, volume 3077 of Lecture Notes in Computer Science, pages 1–15. Springer, Berlin, 2004. C. Marangi, M. Biehl, and S.A. Solla. Supervised learning from clustered input examples. Europhysics Letters, 30(2):117, 1995. 359 B IEHL , G HOSH AND H AMMER R. Meir. Empirical risk minimization versus maximum-likelihood estimation: a case study. Neural Computation, 7(1):144–157, 1995. Neural Networks Research Centre, Helsinki. Bibliography on the self-organizing maps (SOM) and learning vector quantization (LVQ). Otaniemi: Helsinki Univ. of Technology. Available on-line: http://liinwww.ira.uka.de/bibliography/Neural/SOM.LVQ.html, 2002. M. Pregenzer, G. Pfurtscheller, and D. Flotzinger. Automated feature selection with distinction sensitive learning vector quantization. Neurocomputing, 11:19–20, 1996. G. Reents and R. Urbanczik. Self-averaging and on-line learning. Physical Review Letters, 80(24): 5445–5448, 1998. P. Riegler, M. Biehl, S.A. Solla, and C. Marangi. On-line learning from clustered input examples. In M. Marinaro and R. Tagliaferri, editors, Neural Nets WIRN Vietri-95, Proc. of the 7th Italian Workshop on Neural Nets, pages 87–92. World Scientific, Singapore, 1996. D. Saad, editor. Online learning in neural networks. Cambridge University Press, Cambridge, UK, 1999. A.S. Sato and K. Yamada. Generalized learning vector quantization. In G. Tesauro, D. Touretzky, and T. Leen, editors, Advances in Neural Information Processing Systems, volume 7, pages 423– 429, 1995. A.S. Sato and K. Yamada. An analysis of convergence in generalized LVQ. In L. Niklasson, M. Bod´ n, and T. Ziemke, editors, International Conference on Artificial Neural Networks, e ICANN’98, pages 172–176. Springer, Berlin, 1998. F.-M. Schleif, T. Villmann, and B. Hammer. Local metric adaptation for soft nearest prototype classification to classify proteomic data. In I. Bloch, A. Petrosino, and A.G.B. Tettamanzi, editors, International Workshop on Fuzzy Logic and Applications, volume 3849 of Lecture Notes in Computer Science, pages 290–296. Springer, Berlin, 2006. S. Seo and K. Obermayer. Soft learning vector quantization. Neural Computation, 15:1589–1604, 2003. S. Seo, M. Bode, and K. Obermayer. Soft nearest prototype classification. IEEE Transactions on Neural Networks, 14(2):390–398, 2003. T. Villmann, E. Merenyi, and B. Hammer. Neural maps in remote sensing image analysis. Neural Networks, 16(3-4):389–403, 2003. T. H. L. Watkin, A. Rau, and M. Biehl. The statistical mechanics of learning a rule. Reviews of Modern Physics, 65:499–556, 1993. 360