nips nips2013 nips2013-64 nips2013-64-reference knowledge-graph by maker-knowledge-mining

64 nips-2013-Compete to Compute

Source: pdf

Author: Rupesh K. Srivastava, Jonathan Masci, Sohrob Kazerounian, Faustino Gomez, Jürgen Schmidhuber

Abstract: Local competition among neighboring neurons is common in biological neural networks (NNs). In this paper, we apply the concept to gradient-based, backprop-trained artiﬁcial multilayer NNs. NNs with competing linear units tend to outperform those with non-competing nonlinear units, and avoid catastrophic forgetting when training sets change over time. 1

reference text

[1] Per Anderson, Gary N. Gross, Terje Lømo, and Ola Sveen. Participation of inhibitory and excitatory interneurones in the control of hippocampal cortical output. In Mary A.B. Brazier, editor, The Interneuron, volume 11. University of California Press, Los Angeles, 1969.

[2] John Carew Eccles, Masao Ito, and János Szentágothai. The cerebellum as a neuronal machine. Springer-Verlag New York, 1967.

[3] Costas Stefanis. Interneuronal mechanisms in the cortex. In Mary A.B. Brazier, editor, The Interneuron, volume 11. University of California Press, Los Angeles, 1969.

[4] Stephen Grossberg. Contour enhancement, short-term memory, and constancies in reverberating neural networks. Studies in Applied Mathematics, 52:213–257, 1973.

[5] Michael McCloskey and Neal J. Cohen. Catastrophic interference in connectionist networks: The sequential learning problem. The Psychology of Learning and Motivation, 24:109–164, 1989.

[6] Gail A. Carpenter and Stephen Grossberg. The art of adaptive pattern recognition by a self-organising neural network. Computer, 21(3):77–88, 1988.

[7] Mark B. Ring. Continual Learning in Reinforcement Environments. PhD thesis, Department of Computer Sciences, The University of Texas at Austin, Austin, Texas 78712, August 1994.

[8] Samuel A. Ellias and Stephen Grossberg. Pattern formation, contrast control, and oscillations in the short term memory of shunting on-center oﬀ-surround networks. Bio. Cybernetics, 1975.

[9] Brad Ermentrout. Complex dynamics in winner-take-all neural nets with slow inhibition. Neural Networks, 5(1):415–431, 1992. 8

[10] Christoph von der Malsburg. Self-organization of orientation sensitive cells in the striate cortex. Kybernetik, 14(2):85–100, December 1973.

[11] Teuvo Kohonen. Self-organized formation of topologically correct feature maps. Biological cybernetics, 43(1):59–69, 1982.

[12] Risto Mikkulainen, James A. Bednar, Yoonsuck Choe, and Joseph Sirosh. Computational maps in the visual cortex. Springer Science+ Business Media, 2005.

[13] Dale K. Lee, Laurent Itti, Christof Koch, and Jochen Braun. Attention activates winner-takeall competition among visual ﬁlters. Nature Neuroscience, 2(4):375–81, April 1999.

[14] Matthias Oster and Shih-Chii Liu. Spiking inputs to a winner-take-all network. In Proceedings of NIPS, volume 18. MIT; 1998, 2006.

[15] John P. Lazzaro, Sylvie Ryckebusch, Misha Anne Mahowald, and Caver A. Mead. Winnertake-all networks of O(n) complexity. Technical report, 1988.

[16] Giacomo Indiveri. Modeling selective attention using a neuromorphic analog VLSI device. Neural Computation, 12(12):2857–2880, 2000.

[17] Wolfgang Maass. Neural computation with winner-take-all as the only nonlinear operation. In Proceedings of NIPS, volume 12, 1999.

[18] Wolfgang Maass. On the computational power of winner-take-all. Neural Computation, 12:2519–2535, 2000.

[19] Ian J. Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron Courville, and Yoshua Bengio. Maxout networks. In Proceedings of the ICML, 2013.

[20] Geoﬀrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors, 2012. arXiv:1207.0580.

[21] Juergen Schmidhuber. A local learning algorithm for dynamic feedforward and recurrent networks. Connection Science, 1(4):403–412, 1989.

[22] Rupesh K. Srivastava, Bas R. Steunebrink, and Juergen Schmidhuber. First experiments with powerplay. Neural Networks, 2013.

[23] Maximillian Riesenhuber and Tomaso Poggio. Hierarchical models of object recognition in cortex. Nature Neuroscience, 2(11), 1999.

[24] Alex Krizhevsky, Ilya Sutskever, and Goeﬀrey E. Hinton. Imagenet classiﬁcation with deep convolutional neural networks. In Proceedings of NIPS, pages 1–9, 2012.

[25] Dan Ciresan, Ueli Meier, and Jürgen Schmidhuber. Multi-column deep neural networks for image classiﬁcation. Proceeedings of the CVPR, 2012.

[26] Vinod Nair and Geoﬀrey E. Hinton. Rectiﬁed linear units improve restricted boltzmann machines. In Proceedings of the ICML, number 3, 2010.

[27] Xavier Glorot, Antoine Bordes, and Yoshua Bengio. Deep sparse rectiﬁer networks. In AISTATS, volume 15, pages 315–323, 2011.

[28] George E. Dahl, Tara N. Sainath, and Geoﬀrey E. Hinton. Improving Deep Neural Networks for LVCSR using Rectiﬁed Linear Units and Dropout. In Proceedings of ICASSP, 2013.

[29] Andrew L. Maas, Awni Y. Hannun, and Andrew Y. Ng. Rectiﬁer nonlinearities improve neural network acoustic models. In Proceedings of the ICML, 2013.

[30] Tijmen Tieleman. Gnumpy: an easy way to use GPU boards in Python. Department of Computer Science, University of Toronto, 2010.

[31] Volodymyr Mnih. CUDAMat: a CUDA-based matrix class for Python. Department of Computer Science, University of Toronto, Tech. Rep. UTML TR, 4, 2009.

[32] Patrice Y. Simard, Dave Steinkraus, and John C. Platt. Best practices for convolutional neural networks applied to visual document analysis. In International Conference on Document Analysis and Recognition (ICDAR), 2003.

[33] Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haﬀner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998.

[34] Marc’Aurelio Ranzato, Christopher Poultney, Sumit Chopra, and Yann LeCun. Eﬃcient learning of sparse representations with an energy-based model. In Proceedings of NIPS, 2007.

[35] Matthew D. Zeiler and Rob Fergus. Stochastic pooling for regularization of deep convolutional neural networks. In Proceedings of the ICLR, 2013.

[36] Kevin Jarrett, Koray Kavukcuoglu, Marc’Aurelio Ranzato, and Yann LeCun. What is the best multi-stage architecture for object recognition? In Proc. of the ICCV, pages 2146–2153, 2009.

[37] John Blitzer, Mark Dredze, and Fernando Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classiﬁcation. Annual Meeting-ACL, 2007.

[38] Xavier Glorot, Antoine Bordes, and Yoshua Bengio. Domain adaptation for large-scale sentiment classiﬁcation: A deep learning approach. In Proceedings of the ICML, number 1, 2011. 9