nips nips2010 nips2010-271 nips2010-271-reference knowledge-graph by maker-knowledge-mining

271 nips-2010-Tiled convolutional neural networks

Source: pdf

Author: Jiquan Ngiam, Zhenghao Chen, Daniel Chia, Pang W. Koh, Quoc V. Le, Andrew Y. Ng

Abstract: Convolutional neural networks (CNNs) have been successfully applied to many tasks such as digit and object recognition. Using convolutional (tied) weights signiﬁcantly reduces the number of parameters that have to be learned, and also allows translational invariance to be hard-coded into the architecture. In this paper, we consider the problem of learning invariances, rather than relying on hardcoding. We propose tiled convolution neural networks (Tiled CNNs), which use a regular “tiled” pattern of tied weights that does not require that adjacent hidden units share identical weights, but instead requires only that hidden units k steps away from each other to have tied weights. By pooling over neighboring units, this architecture is able to learn complex invariances (such as scale and rotational invariance) beyond translational invariance. Further, it also enjoys much of CNNs’ advantage of having a relatively small number of learned parameters (such as ease of learning and greater scalability). We provide an efﬁcient learning algorithm for Tiled CNNs based on Topographic ICA, and show that learning complex invariant features allows us to achieve highly competitive results for both the NORB and CIFAR-10 datasets. 1

reference text

[1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient based learning applied to document recognition. Proceeding of the IEEE, 1998.

[2] P. Simard, D. Steinkraus, and J. Platt. Best practices for convolutional neural networks applied to visual document analysis. In ICDAR, 2003.

[3] Y. LeCun, F.J. Huang, and L. Bottou. Learning methods for generic object recognition with invariance to pose and lighting. In CVPR, 2004.

[4] R. Collobert and J. Weston. A uniﬁed architecture for natural language processing: Deep neural networks with multitask learning. In ICML, 2008.

[5] Rajat Raina, Alexis Battle, Honglak Lee, Benjamin Packer, and Andrew Y. Ng. Self-taught learning: Transfer learning from unlabeled data. In ICML, 2007.

[6] G.E. Hinton, S. Osindero, and Y.W. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 2006.

[7] D. Erhan, A. Courville, Y. Bengio, and P. Vincent. Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research, 2010.

[8] A. Hyvarinen and P. Hoyer. Topographic independent component analysis as a model of V1 organization and receptive ﬁelds. Neural Computation, 2001.

[9] A. Hyvarinen, J. Hurri, and P. Hoyer. Natural Image Statistics. Springer, 2009.

[10] A. Krizhevsky. Learning multiple layers of features from tiny images. Technical report, U. Toronto, 2009.

[11] H. Lee, R. Grosse, R. Ranganath, and A.Y. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In ICML, 2009.

[12] M.A. Ranzato K. Jarrett, K. Kavukcuoglu and Y. LeCun. What is the best multi-stage architecture for object recognition? In ICCV, 2009.

[13] I. Goodfellow, Q.V. Le, A. Saxe, H. Lee, and A.Y. Ng. Measuring invariances in deep networks. In NIPS, 2010.

[14] B. Olshausen and D. Field. Emergence of simple-cell receptive ﬁeld properties by learning a sparse code for natural images. Nature, 1996.

[15] A. Hyvarinen, J. Karhunen, and E. Oja. Independent Component Analysis. Wiley Interscience, 2001.

[16] A. Hyvarinen. Estimation of non-normalized statistical models using score matching. JMLR, 2005.

[17] K. Kavukcuoglu, M.A. Ranzato, R. Fergus, and Y. LeCun. Learning invariant features through topographic ﬁlter maps. In CVPR, 2009.

[18] Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layerwise training of deep networks. In NIPS, 2007.

[19] V. Nair and G. Hinton. 3D object recognition with deep belief nets. In NIPS, 2009.

[20] Y. Bengio and Y. LeCun. Scaling learning algorithms towards AI. In Large-Scale Kernel Machines, 2007.

[21] R. Salakhutdinov and H. Larochelle. Efﬁcient learning of Deep Boltzmann Machines. In AISTATS, 2010.

[22] R.E. Fan, K.W. Chang, C.J. Hsieh, X.R. Wang, and C.J. Lin. LIBLINEAR: A library for large linear classiﬁcation. JMLR, 9:1871–1874, 2008.

[23] G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 2006.

[24] M. Ranzato and G. Hinton. Modeling pixel means and covariances using factorized third-order boltzmann machines. In CVPR, 2010.

[25] K. Yu and T. Zhang. Improved local coordinate coding using local tangents. In ICML, 2010.

[26] Y. Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2009.

[27] A. Saxe, M. Bhand, Z. Chen, P. W. Koh, B. Suresh, and A. Y. Ng. On random weights and unsupervised feature learning. In Workshop: Deep Learning and Unsupervised Feature Learning (NIPS), 2010.

[28] D. Erhan, Y. Bengio, A. Courville, and P. Vincent. Visualizing higher-layer features of a deep network. Technical report, University of Montreal, 2009.

[29] P. Berkes and L. Wiskott. Slow feature analysis yields a rich repertoire of complex cell properties. Journal of Vision, 2005.

[30] R. Fergus A. Torralba and W. T. Freeman. 80 million tiny images: a large dataset for non-parametric object and scene recognition. In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008. 9