nips nips2012 nips2012-4 nips2012-4-reference knowledge-graph by maker-knowledge-mining

4 nips-2012-A Better Way to Pretrain Deep Boltzmann Machines

Source: pdf

Author: Geoffrey E. Hinton, Ruslan Salakhutdinov

Abstract: We describe how the pretraining algorithm for Deep Boltzmann Machines (DBMs) is related to the pretraining algorithm for Deep Belief Networks and we show that under certain conditions, the pretraining procedure improves the variational lower bound of a two-hidden-layer DBM. Based on this analysis, we develop a different method of pretraining DBMs that distributes the modelling work more evenly over the hidden layers. Our results on the MNIST and NORB datasets demonstrate that the new pretraining algorithm allows us to learn better generative models. 1

reference text

[1] Y. Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2009.

[2] G. E. Hinton, S. Osindero, and Y. W. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527–1554, 2006.

[3] H. Larochelle, Y. Bengio, J. Louradour, and P. Lamblin. Exploring strategies for training deep neural networks. Journal of Machine Learning Research, 10:1–40, 2009.

[4] Y. LeCun, F. J. Huang, and L. Bottou. Learning methods for generic object recognition with invariance to pose and lighting. In CVPR (2), pages 97–104, 2004.

[5] V. Nair and G. E. Hinton. Implicit mixtures of restricted Boltzmann machines. In Advances in Neural Information Processing Systems, volume 21, 2009.

[6] M. A. Ranzato. Unsupervised learning of feature hierarchies. In Ph.D. New York University, 2009.

[7] R. R. Salakhutdinov and G. E. Hinton. Deep Boltzmann machines. In Proceedings of the International Conference on Artiﬁcial Intelligence and Statistics, volume 12, 2009.

[8] R. R. Salakhutdinov and G. E. Hinton. An efﬁcient learning procedure for Deep Boltzmann Machines. Neural Computation, 24:1967 – 2006, 2012.

[9] R. R. Salakhutdinov and H. Larochelle. Efﬁcient learning of deep Boltzmann machines. In Proceedings of the International Conference on Artiﬁcial Intelligence and Statistics, volume 13, 2010.

[10] R. R. Salakhutdinov and I. Murray. On the quantitative analysis of deep belief networks. In Proceedings of the International Conference on Machine Learning, volume 25, pages 872 – 879, 2008.

[11] T. Tieleman. Training restricted Boltzmann machines using approximations to the likelihood gradient. In ICML. ACM, 2008.

[12] M. Welling and G. E. Hinton. A new learning algorithm for mean ﬁeld Boltzmann machines. Lecture Notes in Computer Science, 2415, 2002.

[13] M. Welling and C. Sutton. Learning in markov random ﬁelds with contrastive free energies. In International Workshop on AI and Statistics (AISTATS’2005), 2005.

[14] L. Younes. On the convergence of Markovian stochastic algorithms with rapidly decreasing ergodicity rates, March 17 2000.

[15] A. L. Yuille. The convergence of contrastive divergences. In Advances in Neural Information Processing Systems, 2004. 9