nips nips2008 nips2008-103 nips2008-103-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Vinod Nair, Geoffrey E. Hinton
Abstract: We present a mixture model whose components are Restricted Boltzmann Machines (RBMs). This possibility has not been considered before because computing the partition function of an RBM is intractable, which appears to make learning a mixture of RBMs intractable as well. Surprisingly, when formulated as a third-order Boltzmann machine, such a mixture model can be learned tractably using contrastive divergence. The energy function of the model captures threeway interactions among visible units, hidden units, and a single hidden discrete variable that represents the cluster label. The distinguishing feature of this model is that, unlike other mixture models, the mixing proportions are not explicitly parameterized. Instead, they are defined implicitly via the energy function and depend on all the parameters in the model. We present results for the MNIST and NORB datasets showing that the implicit mixture of RBMs learns clusters that reflect the class structure in the data. 1
[1] Mnist database, http://yann.lecun.com/exdb/mnist/.
[2] C. M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.
[3] Z. Ghahramani and G. E. Hinton. The em algorithm for mixtures of factor analyzers. Technical Report CRG-TR-96-1, Dept. of Computer Science, University of Toronto, 1996.
[4] X. He, R. S. Zemel, and M. A. Carreira-Perpinan. Multiscale conditional random fields for image labeling. In CVPR, pages 695–702, 2004.
[5] G. E. Hinton. Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8):1711–1800, 2002.
[6] G. E. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313:504–507, 2006.
[7] Y. LeCun, F. J. Huang, and L. Bottou. Learning methods for generic object recognition with invariance to pose and lighting. In CVPR, Washington, D.C., 2004.
[8] S. Roth and M. J. Black. Fields of experts: A framework for learning image priors. In CVPR, pages 860–867, 2005.
[9] S. Roth and M. J. Black. Steerable random fields. In ICCV, 2007.
[10] N. Le Roux and Y. Bengio. Representational power of restricted boltzmann machines and deep belief networks. Neural Computation, To appear.
[11] R. Salakhutdinov and I. Murray. On the quantitative analysis of deep belief networks. In ICML, Helsinki, 2008.
[12] I. Sutskever and G. E. Hinton. Deep narrow sigmoid belief networks are universal approximators. Neural Computation, To appear.
[13] M. Welling, M. Rosen-Zvi, and G. E. Hinton. Exponential family harmoniums with an application to information retrieval. In NIPS 17, 2005. 8