nips nips2009 nips2009-2 nips2009-2-reference knowledge-graph by maker-knowledge-mining

2 nips-2009-3D Object Recognition with Deep Belief Nets

Source: pdf

Author: Vinod Nair, Geoffrey E. Hinton

Abstract: We introduce a new type of top-level model for Deep Belief Nets and evaluate it on a 3D object recognition task. The top-level model is a third-order Boltzmann machine, trained using a hybrid algorithm that combines both generative and discriminative gradients. Performance is evaluated on the NORB database (normalized-uniform version), which contains stereo-pair images of objects under diﬀerent lighting conditions and viewpoints. Our model achieves 6.5% error on the test set, which is close to the best published result for NORB (5.9%) using a convolutional neural net that has built-in knowledge of translation invariance. It substantially outperforms shallow models such as SVMs (11.6%). DBNs are especially suited for semi-supervised learning, and to demonstrate this we consider a modiﬁed version of the NORB recognition task in which additional unlabeled images are created by applying small translations to the images in the database. With the extra unlabeled data (and the same amount of labeled data as before), our model achieves 5.2% error. 1

reference text

[1] Y. Bengio, P. Lamblin, P. Popovici, and H. Larochelle. Greedy Layer-Wise Training of Deep Networks. In NIPS, 2006.

[2] Y. Bengio and Y. LeCun. Scaling learning algorithms towards AI. In Large-Scale Kernel Machines, 2007.

[3] D. DeCoste and B. Scholkopf. Training Invariant Support Vector Machines. Machine Learning, 46:161–190, 2002.

[4] G. E. Hinton. Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8):1711–1800, 2002.

[5] G. E. Hinton. To Recognize Shapes, First Learn to Generate Images. Technical Report UTML TR 2006-04, Dept. of Computer Science, University of Toronto, 2006.

[6] G. E. Hinton, S. Osindero, and Y. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18:1527–1554, 2006.

[7] G. E. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313:504–507, 2006.

[8] M. Kelm, C. Pal, and A. McCallum. Combining Generative and Discriminative Methods for Pixel Classiﬁcation with Multi-Conditional Learning. In ICPR, 2006.

[9] H. Larochelle and Y. Bengio. Classiﬁcation Using Discriminative Restricted Boltzmann Machines. In ICML, pages 536–543, 2008.

[10] H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. In ICML, pages 473–480, 2007.

[11] Y. LeCun, L. Bottou, Y. Bengio, and P. Haﬀner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, November 1998.

[12] Y. LeCun, F. J. Huang, and L. Bottou. Learning methods for generic object recognition with invariance to pose and lighting. In CVPR, Washington, D.C., 2004.

[13] H. Lee, R. Grosse, R. Ranganath, and A. Ng. Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations. In ICML, 2009.

[14] V. Nair and G. E. Hinton. Implicit mixtures of restricted boltzmann machines. In Neural information processing systems, 2008.

[15] R. Raina, A. Madhavan, and A. Ng. Large-scale Deep Unsupervised Learning using Graphics Processors. In ICML, 2009.

[16] Marc’Aurelio Ranzato, Fu-Jie Huang, Y-Lan Boureau, and Yann LeCun. Unsupervised learning of invariant feature hierarchies with applications to object recognition. In Proc. Computer Vision and Pattern Recognition Conference (CVPR’07). IEEE Press, 2007.

[17] T. J. Sejnowski. Higher-order Boltzmann Machines. In AIP Conference Proceedings, pages 398–403, 1987.

[18] G. Taylor and G. E. Hinton. Factored Conditional Restricted Boltzmann Machines for Modeling Motion Style. In ICML, 2009.

[19] V. Vapnik. Statistical Learning Theory. John Wiley and Sons, 1998.

[20] P. Vincent, H. Larochelle, Y. Bengio, and P. A. Manzagol. Extracting and Composing Robust Features with Denoising Autoencoders. In ICML, 2008.

[21] M. Welling, M. Rosen-Zvi, and G. E. Hinton. Exponential family harmoniums with an application to information retrieval. In NIPS 17, 2005. 9