nips nips2013 nips2013-226 nips2013-226-reference knowledge-graph by maker-knowledge-mining

226 nips-2013-One-shot learning by inverting a compositional causal process

Source: pdf

Author: Brenden M. Lake, Ruslan Salakhutdinov, Josh Tenenbaum

Abstract: People can learn a new visual class from just one example, yet machine learning algorithms typically require hundreds or thousands of examples to tackle the same problems. Here we present a Hierarchical Bayesian model based on compositionality and causality that can learn a wide range of natural (although simple) visual concepts, generalizing in human-like ways from just one image. We evaluated performance on a challenging one-shot classiﬁcation task, where our model achieved a human-level error rate while substantially outperforming two deep learning models. We also tested the model on another conceptual task, generating new examples, by using a “visual Turing test” to show that our model produces human-like performance. 1

reference text

[1] M. K. Babcock and J. Freyd. Perception of dynamic information in static handwritten forms. American Journal of Psychology, 101(1):111–130, 1988.

[2] I. Biederman. Recognition-by-components: a theory of human image understanding. Psychological Review, 94(2):115–47, 1987.

[3] S. Carey and E. Bartlett. Acquiring a single new word. Papers and Reports on Child Language Development, 15:17–29, 1978.

[4] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009.

[5] L. Fei-Fei, R. Fergus, and P. Perona. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(4):594–611, 2006.

[6] J. Feldman. The structure of perceptual categories. Journal of Mathematical Psychology, 41:145–170, 1997.

[7] J. Freyd. Representing the dynamics of a static form. Memory and Cognition, 11(4):342–346, 1983.

[8] S. Geman, E. Bienenstock, and R. Doursat. Neural Networks and the Bias/Variance Dilemma. Neural Computation, 4:1–58, 1992.

[9] E. Gilet, J. Diard, and P. Bessi` re. Bayesian action-perception computational model: interaction of proe duction and recognition of cursive letters. PloS ONE, 6(6), 2011.

[10] G. E. Hinton and V. Nair. Inferring motor programs from images of handwritten digits. In Advances in Neural Information Processing Systems 19, 2006.

[11] K. H. James and I. Gauthier. Letter processing automatically recruits a sensory-motor brain network. Neuropsychologia, 44(14):2937–2949, 2006.

[12] K. H. James and I. Gauthier. When writing impairs reading: letter perception’s susceptibility to motor interference. Journal of Experimental Psychology: General, 138(3):416–31, Aug. 2009.

[13] C. Kemp and A. Jern. Abstraction and relational learning. In Advances in Neural Information Processing Systems 22, 2009.

[14] A. Krizhevsky. Learning multiple layers of features from tiny images. PhD thesis, Unviersity of Toronto, 2009.

[15] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet Classiﬁcation with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25, 2012.

[16] B. M. Lake, R. Salakhutdinov, J. Gross, and J. B. Tenenbaum. One shot learning of simple visual concepts. In Proceedings of the 33rd Annual Conference of the Cognitive Science Society, 2011.

[17] B. M. Lake, R. Salakhutdinov, and J. B. Tenenbaum. Concept learning as motor program induction: A large-scale empirical study. In Proceedings of the 34th Annual Conference of the Cognitive Science Society, 2012.

[18] L. Lam, S.-W. Lee, and C. Y. Suen. Thinning Methodologies - A Comprehensive Survey. IEEE Transactions of Pattern Analysis and Machine Intelligence, 14(9):869–885, 1992.

[19] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86(11):2278–2323, 1998.

[20] K. Liu, Y. S. Huang, and C. Y. Suen. Identiﬁcation of Fork Points on the Skeletons of Handwritten Chinese Characters. IEEE Transactions of Pattern Analysis and Machine Intelligence, 21(10):1095–1100, 1999.

[21] M. Longcamp, J. L. Anton, M. Roth, and J. L. Velay. Visual presentation of single letters activates a premotor area involved in writing. Neuroimage, 19(4):1492–1500, 2003.

[22] E. M. Markman. Categorization and Naming in Children. MIT Press, Cambridge, MA, 1989.

[23] E. G. Miller, N. E. Matsakis, and P. A. Viola. Learning from one example through shared densities on transformations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2000.

[24] M. Revow, C. K. I. Williams, and G. E. Hinton. Using Generative Models for Handwritten Digit Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(6):592–606, 1996.

[25] R. Salakhutdinov and G. E. Hinton. Deep Boltzmann Machines. In 12th Internationcal Conference on Artiﬁcial Intelligence and Statistics (AISTATS), 2009.

[26] R. Salakhutdinov, J. B. Tenenbaum, and A. Torralba. Learning with Hierarchical-Deep Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1958–71, 2013.

[27] P. H. Winston. Learning structural descriptions from examples. In P. H. Winston, editor, The Psychology of Computer Vision. McGraw-Hill, New York, 1975.

[28] F. Xu and J. B. Tenenbaum. Word Learning as Bayesian Inference. Psychological Review, 114(2):245– 272, 2007. 9