nips nips2013 nips2013-226 nips2013-226-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Brenden M. Lake, Ruslan Salakhutdinov, Josh Tenenbaum
Abstract: People can learn a new visual class from just one example, yet machine learning algorithms typically require hundreds or thousands of examples to tackle the same problems. Here we present a Hierarchical Bayesian model based on compositionality and causality that can learn a wide range of natural (although simple) visual concepts, generalizing in human-like ways from just one image. We evaluated performance on a challenging one-shot classification task, where our model achieved a human-level error rate while substantially outperforming two deep learning models. We also tested the model on another conceptual task, generating new examples, by using a “visual Turing test” to show that our model produces human-like performance. 1
[1] M. K. Babcock and J. Freyd. Perception of dynamic information in static handwritten forms. American Journal of Psychology, 101(1):111–130, 1988.
[2] I. Biederman. Recognition-by-components: a theory of human image understanding. Psychological Review, 94(2):115–47, 1987.
[3] S. Carey and E. Bartlett. Acquiring a single new word. Papers and Reports on Child Language Development, 15:17–29, 1978.
[4] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
[5] L. Fei-Fei, R. Fergus, and P. Perona. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(4):594–611, 2006.
[6] J. Feldman. The structure of perceptual categories. Journal of Mathematical Psychology, 41:145–170, 1997.
[7] J. Freyd. Representing the dynamics of a static form. Memory and Cognition, 11(4):342–346, 1983.
[8] S. Geman, E. Bienenstock, and R. Doursat. Neural Networks and the Bias/Variance Dilemma. Neural Computation, 4:1–58, 1992.
[9] E. Gilet, J. Diard, and P. Bessi` re. Bayesian action-perception computational model: interaction of proe duction and recognition of cursive letters. PloS ONE, 6(6), 2011.
[10] G. E. Hinton and V. Nair. Inferring motor programs from images of handwritten digits. In Advances in Neural Information Processing Systems 19, 2006.
[11] K. H. James and I. Gauthier. Letter processing automatically recruits a sensory-motor brain network. Neuropsychologia, 44(14):2937–2949, 2006.
[12] K. H. James and I. Gauthier. When writing impairs reading: letter perception’s susceptibility to motor interference. Journal of Experimental Psychology: General, 138(3):416–31, Aug. 2009.
[13] C. Kemp and A. Jern. Abstraction and relational learning. In Advances in Neural Information Processing Systems 22, 2009.
[14] A. Krizhevsky. Learning multiple layers of features from tiny images. PhD thesis, Unviersity of Toronto, 2009.
[15] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25, 2012.
[16] B. M. Lake, R. Salakhutdinov, J. Gross, and J. B. Tenenbaum. One shot learning of simple visual concepts. In Proceedings of the 33rd Annual Conference of the Cognitive Science Society, 2011.
[17] B. M. Lake, R. Salakhutdinov, and J. B. Tenenbaum. Concept learning as motor program induction: A large-scale empirical study. In Proceedings of the 34th Annual Conference of the Cognitive Science Society, 2012.
[18] L. Lam, S.-W. Lee, and C. Y. Suen. Thinning Methodologies - A Comprehensive Survey. IEEE Transactions of Pattern Analysis and Machine Intelligence, 14(9):869–885, 1992.
[19] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86(11):2278–2323, 1998.
[20] K. Liu, Y. S. Huang, and C. Y. Suen. Identification of Fork Points on the Skeletons of Handwritten Chinese Characters. IEEE Transactions of Pattern Analysis and Machine Intelligence, 21(10):1095–1100, 1999.
[21] M. Longcamp, J. L. Anton, M. Roth, and J. L. Velay. Visual presentation of single letters activates a premotor area involved in writing. Neuroimage, 19(4):1492–1500, 2003.
[22] E. M. Markman. Categorization and Naming in Children. MIT Press, Cambridge, MA, 1989.
[23] E. G. Miller, N. E. Matsakis, and P. A. Viola. Learning from one example through shared densities on transformations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2000.
[24] M. Revow, C. K. I. Williams, and G. E. Hinton. Using Generative Models for Handwritten Digit Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(6):592–606, 1996.
[25] R. Salakhutdinov and G. E. Hinton. Deep Boltzmann Machines. In 12th Internationcal Conference on Artificial Intelligence and Statistics (AISTATS), 2009.
[26] R. Salakhutdinov, J. B. Tenenbaum, and A. Torralba. Learning with Hierarchical-Deep Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1958–71, 2013.
[27] P. H. Winston. Learning structural descriptions from examples. In P. H. Winston, editor, The Psychology of Computer Vision. McGraw-Hill, New York, 1975.
[28] F. Xu and J. B. Tenenbaum. Word Learning as Bayesian Inference. Psychological Review, 114(2):245– 272, 2007. 9