nips nips2010 nips2010-264 nips2010-264-reference knowledge-graph by maker-knowledge-mining

264 nips-2010-Synergies in learning words and their referents


Source: pdf

Author: Mark Johnson, Katherine Demuth, Bevan Jones, Michael J. Black

Abstract: This paper presents Bayesian non-parametric models that simultaneously learn to segment words from phoneme strings and learn the referents of some of those words, and shows that there is a synergistic interaction in the acquisition of these two kinds of linguistic information. The models themselves are novel kinds of Adaptor Grammars that are an extension of an embedding of topic models into PCFGs. These models simultaneously segment phoneme sequences into words and learn the relationship between non-linguistic objects to the words that refer to them. We show (i) that modelling inter-word dependencies not only improves the accuracy of the word segmentation but also of word-object relationships, and (ii) that a model that simultaneously learns word-object relationships and word segmentation segments more accurately than one that just learns word segmentation on its own. We argue that these results support an interactive view of language acquisition that can take advantage of synergies such as these. 1


reference text

[1] Patricia K. Kuhl. Early language acquisition: Cracking the speech code. Nature Reviews Neuroscience, 5:831–843, 2004.

[2] Katharine Graf Estes, Julia L. Evans, Martha W. Alibali, and Jenny R. Saffran. Can infants map meaning to newly segmented words? statistical segmentation and word learning. Psychological Science, 18(3):254–260, 2007.

[3] James L. McClelland and David E. Rummelhart. An interactive activation model of context effects in letter perception. Psychological Review, 88(5):375–407, 1981.

[4] Jeffrey Elman. Finding structure in time. Cognitive Science, 14:197–211, 1990.

[5] M. Brent and T. Cartwright. Distributional regularity and phonotactic constraints are useful for segmentation. Cognition, 61:93–125, 1996.

[6] M. Brent. An efficient, probabilistically sound algorithm for segmentation and word discovery. Machine Learning, 34:71–105, 1999.

[7] Michael C. Frank, Noah Goodman, and Joshua Tenenbaum. Using speakers’ referential intentions to model early cross-situational word learning. Psychological Science, 20:579–585, 2009.

[8] Bevan K. Jones, Mark Johnson, and Michael C. Frank. Learning words and their meanings from unsegmented child-directed speech. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 501–509, Los Angeles, California, June 2010. Association for Computational Linguistics.

[9] Sharon Goldwater, Thomas L. Griffiths, and Mark Johnson. A Bayesian framework for word segmentation: Exploring the effects of context. Cognition, 112(1):21 – 54, 2009.

[10] Y. W. Teh, M. Jordan, M. Beal, and D. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101:1566–1581, 2006.

[11] Mark Johnson, Thomas L. Griffiths, and Sharon Goldwater. Adaptor Grammars: A framework for specifying compositional nonparametric Bayesian models. In B. Sch¨ lkopf, J. Platt, and T. Hoffman, editors, o Advances in Neural Information Processing Systems 19, pages 641–648. MIT Press, Cambridge, MA, 2007.

[12] Mark Johnson. Using adaptor grammars to identifying synergies in the unsupervised acquisition of linguistic structure. In Proceedings of the 46th Annual Meeting of the Association of Computational Linguistics, Columbus, Ohio, 2008. Association for Computational Linguistics.

[13] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003.

[14] Mark Johnson. PCFGs, topic models, adaptor grammars and learning topical collocations and the structure of proper names. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 1148–1157, Uppsala, Sweden, July 2010. Association for Computational Linguistics.

[15] Mark Johnson and Sharon Goldwater. Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 317–325, Boulder, Colorado, June 2009. Association for Computational Linguistics.

[16] Shay B. Cohen, David M. Blei, and Noah A. Smith. Variational inference for adaptor grammars. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 564–572, Los Angeles, California, June 2010. Association for Computational Linguistics.

[17] Kenichi Kurihara and Taisuke Sato. Variational Bayesian grammar induction for natural language. In 8th International Colloquium on Grammatical Inference, 2006.

[18] Mark Johnson, Thomas Griffiths, and Sharon Goldwater. Bayesian inference for PCFGs via Markov chain Monte Carlo. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pages 139–146, Rochester, New York, April 2007. Association for Computational Linguistics.

[19] Anne Fernald and Hiromi Morikawa. Common themes and cultural variations in Japanese and American mothers’ speech to infants. Child Development, 64(3):637–656, 1993. 9