acl acl2012 acl2012-41 acl2012-41-reference knowledge-graph by maker-knowledge-mining

41 acl-2012-Bootstrapping a Unified Model of Lexical and Phonetic Acquisition

Source: pdf

Author: Micha Elsner ; Sharon Goldwater ; Jacob Eisenstein

Abstract: ILCC, School of Informatics School of Interactive Computing University of Edinburgh Georgia Institute of Technology Edinburgh, EH8 9AB, UK Atlanta, GA, 30308, USA (a) intended: /ju want w2n/ /want e kUki/ (b) surface: [j@ w a?P w2n] [wan @ kUki] During early language acquisition, infants must learn both a lexicon and a model of phonetics that explains how lexical items can vary in pronunciation—for instance “the” might be realized as [Di] or [D@]. Previous models of acquisition have generally tackled these problems in isolation, yet behavioral evidence suggests infants acquire lexical and phonetic knowledge simultaneously. We present a Bayesian model that clusters together phonetic variants of the same lexical item while learning both a language model over lexical items and a log-linear model of pronunciation variability based on articulatory features. The model is trained on transcribed surface pronunciations, and learns by bootstrapping, without access to the true lexicon. We test the model using a corpus of child-directed speech with realistic phonetic variation and either gold standard or automatically induced word boundaries. In both cases modeling variability improves the accuracy of the learned lexicon over a system that assumes each lexical item has a unique pronunciation.

reference text

Guillaume Aimetti. 2009. Modelling early language acquisition skills: Towards a general statistical learning mechanism. In Proceedings of the Student Research Workshop at EACL. Armen Allahverdyan and Aram Galstyan. 2011. Comparative analysis of Viterbi training and ML estimation for HMMs. In Advances in Neural Information Processing Systems (NIPS). Cyril Allauzen, Michael Riley, Johan Schalkwyk, Wojciech Skut, and Mehryar Mohri. 2007. OpenFst: A general and efficient weighted finite-state transducer library. In Proceedings of the Ninth International Conference on Implementation and Application of Automata, (CIAA 2007), volume 4783 of Lecture Notes in Computer Science, pages 11–23. Springer. http : / /www .open f st .org. Galen Andrew and Jianfeng Gao. 2007. Scalable training of L1-regularized log-linear models. In ICML ’07. Lalit Bahl, Raimo Bakis, Frederick Jelinek, and Robert Mercer. 1980. Language-model/acoustic-channelmodel balance mechanism. Technical disclosure bulletin Vol. 23, No. 7b, IBM, December. Nan Bernstein-Ratner. 1987. The phonology of parentchild speech. In K. Nelson and A. van Kleeck, editors, Children ’s Language, volume 6. Erlbaum, Hillsdale, NJ. L. Boruta, S. Peperkamp, B. Crabbe´, E. Dupoux, et al. 2011. Testing the robustness of online word segmentation: effects of linguistic diversity and phonetic variation. ACL HLT 2011, page 1. Michael Brent and Timothy Cartwright. 1996. Distributional regularity and phonotactic constraints are useful for segmentation. Cognition, 61:93–125. Michael R. Brent. 1999. An efficient, probabilistically sound algorithm for segmentation and word discovery. Machine Learning, 34:71–105, February. R. Daland and J.B. Pierrehumbert. 2010. Learning diphone-based segmentation. Cognitive Science. Markus Dreyer, Jason R. Smith, and Jason Eisner. 2008. Latent-variable modeling of string transductions with finite-state methods. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’08, pages 1080–1089, Stroudsburg, PA, USA. Association for Computational Linguistics. Joris Driesen, Louis ten Bosch, and Hugo Van hamme. 2009. Adaptive non-negative matrix factorization in a computational model of language acquisition. In Proceedings of Interspeech. E. Dupoux, G. Beraud-Sudreau, and S. Sagayama. 2011. Templatic features for modeling phoneme acquisition. In Proceedings of the 33rd Annual Cognitive Science Society. 192 Naomi Feldman, Thomas Griffiths, and James Morgan. 2009. Learning phonetic categories by learning a lexicon. In Proceedings of the 31st Annual Conference of the Cognitive Science Society (CogSci). Naomi Feldman. 2011. Interactions between word and speech sound categorization in language acquisition. Ph.D. thesis, Brown University. Margaret M. Fleck. 2008. Lexicalized phonotactic word segmentation. In Proceedings of ACL-08: HLT, pages 130–138, Columbus, Ohio, June. Association for Computational Linguistics. Sharon Goldwater and Mark Johnson. 2003. Learning OT constraint rankings using a maximum entropy model. In J. Spenader, A. Eriksson, and Osten Dahl, editors, Proceedings of the Stockholm Workshop on Variation within Optimality Theory, pages 111–120, Stockholm. Stockholm University. Sharon Goldwater, Tom Griffiths, and Mark Johnson. 2006. Interpolating between types and tokens by estimating power-law generators. In Advances in Neural Information Processing Systems (NIPS) 18. Sharon Goldwater, Thomas L. Griffiths, and Mark Johnson. 2009. A Bayesian framework for word segmentation: Exploring the effects of context. In In 46th Annual Meeting of the ACL, pages 398–406. Aria Haghighi and Dan Klein. 2006. Prototype-driven learning for sequence models. In Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, pages 320–327, New York City, USA, June. Association for Computational Linguistics. Bruce Hayes and Colin Wilson. 2008. A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry, 39(3):379–440. Bruce Hayes. 2011. Introductory Phonology. John Wiley and Sons. A. Jansen, K. Church, and H. Hermansky. 2010. Towards spoken term discovery at scale with zero resources. In Proceedings of Interspeech, pages 1676–1679. R. Kneser and H. Ney. 1995. Improved backing-off for Mgram language modeling. In Proc. ICASSP ’95, pages 181–184, Detroit, MI, May. B. MacWhinney. 2000. The CHILDES Project: Tools for Analyzing Talk. Vol 2: The Database. Lawrence Erlbaum Associates, Mahwah, NJ, 3rd edition. 2011. Unsupervised extraction of recurring words from infantdirected speech. In Proceedings of the 33rd Annual Fergus R. McInnes and Sharon Goldwater. Conference of the Cognitive Science Society. A. S. Park and J. R. Glass. 2008. Unsupervised pat- discovery in speech. IEEE Transactions on Audio, Speech and Language Processing, 16: 186–197. tern Fernando Pereira, Michael Riley, and Richard Sproat. 1994. Weighted rational transductions and their application to human language processing. In HLT. Mark A. Pitt, Laura Dilley, Keith Johnson, Scott Kiesling, William Raymond, Elizabeth Hume, and Eric Fosler-Lussier. 2007. Buckeye corpus of conversational speech (2nd release). Okko R¨asa¨nen. 2011. A computational model of word segmentation from continuous speech using transitional probabilities of atomic acoustic events. Cognition, 120(2):28. Anton Rytting. 2007. Preserving Subsegmental Variation in Modeling Word Segmentation (Or, the Raising of Baby Mondegreen). Ph.D. thesis, The Ohio State University. Valentin I. Spitkovsky, Hiyan Alshawi, Daniel Jurafsky, and Christopher D. Manning. 2010. Viterbi training improves unsupervised dependency parsing. In Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pages 9–17, Uppsala, Sweden, July. Association for Computational Linguistics. D. Swingley. 2005. Statistical clustering and the contents of the infant vocabulary. Cognitive Psychology, 50:86– 132. Yee Whye Teh. 2006. A hierarchical Bayesian language model based on Pitman-Yor processes. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 985–992, Sydney, Australia, July. Association for Computational Linguistics. G.K. Vallabha, J.L. McClelland, F. Pons, J.F. Werker, and S. Amano. 2007. Unsupervised learning of vowel categories from infant-directed speech. Proceedings of the National Academy of Sciences, 104(33): 13273– 13278. B. Varadarajan, S. Khudanpur, and E. Dupoux. 2008. Unsupervised learning of acoustic sub-word units. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers, pages 165–168. Association for Computational Linguistics. A. Venkataraman. 2001. A statistical model for word discovery in transcribed speech. Computational Linguistics, 27(3):351–372. 193