emnlp emnlp2013 emnlp2013-100 emnlp2013-100-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Hiroshi Noji ; Daichi Mochihashi ; Yusuke Miyao
Abstract: One of the language phenomena that n-gram language model fails to capture is the topic information of a given situation. We advance the previous study of the Bayesian topic language model by Wallach (2006) in two directions: one, investigating new priors to alleviate the sparseness problem caused by dividing all ngrams into exclusive topics, and two, developing a novel Gibbs sampler that enables moving multiple n-grams across different documents to another topic. Our blocked sampler can efficiently search for higher probability space even with higher order n-grams. In terms of modeling assumption, we found it is effective to assign a topic to only some parts of a document.
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. The Journal of Machine Learning Research, 3:993–1022. Phil Blunsom and Trevor Cohn. 2011. A hierarchical pitman-yor process hmm for unsupervised part of speech induction. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 865–874, Portland, Oregon, USA, June. Association for Computational Linguistics. 1189 gram S 4:WTI ToCpHicIaNlG p mhraosdeesl. ˆr oism a symbol fdourc ethde beginning of a sentence and # represents a number. Phil Blunsom, Trevor Cohn, Sharon Goldwater, and Mark Johnson. 2009. A note on the implementation of hierarchical dirichlet processes. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pages 337–340, Suntec, Singapore, August. Association for Computational Linguistics. Daniel Gildea and Thomas Hofmann. 1999. Topic-based language models using em. In In Proceedings of EUROSPEECH, pages 2167–2170. Thomas L. Griffiths and Mark Steyvers. 2004. Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl 1):5228–5235. Thomas L. Griffiths, Mark Steyvers, David M. Blei, and Joshua B. Tenenbaum. 2005. Integrating topics and syntax. In In Advances in Neural Information Processing Systems 17, pages 537–544. MIT Press. Songfang Huang and Steve Renals. 2008. Unsupervised language model adaptation based on topic and role information in multiparty meetings. In in Proc. Interspeech08, pages 833–836. F. Jelinek, B. Merialdo, S. Roukos, and M. Strauss. 1991 . A dynamic language model for speech recognition. In Proceedings of the workshop on Speech and Natural Language, HLT ’91, pages 293–295, Stroudsburg, PA, USA. Association for Computational Linguistics. Robert Lindsey, William Headden, and Michael Stipicevic. 2012. A phrase-discovering topic model using hierarchical pitman-yor processes. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 214–222, Jeju Island, Korea, July. Association for Computational Linguistics. Daichi Mochihashi and Eiichiro Sumita. 2008. The infinite markov model. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 1017–1024. MIT Press, Cambridge, MA. Adam Pauls and Dan Klein. 2012. Large-scale syntactic language modeling with treelets. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1, pages 959–968. Association for Computational Linguistics. Yik-Cheung Tam and Tanja Schultz. 2005. Dynamic language model adaptation using variational bayes inference. In INTERSPEECH, pages 5–8. Yik-Cheung Tam and Tanja Schultz. 2009. Correlated bigram lsa for unsupervised language model adaptation. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21, pages 1633–1640. Yee Whye Teh, Michael I. Jordan, Matthew J. Beal, and David M. Blei. 2006. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476): 1566–1581. Yee Whye Teh. 2006a. A Bayesian Interpretation of Interpolated Kneser-Ney. NUS School of Computing Technical Report TRA2/06. Yee Whye Teh. 2006b. A hierarchical bayesian language model based on pitman-yor processes. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 985– 992, Sydney, Australia, July. Association for Computational Linguistics. Hanna M. Wallach, Iain Murray, Ruslan Salakhutdinov, and David Mimno. 2009. Evaluation methods for topic models. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, pages 1105–1 112, New York, NY, USA. ACM. Hanna M. Wallach. 2006. Topic modeling: beyond bagof-words. In Proceedings of the 23rd international conference on Machine learning, ICML ’06, pages 977–984. Xuerui Wang, Andrew McCallum, and Xing Wei. 2007. Topical n-grams: Phrase and topic discovery, with an application to information retrieval. In Proceedings of the 2007 Seventh IEEE International Conference on Data Mining, ICDM ’07, pages 697–702, Washington, DC, USA. IEEE Computer Society. Frank Wood and Yee Whye Teh. 2009. A hierarchical nonparametric Bayesian approach to statistical language model domain adaptation. In Proceedings of the International Conference on Artificial Intelligence and Statistics, volume 12. 1190