nips nips2010 nips2010-286 nips2010-286-reference knowledge-graph by maker-knowledge-mining

286 nips-2010-Word Features for Latent Dirichlet Allocation

Source: pdf

Author: James Petterson, Wray Buntine, Shravan M. Narayanamurthy, Tibério S. Caetano, Alex J. Smola

Abstract: We extend Latent Dirichlet Allocation (LDA) by explicitly allowing for the encoding of side information in the distribution over words. This results in a variety of new capabilities, such as improved estimates for infrequently occurring words, as well as the ability to leverage thesauri and dictionaries in order to boost topic cohesion within and across languages. We present experiments on multi-language topic synchronisation where dictionary information is used to bias corresponding words towards similar topics. Results indicate that our model substantially improves topic cohesion when compared to the standard LDA model. 1

reference text

[1] David Andrzejewski, Xiaojin Zhu, and Mark Craven. Incorporating domain knowledge into topic modeling via Dirichlet Forest priors. In ICML, pages 1–8. ACM Press, 2009.

[2] C. Antoniak. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annals of Statistics, 2:1152–1174, 1974.

[3] David M. Blei and John D. Lafferty. Dynamic topic models. In W. W. Cohen and A. Moore, editors, ICML, volume 148, pages 113–120. ACM, 2006.

[4] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, January 2003.

[5] Jordan Boyd-Graber, David Blei, and Xiaojin Zhu. A Topic Model for Word Sense Disambiguation. In EMNLP-CoNLL, pages 1024–1033, 2007.

[6] Jordan Boyd-Graber and David M. Blei. Multilingual topic models for unaligned text. In Proceedings of the 25th Conference in Uncertainty in Artiﬁcial Intelligence (UAI), 2009.

[7] Jonathan Chang, Jordan Boyd-Graber, Sean Gerrish, Chong Wang, and David Blei. Reading tea leaves: How humans interpret topic models. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, NIPS, pages 288–296. 2009.

[8] Thomas L. Grifﬁths and Mark Steyvers. Finding scientiﬁc topics. Proceedings of the National Academy of Sciences, 101:5228–5235, 2004.

[9] Woosung Kim and Sanjeev Khudanpur. Lexical triggers and latent semantic analysis for crosslingual language model adaptation. ACM Transactions on Asian Language Information Processing, 3, 2004.

[10] T.B. Kirkpatrick, A.B. Cˆ t´ , J. DeNero, and Dan Klein. Painless Unsupervised Learning with oe Features. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2010.

[11] Philipp Koehn. Europarl: A parallel corpus for statistical machine translation. In Machine Translation Summit X, pages 79–86, 2005.

[12] Dong C. Liu and Jorge Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(3):503–528, 1989.

[13] David Mimno, Hanna M. Wallach, Jason Naradowsky, David A. Smith, and Andrew McCallum. Polylingual topic models. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 880–889, Singapore, August 2009. ACL.

[14] David M. Mimno and Andrew McCallum. Topic models conditioned on arbitrary features with dirichlet-multinomial regression. In D. A. McAllester and P. Myllym¨ ki, editors, UAI, a Proceedings of the 24th Conference in Uncertainty in Artiﬁcial Intelligence, pages 411–418. AUAI Press, 2008.

[15] Xiaochuan Ni, Jian-Tao Sun, Jian Hu, and Zheng Chen. Mining multilingual topics from wikipedia. In 18th International World Wide Web Conference, pages 1155–1155, April 2009.

[16] Patrick Pantel and Dekang Lin. Discovering word senses from text. In David Hand, Daniel Keim, and Raymond Ng, editors, Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 613–619, New York, July 2002. ACM Press.

[17] Noah A Smith and Shay B Cohen. The Shared Logistic Normal Distribution for Grammar Induction. In NIPS Workshop on Speech and Language: Unsupervised Latent-Variable Models,, pages 1–4, 2008.

[18] S. V. N. Vishwanathan and A. J. Smola. Fast kernels for string and tree matching. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, pages 569–576. MIT Press, Cambridge, MA, 2003.

[19] Limin Yao, David Mimno, and Andrew McCallum. Efﬁcient methods for topic model inference on streaming document collections. In KDD’09, 2009.

[20] Bing Zhao and Eric P. Xing. BiTAM: Bilingual Topic AdMixture Models for Word Alignment. In In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics (ACL’06), 2006. 9