emnlp emnlp2013 emnlp2013-133 emnlp2013-133-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: James Foulds ; Padhraic Smyth
Abstract: When reviewing scientific literature, it would be useful to have automatic tools that identify the most influential scientific articles as well as how ideas propagate between articles. In this context, this paper introduces topical influence, a quantitative measure of the extent to which an article tends to spread its topics to the articles that cite it. Given the text of the articles and their citation graph, we show how to learn a probabilistic model to recover both the degree of topical influence of each article and the influence relationships between articles. Experimental results on corpora from two well-known computer science conferences are used to illustrate and validate the proposed approach.
[A’Hearn2004] B. A’Hearn. 2004. A restricted maximum likelihood estimator for truncated height samples. Economics & Human Biology, 2(1):5–19. [Blei et al.2003] D.M. Blei, A.Y. Ng, and M.I. Jordan. 2003. Latent Dirichlet allocation. The Journal of Machine Learning Research, 3:993–1022. [Brin and Page1998] S. Brin and L. Page. 1998. The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems, 30(17): 107–1 17. [Chang and Blei2009] J. Chang and D. Blei. 2009. Relational topic models for document networks. In Artificial Intelligence and Statistics, pages 81–88. [Cohn and Hofmann2001] D. Cohn and T. Hofmann. 2001 . The missing link-a probabilistic model of document content and hypertext connectivity. In Advances in Neural Information Processing Systems, pages 430– 436. [Dietz et al.2007] L. Dietz, S. Bickel, and T. Scheffer. 2007. Unsupervised prediction of citation influences. In Proceedings of the 24th International Conference on Machine Learning, pages 233–240. [Gerrish and Blei2010] S. Gerrish and D.M. Blei. 2010. A language-based approach to measuring scholarly impact. In Proceedings of the 26th International Conference on Machine Learning, pages 375–382. [Griffiths and Steyvers2004] T.L. Griffiths and M. Steyvers. 2004. Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl 1):5228. [He et al.2009] Q. He, B. Chen, J. Pei, B. Qiu, P. Mitra, and L. Giles. 2009. Detecting topic evolution in scientific literature: how can citations help? In Proceedings of the 18th ACM Conference on Information and Knowledge Management, pages 957–966. ACM. [Le Cun et al. 1990] B.B. Le Cun, JS Denker, D. Henderson, RE Howard, W. Hubbard, and LD Jackel. 1990. Handwritten digit recognition with a back-propagation network. In Advances in Neural Information Processing Systems, pages 396–404. [Lin2008] J. Lin. 2008. Pagerank without hyperlinks: Reranking with pubmed related article networks for biomedical text retrieval. BMC bioinformatics, 9(1):270. [Mimno and McCallum2008] D. Mimno and A. McCallum. 2008. Topic models conditioned on arbitrary features with Dirichlet-multinomial regression. In Uncertainty in Artificial Intelligence, pages 411–418. [Nallapati et al.201 1] R. Nallapati, D. McFarland, and C. Manning. 2011. Topicflow model: Unsupervised learning of topic-specific influences of hyperlinked documents. In International Conference on Artificial Intelligence and Statistics, pages 543–55 1. [Neal2001] R.M. Neal. 2001. Annealed importance sampling. Statistics and Computing, 11(2): 125–139. [Radev et al.2009] D. R. Radev, P. Muthukrishnan, and V. Qazvinian. 2009. The ACL anthology network corpus. In Proceedings, ACL Workshop on Natural Language Processing and Information Retrieval for Digi- tal Libraries, pages 54–61, Singapore. [Shaparenko and Joachims2009] B. Shaparenko and T. Joachims. 2009. Identifying the original contribution of a document via language modeling. In Machine Learning and Knowledge Discovery in Databases, pages 350–365. Springer. [Teufel et al.2006] S. Teufel, A. Siddharthan, and D. Tidhar. 2006. Automatic classification of citation function. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 103–1 10. Association for Computational Linguistics. [Wallach et al.2009] H.M. Wallach, I. Murray, R. Salakhutdinov, and D. Mimno. 2009. Evaluation methods for topic models. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 1105–1 112. ACM. [Wallach2006] H.M. Wallach. 2006. Topic modeling: beyond bag-of-words. In Proceedings of the 23rd International Conference on Machine Learning, pages 977–984. ACM. [Ziman1968] J.M. Ziman. 1968. Public knowledge: an essay concerning the social dimension of science. Cambridge University Press. 123