emnlp emnlp2011 emnlp2011-61 emnlp2011-61-reference knowledge-graph by maker-knowledge-mining

61 emnlp-2011-Generating Aspect-oriented Multi-Document Summarization with Event-aspect model

Source: pdf

Author: Peng Li ; Yinglin Wang ; Wei Gao ; Jing Jiang

Abstract: In this paper, we propose a novel approach to automatic generation of aspect-oriented summaries from multiple documents. We first develop an event-aspect LDA model to cluster sentences into aspects. We then use extended LexRank algorithm to rank the sentences in each cluster. We use Integer Linear Programming for sentence selection. Key features of our method include automatic grouping of semantically related sentences and sentence ranking based on extension of random walk model. Also, we implement a new sentence compression algorithm which use dependency tree instead of parser tree. We compare our method with four baseline methods. Quantitative evaluation based on Rouge metric demonstrates the effectiveness and advantages of our method.

reference text

Chaitanya Chemudugunta, Padhraic Smyth, and Mark Steyvers. 2007. Modeling general and specific aspects of documents with a probabilistic topic model. In Advances in Neural Information Processing Systems 19, pages 241–248. Hal. Daum e´ III and Daniel. Marcu. 2006. Bayesian query-focused summarization. In Proceedings of the 21st International Conference on Computational Lin- guistics and the 44th annual meeting of the Association for Computational Linguistics, pages 305–3 12. Association for Computational Linguistics. G ¨unes. Erkan and Dragomir Radev. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22(1):457–479. K. Filippova and M. Strube. 2008. Dependency tree based sentence compression. In Proceedings of the Fifth International Natural Language Generation Conference, pages 25–32. Association for Computational Linguistics. Dan Gillick and Benoit Favre. 2009. A scalable global model for summarization. In Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing, pages 10–18. Dan Gillick, Benoit Favre, D. Hakkani-Tur, B. Bohnet, Y. Liu, and S. Xie. 2010. The icsi/utd summarization system at tac 2009. In Proceedings of the Second Text Analysis Conference, Gaithersburg, Maryland, USA: National Institute of Standards and Technology. Thomas L. Griffiths and Mark Steyvers. 2004. Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl. 1):5228–5235. A. Haghighi and L. Vanderwende. 2009. Exploring content models for multi-document summarization. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter oftheAssociationfor Computational Linguistics on ZZZ, pages 362–370. Association for Computational Linguistics. Peng Li, Jing Jiang, and Yinglin Wang. 2010. Generating templates of entity summaries with an entityaspect model and pattern mining. In Proceedings of the Joint Conference of the 48th Annual Meeting of the ACL. Association for Computational Linguistics. C.Y. Lin and E. Hovy. 2003. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language TechnologyVolume 1, pages 71–78. Association for Computational Linguistics. Ryan McDonald. 2007. A study ofglobal inference algorithms in multi-document summarization. Advances in Information Retrieval, pages 557–564. A. Nenkova and L. Vanderwende. 2005. The impact of frequency on summarization. Microsoft Research, Redmond, Washington, Tech. Rep. MSR-TR-2005-101. Michael J. Paul and Roxana Girju. 2010. A twodimensional topic-aspect model for discovering multifaceted topics. In In AAAI-2010: Twenty-Fourth Conference on Artificial Intelligence. Michael J. Paul, ChengXiang Zhai, and Roxana Girju. 2010. Summarizing contrastive viewpoints in opinionated text. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP ’ 10, pages 66–76, Morristown, NJ, USA. Association for Computational Linguistics. Christina Sauper and Regina Barzilay. 2009. Automatically generating wikipedia articles: A structure-aware approach. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 208–216, Suntec, Singapore, August. Association for Computational Linguistics. Ivan Titov and Ryan McDonald. 2008. Modeling online reviews with multi-grain topic models. In Proceeding of the 1 International Conference on World Wide 7th Web, pages 111–120. D. Zajic, B.J. Dorr, J. Lin, and R. Schwartz. 2007. Multicandidate reduction: Sentence compression as a tool for document summarization tasks. Information Processing & Management, 43(6): 1549–1570. Jin. Zhang, Xueqi. Cheng, and Hongbo. Xu. 2008. GSPSummary: a graph-based sub-topic partition algorithm for summarization. In Proceedings of the 4th Asia information retrieval conference on Information retrieval technology, pages 321–334. Springer-Verlag. 1146