acl acl2011 acl2011-98 acl2011-98-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Asli Celikyilmaz ; Dilek Hakkani-Tur
Abstract: Extractive methods for multi-document summarization are mainly governed by information overlap, coherence, and content constraints. We present an unsupervised probabilistic approach to model the hidden abstract concepts across documents as well as the correlation between these concepts, to generate topically coherent and non-redundant summaries. Based on human evaluations our models generate summaries with higher linguistic quality in terms of coherence, readability, and redundancy compared to benchmark systems. Although our system is unsupervised and optimized for topical coherence, we achieve a 44.1 ROUGE on the DUC-07 test set, roughly in the range of state-of-the-art supervised models.
R. Barzilay and L. Lee. 2004. Catching the drift: Probabilistic content models with applications to generation and summarization. In Proc. HLT-NAACL’04. R. Barzilay, K.R. McKeown, and M. Elhadad. 1999. Information fusion in the context of multi-document summarization. Proc. 37th ACL, pages 550–557. D. Blei, A. Ng, and M. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research. D. Blei, T. Griffiths, M. Jordan, and J. Tenenbaum. 2004. Hierarchical topic models and the nested chinese restaurant process. In Neural Information Processing Systems [NIPS]. A. Celikyilmaz and D. Hakkani-Tur. 2010. A hybrid hierarchical model for multi-document summarization. Proc. 48th ACL 2010. D. Chen, J. Tang, L. Yao, J. Li, and L. Zhou. 2000. Query-focused summarization by combining topic model and affinity propagation. LNCS– Advances in Data and Web Development. J. Conroy, H. Schlesinger, and D. OLeary. 2006. Topicfocused multi-document summarization using an approximate oracle score. Proc. ACL. H. Daum e´-III and D. Marcu. 2006. Bayesian query focused summarization. Proc. ACL-06. 499 J. Eisenstein and R. Barzilay. 2008. Bayesian unsupervised topic segmentation. Proc. EMNLP-SIGDAT. A. Haghighi and L. Vanderwende. 2009. Exploring content models for multi-document summarization. NAACL HLT-09. S. Harabagiu, A. Hickl, and F. Lacatusu. 2007. Satisfying information needs with multi-document summaries. Information Processing and Management. W. Li and A. McCallum. 2006. Pachinko allocation: Dag-structure mixture models of topic correlations. Proc. ICML. W. Li, D. Blei, and A. McCallum. 2007. Nonparametric bayes pachinko allocation. The 23rd Conference on Uncertainty in Artificial Intelligence. C.Y. Lin and E. Hovy. 2002. The automated acquisition of topic signatures fro text summarization. Proc. CoLing. G. A. Miller. 1995. Wordnet: A lexical database for english. ACM, Vol. 38, No. 11: 39-41. D. Mimno, W. Li, and A. McCallum. 2007. Mixtures of hierarchical topics with pachinko allocation. Proc. ICML. A. Nenkova and L. Vanderwende. 2005a. Document summarization using conditional random fields. Technical report, Microsoft Research. A. Nenkova and L. Vanderwende. 2005b. The impact of frequency on summarization. Technical report, Mi- crosoft Research. A. Nenkova, L. Vanderwende, and K. McKowen. 2006. A composition context sensitive multi-document summarizer. Prof. SIGIR. D. R. Radev. 2004. Lexrank: graph-based centrality as salience in text summarization. Jrnl. Artificial Intelligence Research. M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth. 2004. The author-topic model for authors and documents. UAI. J. Tang, L. Yao, and D. Chens. 2009. Multi-topic based query-oriented summarization. SIAM International Conference Data Mining. K. Toutanova, C. Brockett, M. Gamon, J. Jagarlamudi, H. Suzuki, and L. Vanderwende. 2007. The phthy summarization system: Microsoft research at duc 2007. In Proc. DUC. H. Wallach. 2006. Topic modeling: Beyond bag-ofwords. Proc. ICML 2006. X. Wan and J. Yang. 2006. Improved affinity graph based multi-document summarization. HLT-NAACL. D. Wang, S. Zhu, T. Li, and Y. Gong. 2009. Multidocument summarization using sentence-based topic models. Proc. ACL 2009.