acl acl2013 acl2013-129 acl2013-129-reference knowledge-graph by maker-knowledge-mining

129 acl-2013-Domain-Independent Abstract Generation for Focused Meeting Summarization

Source: pdf

Author: Lu Wang ; Claire Cardie

Abstract: We address the challenge of generating natural language abstractive summaries for spoken meetings in a domain-independent fashion. We apply Multiple-Sequence Alignment to induce abstract generation templates that can be used for different domains. An Overgenerateand-Rank strategy is utilized to produce and rank candidate abstracts. Experiments using in-domain and out-of-domain training on disparate corpora show that our system uniformly outperforms state-of-the-art supervised extract-based approaches. In addition, human judges rate our system summaries significantly higher than compared systems in fluency and overall quality.

reference text

Gabor Angeli, Percy Liang, and Dan Klein. 2010. A simple domain-independent probabilistic approach to generation. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP ’ 10, pages 502–5 12, Stroudsburg, PA, USA. Association for Computational Lin- guistics. Regina Barzilay and Lillian Lee. 2003. Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, NAACL ’03, pages 16–23, Stroudsburg, PA, USA. Association for Computational Linguistics. Trung H. Bui, Matthew Frampton, John Dowding, and Stanley Peters. 2009. Extracting decisions from multi-party dialogue using directed graphical models and semantic similarity. In Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL ’09, pages 235–243, Stroudsburg, PA, USA. Association for Computational Linguistics. Giuseppe Carenini, Gabriel Murray, and Raymond Ng. 2011. Methods for Mining and Summarizing Text Conversations. Morgan & Claypool Publishers. Harr Chen, Edward Benson, Tahira Naseem, and Regina Barzilay. 2011. In-domain relation discovery with meta-constraints via posterior regularization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT ’ 11, pages 530–540, Stroudsburg, PA, USA. Association for Computational Linguistics. Hoa T. Dang. 2005. Overview of DUC 2005. In Document Understanding Conference. Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison. 1998. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, July. Raquel Fern a´ndez, Matthew Frampton, John Dowding, Anish Adukuzhiyil, Patrick Ehlen, and Stanley Peters. 2008. Identifying relevant phrases to summarize decisions in spoken meetings. In INTERSPEECH, pages 78–81. Michel Galley. 2006. A skip-chain conditional random field for ranking meeting utterances by importance. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP ’06, pages 364–372, Stroudsburg, PA, USA. Association for Computational Linguistics. David Graff. 2003. English Gigaword. Michael Heilman and Noah A. Smith. 2010. Good question! statistical ranking for question generation. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT ’ 10, pages 609–617, Stroudsburg, PA, USA. Association for Computational Linguistics. A. Janin, D. Baron, J. Edwards, D. Ellis, D. Gelbart, N. Morgan, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, and C. Wooters. 2003. The icsi meeting corpus. volume 1, pages I–364–I–367 vol. 1. Thorsten Joachims. 1998. Text categorization with suport vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning, ECML ’98, pages 137–142, London, UK, UK. Springer-Verlag. Thorsten Joachims. 1999. Advances in kernel methods. chapter Making large-scale support vector machine learning practical, pages 169–184. MIT Press, Cambridge, MA, USA. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, ACL ’03, pages 423– 430, Stroudsburg, PA, USA. Association for Computational Linguistics. Ioannis Konstas and Mirella Lapata. 2012. Conceptto-text generation via discriminative reranking. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1, ACL ’ 12, pages 369–378, Stroudsburg, PA, USA. Association for Computational Linguistics. J R Landis and G G Koch. 1977. The measurement of observer agreement for categorical data. Biometrics, 33(1): 159–174. Hui Lin and Jeff Bilmes. 2010. Multi-document summarization via budgeted maximization of submodular functions. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT ’ 10, pages 912–920, Stroudsburg, PA, USA. Association for Computational Linguistics. Chin-Yew Lin and Eduard Hovy. 2003. Automatic evaluation of summaries using n-gram cooccurrence statistics. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, pages 71–78. Fei Liu and Yang Liu. 2009. From extractive to ab- stractive meeting summaries: can it be done by sentence compression? In Proceedings of the ACLIJCNLP 2009 Conference Short Papers, ACLShort ’09, pages 261–264, Stroudsburg, PA, USA. Association for Computational Linguistics. 1404 I. Mccowan, G. Lathoud, M. Lincoln, A. Lisowska, W. Post, D. Reidsma, and P. Wellner. 2005. The ami meeting corpus. In In: Proceedings Measuring Behavior 2005, 5th International Conference on Methods and Techniques in Behavioral Research. L.P.J.J. Noldus, F. Grieco, L.W.S. Loijens and P.H. Zimmerman (Eds.), Wageningen: Noldus Information Technology. Gabriel Murray, Steve Renals, and Jean Carletta. 2005. Extractive summarization of meeting recordings. In INTERSPEECH, pages 593–596. Gabriel Murray, Giuseppe Carenini, and Raymond Ng. 2010a. Interpretation and transformation for abstracting conversations. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT ’ 10, pages 894–902, Stroudsburg, PA, USA. Association for Computational Linguistics. Gabriel Murray, Giuseppe Carenini, and Raymond T. Ng. 2010b. Generating and validating abstracts of meeting conversations: a user study. In INLG. S. B. Needleman and C. D. Wunsch. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of molecular biology, 48(3):443–453, March. Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, pages 3 11–318, Stroudsburg, PA, USA. Association for Computational Linguistics. Ehud Reiter and Robert Dale. 2000. Building natural language generation systems. Cambridge University Press, New York, NY, USA. Korbinian Riedhammer, Benoit Favre, and Dilek Hakkani-T u¨r. 2010. Long story short - global unsupervised models for keyphrase based meeting summarization. Speech Commun., 52(10):801–815, October. Oana Sandu, Giuseppe Carenini, Gabriel Murray, and Raymond Ng. 2010. Domain adaptation to summarize human conversations. In Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing, DANLP 2010, pages 16–22, Stroudsburg, PA, USA. Association for Computational Linguistics. Alex J. Smola and Bernhard Sch o¨lkopf. 2004. A tutorial on support vector regression. Statistics and Computing, 14(3): 199–222, August. Andreas Stolcke. 2002. SRILM an extensible language modeling toolkit. In Proceedings of ICSLP, volume 2, pages 901–904, Denver, USA. Marilyn A. Walker, Owen Rambow, and Monica Rogati. 2001 . Spot: a trainable sentence planner. In Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies, NAACL ’01, pages 1–8, Stroudsburg, PA, USA. Association for Computational Linguistics. – Lu Wang and Claire Cardie. 2011. Summarizing decisions in spoken meetings. In Proceedings of the Workshop on Automatic SummarizationforDifferent Genres, Media, and Languages, WASDGML ’ 11, pages 16–24, Stroudsburg, PA, USA. Association for Computational Linguistics. Lu Wang and Claire Cardie. 2012. Focused meeting summarization via unsupervised relation extraction. In Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL ’ 12, pages 304–3 13, Stroudsburg, PA, USA. Association for Computational Linguistics. Lusheng Wang and Tao Jiang. 1994. On the complexity of multiple sequence alignment. Journal of Computational Biology, 1(4):337–348. Shasha Xie, Yang Liu, and Hui Lin. 2008. Evaluating the effectiveness of features and sampling in extractive meeting summarization. In in Proc. of IEEE Spoken Language Technology (SLT. 1405