acl acl2012 acl2012-23 acl2012-23-reference knowledge-graph by maker-knowledge-mining

23 acl-2012-A Two-step Approach to Sentence Compression of Spoken Utterances

Source: pdf

Author: Dong Wang ; Xian Qian ; Yang Liu

Abstract: This paper presents a two-step approach to compress spontaneous spoken utterances. In the first step, we use a sequence labeling method to determine if a word in the utterance can be removed, and generate n-best compressed sentences. In the second step, we use a discriminative training approach to capture sentence level global information from the candidates and rerank them. For evaluation, we compare our system output with multiple human references. Our results show that the new features we introduced in the first compression step improve performance upon the previous work on the same data set, and reranking is able to yield additional gain, especially when training is performed to take into account multiple references.

reference text

Eugene Charniak and Mark Johnson. 2005. Coarse-tofine n-best parsing and maxent discriminative reranking. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 173–180, Stroudsburg, PA, USA. Proceedings ofACL. James Clarke and Mirella Lapata. 2008. Global inference for sentence compression an integer linear programming approach. Journal of Artificial Intelligence Research, 3 1:399–429, March. Trevor Cohn and Mirella Lapata. 2008. Sentence compression beyond word deletion. In Proceedings of COLING. Michel Galley and Kathleen R. Mckeown. 2007. Lexicalized Markov grammars for sentence compression. In Proceedings of HLT-NAACL. Kevin Knight and Daniel Marcu. 2000. Statistics-based summarization-step one: Sentence compression. In Proceedings of AAAI. Fei Liu and Yang Liu. 2009. From extractive to abstractive meeting summaries: can it be done by sentence compression? In Proceedings of the ACL-IJCNLP. Fei Liu and Yang Liu. 2010. Using spoken utterance compression for meeting summarization: a pilot study. In Proceedings of SLT. Gabriel Murray, Steve Renals, and Jean Carletta. 2005. Extractive summarization of meeting recordings. In Proceedings of EUROSPEECH. Courtney Napoles, Benjamin Van Durme, and Chris Callison-Burch. 2011. Evaluating Sentence Compression: Pitfalls and Suggested Remedies. In Proceedings ofthe Workshop on Monolingual Text-To-Text Generation, pages 91–97, Portland, Oregon, June. Association for Computational Linguistics. Wen Wang, A. Stolcke, and Jing Zheng. 2007. Reranking machine translation hypotheses with structured and web-based language models. In Proceedings of IEEE Workshop on Speech Recognition and Understanding, pages 159–164, Kyoto. Simon Zwarts and Mark Johnson. 2011. The impact of language models and loss functions on repair disfluency detection. In Proceedings of ACL. 170