acl acl2011 acl2011-257 acl2011-257-reference knowledge-graph by maker-knowledge-mining

257 acl-2011-Question Detection in Spoken Conversations Using Textual Conversations

Source: pdf

Author: Anna Margolis ; Mari Ostendorf

Abstract: We investigate the use of textual Internet conversations for detecting questions in spoken conversations. We compare the text-trained model with models trained on manuallylabeled, domain-matched spoken utterances with and without prosodic features. Overall, the text-trained model achieves over 90% of the performance (measured in Area Under the Curve) of the domain-matched model including prosodic features, but does especially poorly on declarative questions. We describe efforts to utilize unlabeled spoken utterances and prosodic features via domain adaptation.

reference text

Jeremy Ang, Yang Liu, and Elizabeth Shriberg. 2005. Automatic dialog act segmentation and classification in multiparty meetings. In Proc. Int. Conference on Acoustics, Speech, and Signal Processing. John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Pro- cessing, pages 120–128, Sydney, Australia, July. Association for Computational Linguistics. Kofi Boakye, Benoit Favre, and Dilek Hakkini-t u¨r. 2009. Any questions? Automatic question detection in meetings. In Proc. IEEE Workshop on Automatic Speech Recognition and Understanding. Zhigang Chen, Guoping Hu, and Wei Jiang. 2010. Improving prosodic phrase prediction by unsupervised adaptation and syntactic features extraction. In Proc. Interspeech. Heidi Christensen, Yoshihiko Gotoh, and Steve Renals. 2001. Punctuation annotation using statistical prosody models. In in Proc. ISCA Workshop on Prosody in Speech Recognition and Understanding, pages 35–40. Mark G. Core and James F. Allen. 1997. Coding dialogs with the DAMSL annotation scheme. In Proc. of the Working Notes of the AAAI Fall Symposium on Communicative Action in Humans and Machines, Cambridge, MA, November. Rajdip Dhillon, Sonali Bhagat, Hannah Carvey, and Elizabeth Shriberg. 2004. Meeting recorder project: Dialog act labeling guide. Technical report, ICSI Tech. Report. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9: 1871–1874, August. Agustin Gravano, Martin Jansche, and Michiel Bacchiani. 2009. Restoring punctuation and capitalization in transcribed speech. In Proc. Int. Conference on Acoustics, Speech, and Signal Processing. Umit Guz, S ´ebastien Cuendet, Dilek Hakkani-T u¨r, and Gokhan Tur. 2007. Co-training using prosodic and lexical information for sentence segmentation. In Proc. Interspeech. Umit Guz, Gokhan Tur, Dilek Hakkani-T u¨r, and S ´ebastien Cuendet. 2010. Cascaded model adaptation for dialog act segmentation and tagging. Computer Speech & Language, 24(2):289–306, April. Jing Huang and Geoffrey Zweig. 2002. Maximum entropy model for punctuation annotation from speech. In Proc. Int. Conference on Spoken Language Processing, pages 917–920. 123 Minwoo Jeong, Chin-Yew Lin, and Gary G. Lee. 2009. Semi-supervised speech act recognition in emails and forums. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 1250–1259, Singapore, August. Association for Computational Linguistics. Ji-Hwan Kim and Philip C. Woodland. 2003. A combined punctuation generation and speech recognition system and its performance enhancement using prosody. Speech Communication, 41(4):563–577, November. Xin Lei. 2006. Modeling lexical tones for Mandarin large vocabulary continuous speech recognition. Ph.D. thesis, Department of Electrical Engineering, University of Washington. Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, and Mary Harper. 2006. Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. IEEE Trans. Audio, Speech, and Language Processing, 14(5): 1526–1540, September. Anna Margolis, Karen Livescu, and Mari Ostendorf. 2010. Domain adaptation with unlabeled data for dialog act tagging. In Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing, pages 45–52, Uppsala, Sweden, July. Association for Computational Linguistics. Helena Moniz, Fernando Batista, Isabel Trancoso, and Ana Mata. 2011. Analysis of interrogatives in different domains. In Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Theoretical and Practical Issues, volume 6456 of Lecture Notes in Computer Science, chapter 12, pages 134– 146. Springer Berlin / Heidelberg. Jeffrey C. Reynar and Adwait Ratnaparkhi. 1997. A maximum entropy approach to identifying sentence boundaries. In Proc. 5th Conf. on Applied Natural Language Processing, April. Wenzhu 2009. Shen, Roger P. Yu, Frank Seide, and Ji Wu. Automatic punctuation generation for speech. In Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, pages Elizabeth Shriberg, Rebecca 586–589, Bates, December. Andreas Stolcke, Paul Taylor, Daniel Jurafsky, Klaus Ries, Noah Coccaro, Rachel Martin, Marie Meteer, and Carol Van EssDykema. 1998. Can prosody aid the automatic classi- fication of dialog acts in conversational speech? Lan- guage and Speech (Special Double Issue on Prosody and Conversation), 41(3-4):439–487. Elizabeth Shriberg, Raj Dhillon, Sonali Bhagat, Jeremy Ang, and Hannah Carvey. 2004. The ICSI meet- ing recorder dialog act (MRDA) corpus. In Proc. of the 5th SIGdial Workshop on Discourse and Dialogue, pages 97–100. Kemal S ¨onmez, Elizabeth Shriberg, Larry Heck, and Mitchel Weintraub. 1998. Modeling dynamic prosodic variation for speaker verification. In Proc. Int. Conference on Spoken Language Processing, pages 3189–3 192. Andreas Stolcke, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Daniel Jurafsky, Paul Taylor, Rachel Martin, Carol Van Ess-Dykema, and Marie Meteer. 2000. Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational Linguistics, 26:339–373. Dinoj Surendran and Gina-Anne Levow. 2006. Dialog act tagging with support vector machines and hidden Markov models. In Proc. Interspeech, pages 1950– 1953. Anand Venkataraman, Luciana Ferrer, Andreas Stolcke, and Elizabeth Shriberg. 2003. Training a prosodybased dialog act tagger from unlabeled data. In Proc. Int. Conference on Acoustics, Speech, and Signal Processing, volume 1, pages 272–275, April. 124