acl acl2012 acl2012-113 acl2012-113-reference knowledge-graph by maker-knowledge-mining

113 acl-2012-INPROwidth.3emiSS: A Component for Just-In-Time Incremental Speech Synthesis


Source: pdf

Author: Timo Baumann ; David Schlangen

Abstract: We present a component for incremental speech synthesis (iSS) and a set of applications that demonstrate its capabilities. This component can be used to increase the responsivity and naturalness of spoken interactive systems. While iSS can show its full strength in systems that generate output incrementally, we also discuss how even otherwise unchanged systems may profit from its capabilities.


reference text

Timo Baumann and David Schlangen. 2011. Predicting the Micro-Timing of User Input for an Incremental Spoken Dialogue System that Completes a User’s Ongoing Turn. In Proceedings of SigDial 2011, pages 120–129, Portland, USA, June. Timo Baumann and David Schlangen. 2012. The INPROTK 2012 release. In Proceedings of SDCTD. to appear. Herbert H. Clark. 1996. Using Language. Cambridge University Press. Thierry Dutoit, Maria Astrinaki, Onur Babacan, Nicolas d’Alessandro, and Benjamin Picart. 2011. pHTS for Max/MSP: A Streaming Architecture for Statistical Parametric Speech Synthesis. Technical Report 1, numediart Research Program on Digital Art Technologies, March. Jens Edlund. 2008. Incremental speech synthesis. In Second Swedish Language Technology Conference, pages 53–54, Stockholm, Sweden, November. System Demonstration. Wolfgang Finkler. 1997. Automatische Selbstkorrektur bei der inkrementellen Generierung gesprochener Sprache unter Realzeitbedingungen. Dissertationen zur Künstlichen Intelligenz. infix Verlag. Markus Guhe. 2007. Incremental Conceptualization for Language Production. Lawrence Erlbaum Asso., Inc., Mahwah, USA. Anne Kilger and Wolfgang Finkler. 1995. Incremental Generation for Real-time Applications. Technical Report RR-95-1 1, DFKI, Saarbrücken, Germany. William J.M. Levelt. 1989. Speaking: From Intention to Articulation. MIT Press. Kyoko Matsuyama, Kazunori Komatani, Ryu Takeda, Toru Takahashi, Tetsuya Ogata, and Hiroshi G. Okuno. 2010. Analyzing User Utterances in Barge-in-able Spoken Dialogue System for Improving Identification Ac108 curacy. In Proceedings of Interspeech, pages 3050– 3053, Makuhari, Japan, September. Michael McTear. 2002. Spoken Dialogue Technology. Toward the Conversational User-Interface. Springer, London, UK. David Schlangen and Gabriel Skantze. 2009. A General, Abstract Model of Incremental Dialogue Processing. In Proceedings of the EACL, Athens, Greece. David Schlangen, Timo Baumann, Hendrik Buschmeier, Okko Buß, Stefan Kopp, Gabriel Skantze, and Ramin Yaghoubzadeh. 2010. Middleware for Incremental Processing in Conversational Agents. In Proceedings of SigDial 2010, pages 51–54, Tokyo, Japan, September. Marc Schröder and Jürgen Trouvain. 2003. The German Text-to-Speech Synthesis System MARY: A Tool for Research, Development and Teaching. International Journal of Speech Technology, 6(3):365–377, October. Gabriel Skantze and Anna Hjalmarsson. 2010. Towards incremental speech generation in dialogue systems. In Proceedings of SigDial 2010, pages 1–8, Tokyo, Japan, September. Gabriel Skantze and David Schlangen. 2009. Incremental dialogue processing in a micro-domain. In Proceedings of EACL 2009, Athens, Greece, April. Paul Taylor. 2009. Text-to-Speech Synthesis. Cambridge Univ Press, Cambridge, UK. Tomoki Toda and Keiichi Tokuda. 2007. A Speech Parameter Generation Algorithm Considering Global Variance for HMM-based Speech Synthesis. IEICE Transactions on Information and Systems, 90(5):816–824. Keiichi Tokuda, Takayoshi Yoshimura, Takashi Masuko, Takao Kobayashi, and Tadashi Kitamura. 2000. Speech Parameter Generation Algorithms for HMMbased Speech Synthesis. In Proceedings of ICASSP 2000, pages 13 15–1318, Istanbul, Turkey.