acl acl2010 acl2010-187 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Verena Rieser ; Oliver Lemon ; Xingkun Liu
Abstract: We present a novel approach to Information Presentation (IP) in Spoken Dialogue Systems (SDS) using a data-driven statistical optimisation framework for content planning and attribute selection. First we collect data in a Wizard-of-Oz (WoZ) experiment and use it to build a supervised model of human behaviour. This forms a baseline for measuring the performance of optimised policies, developed from this data using Reinforcement Learning (RL) methods. We show that the optimised policies significantly outperform the baselines in a variety of generation scenarios: while the supervised model is able to attain up to 87.6% of the possible reward on this task, the RL policies are significantly better in 5 out of 6 scenarios, gaining up to 91.5% of the total possible reward. The RL policies perform especially well in more complex scenarios. We are also the first to show that adding predictive “lower level” features (e.g. from the NLG realiser) is important for optimising IP strategies according to user preferences. This provides new insights into the nature of the IP problem for SDS.
Reference: text
sentIndex sentText sentNum sentScore
1 uk Abstract We present a novel approach to Information Presentation (IP) in Spoken Dialogue Systems (SDS) using a data-driven statistical optimisation framework for content planning and attribute selection. [sent-10, score-0.274]
2 6% of the possible reward on this task, the RL policies are significantly better in 5 out of 6 scenarios, gaining up to 91. [sent-14, score-0.261]
3 from the NLG realiser) is important for optimising IP strategies according to user preferences. [sent-19, score-0.446]
4 1 Introduction Work on evaluating SDS suggests that the Information Presentation (IP) phase is the primary contributor to dialogue duration (Walker et al. [sent-21, score-0.22]
5 An inherent problem in this task is the trade-off between presenting “enough” information to the user (for example helping them to feel confident that they have a good overview of the search results) versus keeping the utterances short and understandable. [sent-24, score-0.319]
6 A similar approach has been applied to the problem of Referring Expression Generation in dialogue (Janarthanam and Lemon, 2010). [sent-26, score-0.22]
7 It draws the user’s attention to relevant attributes by grouping the current results from the database into clusters, e. [sent-31, score-0.29]
8 This method employs utilitybased attribute selection with respect to how each attribute (e. [sent-47, score-0.292]
9 Related work explores a user modelling approach, where attributes are ranked according to user preferences (Dem- berg and Moore, 2006; Winterboer et al. [sent-52, score-0.767]
10 combinations of Information Presentation strategies as well as attribute selection), and to show the utility of both lower-level features (e. [sent-57, score-0.274]
11 how many attributes to generate, or when to use a SUMMARY), using a pipeline model for SDS with DM features as input, and where NLG has no knowledge of lower level features (e. [sent-64, score-0.283]
12 In the following we use a Reinforcement Learning (RL) as a statistical planning framework (Sutton and Barto, 1998) to explore the contextual features for making these decisions, and propose a new joint optimisation method for IP strategies combining con- tent structuring and attribute selection. [sent-69, score-0.457]
13 The IP module has to decide which action to take next, how many attributes to mention, and when to stop generating. [sent-80, score-0.288]
14 They were in- structed to select IP structures and attributes for NLG so as to most efficiently allow users to find a restaurant matching their search constraints. [sent-83, score-0.302]
15 2 for a list of IP strategies to choose from), which attributes to mention (e. [sent-86, score-0.298]
16 cuisine, price range, location, food quality, and/or service quality), and whether to stop generating, given varying numbers of database matches, varying prompt realisations, and varying user behaviour. [sent-88, score-0.422]
17 The user speech input was delivered to the wizard using Voice Over IP. [sent-90, score-0.471]
18 Each user performed a total of 12 tasks, where no task set was seen twice by any one wizard. [sent-95, score-0.289]
19 [A:] The wizard selects attribute values as specified by the user’s query. [sent-97, score-0.339]
20 [C:] The wizard then chooses which strategy and which attributes to generate next, by clicking radio buttons. [sent-102, score-0.41]
21 The attribute/s specified in the last user query are pre-selected by default. [sent-103, score-0.289]
22 [D:] An utterance is automatically generated by the NLG realiser every time the wizard selects a strategy, and is displayed in an intermediate text panel. [sent-105, score-0.555]
23 [E:] The wizard can decide to add the generated utterance to the final output panel or to start over again. [sent-106, score-0.231]
24 The text in the final panel is sent to the user via TTS, once the wizard decides to stop generating. [sent-107, score-0.498]
25 StrategyExample utterance Table 1: Example realisations, generated when the user provided cui s ine=Indian, and where the wizard has also selected the additional attribute pri ce for presentation to the user. [sent-108, score-0.755]
26 After each task the user answered a questionnaire on a 6 point Likert scale, regarding the perceived generation quality in that task. [sent-110, score-0.338]
27 The data contains 2236 utterances in total: 1465 wizard utterances and 771 user utterances. [sent-115, score-0.531]
28 2 NLG Realiser In the Wizard-of-Oz environment we implemented a NLG realiser for the chosen IP structures and attribute choices, in order to realise the wizards’ choices in real time. [sent-123, score-0.467]
29 The length of an utterance also depends on the number of attributes chosen, i. [sent-134, score-0.238]
30 The approach using a UM assumes that the user has certain preferences (e. [sent-139, score-0.289]
31 listing all the attributes for the first item and then for the other) or by Attribute (i. [sent-144, score-0.249]
32 IF (T (pHrd Ebe HvN HiNtL nGls g=S :=10): ELSTEHE nNlg SnltrgaSttreagtye=gsuym=ma Rreyc;ommend; Figure 3: Rules learned by JRip for the wizard model (‘dbHit s ’= number of database matches, ‘prevNLG’= previous NLG action) The features selected by this model were only “high-level” features, i. [sent-167, score-0.33]
33 A user simulation for NLG is very similar, in that it is a predictive model of the most likely next user act. [sent-182, score-0.673]
34 4 However, this NLG predicted user act does not actually change the overall dialogue state (e. [sent-183, score-0.564]
35 In other words, 4Similar to the internal user models applied in recent work on POMDP (Partially Observable Markov Decision Process) dialogue managers (Young et al. [sent-186, score-0.509]
36 1012 the NLG user simulation tells us what the user is most likely to do next, if we were to stop generating now. [sent-189, score-0.7]
37 We are most interested in the following user reactions: 1. [sent-190, score-0.289]
38 select : the user chooses one of the presented items, e. [sent-191, score-0.289]
39 This reply type indicates that the Information Presentation was sufficient for the user to make a choice. [sent-195, score-0.325]
40 This reply type indicates that the user has more specific requests, which s/he wants to specify after being presented with the current information. [sent-201, score-0.325]
41 reque stMore I nfo: The user asks for more information, e. [sent-203, score-0.289]
42 This reply type indicates that the system failed to present the information the user was looking for. [sent-208, score-0.325]
43 askRepeat : The user asks the system to repeat the same message again, e. [sent-210, score-0.289]
44 This reply type indicates that the utterance was either too long or confusing for the user to remember, or the TTS quality was not good enough, or both. [sent-214, score-0.374]
45 We build user simulations using n-gram models of system (s) and user (u) acts, as first introduced by (Eckert et al. [sent-220, score-0.621]
46 IP structure + att|rIiPbute choice) model for predicting user reactions to the system’s combined IP structure and attribute selection decisions: P(au,t |IPs,t, attributess,t). [sent-225, score-0.506]
47 5Where au,t is the predicted next user action at time t, IPs,t was the system’s Information Presentation action at t, and attributess,t is the attributes selected by the system at t. [sent-226, score-0.677]
48 We use the most similar user models for system training, and the most dissimilar user models for testing NLG policies, in order to test whether the learned policies are robust and adaptive to unseen dialogue contexts. [sent-240, score-0.987]
49 For example, the user’s focus after the SUMMARY (with UM) in Table 1 is DBhits = 10, since the user is only interested in cheap, Indian places. [sent-249, score-0.289]
50 This reflects the fact that good IP strategies should help the user to select an item (valueUserReaction = +100) or provide more constraints addInfo (valueUserReaction = ±0), but the user should (nvoat udeoU anything eiolsne (valueUserReaction = −100). [sent-256, score-0.747]
51 8he4 user sin is 1 a8c hseienvteednc beys6 p, einsuch a way that the user ends the conversation unsuccessfully. [sent-271, score-0.578]
52 The top possible reward is achieved in the rare cases where the system can immediately present 1 item to the user using just 2 sentences, and the user then selects that item, i. [sent-272, score-0.78]
53 the user simulation and realiser), and explicitly captures the uncertainty in the generation environment. [sent-283, score-0.467]
54 The aim of the MDP is to maximise long-term expected reward of its decisions, resulting in a policy which maps each possible state to an appropriate action in that state. [sent-286, score-0.276]
55 We treat IP as a hierarchical joint optimisation problem, where first one of the IP structures (13) is chosen and then the number of attributes is decided, as shown in Figure 4. [sent-287, score-0.245]
56 At each generation step, the MDP can choose 1-5 attributes (e. [sent-288, score-0.238]
57 Generation stops as soon as the user is predicted to select an item, i. [sent-291, score-0.344]
58 For attribute selection we choose a majority baseline (randomly choosing between 3 or 4 attributes) since the attribute selection models learned by Supervised Learning on the WoZ data didn’t show significant improvements. [sent-304, score-0.363]
59 For training, we used the user simulation model most similar to the data, see Section 4. [sent-305, score-0.384]
60 For testing, we test with the different user simulation model (the one which is most dissimilar to the data). [sent-307, score-0.413]
61 We first investigate how well IP structure (without attribute choice) can be learned in increasingly complex generation scenarios. [sent-308, score-0.224]
62 A generation scenario is a combination of a particular kind of NLG realiser (template vs. [sent-309, score-0.418]
63 stochastic) along with different levels ofvariation introduced by certain features of the dialogue context. [sent-310, score-0.253]
64 In general, the stochastic realiser introduces more variation in lower level features than the template-based realiser. [sent-311, score-0.41]
65 IP structure choice, Template realiser: Predicted next user action varies according to the bi-gram model (P(au,t |IPs,t)); Number of sentences and attributes per IP strategy is set by defaults, reflecting a template-based realiser. [sent-316, score-0.589]
66 set by the DM); Sentence generation according to the SPaRKy stochastic realiser model as described in Section 3. [sent-321, score-0.398]
67 We then investigate different scenarios for jointly optimising IP structure (IPS) and attribute selection (Attr) decisions. [sent-323, score-0.271]
68 IPS+Attr choice, Template realiser: Predicted next user action varies according to tri-gram (P(au,t |IPs,t, attributess,t)) model; Number of sentences per IP structure set to default. [sent-326, score-0.361]
69 IPS+Attr choice, Template realiser+Focus model: Tri-gram user simulation with Template realiser and Focus of attention model with respect to #DBhits and #attributes as described in section 4. [sent-329, score-0.712]
70 IPS+Attr choice, Stochastic realiser: Trigram user simulation with sentence/attribute relationship according to Stochastic realiser as described in Section 3. [sent-333, score-0.683]
71 the full model = Predicted next user action varies according to tri-gram model+ Focus of attention model + Sentence/attribute relationship according to stochastic realiser. [sent-339, score-0.44]
72 2 Results We compare the average final reward (see Equation 1) gained by the baseline against the trained RL policies in the different scenarios for each 1000 test runs, using a paired samples t-test. [sent-341, score-0.297]
73 The learned RL policies show that lower level features are important in gaining significant improvements over the baseline. [sent-348, score-0.248]
74 Note that these strategies are context-dependent: the learner chooses how to proceed dependent on 7Note, that the baseline does reasonably well in scenarios with variation introduced by only higher level features (e. [sent-351, score-0.233]
75 Table 4: RL strategies learned for the different scenarios, where (n) denotes the number of attributes generated. [sent-379, score-0.341]
76 It will then stop generating if the user is predicted to select an item. [sent-382, score-0.371]
77 If the number of database items is low, it will start with a COMPARE and then continue with a RECOMMEND, unless the user selects an item. [sent-384, score-0.437]
78 2 learns to adapt to a more complex scenario: the number of attributes requested by the DM and produced by the stochastic sentence realiser. [sent-388, score-0.239]
79 The RL policies for jointly optimising IP strategy and attribute selection learn to select the num- ber of attributes according to the generation scenarios 2. [sent-392, score-0.665]
80 1generates a RECOMMEND with 5 attributes if the database hits are low (< 13). [sent-396, score-0.298]
81 If the user is predicted to narrow down his focus after the SUMMARY, the policy continues with a COMPARE using 1 attribute only, otherwise it helps the user by presenting 4 attributes. [sent-398, score-0.879]
82 It then continues with RECOMMEND(5), and stops as soon as the user is predicted to select one item. [sent-399, score-0.344]
83 the cumulative number of attributes generated in the whole NLG sequence, where the same attribute may be repeated within the sequence). [sent-404, score-0.321]
84 This strategy primarily adapts to the variations from the user simulation (tri-gram model). [sent-405, score-0.423]
85 6 Conclusion We have presented a new data-driven method for Information Presentation (IP) in Spoken Dialogue Systems using a statistical optimisation framework for content structure planning and attribute selection. [sent-428, score-0.274]
86 The WoZ data was used to build statistical models of user reactions to IP strategies, and a data-driven reward function for Reinforcement Learning (RL). [sent-431, score-0.463]
87 We compared a model of human behaviour (the ‘human wizard baseline’) against policies optimised using Reinforcement Learning, in a variety of scenarios. [sent-433, score-0.369]
88 Our optimised policies significantly outperform the IP structuring and attribute selection present in the WoZ data, especially when performing in complex generation scenarios which require adaptation to, e. [sent-434, score-0.467]
89 6% of the possible reward on this task, the RL policies are significantly better in 5 out of 6 scenarios, gaining up to 91. [sent-438, score-0.261]
90 from the NLG realiser and a user reaction model, is important for learning optimal IP strategies according to user preferences. [sent-442, score-0.986]
91 This methodology provides new insights into the nature of the IP problem, which has previously been treated as a module following dialogue management with no access to lower-level context features. [sent-448, score-0.245]
92 Learning to adapt to unknown users: Referring expression generation in spoken dialogue systems. [sent-519, score-0.361]
93 Datadriven user simulation for automated evaluation of spoken dialog systems. [sent-523, score-0.505]
94 Learning what to say and how to say it: joint optimization of spoken dialogue management and Natural Language Generation. [sent-539, score-0.337]
95 A wizard-of-oz interface to study information presentation strategies for spoken dialogue systems. [sent-543, score-0.524]
96 Learning database content for spoken dialogue system design. [sent-553, score-0.384]
97 Trainable sentence planning for complex information presentation in spoken dialog systems. [sent-583, score-0.31]
98 Quantitative and qualitative evaluation of DARPA Communicator spoken dialogue systems. [sent-607, score-0.312]
99 Fish or Fowl: A Wizard of Oz evaluation of dialogue strategies in the restaurant domain. [sent-616, score-0.376]
100 The influence of user tailoring and cognitive load on user performance in spoken dialogue systems. [sent-622, score-0.89]
wordName wordTfidf (topN-words)
[('ip', 0.352), ('realiser', 0.299), ('user', 0.289), ('nlg', 0.241), ('dialogue', 0.22), ('woz', 0.213), ('attributes', 0.189), ('wizard', 0.182), ('rl', 0.162), ('wizards', 0.156), ('rieser', 0.139), ('sds', 0.137), ('attribute', 0.132), ('oliver', 0.125), ('lemon', 0.121), ('reward', 0.117), ('policies', 0.117), ('strategies', 0.109), ('walker', 0.103), ('presentation', 0.103), ('reinforcement', 0.1), ('dbhits', 0.1), ('polifroni', 0.1), ('simulation', 0.095), ('spoken', 0.092), ('recommend', 0.092), ('policy', 0.087), ('verena', 0.087), ('planning', 0.086), ('valueuserreaction', 0.085), ('database', 0.072), ('action', 0.072), ('attr', 0.071), ('scenario', 0.07), ('users', 0.066), ('marilyn', 0.064), ('scenarios', 0.063), ('item', 0.06), ('um', 0.058), ('ips', 0.057), ('reactions', 0.057), ('sparky', 0.057), ('optimisation', 0.056), ('predicted', 0.055), ('dm', 0.052), ('items', 0.051), ('summary', 0.05), ('stochastic', 0.05), ('utterance', 0.049), ('generation', 0.049), ('optimising', 0.048), ('restaurant', 0.047), ('realisations', 0.046), ('janarthanam', 0.046), ('restaurants', 0.045), ('discounting', 0.043), ('simulations', 0.043), ('learned', 0.043), ('addinfo', 0.043), ('stent', 0.043), ('xingkun', 0.043), ('structuring', 0.041), ('tts', 0.04), ('strategy', 0.039), ('hits', 0.037), ('optimised', 0.037), ('environment', 0.036), ('reply', 0.036), ('price', 0.034), ('uncertainty', 0.034), ('mdp', 0.034), ('features', 0.033), ('behaviour', 0.033), ('edinburgh', 0.032), ('young', 0.031), ('utterances', 0.03), ('ratings', 0.029), ('attention', 0.029), ('dissimilar', 0.029), ('dialog', 0.029), ('selection', 0.028), ('asru', 0.028), ('boidin', 0.028), ('clarkson', 0.028), ('cuay', 0.028), ('cuisine', 0.028), ('dbhit', 0.028), ('gasic', 0.028), ('jrip', 0.028), ('winterboer', 0.028), ('template', 0.028), ('level', 0.028), ('stop', 0.027), ('decisions', 0.027), ('narrow', 0.027), ('gaining', 0.027), ('johanna', 0.026), ('management', 0.025), ('selects', 0.025), ('kingdom', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 1.0 187 acl-2010-Optimising Information Presentation for Spoken Dialogue Systems
Author: Verena Rieser ; Oliver Lemon ; Xingkun Liu
Abstract: We present a novel approach to Information Presentation (IP) in Spoken Dialogue Systems (SDS) using a data-driven statistical optimisation framework for content planning and attribute selection. First we collect data in a Wizard-of-Oz (WoZ) experiment and use it to build a supervised model of human behaviour. This forms a baseline for measuring the performance of optimised policies, developed from this data using Reinforcement Learning (RL) methods. We show that the optimised policies significantly outperform the baselines in a variety of generation scenarios: while the supervised model is able to attain up to 87.6% of the possible reward on this task, the RL policies are significantly better in 5 out of 6 scenarios, gaining up to 91.5% of the total possible reward. The RL policies perform especially well in more complex scenarios. We are also the first to show that adding predictive “lower level” features (e.g. from the NLG realiser) is important for optimising IP strategies according to user preferences. This provides new insights into the nature of the IP problem for SDS.
2 0.37160614 167 acl-2010-Learning to Adapt to Unknown Users: Referring Expression Generation in Spoken Dialogue Systems
Author: Srinivasan Janarthanam ; Oliver Lemon
Abstract: We present a data-driven approach to learn user-adaptive referring expression generation (REG) policies for spoken dialogue systems. Referring expressions can be difficult to understand in technical domains where users may not know the technical ‘jargon’ names of the domain entities. In such cases, dialogue systems must be able to model the user’s (lexical) domain knowledge and use appropriate referring expressions. We present a reinforcement learning (RL) framework in which the sys- tem learns REG policies which can adapt to unknown users online. Furthermore, unlike supervised learning methods which require a large corpus of expert adaptive behaviour to train on, we show that effective adaptive policies can be learned from a small dialogue corpus of non-adaptive human-machine interaction, by using a RL framework and a statistical user simulation. We show that in comparison to adaptive hand-coded baseline policies, the learned policy performs significantly better, with an 18.6% average increase in adaptation accuracy. The best learned policy also takes less dialogue time (average 1.07 min less) than the best hand-coded policy. This is because the learned policies can adapt online to changing evidence about the user’s domain expertise.
3 0.2079792 239 acl-2010-Towards Relational POMDPs for Adaptive Dialogue Management
Author: Pierre Lison
Abstract: Open-ended spoken interactions are typically characterised by both structural complexity and high levels of uncertainty, making dialogue management in such settings a particularly challenging problem. Traditional approaches have focused on providing theoretical accounts for either the uncertainty or the complexity of spoken dialogue, but rarely considered the two issues simultaneously. This paper describes ongoing work on a new approach to dialogue management which attempts to fill this gap. We represent the interaction as a Partially Observable Markov Decision Process (POMDP) over a rich state space incorporating both dialogue, user, and environment models. The tractability of the resulting POMDP can be preserved using a mechanism for dynamically constraining the action space based on prior knowledge over locally relevant dialogue structures. These constraints are encoded in a small set of general rules expressed as a Markov Logic network. The first-order expressivity of Markov Logic enables us to leverage the rich relational structure of the problem and efficiently abstract over large regions ofthe state and action spaces.
4 0.20547915 142 acl-2010-Importance-Driven Turn-Bidding for Spoken Dialogue Systems
Author: Ethan Selfridge ; Peter Heeman
Abstract: Current turn-taking approaches for spoken dialogue systems rely on the speaker releasing the turn before the other can take it. This reliance results in restricted interactions that can lead to inefficient dialogues. In this paper we present a model we refer to as Importance-Driven Turn-Bidding that treats turn-taking as a negotiative process. Each conversant bids for the turn based on the importance of the intended utterance, and Reinforcement Learning is used to indirectly learn this parameter. We find that Importance-Driven Turn-Bidding performs better than two current turntaking approaches in an artificial collaborative slot-filling domain. The negotiative nature of this model creates efficient dia- logues, and supports the improvement of mixed-initiative interaction.
5 0.16425937 194 acl-2010-Phrase-Based Statistical Language Generation Using Graphical Models and Active Learning
Author: Francois Mairesse ; Milica Gasic ; Filip Jurcicek ; Simon Keizer ; Blaise Thomson ; Kai Yu ; Steve Young
Abstract: Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of generated utterances, or (b) using statistics to inform the generation decision process. Both approaches rely on the existence of a handcrafted generator, which limits their scalability to new domains. This paper presents BAGEL, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators. A human evaluation shows that BAGEL can generate natural and informative utterances from unseen inputs in the information presentation domain. Additionally, generation perfor- mance on sparse datasets is improved significantly by using certainty-based active learning, yielding ratings close to the human gold standard with a fraction of the data.
6 0.14711928 82 acl-2010-Demonstration of a Prototype for a Conversational Companion for Reminiscing about Images
7 0.12445407 47 acl-2010-Beetle II: A System for Tutoring and Computational Linguistics Experimentation
8 0.12236305 227 acl-2010-The Impact of Interpretation Problems on Tutorial Dialogue
9 0.12113605 202 acl-2010-Reading between the Lines: Learning to Map High-Level Instructions to Commands
10 0.10281092 172 acl-2010-Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons
11 0.10258164 113 acl-2010-Extraction and Approximation of Numerical Attributes from the Web
12 0.10135176 129 acl-2010-Growing Related Words from Seed via User Behaviors: A Re-Ranking Based Approach
13 0.094250277 178 acl-2010-Non-Cooperation in Dialogue
14 0.089220092 35 acl-2010-Automated Planning for Situated Natural Language Generation
15 0.088139176 168 acl-2010-Learning to Follow Navigational Directions
16 0.080303796 199 acl-2010-Preferences versus Adaptation during Referring Expression Generation
17 0.073723115 58 acl-2010-Classification of Feedback Expressions in Multimodal Data
18 0.068425119 209 acl-2010-Sentiment Learning on Product Reviews via Sentiment Ontology Tree
19 0.067984663 179 acl-2010-Now, Where Was I? Resumption Strategies for an In-Vehicle Dialogue System
20 0.06734661 224 acl-2010-Talking NPCs in a Virtual Game World
topicId topicWeight
[(0, -0.156), (1, 0.099), (2, -0.119), (3, -0.228), (4, -0.064), (5, -0.292), (6, -0.22), (7, 0.084), (8, -0.046), (9, 0.002), (10, 0.057), (11, -0.102), (12, -0.015), (13, -0.047), (14, 0.062), (15, -0.141), (16, 0.102), (17, 0.048), (18, -0.1), (19, 0.027), (20, 0.013), (21, -0.014), (22, -0.025), (23, 0.134), (24, -0.004), (25, 0.103), (26, 0.026), (27, -0.029), (28, -0.052), (29, 0.037), (30, 0.03), (31, 0.001), (32, -0.03), (33, -0.015), (34, 0.017), (35, -0.046), (36, -0.06), (37, -0.028), (38, 0.094), (39, -0.03), (40, -0.025), (41, -0.016), (42, -0.004), (43, 0.042), (44, -0.002), (45, -0.097), (46, 0.105), (47, 0.079), (48, -0.11), (49, 0.036)]
simIndex simValue paperId paperTitle
same-paper 1 0.97000325 187 acl-2010-Optimising Information Presentation for Spoken Dialogue Systems
Author: Verena Rieser ; Oliver Lemon ; Xingkun Liu
Abstract: We present a novel approach to Information Presentation (IP) in Spoken Dialogue Systems (SDS) using a data-driven statistical optimisation framework for content planning and attribute selection. First we collect data in a Wizard-of-Oz (WoZ) experiment and use it to build a supervised model of human behaviour. This forms a baseline for measuring the performance of optimised policies, developed from this data using Reinforcement Learning (RL) methods. We show that the optimised policies significantly outperform the baselines in a variety of generation scenarios: while the supervised model is able to attain up to 87.6% of the possible reward on this task, the RL policies are significantly better in 5 out of 6 scenarios, gaining up to 91.5% of the total possible reward. The RL policies perform especially well in more complex scenarios. We are also the first to show that adding predictive “lower level” features (e.g. from the NLG realiser) is important for optimising IP strategies according to user preferences. This provides new insights into the nature of the IP problem for SDS.
2 0.91618359 167 acl-2010-Learning to Adapt to Unknown Users: Referring Expression Generation in Spoken Dialogue Systems
Author: Srinivasan Janarthanam ; Oliver Lemon
Abstract: We present a data-driven approach to learn user-adaptive referring expression generation (REG) policies for spoken dialogue systems. Referring expressions can be difficult to understand in technical domains where users may not know the technical ‘jargon’ names of the domain entities. In such cases, dialogue systems must be able to model the user’s (lexical) domain knowledge and use appropriate referring expressions. We present a reinforcement learning (RL) framework in which the sys- tem learns REG policies which can adapt to unknown users online. Furthermore, unlike supervised learning methods which require a large corpus of expert adaptive behaviour to train on, we show that effective adaptive policies can be learned from a small dialogue corpus of non-adaptive human-machine interaction, by using a RL framework and a statistical user simulation. We show that in comparison to adaptive hand-coded baseline policies, the learned policy performs significantly better, with an 18.6% average increase in adaptation accuracy. The best learned policy also takes less dialogue time (average 1.07 min less) than the best hand-coded policy. This is because the learned policies can adapt online to changing evidence about the user’s domain expertise.
3 0.9078297 142 acl-2010-Importance-Driven Turn-Bidding for Spoken Dialogue Systems
Author: Ethan Selfridge ; Peter Heeman
Abstract: Current turn-taking approaches for spoken dialogue systems rely on the speaker releasing the turn before the other can take it. This reliance results in restricted interactions that can lead to inefficient dialogues. In this paper we present a model we refer to as Importance-Driven Turn-Bidding that treats turn-taking as a negotiative process. Each conversant bids for the turn based on the importance of the intended utterance, and Reinforcement Learning is used to indirectly learn this parameter. We find that Importance-Driven Turn-Bidding performs better than two current turntaking approaches in an artificial collaborative slot-filling domain. The negotiative nature of this model creates efficient dia- logues, and supports the improvement of mixed-initiative interaction.
4 0.75460941 239 acl-2010-Towards Relational POMDPs for Adaptive Dialogue Management
Author: Pierre Lison
Abstract: Open-ended spoken interactions are typically characterised by both structural complexity and high levels of uncertainty, making dialogue management in such settings a particularly challenging problem. Traditional approaches have focused on providing theoretical accounts for either the uncertainty or the complexity of spoken dialogue, but rarely considered the two issues simultaneously. This paper describes ongoing work on a new approach to dialogue management which attempts to fill this gap. We represent the interaction as a Partially Observable Markov Decision Process (POMDP) over a rich state space incorporating both dialogue, user, and environment models. The tractability of the resulting POMDP can be preserved using a mechanism for dynamically constraining the action space based on prior knowledge over locally relevant dialogue structures. These constraints are encoded in a small set of general rules expressed as a Markov Logic network. The first-order expressivity of Markov Logic enables us to leverage the rich relational structure of the problem and efficiently abstract over large regions ofthe state and action spaces.
5 0.73994362 82 acl-2010-Demonstration of a Prototype for a Conversational Companion for Reminiscing about Images
Author: Yorick Wilks ; Roberta Catizone ; Alexiei Dingli ; Weiwei Cheng
Abstract: This paper describes an initial prototype demonstrator of a Companion, designed as a platform for novel approaches to the following: 1) The use of Information Extraction (IE) techniques to extract the content of incoming dialogue utterances after an Automatic Speech Recognition (ASR) phase, 2) The conversion of the input to Resource Descriptor Format (RDF) to allow the generation of new facts from existing ones, under the control of a Dialogue Manger (DM), that also has access to stored knowledge and to open knowledge accessed in real time from the web, all in RDF form, 3) A DM implemented as a stack and network virtual machine that models mixed initiative in dialogue control, and 4) A tuned dialogue act detector based on corpus evidence. The prototype platform was evaluated, and we describe this briefly; it is also designed to support more extensive forms of emotion detection carried by both speech and lexical content, as well as extended forms of machine learning.
6 0.67583781 179 acl-2010-Now, Where Was I? Resumption Strategies for an In-Vehicle Dialogue System
7 0.57171822 194 acl-2010-Phrase-Based Statistical Language Generation Using Graphical Models and Active Learning
8 0.56753802 224 acl-2010-Talking NPCs in a Virtual Game World
9 0.51351541 178 acl-2010-Non-Cooperation in Dialogue
10 0.47608808 199 acl-2010-Preferences versus Adaptation during Referring Expression Generation
11 0.43597585 35 acl-2010-Automated Planning for Situated Natural Language Generation
12 0.4187369 129 acl-2010-Growing Related Words from Seed via User Behaviors: A Re-Ranking Based Approach
13 0.41786382 202 acl-2010-Reading between the Lines: Learning to Map High-Level Instructions to Commands
14 0.41435668 81 acl-2010-Decision Detection Using Hierarchical Graphical Models
15 0.41042981 254 acl-2010-Using Speech to Reply to SMS Messages While Driving: An In-Car Simulator User Study
16 0.38261503 47 acl-2010-Beetle II: A System for Tutoring and Computational Linguistics Experimentation
17 0.37005794 168 acl-2010-Learning to Follow Navigational Directions
18 0.34980145 227 acl-2010-The Impact of Interpretation Problems on Tutorial Dialogue
19 0.34935287 58 acl-2010-Classification of Feedback Expressions in Multimodal Data
20 0.33213496 204 acl-2010-Recommendation in Internet Forums and Blogs
topicId topicWeight
[(14, 0.043), (25, 0.043), (39, 0.012), (41, 0.026), (42, 0.041), (44, 0.018), (59, 0.06), (72, 0.02), (73, 0.034), (76, 0.013), (78, 0.035), (83, 0.089), (84, 0.038), (94, 0.307), (98, 0.123)]
simIndex simValue paperId paperTitle
same-paper 1 0.75811654 187 acl-2010-Optimising Information Presentation for Spoken Dialogue Systems
Author: Verena Rieser ; Oliver Lemon ; Xingkun Liu
Abstract: We present a novel approach to Information Presentation (IP) in Spoken Dialogue Systems (SDS) using a data-driven statistical optimisation framework for content planning and attribute selection. First we collect data in a Wizard-of-Oz (WoZ) experiment and use it to build a supervised model of human behaviour. This forms a baseline for measuring the performance of optimised policies, developed from this data using Reinforcement Learning (RL) methods. We show that the optimised policies significantly outperform the baselines in a variety of generation scenarios: while the supervised model is able to attain up to 87.6% of the possible reward on this task, the RL policies are significantly better in 5 out of 6 scenarios, gaining up to 91.5% of the total possible reward. The RL policies perform especially well in more complex scenarios. We are also the first to show that adding predictive “lower level” features (e.g. from the NLG realiser) is important for optimising IP strategies according to user preferences. This provides new insights into the nature of the IP problem for SDS.
2 0.55005908 87 acl-2010-Discriminative Modeling of Extraction Sets for Machine Translation
Author: John DeNero ; Dan Klein
Abstract: We present a discriminative model that directly predicts which set ofphrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets: nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First, we can incorporate features on phrase pairs, in addition to word links. Second, we can optimize for an extraction-based loss function that relates directly to the end task of generating translations. Our model gives improvements in alignment quality relative to state-of-the-art unsupervised and supervised baselines, as well as providing up to a 1.4 improvement in BLEU score in Chinese-to-English translation experiments.
3 0.53467536 167 acl-2010-Learning to Adapt to Unknown Users: Referring Expression Generation in Spoken Dialogue Systems
Author: Srinivasan Janarthanam ; Oliver Lemon
Abstract: We present a data-driven approach to learn user-adaptive referring expression generation (REG) policies for spoken dialogue systems. Referring expressions can be difficult to understand in technical domains where users may not know the technical ‘jargon’ names of the domain entities. In such cases, dialogue systems must be able to model the user’s (lexical) domain knowledge and use appropriate referring expressions. We present a reinforcement learning (RL) framework in which the sys- tem learns REG policies which can adapt to unknown users online. Furthermore, unlike supervised learning methods which require a large corpus of expert adaptive behaviour to train on, we show that effective adaptive policies can be learned from a small dialogue corpus of non-adaptive human-machine interaction, by using a RL framework and a statistical user simulation. We show that in comparison to adaptive hand-coded baseline policies, the learned policy performs significantly better, with an 18.6% average increase in adaptation accuracy. The best learned policy also takes less dialogue time (average 1.07 min less) than the best hand-coded policy. This is because the learned policies can adapt online to changing evidence about the user’s domain expertise.
4 0.50332445 214 acl-2010-Sparsity in Dependency Grammar Induction
Author: Jennifer Gillenwater ; Kuzman Ganchev ; Joao Graca ; Fernando Pereira ; Ben Taskar
Abstract: A strong inductive bias is essential in unsupervised grammar induction. We explore a particular sparsity bias in dependency grammars that encourages a small number of unique dependency types. Specifically, we investigate sparsity-inducing penalties on the posterior distributions of parent-child POS tag pairs in the posterior regularization (PR) framework of Graça et al. (2007). In ex- periments with 12 languages, we achieve substantial gains over the standard expectation maximization (EM) baseline, with average improvement in attachment accuracy of 6.3%. Further, our method outperforms models based on a standard Bayesian sparsity-inducing prior by an average of 4.9%. On English in particular, we show that our approach improves on several other state-of-the-art techniques.
5 0.50231737 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing
Author: Liang Huang ; Kenji Sagae
Abstract: Incremental parsing techniques such as shift-reduce have gained popularity thanks to their efficiency, but there remains a major problem: the search is greedy and only explores a tiny fraction of the whole space (even with beam search) as opposed to dynamic programming. We show that, surprisingly, dynamic programming is in fact possible for many shift-reduce parsers, by merging “equivalent” stacks based on feature values. Empirically, our algorithm yields up to a five-fold speedup over a state-of-the-art shift-reduce depen- dency parser with no loss in accuracy. Better search also leads to better learning, and our final parser outperforms all previously reported dependency parsers for English and Chinese, yet is much faster.
6 0.49422824 251 acl-2010-Using Anaphora Resolution to Improve Opinion Target Identification in Movie Reviews
7 0.49412704 62 acl-2010-Combining Orthogonal Monolingual and Multilingual Sources of Evidence for All Words WSD
8 0.49238339 71 acl-2010-Convolution Kernel over Packed Parse Forest
9 0.49229434 202 acl-2010-Reading between the Lines: Learning to Map High-Level Instructions to Commands
10 0.49107069 188 acl-2010-Optimizing Informativeness and Readability for Sentiment Summarization
11 0.49067944 140 acl-2010-Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.
12 0.49028379 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
13 0.48977643 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
14 0.48953944 39 acl-2010-Automatic Generation of Story Highlights
15 0.4894678 245 acl-2010-Understanding the Semantic Structure of Noun Phrase Queries
16 0.48943579 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese
17 0.48896331 208 acl-2010-Sentence and Expression Level Annotation of Opinions in User-Generated Discourse
18 0.48844376 65 acl-2010-Complexity Metrics in an Incremental Right-Corner Parser
19 0.48819375 116 acl-2010-Finding Cognate Groups Using Phylogenies
20 0.4880113 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans