acl acl2010 acl2010-194 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Francois Mairesse ; Milica Gasic ; Filip Jurcicek ; Simon Keizer ; Blaise Thomson ; Kai Yu ; Steve Young
Abstract: Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of generated utterances, or (b) using statistics to inform the generation decision process. Both approaches rely on the existence of a handcrafted generator, which limits their scalability to new domains. This paper presents BAGEL, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators. A human evaluation shows that BAGEL can generate natural and informative utterances from unseen inputs in the information presentation domain. Additionally, generation perfor- mance on sparse datasets is improved significantly by using certainty-based active learning, yielding ratings close to the human gold standard with a fraction of the data.
Reference: text
sentIndex sentText sentNum sentScore
1 mai re s s e , mg4 3 6 , fj22 8 , , s k5 6 1 Abstract Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of generated utterances, or (b) using statistics to inform the generation decision process. [sent-2, score-0.289]
2 A human evaluation shows that BAGEL can generate natural and informative utterances from unseen inputs in the information presentation domain. [sent-5, score-0.289]
3 Additionally, generation perfor- mance on sparse datasets is improved significantly by using certainty-based active learning, yielding ratings close to the human gold standard with a fraction of the data. [sent-6, score-0.298]
4 This paper presents BAGEL (Bayesian networks for generation using active learning), an NLG system that can be fully trained from aligned data. [sent-23, score-0.334]
5 While the main requirement of the generator is to produce natural utterances within a dialogue system domain, a second objective is to minimise the overall development effort. [sent-24, score-0.51]
6 c ss2o0c1ia0ti Aosnso focria Ctioonm fpourta Ctoiomnpault Laitniognuaislt Licisn,g puaigsetisc 1s552–1561, to improve generation performance on sparse datasets, by guiding the data collection process using certainty-based active learning (Lewis and Catlett, 1994). [sent-33, score-0.271]
7 We train BAGEL in the information presentation domain, from a corpus of utterances produced by 42 untrained annotators (see Section 5. [sent-34, score-0.307]
8 Finally, our human evaluation shows that training using active learning significantly improves generation performance on sparse datasets, yielding results close to the human gold standard using a fraction of the data. [sent-39, score-0.326]
9 2 Phrase-based generation from semantic stacks BAGEL uses a stack-based semantic representation to constrain the sequence of semantic concepts to be searched. [sent-40, score-0.756]
10 A stack representation provides useful generalisation properties (see Section 3. [sent-42, score-0.326]
11 1), while the resulting stack sequences are relatively easy to align (see Section 5. [sent-43, score-0.306]
12 In the context of dialogue systems, Table 1illustrates how the input dialogue act is first mapped to a set of stacks of semantic concepts, and then aligned with a word sequence. [sent-45, score-0.957]
13 The bottom concept in the stack will typically be a dialogue act type, e. [sent-46, score-0.558]
14 The generator’s goal is thus finding the most likely realisation given an unordered set of mandatory semantic stacks Sm derived fsreotm o fth mea input dialogue cac st. [sent-58, score-1.074]
15 t sFo Sr example, s =inform(area(centre)) is a mandatory stack associated with the dialogue act in Table 1 (frame 8). [sent-59, score-0.738]
16 While mandatory stacks must all be conveyed in the output realisation, Sm does not contain the optional intermediary nst,a Scks Si that can refer to (a) general attributes of the object under discussion (e. [sent-60, score-0.673]
17 , inform(area) in Table 1), or (b) to concepts that are not in the input at all, which are associated with the singleton stack inform (e. [sent-62, score-0.396]
18 , phrases expressing the dialogue act type, or clause aggregation operations). [sent-64, score-0.292]
19 For example, the stack sequence in Table 1contains 3 intermediary stacks for t = 2, 5 and 7. [sent-65, score-0.756]
20 contiguous words belonging to the same semantic stack are modelled as an atomic observation unit orphrase. [sent-68, score-0.336]
21 3 Dynamic Bayesian networks for NLG Dynamic Bayesian networks have been used successfully for speech recognition, natural language understanding, dialogue management and text-tospeech synthesis (Rabiner, 1989; He and Young, 2005; Lef e`vre, 2006; Thomson and Young, 2010; Tokuda et al. [sent-70, score-0.324]
22 BAGEL models the generation task as finding the most likely sequence of realisation phrases R∗ = (r1. [sent-76, score-0.347]
23 rL) given an unordered set of mandatory semantic stacks Sm, with |Sm | ≤ L. [sent-79, score-0.635]
24 e Bmantic stacks S∗ that will appear in the utterance given Sm, i. [sent-81, score-0.503]
25 by inserting intermediary stacks if needed aSnd by performing content ordering. [sent-83, score-0.453]
26 Any number of intermediary stacks can be inserted between two consecutive mandatory stacks, as long as all their concepts are included in either the previous or following mandatory stack, and as long as each stack transition leads to a different stack (see example in Table 1). [sent-84, score-1.388]
27 Let us define the set of possible stack sequences matching these constraints as Seq(Sm) ⊆ {S = (s1. [sent-85, score-0.306]
28 The decoded stack sequence S∗ is then treated as observed in the realisation phase, in which the model in Fig. [sent-98, score-0.532]
29 2 is used to find the realisation phrase sequence R∗ maximising P(R| S∗) over all phrase sequences ofm length iLng = |RS∗| | in our vocabulary: R∗ = argmax R= (r1. [sent-99, score-0.432]
30 rL ) In order to reduce model complexity, we factorise our model by conditioning the realisation phrase at time t on the previous phrase rt−1, and the previous, current, and following semantic stacks. [sent-105, score-0.357]
31 The semantic stack st at time t is assumed stlacstv kstame atmslaecitadktnc akdstrieoa stcr okryefistrame patedframeinalfrme Figure 1: Graphical model for the semantic decoding phase. [sent-106, score-0.49]
32 The generation of the end semantic stack symbol deterministically triggers the final frame. [sent-108, score-0.417]
33 1), and pruning any sequence that has not included all mandatory input stacks on reaching the final frame (see observed stack set validator variable in Fig. [sent-110, score-0.868]
34 Since the number of intermediary stacks is not known at decoding time, the network is unrolled for a fixed number of frames T defining the maximum number of phrases that can be generated (e. [sent-112, score-0.495]
35 The end of the stack sequence is then determined by a special end symbol, which can only be emitted within the T frames once all mandatory stacks have been visited. [sent-115, score-0.868]
36 A consequence is that shorter sequences containing only mandatory stacks are likely to be favoured. [sent-118, score-0.633]
37 While future work should investigate length normalisation strategies, we find that the learned transition probabilities are skewed enough to favour stack sequences including intermediary stacks. [sent-119, score-0.413]
38 In terms of computational complexity, it is important to note that the number of stack sequences Seq(Sm) to search over increases exponentially wSietqh( Sthe number of input mandatory stacks. [sent-122, score-0.486]
39 1 Generalisation to unseen semantic stacks In order to generalise to semantic stacks which have not been observed during training, the realisation phrase r is made dependent on underspecified stack configurations, i. [sent-125, score-1.465]
40 For example, the last stack in Table 1 is associated with the head centre and the tail inform(area). [sent-128, score-0.399]
41 As a result, BAGEL assigns non-zero probabilities to realisation phrases in unseen semantic contexts, by backing off to the head and the tail of the stack. [sent-129, score-0.405]
42 A consequence is that BAGEL’s lexical realisation can generalise across contexts. [sent-130, score-0.257]
43 For example, if rej ect(area(centre)) was never observed at training time, P(r = centre of town|s = rej ect(area(centre))) rw =il c e bnet ees otifm toawtend| by backing off to P(r = centre of town|h = centre). [sent-131, score-0.342]
44 rt |ht,lt,rt−1,lt−1,lt+1,st,st−1,st+1 rt |ht,lt, rt−1,lt−1,lt+1,st rt|ht,lt,rt−1,lt−1,lt+1 st|st−1,st−2,su st|st−1,st−2 st|st−1 rt|ht,lt rt |ht st rt Figure 3: Backoff graphs for the semantic decoding and realisation models. [sent-137, score-0.759]
45 Variables which are the furthest away in time are dropped first, and par- tial stack variables are dropped last as they are observed the most. [sent-140, score-0.266]
46 It is important to note that generating unseen semantic stacks requires all possible mandatory semantic stacks in the target domain to be predefined, in order for all stack unigrams to be assigned a smoothed non-zero probability. [sent-141, score-1.422]
47 2 High cardinality concept abstraction While one should expect a trainable generator to learn multiple lexical realisations for lowcardinality semantic concepts, learning lexical realisations for high-cardinality database entries (e. [sent-143, score-0.271]
48 We thus divide pre-terminal concepts in the semantic stacks into two types: (a) enumerable attributes whose values are associated with distinct semantic stacks in 1555 our and are ing model (e. [sent-146, score-1.019]
49 , inform(pricerange(cheap))), (b) non-enumerable attributes whose values replaced by a generic symbol before trainin both the utterance and the semantic stack (e. [sent-148, score-0.494]
50 BAGEL supports the optimisation of the data collection process through active learning, in which the next semantic input to annotate is determined by the current model. [sent-155, score-0.26]
51 The probabilistic nature of BAGEL allows the use of certaintybased active learning (Lewis and Catlett, 1994), by querying the k semantic inputs for which the model is the least certain about its output realisation. [sent-156, score-0.339]
52 Given a finite semantic input space I representing ealnl possible dialogue acts itn our edo Imain (i. [sent-157, score-0.34]
53 , the set of all sets of mandatory seman- tic stacks Sm), BAGEL’s active learning training process ister Sates over the following steps: 1. [sent-159, score-0.783]
54 The number ofutterances to be queried k should depend on the flexibility of the annotators and the time required for generating all possible utterances in the domain. [sent-178, score-0.278]
55 Since each active learning iteration requires generating all training utterances in our domain, they are generated using a larger clique pruning threshold than the test utterances used for evaluation. [sent-181, score-0.636]
56 The domain contains two dialogue act types: (a) inform: presenting information about a restaurant (see Table 1), and (b) rej ect: informing that the user’s constraints cannot be met (e. [sent-184, score-0.414]
57 3 Our input semantic space is approximated by the set of information presentation dialogue acts produced over 20,000 simulated dialogues between our statistical dialogue manager (Young et al. [sent-188, score-0.55]
58 , 2007), which results in 202 unique dialogue acts after replacing nonenumerable values by a generic symbol. [sent-190, score-0.27]
59 As one of our objectives is to test whether BAGEL can learn from data provided by a large sample of untrained annotators, we collected a corpus of semantically-aligned utterances using Amazon’s Mechanical Turk data collection service. [sent-193, score-0.269]
60 Annotators were first asked to provide an utterance matching an abstract description of the dialogue act, regardless of the order in which the constraints are presented (e. [sent-195, score-0.328]
61 Two paraphrases were collected for each dialogue act in our domain, resulting in a total of 404 aligned ut3With the exception of areas defined as proper nouns. [sent-204, score-0.337]
62 After manually checking and normalising the dataset,4 the layered annotations were automatically mapped to phrase-level semantic stacks by splitting the utterance into phrases at annotation boundaries. [sent-208, score-0.603]
63 The resulting vocabulary consists of 52 distinct semantic stacks and 109 distinct realisation phrases, with an average of 8. [sent-210, score-0.684]
64 , 2002), which measures the word n-gram overlap between the generated utterances and the 2 reference paraphrases over a test corpus (with n up to 4). [sent-214, score-0.254]
65 Our results are averaged over a 10-fold cross-validation over distinct dialogue acts, i. [sent-217, score-0.237]
66 dialogue acts used for testing are not seen at training time,5 and all systems are tested on the same folds. [sent-219, score-0.298]
67 The training and test sets respectively contain an average of 181 and 21 distinct dialogue acts, and each dialogue act is associated with two paraphrases, resulting in 362 training utterances. [sent-220, score-0.558]
68 5We do not evaluate performance on dialogue acts used for training, as the training examples can trivially be used as generation templates. [sent-222, score-0.379]
69 4 shows that adding a dependency on the future semantic stack improves performances for all training set sizes, despite the added model complexity. [sent-227, score-0.364]
70 Backing off to partial stacks also improves performance, but only for sparse training sets. [sent-228, score-0.413]
71 As our dataset only contains two paraphrases per dialogue act, the same dialogue act can only be queried twice during the active learning procedure. [sent-232, score-0.768]
72 A consequence is that the training set used for active learning converges towards the randomly sampled set as its size increases. [sent-233, score-0.246]
73 Results show that increasing the training set one utterance at a time using active learning (k = 1) significantly outperforms random sampling when using 40, 80, and 100 utterances (p < . [sent-234, score-0.589]
74 Increasing the number of utterances to be queried at each iteration to k = 10 results in a smaller performance increase. [sent-236, score-0.24]
75 As the length of the semantic stack sequence is not known before decoding, the active learning selection criterion presented in (9) is biased towards longer utterances, which tend to have a lower probability. [sent-242, score-0.563]
76 6 shows that normalising the log probability by the number of semantic stacks does not improve overall learning performance. [sent-244, score-0.485]
77 6 shows that a baseline selecting the largest remaining semantic input at each iteration performs worse than the active learning scheme for training sets above 20 utterances. [sent-246, score-0.288]
78 The judges are then asked to evaluate the informativeness and naturalness of each of the 8 utterances on a 5 point likert-scale. [sent-253, score-0.447]
79 the models are trained on up to 90% of the data and the training set does not contain the dialogue act being tested. [sent-258, score-0.355]
80 7 and 8 compare the naturalness and informativeness scores of each system averaged over all 202 dialogue acts. [sent-260, score-0.475]
81 A paired t-test shows that models trained on 40 utterances or less produce utterances that are rated significantly lower than human utterances for both naturalness and informativeness (p < . [sent-261, score-0.9]
82 However, models trained on 100 utterances or more do not perform significantly worse than human utterances for both dimensions, with a mean difference below . [sent-263, score-0.453]
83 As far as the learning method is concerned, a paired t-test shows that models trained on 20 and 40 utterances using active learning significantly outperform models trained using random sampling, for both dimensions (p < . [sent-266, score-0.469]
84 0" /)#&$&$0 -"+ -&1" Figure 7: Naturalness mean opinion scores for different training set sizes, using random sampling and active learning. [sent-278, score-0.262]
85 terestingly, while models trained on 100 utterances outperform models trained on 40 utterances using random sampling (p < . [sent-281, score-0.532]
86 05), they do not significantly outperform models trained on 40 utterances using active learning (p = . [sent-282, score-0.434]
87 These results sug- gest that certainty-based active learning is beneficial for training a generator from a limited amount of data given the domain size. [sent-285, score-0.318]
88 Figure 8: Informativeness mean opinion scores for different training set sizes, using random sampling and active learning. [sent-293, score-0.262]
89 Although early experiments showed that GIZA++ did not perform well on our data—possibly because of the coarse granularity of our semantic representation—future work should evaluate the generalisation performance of synchronous CFGs in a dialogue system domain. [sent-306, score-0.34]
90 Although we do not know of any work on active learning for NLG, previous work has used active learning for semantic parsing and information extraction (Thompson et al. [sent-307, score-0.45]
91 While certaintybased methods have been widely used, future work should investigate the performance of committeebased active learning for NLG, in which examples are selected based on the level of disagreement between models trained on subsets of the data (Freund et al. [sent-312, score-0.255]
92 To train BAGEL in a dialogue system domain, we propose a stack-based semantic representation at the phrase level, which is expressive enough to generate natural utterances from unseen inputs, yet simple enough for data to be collected from 42 untrained annotators with a minimal normalisation step. [sent-318, score-0.686]
93 A human evaluation over 202 dialogue acts does not show any difference in naturalness and informativeness between BAGEL’s outputs and human utterances. [sent-319, score-0.508]
94 While this paper only evaluates the most likely realisation given a dialogue act, we believe that BAGEL’s probabilistic nature and generalisation capabilities are well suited to model the linguistic variation resulting from the diversity of annotators. [sent-322, score-0.499]
95 For example, the realisation phrase can easily be conditioned on syntactic constructs governing that phrase, and the recursive nature of syntax can be modelled by keeping track of the depth of the current embedded clause. [sent-326, score-0.258]
96 While syntactic information can be included with no human effort by using syntactic parsers, their robustness to dialogue system utterances must first be evaluated. [sent-327, score-0.419]
97 Natural language generation as planning under uncertainty for spoken dialogue systems. [sent-451, score-0.329]
98 Bayesian update ofdialogue state: A POMDP framework for spoken dialogue systems. [sent-486, score-0.248]
99 Training a sentence planner for spoken dialogue using boosting. [sent-513, score-0.248]
100 The Hidden Information State model: a practical framework for POMDP-based spoken dialogue management. [sent-535, score-0.248]
wordName wordTfidf (topN-words)
[('bagel', 0.45), ('stacks', 0.385), ('stack', 0.266), ('sm', 0.247), ('realisation', 0.229), ('dialogue', 0.21), ('utterances', 0.209), ('active', 0.19), ('mandatory', 0.18), ('naturalness', 0.15), ('nlg', 0.145), ('utterance', 0.118), ('rt', 0.094), ('centre', 0.094), ('informativeness', 0.088), ('inform', 0.087), ('act', 0.082), ('generation', 0.081), ('semantic', 0.07), ('intermediary', 0.068), ('bilmes', 0.066), ('generator', 0.065), ('acts', 0.06), ('untrained', 0.06), ('generalisation', 0.06), ('town', 0.06), ('seq', 0.056), ('area', 0.053), ('thomson', 0.051), ('inputs', 0.049), ('mairesse', 0.048), ('realisations', 0.048), ('paraphrases', 0.045), ('rej', 0.045), ('pricerange', 0.045), ('sxeq', 0.045), ('sampling', 0.044), ('concepts', 0.043), ('ect', 0.043), ('bleu', 0.042), ('decoding', 0.042), ('st', 0.042), ('restaurant', 0.042), ('trainable', 0.04), ('sequences', 0.04), ('attributes', 0.04), ('riverside', 0.039), ('tokuda', 0.039), ('belz', 0.039), ('normalisation', 0.039), ('tail', 0.039), ('young', 0.039), ('food', 0.039), ('annotators', 0.038), ('spoken', 0.038), ('sequence', 0.037), ('backing', 0.036), ('sizes', 0.035), ('domain', 0.035), ('trained', 0.035), ('argmax', 0.034), ('handcrafted', 0.034), ('backoff', 0.034), ('arrows', 0.034), ('maximising', 0.034), ('icassp', 0.032), ('schatzmann', 0.032), ('unseen', 0.031), ('queried', 0.031), ('certaintybased', 0.03), ('espinosa', 0.03), ('gmtk', 0.03), ('langkilde', 0.03), ('lef', 0.03), ('nakanishi', 0.03), ('normalising', 0.03), ('sates', 0.03), ('varges', 0.03), ('wasp', 0.03), ('synthesis', 0.029), ('bayesian', 0.029), ('phrase', 0.029), ('speech', 0.029), ('training', 0.028), ('networks', 0.028), ('white', 0.028), ('factored', 0.028), ('consequence', 0.028), ('walker', 0.027), ('averaged', 0.027), ('yielding', 0.027), ('reiter', 0.026), ('catlett', 0.026), ('bloodgood', 0.026), ('isard', 0.026), ('minimise', 0.026), ('keizer', 0.026), ('enumerable', 0.026), ('paiva', 0.026), ('zweig', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999952 194 acl-2010-Phrase-Based Statistical Language Generation Using Graphical Models and Active Learning
Author: Francois Mairesse ; Milica Gasic ; Filip Jurcicek ; Simon Keizer ; Blaise Thomson ; Kai Yu ; Steve Young
Abstract: Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of generated utterances, or (b) using statistics to inform the generation decision process. Both approaches rely on the existence of a handcrafted generator, which limits their scalability to new domains. This paper presents BAGEL, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators. A human evaluation shows that BAGEL can generate natural and informative utterances from unseen inputs in the information presentation domain. Additionally, generation perfor- mance on sparse datasets is improved significantly by using certainty-based active learning, yielding ratings close to the human gold standard with a fraction of the data.
2 0.16425937 187 acl-2010-Optimising Information Presentation for Spoken Dialogue Systems
Author: Verena Rieser ; Oliver Lemon ; Xingkun Liu
Abstract: We present a novel approach to Information Presentation (IP) in Spoken Dialogue Systems (SDS) using a data-driven statistical optimisation framework for content planning and attribute selection. First we collect data in a Wizard-of-Oz (WoZ) experiment and use it to build a supervised model of human behaviour. This forms a baseline for measuring the performance of optimised policies, developed from this data using Reinforcement Learning (RL) methods. We show that the optimised policies significantly outperform the baselines in a variety of generation scenarios: while the supervised model is able to attain up to 87.6% of the possible reward on this task, the RL policies are significantly better in 5 out of 6 scenarios, gaining up to 91.5% of the total possible reward. The RL policies perform especially well in more complex scenarios. We are also the first to show that adding predictive “lower level” features (e.g. from the NLG realiser) is important for optimising IP strategies according to user preferences. This provides new insights into the nature of the IP problem for SDS.
3 0.13834062 239 acl-2010-Towards Relational POMDPs for Adaptive Dialogue Management
Author: Pierre Lison
Abstract: Open-ended spoken interactions are typically characterised by both structural complexity and high levels of uncertainty, making dialogue management in such settings a particularly challenging problem. Traditional approaches have focused on providing theoretical accounts for either the uncertainty or the complexity of spoken dialogue, but rarely considered the two issues simultaneously. This paper describes ongoing work on a new approach to dialogue management which attempts to fill this gap. We represent the interaction as a Partially Observable Markov Decision Process (POMDP) over a rich state space incorporating both dialogue, user, and environment models. The tractability of the resulting POMDP can be preserved using a mechanism for dynamically constraining the action space based on prior knowledge over locally relevant dialogue structures. These constraints are encoded in a small set of general rules expressed as a Markov Logic network. The first-order expressivity of Markov Logic enables us to leverage the rich relational structure of the problem and efficiently abstract over large regions ofthe state and action spaces.
4 0.11894352 142 acl-2010-Importance-Driven Turn-Bidding for Spoken Dialogue Systems
Author: Ethan Selfridge ; Peter Heeman
Abstract: Current turn-taking approaches for spoken dialogue systems rely on the speaker releasing the turn before the other can take it. This reliance results in restricted interactions that can lead to inefficient dialogues. In this paper we present a model we refer to as Importance-Driven Turn-Bidding that treats turn-taking as a negotiative process. Each conversant bids for the turn based on the importance of the intended utterance, and Reinforcement Learning is used to indirectly learn this parameter. We find that Importance-Driven Turn-Bidding performs better than two current turntaking approaches in an artificial collaborative slot-filling domain. The negotiative nature of this model creates efficient dia- logues, and supports the improvement of mixed-initiative interaction.
5 0.11569145 24 acl-2010-Active Learning-Based Elicitation for Semi-Supervised Word Alignment
Author: Vamshi Ambati ; Stephan Vogel ; Jaime Carbonell
Abstract: Semi-supervised word alignment aims to improve the accuracy of automatic word alignment by incorporating full or partial manual alignments. Motivated by standard active learning query sampling frameworks like uncertainty-, margin- and query-by-committee sampling we propose multiple query strategies for the alignment link selection task. Our experiments show that by active selection of uncertain and informative links, we reduce the overall manual effort involved in elicitation of alignment link data for training a semisupervised word aligner.
6 0.11046597 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing
7 0.11028218 167 acl-2010-Learning to Adapt to Unknown Users: Referring Expression Generation in Spoken Dialogue Systems
8 0.099350914 82 acl-2010-Demonstration of a Prototype for a Conversational Companion for Reminiscing about Images
9 0.094969705 81 acl-2010-Decision Detection Using Hierarchical Graphical Models
10 0.093271457 253 acl-2010-Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing
11 0.085445657 178 acl-2010-Non-Cooperation in Dialogue
12 0.084521085 58 acl-2010-Classification of Feedback Expressions in Multimodal Data
13 0.082264625 227 acl-2010-The Impact of Interpretation Problems on Tutorial Dialogue
14 0.080522195 47 acl-2010-Beetle II: A System for Tutoring and Computational Linguistics Experimentation
15 0.068616785 179 acl-2010-Now, Where Was I? Resumption Strategies for an In-Vehicle Dialogue System
16 0.066110805 240 acl-2010-Training Phrase Translation Models with Leaving-One-Out
17 0.061832331 27 acl-2010-An Active Learning Approach to Finding Related Terms
18 0.061581004 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
19 0.059500068 4 acl-2010-A Cognitive Cost Model of Annotations Based on Eye-Tracking Data
20 0.058448955 188 acl-2010-Optimizing Informativeness and Readability for Sentiment Summarization
topicId topicWeight
[(0, -0.169), (1, 0.025), (2, -0.06), (3, -0.135), (4, -0.02), (5, -0.171), (6, -0.139), (7, 0.047), (8, -0.028), (9, 0.039), (10, -0.016), (11, -0.041), (12, 0.004), (13, 0.041), (14, 0.027), (15, -0.067), (16, 0.058), (17, 0.012), (18, -0.108), (19, -0.02), (20, 0.008), (21, 0.022), (22, -0.023), (23, -0.004), (24, -0.027), (25, -0.027), (26, 0.09), (27, 0.041), (28, 0.026), (29, -0.058), (30, 0.05), (31, 0.121), (32, -0.056), (33, 0.019), (34, -0.03), (35, -0.025), (36, -0.082), (37, -0.1), (38, 0.052), (39, 0.179), (40, 0.052), (41, -0.106), (42, -0.096), (43, 0.001), (44, 0.095), (45, -0.019), (46, -0.044), (47, 0.132), (48, -0.044), (49, -0.095)]
simIndex simValue paperId paperTitle
same-paper 1 0.90446633 194 acl-2010-Phrase-Based Statistical Language Generation Using Graphical Models and Active Learning
Author: Francois Mairesse ; Milica Gasic ; Filip Jurcicek ; Simon Keizer ; Blaise Thomson ; Kai Yu ; Steve Young
Abstract: Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of generated utterances, or (b) using statistics to inform the generation decision process. Both approaches rely on the existence of a handcrafted generator, which limits their scalability to new domains. This paper presents BAGEL, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators. A human evaluation shows that BAGEL can generate natural and informative utterances from unseen inputs in the information presentation domain. Additionally, generation perfor- mance on sparse datasets is improved significantly by using certainty-based active learning, yielding ratings close to the human gold standard with a fraction of the data.
2 0.74228561 239 acl-2010-Towards Relational POMDPs for Adaptive Dialogue Management
Author: Pierre Lison
Abstract: Open-ended spoken interactions are typically characterised by both structural complexity and high levels of uncertainty, making dialogue management in such settings a particularly challenging problem. Traditional approaches have focused on providing theoretical accounts for either the uncertainty or the complexity of spoken dialogue, but rarely considered the two issues simultaneously. This paper describes ongoing work on a new approach to dialogue management which attempts to fill this gap. We represent the interaction as a Partially Observable Markov Decision Process (POMDP) over a rich state space incorporating both dialogue, user, and environment models. The tractability of the resulting POMDP can be preserved using a mechanism for dynamically constraining the action space based on prior knowledge over locally relevant dialogue structures. These constraints are encoded in a small set of general rules expressed as a Markov Logic network. The first-order expressivity of Markov Logic enables us to leverage the rich relational structure of the problem and efficiently abstract over large regions ofthe state and action spaces.
3 0.70263207 81 acl-2010-Decision Detection Using Hierarchical Graphical Models
Author: Trung H. Bui ; Stanley Peters
Abstract: We investigate hierarchical graphical models (HGMs) for automatically detecting decisions in multi-party discussions. Several types of dialogue act (DA) are distinguished on the basis of their roles in formulating decisions. HGMs enable us to model dependencies between observed features of discussions, decision DAs, and subdialogues that result in a decision. For the task of detecting decision regions, an HGM classifier was found to outperform non-hierarchical graphical models and support vector machines, raising the F1-score to 0.80 from 0.55.
4 0.68522245 179 acl-2010-Now, Where Was I? Resumption Strategies for an In-Vehicle Dialogue System
Author: Jessica Villing
Abstract: In-vehicle dialogue systems often contain more than one application, e.g. a navigation and a telephone application. This means that the user might, for example, interrupt the interaction with the telephone application to ask for directions from the navigation application, and then resume the dialogue with the telephone application. In this paper we present an analysis of interruption and resumption behaviour in human-human in-vehicle dialogues and also propose some implications for resumption strategies in an in-vehicle dialogue system.
5 0.66565657 178 acl-2010-Non-Cooperation in Dialogue
Author: Brian Pluss
Abstract: This paper presents ongoing research on computational models for non-cooperative dialogue. We start by analysing different levels of cooperation in conversation. Then, inspired by findings from an empirical study, we propose a technique for measuring non-cooperation in political interviews. Finally, we describe a research programme towards obtaining a suitable model and discuss previous accounts for conflictive dialogue, identifying the differences with our work.
6 0.66381949 142 acl-2010-Importance-Driven Turn-Bidding for Spoken Dialogue Systems
7 0.6211378 82 acl-2010-Demonstration of a Prototype for a Conversational Companion for Reminiscing about Images
8 0.611112 187 acl-2010-Optimising Information Presentation for Spoken Dialogue Systems
9 0.60497606 58 acl-2010-Classification of Feedback Expressions in Multimodal Data
10 0.5150227 167 acl-2010-Learning to Adapt to Unknown Users: Referring Expression Generation in Spoken Dialogue Systems
11 0.44367695 253 acl-2010-Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing
12 0.43802634 173 acl-2010-Modeling Norms of Turn-Taking in Multi-Party Conversation
13 0.38824585 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing
14 0.35307232 111 acl-2010-Extracting Sequences from the Web
15 0.35233688 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
16 0.33562064 246 acl-2010-Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure
17 0.30413485 224 acl-2010-Talking NPCs in a Virtual Game World
18 0.29893494 241 acl-2010-Transition-Based Parsing with Confidence-Weighted Classification
19 0.29680938 12 acl-2010-A Probabilistic Generative Model for an Intermediate Constituency-Dependency Representation
20 0.29405376 256 acl-2010-Vocabulary Choice as an Indicator of Perspective
topicId topicWeight
[(3, 0.013), (13, 0.242), (14, 0.021), (25, 0.049), (39, 0.026), (42, 0.056), (44, 0.02), (59, 0.096), (65, 0.01), (72, 0.016), (73, 0.036), (78, 0.024), (83, 0.072), (84, 0.022), (94, 0.022), (98, 0.146)]
simIndex simValue paperId paperTitle
1 0.82010096 12 acl-2010-A Probabilistic Generative Model for an Intermediate Constituency-Dependency Representation
Author: Federico Sangati
Abstract: We present a probabilistic model extension to the Tesni `ere Dependency Structure (TDS) framework formulated in (Sangati and Mazza, 2009). This representation incorporates aspects from both constituency and dependency theory. In addition, it makes use of junction structures to handle coordination constructions. We test our model on parsing the English Penn WSJ treebank using a re-ranking framework. This technique allows us to efficiently test our model without needing a specialized parser, and to use the standard evaluation metric on the original Phrase Structure version of the treebank. We obtain encouraging results: we achieve a small improvement over state-of-the-art results when re-ranking a small number of candidate structures, on all the evaluation metrics except for chunking.
2 0.80818498 212 acl-2010-Simple Semi-Supervised Training of Part-Of-Speech Taggers
Author: Anders Sogaard
Abstract: Most attempts to train part-of-speech taggers on a mixture of labeled and unlabeled data have failed. In this work stacked learning is used to reduce tagging to a classification task. This simplifies semisupervised training considerably. Our prefered semi-supervised method combines tri-training (Li and Zhou, 2005) and disagreement-based co-training. On the Wall Street Journal, we obtain an error reduction of 4.2% with SVMTool (Gimenez and Marquez, 2004).
same-paper 3 0.79946303 194 acl-2010-Phrase-Based Statistical Language Generation Using Graphical Models and Active Learning
Author: Francois Mairesse ; Milica Gasic ; Filip Jurcicek ; Simon Keizer ; Blaise Thomson ; Kai Yu ; Steve Young
Abstract: Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of generated utterances, or (b) using statistics to inform the generation decision process. Both approaches rely on the existence of a handcrafted generator, which limits their scalability to new domains. This paper presents BAGEL, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators. A human evaluation shows that BAGEL can generate natural and informative utterances from unseen inputs in the information presentation domain. Additionally, generation perfor- mance on sparse datasets is improved significantly by using certainty-based active learning, yielding ratings close to the human gold standard with a fraction of the data.
4 0.75975335 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
Author: Ivan Titov ; Mikhail Kozhevnikov
Abstract: We argue that groups of unannotated texts with overlapping and non-contradictory semantics represent a valuable source of information for learning semantic representations. A simple and efficient inference method recursively induces joint semantic representations for each group and discovers correspondence between lexical entries and latent semantic concepts. We consider the generative semantics-text correspondence model (Liang et al., 2009) and demonstrate that exploiting the noncontradiction relation between texts leads to substantial improvements over natural baselines on a problem of analyzing human-written weather forecasts.
5 0.64323676 214 acl-2010-Sparsity in Dependency Grammar Induction
Author: Jennifer Gillenwater ; Kuzman Ganchev ; Joao Graca ; Fernando Pereira ; Ben Taskar
Abstract: A strong inductive bias is essential in unsupervised grammar induction. We explore a particular sparsity bias in dependency grammars that encourages a small number of unique dependency types. Specifically, we investigate sparsity-inducing penalties on the posterior distributions of parent-child POS tag pairs in the posterior regularization (PR) framework of Graça et al. (2007). In ex- periments with 12 languages, we achieve substantial gains over the standard expectation maximization (EM) baseline, with average improvement in attachment accuracy of 6.3%. Further, our method outperforms models based on a standard Bayesian sparsity-inducing prior by an average of 4.9%. On English in particular, we show that our approach improves on several other state-of-the-art techniques.
6 0.63309872 22 acl-2010-A Unified Graph Model for Sentence-Based Opinion Retrieval
7 0.63039291 172 acl-2010-Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons
8 0.62887573 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
9 0.62737489 87 acl-2010-Discriminative Modeling of Extraction Sets for Machine Translation
10 0.62534213 251 acl-2010-Using Anaphora Resolution to Improve Opinion Target Identification in Movie Reviews
11 0.62530732 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing
12 0.62343216 133 acl-2010-Hierarchical Search for Word Alignment
13 0.62264431 187 acl-2010-Optimising Information Presentation for Spoken Dialogue Systems
14 0.62176394 167 acl-2010-Learning to Adapt to Unknown Users: Referring Expression Generation in Spoken Dialogue Systems
15 0.62113768 50 acl-2010-Bilingual Lexicon Generation Using Non-Aligned Signatures
16 0.62073869 245 acl-2010-Understanding the Semantic Structure of Noun Phrase Queries
17 0.62060875 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans
18 0.62001091 150 acl-2010-Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing
19 0.61993796 113 acl-2010-Extraction and Approximation of Numerical Attributes from the Web
20 0.61971742 5 acl-2010-A Framework for Figurative Language Detection Based on Sense Differentiation