emnlp emnlp2012 emnlp2012-114 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Bo Pang ; Sujith Ravi
Abstract: The question “how predictable is English?” has long fascinated researchers. While prior work has focused on formal English typically used in news articles, we turn to texts generated by users in online settings that are more informal in nature. We are motivated by a novel application scenario: given the difficulty of typing on mobile devices, can we help reduce typing effort with message completion, especially in conversational settings? We propose a method for automatic response completion. Our approach models both the language used in responses and the specific context provided by the original message. Our experimental results on a large-scale dataset show that both components help reduce typing effort. We also perform an information-theoretic study in this setting and examine the entropy of user-generated content, especially in con- versational scenarios, to better understand predictability of user generated English.
Reference: text
sentIndex sentText sentNum sentScore
1 We are motivated by a novel application scenario: given the difficulty of typing on mobile devices, can we help reduce typing effort with message completion, especially in conversational settings? [sent-6, score-0.782]
2 Our experimental results on a large-scale dataset show that both components help reduce typing effort. [sent-9, score-0.302]
3 We also perform an information-theoretic study in this setting and examine the entropy of user-generated content, especially in con- versational scenarios, to better understand predictability of user generated English. [sent-10, score-0.498]
4 Text completion for user-generated texts: Consider a user who is chatting with her contact or posting to a social media site using a mobile device. [sent-18, score-0.441]
5 If we can predict the next word given the preceding words that were already typed in, we can help reduce the typing cost by offering users suggestions of possible completions of their partially typed messages (e. [sent-19, score-0.629]
6 If the intended word is ranked reasonably high, the user can select the word instead of typing it. [sent-22, score-0.443]
7 Assuming a lower cost associated with selections, this could lead to less typing effort for the user. [sent-23, score-0.346]
8 Each time, we propose candidate completions at the word- level when the user is about to start a new word, or has partially entered the first few letters; once this word is successfully completed, we move on to the next one. [sent-33, score-0.224]
9 The response completion task: In addition, our task has another interesting difference from Shannon’s human experiment. [sent-35, score-0.525]
10 , an instant message sent by a contact), we have an additional source of contextual information in the stimulus, or the text which triggered the response that the user is trying to type. [sent-39, score-0.593]
11 That is, can we take advantage of this conversational setting and effectively use the information provided by stimulus to better predict the next word in the response? [sent-41, score-0.583]
12 We refer to this task as the response completion task. [sent-42, score-0.525]
13 Our task is different from “chatter-bots” (Weizen- baum, 1966), where the goal is to generate a response to an input that would resemble a human conversation partner. [sent-43, score-0.382]
14 Instead, we want to complete a response as the replier intends to. [sent-44, score-0.382]
15 al (201 1) experimented with automatic response generation in social media. [sent-46, score-0.382]
16 They had a similar conversational setting, but instead of completion based on partial input, they attempted to generate a response in its entirety given only the stimulus. [sent-47, score-0.641]
17 While many of the generated responses are deemed possible replies to the stimulus, they have a low chance of actually matching the real response given by the user: they reported BLEU scores between 0 and 2 1490 for various systems. [sent-48, score-0.579]
18 In this paper, we propose a method for automatic response completion. [sent-51, score-0.382]
19 We construct a large-scale dataset of user- generated textual exchanges, and our experimental results show that both components help reduce typing effort. [sent-53, score-0.302]
20 In addition, to better understand predictability of user generated English, we perform an information-theoretic study in this conversational setting to investigate the entropy of user-generated content. [sent-54, score-0.551]
21 As discussed in Section 1, the response generation task (Ritter et al. [sent-61, score-0.382]
22 , 2003), we are not aware of much previous work that has taken advantage of information in the stimulus for word predictions in responses. [sent-65, score-0.458]
23 Previous work on entropy of language stems from the field of information theory (Shannon, 1948), starting with Shannon (1951). [sent-66, score-0.216]
24 , insights into the structure of language via information theory, entropy estimates via other techniques and/or for different languages, as well as a broad range of applications of such estimates) can be found in (Cover and King, 1978). [sent-69, score-0.216]
25 al (1992) computed an upper bound for the entropy of English with a trigram model, using the Brown corpus. [sent-71, score-0.33]
26 There was also a recent study using entropy in the context of Web search (Mei and Church, 2008). [sent-74, score-0.216]
27 We perform entropy studies over texts generated in online settings which are more informal in nature. [sent-82, score-0.278]
28 Additionally, we utilize the properties of language predictability within a novel application for automatically completing responses in conversational settings. [sent-83, score-0.365]
29 3 Model In this section, we first state our problem more formally, followed by descriptions of the basic N-gram language model we use, as well as two approaches that model both stimulus and preceding words in response as the context for the next-word generation. [sent-86, score-0.888]
30 1 Problem definition Consider a stimulus-response pair, where the stimulus is a sequence of tokens s = (s1, s2 , . [sent-89, score-0.501]
31 , sm), 1491 and the response is a sequence of tokens r = (r1, r2, . [sent-92, score-0.425]
32 2 Generic Response Language Model First, we consider an N-gram language model trained on all responses in the training data as our generic response language model. [sent-107, score-0.581]
33 Normally, trigram models use back-off to both bigrams and unigrams; in order to compare the effectiveness of trigram models vs. [sent-109, score-0.228]
34 i) = λ1 ∗ P2 (ri+1 | ri) +(1 − λ1) ∗ P1(ri+1) If we ignore the context provided by texts in the stimulus, we can simply generate and rank candidate words from the dictionary according to the generic response LM: P(ri+1 | r1. [sent-114, score-0.505]
35 al (201 1) have considered a related task of generating a response in its entirety given only the text in the stimulus. [sent-126, score-0.407]
36 They cast the problem as a translation task, where the stimulus is considered as the source language and the response is considered as the target language. [sent-127, score-0.84]
37 We can adapt this approach for our response completion task. [sent-128, score-0.525]
38 That is, let P(n) be the distribution of response length, let r0 be a possible completion of r1. [sent-135, score-0.525]
39 Instead, we take a greedy approach, and choose ri+1 which yields the optimal partial response (without looking ahead): P(r1. [sent-145, score-0.382]
40 4 Mixture Model One potential concern over applying the translation model is that the response can often contain novel information not implied by the stimulus. [sent-160, score-0.382]
41 Alternatively, one can model the response generation process with a mixture model: with probability λs, we generate a word according to a distribution 1492 over s (P(w | s)), and with probability 1 λs, we generate a wwo |rd s using dth we response language mλodel: − P(ri+1 | s, r1. [sent-163, score-0.851]
42 More specifically, P(ri+1| s) =1ri|+s|1∈s We can take λs to be a constant λselect, which can be estimated in the training data as the probability of a response token being a repetition of a token in the corresponding stimulus. [sent-172, score-0.478]
43 Given a new stimulus s, we then select the highest ranked topic as being representative of s. [sent-178, score-0.563]
44 4 Data In order to investigate text completion in a conversational setting, we need to construct a large-scale dataset with textual exchanges among users. [sent-184, score-0.403]
45 Many sites with a user comment environment allow other users to reply to existing comments, where the original comment and its reply can form a (stimulus, response) pair for our purposes. [sent-188, score-0.519]
46 News2, where under each news article, a user can post a new comment or reply to an existing comment. [sent-190, score-0.326]
47 To ensure the reply is a direct response to the original comment, we took only the first reply to a comment, and consider the resulting pair as a textual exchange in the form of a (stimulus, response) pair. [sent-192, score-0.554]
48 A random sample yielded a total of 1,487,995 exchanges, representing 237,040 unique users posting responses to stimuli comments authored by 357,81 1 users. [sent-194, score-0.437]
49 , before tokenization), stimuli average at 59 tokens (332 characters), and responses average at 26 tokens (144 characters). [sent-197, score-0.404]
50 While this is a straight-forward measure to assess the overall quality of different top-k lists, it is not tailored to suit our specific task of response completion. [sent-207, score-0.382]
51 TypRed : Our main evaluation measure is based on “reduction in typing effort for a user of the system”, which is a more informative measure for our task. [sent-209, score-0.413]
52 We estimate the typing reduction via a hypothetical typing model3 in the following manner: Suppose we show top k predictions for a given setting. [sent-210, score-0.549]
53 The typing cost is then estimated to be the number of characters in the word lw; 2. [sent-213, score-0.355]
54 if the user spots the correct answer in the list, the cost for choosing the word is proportional to the rank of the word rankw, with a fixed cost ratio c0. [sent-214, score-0.334]
55 Suppose the user scrolls down the list using the down-arrow (↓) to reach the intended uwsoirndg (instead o-fa typing), oth reena rankw · c0 reflects the scrolling effort required, whe·re c c0 is the relative cost of scrolling down versus typing a character. [sent-215, score-0.718]
56 In general, pressing a fixed key can have a lower cost than typing a new one, in addition, we can imagine a virtual keyboard where navigational keys occupy bigger real-estate, and thus incur less cost to press. [sent-216, score-0.367]
57 < typing model assumes a user selects the intended word using an interface that is similar to a mousing device, the cost may increase with rankw at a sublinear rate; in that case, our measure will be overestimating the cost. [sent-230, score-0.605]
58 In order to have a consistent measure that always improves as the ranking improves, we assume a clever user who will choose to finish the word by typing or by selecting, depending on which cost is lower. [sent-231, score-0.461]
59 Combining these two cases under the cleveruser model, we estimate the reduction in typing cost for every word as follows: (0. [sent-232, score-0.346]
60 3 Results Previous words from response observed: We first present results for the setting where only previous words from the response are provided as context. [sent-237, score-0.798]
61 A higher value of TypRed implies higher savings achieved in typing cost and thereby better prediction performance. [sent-242, score-0.336]
62 2 Experimental setup We run experiments using the models described in Section 3 under two different settings: (1) previous words from the response are provided, and (2) previous words from response + first c characters of the current word are provided. [sent-244, score-0.815]
63 During the candidate generation phase, for every position in the response message we present the top 1,000 candidates (as scored by the generic response language model or mixture models). [sent-245, score-0.939]
64 For the generic response language models, we set the interpolation weight λ1 = 0. [sent-247, score-0.421]
65 Recall that in all experiments, we set c0, the cost ratio of selecting a candidate from the ranked topk list (via scrolling) versus typing a character to a value of 0. [sent-253, score-0.367]
66 But we also experimented with a hypothetical setting where c0 = 1and noticed that the trigram LM achieves a slightly lower but still significant typing reduction (TypRed score of 9. [sent-255, score-0.422]
67 This comparison also holds for c > 0 that is, a naive version of LM+Selection that selects a word from the stimulus whenever the — prefix allows would not have worked well. [sent-270, score-0.458]
68 Previous words from response + first c characters of current word observed: Table 1 also compares the TypRed performance of all the models under settings where c > 0. [sent-274, score-0.464]
69 Our best model is able to save the user approximately 23% in terms of typing effort (according to TypRed scores). [sent-276, score-0.413]
70 Interestingly a lot less reduction was observed for c = 2: the second character, on average, does not improve the ranking enough to justify the cost of typing this extra character. [sent-277, score-0.346]
71 This demonstrates the utility of such a response completion system, especially since shorter words are predominant in conversational settings. [sent-291, score-0.616]
72 Interestingly, emoticons and informal expressions like : ) or lmao in the stimulus tend to evoke similar type of expressions in the response (as seen in Table 3). [sent-309, score-0.871]
73 6 Entropy of user comments We adapt the notion of predictability of English as examined by Shannon (1951) from letter-prediction to token-prediction, and define the predictability of 1496 English as how well can the next token be predicted when the preceding N tokens are known. [sent-326, score-0.566]
74 How much does the immediate context in the response help reduce the uncertainty? [sent-327, score-0.409]
75 And how about the corresponding stimuli given the preceding N tokens, does the knowledge — of stimulus further reduce the uncertainty? [sent-329, score-0.691]
76 jTh |is b conditional entropy reflects how much is the uncertainty of the next token reduced by knowing the preceding N−1 tokens. [sent-333, score-0.393]
77 Un nreddeurc ethdi bs measure, hise user-generated content more predictable or less predictable than the more formal “printed” English examined by Shannon? [sent-334, score-0.231]
78 To answer this question empirically, we construct a reference dataset written in more formal English (Df) to be compared against the user comments dataset described in Section 4 (Du). [sent-338, score-0.286]
79 5 Next, we compare both the entropy over unigrams and N-gram entropy in three datasets: the news article dataset described above, and comments data (Section 4) separated into stimuli and responses. [sent-350, score-0.73]
80 Note that datasets with different vocabulary size can lead to different entropy: the entropy of picking a word from the vocabulary uniformly at random would have been different. [sent-352, score-0.316]
81 Thus, we sample each dataset at different rates, and plot the (conditional) entropy in the sample against the corresponding vocabulary size. [sent-353, score-0.3]
82 As shown in Figure 4(a), the entropy of unigrams in Du (both stimuli and responses) is consistently lower than in Df. [sent-354, score-0.414]
83 6 On the other hand, both stimuli and responses exhibit higher uncertainty in bigram entropy (Figure 4(b)) and trigram entropy (Figure 4(c)). [sent-355, score-0.954]
84 We postulate that the difference in unigram entropy could be due to (a) more balanced topic coverage in Df vs. [sent-357, score-0.323]
85 If (b) is the main reason, however, the lower trigram entropy in Df would seem unexpected — shouldn’t professional journalists also have a more balanced use of different phrases? [sent-359, score-0.375]
86 Upon further contemplation, what we hypothesized earlier could be true: professional writers use the “proper” En- glish expected in news coverage, which could limit 5We note that this does not guarantee the exact same topic distribution as in the comment data. [sent-360, score-0.235]
87 1497 vocabulary size (a) Entropy of unigrams vocabulary size (b) Bigram entropy (F2) vocabulary size (c) Trigram entropy (F3) Figure 4: Entropy of unigrams and N-gram entropy. [sent-365, score-0.662]
88 Interestingly, distributions in the stimulus dataset are closer to news articles: they have a higher unigram entropy than responses, but a lower trigram entropy at comparable vocabulary sizes. [sent-367, score-1.119]
89 In particular, recall from Section 4 that our comments dataset contains roughly 237K repliers and 357K original com- vocabulary size Figure 5: Predicting the next word in responses: bigram entropy vs. [sent-368, score-0.385]
90 If higher trigram entropy is due to variance among different users, the stimulus dataset should have had a higher trigram entropy. [sent-372, score-0.936]
91 2 Information in stimuli We now examine the next question: does knowing words in the stimulus further reduce the uncertainty of the next word in the response? [sent-375, score-0.762]
92 For simplicity, we model the stimulus as a collection of unigrams. [sent-376, score-0.458]
93 Consider the following conditional entropy: GN = −Xp(bi,j,sk)log2p(j | bi,sk) iX,k,j where bi is a block of N − 1 tokens in a response r, j is an arbitrary to ofke Nn following bi, nan ad r sk iosn an arbitrary token in the corresponding stimulus s for r. [sent-377, score-1.096]
94 pL(jet |V b be the vocabulary of user comments (ignore for now differences in responses and stimuli). [sent-388, score-0.375]
95 Still, since GN shows a consistent improvement over FN, there could be more information in the stimulus that we are not yet fully utilizing, which can be interesting future work. [sent-406, score-0.458]
96 7 Conclusions In this paper, we examined a novel application: automatic response completion in conversational settings. [sent-407, score-0.65]
97 We investigated the effectiveness of several models that incorporate contextual information provided by the partially typed response as well as the stimulus. [sent-408, score-0.467]
98 We found that the partially typed response provides strong signals. [sent-409, score-0.433]
99 In addition, using a mixture model which also incorporates stimulus content yielded the best overall result. [sent-410, score-0.545]
100 Our analysis (entropy estimates along with upper-bound numbers observed from experiments) suggest that there can be interesting future work to explore the contextual information provided by the stimulus more effectively and further improve the response completion task. [sent-412, score-1.017]
wordName wordTfidf (topN-words)
[('stimulus', 0.458), ('response', 0.382), ('typred', 0.284), ('typing', 0.241), ('entropy', 0.216), ('ri', 0.187), ('responses', 0.16), ('stimuli', 0.158), ('completion', 0.143), ('exchanges', 0.135), ('user', 0.13), ('shannon', 0.123), ('bi', 0.121), ('trigram', 0.114), ('predictability', 0.114), ('lm', 0.111), ('conversational', 0.091), ('mobile', 0.091), ('mixture', 0.087), ('predictable', 0.086), ('reply', 0.086), ('topic', 0.08), ('comment', 0.079), ('ritter', 0.064), ('rankw', 0.063), ('cost', 0.063), ('interface', 0.061), ('users', 0.059), ('df', 0.059), ('completions', 0.057), ('du', 0.055), ('gn', 0.053), ('typed', 0.051), ('characters', 0.051), ('rank', 0.05), ('vocabulary', 0.05), ('bigram', 0.05), ('message', 0.049), ('preceding', 0.048), ('token', 0.048), ('scrolling', 0.047), ('intended', 0.047), ('fn', 0.045), ('professional', 0.045), ('sk', 0.044), ('tokens', 0.043), ('reduction', 0.042), ('effort', 0.042), ('printed', 0.041), ('knowing', 0.041), ('unigrams', 0.04), ('uncertainty', 0.04), ('generic', 0.039), ('examine', 0.038), ('versus', 0.038), ('entered', 0.037), ('replies', 0.037), ('comments', 0.035), ('yahoo', 0.035), ('dataset', 0.034), ('weeks', 0.034), ('examined', 0.034), ('provided', 0.034), ('messages', 0.032), ('savings', 0.032), ('predictive', 0.032), ('bulyko', 0.032), ('farmer', 0.032), ('indus', 0.032), ('instant', 0.032), ('mackenzie', 0.032), ('moradi', 0.032), ('nurmi', 0.032), ('reischel', 0.032), ('teahan', 0.032), ('informal', 0.031), ('news', 0.031), ('settings', 0.031), ('query', 0.028), ('answer', 0.028), ('devices', 0.027), ('reduce', 0.027), ('postulate', 0.027), ('clever', 0.027), ('claude', 0.027), ('kucera', 0.027), ('contact', 0.027), ('zipf', 0.027), ('conversations', 0.026), ('interestingly', 0.026), ('brown', 0.026), ('site', 0.025), ('scenarios', 0.025), ('ranked', 0.025), ('formal', 0.025), ('mei', 0.025), ('entirety', 0.025), ('interfaces', 0.025), ('posting', 0.025), ('hypothetical', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999946 114 emnlp-2012-Revisiting the Predictability of Language: Response Completion in Social Media
Author: Bo Pang ; Sujith Ravi
Abstract: The question “how predictable is English?” has long fascinated researchers. While prior work has focused on formal English typically used in news articles, we turn to texts generated by users in online settings that are more informal in nature. We are motivated by a novel application scenario: given the difficulty of typing on mobile devices, can we help reduce typing effort with message completion, especially in conversational settings? We propose a method for automatic response completion. Our approach models both the language used in responses and the specific context provided by the original message. Our experimental results on a large-scale dataset show that both components help reduce typing effort. We also perform an information-theoretic study in this setting and examine the entropy of user-generated content, especially in con- versational scenarios, to better understand predictability of user generated English.
2 0.11225528 21 emnlp-2012-Assessment of ESL Learners' Syntactic Competence Based on Similarity Measures
Author: Su-Youn Yoon ; Suma Bhat
Abstract: This study presents a novel method that measures English language learners’ syntactic competence towards improving automated speech scoring systems. In contrast to most previous studies which focus on the length of production units such as the mean length of clauses, we focused on capturing the differences in the distribution of morpho-syntactic features or grammatical expressions across proficiency. We estimated the syntactic competence through the use of corpus-based NLP techniques. Assuming that the range and so- phistication of grammatical expressions can be captured by the distribution of Part-ofSpeech (POS) tags, vector space models of POS tags were constructed. We use a large corpus of English learners’ responses that are classified into four proficiency levels by human raters. Our proposed feature measures the similarity of a given response with the most proficient group and is then estimates the learner’s syntactic competence level. Widely outperforming the state-of-the-art measures of syntactic complexity, our method attained a significant correlation with humanrated scores. The correlation between humanrated scores and features based on manual transcription was 0.43 and the same based on ASR-hypothesis was slightly lower, 0.42. An important advantage of our method is its robustness against speech recognition errors not to mention the simplicity of feature generation that captures a reasonable set of learnerspecific syntactic errors. 600 Measures Suma Bhat Beckman Institute, Urbana, IL 61801 . spbhat 2 @ i l l ino i edu s
3 0.10515279 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
Author: Michael J. Paul
Abstract: Recent work has explored the use of hidden Markov models for unsupervised discourse and conversation modeling, where each segment or block of text such as a message in a conversation is associated with a hidden state in a sequence. We extend this approach to allow each block of text to be a mixture of multiple classes. Under our model, the probability of a class in a text block is a log-linear function of the classes in the previous block. We show that this model performs well at predictive tasks on two conversation data sets, improving thread reconstruction accuracy by up to 15 percentage points over a standard HMM. Additionally, we show quantitatively that the induced word clusters correspond to speech acts more closely than baseline models.
4 0.091720045 60 emnlp-2012-Generative Goal-Driven User Simulation for Dialog Management
Author: Aciel Eshky ; Ben Allison ; Mark Steedman
Abstract: User simulation is frequently used to train statistical dialog managers for task-oriented domains. At present, goal-driven simulators (those that have a persistent notion of what they wish to achieve in the dialog) require some task-specific engineering, making them impossible to evaluate intrinsically. Instead, they have been evaluated extrinsically by means of the dialog managers they are intended to train, leading to circularity of argument. In this paper, we propose the first fully generative goal-driven simulator that is fully induced from data, without hand-crafting or goal annotation. Our goals are latent, and take the form of topics in a topic model, clustering together semantically equivalent and phonetically confusable strings, implicitly modelling synonymy and speech recognition noise. We evaluate on two standard dialog resources, the Communicator and Let’s Go datasets, and demonstrate that our model has substantially better fit to held out data than competing approaches. We also show that features derived from our model allow significantly greater improvement over a baseline at distinguishing real from randomly permuted dialogs.
5 0.08524362 128 emnlp-2012-Translation Model Based Cross-Lingual Language Model Adaptation: from Word Models to Phrase Models
Author: Shixiang Lu ; Wei Wei ; Xiaoyin Fu ; Bo Xu
Abstract: In this paper, we propose a novel translation model (TM) based cross-lingual data selection model for language model (LM) adaptation in statistical machine translation (SMT), from word models to phrase models. Given a source sentence in the translation task, this model directly estimates the probability that a sentence in the target LM training corpus is similar. Compared with the traditional approaches which utilize the first pass translation hypotheses, cross-lingual data selection model avoids the problem of noisy proliferation. Furthermore, phrase TM based cross-lingual data selection model is more effective than the traditional approaches based on bag-ofwords models and word-based TM, because it captures contextual information in modeling the selection of phrase as a whole. Experiments conducted on large-scale data sets demonstrate that our approach significantly outperforms the state-of-the-art approaches on both LM perplexity and SMT performance.
6 0.072477348 102 emnlp-2012-Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems
7 0.066324249 90 emnlp-2012-Modelling Sequential Text with an Adaptive Topic Model
8 0.064009361 49 emnlp-2012-Exploring Topic Coherence over Many Models and Many Topics
9 0.063908458 24 emnlp-2012-Biased Representation Learning for Domain Adaptation
10 0.063430965 134 emnlp-2012-User Demographics and Language in an Implicit Social Network
11 0.060926717 8 emnlp-2012-A Phrase-Discovering Topic Model Using Hierarchical Pitman-Yor Processes
12 0.059460789 23 emnlp-2012-Besting the Quiz Master: Crowdsourcing Incremental Classification Games
13 0.048670869 32 emnlp-2012-Detecting Subgroups in Online Discussions by Modeling Positive and Negative Relations among Participants
14 0.047219075 98 emnlp-2012-No Noun Phrase Left Behind: Detecting and Typing Unlinkable Entities
15 0.045842659 42 emnlp-2012-Entropy-based Pruning for Phrase-based Machine Translation
16 0.044308595 11 emnlp-2012-A Systematic Comparison of Phrase Table Pruning Techniques
17 0.043409038 115 emnlp-2012-SSHLDA: A Semi-Supervised Hierarchical Topic Model
18 0.042923924 129 emnlp-2012-Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries
19 0.042864371 35 emnlp-2012-Document-Wide Decoding for Phrase-Based Statistical Machine Translation
20 0.041199293 5 emnlp-2012-A Discriminative Model for Query Spelling Correction with Latent Structural SVM
topicId topicWeight
[(0, 0.174), (1, 0.018), (2, 0.01), (3, 0.124), (4, -0.149), (5, 0.013), (6, 0.014), (7, -0.049), (8, 0.042), (9, -0.042), (10, 0.085), (11, -0.091), (12, -0.156), (13, -0.05), (14, -0.008), (15, -0.117), (16, -0.033), (17, 0.006), (18, -0.064), (19, -0.03), (20, 0.01), (21, -0.094), (22, 0.003), (23, -0.094), (24, 0.038), (25, 0.009), (26, 0.044), (27, -0.1), (28, -0.024), (29, -0.075), (30, 0.039), (31, -0.044), (32, 0.09), (33, 0.081), (34, -0.133), (35, -0.025), (36, 0.038), (37, -0.155), (38, -0.005), (39, 0.343), (40, -0.175), (41, -0.215), (42, -0.105), (43, 0.031), (44, 0.193), (45, 0.066), (46, -0.043), (47, 0.065), (48, 0.07), (49, 0.005)]
simIndex simValue paperId paperTitle
same-paper 1 0.95109218 114 emnlp-2012-Revisiting the Predictability of Language: Response Completion in Social Media
Author: Bo Pang ; Sujith Ravi
Abstract: The question “how predictable is English?” has long fascinated researchers. While prior work has focused on formal English typically used in news articles, we turn to texts generated by users in online settings that are more informal in nature. We are motivated by a novel application scenario: given the difficulty of typing on mobile devices, can we help reduce typing effort with message completion, especially in conversational settings? We propose a method for automatic response completion. Our approach models both the language used in responses and the specific context provided by the original message. Our experimental results on a large-scale dataset show that both components help reduce typing effort. We also perform an information-theoretic study in this setting and examine the entropy of user-generated content, especially in con- versational scenarios, to better understand predictability of user generated English.
2 0.49222323 21 emnlp-2012-Assessment of ESL Learners' Syntactic Competence Based on Similarity Measures
Author: Su-Youn Yoon ; Suma Bhat
Abstract: This study presents a novel method that measures English language learners’ syntactic competence towards improving automated speech scoring systems. In contrast to most previous studies which focus on the length of production units such as the mean length of clauses, we focused on capturing the differences in the distribution of morpho-syntactic features or grammatical expressions across proficiency. We estimated the syntactic competence through the use of corpus-based NLP techniques. Assuming that the range and so- phistication of grammatical expressions can be captured by the distribution of Part-ofSpeech (POS) tags, vector space models of POS tags were constructed. We use a large corpus of English learners’ responses that are classified into four proficiency levels by human raters. Our proposed feature measures the similarity of a given response with the most proficient group and is then estimates the learner’s syntactic competence level. Widely outperforming the state-of-the-art measures of syntactic complexity, our method attained a significant correlation with humanrated scores. The correlation between humanrated scores and features based on manual transcription was 0.43 and the same based on ASR-hypothesis was slightly lower, 0.42. An important advantage of our method is its robustness against speech recognition errors not to mention the simplicity of feature generation that captures a reasonable set of learnerspecific syntactic errors. 600 Measures Suma Bhat Beckman Institute, Urbana, IL 61801 . spbhat 2 @ i l l ino i edu s
3 0.46859184 128 emnlp-2012-Translation Model Based Cross-Lingual Language Model Adaptation: from Word Models to Phrase Models
Author: Shixiang Lu ; Wei Wei ; Xiaoyin Fu ; Bo Xu
Abstract: In this paper, we propose a novel translation model (TM) based cross-lingual data selection model for language model (LM) adaptation in statistical machine translation (SMT), from word models to phrase models. Given a source sentence in the translation task, this model directly estimates the probability that a sentence in the target LM training corpus is similar. Compared with the traditional approaches which utilize the first pass translation hypotheses, cross-lingual data selection model avoids the problem of noisy proliferation. Furthermore, phrase TM based cross-lingual data selection model is more effective than the traditional approaches based on bag-ofwords models and word-based TM, because it captures contextual information in modeling the selection of phrase as a whole. Experiments conducted on large-scale data sets demonstrate that our approach significantly outperforms the state-of-the-art approaches on both LM perplexity and SMT performance.
4 0.44384325 60 emnlp-2012-Generative Goal-Driven User Simulation for Dialog Management
Author: Aciel Eshky ; Ben Allison ; Mark Steedman
Abstract: User simulation is frequently used to train statistical dialog managers for task-oriented domains. At present, goal-driven simulators (those that have a persistent notion of what they wish to achieve in the dialog) require some task-specific engineering, making them impossible to evaluate intrinsically. Instead, they have been evaluated extrinsically by means of the dialog managers they are intended to train, leading to circularity of argument. In this paper, we propose the first fully generative goal-driven simulator that is fully induced from data, without hand-crafting or goal annotation. Our goals are latent, and take the form of topics in a topic model, clustering together semantically equivalent and phonetically confusable strings, implicitly modelling synonymy and speech recognition noise. We evaluate on two standard dialog resources, the Communicator and Let’s Go datasets, and demonstrate that our model has substantially better fit to held out data than competing approaches. We also show that features derived from our model allow significantly greater improvement over a baseline at distinguishing real from randomly permuted dialogs.
5 0.35237601 122 emnlp-2012-Syntactic Surprisal Affects Spoken Word Duration in Conversational Contexts
Author: Vera Demberg ; Asad Sayeed ; Philip Gorinski ; Nikolaos Engonopoulos
Abstract: We present results of a novel experiment to investigate speech production in conversational data that links speech rate to information density. We provide the first evidence for an association between syntactic surprisal and word duration in recorded speech. Using the AMI corpus which contains transcriptions of focus group meetings with precise word durations, we show that word durations correlate with syntactic surprisal estimated from the incremental Roark parser over and above simpler measures, such as word duration estimated from a state-of-the-art text-to-speech system and word frequencies, and that the syntactic surprisal estimates are better predictors of word durations than a simpler version of surprisal based on trigram probabilities. This result supports the uniform information density (UID) hypothesis and points a way to more realistic artificial speech generation.
6 0.34579432 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
7 0.31081766 134 emnlp-2012-User Demographics and Language in an Implicit Social Network
8 0.28556934 55 emnlp-2012-Forest Reranking through Subtree Ranking
9 0.27744395 118 emnlp-2012-Source Language Adaptation for Resource-Poor Machine Translation
10 0.27063715 17 emnlp-2012-An "AI readability" Formula for French as a Foreign Language
11 0.26043862 102 emnlp-2012-Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems
12 0.24792618 8 emnlp-2012-A Phrase-Discovering Topic Model Using Hierarchical Pitman-Yor Processes
13 0.24159408 33 emnlp-2012-Discovering Diverse and Salient Threads in Document Collections
14 0.23861855 48 emnlp-2012-Exploring Adaptor Grammars for Native Language Identification
15 0.23769754 41 emnlp-2012-Entity based QA Retrieval
16 0.22964884 32 emnlp-2012-Detecting Subgroups in Online Discussions by Modeling Positive and Negative Relations among Participants
17 0.22636935 78 emnlp-2012-Learning Lexicon Models from Search Logs for Query Expansion
18 0.22154917 49 emnlp-2012-Exploring Topic Coherence over Many Models and Many Topics
19 0.21603779 121 emnlp-2012-Supervised Text-based Geolocation Using Language Models on an Adaptive Grid
20 0.21527573 24 emnlp-2012-Biased Representation Learning for Domain Adaptation
topicId topicWeight
[(2, 0.031), (6, 0.297), (16, 0.027), (25, 0.019), (34, 0.073), (45, 0.01), (60, 0.089), (63, 0.07), (64, 0.029), (65, 0.031), (70, 0.021), (74, 0.056), (76, 0.042), (80, 0.018), (86, 0.036), (94, 0.018), (95, 0.037)]
simIndex simValue paperId paperTitle
1 0.77436936 117 emnlp-2012-Sketch Algorithms for Estimating Point Queries in NLP
Author: Amit Goyal ; Hal Daume III ; Graham Cormode
Abstract: Many NLP tasks rely on accurate statistics from large corpora. Tracking complete statistics is memory intensive, so recent work has proposed using compact approximate “sketches” of frequency distributions. We describe 10 sketch methods, including existing and novel variants. We compare and study the errors (over-estimation and underestimation) made by the sketches. We evaluate several sketches on three important NLP problems. Our experiments show that one sketch performs best for all the three tasks.
same-paper 2 0.74555105 114 emnlp-2012-Revisiting the Predictability of Language: Response Completion in Social Media
Author: Bo Pang ; Sujith Ravi
Abstract: The question “how predictable is English?” has long fascinated researchers. While prior work has focused on formal English typically used in news articles, we turn to texts generated by users in online settings that are more informal in nature. We are motivated by a novel application scenario: given the difficulty of typing on mobile devices, can we help reduce typing effort with message completion, especially in conversational settings? We propose a method for automatic response completion. Our approach models both the language used in responses and the specific context provided by the original message. Our experimental results on a large-scale dataset show that both components help reduce typing effort. We also perform an information-theoretic study in this setting and examine the entropy of user-generated content, especially in con- versational scenarios, to better understand predictability of user generated English.
3 0.5448795 52 emnlp-2012-Fast Large-Scale Approximate Graph Construction for NLP
Author: Amit Goyal ; Hal Daume III ; Raul Guerra
Abstract: Many natural language processing problems involve constructing large nearest-neighbor graphs. We propose a system called FLAG to construct such graphs approximately from large data sets. To handle the large amount of data, our algorithm maintains approximate counts based on sketching algorithms. To find the approximate nearest neighbors, our algorithm pairs a new distributed online-PMI algorithm with novel fast approximate nearest neighbor search algorithms (variants of PLEB). These algorithms return the approximate nearest neighbors quickly. We show our system’s efficiency in both intrinsic and extrinsic experiments. We further evaluate our fast search algorithms both quantitatively and qualitatively on two NLP applications.
4 0.47591263 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
Author: Jayant Krishnamurthy ; Tom Mitchell
Abstract: We present a method for training a semantic parser using only a knowledge base and an unlabeled text corpus, without any individually annotated sentences. Our key observation is that multiple forms ofweak supervision can be combined to train an accurate semantic parser: semantic supervision from a knowledge base, and syntactic supervision from dependencyparsed sentences. We apply our approach to train a semantic parser that uses 77 relations from Freebase in its knowledge representation. This semantic parser extracts instances of binary relations with state-of-theart accuracy, while simultaneously recovering much richer semantic structures, such as conjunctions of multiple relations with partially shared arguments. We demonstrate recovery of this richer structure by extracting logical forms from natural language queries against Freebase. On this task, the trained semantic parser achieves 80% precision and 56% recall, despite never having seen an annotated logical form.
5 0.47446918 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
Author: Michael J. Paul
Abstract: Recent work has explored the use of hidden Markov models for unsupervised discourse and conversation modeling, where each segment or block of text such as a message in a conversation is associated with a hidden state in a sequence. We extend this approach to allow each block of text to be a mixture of multiple classes. Under our model, the probability of a class in a text block is a log-linear function of the classes in the previous block. We show that this model performs well at predictive tasks on two conversation data sets, improving thread reconstruction accuracy by up to 15 percentage points over a standard HMM. Additionally, we show quantitatively that the induced word clusters correspond to speech acts more closely than baseline models.
6 0.47046757 14 emnlp-2012-A Weakly Supervised Model for Sentence-Level Semantic Orientation Analysis with Multiple Experts
7 0.46353814 20 emnlp-2012-Answering Opinion Questions on Products by Exploiting Hierarchical Organization of Consumer Reviews
8 0.46075711 103 emnlp-2012-PATTY: A Taxonomy of Relational Patterns with Semantic Types
9 0.46074098 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
10 0.45930681 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents
11 0.45810047 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
12 0.45780671 23 emnlp-2012-Besting the Quiz Master: Crowdsourcing Incremental Classification Games
13 0.45706114 129 emnlp-2012-Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries
14 0.45525733 18 emnlp-2012-An Empirical Investigation of Statistical Significance in NLP
15 0.45498696 110 emnlp-2012-Reading The Web with Learned Syntactic-Semantic Inference Rules
16 0.45493105 42 emnlp-2012-Entropy-based Pruning for Phrase-based Machine Translation
17 0.45459095 123 emnlp-2012-Syntactic Transfer Using a Bilingual Lexicon
18 0.45301715 51 emnlp-2012-Extracting Opinion Expressions with semi-Markov Conditional Random Fields
19 0.45290884 122 emnlp-2012-Syntactic Surprisal Affects Spoken Word Duration in Conversational Contexts
20 0.45260498 109 emnlp-2012-Re-training Monolingual Parser Bilingually for Syntactic SMT