acl acl2012 acl2012-14 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Asli Celikyilmaz ; Dilek Hakkani-Tur
Abstract: We describe a joint model for understanding user actions in natural language utterances. Our multi-layer generative approach uses both labeled and unlabeled utterances to jointly learn aspects regarding utterance’s target domain (e.g. movies), intention (e.g., finding a movie) along with other semantic units (e.g., movie name). We inject information extracted from unstructured web search query logs as prior information to enhance the generative process of the natural language utterance understanding model. Using utterances from five domains, our approach shows up to 4.5% improvement on domain and dialog act performance over cascaded approach in which each semantic component is learned sequentially and a supervised joint learning model (which requires fully labeled data).
Reference: text
sentIndex sentText sentNum sentScore
1 Our multi-layer generative approach uses both labeled and unlabeled utterances to jointly learn aspects regarding utterance’s target domain (e. [sent-3, score-0.618]
2 We inject information extracted from unstructured web search query logs as prior information to enhance the generative process of the natural language utterance understanding model. [sent-10, score-0.565]
3 Using utterances from five domains, our approach shows up to 4. [sent-11, score-0.265]
4 5% improvement on domain and dialog act performance over cascaded approach in which each semantic component is learned sequentially and a supervised joint learning model (which requires fully labeled data). [sent-12, score-1.173]
5 1 Introduction Virtual personal assistance (VPA) is a human to machine dialog system, which is designed to perform tasks such as making reservations at restaurants, checking flight statuses, or planning weekend activities. [sent-13, score-0.422]
6 While target domain corresponds to the context of an utterance in a dialog, the dialog act represents overall intent of an utterance. [sent-15, score-1.097]
7 Sample utterances on ’plan a night out’ scenario ( I I) SI’hmow in m thee th meoatoedrs fo inr [ [Ainudsitainn]] f poloadyi tnogni [girhotn, s mhoawn 2 m]. [sent-18, score-0.265]
8 We build a joint understanding framework and introduce a multi-layer context model for semantic representation of utterances of multiple domains. [sent-25, score-0.431]
9 , the domain and slot or dialog act and slot components together. [sent-32, score-1.701]
10 c so2c0ia1t2io Ans fso rc Ciatoiomnp fuotart Cio nmaplu Ltiantgiounisatlic Lsi,n pgaugiestsi3c 3s0–3 8, pendency and language variability issues, a model that considers dependencies between semantic components and utilizes information from large bodies of unlabeled text can be beneficial for SLU. [sent-36, score-0.259]
11 We incorporate prior knowledge that we observe in web search query logs as constraints on these latent aspects. [sent-39, score-0.283]
12 Our model can discover associations between words within a multi-layered aspect model, in which some words are indicative of higher layer (meta) aspects (domain or dialog act components), while others are indicative of lower layer specific entities. [sent-40, score-0.942]
13 However data sources in VPA systems pose new challenges, such as variability and ambiguities in natural language, or short utterances that rarely contain contextual information, etc. [sent-50, score-0.296]
14 Thus, SLU plays an important role in allowing any sophisticated spoken dialog system (e. [sent-51, score-0.475]
15 Earlier work takes dialog act identification as a classification task to capture the user’s intentions (Margolis et al. [sent-57, score-0.775]
16 , 2010) and slot filling as a sequence learning task specific to a given domain class (Wang et al. [sent-58, score-0.535]
17 Their discriminative approach represents semantic slots and discourse-level utterance labels (domain or dialog act) in a single structure to encode dependencies. [sent-69, score-0.814]
18 However, their model requires fully labeled utterances for training, which can be time consuming and expensive to generate for dynamic systems. [sent-70, score-0.341]
19 Our joint model can discover domain D, and user’s act A as higher layer latent concepts of utterances in relation to lower layer latent semantic topics (slots) S such as named-entities (”New York”) or context bearing non-named entities (”vegan ”). [sent-73, score-1.18]
20 , directed acyclic graphs representing mixtures of hierarchical topic structures, where upper level topics are multinomial over lower level topics in a hierarchy. [sent-77, score-0.253]
21 Concretely, correlated topics eliminate assignment of semantic tags to segments in an utterance that belong to other domains, e. [sent-80, score-0.337]
22 Be- ing generative, our model can incorporate unlabeled utterances and encode prior information of concepts. [sent-83, score-0.437]
23 Our corpus mainly contains NL utterances (”show me the nearest dimsum places”) and some keyword queries (”iron man 2 trailers”). [sent-86, score-0.326]
24 Our corpus contains utterances from KD=4 main domains:∈ { movies, hotels, restaurants, events }, as wmealiln as ∈out {-omfo-dvioems,ai hno otetlhse,r rcelsatsasu. [sent-89, score-0.265]
25 rEanacths, ,u ettveernatnsc}e, has one dialog act (A) associated with it. [sent-90, score-0.743]
26 We assume a fixed number of possible dialog acts KA for each domain. [sent-91, score-0.478]
27 Semantic Tags, slots (S) are lexical units (segments) of an utterance, which we classify into two types: domain-independent slots that are shared across all domains, (e. [sent-92, score-0.308]
28 For tractability, we consider a fixed number of latent slot types KS. [sent-99, score-0.394]
29 ) We represent domain and dialog act components as meta-variables of utterances. [sent-101, score-0.989]
30 In our model, each utterance u is associated with domain and dialog act topics. [sent-105, score-1.097]
31 A word wuj in u is generated by first selecting a domain and an act topic and then slot topic over words of u. [sent-106, score-1.199]
32 The domain-dependent slots 332 in utterances are usually not dependent on the dialog act. [sent-107, score-0.841]
33 For instance, while ”find [hugo] trailer” and ”show me where [hugo] is playing ” have both a movie-name slot (”hugo”), they have different dialog acts, i. [sent-108, score-0.778]
34 We predict posterior probabilities for domain P˜(d PP˜((sdj P˜(a ∈ D|u) dialog act ∈ A|ud) and slots ∈ S|wuj , d, sj−1) cotf Pw(oards ∈ wuj uind sequence. [sent-111, score-1.237]
35 The click information can be used to infer domain class labels, and therefore, can provide (noisy) supervision in training domain classifiers. [sent-115, score-0.358]
36 We inject them as nonuniform priors over d}o∈m ψain and dialog act parameters in §4. [sent-126, score-0.913]
37 wj 1Two utterances can be intrinsically related but contain no common terms, e. [sent-135, score-0.33]
38 o9t08e1l (ψE) Entity List Prior (ψG) Web N-Gram Context Prior domain, dialog act and slot in a hierarchy, each consisting of KD , KA , KS components. [sent-149, score-1.099]
39 We represent each entity list as observed nonuniform priors ψE and inject them into our joint learning process as V sparse multinomial distributions over latent topics D, and S to ”guide” the generation of utterances (Fig. [sent-154, score-0.726]
40 In our MCM model, we assume that each utte→ran θce is represented as a hidden Markov model with KS slot states. [sent-164, score-0.356]
41 Once domain Du and act Aud topics are sampled for u, a slot state topic Sujd is drawn to generate each segment wuj of u by considering the word-tag sequence frequencies based on a simple HMM assumption, similar to the content models of (Sauper et al. [sent-166, score-1.233]
42 Initial and transition probability distributions over the HMM states are sampled from Dirichlet distribution over slots θdSs. [sent-168, score-0.25]
43 Each slot state s generates words according to multinomial word distribution φsS. [sent-169, score-0.404]
44 Every time a wj is sampled nfaorV a ×doKmain d, we increment its count, a degree of domain bearing 333 words. [sent-171, score-0.331]
45 Similarly, we keep track of dialog act and slot bearing words in V KA and V KS matrices, MA baenadr MS (shown as ×reKd arrows in × Fig 1). [sent-172, score-1.15]
46 7: endfor 8: draw φsS ∼ Dir(β) for each slot type s ← 1, . [sent-194, score-0.356]
47 ‡ Here HMM assumption In hierarchical over utterance topic models words on is prior used. [sent-215, score-0.34]
48 , 2009)) or in terms of pre-learnt topics encoded as prior knowledge on topic distributions in documents (Reisinger and Pas ¸ca, 2009). [sent-222, score-0.282]
49 Different from earlier work though, we also inject knowledge that we extract from several resources including entity lists from web search query click logs as well as seed labeled training utterances as prior information. [sent-226, score-0.716]
50 We constrain the generation of the semantic components of our model by encoding prior knowledge in terms of asymmetric Dirichlet topic priors α=(αm1,. [sent-227, score-0.354]
51 ,αmK) where each kth topic has a prior weight αk=αmk, with varying base measure m=(m1,. [sent-230, score-0.282]
52 We update parameter vectors of Dirichlet domain prior αDu? [sent-234, score-0.291]
53 Because base measure updates are dependent on prior knowledge of corpus words, each utterance u gets a different base measure. [sent-239, score-0.449]
54 Similarly, we update the parameter vector of the Dirichlet dialog act and slot priors and αD·ψuDKD}, ψDu={ψuDd}dK=D1 αAu? [sent-240, score-1.196]
55 Before describing base measure update for domain, act and slot Dirichlet priors, we explain the constraining prior knowledge parameters below: ? [sent-249, score-0.906]
56 Entity List Base Measure(ψjE): Entity features are indicative of domain and slots and MCM utilizes these features while sampling topics. [sent-250, score-0.367]
57 For instance, entities hotel-name ”Hilton” and location ”New York” are discriminative features in classifying ”find nice cheap double room in New York Hilton” into correct domain (hotel) and slot (hotelname) clusters. [sent-251, score-0.535]
58 We represent entity lists correspond- ing to known domains as multinomial distributions where each is the probability of entityword wj used in the domain d. [sent-252, score-0.45]
59 Normalized word distributions over domains were used as weights for domain and dialog act base measure. [sent-258, score-1.112]
60 Concretely, we calculate the frequency of vocabulary items given domain-act label pairs from the training labeled utterances and convert there into probability measures over domain-acts. [sent-261, score-0.372]
61 We update the base measures of each sampled domain Du = d given each vocabulary wj as: ψdDj=(ψψdEGd||j , ψotEdh|jer>wi 0se (1) In (1) we assume that entities (E) are more indicative of the domain compared to other n-grams (G) and should be more dominant in sampling decision for domain topics. [sent-270, score-0.824]
62 Given anP utterance u, we calculate its base measure ψuDd Once the domain is samplePd, we update the prior weight of dialog acts Aud = a: = ∗ (2) =(PjNu ψDdj)/Nu. [sent-271, score-1.061]
63 ψaAj ψaCd|j ψGd|j and slot componentψs SsSjuj=d= ψEd s|j: Then wPe update their base measurePs (3) for a given u as: ψuAa =(PjNu ψAaj)/Nu and ψuSs =(PjNu ψSsj)/Nu. [sent-272, score-0.477]
64 1 InPference and Learning The goal of inference is to predict the domain, user’s act and slot distributions over each segment given an utterance. [sent-274, score-0.769]
65 The MCM has the following set of parameters: domain-topic distributions θDd for each u, the act-topic distributions θdAa for each domain topic d of u, local slot-topic distributions for each domain θS, and φsS for slot-word distributions. [sent-275, score-0.629]
66 For each utterance u, we sample a domain Du and act Aud and hyper-parameters αD and αA and their base measures ψuDd, ψuAa (from Eq. [sent-283, score-0.82]
67 Dud (4) The Nud is the number of occurrences of domain topic d in utterance u, Na|ud is the number of occur- rences of act a given d ian|u u. [sent-286, score-0.766]
68 During sampling of a slot state Sujd, we assume that utterance is generated by the HMM model associated with the assigned domain. [sent-287, score-0.531]
69 For each segment wuj in u, we sample a slot state Sujd given the remaining slots and hyperparameters αS, β and base measure ψuSs (Eq. [sent-288, score-0.851]
70 αSψSus (5) The Nukjd is the number of times segment wuj is generated from slot state s in all utterances assigned to domain topic d, is the number of transitions from slot state s1 to s2, where s1 ∈{Su(j−1)d,Su(j+1)d}, I(s1, s2)=1 if slot s1=s2. [sent-293, score-1.796]
71 2 Semantic Structure Extraction with MCM During Gibbs sampling, we keep track of the frequency of draws of domain, dialog act and slot in- dicating n-grams wj, in MD, MA and MS matrices, respectively. [sent-295, score-1.099]
72 1 Datasets, Labels and Tags Our dataset contains utterances obtained from dialogs between human users and our personal assistant system. [sent-303, score-0.265]
73 the utterances obtained from (acoustic modeling engine) to train our models 4. [sent-306, score-0.265]
74 e, movie, restaurant, hotel, event, other, 42 unique dialog acts and 41 slot tags. [sent-309, score-0.834]
75 Each utterance is labeled with a domain, dialog act and a sequence of slot tags corresponding to segments in utterance (see examples in Table 1). [sent-310, score-1.567]
76 We pulled a month of web query logs and extracted over 2 million search queries from the movie, hotel, event, and restaurant domains. [sent-313, score-0.28]
77 We also used generic web queries to compile a set of ’other’ domain queries. [sent-314, score-0.295]
78 Our vocabulary consists of n-grams and segments (phrases) in utterances that are extracted using web n-grams and entity lists of §3. [sent-315, score-0.413]
79 We terxatcratecdt udsisintrgib wuteiobn ns- ogrfa n-grams eanntdit yen lti sttises o fto § inject as prior weights for entity list base and web n-gram context base measures (see §4). [sent-316, score-0.456]
80 Sequence-SLU: A traditional approach to SLU extracts domain, dialog act and slots as semantic components of utterances using three sequential models. [sent-319, score-1.292]
81 Typically, domain and dialog act detection models are taken as query classification, where a given NL query is assigned domain and act labels. [sent-320, score-1.544]
82 Among supervised query classification meth- 4We submitted sample utterances used in our models as additional resource. [sent-321, score-0.391]
83 Due to licensing issues, we will reveal the full train/test utterances upon acceptance of our paper. [sent-322, score-0.265]
84 Slot discovery is taken as a sequence labeling task in which segments in utterances are labeled (Li, 2010). [sent-324, score-0.383]
85 It is a state-of-the art method that learns the sequence labels and utterance class (domain or dialog act) as meta-sequence in a joint framework. [sent-328, score-0.634]
86 It encodes the inter-dependence between the slot se- quence s and meta-sequence label (d or a) using a triangular chain (dual-layer) structure. [sent-329, score-0.398]
87 Base-MCM: Our first version injects an informative prior for domain, dialog act and slot topic distributions using information extracted from only labeled training utterances and inject as prior constraints (corpus n-gram base measure during topic assignments. [sent-331, score-2.026]
88 WebPrior-MCM: Our full model encodes distributions extracted from labeled training data as well as structured web logs as asymmetric Dirichlet priors. [sent-333, score-0.246]
89 For domain and dialog act detection performance we present results in accuracy, and for slot detection we use the F1pairwise measure. [sent-344, score-1.278]
90 When the number of labeled data is small (niL ≤25%*nL), our WebP rior-MCM has a better performance on domain and act predictions compared to the two baselines. [sent-357, score-0.576]
91 As the percentage of labeled utterances in training data increase, Tri-CRF performance increases, however WebP rior-MCM is still comparable with Sequence-SLU. [sent-363, score-0.341]
92 This is because we utilize domain priors obtained from the web sources as supervision during generative process as well as unlabeled utterances that enable handling language variability. [sent-364, score-0.656]
93 Although WebP rior-MCM’s domain and dialog act performances are comparable (if not better than) the other baselines, it falls short on the semantic tagging model. [sent-366, score-0.985]
94 Here, we evaluate the performance gain on domain, act and slot predictions as more unlabeled data is introduced at learning time. [sent-374, score-0.775]
95 We use only 10% of the utterances as labeled data in this experiment and incrementally add unlabeled data (90% of labeled data are treated as unlabeled). [sent-375, score-0.515]
96 ) unlabeled data indicates that the WebP rior-MCM is trained using n% of unlabeled utterances along with training utterances. [sent-379, score-0.461]
97 Adding unlabeled data has a positive impact on the performance of all three se337 Table 3: Performance evaluation results of WebP rior-MCM using different sizes of unlabeled utterances at learning time. [sent-380, score-0.461]
98 We proposed a semi-supervised generative joint learning approach tailored for injecting prior knowledge to enhance the semantic component extraction from utterances as a unifying framework. [sent-390, score-0.48]
99 Domain adaptation with unlabeled data for dialog act tagging. [sent-493, score-0.841]
100 Semi-supervised learning of semantic classes for query understanding from the web and for the web. [sent-588, score-0.245]
wordName wordTfidf (topN-words)
[('dialog', 0.422), ('slot', 0.356), ('act', 0.321), ('utterances', 0.265), ('domain', 0.179), ('utterance', 0.175), ('wuj', 0.161), ('slu', 0.154), ('slots', 0.154), ('mcm', 0.145), ('sujd', 0.145), ('webp', 0.113), ('du', 0.099), ('unlabeled', 0.098), ('aud', 0.097), ('topic', 0.091), ('dir', 0.084), ('base', 0.083), ('inject', 0.079), ('hotel', 0.079), ('nl', 0.078), ('labeled', 0.076), ('prior', 0.074), ('hotels', 0.07), ('ud', 0.07), ('components', 0.067), ('understanding', 0.066), ('wj', 0.065), ('jeong', 0.064), ('dirichlet', 0.063), ('semantic', 0.063), ('query', 0.061), ('queries', 0.061), ('distributions', 0.06), ('priors', 0.059), ('topics', 0.057), ('acts', 0.056), ('movies', 0.056), ('jg', 0.056), ('logs', 0.055), ('web', 0.055), ('spoken', 0.053), ('bearing', 0.051), ('entity', 0.051), ('layer', 0.049), ('dd', 0.049), ('acd', 0.048), ('daa', 0.048), ('eraacwh', 0.048), ('pjnu', 0.048), ('uaa', 0.048), ('udd', 0.048), ('vpa', 0.048), ('multinomial', 0.048), ('kd', 0.048), ('reisinger', 0.048), ('restaurant', 0.048), ('hmm', 0.047), ('movie', 0.047), ('domains', 0.047), ('mimno', 0.045), ('triangular', 0.042), ('tur', 0.042), ('uss', 0.042), ('segments', 0.042), ('component', 0.041), ('ks', 0.038), ('update', 0.038), ('latent', 0.038), ('ka', 0.037), ('joint', 0.037), ('sampled', 0.036), ('hugo', 0.036), ('restaurants', 0.036), ('baselines', 0.034), ('su', 0.034), ('wallach', 0.034), ('indicative', 0.034), ('measure', 0.034), ('supervised', 0.034), ('discover', 0.033), ('aaj', 0.032), ('asuncion', 0.032), ('begeja', 0.032), ('berry', 0.032), ('ddj', 0.032), ('dinarelli', 0.032), ('dud', 0.032), ('hilton', 0.032), ('intentions', 0.032), ('margolis', 0.032), ('nonuniform', 0.032), ('qjn', 0.032), ('vegan', 0.032), ('segment', 0.032), ('bayesian', 0.031), ('sample', 0.031), ('user', 0.031), ('variability', 0.031), ('measures', 0.031)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 14 acl-2012-A Joint Model for Discovery of Aspects in Utterances
Author: Asli Celikyilmaz ; Dilek Hakkani-Tur
Abstract: We describe a joint model for understanding user actions in natural language utterances. Our multi-layer generative approach uses both labeled and unlabeled utterances to jointly learn aspects regarding utterance’s target domain (e.g. movies), intention (e.g., finding a movie) along with other semantic units (e.g., movie name). We inject information extracted from unstructured web search query logs as prior information to enhance the generative process of the natural language utterance understanding model. Using utterances from five domains, our approach shows up to 4.5% improvement on domain and dialog act performance over cascaded approach in which each semantic component is learned sequentially and a supervised joint learning model (which requires fully labeled data).
2 0.11705193 142 acl-2012-Mining Entity Types from Query Logs via User Intent Modeling
Author: Patrick Pantel ; Thomas Lin ; Michael Gamon
Abstract: We predict entity type distributions in Web search queries via probabilistic inference in graphical models that capture how entitybearing queries are generated. We jointly model the interplay between latent user intents that govern queries and unobserved entity types, leveraging observed signals from query formulations and document clicks. We apply the models to resolve entity types in new queries and to assign prior type distributions over an existing knowledge base. Our models are efficiently trained using maximum likelihood estimation over millions of real-world Web search queries. We show that modeling user intent significantly improves entity type resolution for head queries over the state ofthe art, on several metrics, without degradation in tail query performance.
3 0.10791162 16 acl-2012-A Nonparametric Bayesian Approach to Acoustic Model Discovery
Author: Chia-ying Lee ; James Glass
Abstract: We investigate the problem of acoustic modeling in which prior language-specific knowledge and transcribed data are unavailable. We present an unsupervised model that simultaneously segments the speech, discovers a proper set of sub-word units (e.g., phones) and learns a Hidden Markov Model (HMM) for each induced acoustic unit. Our approach is formulated as a Dirichlet process mixture model in which each mixture is an HMM that represents a sub-word unit. We apply our model to the TIMIT corpus, and the results demonstrate that our model discovers sub-word units that are highly correlated with English phones and also produces better segmentation than the state-of-the-art unsupervised baseline. We test the quality of the learned acoustic models on a spoken term detection task. Compared to the baselines, our model improves the relative precision of top hits by at least 22.1% and outper- forms a language-mismatched acoustic model.
4 0.10278625 22 acl-2012-A Topic Similarity Model for Hierarchical Phrase-based Translation
Author: Xinyan Xiao ; Deyi Xiong ; Min Zhang ; Qun Liu ; Shouxun Lin
Abstract: Previous work using topic model for statistical machine translation (SMT) explore topic information at the word level. However, SMT has been advanced from word-based paradigm to phrase/rule-based paradigm. We therefore propose a topic similarity model to exploit topic information at the synchronous rule level for hierarchical phrase-based translation. We associate each synchronous rule with a topic distribution, and select desirable rules according to the similarity of their topic distributions with given documents. We show that our model significantly improves the translation performance over the baseline on NIST Chinese-to-English translation experiments. Our model also achieves a better performance and a faster speed than previous approaches that work at the word level.
5 0.10254306 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons
Author: Fangtao Li ; Sinno Jialin Pan ; Ou Jin ; Qiang Yang ; Xiaoyan Zhu
Abstract: Extracting sentiment and topic lexicons is important for opinion mining. Previous works have showed that supervised learning methods are superior for this task. However, the performance of supervised methods highly relies on manually labeled training data. In this paper, we propose a domain adaptation framework for sentiment- and topic- lexicon co-extraction in a domain of interest where we do not require any labeled data, but have lots of labeled data in another related domain. The framework is twofold. In the first step, we generate a few high-confidence sentiment and topic seeds in the target domain. In the second step, we propose a novel Relational Adaptive bootstraPping (RAP) algorithm to expand the seeds in the target domain by exploiting the labeled source domain data and the relationships between topic and sentiment words. Experimental results show that our domain adaptation framework can extract precise lexicons in the target domain without any annotation.
7 0.093977764 113 acl-2012-INPROwidth.3emiSS: A Component for Just-In-Time Incremental Speech Synthesis
8 0.090701796 177 acl-2012-Sentence Dependency Tagging in Online Question Answering Forums
9 0.089602403 199 acl-2012-Topic Models for Dynamic Translation Model Adaptation
10 0.082672119 79 acl-2012-Efficient Tree-Based Topic Modeling
11 0.081513837 191 acl-2012-Temporally Anchored Relation Extraction
12 0.081067666 88 acl-2012-Exploiting Social Information in Grounded Language Learning via Grammatical Reduction
13 0.077043682 212 acl-2012-Using Search-Logs to Improve Query Tagging
14 0.072532825 59 acl-2012-Corpus-based Interpretation of Instructions in Virtual Environments
15 0.071056031 159 acl-2012-Pattern Learning for Relation Extraction with a Hierarchical Topic Model
16 0.069986328 208 acl-2012-Unsupervised Relation Discovery with Sense Disambiguation
17 0.065962218 144 acl-2012-Modeling Review Comments
18 0.065509349 201 acl-2012-Towards the Unsupervised Acquisition of Discourse Relations
19 0.063468456 73 acl-2012-Discriminative Learning for Joint Template Filling
20 0.062656619 28 acl-2012-Aspect Extraction through Semi-Supervised Modeling
topicId topicWeight
[(0, -0.191), (1, 0.109), (2, 0.062), (3, 0.061), (4, -0.115), (5, 0.107), (6, -0.013), (7, -0.018), (8, 0.001), (9, -0.002), (10, 0.028), (11, 0.028), (12, -0.092), (13, 0.019), (14, -0.046), (15, 0.018), (16, -0.045), (17, 0.015), (18, 0.083), (19, 0.003), (20, 0.013), (21, 0.102), (22, -0.019), (23, -0.067), (24, -0.02), (25, -0.136), (26, 0.124), (27, 0.047), (28, -0.056), (29, 0.056), (30, -0.041), (31, 0.115), (32, 0.025), (33, -0.043), (34, -0.095), (35, -0.004), (36, -0.025), (37, -0.047), (38, -0.004), (39, -0.033), (40, -0.17), (41, 0.102), (42, -0.035), (43, -0.081), (44, -0.043), (45, 0.065), (46, -0.069), (47, 0.054), (48, 0.022), (49, 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 0.95161778 14 acl-2012-A Joint Model for Discovery of Aspects in Utterances
Author: Asli Celikyilmaz ; Dilek Hakkani-Tur
Abstract: We describe a joint model for understanding user actions in natural language utterances. Our multi-layer generative approach uses both labeled and unlabeled utterances to jointly learn aspects regarding utterance’s target domain (e.g. movies), intention (e.g., finding a movie) along with other semantic units (e.g., movie name). We inject information extracted from unstructured web search query logs as prior information to enhance the generative process of the natural language utterance understanding model. Using utterances from five domains, our approach shows up to 4.5% improvement on domain and dialog act performance over cascaded approach in which each semantic component is learned sequentially and a supervised joint learning model (which requires fully labeled data).
2 0.62009519 113 acl-2012-INPROwidth.3emiSS: A Component for Just-In-Time Incremental Speech Synthesis
Author: Timo Baumann ; David Schlangen
Abstract: We present a component for incremental speech synthesis (iSS) and a set of applications that demonstrate its capabilities. This component can be used to increase the responsivity and naturalness of spoken interactive systems. While iSS can show its full strength in systems that generate output incrementally, we also discuss how even otherwise unchanged systems may profit from its capabilities.
3 0.61949939 142 acl-2012-Mining Entity Types from Query Logs via User Intent Modeling
Author: Patrick Pantel ; Thomas Lin ; Michael Gamon
Abstract: We predict entity type distributions in Web search queries via probabilistic inference in graphical models that capture how entitybearing queries are generated. We jointly model the interplay between latent user intents that govern queries and unobserved entity types, leveraging observed signals from query formulations and document clicks. We apply the models to resolve entity types in new queries and to assign prior type distributions over an existing knowledge base. Our models are efficiently trained using maximum likelihood estimation over millions of real-world Web search queries. We show that modeling user intent significantly improves entity type resolution for head queries over the state ofthe art, on several metrics, without degradation in tail query performance.
4 0.6125595 16 acl-2012-A Nonparametric Bayesian Approach to Acoustic Model Discovery
Author: Chia-ying Lee ; James Glass
Abstract: We investigate the problem of acoustic modeling in which prior language-specific knowledge and transcribed data are unavailable. We present an unsupervised model that simultaneously segments the speech, discovers a proper set of sub-word units (e.g., phones) and learns a Hidden Markov Model (HMM) for each induced acoustic unit. Our approach is formulated as a Dirichlet process mixture model in which each mixture is an HMM that represents a sub-word unit. We apply our model to the TIMIT corpus, and the results demonstrate that our model discovers sub-word units that are highly correlated with English phones and also produces better segmentation than the state-of-the-art unsupervised baseline. We test the quality of the learned acoustic models on a spoken term detection task. Compared to the baselines, our model improves the relative precision of top hits by at least 22.1% and outper- forms a language-mismatched acoustic model.
5 0.55478656 211 acl-2012-Using Rejuvenation to Improve Particle Filtering for Bayesian Word Segmentation
Author: Benjamin Borschinger ; Mark Johnson
Abstract: We present a novel extension to a recently proposed incremental learning algorithm for the word segmentation problem originally introduced in Goldwater (2006). By adding rejuvenation to a particle filter, we are able to considerably improve its performance, both in terms of finding higher probability and higher accuracy solutions.
6 0.51355147 129 acl-2012-Learning High-Level Planning from Text
8 0.46711612 79 acl-2012-Efficient Tree-Based Topic Modeling
9 0.45359206 182 acl-2012-Spice it up? Mining Refinements to Online Instructions from User Generated Content
10 0.43318322 208 acl-2012-Unsupervised Relation Discovery with Sense Disambiguation
11 0.42814773 212 acl-2012-Using Search-Logs to Improve Query Tagging
12 0.42812401 44 acl-2012-CSNIPER - Annotation-by-query for Non-canonical Constructions in Large Corpora
13 0.42578644 59 acl-2012-Corpus-based Interpretation of Instructions in Virtual Environments
14 0.42413557 31 acl-2012-Authorship Attribution with Author-aware Topic Models
15 0.41436961 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information
16 0.41160446 110 acl-2012-Historical Analysis of Legal Opinions with a Sparse Mixed-Effects Latent Variable Model
17 0.40513542 88 acl-2012-Exploiting Social Information in Grounded Language Learning via Grammatical Reduction
18 0.39930454 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons
19 0.39375475 159 acl-2012-Pattern Learning for Relation Extraction with a Hierarchical Topic Model
20 0.38953477 35 acl-2012-Automatically Mining Question Reformulation Patterns from Search Log Data
topicId topicWeight
[(25, 0.015), (26, 0.032), (28, 0.033), (30, 0.024), (37, 0.025), (39, 0.057), (41, 0.337), (57, 0.012), (59, 0.013), (74, 0.025), (82, 0.018), (84, 0.017), (85, 0.017), (90, 0.132), (92, 0.072), (94, 0.03), (99, 0.072)]
simIndex simValue paperId paperTitle
same-paper 1 0.75055516 14 acl-2012-A Joint Model for Discovery of Aspects in Utterances
Author: Asli Celikyilmaz ; Dilek Hakkani-Tur
Abstract: We describe a joint model for understanding user actions in natural language utterances. Our multi-layer generative approach uses both labeled and unlabeled utterances to jointly learn aspects regarding utterance’s target domain (e.g. movies), intention (e.g., finding a movie) along with other semantic units (e.g., movie name). We inject information extracted from unstructured web search query logs as prior information to enhance the generative process of the natural language utterance understanding model. Using utterances from five domains, our approach shows up to 4.5% improvement on domain and dialog act performance over cascaded approach in which each semantic component is learned sequentially and a supervised joint learning model (which requires fully labeled data).
2 0.47283992 191 acl-2012-Temporally Anchored Relation Extraction
Author: Guillermo Garrido ; Anselmo Penas ; Bernardo Cabaleiro ; Alvaro Rodrigo
Abstract: Although much work on relation extraction has aimed at obtaining static facts, many of the target relations are actually fluents, as their validity is naturally anchored to a certain time period. This paper proposes a methodological approach to temporally anchored relation extraction. Our proposal performs distant supervised learning to extract a set of relations from a natural language corpus, and anchors each of them to an interval of temporal validity, aggregating evidence from documents supporting the relation. We use a rich graphbased document-level representation to generate novel features for this task. Results show that our implementation for temporal anchoring is able to achieve a 69% of the upper bound performance imposed by the relation extraction step. Compared to the state of the art, the overall system achieves the highest precision reported.
3 0.47142076 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons
Author: Fangtao Li ; Sinno Jialin Pan ; Ou Jin ; Qiang Yang ; Xiaoyan Zhu
Abstract: Extracting sentiment and topic lexicons is important for opinion mining. Previous works have showed that supervised learning methods are superior for this task. However, the performance of supervised methods highly relies on manually labeled training data. In this paper, we propose a domain adaptation framework for sentiment- and topic- lexicon co-extraction in a domain of interest where we do not require any labeled data, but have lots of labeled data in another related domain. The framework is twofold. In the first step, we generate a few high-confidence sentiment and topic seeds in the target domain. In the second step, we propose a novel Relational Adaptive bootstraPping (RAP) algorithm to expand the seeds in the target domain by exploiting the labeled source domain data and the relationships between topic and sentiment words. Experimental results show that our domain adaptation framework can extract precise lexicons in the target domain without any annotation.
4 0.47114971 40 acl-2012-Big Data versus the Crowd: Looking for Relationships in All the Right Places
Author: Ce Zhang ; Feng Niu ; Christopher Re ; Jude Shavlik
Abstract: Classically, training relation extractors relies on high-quality, manually annotated training data, which can be expensive to obtain. To mitigate this cost, NLU researchers have considered two newly available sources of less expensive (but potentially lower quality) labeled data from distant supervision and crowd sourcing. There is, however, no study comparing the relative impact of these two sources on the precision and recall of post-learning answers. To fill this gap, we empirically study how state-of-the-art techniques are affected by scaling these two sources. We use corpus sizes of up to 100 million documents and tens of thousands of crowd-source labeled examples. Our experiments show that increasing the corpus size for distant supervision has a statistically significant, positive impact on quality (F1 score). In contrast, human feedback has a positive and statistically significant, but lower, impact on precision and recall.
5 0.47100055 167 acl-2012-QuickView: NLP-based Tweet Search
Author: Xiaohua Liu ; Furu Wei ; Ming Zhou ; QuickView Team Microsoft
Abstract: Tweets have become a comprehensive repository for real-time information. However, it is often hard for users to quickly get information they are interested in from tweets, owing to the sheer volume of tweets as well as their noisy and informal nature. We present QuickView, an NLP-based tweet search platform to tackle this issue. Specifically, it exploits a series of natural language processing technologies, such as tweet normalization, named entity recognition, semantic role labeling, sentiment analysis, tweet classification, to extract useful information, i.e., named entities, events, opinions, etc., from a large volume of tweets. Then, non-noisy tweets, together with the mined information, are indexed, on top of which two brand new scenarios are enabled, i.e., categorized browsing and advanced search, allowing users to effectively access either the tweets or fine-grained information they are interested in.
6 0.47042957 28 acl-2012-Aspect Extraction through Semi-Supervised Modeling
7 0.47009373 159 acl-2012-Pattern Learning for Relation Extraction with a Hierarchical Topic Model
8 0.47004151 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information
9 0.46952927 84 acl-2012-Estimating Compact Yet Rich Tree Insertion Grammars
10 0.46905354 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale
11 0.46729293 21 acl-2012-A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election Cycle
13 0.46393895 182 acl-2012-Spice it up? Mining Refinements to Online Instructions from User Generated Content
14 0.46346447 142 acl-2012-Mining Entity Types from Query Logs via User Intent Modeling
15 0.46293977 146 acl-2012-Modeling Topic Dependencies in Hierarchical Text Categorization
16 0.46276581 38 acl-2012-Bayesian Symbol-Refined Tree Substitution Grammars for Syntactic Parsing
17 0.46175498 62 acl-2012-Cross-Lingual Mixture Model for Sentiment Classification
18 0.46098387 132 acl-2012-Learning the Latent Semantics of a Concept from its Definition
19 0.46049488 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers
20 0.46000928 73 acl-2012-Discriminative Learning for Joint Template Filling