acl acl2013 acl2013-60 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Xuchen Yao ; Benjamin Van Durme ; Peter Clark
Abstract: Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily performance gain for QA. We propose to tightly integrate them by coupling automatically learned features for answer extraction to a shallow-structured IR model. Our method is very quick to implement, and significantly improves IR for QA (measured in Mean Average Precision and Mean Reciprocal Rank) by 10%-20% against an uncoupled retrieval baseline in both document and passage retrieval, which further leads to a downstream 20% improvement in QA F1.
Reference: text
sentIndex sentText sentNum sentScore
1 We propose to tightly integrate them by coupling automatically learned features for answer extraction to a shallow-structured IR model. [sent-2, score-0.538]
2 Our method is very quick to implement, and significantly improves IR for QA (measured in Mean Average Precision and Mean Reciprocal Rank) by 10%-20% against an uncoupled retrieval baseline in both document and passage retrieval, which further leads to a downstream 20% improvement in QA F1. [sent-3, score-0.669]
3 Common approaches such as query expansion, structured retrieval, and translation models show patterns of complicated engineering on the IR side, or isolate the upstream passage retrieval from downstream answer extraction. [sent-6, score-1.012]
4 We propose a coupled retrieval method with prior knowledge ofits downstream QA component, that feeds QA with exactly the information needed. [sent-10, score-0.567]
5 While this relates Al aska and purchas ed, it is not a useful passage for the given question. [sent-16, score-0.216]
6 2 It is apparent that the question asks for a date. [sent-17, score-0.196]
7 , 2006): text is first annotated in a predictive manner (of what types of questions it might answer) with 20 answer types and then indexed. [sent-20, score-0.584]
8 A question analysis component (consisting of 400 question templates) maps the desired an- swer type to one of the 20 existing answer types. [sent-21, score-0.752]
9 Retrieval is then performed with both the question and predicated answer types in the query. [sent-22, score-0.59]
10 However, predictive annotation has the limitation of being labor intensive and assuming the underlying NLP pipeline to be accurate. [sent-23, score-0.146]
11 We avoid these limitations by directly asking the downstream QA system for the information about which entities answer which questions, via two steps: 1. [sent-24, score-0.522]
12 forming a query based on the most relevant answer features given a question from the learned QA model. [sent-26, score-0.724]
13 , entity recognition errors, because answer typing knowledge is learned from how the data was actually labeled, not from how the data was assumed to be labeled (e. [sent-30, score-0.429]
14 , manual templates usually assume perfect labeling of named entities, but often it is not the case 2Based on a non-optimized IR configuration, none of the top 1000 returned passages contained the correct answer: 1867. [sent-32, score-0.138]
15 , 2013) that recognizes the association between question type and expected answer types through various features. [sent-37, score-0.619]
16 , 2001) and tags each token as either an answer (ANS) or not (O). [sent-39, score-0.452]
17 This will be our offthe-shelf QA system, which recognizes the association between question type and expected answer types through various features based on e. [sent-40, score-0.651]
18 With weights optimized by CRF training (Table 1), we can learn how answer features are correlated with question features. [sent-43, score-0.646]
19 These features, whose weights are optimized by the CRF training, directly reflect what the most important answer types associated with each question type are. [sent-44, score-0.648]
20 For instance, line 2 in Table 1says that if there is a when question, and the current token’s NER label is DATE, then it is likely that this token is tagged as ANS. [sent-45, score-0.073]
21 The analyzing power of discriminative answer features for IR comes for free from a trained QA system. [sent-48, score-0.392]
22 Unlike predictive annotation, statistical evidence determines the best answer features given the question, with no manual pattern or templates needed. [sent-49, score-0.515]
23 To compare again predictive annotation with our approach: predictive annotation works in a forward mode, downstream QA is tailored for upstream IR, i. [sent-50, score-0.445]
24 Our method works in reverse (backward): downstream QA dictates upstream IR, i. [sent-53, score-0.211]
25 Moreover, our approach extends easily beyond fixed answer types such as named entities: we are already using POS tags as a demonstration. [sent-56, score-0.48]
26 We can potentially use any helpful answer features in retrieval. [sent-57, score-0.392]
27 For instance, if the QA system learns that in order to is highly correlated with why question through lexicalized features, or some certain dependency relations are helpful in answering questions with specific structures, then it is natural and easy for the IR component to incorporate them. [sent-58, score-0.35]
28 74 Table 1: Learned weights for sampled features with respect to the label of current token (indexed by [0]) in a CRF. [sent-62, score-0.105]
29 For instance, line 1 says when answering a when question, and the POS of current token is CD (cardinal number), it is likely (large weight) that the token is tagged as ANS. [sent-64, score-0.206]
30 Our method is a QA-driven approach that provides supervision for IR from a learned QA model, while learning to rank is essentially an IR-driven approach: the supervision for IR comes from a la- beled ranking list of retrieval results. [sent-68, score-0.323]
31 Overall, we make the following contributions: • Our proposed method tightly integrates QA wOiuthr I pRro apnods ethde reuse odf analysis ifnrtoemgr QA d QoeAs not put extra overhead on the IR queries. [sent-69, score-0.091]
32 This provides great flexibility in using answer features in IR queries. [sent-72, score-0.392]
33 We give a full spectrum evaluation of all three stages of IR+QA: document retrieval, passage retrieval and answer extraction, to examine thoroughly the effectiveness of the method. [sent-73, score-0.732]
34 , 2005; Kaisser, 2012), graph matching with Semantic Role Labeling (Shen and Lapata, 2007) and answer type checking (Pinchak et al. [sent-76, score-0.36]
35 3 Method Table 1 already shows some examples of features associating question types with answer types. [sent-91, score-0.622]
36 We store the features and their learned weights from the trained model for IR usage. [sent-92, score-0.099]
37 We let the trained QA system guide the query formulation when performing coupled retrieval with Indri (Strohman et al. [sent-93, score-0.502]
38 The question analysis component from QA is reused here. [sent-97, score-0.196]
39 In this implementation, the only information we have chosen to use from the question is the question word (e. [sent-98, score-0.392]
40 , how, who) and the lexical answer types (LAT) in case of what/which questions. [sent-100, score-0.394]
41 Given the question word, we select the 5 highest weighted features (e. [sent-103, score-0.252]
42 The original question is combined with the top features as the query. [sent-108, score-0.263]
43 IR reuses both code for question analysis and top weighted features from QA. [sent-113, score-0.287]
44 3 that keyword and named entities based retrieval actually outperformed SRLbased structured retrieval in MAP for the answer-bearing sentence retrieval task in their setting. [sent-116, score-0.81]
45 Systematic errors made by the processing tools are tolerated, in the sense that if the same preprocessing error is made on both the question and sentence, an answer may still be found. [sent-122, score-0.556]
46 But since the importance of this feature is recognized by downstream QA, the upstream IR is still motivated to retrieve it. [sent-125, score-0.211]
47 Queries were lightly optimized using the following strategies: Query Weighting In practice query words are weighted: #weight(1 . [sent-126, score-0.101]
48 0 purchased α #max(#any:CD #any:DATE)) with a weight α for the answer types tuned via cross-validation. [sent-130, score-0.486]
49 Since NER and POS tags are not lexicalized they accumulate many more counts (i. [sent-131, score-0.078]
50 0, giving the expected answer types “enough say” but not “too much say”: NER Types First We found NER labels better indicators of expected answer types than POS tags. [sent-134, score-0.788]
51 In general POS tags are too coarse-grained in answer types than NER labels. [sent-136, score-0.442]
52 , NNP can answer who and where questions, but is not as precise as PERSON and GPE. [sent-139, score-0.36]
53 POS tags accumulate even more counts than NER labels, thus they need separate downweighting. [sent-141, score-0.078]
54 If the top-weighted features are based on NER, then we do not include POS tags for that question. [sent-143, score-0.08]
55 Otherwise POS tags are useful, for instance, in answering how questions. [sent-144, score-0.137]
56 Unigram QA Model The QA system uses up to trigram features (Table 1 shows examples of unigram and bigram features). [sent-145, score-0.159]
57 Thus it is able to learn, for instance, that a POS sequence of IN CD NNS is likely an answer to a when question (such as: in 5 years). [sent-146, score-0.556]
58 Simple question analysis (reuse from QA) qword=when 2. [sent-149, score-0.196]
59 Figure 1: Coupled retrieval with queries directly constructed from highest weighted features of downstream QA. [sent-181, score-0.5]
60 The retrieved and ranked list of sentences is POS and NER tagged, but only query-relevant tags are shown due to space limit. [sent-182, score-0.094]
61 A bag-of-words retrieval approach would have the sentence shown above at rank 50 at its top position instead. [sent-183, score-0.32]
62 We drop this strict constraint (which may need further smoothing) and only use unigram features, not by simply extracting “good” unigram features from the trained model, but by re-training the model with only unigram features. [sent-185, score-0.293]
63 In answer extraction, we still use up to trigram features. [sent-186, score-0.4]
64 6 4 Experiments We want to measure and compare the performance of the following retrieval techniques: 1. [sent-187, score-0.24]
65 uncoupled retrieval with an off-the-shelf IR engine by using the question as query (baseline), 2. [sent-188, score-0.704]
66 answer-bearing retrieval by using both the question and known answer as query, only evaluated for answer extraction (upper bound), at the three stages of question answering: 1. [sent-190, score-1.382]
67 Document retrieval (for relevant docs from corpus), measured by Mean Average Precision (MAP) and Mean Reciprocal Rank (MRR). [sent-191, score-0.266]
68 Passage retrieval (finding relevant sentences from the document), also by MAP and MRR. [sent-193, score-0.266]
69 6This is because the weights of unigram to trigram features in a loglinear CRF model is a balanced consequence for maximization. [sent-196, score-0.188]
70 A unigram feature might end up with lower weight because another trigram containing this unigram gets a higher weight. [sent-197, score-0.238]
71 Then we would have missed this feature if we only used top unigram features. [sent-198, score-0.122]
72 Thus we re-train the model with only unigram features to make sure weights are “assigned properly” among only unigram features. [sent-199, score-0.235]
73 All coupled and uncoupled queries are performed with Indri v5. [sent-205, score-0.417]
74 1 Data Test Set for IR and QA The MIT109 test collection by Lin and Katz (2006) contains 109 questions from TREC 2002 and provides a nearexhaustive judgment of relevant documents for each question. [sent-209, score-0.091]
75 We removed 10 questions that do not have an answer by matching the TREC answer patterns. [sent-210, score-0.785]
76 For the top 10 retrieved sentences for each question, three Turkers judged whether each sentence contained the answer. [sent-213, score-0.081]
77 Note that only 88 questions out of MIT99 have an answer from the top 10 query results. [sent-217, score-0.532]
78 But only sentence boundaries, POS tags and NER labels were kept as the annotation of the corpus. [sent-220, score-0.074]
79 2 Document and Passage Retrieval We issued uncoupled queries consisting of question words, and QA-driven coupled queries consisting of both the question and expected answer types, then retrieved the top 1000 documents, and 162 type MAcPoupleMdRR MAunPcouplMedRR document 0. [sent-222, score-1.347]
80 uncoupled document/sentence retrieval in MAP and MRR on MIT99. [sent-231, score-0.4]
81 To find the best weighting α for coupled retrieval, we used 5-fold cross-validation and finalized at α = 0. [sent-237, score-0.19]
82 Coupled retrieval outperforms (20% by MAP with p < 0. [sent-240, score-0.24]
83 01) uncoupled retrieval significantly according to paired randomization test (Smucker et al. [sent-242, score-0.4]
84 To generate a test set for sentence retrieval, we matched each sentence from relevant documents provided by MIT99 for each question against the TREC answer patterns. [sent-246, score-0.582]
85 We found no significant difference between retrieving sentences from the documents returned by document retrieval or directly from the corpus. [sent-247, score-0.27]
86 Still, coupled retrieval is significantly better by about 10% in MAP and 17% in MRR. [sent-249, score-0.43]
87 3 Answer Extraction Lastly we sent the sentences to the downstream QA engine (trained on TRAIN) and computed F1 per K for the top K retrieved sentences, 7 shown in Figure 2. [sent-251, score-0.254]
88 The best F1 with coupled sentence retrieval is 0. [sent-252, score-0.43]
89 Thus we also computed F1’s assuming perfect voting: a voting oracle that always selects the correct answer as long as the QA system produces one, thus the two ascending lines in the center of Figure 2. [sent-256, score-0.481]
90 Still, F1 with coupled retrieval is always better: reiterating the fact that coupled retrieval covers more answer-bearing sentences. [sent-257, score-0.86]
91 The gap between the top two and other lines signals more room for improvements for IR in terms of better coverage and better rank for answer-bearing sentences. [sent-264, score-0.107]
92 Figure 2: F1 values for answer extraction on MIT99. [sent-265, score-0.39]
93 “Oracle” methods assumed perfect voting of answer candidates (a question is answered correctly if the system ever produced one correct answer for it). [sent-267, score-0.981]
94 5 Conclusion We described a method to perform coupled information retrieval with a prior knowledge of the downstream QA system. [sent-269, score-0.567]
95 Specifically, we coupled IR queries with automatically learned answer features from QA and observed significant improvements in document/passage retrieval and boosted F1 in answer extraction. [sent-270, score-1.287]
96 This method has the merits of not requiring hand-built question and answer templates and being flexible in incorporating various answer features automatically learned and optimized from the downstream QA system. [sent-271, score-1.184]
97 Improving text retrieval precision and answer accuracy in question answering systems. [sent-291, score-0.885]
98 Rank learning for factoid question answering with linguistic and semantic constraints. [sent-310, score-0.317]
99 An exploration of the principles underlying redundancy-based factoid question answering. [sent-364, score-0.228]
100 A comparison of statistical significance tests for information retrieval evaluation. [sent-416, score-0.24]
wordName wordTfidf (topN-words)
[('qa', 0.497), ('answer', 0.36), ('ir', 0.341), ('retrieval', 0.24), ('question', 0.196), ('coupled', 0.19), ('ner', 0.179), ('uncoupled', 0.16), ('downstream', 0.137), ('bilotti', 0.105), ('passage', 0.102), ('indri', 0.093), ('qword', 0.091), ('predictive', 0.091), ('answering', 0.089), ('unigram', 0.087), ('cd', 0.081), ('alaska', 0.081), ('upstream', 0.074), ('pos', 0.074), ('query', 0.072), ('mrr', 0.07), ('purchas', 0.068), ('purchased', 0.068), ('strohman', 0.068), ('testgold', 0.068), ('queries', 0.067), ('prager', 0.067), ('questions', 0.065), ('kaisser', 0.06), ('smucker', 0.06), ('coupling', 0.051), ('trec', 0.05), ('tags', 0.048), ('map', 0.047), ('ans', 0.046), ('retrieved', 0.046), ('aquaint', 0.046), ('aska', 0.046), ('greenwood', 0.046), ('rank', 0.045), ('token', 0.044), ('retrieves', 0.042), ('pinchak', 0.04), ('trigram', 0.04), ('learned', 0.038), ('named', 0.038), ('yao', 0.037), ('vulcan', 0.037), ('xuchen', 0.037), ('sakai', 0.037), ('sigir', 0.037), ('engine', 0.036), ('date', 0.036), ('ny', 0.035), ('top', 0.035), ('crf', 0.035), ('management', 0.034), ('reuse', 0.034), ('types', 0.034), ('acm', 0.033), ('callan', 0.033), ('ogilvie', 0.033), ('perfect', 0.033), ('voting', 0.032), ('features', 0.032), ('templates', 0.032), ('durme', 0.032), ('factoid', 0.032), ('turkers', 0.032), ('gpe', 0.032), ('purchase', 0.031), ('typing', 0.031), ('document', 0.03), ('extraction', 0.03), ('overhead', 0.03), ('accumulate', 0.03), ('agarwal', 0.03), ('tagged', 0.029), ('optimized', 0.029), ('weights', 0.029), ('assuming', 0.029), ('recognizes', 0.029), ('nltk', 0.028), ('loc', 0.027), ('artstein', 0.027), ('structured', 0.027), ('lines', 0.027), ('person', 0.027), ('reciprocal', 0.027), ('tightly', 0.027), ('relevant', 0.026), ('york', 0.026), ('annotation', 0.026), ('ratinov', 0.026), ('nns', 0.025), ('cui', 0.025), ('entities', 0.025), ('weight', 0.024), ('weighted', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999946 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval
Author: Xuchen Yao ; Benjamin Van Durme ; Peter Clark
Abstract: Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily performance gain for QA. We propose to tightly integrate them by coupling automatically learned features for answer extraction to a shallow-structured IR model. Our method is very quick to implement, and significantly improves IR for QA (measured in Mean Average Precision and Mean Reciprocal Rank) by 10%-20% against an uncoupled retrieval baseline in both document and passage retrieval, which further leads to a downstream 20% improvement in QA F1.
2 0.43916452 241 acl-2013-Minimum Bayes Risk based Answer Re-ranking for Question Answering
Author: Nan Duan
Abstract: This paper presents two minimum Bayes risk (MBR) based Answer Re-ranking (MBRAR) approaches for the question answering (QA) task. The first approach re-ranks single QA system’s outputs by using a traditional MBR model, by measuring correlations between answer candidates; while the second approach reranks the combined outputs of multiple QA systems with heterogenous answer extraction components by using a mixture model-based MBR model. Evaluations are performed on factoid questions selected from two different domains: Jeopardy! and Web, and significant improvements are achieved on all data sets.
3 0.30132335 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models
Author: Wen-tau Yih ; Ming-Wei Chang ; Christopher Meek ; Andrzej Pastusiak
Abstract: In this paper, we study the answer sentence selection problem for question answering. Unlike previous work, which primarily leverages syntactic analysis through dependency tree matching, we focus on improving the performance using models of lexical semantic resources. Experiments show that our systems can be consistently and significantly improved with rich lexical semantic information, regardless of the choice of learning algorithms. When evaluated on a benchmark dataset, the MAP and MRR scores are increased by 8 to 10 points, compared to one of our baseline systems using only surface-form matching. Moreover, our best system also outperforms pervious work that makes use of the dependency tree structure by a wide margin.
4 0.24493214 107 acl-2013-Deceptive Answer Prediction with User Preference Graph
Author: Fangtao Li ; Yang Gao ; Shuchang Zhou ; Xiance Si ; Decheng Dai
Abstract: In Community question answering (QA) sites, malicious users may provide deceptive answers to promote their products or services. It is important to identify and filter out these deceptive answers. In this paper, we first solve this problem with the traditional supervised learning methods. Two kinds of features, including textual and contextual features, are investigated for this task. We further propose to exploit the user relationships to identify the deceptive answers, based on the hypothesis that similar users will have similar behaviors to post deceptive or authentic answers. To measure the user similarity, we propose a new user preference graph based on the answer preference expressed by users, such as “helpful” voting and “best answer” selection. The user preference graph is incorporated into traditional supervised learning framework with the graph regularization technique. The experiment results demonstrate that the user preference graph can indeed help improve the performance of deceptive answer prediction.
5 0.21648271 292 acl-2013-Question Classification Transfer
Author: Anne-Laure Ligozat
Abstract: Question answering systems have been developed for many languages, but most resources were created for English, which can be a problem when developing a system in another language such as French. In particular, for question classification, no labeled question corpus is available for French, so this paper studies the possibility to use existing English corpora and transfer a classification by translating the question and their labels. By translating the training corpus, we obtain results close to a monolingual setting.
6 0.19141066 290 acl-2013-Question Analysis for Polish Question Answering
7 0.18008736 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering
9 0.15300012 218 acl-2013-Latent Semantic Tensor Indexing for Community-based Question Answering
10 0.11605618 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
11 0.10951616 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
12 0.10491311 55 acl-2013-Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?
13 0.099781796 254 acl-2013-Multimodal DBN for Predicting High-Quality Answers in cQA portals
14 0.099646233 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval
15 0.096127331 210 acl-2013-Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition
16 0.091745295 387 acl-2013-Why-Question Answering using Intra- and Inter-Sentential Causal Relations
17 0.085610956 266 acl-2013-PAL: A Chatterbot System for Answering Domain-specific Questions
18 0.080474623 141 acl-2013-Evaluating a City Exploration Dialogue System with Integrated Question-Answering and Pedestrian Navigation
19 0.077028535 273 acl-2013-Paraphrasing Adaptation for Web Search Ranking
20 0.076854512 338 acl-2013-Task Alternation in Parallel Sentence Retrieval for Twitter Translation
topicId topicWeight
[(0, 0.188), (1, 0.055), (2, -0.006), (3, -0.141), (4, 0.134), (5, 0.085), (6, 0.002), (7, -0.446), (8, 0.183), (9, 0.004), (10, 0.128), (11, -0.092), (12, 0.001), (13, -0.037), (14, 0.052), (15, 0.048), (16, 0.023), (17, -0.078), (18, 0.086), (19, 0.09), (20, 0.054), (21, 0.001), (22, 0.042), (23, -0.138), (24, -0.021), (25, -0.058), (26, -0.038), (27, -0.071), (28, 0.06), (29, -0.027), (30, 0.089), (31, -0.037), (32, -0.056), (33, 0.02), (34, 0.014), (35, -0.012), (36, 0.008), (37, 0.019), (38, -0.001), (39, -0.0), (40, 0.002), (41, -0.054), (42, -0.011), (43, -0.028), (44, 0.103), (45, 0.06), (46, 0.01), (47, -0.029), (48, 0.013), (49, -0.023)]
simIndex simValue paperId paperTitle
same-paper 1 0.97825658 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval
Author: Xuchen Yao ; Benjamin Van Durme ; Peter Clark
Abstract: Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily performance gain for QA. We propose to tightly integrate them by coupling automatically learned features for answer extraction to a shallow-structured IR model. Our method is very quick to implement, and significantly improves IR for QA (measured in Mean Average Precision and Mean Reciprocal Rank) by 10%-20% against an uncoupled retrieval baseline in both document and passage retrieval, which further leads to a downstream 20% improvement in QA F1.
2 0.9649452 241 acl-2013-Minimum Bayes Risk based Answer Re-ranking for Question Answering
Author: Nan Duan
Abstract: This paper presents two minimum Bayes risk (MBR) based Answer Re-ranking (MBRAR) approaches for the question answering (QA) task. The first approach re-ranks single QA system’s outputs by using a traditional MBR model, by measuring correlations between answer candidates; while the second approach reranks the combined outputs of multiple QA systems with heterogenous answer extraction components by using a mixture model-based MBR model. Evaluations are performed on factoid questions selected from two different domains: Jeopardy! and Web, and significant improvements are achieved on all data sets.
3 0.81500769 218 acl-2013-Latent Semantic Tensor Indexing for Community-based Question Answering
Author: Xipeng Qiu ; Le Tian ; Xuanjing Huang
Abstract: Retrieving similar questions is very important in community-based question answering(CQA) . In this paper, we propose a unified question retrieval model based on latent semantic indexing with tensor analysis, which can capture word associations among different parts of CQA triples simultaneously. Thus, our method can reduce lexical chasm of question retrieval with the help of the information of question content and answer parts. The experimental result shows that our method outperforms the traditional methods.
4 0.80630332 292 acl-2013-Question Classification Transfer
Author: Anne-Laure Ligozat
Abstract: Question answering systems have been developed for many languages, but most resources were created for English, which can be a problem when developing a system in another language such as French. In particular, for question classification, no labeled question corpus is available for French, so this paper studies the possibility to use existing English corpora and transfer a classification by translating the question and their labels. By translating the training corpus, we obtain results close to a monolingual setting.
5 0.80627602 290 acl-2013-Question Analysis for Polish Question Answering
Author: Piotr Przybyla
Abstract: This study is devoted to the problem of question analysis for a Polish question answering system. The goal of the question analysis is to determine its general structure, type of an expected answer and create a search query for finding relevant documents in a textual knowledge base. The paper contains an overview of available solutions of these problems, description of their implementation and presents an evaluation based on a set of 1137 questions from a Polish quiz TV show. The results help to understand how an environment of a Slavonic language affects the performance of methods created for English.
6 0.78200305 266 acl-2013-PAL: A Chatterbot System for Answering Domain-specific Questions
7 0.76396173 107 acl-2013-Deceptive Answer Prediction with User Preference Graph
8 0.74200171 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering
9 0.73078787 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models
11 0.58043951 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
12 0.55958629 254 acl-2013-Multimodal DBN for Predicting High-Quality Answers in cQA portals
13 0.53549206 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval
14 0.4838101 387 acl-2013-Why-Question Answering using Intra- and Inter-Sentential Causal Relations
15 0.47459844 239 acl-2013-Meet EDGAR, a tutoring agent at MONSERRATE
16 0.47345653 141 acl-2013-Evaluating a City Exploration Dialogue System with Integrated Question-Answering and Pedestrian Navigation
17 0.39891139 273 acl-2013-Paraphrasing Adaptation for Web Search Ranking
18 0.36082602 215 acl-2013-Large-scale Semantic Parsing via Schema Matching and Lexicon Extension
19 0.35148543 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
20 0.34172544 271 acl-2013-ParaQuery: Making Sense of Paraphrase Collections
topicId topicWeight
[(0, 0.049), (6, 0.036), (11, 0.053), (19, 0.212), (24, 0.04), (26, 0.055), (35, 0.152), (42, 0.045), (48, 0.044), (64, 0.017), (70, 0.066), (88, 0.022), (90, 0.031), (95, 0.077)]
simIndex simValue paperId paperTitle
1 0.91358382 340 acl-2013-Text-Driven Toponym Resolution using Indirect Supervision
Author: Michael Speriosu ; Jason Baldridge
Abstract: Toponym resolvers identify the specific locations referred to by ambiguous placenames in text. Most resolvers are based on heuristics using spatial relationships between multiple toponyms in a document, or metadata such as population. This paper shows that text-driven disambiguation for toponyms is far more effective. We exploit document-level geotags to indirectly generate training instances for text classifiers for toponym resolution, and show that textual cues can be straightforwardly integrated with other commonly used ones. Results are given for both 19th century texts pertaining to the American Civil War and 20th century newswire articles.
2 0.85259801 126 acl-2013-Diverse Keyword Extraction from Conversations
Author: Maryam Habibi ; Andrei Popescu-Belis
Abstract: A new method for keyword extraction from conversations is introduced, which preserves the diversity of topics that are mentioned. Inspired from summarization, the method maximizes the coverage of topics that are recognized automatically in transcripts of conversation fragments. The method is evaluated on excerpts of the Fisher and AMI corpora, using a crowdsourcing platform to elicit comparative relevance judgments. The results demonstrate that the method outperforms two competitive baselines.
same-paper 3 0.84901494 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval
Author: Xuchen Yao ; Benjamin Van Durme ; Peter Clark
Abstract: Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily performance gain for QA. We propose to tightly integrate them by coupling automatically learned features for answer extraction to a shallow-structured IR model. Our method is very quick to implement, and significantly improves IR for QA (measured in Mean Average Precision and Mean Reciprocal Rank) by 10%-20% against an uncoupled retrieval baseline in both document and passage retrieval, which further leads to a downstream 20% improvement in QA F1.
4 0.81321776 222 acl-2013-Learning Semantic Textual Similarity with Structural Representations
Author: Aliaksei Severyn ; Massimo Nicosia ; Alessandro Moschitti
Abstract: Measuring semantic textual similarity (STS) is at the cornerstone of many NLP applications. Different from the majority of approaches, where a large number of pairwise similarity features are used to represent a text pair, our model features the following: (i) it directly encodes input texts into relational syntactic structures; (ii) relies on tree kernels to handle feature engineering automatically; (iii) combines both structural and feature vector representations in a single scoring model, i.e., in Support Vector Regression (SVR); and (iv) delivers significant improvement over the best STS systems.
5 0.78924668 4 acl-2013-A Context Free TAG Variant
Author: Ben Swanson ; Elif Yamangil ; Eugene Charniak ; Stuart Shieber
Abstract: We propose a new variant of TreeAdjoining Grammar that allows adjunction of full wrapping trees but still bears only context-free expressivity. We provide a transformation to context-free form, and a further reduction in probabilistic model size through factorization and pooling of parameters. This collapsed context-free form is used to implement efficient gram- mar estimation and parsing algorithms. We perform parsing experiments the Penn Treebank and draw comparisons to TreeSubstitution Grammars and between different variations in probabilistic model design. Examination of the most probable derivations reveals examples of the linguistically relevant structure that our variant makes possible.
6 0.77308577 317 acl-2013-Sentence Level Dialect Identification in Arabic
7 0.70160311 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models
8 0.70058733 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
9 0.69803292 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval
10 0.69619566 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
11 0.6932463 172 acl-2013-Graph-based Local Coherence Modeling
12 0.69229871 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
13 0.69034177 215 acl-2013-Large-scale Semantic Parsing via Schema Matching and Lexicon Extension
14 0.68796456 58 acl-2013-Automated Collocation Suggestion for Japanese Second Language Learners
15 0.68663824 341 acl-2013-Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm
16 0.68620926 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering
17 0.68588442 347 acl-2013-The Role of Syntax in Vector Space Models of Compositional Semantics
18 0.68494141 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference
19 0.68376535 121 acl-2013-Discovering User Interactions in Ideological Discussions
20 0.68190408 113 acl-2013-Derivational Smoothing for Syntactic Distributional Semantics