acl acl2013 acl2013-241 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Nan Duan
Abstract: This paper presents two minimum Bayes risk (MBR) based Answer Re-ranking (MBRAR) approaches for the question answering (QA) task. The first approach re-ranks single QA system’s outputs by using a traditional MBR model, by measuring correlations between answer candidates; while the second approach reranks the combined outputs of multiple QA systems with heterogenous answer extraction components by using a mixture model-based MBR model. Evaluations are performed on factoid questions selected from two different domains: Jeopardy! and Web, and significant improvements are achieved on all data sets.
Reference: text
sentIndex sentText sentNum sentScore
1 com Abstract This paper presents two minimum Bayes risk (MBR) based Answer Re-ranking (MBRAR) approaches for the question answering (QA) task. [sent-2, score-0.186]
2 Evaluations are performed on factoid questions selected from two different domains: Jeopardy! [sent-4, score-0.114]
3 This work makes further exploration along this line of research, by applying MBR technique to question answering (QA). [sent-7, score-0.121]
4 The function of a typical factoid question answering system is to automatically give answers to questions in most case asking about entities, which usually consists of three key components: question understanding, passage retrieval, and answer extraction. [sent-8, score-0.747]
5 In this paper, we propose two MBRbased Answer Re-ranking (MBRAR) approaches, aiming to re-rank answer candidates from either single and multiple QA systems. [sent-9, score-0.483]
6 The key contribution of this work is that, our MBRAR approaches assume little about QA systems and can be easily applied to QA systems with arbitrary sub-components. [sent-11, score-0.066]
7 The remainder of this paper is organized as follows: Section 2 gives a brief review of the QA task and describe two types of QA systems with different pros and cons. [sent-12, score-0.069]
8 Section 3 presents two MBRAR approaches that can re-rank the answer candidates from single and multiple QA systems respectively. [sent-13, score-0.489]
9 Section 5 evaluates our methods on large scale questions selected from two domains (Jeopardy! [sent-15, score-0.074]
10 (2) Passage Retrieval, which formulates queries based on Q, and retrieves passages from offline corpus or online search engines (e. [sent-21, score-0.091]
11 (3) Answer Extraction, which first extracts answer candidates from retrieved passages, and then ranks them based on specific ranking models. [sent-24, score-0.494]
12 2 Two Types of QA Systems We present two different QA sysytems, which are distinguished from three aspects: answer typing, answer generation, and answer ranking. [sent-28, score-0.969]
13 The 1st QA system is denoted as TypeDependent QA engine (TD-QA). [sent-29, score-0.049]
14 In answer typing phase, TD-QA assigns the most possible answer type to a given question Q based on: Tˆ Tˆ = argTmaxP(T|Q) P(T|Q) is a probabilistic answer-typing modePl tTh|aQt )is i ssim ail parro to iPliisnticcha akn sawnedr tLyipni (2006)’s work. [sent-30, score-0.785]
15 T}he a e2 tehde QA system igs daetunroete wde as TypeIndependent QA engine (TI-QA). [sent-32, score-0.096]
16 In answer typing phase, TI-QA assigns top N, instead of the best, answer types TN(Q) for each question Q. [sent-33, score-0.765]
17 bTehset probability eofs Teach type candidate is maintained as well. [sent-34, score-0.059]
18 In answer generation phase, TIQA extracts all answer candidates from retrieved passages based on answer types in TN(Q), by the same NesE Rba uesde do i ann TD-QA. [sent-35, score-1.153]
19 However, as the answer-typing model is far from perfect, if prediction errors happen, TDQA can no longer give correct answers at all. [sent-37, score-0.073]
20 On the other hand, TI-QA can provide higher answer coverage, as it can extract answer candidates with multiple answer types. [sent-38, score-1.076]
21 However, more answer candidates with different types bring more difficulties to the answer ranking model to rank the correct answer to the top 1 position. [sent-39, score-1.159]
22 So the ranking precision of TI-QA is not as good as TD-QA. [sent-40, score-0.08]
23 1 MBRAR for Single QA System MBR decoding (Bickel and Doksum, 1977) aims to select the hypothesis that minimizes the expected loss in classification. [sent-42, score-0.043]
24 In MBRAR, we replace the loss function with the gain function that measure the correlation between answer candidates. [sent-43, score-0.381]
25 G(A, Ak) is the gain function that denotes the degree ,oAf how Ak supports A. [sent-45, score-0.054]
26 This function can dbee gfruereth oefr expanded as a weighted icsom fubnicntaiotinon c aonf a set of correlation features as: ∑j λj · hj (A, Ak). [sent-46, score-0.045]
27 The following correlation fea∑tures are uAse,Ad in G(·): • answer-level n-gram correlation feature: hanswer(A,Ak) =∑#ω(Ak) where ω denotes an n-gram in A, #ω (Ak) dwehneortees ω ωth dee nnuotmebse arn no fn t-igmraems th inat A ω occurs in Ak. [sent-47, score-0.114]
28 425 • passage-level n-gram correlation feature: hpassage(A,Ak) = ∑ #ω(PAk) ω∈∑PA where PA denotes passages from which A are erextr Pactedde. [sent-48, score-0.126]
29 • answer-type agreement feature: htype(A, Ak) = δ(TA, TAi) δ(TA, TAk ) denotes an indicator function that equals to 1 when the answer types of A and Ak are ttohe 1 same, a thnde 0a nostwheerrw tiyspee. [sent-50, score-0.438]
30 • • answer-length feature that is used to penalize long answer cha fnedaitduarteets h. [sent-51, score-0.347]
31 averaged passage-length feature that is used taov penalize passages twhit fhe a long averaged length. [sent-52, score-0.117]
32 2 MBRAR for Multiple QA Systems Aiming to apply MBRAR to the outputs from N QA systems, we modify MBR components as follows. [sent-54, score-0.051]
33 , αN are coefficients with following constraints holds1 : 0 ≤ αi ≤ 1and αi = 1, P(A|Hi(Q)) is the: posterior probab∑ility of A estiPma(Ated|H on tQhe)) )i iths QA system’s psreoabrcahb space Hi (Q). [sent-58, score-0.019]
34 Third, the featQurAess suyssetdem min’ sthsee gain sfpuanccetioHn G(·) can bhier grouped tinurtoes stwusoe categories, including: ∑iN=1 • system-independent features, which includes asylls tfeematu-irnedse dpeesncdriebnetd f eina Surecetsi,o wn h3i. [sent-59, score-0.049]
35 c c1h hfo inr single system based MBRAR method; 1For simplicity, the coefficients are equally set: αi = 1/N. [sent-60, score-0.086]
36 If QAi pfaosilsit oton generate dAic, tehde nby yi tQ equals to 0; ensemble feature hcons (A), which equals to 1 when A can b(eA generated by qalul lisnd tiovi 1du walh QA system, gaennde r0a toedthe brywise. [sent-62, score-0.136]
37 Thus, the MBRAR for multiple QA systems can be finally formulated as follows: Aˆ =Aa r∈gHmC(aQx)Ai∈∑HC(Q)G(A,Ai) · P(Ai|HC(Q)) where the training process of the weights in the gain function is carried out with Ranking SVM2 based on the method described in Verberne et al. [sent-63, score-0.105]
38 4 Related Work MBR decoding have been successfully applied to many NLP tasks, e. [sent-65, score-0.018]
39 (2009) proposed a classification based method for QA task that jointly uses multiple 5-W QA systems by selecting one optimal QA system for each question. [sent-70, score-0.099]
40 Comparing to their work, our MBRAR approaches assume few about the question types, and all QA systems contribute in the re-ranking model. [sent-71, score-0.09]
41 (2008) presented an answer validation method that helps individual QA systems to automatically detect its own errors based on information from multiple QA systems. [sent-73, score-0.398]
42 (2003) presented a multi-level answer resolution algorithm to merge results from the answering agents at the question, passage, and answer levels. [sent-75, score-0.728]
43 html/ 426 (2012) proposed to use different score combinations to merge answers from different QA systems. [sent-81, score-0.091]
44 Although all methods mentioned above leverage information provided by multiple QA systems, our work is the first time to explore the usage of MBR principle for the QA task. [sent-82, score-0.042]
45 1 Data and Metric Questions from two different domains are used as our evaluation data sets: the first data set includes 10,051 factoid question-answer pairs selected from the Jeopardy! [sent-84, score-0.078]
46 quiz show3 ; while the second data set includes 360 celebrity-asking web questions4 selected from a commercial search engine, the answers for each question is labeled by human annotators. [sent-85, score-0.154]
47 The evaluation metric Succeed@n is defined as the number of questions whose correct answers are successfully ranked to the top n answer candidates. [sent-86, score-0.451]
48 2 MBRAR for Single QA System We first evaluate the effectiveness of our MBRAR for single QA system. [sent-88, score-0.026]
49 Given the N-best answer outputs from each single QA system, together with their ranking scores assigned by the corresponding ranking components, we further perform MBRAR to re-rank them and show resulting numbers on two evaluation data sets in Table 1 and 2 respectively. [sent-89, score-0.56]
50 Both Table 1 and Table 2 show that, by leveraging our MBRAR method on individual QA systems, the rankings of correct answers are consistently improved on both Jeopardy! [sent-90, score-0.073]
51 Succeed@1Succeed@2Succeed@3 MTDBR-QAAR22,,32782922,,67983422,,988825 MTBI-RQAAR22,,65228733,,35907033,,983211 Table 1: Impacts ofMBRAR for single QA system on Jeopardy! [sent-93, score-0.05]
52 This is due to fact that when the answer type is fixed (PERSON for 3http://www. [sent-97, score-0.343]
53 com/ 4The answers of such questions are person names. [sent-99, score-0.128]
54 WebSucceed@1Succeed@2Succeed@3 MTDB-RQAAR9979113208114486 MTBI-RQAAR9957112262114336 Table 2: Impacts ofMBRAR for single QA system on web questions. [sent-100, score-0.074]
55 celebrity-asking questions), TI-QA will generate candidates with wrong answer types, which will definitely deteriorate the ranking accuracy. [sent-101, score-0.468]
56 3 MBRAR for Multiple QA Systems We then evaluate the effectiveness of our MBRAR for multiple QA systems. [sent-103, score-0.042]
57 The mixture modelbased MBRAR method described in Section 3. [sent-104, score-0.022]
58 2 is used to rank the combined answer outputs from TD-QA and TI-QA, with ranking results shown in Table 3 and 4. [sent-105, score-0.48]
59 From Table 3 and Table 4 we can see that, comparing to the ranking performances of single QA systems TD-QA and TI-QA, MBRAR using two QA systems’ outputs shows significant improvements on both Jeopardy! [sent-106, score-0.19]
60 Furthermore, comparing to MBRAR on single QA system, MBRAR on multiple QA systems can provide extra gains on both questions sets as well. [sent-108, score-0.156]
61 Succeed@1Succeed@2Succeed@3 TTDI--QQAA22,,52287932,,63993732,,882815 MBRAR2,8913,6684,033 Table 3: Impacts of MBRAR for multiple QA systems on Jeopardy! [sent-110, score-0.075]
62 WebSucceed@1Succeed@2Succeed@3 TTDI--QQAA9975112228113466 MBRAR108137152 Table 4: Impacts of MBRAR for multiple QA systems on web questions. [sent-112, score-0.099]
63 6 Conclusions and Future Work In this paper, we present two MBR-based answer re-ranking approaches for QA. [sent-113, score-0.323]
64 Comparing to previous methods, MBRAR provides a systematic way to re-rank answers from either single or multiple QA systems, without considering their heterogeneous implementations of internal components. [sent-114, score-0.141]
65 427 Experiments on questions from two different domains show that, our proposed method can significantly improve the ranking performances. [sent-115, score-0.154]
66 In future, we will add more QA systems into our MBRAR framework, and design more features for the MBR gain function. [sent-116, score-0.063]
67 Methods Combination and ML-based Re-ranking of Multiple Hypothesis for QuestionAnswering Systems, In proceeding of EACL. [sent-134, score-0.077]
68 Training Linear SVMs in Linear Time, In proceeding of KDD. [sent-137, score-0.077]
69 evaluating machine learning techniques for ranking answers to why-questions. [sent-157, score-0.153]
wordName wordTfidf (topN-words)
[('qa', 0.557), ('mbrar', 0.55), ('answer', 0.323), ('mbr', 0.243), ('ak', 0.17), ('jeopardy', 0.133), ('ghm', 0.106), ('aar', 0.093), ('aqx', 0.085), ('qai', 0.085), ('ranking', 0.08), ('proceeding', 0.077), ('passages', 0.074), ('answers', 0.073), ('candidates', 0.065), ('answering', 0.064), ('nijmegen', 0.063), ('hc', 0.06), ('factoid', 0.059), ('question', 0.057), ('clst', 0.056), ('questions', 0.055), ('equals', 0.053), ('outputs', 0.051), ('phase', 0.05), ('impacts', 0.049), ('hi', 0.047), ('ru', 0.047), ('succeed', 0.043), ('typing', 0.043), ('ofmbrar', 0.042), ('websucceed', 0.042), ('yaman', 0.042), ('multiple', 0.042), ('risk', 0.038), ('goel', 0.037), ('grappy', 0.037), ('qax', 0.037), ('passage', 0.035), ('bayes', 0.035), ('systems', 0.033), ('verberne', 0.032), ('bickel', 0.031), ('gain', 0.03), ('tehde', 0.03), ('byrne', 0.028), ('correlation', 0.028), ('minimum', 0.027), ('tn', 0.027), ('aiming', 0.027), ('single', 0.026), ('rank', 0.026), ('retrieved', 0.026), ('engine', 0.025), ('titov', 0.025), ('hypothesis', 0.025), ('denotes', 0.024), ('system', 0.024), ('web', 0.024), ('penalize', 0.024), ('nan', 0.022), ('mixture', 0.022), ('candidate', 0.02), ('type', 0.02), ('correlations', 0.019), ('coefficients', 0.019), ('domains', 0.019), ('types', 0.019), ('reranks', 0.019), ('fhe', 0.019), ('ttohe', 0.019), ('pinchak', 0.019), ('ssim', 0.019), ('pak', 0.019), ('czuba', 0.019), ('aqt', 0.019), ('eina', 0.019), ('eofs', 0.019), ('mrank', 0.019), ('nanduan', 0.019), ('raaijmakers', 0.019), ('suzan', 0.019), ('kumar', 0.019), ('decoding', 0.018), ('merge', 0.018), ('formulates', 0.017), ('tahnadt', 0.017), ('sharp', 0.017), ('pros', 0.017), ('igs', 0.017), ('arn', 0.017), ('tak', 0.017), ('teach', 0.017), ('lou', 0.017), ('hthe', 0.017), ('inat', 0.017), ('anselmo', 0.017), ('inr', 0.017), ('aonf', 0.017), ('brigitte', 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999946 241 acl-2013-Minimum Bayes Risk based Answer Re-ranking for Question Answering
Author: Nan Duan
Abstract: This paper presents two minimum Bayes risk (MBR) based Answer Re-ranking (MBRAR) approaches for the question answering (QA) task. The first approach re-ranks single QA system’s outputs by using a traditional MBR model, by measuring correlations between answer candidates; while the second approach reranks the combined outputs of multiple QA systems with heterogenous answer extraction components by using a mixture model-based MBR model. Evaluations are performed on factoid questions selected from two different domains: Jeopardy! and Web, and significant improvements are achieved on all data sets.
2 0.43916452 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval
Author: Xuchen Yao ; Benjamin Van Durme ; Peter Clark
Abstract: Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily performance gain for QA. We propose to tightly integrate them by coupling automatically learned features for answer extraction to a shallow-structured IR model. Our method is very quick to implement, and significantly improves IR for QA (measured in Mean Average Precision and Mean Reciprocal Rank) by 10%-20% against an uncoupled retrieval baseline in both document and passage retrieval, which further leads to a downstream 20% improvement in QA F1.
3 0.23627102 107 acl-2013-Deceptive Answer Prediction with User Preference Graph
Author: Fangtao Li ; Yang Gao ; Shuchang Zhou ; Xiance Si ; Decheng Dai
Abstract: In Community question answering (QA) sites, malicious users may provide deceptive answers to promote their products or services. It is important to identify and filter out these deceptive answers. In this paper, we first solve this problem with the traditional supervised learning methods. Two kinds of features, including textual and contextual features, are investigated for this task. We further propose to exploit the user relationships to identify the deceptive answers, based on the hypothesis that similar users will have similar behaviors to post deceptive or authentic answers. To measure the user similarity, we propose a new user preference graph based on the answer preference expressed by users, such as “helpful” voting and “best answer” selection. The user preference graph is incorporated into traditional supervised learning framework with the graph regularization technique. The experiment results demonstrate that the user preference graph can indeed help improve the performance of deceptive answer prediction.
4 0.21180305 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models
Author: Wen-tau Yih ; Ming-Wei Chang ; Christopher Meek ; Andrzej Pastusiak
Abstract: In this paper, we study the answer sentence selection problem for question answering. Unlike previous work, which primarily leverages syntactic analysis through dependency tree matching, we focus on improving the performance using models of lexical semantic resources. Experiments show that our systems can be consistently and significantly improved with rich lexical semantic information, regardless of the choice of learning algorithms. When evaluated on a benchmark dataset, the MAP and MRR scores are increased by 8 to 10 points, compared to one of our baseline systems using only surface-form matching. Moreover, our best system also outperforms pervious work that makes use of the dependency tree structure by a wide margin.
5 0.17467886 292 acl-2013-Question Classification Transfer
Author: Anne-Laure Ligozat
Abstract: Question answering systems have been developed for many languages, but most resources were created for English, which can be a problem when developing a system in another language such as French. In particular, for question classification, no labeled question corpus is available for French, so this paper studies the possibility to use existing English corpora and transfer a classification by translating the question and their labels. By translating the training corpus, we obtain results close to a monolingual setting.
6 0.11559853 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering
7 0.10344613 290 acl-2013-Question Analysis for Polish Question Answering
8 0.087292373 254 acl-2013-Multimodal DBN for Predicting High-Quality Answers in cQA portals
9 0.084403917 218 acl-2013-Latent Semantic Tensor Indexing for Community-based Question Answering
10 0.081580505 141 acl-2013-Evaluating a City Exploration Dialogue System with Integrated Question-Answering and Pedestrian Navigation
11 0.081524976 387 acl-2013-Why-Question Answering using Intra- and Inter-Sentential Causal Relations
12 0.074033819 329 acl-2013-Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization
13 0.068973728 266 acl-2013-PAL: A Chatterbot System for Answering Domain-specific Questions
14 0.058606971 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
15 0.048923358 273 acl-2013-Paraphrasing Adaptation for Web Search Ranking
16 0.047252927 255 acl-2013-Name-aware Machine Translation
17 0.038050234 269 acl-2013-PLIS: a Probabilistic Lexical Inference System
18 0.035103183 262 acl-2013-Offspring from Reproduction Problems: What Replication Failure Teaches Us
19 0.034439351 250 acl-2013-Models of Translation Competitions
20 0.033148549 222 acl-2013-Learning Semantic Textual Similarity with Structural Representations
topicId topicWeight
[(0, 0.097), (1, 0.041), (2, 0.016), (3, -0.098), (4, 0.077), (5, 0.066), (6, 0.012), (7, -0.378), (8, 0.159), (9, 0.03), (10, 0.092), (11, -0.064), (12, -0.025), (13, -0.048), (14, 0.051), (15, 0.04), (16, 0.01), (17, -0.075), (18, 0.082), (19, 0.087), (20, 0.075), (21, -0.035), (22, 0.051), (23, -0.164), (24, -0.045), (25, -0.041), (26, -0.012), (27, -0.044), (28, 0.071), (29, -0.066), (30, 0.091), (31, -0.047), (32, -0.018), (33, 0.034), (34, 0.05), (35, 0.012), (36, -0.048), (37, 0.032), (38, -0.006), (39, -0.036), (40, 0.016), (41, -0.032), (42, -0.03), (43, -0.02), (44, 0.111), (45, 0.062), (46, -0.009), (47, -0.029), (48, -0.035), (49, -0.071)]
simIndex simValue paperId paperTitle
same-paper 1 0.98108423 241 acl-2013-Minimum Bayes Risk based Answer Re-ranking for Question Answering
Author: Nan Duan
Abstract: This paper presents two minimum Bayes risk (MBR) based Answer Re-ranking (MBRAR) approaches for the question answering (QA) task. The first approach re-ranks single QA system’s outputs by using a traditional MBR model, by measuring correlations between answer candidates; while the second approach reranks the combined outputs of multiple QA systems with heterogenous answer extraction components by using a mixture model-based MBR model. Evaluations are performed on factoid questions selected from two different domains: Jeopardy! and Web, and significant improvements are achieved on all data sets.
2 0.89495575 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval
Author: Xuchen Yao ; Benjamin Van Durme ; Peter Clark
Abstract: Information Retrieval (IR) and Answer Extraction are often designed as isolated or loosely connected components in Question Answering (QA), with repeated overengineering on IR, and not necessarily performance gain for QA. We propose to tightly integrate them by coupling automatically learned features for answer extraction to a shallow-structured IR model. Our method is very quick to implement, and significantly improves IR for QA (measured in Mean Average Precision and Mean Reciprocal Rank) by 10%-20% against an uncoupled retrieval baseline in both document and passage retrieval, which further leads to a downstream 20% improvement in QA F1.
3 0.80905217 107 acl-2013-Deceptive Answer Prediction with User Preference Graph
Author: Fangtao Li ; Yang Gao ; Shuchang Zhou ; Xiance Si ; Decheng Dai
Abstract: In Community question answering (QA) sites, malicious users may provide deceptive answers to promote their products or services. It is important to identify and filter out these deceptive answers. In this paper, we first solve this problem with the traditional supervised learning methods. Two kinds of features, including textual and contextual features, are investigated for this task. We further propose to exploit the user relationships to identify the deceptive answers, based on the hypothesis that similar users will have similar behaviors to post deceptive or authentic answers. To measure the user similarity, we propose a new user preference graph based on the answer preference expressed by users, such as “helpful” voting and “best answer” selection. The user preference graph is incorporated into traditional supervised learning framework with the graph regularization technique. The experiment results demonstrate that the user preference graph can indeed help improve the performance of deceptive answer prediction.
4 0.72043025 266 acl-2013-PAL: A Chatterbot System for Answering Domain-specific Questions
Author: Yuanchao Liu ; Ming Liu ; Xiaolong Wang ; Limin Wang ; Jingjing Li
Abstract: In this paper, we propose PAL, a prototype chatterbot for answering non-obstructive psychological domain-specific questions. This system focuses on providing primary suggestions or helping people relieve pressure by extracting knowledge from online forums, based on which the chatterbot system is constructed. The strategies used by PAL, including semantic-extension-based question matching, solution management with personal information consideration, and XML-based knowledge pattern construction, are described and discussed. We also conduct a primary test for the feasibility of our system.
5 0.71546268 218 acl-2013-Latent Semantic Tensor Indexing for Community-based Question Answering
Author: Xipeng Qiu ; Le Tian ; Xuanjing Huang
Abstract: Retrieving similar questions is very important in community-based question answering(CQA) . In this paper, we propose a unified question retrieval model based on latent semantic indexing with tensor analysis, which can capture word associations among different parts of CQA triples simultaneously. Thus, our method can reduce lexical chasm of question retrieval with the help of the information of question content and answer parts. The experimental result shows that our method outperforms the traditional methods.
6 0.71463352 292 acl-2013-Question Classification Transfer
7 0.64767003 290 acl-2013-Question Analysis for Polish Question Answering
9 0.6263296 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models
10 0.56457949 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering
11 0.52607906 254 acl-2013-Multimodal DBN for Predicting High-Quality Answers in cQA portals
12 0.47802132 141 acl-2013-Evaluating a City Exploration Dialogue System with Integrated Question-Answering and Pedestrian Navigation
13 0.43404132 239 acl-2013-Meet EDGAR, a tutoring agent at MONSERRATE
14 0.43071046 387 acl-2013-Why-Question Answering using Intra- and Inter-Sentential Causal Relations
15 0.41520321 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
16 0.38814166 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval
17 0.28183943 122 acl-2013-Discriminative Approach to Fill-in-the-Blank Quiz Generation for Language Learners
18 0.26408792 250 acl-2013-Models of Translation Competitions
19 0.25937432 350 acl-2013-TopicSpam: a Topic-Model based approach for spam detection
20 0.25262105 262 acl-2013-Offspring from Reproduction Problems: What Replication Failure Teaches Us
topicId topicWeight
[(0, 0.045), (4, 0.035), (6, 0.025), (11, 0.059), (24, 0.021), (26, 0.039), (35, 0.093), (42, 0.07), (48, 0.016), (49, 0.333), (70, 0.063), (88, 0.024), (90, 0.024), (95, 0.044)]
simIndex simValue paperId paperTitle
same-paper 1 0.73031968 241 acl-2013-Minimum Bayes Risk based Answer Re-ranking for Question Answering
Author: Nan Duan
Abstract: This paper presents two minimum Bayes risk (MBR) based Answer Re-ranking (MBRAR) approaches for the question answering (QA) task. The first approach re-ranks single QA system’s outputs by using a traditional MBR model, by measuring correlations between answer candidates; while the second approach reranks the combined outputs of multiple QA systems with heterogenous answer extraction components by using a mixture model-based MBR model. Evaluations are performed on factoid questions selected from two different domains: Jeopardy! and Web, and significant improvements are achieved on all data sets.
2 0.43189615 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
Author: Trevor Cohn ; Gholamreza Haffari
Abstract: Modern phrase-based machine translation systems make extensive use of wordbased translation models for inducing alignments from parallel corpora. This is problematic, as the systems are incapable of accurately modelling many translation phenomena that do not decompose into word-for-word translation. This paper presents a novel method for inducing phrase-based translation units directly from parallel data, which we frame as learning an inverse transduction grammar (ITG) using a recursive Bayesian prior. Overall this leads to a model which learns translations of entire sentences, while also learning their decomposition into smaller units (phrase-pairs) recursively, terminating at word translations. Our experiments on Arabic, Urdu and Farsi to English demonstrate improvements over competitive baseline systems.
3 0.42962265 121 acl-2013-Discovering User Interactions in Ideological Discussions
Author: Arjun Mukherjee ; Bing Liu
Abstract: Online discussion forums are a popular platform for people to voice their opinions on any subject matter and to discuss or debate any issue of interest. In forums where users discuss social, political, or religious issues, there are often heated debates among users or participants. Existing research has studied mining of user stances or camps on certain issues, opposing perspectives, and contention points. In this paper, we focus on identifying the nature of interactions among user pairs. The central questions are: How does each pair of users interact with each other? Does the pair of users mostly agree or disagree? What is the lexicon that people often use to express agreement and disagreement? We present a topic model based approach to answer these questions. Since agreement and disagreement expressions are usually multiword phrases, we propose to employ a ranking method to identify highly relevant phrases prior to topic modeling. After modeling, we use the modeling results to classify the nature of interaction of each user pair. Our evaluation results using real-life discussion/debate posts demonstrate the effectiveness of the proposed techniques.
4 0.42799738 173 acl-2013-Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
Author: Xiaodong Zeng ; Derek F. Wong ; Lidia S. Chao ; Isabel Trancoso
Abstract: This paper introduces a graph-based semisupervised joint model of Chinese word segmentation and part-of-speech tagging. The proposed approach is based on a graph-based label propagation technique. One constructs a nearest-neighbor similarity graph over all trigrams of labeled and unlabeled data for propagating syntactic information, i.e., label distributions. The derived label distributions are regarded as virtual evidences to regularize the learning of linear conditional random fields (CRFs) on unlabeled data. An inductive character-based joint model is obtained eventually. Empirical results on Chinese tree bank (CTB-7) and Microsoft Research corpora (MSR) reveal that the proposed model can yield better results than the supervised baselines and other competitive semi-supervised CRFs in this task.
5 0.42796248 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
Author: Wei Xu ; Raphael Hoffmann ; Le Zhao ; Ralph Grishman
Abstract: Distant supervision has attracted recent interest for training information extraction systems because it does not require any human annotation but rather employs existing knowledge bases to heuristically label a training corpus. However, previous work has failed to address the problem of false negative training examples mislabeled due to the incompleteness of knowledge bases. To tackle this problem, we propose a simple yet novel framework that combines a passage retrieval model using coarse features into a state-of-the-art relation extractor using multi-instance learning with fine features. We adapt the information retrieval technique of pseudo- relevance feedback to expand knowledge bases, assuming entity pairs in top-ranked passages are more likely to express a relation. Our proposed technique significantly improves the quality of distantly supervised relation extraction, boosting recall from 47.7% to 61.2% with a consistently high level of precision of around 93% in the experiments.
6 0.42784905 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search
7 0.4277814 167 acl-2013-Generalizing Image Captions for Image-Text Parallel Corpus
8 0.42747986 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction
9 0.42695442 315 acl-2013-Semi-Supervised Semantic Tagging of Conversational Understanding using Markov Topic Regression
10 0.42537335 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing
11 0.42375895 57 acl-2013-Arguments and Modifiers from the Learner's Perspective
12 0.42338359 172 acl-2013-Graph-based Local Coherence Modeling
13 0.42175692 273 acl-2013-Paraphrasing Adaptation for Web Search Ranking
14 0.42152679 4 acl-2013-A Context Free TAG Variant
15 0.42065594 212 acl-2013-Language-Independent Discriminative Parsing of Temporal Expressions
16 0.42052671 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
17 0.42032638 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing
18 0.41997981 225 acl-2013-Learning to Order Natural Language Texts
19 0.41981211 275 acl-2013-Parsing with Compositional Vector Grammars
20 0.41945922 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization