emnlp emnlp2010 emnlp2010-74 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Razvan Bunescu ; Yunfeng Huang
Abstract: We present a machine learning approach for the task of ranking previously answered questions in a question repository with respect to their relevance to a new, unanswered reference question. The ranking model is trained on a collection of question groups manually annotated with a partial order relation reflecting the relative utility of questions inside each group. Based on a set of meaning and structure aware features, the new ranking model is able to substantially outperform more straightforward, unsupervised similarity measures.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We present a machine learning approach for the task of ranking previously answered questions in a question repository with respect to their relevance to a new, unanswered reference question. [sent-2, score-0.95]
2 The ranking model is trained on a collection of question groups manually annotated with a partial order relation reflecting the relative utility of questions inside each group. [sent-3, score-0.869]
3 edu a new approach to question answering that shifts the inherent complexity of open domain QA from the computer system to volunteer contributors. [sent-16, score-0.301]
4 The computer is no longer required to perform a deep linguistic analysis of questions and generate corresponding answers, and instead acts as a mediator between users submitting questions and volunteers providing the answers. [sent-17, score-0.7]
5 An important objective in community QA is to minimize the time elapsed between the submission of questions by users and the subsequent posting of answers by volunteer contributors. [sent-18, score-0.474]
6 One useful strategy for minimizing the response latency is to search the QA repository for similar questions that have already been answered, and provide the corresponding ranked list of answers, if such a question is found. [sent-19, score-0.638]
7 In the simplest solution, the system searches for previously answered questions based on exact string matching with the reference question. [sent-21, score-0.553]
8 Alternatively, sites such as WikiAnswers allow the users to mark questions they think are rephrasings (“alternate wordings”, or paraphrases) of existing questions. [sent-22, score-0.414]
9 These question clusters are then taken into account when performing exact string matching, therefore increasing the likelihood of finding previously answered questions that are semantically equivalent to the reference question. [sent-23, score-0.758]
10 In order to lessen the amount of work required from the contributors, an alternative approach is to build a system that automatically finds rephrasings of questions, especially since question rephrasing ProceMedITi,n Mgsa osfsa thcehu 20se1t0ts C, UonSfAe,re 9n-c1e1 o Onc Etombpeir i 2c0a1l0 M. [sent-24, score-0.342]
11 According to previous work in this domain, a question is considered a rephrasing of a reference question Q0 if it uses an alternate wording to express an identical information need. [sent-27, score-0.668]
12 We believe that computing a ranked list of existing questions that at least partially address the original information need could also be useful to the user, at least until other users volunteer to give an exact answer to the original, unanswered reference question. [sent-32, score-0.696]
13 For example, in the absence of any additional information about the reference question Q0, the expected answers to questions Q2 and Q3 below may be seen as partially overlapping in information content with the expected answer for the reference question Q0. [sent-33, score-1.362]
14 An answer to question Q4, on the other hand, is less likely to benefit the user, even though it has a significant lexical overlap with the reference question. [sent-34, score-0.475]
15 Underlying the question ranking task is the expectation that the user who submits a reference question will find the answers of the highly ranked questions to be more useful than the answers associated with the lower ranked questions. [sent-39, score-1.323]
16 For the reference question Q0 above, the learned ranking model is expected to produce a partial order in which Q1 is ranked higher than Q2, Q3 and Q4, whereas Q2 and Q3 are ranked higher than Q4. [sent-40, score-0.608]
17 2 98 Partially Ordered Datasets for Question Ranking In order to enable the evaluation of question ranking approaches, we have previously created a dataset of 60 groups of questions (Bunescu and Huang, 2010b). [sent-41, score-0.772]
18 Q0 above) that is associated with a partially ordered set of questions (e. [sent-44, score-0.429]
19 For each reference questions, its corresponding partially ordered set is created from questions in Yahoo! [sent-47, score-0.534]
20 Answers, the 60 reference questions span a diverse set of categories. [sent-50, score-0.455]
21 Figure 1 lists the 20 categories covered, where each category is shown with the number of corresponding reference questions between parentheses. [sent-51, score-0.455]
22 Inside each group, the questions are manually an- notated with a partial order relation, according to their utility with respect to the reference question. [sent-52, score-0.564]
23 We use the notation hQi ≻ Qj |Qri to encode the fWacet tuhsaet question Qi i sh more useful thian to question Qj with respect to the reference question Qr. [sent-53, score-0.916]
24 The partial ordering among the questions Q0 to Q4 above can therefore be expressed concisely as follows: hQ0 = Q1i, hQ1 ≻ Q2 |Q0i, hQ1 ≻ Q3|Q0i, hQ2 ≻ Q4|Q0i, hQ3 ≻ Q4|Q0i. [sent-55, score-0.383]
25 This reflects our belief that, in the absence of any additional information regarding the user or the “turtle” referenced in Q0, we cannot compare questions Q2 and Q3 in terms of their usefulness with respect to Q0. [sent-72, score-0.434]
26 Table 1shows another reference question Q5 from our dataset, together with its annotated group of questions Q6 to Q20. [sent-73, score-0.716]
27 During the first annotation stage, each question group is partitioned manually into 3 subgroups of questions: • P is the set of paraphrasing questions. [sent-75, score-0.343]
28 A question is deemed useful if its expected answer may overlap in information content with the expected answer of the reference question. [sent-78, score-0.67]
29 hQp ≻ Qu |Qri : a paraphrasing question is more ≻usef Qul |tQhani a useful question. [sent-82, score-0.315]
30 a paraphrasing question is more u≻sef Qul t|Qhani a in. [sent-88, score-0.315]
31 In the second annotation stage, we perform a finer annotation of relations between questions in the middle group U. [sent-93, score-0.445]
32 f the reference questions are shorter than the other questions in their group. [sent-104, score-0.805]
33 We have also created a complex version of the same dataset, by selecting as the reference question in each group a longer question from the same group. [sent-105, score-0.627]
34 We believe that the new complex dataset is closer to the actual distribution of questions in community QA repositories: unanswered questions tend to be more specific (longer), whereas general questions (shorter) are more likely to have been answered already. [sent-108, score-1.169]
35 P =|PairPs1a∩irPs 1airs2| R =|PairPs1a∩irPs 2airs2| The statistics in Table 2 indicate that the second annotator was in general more conservative in tagging questions as paraphrases or useful questions. [sent-114, score-0.404]
36 3 Unsupervised Methods for Question Ranking An ideal question ranking method would take an arbitrary triplet of questions Qr, Qi and Qj as input, and output an ordering between Qi and Qj with respect to the reference question Qr, i. [sent-115, score-1.129]
37 As a measure of question similarity, one major drawback of cosine similarity is that it is oblivious of the meanings of words in each question. [sent-122, score-0.404]
38 This particular problem is illustrated by the three questions below. [sent-123, score-0.35]
39 101 4 Supervised Learning for Question Ranking Cosine similarity, henceforth referred as cos, treats questions as bags-of-words. [sent-136, score-0.35]
40 Both cos and mcs ignore the syntactic relations between the words in a question, and therefore may miss important structural information. [sent-139, score-0.31]
41 In the next three sections we describe a set of structural features that we believe are relevant for judging question similarity. [sent-140, score-0.297]
42 1 Matching the Focus Words If we consider the question Q24 below as reference, question Q26 will be deemed more useful than Q25 when using cos or mcs because of the higher relative lexical and conceptual overlap with Q24. [sent-144, score-0.844]
43 However, instead of relying exclusively on a predefined hierarchy of answer types, we identify the question focus of a question, defined as the set of maximal noun phrases in the question that corefer with the expected answer (Bunescu and Huang, 2010a). [sent-150, score-0.808]
44 We use answer types only for questions such as Q27 or Q28 below that lack an explicit question focus. [sent-152, score-0.72]
45 In such cases, an artificial question focus is created from the answer type (e. [sent-153, score-0.407]
46 Let fi and fr be the focus words corresponding to questions Qi and Qr. [sent-158, score-0.455]
47 We introduce a focus feature φf, and set its value to be equal with the similarity between the focus words: φf (Qi, Qr) = wsim(fi, fr) (1) We use wsim to denote a generic word meaning similarity measure (e. [sent-159, score-0.38]
48 2 Matching the Main Verbs In addition to the question focus, the main verb of a question can also provide key information in estimating question-to-question similarity. [sent-166, score-0.561]
49 If the question does not contain a content verb, the main verb is defined to be the highest verb in the dependency tree, as for example are in Q24 to Q26. [sent-170, score-0.382]
50 The utility of a question’s main verb in judging its similarity to other questions can be seen more clearly in the questions below, where Q29 is the reference: Q29 How can Itransfer music from iTunes to my iPod? [sent-171, score-0.96]
51 Let vi and vr be the main verbs corresponding to questions Qi and Qr. [sent-177, score-0.543]
52 We introduce a main verb feature φv as follows: φv (Qi, Qr) = wsim(vi, vr) (2) If Q29 is considered as reference question, it is expected that the main verb feature for question Q30 will have a higher value than the main verb feature for Q31, i. [sent-178, score-0.514]
53 3 Matching the Dependency Trees The question focus and the main verb are only two of the nodes in the syntactic dependency tree of a question. [sent-183, score-0.449]
54 In general, all the words in a question are important when judging its semantic similarity with another question. [sent-184, score-0.41]
55 We therefore propose a more general feature that exploits the dependency structure of the question and, in doing so, it also considers all the words in the question, like cos and mcs. [sent-185, score-0.428]
56 For any given question we initially ignore the direction of the dependency arcs and change the question dependency tree to be rooted at the focus word, as illustrated in Figure 2 for questions Q5 and Q9. [sent-186, score-1.019]
57 We define the dependency tree similarity between two questions Qi and Qr to be a function of similarities wsim(vi, vr) computed between aligned nodes vi ∈ Qi and vr ∈ Qr. [sent-188, score-0.768]
58 ) C: for finding a matching between two question dependency trees rooted at the focus words. [sent-212, score-0.397]
59 Figure 2 shows the results of applying the tree matching algorithm on questions Q5 and Q9. [sent-219, score-0.43]
60 We introduce a new feature φt(Qi, Qr) whose value is defined as the dependency tree similarity between questions Qi and Qr. [sent-221, score-0.53]
61 When computing the similarity between two matched nodes, we factor in the similarities between corresponding pairs of words on the paths fi ; fr ; between the focus words fi, vi, vr 103 fr and the nodes vi, vr, as shown in Equation 5. [sent-223, score-0.455]
62 Each ofthe generic features φf, φv, φt, and mcs corresponds to four actual features, one for each possible choice of the word similarity function wsim (i. [sent-242, score-0.34]
63 An additional pair of features is targeted at questions containing locations: 6. [sent-245, score-0.35]
64 φl (Qi, Qr) = 1 if both questions contain locations, 0 otherwise. [sent-246, score-0.35]
65 5 Experimental Evaluation We use the four question ranking datasets described in Section 2 to evaluate the three similarity measures cos, mcs, and φt, as well as the SVM ranking model. [sent-255, score-0.676]
66 Each question similarity measure is evaluated in terms of its accuracy on the set of ordered pairs, and the performance is averaged between the two annotators for the Simple and Complex datasets. [sent-257, score-0.428]
67 If hQi ≻ Qj |Qri is a relation specified in the annotation, we c|oQnsiid iesr a ath ree tuple hQi, Qj , Qri correctly 104 classified if and only if u(Qi, Qr) > u(Qj , Qr), where u is the question similarity measure. [sent-258, score-0.4]
68 For each question, the focus is identified automatically by an SVM tagger trained on a separate corpus of 2,000 questions manually annotated with focus information (Bunescu and Huang, 2010a). [sent-263, score-0.424]
69 The main verb of a question is identified deterministically using a breadth first traversal of the dependency tree. [sent-268, score-0.343]
70 The random baseline assigning a random similarity value to each pair of questions results in 50% accuracy. [sent-270, score-0.463]
71 Even though its use ofword senses was expected to lead to superior results, mcs does not perform better than cos on this dataset. [sent-271, score-0.302]
72 Our implementation of mcs did however perform better than cos on the Microsoft paraphrase corpus (Dolan et al. [sent-272, score-0.297]
73 One possible reason for this behavior is that mcs seems to be less resilient than cos to differences in question length. [sent-274, score-0.532]
74 Whereas the Microsoft paraphrase corpus was specifically designed such that “the length of the shorter of the two sen– – tences, in words, is at least 66% that of the longer” (Dolan and Brockett, 2005), the question ranking datasets place no constraints on the lengths of the 2svmlight. [sent-275, score-0.436]
75 However, even though by themselves the meaning aware mcs and the structure-and-meaning aware φt do not outperform the bag-of-words cos, they do help in increasing the performance of the SVM ranking model, as can be inferred from the corresponding columns in Table 5. [sent-302, score-0.323]
76 The following question patterns illustrate the need to design more complex similarity measures that take into account the context of every word in the question: P1 Where can Ifind a job around hCity i? [sent-306, score-0.466]
77 Below are three instantiations of the first question pattern: Q32 Where can I find a job around Anaheim, CA? [sent-309, score-0.299]
78 If we take Q32 as reference question, the fact that the distance between Los Angeles and Anaheim is smaller than the distance between Vista and Anaheim leads the ranking system to rank Q33 as more useful than Q34 with respect to Q32, which is the 105 expected result. [sent-312, score-0.288]
79 6 Future Work We plan to integrate context dependent word similarity measures into a more robust question utility function. [sent-315, score-0.451]
80 The questions that are posted on community QA sites often contain spelling or grammatical errors. [sent-317, score-0.374]
81 Consequently, we will work on interfacing the question ranking system with a separate module aimed at fixing orthographic and grammatical errors. [sent-318, score-0.385]
82 7 Related Work The question rephrasing subtask has spawned a diverse set of approaches. [sent-319, score-0.302]
83 , 2002) derive a set of phrasal patterns for question reformulation by generalizing surface patterns acquired automatically from a large corpus of web documents. [sent-321, score-0.359]
84 , 2005), word translation probabilities are trained on pairs of semantically similar questions that are automatically extracted from an FAQ archive, and then used in a language model that retrieves question reformulations. [sent-324, score-0.611]
85 (Jijkoun and de Rijke, 2005) describe an FAQ question retrieval system in which weighted combinations of similarity functions corresponding to questions, existing answers, FAQ titles and pages are computed using a vector space model. [sent-325, score-0.374]
86 , 2007) exploit the Encarta logs to automatically extract clusters containing question paraphrases and further train a perceptron to recognize question paraphrases inside each cluster based on a combination of lexical, syntactic and semantic similarity features. [sent-327, score-0.743]
87 More recently, (Bernhard and Gurevych, 2008) evaluated various string similarity measures and vector space based similarity measures on the task of retrieving question paraphrases from the WikiAn- swers repository. [sent-328, score-0.624]
88 , 2008) is to return questions that are semantically equivalent or close to the queried question, and is therefore similar to our question ranking task. [sent-330, score-0.735]
89 Their approach is evaluated on a dataset in which questions are categorized either as relevant or irrelevant. [sent-331, score-0.387]
90 Our formulation of question ranking is more general, and in particular subsumes the annotation of binary question categories such as relevant vs. [sent-332, score-0.674]
91 The question ranking task was first formulated in (Bunescu and Huang, 2010b), where an initial version of the dataset was also described. [sent-337, score-0.422]
92 In this paper, we introduce 4 versions of the dataset, a more general meaning and structure aware similarity measure, and a supervised model for ranking that substantially outperforms the previously proposed utility measures. [sent-338, score-0.311]
93 8 Conclusion We presented a supervised learning approach to the question ranking task in which previously known questions are ordered based on their relative utility with respect to a new, reference question. [sent-339, score-0.997]
94 We created four versions of a dataset of 60 groups of questions 5, each annotated with a partial order relation reflecting the relative utility of questions inside each group. [sent-340, score-0.871]
95 Answering learners’ questions by retrieving question paraphrases from social Q&A; sites. [sent-351, score-0.69]
96 A utilitydriven approach to question ranking in social QA. [sent-360, score-0.385]
97 Searching questions by identifying question topic and question focus. [sent-379, score-0.872]
98 Natural language based reformulation resource and web exploitation for question answering. [sent-383, score-0.309]
99 Finding similar questions in large question and answer archives. [sent-388, score-0.72]
100 Retrieving answers from frequently asked questions pages on the Web. [sent-401, score-0.434]
wordName wordTfidf (topN-words)
[('qr', 0.516), ('qi', 0.404), ('questions', 0.35), ('question', 0.261), ('qj', 0.187), ('mcs', 0.147), ('qri', 0.147), ('vr', 0.124), ('cos', 0.124), ('ranking', 0.124), ('similarity', 0.113), ('answer', 0.109), ('qa', 0.107), ('reference', 0.105), ('hqi', 0.094), ('answers', 0.084), ('turtle', 0.08), ('wsim', 0.08), ('vi', 0.069), ('wup', 0.067), ('res', 0.057), ('matching', 0.056), ('usefulness', 0.056), ('paraphrases', 0.054), ('ordered', 0.054), ('paraphrasing', 0.054), ('bunescu', 0.053), ('utility', 0.048), ('reformulation', 0.048), ('movies', 0.046), ('nodes', 0.045), ('buy', 0.045), ('dependency', 0.043), ('feed', 0.042), ('answered', 0.042), ('rephrasing', 0.041), ('svm', 0.04), ('anaheim', 0.04), ('hcity', 0.04), ('jcn', 0.04), ('maxmat', 0.04), ('rephrasings', 0.04), ('thriller', 0.04), ('unanswered', 0.04), ('volunteer', 0.04), ('yunfeng', 0.04), ('relations', 0.039), ('verb', 0.039), ('dolan', 0.038), ('job', 0.038), ('idf', 0.038), ('focus', 0.037), ('dataset', 0.037), ('judging', 0.036), ('duan', 0.034), ('faq', 0.034), ('maxsim', 0.034), ('matched', 0.034), ('ohio', 0.034), ('fr', 0.034), ('fi', 0.034), ('partial', 0.033), ('mihalcea', 0.032), ('summer', 0.032), ('expected', 0.031), ('interrogative', 0.031), ('qn', 0.031), ('cosine', 0.03), ('measures', 0.029), ('francisco', 0.029), ('respect', 0.028), ('annotation', 0.028), ('relative', 0.027), ('ranked', 0.027), ('camp', 0.027), ('eecs', 0.027), ('hqp', 0.027), ('hydrangea', 0.027), ('ipod', 0.027), ('irps', 0.027), ('itunes', 0.027), ('paraphrasings', 0.027), ('qul', 0.027), ('ternary', 0.027), ('vista', 0.027), ('wikianswers', 0.027), ('xidf', 0.027), ('sim', 0.027), ('relation', 0.026), ('aware', 0.026), ('paraphrase', 0.026), ('partially', 0.025), ('retrieving', 0.025), ('datasets', 0.025), ('patterns', 0.025), ('tree', 0.024), ('music', 0.024), ('deemed', 0.024), ('sites', 0.024), ('razvan', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999899 74 emnlp-2010-Learning the Relative Usefulness of Questions in Community QA
Author: Razvan Bunescu ; Yunfeng Huang
Abstract: We present a machine learning approach for the task of ranking previously answered questions in a question repository with respect to their relevance to a new, unanswered reference question. The ranking model is trained on a collection of question groups manually annotated with a partial order relation reflecting the relative utility of questions inside each group. Based on a set of meaning and structure aware features, the new ranking model is able to substantially outperform more straightforward, unsupervised similarity measures.
2 0.26518166 51 emnlp-2010-Function-Based Question Classification for General QA
Author: Fan Bu ; Xingwei Zhu ; Yu Hao ; Xiaoyan Zhu
Abstract: In contrast with the booming increase of internet data, state-of-art QA (question answering) systems, otherwise, concerned data from specific domains or resources such as search engine snippets, online forums and Wikipedia in a somewhat isolated way. Users may welcome a more general QA system for its capability to answer questions of various sources, integrated from existed specialized sub-QA engines. In this framework, question classification is the primary task. However, the current paradigms of question classification were focused on some specified type of questions, i.e. factoid questions, which are inappropriate for the general QA. In this paper, we propose a new question classification paradigm, which includes a question taxonomy suitable to the general QA and a question classifier based on MLN (Markov logic network), where rule-based methods and statistical methods are unified into a single framework in a fuzzy discriminative learning approach. Experiments show that our method outperforms traditional question classification approaches.
3 0.11426165 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing
Author: Slav Petrov ; Pi-Chuan Chang ; Michael Ringgaard ; Hiyan Alshawi
Abstract: It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60% labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance.
4 0.08610142 55 emnlp-2010-Handling Noisy Queries in Cross Language FAQ Retrieval
Author: Danish Contractor ; Govind Kothari ; Tanveer Faruquie ; L V Subramaniam ; Sumit Negi
Abstract: Recent times have seen a tremendous growth in mobile based data services that allow people to use Short Message Service (SMS) to access these data services. In a multilingual society it is essential that data services that were developed for a specific language be made accessible through other local languages also. In this paper, we present a service that allows a user to query a FrequentlyAsked-Questions (FAQ) database built in a local language (Hindi) using Noisy SMS English queries. The inherent noise in the SMS queries, along with the language mismatch makes this a challenging problem. We handle these two problems by formulating the query similarity over FAQ questions as a combinatorial search problem where the search space consists of combinations of dictionary variations of the noisy query and its top-N translations. We demonstrate the effectiveness of our approach on a real-life dataset.
5 0.071951754 47 emnlp-2010-Example-Based Paraphrasing for Improved Phrase-Based Statistical Machine Translation
Author: Aurelien Max
Abstract: In this article, an original view on how to improve phrase translation estimates is proposed. This proposal is grounded on two main ideas: first, that appropriate examples of a given phrase should participate more in building its translation distribution; second, that paraphrases can be used to better estimate this distribution. Initial experiments provide evidence of the potential of our approach and its implementation for effectively improving translation performance.
6 0.063636094 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing
7 0.06230573 79 emnlp-2010-Mining Name Translations from Entity Graph Mapping
8 0.061582927 41 emnlp-2010-Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models
9 0.050199158 63 emnlp-2010-Improving Translation via Targeted Paraphrasing
10 0.046803709 77 emnlp-2010-Measuring Distributional Similarity in Context
11 0.04657511 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications
12 0.046154682 20 emnlp-2010-Automatic Detection and Classification of Social Events
13 0.045693774 89 emnlp-2010-PEM: A Paraphrase Evaluation Metric Exploiting Parallel Texts
14 0.044649426 95 emnlp-2010-SRL-Based Verb Selection for ESL
15 0.043034401 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?
16 0.041798633 50 emnlp-2010-Facilitating Translation Using Source Language Paraphrase Lattices
17 0.039089411 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment
18 0.039081786 25 emnlp-2010-Better Punctuation Prediction with Dynamic Conditional Random Fields
19 0.037149988 105 emnlp-2010-Title Generation with Quasi-Synchronous Grammar
20 0.036799595 72 emnlp-2010-Learning First-Order Horn Clauses from Web Text
topicId topicWeight
[(0, 0.16), (1, 0.054), (2, -0.029), (3, 0.165), (4, 0.026), (5, 0.121), (6, -0.014), (7, 0.099), (8, 0.039), (9, 0.088), (10, -0.09), (11, -0.057), (12, 0.223), (13, 0.005), (14, -0.012), (15, 0.347), (16, -0.239), (17, -0.126), (18, 0.231), (19, -0.05), (20, -0.047), (21, 0.109), (22, -0.18), (23, -0.009), (24, 0.065), (25, -0.076), (26, 0.234), (27, 0.029), (28, -0.057), (29, -0.041), (30, 0.119), (31, -0.046), (32, -0.024), (33, -0.153), (34, -0.103), (35, 0.028), (36, 0.051), (37, -0.06), (38, 0.015), (39, -0.004), (40, -0.073), (41, 0.024), (42, -0.014), (43, -0.002), (44, -0.057), (45, 0.061), (46, -0.033), (47, -0.035), (48, 0.028), (49, -0.058)]
simIndex simValue paperId paperTitle
same-paper 1 0.98313171 74 emnlp-2010-Learning the Relative Usefulness of Questions in Community QA
Author: Razvan Bunescu ; Yunfeng Huang
Abstract: We present a machine learning approach for the task of ranking previously answered questions in a question repository with respect to their relevance to a new, unanswered reference question. The ranking model is trained on a collection of question groups manually annotated with a partial order relation reflecting the relative utility of questions inside each group. Based on a set of meaning and structure aware features, the new ranking model is able to substantially outperform more straightforward, unsupervised similarity measures.
2 0.90141183 51 emnlp-2010-Function-Based Question Classification for General QA
Author: Fan Bu ; Xingwei Zhu ; Yu Hao ; Xiaoyan Zhu
Abstract: In contrast with the booming increase of internet data, state-of-art QA (question answering) systems, otherwise, concerned data from specific domains or resources such as search engine snippets, online forums and Wikipedia in a somewhat isolated way. Users may welcome a more general QA system for its capability to answer questions of various sources, integrated from existed specialized sub-QA engines. In this framework, question classification is the primary task. However, the current paradigms of question classification were focused on some specified type of questions, i.e. factoid questions, which are inappropriate for the general QA. In this paper, we propose a new question classification paradigm, which includes a question taxonomy suitable to the general QA and a question classifier based on MLN (Markov logic network), where rule-based methods and statistical methods are unified into a single framework in a fuzzy discriminative learning approach. Experiments show that our method outperforms traditional question classification approaches.
3 0.50361943 55 emnlp-2010-Handling Noisy Queries in Cross Language FAQ Retrieval
Author: Danish Contractor ; Govind Kothari ; Tanveer Faruquie ; L V Subramaniam ; Sumit Negi
Abstract: Recent times have seen a tremendous growth in mobile based data services that allow people to use Short Message Service (SMS) to access these data services. In a multilingual society it is essential that data services that were developed for a specific language be made accessible through other local languages also. In this paper, we present a service that allows a user to query a FrequentlyAsked-Questions (FAQ) database built in a local language (Hindi) using Noisy SMS English queries. The inherent noise in the SMS queries, along with the language mismatch makes this a challenging problem. We handle these two problems by formulating the query similarity over FAQ questions as a combinatorial search problem where the search space consists of combinations of dictionary variations of the noisy query and its top-N translations. We demonstrate the effectiveness of our approach on a real-life dataset.
4 0.37360069 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing
Author: Slav Petrov ; Pi-Chuan Chang ; Michael Ringgaard ; Hiyan Alshawi
Abstract: It is well known that parsing accuracies drop significantly on out-of-domain data. What is less known is that some parsers suffer more from domain shifts than others. We show that dependency parsers have more difficulty parsing questions than constituency parsers. In particular, deterministic shift-reduce dependency parsers, which are of highest interest for practical applications because of their linear running time, drop to 60% labeled accuracy on a question test set. We propose an uptraining procedure in which a deterministic parser is trained on the output of a more accurate, but slower, latent variable constituency parser (converted to dependencies). Uptraining with 100K unlabeled questions achieves results comparable to having 2K labeled questions for training. With 100K unlabeled and 2K labeled questions, uptraining is able to improve parsing accuracy to 84%, closing the gap between in-domain and out-of-domain performance.
5 0.23515883 79 emnlp-2010-Mining Name Translations from Entity Graph Mapping
Author: Gae-won You ; Seung-won Hwang ; Young-In Song ; Long Jiang ; Zaiqing Nie
Abstract: This paper studies the problem of mining entity translation, specifically, mining English and Chinese name pairs. Existing efforts can be categorized into (a) a transliterationbased approach leveraging phonetic similarity and (b) a corpus-based approach exploiting bilingual co-occurrences, each of which suffers from inaccuracy and scarcity respectively. In clear contrast, we use unleveraged resources of monolingual entity co-occurrences, crawled from entity search engines, represented as two entity-relationship graphs extracted from two language corpora respectively. Our problem is then abstracted as finding correct mappings across two graphs. To achieve this goal, we propose a holistic approach, of exploiting both transliteration similarity and monolingual co-occurrences. This approach, building upon monolingual corpora, complements existing corpus-based work, requiring scarce resources of parallel or compa- rable corpus, while significantly boosting the accuracy of transliteration-based work. We validate our proposed system using real-life datasets.
6 0.19778103 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing
7 0.18953261 41 emnlp-2010-Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models
8 0.18014817 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications
9 0.17642985 120 emnlp-2010-What's with the Attitude? Identifying Sentences with Attitude in Online Discussions
10 0.16870268 77 emnlp-2010-Measuring Distributional Similarity in Context
11 0.16193751 89 emnlp-2010-PEM: A Paraphrase Evaluation Metric Exploiting Parallel Texts
12 0.15574706 92 emnlp-2010-Predicting the Semantic Compositionality of Prefix Verbs
13 0.14418527 72 emnlp-2010-Learning First-Order Horn Clauses from Web Text
14 0.14363986 47 emnlp-2010-Example-Based Paraphrasing for Improved Phrase-Based Statistical Machine Translation
15 0.13961154 90 emnlp-2010-Positional Language Models for Clinical Information Retrieval
16 0.13834225 95 emnlp-2010-SRL-Based Verb Selection for ESL
17 0.13795957 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?
18 0.13700916 25 emnlp-2010-Better Punctuation Prediction with Dynamic Conditional Random Fields
19 0.13439678 20 emnlp-2010-Automatic Detection and Classification of Social Events
20 0.1312478 7 emnlp-2010-A Mixture Model with Sharing for Lexical Semantics
topicId topicWeight
[(10, 0.442), (12, 0.06), (29, 0.084), (30, 0.031), (52, 0.018), (56, 0.041), (62, 0.024), (66, 0.086), (72, 0.04), (76, 0.028), (82, 0.013), (87, 0.015), (89, 0.021)]
simIndex simValue paperId paperTitle
1 0.97647887 59 emnlp-2010-Identifying Functional Relations in Web Text
Author: Thomas Lin ; Mausam ; Oren Etzioni
Abstract: Determining whether a textual phrase denotes a functional relation (i.e., a relation that maps each domain element to a unique range element) is useful for numerous NLP tasks such as synonym resolution and contradiction detection. Previous work on this problem has relied on either counting methods or lexico-syntactic patterns. However, determining whether a relation is functional, by analyzing mentions of the relation in a corpus, is challenging due to ambiguity, synonymy, anaphora, and other linguistic phenomena. We present the LEIBNIZ system that overcomes these challenges by exploiting the synergy between the Web corpus and freelyavailable knowledge resources such as Freebase. It first computes multiple typedfunctionality scores, representing functionality of the relation phrase when its arguments are constrained to specific types. It then aggregates these scores to predict the global functionality for the phrase. LEIBNIZ outperforms previous work, increasing area under the precisionrecall curve from 0.61 to 0.88. We utilize LEIBNIZ to generate the first public repository of automatically-identified functional relations.
same-paper 2 0.87259024 74 emnlp-2010-Learning the Relative Usefulness of Questions in Community QA
Author: Razvan Bunescu ; Yunfeng Huang
Abstract: We present a machine learning approach for the task of ranking previously answered questions in a question repository with respect to their relevance to a new, unanswered reference question. The ranking model is trained on a collection of question groups manually annotated with a partial order relation reflecting the relative utility of questions inside each group. Based on a set of meaning and structure aware features, the new ranking model is able to substantially outperform more straightforward, unsupervised similarity measures.
3 0.72995508 5 emnlp-2010-A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages
Author: Minh-Thang Luong ; Preslav Nakov ; Min-Yen Kan
Abstract: We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the basic unit of translation is the morpheme, but word boundaries are respected at all stages of the translation process. Our model extends the classic phrase-based model by means of (1) word boundary-aware morpheme-level phrase extraction, (2) minimum error-rate training for a morpheme-level translation model using word-level BLEU, and (3) joint scoring with morpheme- and word-level language models. Further improvements are achieved by combining our model with the classic one. The evaluation on English to Finnish using Europarl (714K sentence pairs; 15.5M English words) shows statistically significant improvements over the classic model based on BLEU and human judgments.
4 0.49173537 51 emnlp-2010-Function-Based Question Classification for General QA
Author: Fan Bu ; Xingwei Zhu ; Yu Hao ; Xiaoyan Zhu
Abstract: In contrast with the booming increase of internet data, state-of-art QA (question answering) systems, otherwise, concerned data from specific domains or resources such as search engine snippets, online forums and Wikipedia in a somewhat isolated way. Users may welcome a more general QA system for its capability to answer questions of various sources, integrated from existed specialized sub-QA engines. In this framework, question classification is the primary task. However, the current paradigms of question classification were focused on some specified type of questions, i.e. factoid questions, which are inappropriate for the general QA. In this paper, we propose a new question classification paradigm, which includes a question taxonomy suitable to the general QA and a question classifier based on MLN (Markov logic network), where rule-based methods and statistical methods are unified into a single framework in a fuzzy discriminative learning approach. Experiments show that our method outperforms traditional question classification approaches.
5 0.38798869 31 emnlp-2010-Constraints Based Taxonomic Relation Classification
Author: Quang Do ; Dan Roth
Abstract: Determining whether two terms in text have an ancestor relation (e.g. Toyota and car) or a sibling relation (e.g. Toyota and Honda) is an essential component of textual inference in NLP applications such as Question Answering, Summarization, and Recognizing Textual Entailment. Significant work has been done on developing stationary knowledge sources that could potentially support these tasks, but these resources often suffer from low coverage, noise, and are inflexible when needed to support terms that are not identical to those placed in them, making their use as general purpose background knowledge resources difficult. In this paper, rather than building a stationary hierarchical structure of terms and relations, we describe a system that, given two terms, determines the taxonomic relation between them using a machine learning-based approach that makes use of existing resources. Moreover, we develop a global constraint opti- mization inference process and use it to leverage an existing knowledge base also to enforce relational constraints among terms and thus improve the classifier predictions. Our experimental evaluation shows that our approach significantly outperforms other systems built upon existing well-known knowledge sources.
6 0.37938565 72 emnlp-2010-Learning First-Order Horn Clauses from Web Text
7 0.37722829 55 emnlp-2010-Handling Noisy Queries in Cross Language FAQ Retrieval
8 0.37677941 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing
9 0.37431183 37 emnlp-2010-Domain Adaptation of Rule-Based Annotators for Named-Entity Recognition Tasks
10 0.37075442 53 emnlp-2010-Fusing Eye Gaze with Speech Recognition Hypotheses to Resolve Exophoric References in Situated Dialogue
11 0.37041861 18 emnlp-2010-Assessing Phrase-Based Translation Models with Oracle Decoding
12 0.36551705 12 emnlp-2010-A Semi-Supervised Method to Learn and Construct Taxonomies Using the Web
13 0.36363348 123 emnlp-2010-Word-Based Dialect Identification with Georeferenced Rules
14 0.34676245 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment
15 0.34555054 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification
16 0.34305614 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications
17 0.34268913 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks
18 0.34199101 63 emnlp-2010-Improving Translation via Targeted Paraphrasing
19 0.34185836 84 emnlp-2010-NLP on Spoken Documents Without ASR
20 0.34132102 32 emnlp-2010-Context Comparison of Bursty Events in Web Search and Online Media