acl acl2010 acl2010-189 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Matthias H. Heie ; Edward W. D. Whittaker ; Sadaoki Furui
Abstract: In this paper we demonstrate that there is a strong correlation between the Question Answering (QA) accuracy and the log-likelihood of the answer typing component of our statistical QA model. We exploit this observation in a clustering algorithm which optimizes QA accuracy by maximizing the log-likelihood of a set of question-and-answer pairs. Experimental results show that we achieve better QA accuracy using the resulting clusters than by using manually derived clusters.
Reference: text
sentIndex sentText sentNum sentScore
1 Whittaker and Sadaoki Furui Department of Computer Science Tokyo Institute of Technology Tokyo 152-8552, Japan {he ie , edw, furui } @ furui . [sent-4, score-0.208]
2 j p Abstract In this paper we demonstrate that there is a strong correlation between the Question Answering (QA) accuracy and the log-likelihood of the answer typing component of our statistical QA model. [sent-8, score-0.233]
3 We exploit this observation in a clustering algorithm which optimizes QA accuracy by maximizing the log-likelihood of a set of question-and-answer pairs. [sent-9, score-0.221]
4 Experimental results show that we achieve better QA accuracy using the resulting clusters than by using manually derived clusters. [sent-10, score-0.225]
5 1 Introduction Question Answering (QA) distinguishes itself from other information retrieval tasks in that the system tries to return accurate answers to queries posed in natural language. [sent-11, score-0.137]
6 Factoid QA limits itself to questions that can usually be answered with a few words. [sent-12, score-0.188]
7 Typically factoid QA systems employ some form of question type analysis, so that a question such as What is the capital of Japan? [sent-13, score-0.232]
8 Machine learning methods have been proposed, such as question classification using support vector machines (Zhang and Lee, 2003) and language modeling (Merkel and Klakow, 2007). [sent-16, score-0.105]
9 In these approaches, question categories are predefined and a classifier is trained on manually labeled data. [sent-17, score-0.105]
10 In this paper we present an unsupervised method, where we attempt to cluster question-and-answer (q-a) pairs without any predefined question categories, hence no manually class-labeled questions are used. [sent-19, score-0.504]
11 We use a statistical QA framework, described in Section 2, where the system is trained with clusters of q-a pairs. [sent-20, score-0.149]
12 In Section 3 we show that answer accuracy is strongly correlated with the log-likelihood of the q-a pairs computed by this statistical model. [sent-23, score-0.374]
13 In Section 4 we propose an algorithm to cluster q-a pairs by maximizing the log-likelihood of a disjoint set of q-a pairs. [sent-24, score-0.38]
14 In Section 5 we evaluate the QA accuracy by training the QA system with the resulting clusters. [sent-25, score-0.076]
15 what the question is actually about and what it refers to. [sent-29, score-0.077]
16 For example, in the questions Where is Mount Fuji? [sent-30, score-0.125]
17 , the question type features W differ, while the information-bearing features X are identical. [sent-32, score-0.077]
18 Finding the best answer involves a search over all A for the one which maximizes the probability of the above model, i. [sent-33, score-0.155]
19 (2) Given the correct probability distribution, this will give us the optimal answer in a maximum likelihood sense. [sent-36, score-0.155]
20 1 Retrieval Model The retrieval model P(A|X) is essentially a language mtrioedvaell mwhodicehl Pmo(Ade|Xls )th ies probability ao lfa an answer sequence A given a set of informationbearing features X = {x1, . [sent-42, score-0.243]
21 This set bise acroinngstr fuecatetudr by extracting single-word Tfehaistur seest from Q that are not present in a stop-list of highfrequency words. [sent-46, score-0.054]
22 The implementation of the retrieval model used for the experiments described in this paper, models the proximity of A to features in X. [sent-47, score-0.064]
23 ) such as where, in what and when were from the input question Q. [sent-57, score-0.077]
24 The 2522 most frequent words in a collection of example questions are considered in-vocabulary words; all other words are out-of-vocabulary words, and substituted with hUNKi. [sent-59, score-0.125]
25 Modeling the complex relationship between W and A directly is non-trivial. [sent-60, score-0.025]
26 eIsne notirdnger ato s cton osftr culcats tehse soef classes, given a set E = {t1, . [sent-65, score-0.028]
27 , t|E| } of example q-a pairs, we Edefi =ne a mapping }fun ofcti eoxnf : E → CE which maps each example q-a pair tj ffor : j = →1 . [sent-68, score-0.222]
28 a|Ess ce may ab ert idceufilnare dcl as st fhe( tunion of all component q-a features from each tj satisfying f(tj) = ce. [sent-75, score-0.396]
29 Hence each class ce constitutes a cluster of q-a pairs. [sent-76, score-0.437]
30 Finally, to facilitate modeling we say that W is conditionally independent of A given ce so that, P(W | A) =X|CE|P(W | ceW) · P(ceA| A), (4) eX= X1 where ceW and ceA refer to the subsets of questiontype features and example answers for the class ce, respectively. [sent-77, score-0.355]
31 Due to data sparsity, our set of example q-a pairs cannot be expected to cover all the possible answers to questions that may ever be asked. [sent-80, score-0.262]
32 We therefore employ answer class modeling rather than answer word modeling by expanding Eq. [sent-81, score-0.408]
33 (4) as follows: P(W | A) =|PCE|P(W | |KAeP=|1 ceW)· (5) aP=1P(ceA | ka)P(ka | A), where ka is a concrete class in the set of |KA | answer claissse as KA. [sent-82, score-0.406]
34 eTtehe cslea scsla isnse tsh are generated using the Kneser-Ney clustering algorithm, commonly used for generating class definitions for class language models (Kneser and Ney, 1993). [sent-83, score-0.146]
35 3 (8) The Relationship between Mean Reciprocal Rank and Log-Likelihood We use Mean Reciprocal Rank (MRR) as our metric when evaluating the QA accuracy on a set of questions G = {g1. [sent-88, score-0.177]
36 LL (average per q-a pair) for 100 random cluster configurations. [sent-92, score-0.183]
37 where Ri is the rank of the highest ranking correct candidate answer for gi. [sent-93, score-0.155]
38 d|D| ) of q-a pairs disjoint from the q-a pairs in CE, we can, using Eq. [sent-97, score-0.179]
39 To examine thPe relationship between MRR and LL, we randomly generate configurations CE, with a fixed cluster size of 4, and plot the resulting MRR and LL, computed on the same data set D, as data points in a scatter plot, as seen in Figure 1. [sent-99, score-0.341]
40 We find that LL and MRR are strongly correlated, with a correlation coefficient ρ = 0. [sent-100, score-0.032]
41 4 Clustering algorithm Using the observation that LL is correlated with MRR on the same data set, we expect that optimizing LL on a development set (LLdev) will also improve MRR on an evaluation set (MRReval). [sent-104, score-0.16]
42 Hence we propose the following greedy algorithm to maximize LLdev: init: c1 ∈ CE contains all training pairs |E| while improvement > tllh trraeisnhinogld p daior best LLdev ← −∞ for all j = 1←. [sent-105, score-0.09]
43 |E| do original cluster = f(tj) Take tj out of f(tj) for e = −1, 1. [sent-108, score-0.367]
44 c|C|+1 refers to a new, empty cluster, hence this algorithm automatically finds the optimal number of clusters as well as the optimal configuration of them. [sent-112, score-0.288]
45 1 Experimental Setup For our data sets, we restrict ourselves to questions that start with who, when or where. [sent-114, score-0.125]
46 Furthermore, we only use q-a pairs which can be answered with a single word. [sent-115, score-0.127]
47 As training data we use questions and answers from the Knowledge-Master collection1. [sent-116, score-0.198]
48 Development/evaluation questions are the questions from TREC QA evaluations from TREC 2002 to TREC 2006, the answers to which are to be retrieved from the AQUAINT corpus. [sent-117, score-0.349]
49 In total we have 2016 q-a pairs for training and 568 questions for development/evaluation. [sent-118, score-0.189]
50 We are able to retrieve the correct answer for 317 of the development/evaluation questions, thus the theoretical upper bound for our experiments is an answer accuracy of MRR = 0. [sent-119, score-0.362]
51 Accuracy is evaluated using 5-fold (rotating) cross-validation, where in each fold the TREC QA data is partitioned into a development set of 1http : / /www . [sent-121, score-0.028]
52 2814 Table 1: LLeval (average per q-a pair) and MRReval (over all held-out TREC years), and number of clusters (median of the cross-evaluation folds) for the various configurations. [sent-127, score-0.149]
53 For each TREC question the top 50 documents from the AQUAINT corpus are retrieved using Lucene2. [sent-129, score-0.103]
54 These clusters are obtained by putting all who q-a pairs in one cluster, all when pairs in a second and all where pairs in a third. [sent-133, score-0.341]
55 We compare this baseline with using clusters resulting from the algorithm described in Section 4. [sent-134, score-0.199]
56 We run this algorithm until there are no further improvements in LLdev. [sent-135, score-0.026]
57 Two other cluster configurations are also investigated: all q-a pairs in one cluster (all-in-one), and each qa pair in its own cluster (one-in-each). [sent-136, score-1.059]
58 The all-in- one configuration is equivalent to not using the filter model, i. [sent-137, score-0.155]
59 answer candidates are ranked solely by the retrieval model. [sent-139, score-0.219]
60 The one-in-each configuration was shown to perform well in the TREC 2006 QA evaluation (Whittaker et al. [sent-140, score-0.086]
61 , 2006), where it ranked 9th among 27 participants on the factoid QA task. [sent-141, score-0.078]
62 2 Results In Table 1, we see that the manual clusters (baseline) achieves an MRReval of 0. [sent-143, score-0.18]
63 262, while the clusters resulting from the clustering algorithm give an MRReval of 0. [sent-144, score-0.261]
64 The one-in-each cluster configuration achieves an MRReval of 0. [sent-148, score-0.269]
65 no filter model) has the lowest accuracy, with an MRReval of 0. [sent-152, score-0.069]
66 org/ # iterations (a) Development set, 4 year’s TREC. [sent-156, score-0.042]
67 6 Discussion Manual inspection of the automatically derived clusters showed that the algorithm had constructed configurations where typically who, when and where q-a pairs were put in separate clusters, as in the manual configuration. [sent-160, score-0.344]
68 However, in some cases both who and where q-a pairs occurred in the same cluster, so as to better answer questions like Who won the World Cup? [sent-161, score-0.344]
69 As can be seen from Table 1, there are only 4 clusters in the automatic configuration, compared to 2016 in the one-in-each configuration. [sent-163, score-0.149]
70 Since the computational complexity of the filter model described in Section 2. [sent-164, score-0.069]
71 2 is linear in the number of clusters, a beneficial side effect of our clustering procedure is a significant reduction in the computational requirement of the filter model. [sent-165, score-0.131]
72 In Figure 2 we plot LL and MRR for one of the cross-validation folds over multiple iterations (the while loop) of the clustering algorithm in Sec239 tion 4. [sent-166, score-0.212]
73 It can clearly be seen that the optimization of LLdev leads to improvement in MRReval, and that LLeval is also well correlated with MRReval. [sent-167, score-0.071]
74 7 Conclusions and Future Work In this paper we have shown that the log-likelihood of our statistical model is strongly correlated with answer accuracy. [sent-168, score-0.258]
75 Using this information, we have clustered training q-a pairs by maximizing loglikelihood on a disjoint development set of q-a pairs. [sent-169, score-0.204]
76 The experiments show that with these clusters we achieve better QA accuracy than using manually clustered training q-a pairs. [sent-170, score-0.234]
77 In future work we will extend the types of questions that we consider, and also allow for multiword answers. [sent-171, score-0.125]
wordName wordTfidf (topN-words)
[('qa', 0.366), ('lldev', 0.323), ('mrreval', 0.259), ('mrr', 0.232), ('whittaker', 0.219), ('ce', 0.212), ('ka', 0.209), ('cea', 0.208), ('ll', 0.191), ('tj', 0.184), ('cluster', 0.183), ('trec', 0.17), ('cew', 0.162), ('answer', 0.155), ('clusters', 0.149), ('questions', 0.125), ('furui', 0.104), ('tbest', 0.097), ('configuration', 0.086), ('merkel', 0.085), ('sadaoki', 0.078), ('dietrich', 0.078), ('factoid', 0.078), ('question', 0.077), ('answers', 0.073), ('correlated', 0.071), ('filter', 0.069), ('argmaaxp', 0.065), ('lleval', 0.065), ('odzel', 0.065), ('retrieval', 0.064), ('pairs', 0.064), ('answered', 0.063), ('kneser', 0.063), ('clustering', 0.062), ('tokyo', 0.057), ('fuji', 0.057), ('klakow', 0.057), ('aquaint', 0.057), ('maximizing', 0.056), ('edward', 0.052), ('mount', 0.052), ('accuracy', 0.052), ('disjoint', 0.051), ('dx', 0.049), ('answering', 0.048), ('huang', 0.044), ('plot', 0.043), ('configurations', 0.042), ('iterations', 0.042), ('class', 0.042), ('folds', 0.039), ('optimizing', 0.038), ('pair', 0.038), ('isolation', 0.037), ('reciprocal', 0.037), ('ex', 0.035), ('wd', 0.035), ('clustered', 0.033), ('put', 0.032), ('strongly', 0.032), ('manual', 0.031), ('modeling', 0.028), ('eurospeech', 0.028), ('tish', 0.028), ('pce', 0.028), ('cfe', 0.028), ('seest', 0.028), ('acero', 0.028), ('wode', 0.028), ('rotating', 0.028), ('gp', 0.028), ('sift', 0.028), ('ato', 0.028), ('eut', 0.028), ('lucene', 0.028), ('partitioned', 0.028), ('ad', 0.028), ('predefined', 0.028), ('hence', 0.027), ('year', 0.027), ('zhang', 0.027), ('algorithm', 0.026), ('retrieved', 0.026), ('japan', 0.026), ('ooff', 0.026), ('bise', 0.026), ('ade', 0.026), ('pj', 0.026), ('ech', 0.026), ('apache', 0.026), ('typing', 0.026), ('relationship', 0.025), ('observation', 0.025), ('resulting', 0.024), ('lfa', 0.024), ('ess', 0.024), ('saddle', 0.024), ('cup', 0.024), ('scatter', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999952 189 acl-2010-Optimizing Question Answering Accuracy by Maximizing Log-Likelihood
Author: Matthias H. Heie ; Edward W. D. Whittaker ; Sadaoki Furui
Abstract: In this paper we demonstrate that there is a strong correlation between the Question Answering (QA) accuracy and the log-likelihood of the answer typing component of our statistical QA model. We exploit this observation in a clustering algorithm which optimizes QA accuracy by maximizing the log-likelihood of a set of question-and-answer pairs. Experimental results show that we achieve better QA accuracy using the resulting clusters than by using manually derived clusters.
2 0.19498846 174 acl-2010-Modeling Semantic Relevance for Question-Answer Pairs in Web Social Communities
Author: Baoxun Wang ; Xiaolong Wang ; Chengjie Sun ; Bingquan Liu ; Lin Sun
Abstract: Quantifying the semantic relevance between questions and their candidate answers is essential to answer detection in social media corpora. In this paper, a deep belief network is proposed to model the semantic relevance for question-answer pairs. Observing the textual similarity between the community-driven questionanswering (cQA) dataset and the forum dataset, we present a novel learning strategy to promote the performance of our method on the social community datasets without hand-annotating work. The experimental results show that our method outperforms the traditional approaches on both the cQA and the forum corpora.
3 0.13824441 215 acl-2010-Speech-Driven Access to the Deep Web on Mobile Devices
Author: Taniya Mishra ; Srinivas Bangalore
Abstract: The Deep Web is the collection of information repositories that are not indexed by search engines. These repositories are typically accessible through web forms and contain dynamically changing information. In this paper, we present a system that allows users to access such rich repositories of information on mobile devices using spoken language.
4 0.12334646 113 acl-2010-Extraction and Approximation of Numerical Attributes from the Web
Author: Dmitry Davidov ; Ari Rappoport
Abstract: We present a novel framework for automated extraction and approximation of numerical object attributes such as height and weight from the Web. Given an object-attribute pair, we discover and analyze attribute information for a set of comparable objects in order to infer the desired value. This allows us to approximate the desired numerical values even when no exact values can be found in the text. Our framework makes use of relation defining patterns and WordNet similarity information. First, we obtain from the Web and WordNet a list of terms similar to the given object. Then we retrieve attribute values for each term in this list, and information that allows us to compare different objects in the list and to infer the attribute value range. Finally, we combine the retrieved data for all terms from the list to select or approximate the requested value. We evaluate our method using automated question answering, WordNet enrichment, and comparison with answers given in Wikipedia and by leading search engines. In all of these, our framework provides a significant improvement.
5 0.11653415 171 acl-2010-Metadata-Aware Measures for Answer Summarization in Community Question Answering
Author: Mattia Tomasoni ; Minlie Huang
Abstract: This paper presents a framework for automatically processing information coming from community Question Answering (cQA) portals with the purpose of generating a trustful, complete, relevant and succinct summary in response to a question. We exploit the metadata intrinsically present in User Generated Content (UGC) to bias automatic multi-document summarization techniques toward high quality information. We adopt a representation of concepts alternative to n-grams and propose two concept-scoring functions based on semantic overlap. Experimental re- sults on data drawn from Yahoo! Answers demonstrate the effectiveness of our method in terms of ROUGE scores. We show that the information contained in the best answers voted by users of cQA portals can be successfully complemented by our method.
6 0.10888885 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery
7 0.08167278 236 acl-2010-Top-Down K-Best A* Parsing
8 0.070152625 2 acl-2010-"Was It Good? It Was Provocative." Learning the Meaning of Scalar Adjectives
9 0.069717899 242 acl-2010-Tree-Based Deterministic Dependency Parsing - An Application to Nivre's Method -
10 0.062913559 248 acl-2010-Unsupervised Ontology Induction from Text
11 0.058387477 227 acl-2010-The Impact of Interpretation Problems on Tutorial Dialogue
12 0.05590982 123 acl-2010-Generating Focused Topic-Specific Sentiment Lexicons
13 0.054054178 140 acl-2010-Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.
14 0.052237473 129 acl-2010-Growing Related Words from Seed via User Behaviors: A Re-Ranking Based Approach
15 0.045995601 22 acl-2010-A Unified Graph Model for Sentence-Based Opinion Retrieval
16 0.045823269 125 acl-2010-Generating Templates of Entity Summaries with an Entity-Aspect Model and Pattern Mining
17 0.045571771 5 acl-2010-A Framework for Figurative Language Detection Based on Sense Differentiation
18 0.042574648 205 acl-2010-SVD and Clustering for Unsupervised POS Tagging
19 0.041721106 261 acl-2010-Wikipedia as Sense Inventory to Improve Diversity in Web Search Results
20 0.040257305 238 acl-2010-Towards Open-Domain Semantic Role Labeling
topicId topicWeight
[(0, -0.124), (1, 0.045), (2, -0.052), (3, -0.014), (4, 0.015), (5, -0.063), (6, -0.005), (7, 0.008), (8, 0.001), (9, -0.006), (10, -0.07), (11, 0.014), (12, -0.079), (13, -0.13), (14, 0.064), (15, 0.147), (16, -0.247), (17, -0.12), (18, 0.023), (19, 0.022), (20, 0.025), (21, -0.027), (22, 0.097), (23, -0.07), (24, 0.068), (25, 0.165), (26, -0.037), (27, -0.063), (28, -0.112), (29, -0.003), (30, 0.026), (31, -0.027), (32, 0.026), (33, 0.072), (34, -0.061), (35, 0.051), (36, 0.129), (37, 0.159), (38, -0.017), (39, -0.002), (40, -0.035), (41, -0.114), (42, -0.119), (43, 0.032), (44, 0.013), (45, 0.046), (46, -0.003), (47, 0.02), (48, -0.07), (49, 0.01)]
simIndex simValue paperId paperTitle
same-paper 1 0.96063322 189 acl-2010-Optimizing Question Answering Accuracy by Maximizing Log-Likelihood
Author: Matthias H. Heie ; Edward W. D. Whittaker ; Sadaoki Furui
Abstract: In this paper we demonstrate that there is a strong correlation between the Question Answering (QA) accuracy and the log-likelihood of the answer typing component of our statistical QA model. We exploit this observation in a clustering algorithm which optimizes QA accuracy by maximizing the log-likelihood of a set of question-and-answer pairs. Experimental results show that we achieve better QA accuracy using the resulting clusters than by using manually derived clusters.
2 0.80458927 174 acl-2010-Modeling Semantic Relevance for Question-Answer Pairs in Web Social Communities
Author: Baoxun Wang ; Xiaolong Wang ; Chengjie Sun ; Bingquan Liu ; Lin Sun
Abstract: Quantifying the semantic relevance between questions and their candidate answers is essential to answer detection in social media corpora. In this paper, a deep belief network is proposed to model the semantic relevance for question-answer pairs. Observing the textual similarity between the community-driven questionanswering (cQA) dataset and the forum dataset, we present a novel learning strategy to promote the performance of our method on the social community datasets without hand-annotating work. The experimental results show that our method outperforms the traditional approaches on both the cQA and the forum corpora.
3 0.71572578 171 acl-2010-Metadata-Aware Measures for Answer Summarization in Community Question Answering
Author: Mattia Tomasoni ; Minlie Huang
Abstract: This paper presents a framework for automatically processing information coming from community Question Answering (cQA) portals with the purpose of generating a trustful, complete, relevant and succinct summary in response to a question. We exploit the metadata intrinsically present in User Generated Content (UGC) to bias automatic multi-document summarization techniques toward high quality information. We adopt a representation of concepts alternative to n-grams and propose two concept-scoring functions based on semantic overlap. Experimental re- sults on data drawn from Yahoo! Answers demonstrate the effectiveness of our method in terms of ROUGE scores. We show that the information contained in the best answers voted by users of cQA portals can be successfully complemented by our method.
4 0.67822665 215 acl-2010-Speech-Driven Access to the Deep Web on Mobile Devices
Author: Taniya Mishra ; Srinivas Bangalore
Abstract: The Deep Web is the collection of information repositories that are not indexed by search engines. These repositories are typically accessible through web forms and contain dynamically changing information. In this paper, we present a system that allows users to access such rich repositories of information on mobile devices using spoken language.
5 0.54875743 2 acl-2010-"Was It Good? It Was Provocative." Learning the Meaning of Scalar Adjectives
Author: Marie-Catherine de Marneffe ; Christopher D. Manning ; Christopher Potts
Abstract: Texts and dialogues often express information indirectly. For instance, speakers’ answers to yes/no questions do not always straightforwardly convey a ‘yes’ or ‘no’ answer. The intended reply is clear in some cases (Was it good? It was great!) but uncertain in others (Was it acceptable? It was unprecedented.). In this paper, we present methods for interpreting the answers to questions like these which involve scalar modifiers. We show how to ground scalar modifier meaning based on data collected from the Web. We learn scales between modifiers and infer the extent to which a given answer conveys ‘yes’ or ‘no’ . To evaluate the methods, we collected examples of question–answer pairs involving scalar modifiers from CNN transcripts and the Dialog Act corpus and use response distributions from Mechanical Turk workers to assess the degree to which each answer conveys ‘yes’ or ‘no’ . Our experimental results closely match the Turkers’ response data, demonstrating that meanings can be learned from Web data and that such meanings can drive pragmatic inference.
6 0.53991324 113 acl-2010-Extraction and Approximation of Numerical Attributes from the Web
7 0.50381398 248 acl-2010-Unsupervised Ontology Induction from Text
8 0.41379198 63 acl-2010-Comparable Entity Mining from Comparative Questions
9 0.37108296 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery
10 0.3465209 140 acl-2010-Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.
11 0.2863346 205 acl-2010-SVD and Clustering for Unsupervised POS Tagging
13 0.24987902 40 acl-2010-Automatic Sanskrit Segmentizer Using Finite State Transducers
14 0.24275163 254 acl-2010-Using Speech to Reply to SMS Messages While Driving: An In-Car Simulator User Study
15 0.23530962 60 acl-2010-Collocation Extraction beyond the Independence Assumption
16 0.23403655 183 acl-2010-Online Generation of Locality Sensitive Hash Signatures
17 0.22572628 263 acl-2010-Word Representations: A Simple and General Method for Semi-Supervised Learning
18 0.22385898 227 acl-2010-The Impact of Interpretation Problems on Tutorial Dialogue
19 0.22267169 204 acl-2010-Recommendation in Internet Forums and Blogs
20 0.21404907 236 acl-2010-Top-Down K-Best A* Parsing
topicId topicWeight
[(25, 0.049), (59, 0.132), (72, 0.034), (73, 0.033), (78, 0.023), (83, 0.078), (84, 0.018), (97, 0.379), (98, 0.144)]
simIndex simValue paperId paperTitle
1 0.81261992 196 acl-2010-Plot Induction and Evolutionary Search for Story Generation
Author: Neil McIntyre ; Mirella Lapata
Abstract: In this paper we develop a story generator that leverages knowledge inherent in corpora without requiring extensive manual involvement. A key feature in our approach is the reliance on a story planner which we acquire automatically by recording events, their participants, and their precedence relationships in a training corpus. Contrary to previous work our system does not follow a generate-and-rank architecture. Instead, we employ evolutionary search techniques to explore the space of possible stories which we argue are well suited to the story generation task. Experiments on generating simple children’s stories show that our system outperforms pre- vious data-driven approaches.
same-paper 2 0.7884236 189 acl-2010-Optimizing Question Answering Accuracy by Maximizing Log-Likelihood
Author: Matthias H. Heie ; Edward W. D. Whittaker ; Sadaoki Furui
Abstract: In this paper we demonstrate that there is a strong correlation between the Question Answering (QA) accuracy and the log-likelihood of the answer typing component of our statistical QA model. We exploit this observation in a clustering algorithm which optimizes QA accuracy by maximizing the log-likelihood of a set of question-and-answer pairs. Experimental results show that we achieve better QA accuracy using the resulting clusters than by using manually derived clusters.
3 0.76695681 226 acl-2010-The Human Language Project: Building a Universal Corpus of the World's Languages
Author: Steven Abney ; Steven Bird
Abstract: We present a grand challenge to build a corpus that will include all of the world’s languages, in a consistent structure that permits large-scale cross-linguistic processing, enabling the study of universal linguistics. The focal data types, bilingual texts and lexicons, relate each language to one of a set of reference languages. We propose that the ability to train systems to translate into and out of a given language be the yardstick for determining when we have successfully captured a language. We call on the computational linguistics community to begin work on this Universal Corpus, pursuing the many strands of activity described here, as their contribution to the global effort to document the world’s linguistic heritage before more languages fall silent.
4 0.63318938 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification
Author: Omri Abend ; Ari Rappoport
Abstract: The core-adjunct argument distinction is a basic one in the theory of argument structure. The task of distinguishing between the two has strong relations to various basic NLP tasks such as syntactic parsing, semantic role labeling and subcategorization acquisition. This paper presents a novel unsupervised algorithm for the task that uses no supervised models, utilizing instead state-of-the-art syntactic induction algorithms. This is the first work to tackle this task in a fully unsupervised scenario.
5 0.49811697 230 acl-2010-The Manually Annotated Sub-Corpus: A Community Resource for and by the People
Author: Nancy Ide ; Collin Baker ; Christiane Fellbaum ; Rebecca Passonneau
Abstract: The Manually Annotated Sub-Corpus (MASC) project provides data and annotations to serve as the base for a communitywide annotation effort of a subset of the American National Corpus. The MASC infrastructure enables the incorporation of contributed annotations into a single, usable format that can then be analyzed as it is or ported to any of a variety of other formats. MASC includes data from a much wider variety of genres than existing multiply-annotated corpora of English, and the project is committed to a fully open model of distribution, without restriction, for all data and annotations produced or contributed. As such, MASC is the first large-scale, open, communitybased effort to create much needed language resources for NLP. This paper describes the MASC project, its corpus and annotations, and serves as a call for contributions of data and annotations from the language processing community.
6 0.49681556 87 acl-2010-Discriminative Modeling of Extraction Sets for Machine Translation
7 0.49515206 88 acl-2010-Discriminative Pruning for Discriminative ITG Alignment
9 0.49359143 113 acl-2010-Extraction and Approximation of Numerical Attributes from the Web
10 0.493476 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
11 0.49299565 48 acl-2010-Better Filtration and Augmentation for Hierarchical Phrase-Based Translation Rules
12 0.49276656 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans
13 0.49206114 145 acl-2010-Improving Arabic-to-English Statistical Machine Translation by Reordering Post-Verbal Subjects for Alignment
14 0.49154773 215 acl-2010-Speech-Driven Access to the Deep Web on Mobile Devices
15 0.49135149 148 acl-2010-Improving the Use of Pseudo-Words for Evaluating Selectional Preferences
16 0.49132434 51 acl-2010-Bilingual Sense Similarity for Statistical Machine Translation
17 0.49125201 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery
18 0.49044079 54 acl-2010-Boosting-Based System Combination for Machine Translation
19 0.48999476 169 acl-2010-Learning to Translate with Source and Target Syntax
20 0.48958442 254 acl-2010-Using Speech to Reply to SMS Messages While Driving: An In-Car Simulator User Study