acl acl2013 acl2013-130 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Nathan Gilbert ; Ellen Riloff
Abstract: Most coreference resolvers rely heavily on string matching, syntactic properties, and semantic attributes of words, but they lack the ability to make decisions based on individual words. In this paper, we explore the benefits of lexicalized features in the setting of domain-specific coreference resolution. We show that adding lexicalized features to off-the-shelf coreference resolvers yields significant performance gains on four domain-specific data sets and with two types of coreference resolution architectures.
Reference: text
sentIndex sentText sentNum sentScore
1 edu o Abstract Most coreference resolvers rely heavily on string matching, syntactic properties, and semantic attributes of words, but they lack the ability to make decisions based on individual words. [sent-5, score-0.715]
2 In this paper, we explore the benefits of lexicalized features in the setting of domain-specific coreference resolution. [sent-6, score-0.895]
3 We show that adding lexicalized features to off-the-shelf coreference resolvers yields significant performance gains on four domain-specific data sets and with two types of coreference resolution architectures. [sent-7, score-1.785]
4 1 Introduction Coreference resolvers are typically evaluated on collections of news articles that cover a wide range of topics, such as the ACE (ACE03, 2003; ACE04, 2004; ACE05, 2005) and OntoNotes (Pradhan et al. [sent-8, score-0.092]
5 Many NLP applications, however, involve text analysis for specialized domains, such as clinical medicine (Gooch and Roudsari, 2012; Glinos, 2011), legal text analysis (Bouayad-Agha et al. [sent-10, score-0.165]
6 Learning-based coreference resolvers can be easily retrained for a specialized domain given annotated training texts for that domain. [sent-13, score-0.823]
7 However, we found that retraining an off-the-shelf coreference resolver with domainspecific texts showed little benefit. [sent-14, score-0.819]
8 This surprising result led us to question the nature of the feature sets used by noun phrase (NP) coreference resolvers. [sent-15, score-0.698]
9 Nearly all of the features employed by recent systems fall into three categories: string match and word overlap, syntactic properties (e. [sent-16, score-0.094]
10 Conspicuously absent from most systems are lexical features that allow the classifier to consider the specific words when making a coreference decision. [sent-23, score-0.736]
11 A few researchers have experimented with lexical features, but they achieved mixed results in evaluations on broad-coverage corpora (Bengston and Roth, 2008; Bj ¨orkelund and Nugues, 2011; Rahman and Ng, 2011a). [sent-24, score-0.075]
12 We hypothesized that lexicalized features can have a more substantial impact in domain-specific settings. [sent-25, score-0.3]
13 Lexical features can capture domainspecific knowledge and subtle semantic distinctions that may be important within a domain. [sent-26, score-0.126]
14 For example, based on the resolutions found in domain-specific training sets, our lexicalized features captured the knowledge that “tomcat” can be coreferent with “plane”, “UAW” can be coreferent with “union”, and “anthrax” can be coreferent with “diagnosis”. [sent-27, score-0.728]
15 1 In this paper, we evaluate the impact of lexicalized features on 4 domains: management succession (MUC-6 data), vehicle launches (MUC-7 data), disease outbreaks (ProMed texts), and terrorism (MUC-4 data). [sent-30, score-0.596]
16 We incorporate lexicalized feature sets into two different coreference architectures: Reconcile (Stoyanov et al. [sent-31, score-0.901]
17 , 2010), a pairwise coreference classifier, and Sieve (Raghunathan et al. [sent-32, score-0.595]
18 Our results show that lexicalized features significantly improve performance in all four domains and in both types of coreference architectures. [sent-34, score-0.978]
19 2 Related Work We are not the first researchers to use lexicalized features for coreference resolution. [sent-35, score-0.895]
20 The Paired Permutation test (Pesarin, 2001) was used for statistical significance testing and gray cells represent results that are not significantly different from the best result. [sent-52, score-0.111]
21 vious work has evaluated the benefit of lexical features only for broad-coverage data sets. [sent-53, score-0.176]
22 Bengston and Roth (2008) incorporated a memorization feature to learn which entities can refer to one another. [sent-54, score-0.113]
23 Rahman and Ng (201 1a) also utilized lexical features, going beyond strict memorization with methods to combat data sparseness and incorporating semantic information. [sent-57, score-0.153]
24 Semi-lexical features were also used when one NP was a Named Entity, and unseen features were used when the NPs were not in the training set. [sent-59, score-0.176]
25 Their features did yield improvements on both the ACE 2005 and OntoNotes-2 data, but the semilexical features included Named Entity classes as well as word-based features. [sent-60, score-0.162]
26 Rahman and Ng (201 1b) explored the use of lexical features in greater detail and showed their benefit on the ACE05 corpus independent of, and combined with, a conventional set of coreference features. [sent-61, score-0.771]
27 The authors experimented with utilizing lexical information drawn from different sources. [sent-63, score-0.137]
28 The results showed that the best performance came from training and testing with lexical knowledge drawn from the same source. [sent-64, score-0.179]
29 Although our approach is similar, this paper focuses on learning lexical information from different domains as opposed to the different genres found in the six sources of the ACE05 corpus. [sent-65, score-0.121]
30 Bj ¨orkelund and Nugues (201 1) used lexical word pairs for the 2011 CoNLL Shared Task, showing significant positive impact on performance. [sent-66, score-0.075]
31 Our work aims to show the benefit of lexical features using much smaller training sets (< 50 documents) focused on specific domains. [sent-68, score-0.257]
32 (2004) utilized lexical information such as mention spelling and context for entity tracking in ACE. [sent-71, score-0.111]
33 Ng (2007) used lexical information to assess the likelihood of a noun phrase being anaphoric, but this did not show clear improvements on ACE data. [sent-72, score-0.106]
34 There has been previous work on domainspecific coreference resolution for several domains, including biological literature (Casta˜ no et al. [sent-73, score-0.825]
35 , 2011; Batista-Navarro and Ananiadou, 2011), clinical medicine (He, 2007; Zheng et al. [sent-75, score-0.122]
36 In addition, BABAR (Bean and Riloff, 2004) used contextual role knowledge for coreference resolution in the domains of terrorism and natural disasters. [sent-78, score-0.839]
37 But BABAR acquired and used lexical information to match the compatibility ofcontexts surrounding NPs, not the NPs themselves. [sent-79, score-0.107]
38 To the best of our knowledge, our work is the first to examine the impact of lexicalized features for domain-specific coreference resolution. [sent-80, score-0.895]
39 3 Exploiting Lexicalized Features Table 1shows the performance of a learning-based coreference resolver, Reconcile (Stoyanov et al. [sent-81, score-0.595]
40 , 2010), with its default feature set using different combinations of training and testing data. [sent-82, score-0.108]
41 Reconcile does not include any lexical features, but does contain over 60 general features covering semantic agreement, syntactic constraints, string match and recency. [sent-83, score-0.169]
42 Each row represents a training set, each column represents a test set, and each cell shows precision (P), recall (R), and F score results under the B3 metric when using the corresponding training and test data. [sent-84, score-0.118]
43 246370 Table 2: B3 results for baselines and lexicalized feature sets across four domains. [sent-109, score-0.306]
44 If just one cell is gray in a column, that indicates the result was significantly better than the other results in the same column with p ≤ 0. [sent-112, score-0.112]
45 does not show much benefit from training on the same domain as the test set. [sent-116, score-0.113]
46 Three different training sets produce F scores that are not significantly different for both the MUC-6 and MUC-4 test data. [sent-117, score-0.118]
47 For ProMed, training on the MUC-7 data yields significantly better results than training on all the other data sets, including ProMed texts! [sent-118, score-0.153]
48 Based on these results, it would seem that training on the MUC-7 texts is likely to yield the best results no matter what domain you plan to use the coreference resolver for. [sent-119, score-0.867]
49 The goal of our work is to investigate whether lexical features can extract additional knowledge from domain-specific training texts to help tailor a coreference resolver to perform better for a specific domain. [sent-120, score-0.944]
50 (2009) to define a coreference element (CE) as a noun phrase that can participate in a coreference relation based on the task definition. [sent-123, score-1.221]
51 Each training document has manually annotated gold coreference chains corresponding to the sets of CEs that are coreferent. [sent-124, score-0.676]
52 We consider the coreference relation to be bi-directional, so we don’t retain information about which CE was the antecedent. [sent-126, score-0.622]
53 2 Lexicalized Feature Sets We explore two ways to capture lexicalized infor- mation as features. [sent-131, score-0.234]
54 The first approach indicates whether two CEs have ever been coreferent in the training data. [sent-132, score-0.172]
55 We create a single feature called LEXLOOKUP(X,Y) that receives a value of 1when x and y have been coreferent at least twice, or a value of 0 otherwise. [sent-133, score-0.163]
56 2 LEXLOOKUP(X,Y) is a single feature that captures all CE pairs that were coreferent in the training data. [sent-134, score-0.207]
57 We also created set-based features that capture the set of terms that have been coreferent with a particular CE. [sent-135, score-0.194]
58 The CorefSet(x) is the set of CEs that have appeared in the same coreference chain as mention x at least twice. [sent-136, score-0.631]
59 We create a set of binary-valued features LEXSET(X,Y), one for each CE x in the training data. [sent-137, score-0.11]
60 The bene=fi 1t ioff y yth ∈e Cseotr-beafsSeedt (fexa)t,u orres 0 over a single monolithic feature is that the classifier has one set-based feature for each mention found in the training data, so it can learn to handle individual terms differently. [sent-139, score-0.15]
61 We also tried encoding a separate feature for each distinct pair of words, analogous to the memorization feature in Bengston and Roth (2008). [sent-140, score-0.148]
62 1 Data Sets We evaluated the performance of lexicalized features on 4 domain-specific corpora including two standard coreference benchmarks, the MUC-6 and MUC-7 data sets. [sent-143, score-0.895]
63 The MUC-6 domain is management succession and consists of 30 training texts and 30 test texts. [sent-144, score-0.188]
64 83 launches and consists of 30 training texts and 20 test texts. [sent-146, score-0.148]
65 We also created 2 new coreference data sets which we will make freely available. [sent-148, score-0.632]
66 org) about disease outbreaks and 45 MUC-4 texts about terrorism, following the MUC guidelines (Hirschman, 1997). [sent-151, score-0.161]
67 2 Coreference Resolution Models We conducted experiments using two coreference resolution architectures. [sent-159, score-0.733]
68 We also conducted experiments with the Sieve coreference resolver, which applies high precision heuristic rules to incrementally build coreference chains. [sent-165, score-1.252]
69 We implemented the LEXLOOKUP(X,Y) feature as an additional heuristic rule. [sent-166, score-0.097]
70 3 Experimental Results Table 2 presents results for Reconcile trained with and without lexical features and when adding a lexical heuristic with data drawn from samedomain texts to Sieve. [sent-169, score-0.419]
71 The first row shows the results without the lexicalized features (from Table 1). [sent-170, score-0.3]
72 All F scores for Reconcile with lexicalized features are significantly better than without these features based on the Paired Permutation test (Pesarin, 2001) with 3We also computed κ on MUC-4, but unfortunately the score and original data were lost. [sent-171, score-0.403]
73 For most domains, adding the lexical features to Reconcile substantially increased precision with comparable levels of recall. [sent-180, score-0.141]
74 The bottom half of Table 2 contains the results of adding a lexical heuristic to Sieve. [sent-181, score-0.137]
75 The first row shows the default system with no lexical information. [sent-182, score-0.075]
76 All F scores with the lexical heuristic are significantly better than without it. [sent-183, score-0.174]
77 In Sieve’s high-precision coreference architecture, the lexical heuristic yields additional recall gains without sacrificing much precision. [sent-184, score-0.76]
78 39812 Table 3: B3 results for baselines and lexicalized feature sets on the broad-coverage ACE 2004 data set. [sent-188, score-0.306]
79 For Sieve, the unlexicalized system yields a significantly higher F score than when adding the lexical heuristic. [sent-192, score-0.14]
80 These results support our hypothesis that lexicalized information can be beneficial for capturing domain-specific word associations, but may not be as helpful in a broad-coverage setting where the language covers a diverse set of topics. [sent-193, score-0.234]
81 The bottom half of the table shows cross-domain experiments for Sieve using the lexical heuristic at the end of its rule set (LexEnd). [sent-195, score-0.137]
82 Training and testing on the same domain always produced the highest recall scores for MUC7, ProMed, and MUC-4 when utilizing lexical features. [sent-198, score-0.169]
83 In all cases, lexical features acquired from same-domain texts yield results that are either clearly the best or not significantly different from the best. [sent-199, score-0.298]
84 For MUC-6 and MUC-7, the highest F score results almost always come from training on samedomain texts, although in some cases these results are not significantly different from training on other domains. [sent-202, score-0.177]
85 Lexical features can yield improvements when training on a different domain if there is overlap in the vocabulary across the domains. [sent-203, score-0.174]
86 For the ProMed domain, the Sieve system performs significantly better, under both metrics, with same-domain lexical features than with lexi- cal features acquired from a different domain. [sent-204, score-0.276]
87 In the MUC-4 domain, using same-domain lexical information always produces the best F score, under both metrics and in both coreference systems. [sent-206, score-0.67]
88 5 Conclusions We explored the use of lexical information for domain-specific coreference resolution using 4 domain-specific data sets and 2 coreference resolvers. [sent-207, score-1.44]
89 Lexicalized features consistently improved performance for all of the domains and in both coreference architectures. [sent-208, score-0.707]
90 We see benefits from lexicalized features in cross-domain training, but the gains are often more substantial when utilizing same-domain lexical knowledge. [sent-209, score-0.406]
91 In the future, we plan to explore additional types of lexical information to benefit domain-specific coreference resolution. [sent-210, score-0.705]
92 Unsupervised learning of Contextual Role Knowledge for coreference resolution. [sent-245, score-0.595]
93 A search based method for clinical text coreference resolution. [sent-274, score-0.686]
94 Lexical patterns, features and knowledge resources for coreference resolution in clinical notes. [sent-278, score-0.89]
95 Coreference resolution on entities and events for hospital discharge summaries. [sent-282, score-0.138]
96 The taming of Reconcile as a Biomedical coreference resolver. [sent-292, score-0.595]
97 Anaphora resolution for biomedical literature by exploiting multiple resources. [sent-296, score-0.233]
98 Narrowing the modelling gap: A cluster-ranking approach to coreference resolution. [sent-325, score-0.595]
99 Conundrums in noun phrase coreference resolution: Making sense of the Stateof-the-Art. [sent-329, score-0.626]
100 Coreference resolution: A review of general methodologies and applications in the clinical domain. [sent-346, score-0.091]
wordName wordTfidf (topN-words)
[('coreference', 0.595), ('reconcile', 0.297), ('lexicalized', 0.234), ('promed', 0.207), ('sieve', 0.182), ('ace', 0.144), ('resolution', 0.138), ('coreferent', 0.128), ('stoyanov', 0.12), ('resolver', 0.106), ('biomedical', 0.095), ('ces', 0.095), ('bengston', 0.092), ('resolvers', 0.092), ('clinical', 0.091), ('rahman', 0.091), ('ce', 0.087), ('anthrax', 0.078), ('gooch', 0.078), ('lexlookup', 0.078), ('lexset', 0.078), ('memorization', 0.078), ('pesarin', 0.078), ('villain', 0.078), ('nps', 0.076), ('lexical', 0.075), ('casta', 0.069), ('ellen', 0.067), ('features', 0.066), ('heuristic', 0.062), ('domainspecific', 0.06), ('gilbert', 0.06), ('orkelund', 0.06), ('terrorism', 0.06), ('nathan', 0.059), ('texts', 0.058), ('permutation', 0.056), ('bionlp', 0.055), ('diagnosis', 0.055), ('muc', 0.055), ('babar', 0.052), ('corefset', 0.052), ('glinos', 0.052), ('lexend', 0.052), ('outbreaks', 0.052), ('roudsari', 0.052), ('samedomain', 0.052), ('succession', 0.052), ('uaw', 0.052), ('disease', 0.051), ('anaphora', 0.048), ('domains', 0.046), ('bean', 0.046), ('launches', 0.046), ('tomcat', 0.046), ('gray', 0.045), ('training', 0.044), ('legal', 0.043), ('bj', 0.043), ('gasperin', 0.042), ('veselin', 0.042), ('lynette', 0.042), ('altaf', 0.042), ('vincent', 0.04), ('riloff', 0.039), ('nominals', 0.038), ('significantly', 0.037), ('cardie', 0.037), ('sets', 0.037), ('claire', 0.036), ('ut', 0.036), ('ananiadou', 0.036), ('raghunathan', 0.036), ('mention', 0.036), ('benefit', 0.035), ('feature', 0.035), ('vehicle', 0.035), ('ah', 0.035), ('ng', 0.034), ('domain', 0.034), ('florian', 0.034), ('afrl', 0.034), ('ontonotes', 0.033), ('pradhan', 0.033), ('biological', 0.032), ('nugues', 0.032), ('nist', 0.032), ('acquired', 0.032), ('roth', 0.031), ('medicine', 0.031), ('drawn', 0.031), ('utilizing', 0.031), ('noun', 0.031), ('column', 0.03), ('yield', 0.03), ('testing', 0.029), ('witten', 0.029), ('string', 0.028), ('yields', 0.028), ('retain', 0.027)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000006 130 acl-2013-Domain-Specific Coreference Resolution with Lexicalized Features
Author: Nathan Gilbert ; Ellen Riloff
Abstract: Most coreference resolvers rely heavily on string matching, syntactic properties, and semantic attributes of words, but they lack the ability to make decisions based on individual words. In this paper, we explore the benefits of lexicalized features in the setting of domain-specific coreference resolution. We show that adding lexicalized features to off-the-shelf coreference resolvers yields significant performance gains on four domain-specific data sets and with two types of coreference resolution architectures.
2 0.39172822 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution
Author: Sebastian Martschat
Abstract: We present an unsupervised model for coreference resolution that casts the problem as a clustering task in a directed labeled weighted multigraph. The model outperforms most systems participating in the English track of the CoNLL’ 12 shared task.
3 0.25801012 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning
Author: Emmanuel Lassalle ; Pascal Denis
Abstract: This paper proposes a new method for significantly improving the performance of pairwise coreference models. Given a set of indicators, our method learns how to best separate types of mention pairs into equivalence classes for which we construct distinct classification models. In effect, our approach finds an optimal feature space (derived from a base feature set and indicator set) for discriminating coreferential mention pairs. Although our approach explores a very large space of possible feature spaces, it remains tractable by exploiting the structure of the hierarchies built from the indicators. Our exper- iments on the CoNLL-2012 Shared Task English datasets (gold mentions) indicate that our method is robust relative to different clustering strategies and evaluation metrics, showing large and consistent improvements over a single pairwise model using the same base features. Our best system obtains a competitive 67.2 of average F1 over MUC, and CEAF which, despite its simplicity, places it above the mean score of other systems on these datasets. B3,
4 0.23483494 106 acl-2013-Decentralized Entity-Level Modeling for Coreference Resolution
Author: Greg Durrett ; David Hall ; Dan Klein
Abstract: Efficiently incorporating entity-level information is a challenge for coreference resolution systems due to the difficulty of exact inference over partitions. We describe an end-to-end discriminative probabilistic model for coreference that, along with standard pairwise features, enforces structural agreement constraints between specified properties of coreferent mentions. This model can be represented as a factor graph for each document that admits efficient inference via belief propagation. We show that our method can use entity-level information to outperform a basic pairwise system.
5 0.14404657 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference
Author: Kartik Goyal ; Sujay Kumar Jauhar ; Huiying Li ; Mrinmaya Sachan ; Shashank Srivastava ; Eduard Hovy
Abstract: In this paper we present a novel approach to modelling distributional semantics that represents meaning as distributions over relations in syntactic neighborhoods. We argue that our model approximates meaning in compositional configurations more effectively than standard distributional vectors or bag-of-words models. We test our hypothesis on the problem of judging event coreferentiality, which involves compositional interactions in the predicate-argument structure of sentences, and demonstrate that our model outperforms both state-of-the-art window-based word embeddings as well as simple approaches to compositional semantics pre- viously employed in the literature.
6 0.14116241 267 acl-2013-PARMA: A Predicate Argument Aligner
7 0.14040601 205 acl-2013-Joint Apposition Extraction with Syntactic and Semantic Constraints
8 0.09884771 172 acl-2013-Graph-based Local Coherence Modeling
9 0.095082544 225 acl-2013-Learning to Order Natural Language Texts
10 0.087177463 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction
11 0.084933363 134 acl-2013-Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction
12 0.083589748 166 acl-2013-Generalized Reordering Rules for Improved SMT
13 0.080607757 386 acl-2013-What causes a causal relation? Detecting Causal Triggers in Biomedical Scientific Discourse
14 0.07375823 296 acl-2013-Recognizing Identical Events with Graph Kernels
15 0.070523947 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
16 0.069752604 200 acl-2013-Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
17 0.068893246 189 acl-2013-ImpAr: A Deterministic Algorithm for Implicit Semantic Role Labelling
18 0.063121855 52 acl-2013-Annotating named entities in clinical text by combining pre-annotation and active learning
19 0.063085906 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
20 0.057162229 177 acl-2013-GuiTAR-based Pronominal Anaphora Resolution in Bengali
topicId topicWeight
[(0, 0.149), (1, 0.029), (2, -0.007), (3, -0.102), (4, 0.005), (5, 0.214), (6, -0.003), (7, 0.137), (8, -0.028), (9, 0.08), (10, 0.01), (11, -0.075), (12, -0.072), (13, 0.017), (14, -0.089), (15, 0.145), (16, -0.137), (17, 0.258), (18, -0.087), (19, 0.153), (20, -0.178), (21, 0.103), (22, -0.04), (23, -0.279), (24, -0.001), (25, 0.033), (26, -0.026), (27, 0.102), (28, 0.14), (29, -0.036), (30, 0.013), (31, 0.058), (32, 0.014), (33, -0.017), (34, 0.025), (35, 0.065), (36, 0.064), (37, 0.017), (38, 0.012), (39, -0.03), (40, -0.051), (41, 0.009), (42, 0.011), (43, -0.036), (44, -0.002), (45, 0.072), (46, 0.003), (47, 0.025), (48, -0.003), (49, 0.01)]
simIndex simValue paperId paperTitle
same-paper 1 0.93687409 130 acl-2013-Domain-Specific Coreference Resolution with Lexicalized Features
Author: Nathan Gilbert ; Ellen Riloff
Abstract: Most coreference resolvers rely heavily on string matching, syntactic properties, and semantic attributes of words, but they lack the ability to make decisions based on individual words. In this paper, we explore the benefits of lexicalized features in the setting of domain-specific coreference resolution. We show that adding lexicalized features to off-the-shelf coreference resolvers yields significant performance gains on four domain-specific data sets and with two types of coreference resolution architectures.
2 0.91447324 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution
Author: Sebastian Martschat
Abstract: We present an unsupervised model for coreference resolution that casts the problem as a clustering task in a directed labeled weighted multigraph. The model outperforms most systems participating in the English track of the CoNLL’ 12 shared task.
3 0.90597326 106 acl-2013-Decentralized Entity-Level Modeling for Coreference Resolution
Author: Greg Durrett ; David Hall ; Dan Klein
Abstract: Efficiently incorporating entity-level information is a challenge for coreference resolution systems due to the difficulty of exact inference over partitions. We describe an end-to-end discriminative probabilistic model for coreference that, along with standard pairwise features, enforces structural agreement constraints between specified properties of coreferent mentions. This model can be represented as a factor graph for each document that admits efficient inference via belief propagation. We show that our method can use entity-level information to outperform a basic pairwise system.
4 0.76854199 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning
Author: Emmanuel Lassalle ; Pascal Denis
Abstract: This paper proposes a new method for significantly improving the performance of pairwise coreference models. Given a set of indicators, our method learns how to best separate types of mention pairs into equivalence classes for which we construct distinct classification models. In effect, our approach finds an optimal feature space (derived from a base feature set and indicator set) for discriminating coreferential mention pairs. Although our approach explores a very large space of possible feature spaces, it remains tractable by exploiting the structure of the hierarchies built from the indicators. Our exper- iments on the CoNLL-2012 Shared Task English datasets (gold mentions) indicate that our method is robust relative to different clustering strategies and evaluation metrics, showing large and consistent improvements over a single pairwise model using the same base features. Our best system obtains a competitive 67.2 of average F1 over MUC, and CEAF which, despite its simplicity, places it above the mean score of other systems on these datasets. B3,
5 0.71660984 177 acl-2013-GuiTAR-based Pronominal Anaphora Resolution in Bengali
Author: Apurbalal Senapati ; Utpal Garain
Abstract: This paper attempts to use an off-the-shelf anaphora resolution (AR) system for Bengali. The language specific preprocessing modules of GuiTAR (v3.0.3) are identified and suitably designed for Bengali. Anaphora resolution module is also modified or replaced in order to realize different configurations of GuiTAR. Performance of each configuration is evaluated and experiment shows that the off-the-shelf AR system can be effectively used for Indic languages. 1
6 0.64411044 205 acl-2013-Joint Apposition Extraction with Syntactic and Semantic Constraints
7 0.46506667 267 acl-2013-PARMA: A Predicate Argument Aligner
8 0.45831716 172 acl-2013-Graph-based Local Coherence Modeling
9 0.4414767 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference
10 0.39508554 340 acl-2013-Text-Driven Toponym Resolution using Indirect Supervision
11 0.39113903 225 acl-2013-Learning to Order Natural Language Texts
12 0.35462776 382 acl-2013-Variational Inference for Structured NLP Models
13 0.33691967 296 acl-2013-Recognizing Identical Events with Graph Kernels
14 0.29757538 139 acl-2013-Entity Linking for Tweets
15 0.28639385 179 acl-2013-HYENA-live: Fine-Grained Online Entity Type Classification from Natural-language Text
16 0.2769528 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction
17 0.27511856 280 acl-2013-Plurality, Negation, and Quantification:Towards Comprehensive Quantifier Scope Disambiguation
18 0.27215576 189 acl-2013-ImpAr: A Deterministic Algorithm for Implicit Semantic Role Labelling
19 0.27207777 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
20 0.25333264 365 acl-2013-Understanding Tables in Context Using Standard NLP Toolkits
topicId topicWeight
[(0, 0.043), (6, 0.028), (11, 0.045), (24, 0.035), (26, 0.041), (35, 0.05), (42, 0.047), (48, 0.028), (64, 0.011), (70, 0.047), (80, 0.018), (85, 0.273), (88, 0.101), (90, 0.066), (95, 0.08)]
simIndex simValue paperId paperTitle
same-paper 1 0.71815646 130 acl-2013-Domain-Specific Coreference Resolution with Lexicalized Features
Author: Nathan Gilbert ; Ellen Riloff
Abstract: Most coreference resolvers rely heavily on string matching, syntactic properties, and semantic attributes of words, but they lack the ability to make decisions based on individual words. In this paper, we explore the benefits of lexicalized features in the setting of domain-specific coreference resolution. We show that adding lexicalized features to off-the-shelf coreference resolvers yields significant performance gains on four domain-specific data sets and with two types of coreference resolution architectures.
2 0.71239072 324 acl-2013-Smatch: an Evaluation Metric for Semantic Feature Structures
Author: Shu Cai ; Kevin Knight
Abstract: The evaluation of whole-sentence semantic structures plays an important role in semantic parsing and large-scale semantic structure annotation. However, there is no widely-used metric to evaluate wholesentence semantic structures. In this paper, we present smatch, a metric that calculates the degree of overlap between two semantic feature structures. We give an efficient algorithm to compute the metric and show the results of an inter-annotator agreement study.
3 0.69077039 133 acl-2013-Efficient Implementation of Beam-Search Incremental Parsers
Author: Yoav Goldberg ; Kai Zhao ; Liang Huang
Abstract: Beam search incremental parsers are accurate, but not as fast as they could be. We demonstrate that, contrary to popular belief, most current implementations of beam parsers in fact run in O(n2), rather than linear time, because each statetransition is actually implemented as an O(n) operation. We present an improved implementation, based on Tree Structured Stack (TSS), in which a transition is performed in O(1), resulting in a real lineartime algorithm, which is verified empiri- cally. We further improve parsing speed by sharing feature-extraction and dotproduct across beam items. Practically, our methods combined offer a speedup of ∼2x over strong baselines on Penn Treeb∼a2nxk sentences, a bnads are eosrd oenrs P eofn magnitude faster on much longer sentences.
4 0.62434101 6 acl-2013-A Java Framework for Multilingual Definition and Hypernym Extraction
Author: Stefano Faralli ; Roberto Navigli
Abstract: In this paper we present a demonstration of a multilingual generalization of Word-Class Lattices (WCLs), a supervised lattice-based model used to identify textual definitions and extract hypernyms from them. Lattices are learned from a dataset of automatically-annotated definitions from Wikipedia. We release a Java API for the programmatic use of multilingual WCLs in three languages (English, French and Italian), as well as a Web application for definition and hypernym extraction from user-provided sentences.
5 0.59235394 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers
Author: Graham Neubig
Abstract: In this paper we describe Travatar, a forest-to-string machine translation (MT) engine based on tree transducers. It provides an open-source C++ implementation for the entire forest-to-string MT pipeline, including rule extraction, tuning, decoding, and evaluation. There are a number of options for model training, and tuning includes advanced options such as hypergraph MERT, and training of sparse features through online learning. The training pipeline is modeled after that of the popular Moses decoder, so users familiar with Moses should be able to get started quickly. We perform a validation experiment of the decoder on EnglishJapanese machine translation, and find that it is possible to achieve greater accuracy than translation using phrase-based and hierarchical-phrase-based translation. As auxiliary results, we also compare different syntactic parsers and alignment techniques that we tested in the process of developing the decoder. Travatar is available under the LGPL at http : / /phont ron . com/t ravat ar
6 0.53563756 136 acl-2013-Enhanced and Portable Dependency Projection Algorithms Using Interlinear Glossed Text
7 0.52622199 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
8 0.5261392 299 acl-2013-Reconstructing an Indo-European Family Tree from Non-native English Texts
9 0.52594393 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD
10 0.52371705 41 acl-2013-Aggregated Word Pair Features for Implicit Discourse Relation Disambiguation
11 0.52276474 327 acl-2013-Sorani Kurdish versus Kurmanji Kurdish: An Empirical Comparison
12 0.52032316 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution
13 0.51235515 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning
15 0.49355015 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction
16 0.49070877 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT
17 0.49040881 97 acl-2013-Cross-lingual Projections between Languages from Different Families
18 0.48809341 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification
19 0.48803765 292 acl-2013-Question Classification Transfer
20 0.48776102 250 acl-2013-Models of Translation Competitions