acl acl2012 acl2012-50 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Katja Markert ; Yufang Hou ; Michael Strube
Abstract: Previous work on classifying information status (Nissim, 2006; Rahman and Ng, 2011) is restricted to coarse-grained classification and focuses on conversational dialogue. We here introduce the task of classifying finegrained information status and work on written text. We add a fine-grained information status layer to the Wall Street Journal portion of the OntoNotes corpus. We claim that the information status of a mention depends not only on the mention itself but also on other mentions in the vicinity and solve the task by collectively classifying the information status ofall mentions. Our approach strongly outperforms reimplementations of previous work.
Reference: text
sentIndex sentText sentNum sentScore
1 Abstract Previous work on classifying information status (Nissim, 2006; Rahman and Ng, 2011) is restricted to coarse-grained classification and focuses on conversational dialogue. [sent-6, score-0.301]
2 We here introduce the task of classifying finegrained information status and work on written text. [sent-7, score-0.202]
3 We add a fine-grained information status layer to the Wall Street Journal portion of the OntoNotes corpus. [sent-8, score-0.162]
4 We claim that the information status of a mention depends not only on the mention itself but also on other mentions in the vicinity and solve the task by collectively classifying the information status ofall mentions. [sent-9, score-0.919]
5 While information structure affects all kinds of constituents in a sentence, we here adopt the more restricted notion of information status which concerns only discourse entities realized as noun phrases, i. [sent-12, score-0.3]
6 Information status (IS henceforth) describes the degree to which a discourse entity is available to the hearer with regard to the speaker’s assumptions about the hearer’s knowledge and beliefs (Nissim et al. [sent-15, score-0.272]
7 Old mentions are known to the hearer and have been referred 1Since not all noun phrases phrases which carry information referential, status mentions. [sent-17, score-0.478]
8 Mediated mentions have not been mentioned before but are also not autonomous, i. [sent-19, score-0.229]
9 , they can only be correctly interpreted by reference to another mention or to prior world knowledge. [sent-21, score-0.183]
10 We also report the first results on fine-grained IS classification by modelling further distinctions within the category of mediated mentions, such as comparative and bridging anaphora (see Examples 1 and 2, reProce dJienjgus, R ofep thueb 5lic0t hof A Knonruea ,l M 8-e1e4ti Jnugly o f2 t0h1e2 A. [sent-37, score-0.833]
11 2 Fine-grained IS is a prerequisite to full bridging/comparative anaphora resolution, and therefore necessary to fill gaps in entity grids (Barzilay and Lapata, 2008) based on coreference only. [sent-40, score-0.232]
12 Thus, Examples 1 and 2 do not exhibit any coreferential entity coherence but coherence can be established when the comparative anaphor others is resolved to others than freeway survivor Buck Helm, and the bridging anaphor the streets is resolved to the streets of Oranjemund, respectively. [sent-41, score-0.671]
13 We approach the challenge of modeling IS via collective classification, using several novel linguistically motivated features. [sent-51, score-0.207]
14 First, comparative anaphora are not specifically handled in Nissim et al. [sent-57, score-0.21]
15 (2010)), although some of them might be included in their respective bridging subcategories. [sent-60, score-0.225]
16 Second, we apply the annotation scheme reliably to a new genre, namely news. [sent-61, score-0.169]
17 The mention in question is typed in boldface; antecedents, where applicable, are displayed in italics. [sent-70, score-0.183]
18 To the best of our knowledge, we therefore present the first English corpus reliably annotated for a wide range of IS categories as well as full anaphoric information for three main anaphora types (coreference, bridging, comparative). [sent-74, score-0.222]
19 As their approach is restricted to definites, they only analyse a subset of the mentions we consider carrying IS. [sent-77, score-0.264]
20 Both papers treat IS classification as a local classification problem whereas we look at dependencies between the IS status of different mentions, leading to collective classification. [sent-82, score-0.543]
21 In addition, they only distinguish the three main categories o ld, medi at ed and new. [sent-83, score-0.233]
22 Anaphoricity determination (Ng, 2009; Zhou and Kong, 2009) identifies many or most old mentions. [sent-85, score-0.206]
23 However, no distinction between mediated and new mentions is made. [sent-86, score-0.495]
24 Most approaches to bridging resolution (Meyer and Dale, 2002; Poesio et al. [sent-87, score-0.293]
25 Sasano and Kurohashi (2009) do also tackle bridging recognition, but they depend on languagespecific non-transferrable features for Japanese. [sent-90, score-0.225]
26 (2004) in distinguishing three major IS categories old, new and medi at ed. [sent-93, score-0.233]
27 A mention is o ld if it is either coreferential with an already introduced entity or a generic or deictic pronoun. [sent-94, score-0.255]
28 This definition includes coreference with noun phrase as well as verb phrase antecedents3 . [sent-97, score-0.147]
29 Mediated refers to entities which have not yet been introduced in the text but are inferrable via other mentions or are known via world knowledge. [sent-98, score-0.303]
30 We distinguish the following six subcategories: The category mediated/ comparat ive comprises mentions compared via either a contrast or similarity to another one (see Example 1). [sent-99, score-0.325]
31 We also include a category medi at ed/bridging (see Examples 2, 3 and 4). [sent-101, score-0.247]
32 Bridging anaphora can be any noun phrase and are not limited to definite NPs as in Poesio et al. [sent-102, score-0.165]
33 (2004), antecedents for both comparative and bridging categories are annotated and can be noun phrases, verb phrases or even clauses. [sent-106, score-0.43]
34 4 Mentions that are syntactically linked via a possessive relation or a PP modification to other, old or mediated mentions fall into the type mediated/ synt (see Examples 5 and 6). [sent-109, score-0.788]
35 ’s scheme, coordinated mentions where at least one element in the conjunction is o ld or medi at ed are covered by the category medi at ed/ aggregate, and mentions referring to a value of a previously mentioned function by the type mediated/ func. [sent-111, score-0.934]
36 All other mentions are annotated as new, includ3In contrast to Nissim et al. [sent-112, score-0.229]
37 797 ing most generics as well as newly introduced, specific mentions such as Example 7. [sent-118, score-0.26]
38 There were no restrictions on which texts to include apart from (i) exclusion of letters to the editor as they contain cross-document links and (ii) a preference for longer texts with potentially richer discourse structure. [sent-130, score-0.159]
39 6 The existing coreference annotation was automatically carried over to the IS task by marking all mentions in a coreference chain (apart from the first mention in the chain) as old. [sent-132, score-0.686]
40 The annotation task consisted of marking all mentions for their IS (old, medi ated or new) as well as marking mediat ed subcategories (see Section 3. [sent-133, score-0.571]
41 1) and the antecedents for comparative and bridging anaphora. [sent-134, score-0.345]
42 The annotations of 1499 of these were carried over from OntoNotes, leaving 4406 potential mentions for an- notation and agreement measurement. [sent-137, score-0.276]
43 Table 1 shows agreement results for the overall scheme at the coarse-grained (4 categories: non-mention, old, new, mediated) and the fine-grained level (9 categories: non-mention, old, new and the 6 mediated subtypes). [sent-180, score-0.37]
44 The reliability of the category bridging is more annotatordependent, although still higher, sometimes considerably, than other previous attempts at bridg7Often, annotation is considered highly reliable when κ exceeds 0. [sent-185, score-0.416]
45 8The low reliability of the rare category func, when involving Annotator B, was explained by Annotator B forgetting about this category after having used it once. [sent-190, score-0.19]
46 3 Gold Standard Our final gold standard corpus consists of 50 texts from the WSJ portion of the OntoNotes corpusThe corpus will be made publically available as OntoNotes annotation layer via http : / /www . [sent-197, score-0.145]
47 Disagreements in the 35 texts used for annotator training (9 texts) and testing (26 texts) were resolved via discussion between the annotators. [sent-200, score-0.144]
48 The gold standard includes 10,980 true mentions (see Table 3). [sent-203, score-0.229]
49 1 Features for Local Classification We use the following local features, including the features in Nissim (2006) and Rahman and Ng (201 1) to be able to gauge how their systems fare on our corpus and as a comparison point for our novel collective classification approach. [sent-206, score-0.308]
50 Also, previously unmentioned proper names are more likely to be hearer-old and therefore medi ated/ knowledge, although their exact status will depend on how well known a particular proper name is. [sent-213, score-0.438]
51 Rahman and Ng (201 1) add all unigrams appearing in any mention in the training set as features. [sent-214, score-0.183]
52 They also integrated (via a convolution tree-kernel SVM (Collins and Duffy, 2001)) partial parse trees that capture the generalised syntactic context of a mention e and include the mention’s parent and sibling nodes without lexical leaves. [sent-215, score-0.183]
53 However, they use no structure underneath the mention node e itself, assuming that “any NP-internal information has presumably been captured by the flat features”. [sent-216, score-0.183]
54 These track partial previous mentions by also counting partial previous mention time as well as the previous mention of content words only. [sent-218, score-0.595]
55 We also add a mention’s number as one of singular, plural or unknown, and whether the mention is modified by an adjective. [sent-219, score-0.183]
56 Another feature encapsulates whether the mention is modified by a comparative marker, using a small set of 10 markers such as another, such, similar . [sent-220, score-0.268]
57 2 Relations for Collective Classification Both Nissim (2006) and Rahman and Ng (201 1) classify each mention individually in a standard supervised ML setting, not considering potential de- pendencies between the IS categories of different 9We changed the value of “full meric’ to {yes, no, NA}. [sent-226, score-0.228]
58 However, collective or joint classification has made substantial impact in other NLP tasks, such as opinion mining (Pang and Lee, 2004; Somasundaran et al. [sent-228, score-0.243]
59 , 2002) and the related task of coreference resolution (Denis and Baldridge, 2007). [sent-231, score-0.175]
60 We investigate two types of relations between mentions that might impact on IS classification. [sent-232, score-0.229]
61 mediat ed/ aggregat e is for coordinations in which at least one of the children is old or mediated. [sent-236, score-0.253]
62 We therefore link a mention m1 to a mention m2 via a hasChild relation if (i) m2 is a possessive or prepositional modification of m1, or (ii) m1 is a coordination and m2 is one of its children. [sent-238, score-0.403]
63 5% of all mentions are mediated/ synt) will make this feature highly effective in distinguishing between new and medi at ed categories. [sent-241, score-0.417]
64 Therefore, we integrate dependencies between the IS classification of mentions in precedence relations. [sent-244, score-0.466]
65 For Example 8 (slightly simplified) we extract the precedence relations shown in Table 5. [sent-246, score-0.164]
66 We therefore exclude all precedence relations where one element of the pair is a proper name. [sent-260, score-0.208]
67 Table 6 shows the statistics on precedence with the first mention in a pair in rows and the second in columns. [sent-262, score-0.347]
68 Mediated and new mentions indeed rarely precede old mentions, so that precedence should improve separating of old vs other mentions. [sent-263, score-0.805]
69 Following Nissim (2006) and Rahman and Ng (201 1), we perform all experiments on gold standard mentions and use the human WSJ syntac- tic annotation for feature extraction, when necessary. [sent-267, score-0.289]
70 For the extraction of semantic class, we use 800 OntoNotes entity type annotation for proper names and an automatic assignment of semantic class via WordNet hypernyms for common nouns. [sent-268, score-0.141]
71 Fine-grained versions distinguish between the categories old, the six mediated subtypes, and new. [sent-270, score-0.311]
72 ICA initializes each mention with its most likely IS, according to the local classifier and features. [sent-284, score-0.248]
73 It then iterates a relational classifier, which uses both local and relational features (our hasChild and precedes features) taking IS assignments to neighbouring mentions into account. [sent-285, score-0.463]
74 We use NetKit (Macskassy and Provost, 2007) — — with its standard ICA settings for collective inference, as it allows direct comparison between local and collective classification. [sent-287, score-0.405]
75 d968s Table 7: Collective classification compared to Nissim’s local classifier. [sent-300, score-0.138]
76 local ones with the relational features added: thus, if the local classifier is a tree kernel SVM so is the relational one. [sent-302, score-0.258]
77 4 Results Table 7 shows the comparison of collective classifi- cation to local classification, using Nissim’s framework and features, and Table 8 the equivalent table for Rahman and Ng’s approach. [sent-305, score-0.235]
78 In particular, the inclusion of semantic classes improves medi ated/ knowledge and mediat ed/ func, and comparative anaphora are recognised highly reliably via a small set of comparative markers. [sent-307, score-0.619]
79 The hasChild relation leads to significant improvement in accuracy over local classification in all cases, showing the value of collective classification. [sent-308, score-0.308]
80 The improvement here is centered on the categories of mediated/ synt (for both cases) and medi at ed/ aggregate (for Nissim+ol+hasChild) as well as their distinction from 801 new. [sent-309, score-0.316]
81 10 It is also interesting that collective classification with a concise feature set and a simple decision tree as used in Nissim+ol+hasChild, performs equally well as RahmanNg+ol+hasChild, which uses thousands of unigram and tree features and a more sophisticated local classifier. [sent-310, score-0.308]
82 We investigated several variations of the precedence link, such as restricting it to certain grammatical relations, taking into account definiteness or NP type but none of them led to any improvement. [sent-313, score-0.164]
83 new mentions does not follow a clear order and is therefore not a very predictive feature (see Table 6). [sent-316, score-0.229]
84 However, many of the clearest precedences they find are more specific variants of the o ld >p mediated or old >p new precedence or they are preferences at an even finer level than the one we annotate, including for example the identification of generics. [sent-318, score-0.708]
85 Second, the clear o ld >p medi at ed 10For RhamanNg+ol+hasChild, the aggregate class suffers from collective classification. [sent-319, score-0.432]
86 and old >p new preferences are partially already captured by the local features, especially the grammatical role, as, for example, subjects are often both old as well as early on in a sentence. [sent-335, score-0.477]
87 With regard to fine-grained classification, many categories including comparative anaphora, are identified quite reliably, especially in the multiclass classification setting (Nissim+ol+hasChild). [sent-336, score-0.203]
88 Most bridging mentions do not have any clear internal structure or external syntactic contexts that signal their presence. [sent-338, score-0.454]
89 Unigrams could potentially encapsulate some of this lexical knowledge but without generalization are too sparse for a relatively rare category such as bridging (6% of all mentions) to perform well. [sent-340, score-0.284]
90 The difficulty of bridging recognition is an important insight of this paper as it casts doubt on the strategy in previous research to concentrate almost exclusively on antecedent selection (see Section 2). [sent-341, score-0.26]
91 — 6 — Conclusions We presented a new approach to information status classification in written text, for which we also provide the first reliably annotated English language corpus. [sent-342, score-0.287]
92 Based on linguistic intuition, we define fea802 tures for classifying mentions collectively. [sent-343, score-0.229]
93 We show that our collective classification approach outperforms the state-of-the-art in coarse-grained IS classification by about 10% (Nissim, 2006) and 5% (Rahman and Ng, 2011) accuracy. [sent-344, score-0.316]
94 The gain is almost entirely due to improvements in distinguishing between new and mediated mentions. [sent-345, score-0.266]
95 – – Since the work reported in this paper relied – following Nissim (2006) and Rahman and Ng (201 1) – on gold standard mentions and syntactic annotations, we plan to perform experiments with predicted mentions as well. [sent-347, score-0.458]
96 In addition, we plan to integrate IS resolution with our coreference resolution system (Cai et al. [sent-349, score-0.243]
97 Joint determination of anaphoricity and coreference resolution us- ing integer programming. [sent-389, score-0.238]
98 Learning the information status of noun phrases in spoken dialogues. [sent-507, score-0.202]
99 Information status distinctions and referring expressions: An empirical study of references to people in news summaries. [sent-525, score-0.162]
100 Global learning of noun phrase anaphoricity in coreference resolution via label propagation. [sent-551, score-0.315]
wordName wordTfidf (topN-words)
[('nissim', 0.464), ('mediated', 0.266), ('mentions', 0.229), ('bridging', 0.225), ('old', 0.206), ('rahman', 0.198), ('medi', 0.188), ('mention', 0.183), ('collective', 0.17), ('precedence', 0.164), ('status', 0.162), ('riester', 0.141), ('haschild', 0.125), ('anaphora', 0.125), ('poesio', 0.112), ('coreference', 0.107), ('ontonotes', 0.1), ('comparative', 0.085), ('classification', 0.073), ('reliability', 0.072), ('katja', 0.07), ('ng', 0.068), ('resolution', 0.068), ('local', 0.065), ('relational', 0.064), ('ol', 0.063), ('anaphoricity', 0.063), ('ica', 0.063), ('malvina', 0.063), ('discourse', 0.063), ('markert', 0.062), ('annotation', 0.06), ('annotator', 0.059), ('category', 0.059), ('scheme', 0.057), ('august', 0.056), ('anaphor', 0.055), ('cahill', 0.052), ('reliably', 0.052), ('synt', 0.05), ('massimo', 0.05), ('texts', 0.048), ('artstein', 0.047), ('chores', 0.047), ('hearer', 0.047), ('manu', 0.047), ('mediat', 0.047), ('rahmanng', 0.047), ('agreement', 0.047), ('subcategories', 0.047), ('categories', 0.045), ('proper', 0.044), ('july', 0.042), ('june', 0.041), ('ld', 0.041), ('precedes', 0.041), ('streets', 0.041), ('ritz', 0.041), ('func', 0.041), ('siddharthan', 0.041), ('noun', 0.04), ('finegrained', 0.04), ('heidelberg', 0.04), ('coherence', 0.038), ('prev', 0.037), ('gardent', 0.037), ('via', 0.037), ('antecedents', 0.035), ('mother', 0.035), ('antecedent', 0.035), ('restricted', 0.035), ('nenkova', 0.033), ('aggregate', 0.033), ('april', 0.031), ('weischedel', 0.031), ('birner', 0.031), ('bochum', 0.031), ('buck', 0.031), ('coreferential', 0.031), ('dipper', 0.031), ('freeway', 0.031), ('generics', 0.031), ('gerund', 0.031), ('grosz', 0.031), ('helm', 0.031), ('hypothesise', 0.031), ('korzen', 0.031), ('macskassy', 0.031), ('meyer', 0.031), ('modjeska', 0.031), ('oranjemund', 0.031), ('precedences', 0.031), ('sasano', 0.031), ('sleep', 0.031), ('subjpass', 0.031), ('subproblem', 0.031), ('survivor', 0.031), ('yufang', 0.031), ('conversational', 0.031), ('house', 0.03)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 50 acl-2012-Collective Classification for Fine-grained Information Status
Author: Katja Markert ; Yufang Hou ; Michael Strube
Abstract: Previous work on classifying information status (Nissim, 2006; Rahman and Ng, 2011) is restricted to coarse-grained classification and focuses on conversational dialogue. We here introduce the task of classifying finegrained information status and work on written text. We add a fine-grained information status layer to the Wall Street Journal portion of the OntoNotes corpus. We claim that the information status of a mention depends not only on the mention itself but also on other mentions in the vicinity and solve the task by collectively classifying the information status ofall mentions. Our approach strongly outperforms reimplementations of previous work.
2 0.2047213 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale
Author: Michael Wick ; Sameer Singh ; Andrew McCallum
Abstract: Sameer Singh Andrew McCallum University of Massachusetts University of Massachusetts 140 Governor’s Drive 140 Governor’s Drive Amherst, MA Amherst, MA s ameer@ cs .umas s .edu mccal lum@ c s .umas s .edu Hamming” who authored “The unreasonable effectiveness of mathematics.” Features of the mentions Methods that measure compatibility between mention pairs are currently the dominant ap- proach to coreference. However, they suffer from a number of drawbacks including difficulties scaling to large numbers of mentions and limited representational power. As these drawbacks become increasingly restrictive, the need to replace the pairwise approaches with a more expressive, highly scalable alternative is becoming urgent. In this paper we propose a novel discriminative hierarchical model that recursively partitions entities into trees of latent sub-entities. These trees succinctly summarize the mentions providing a highly compact, information-rich structure for reasoning about entities and coreference uncertainty at massive scales. We demonstrate that the hierarchical model is several orders of magnitude faster than pairwise, allowing us to perform coreference on six million author mentions in under four hours on a single CPU.
3 0.18221498 58 acl-2012-Coreference Semantics from Web Features
Author: Mohit Bansal ; Dan Klein
Abstract: To address semantic ambiguities in coreference resolution, we use Web n-gram features that capture a range of world knowledge in a diffuse but robust way. Specifically, we exploit short-distance cues to hypernymy, semantic compatibility, and semantic context, as well as general lexical co-occurrence. When added to a state-of-the-art coreference baseline, our Web features give significant gains on multiple datasets (ACE 2004 and ACE 2005) and metrics (MUC and B3), resulting in the best results reported to date for the end-to-end task of coreference resolution.
4 0.15126997 18 acl-2012-A Probabilistic Model for Canonicalizing Named Entity Mentions
Author: Dani Yogatama ; Yanchuan Sim ; Noah A. Smith
Abstract: We present a statistical model for canonicalizing named entity mentions into a table whose rows represent entities and whose columns are attributes (or parts of attributes). The model is novel in that it incorporates entity context, surface features, firstorder dependencies among attribute-parts, and a notion of noise. Transductive learning from a few seeds and a collection of mention tokens combines Bayesian inference and conditional estimation. We evaluate our model and its components on two datasets collected from political blogs and sports news, finding that it outperforms a simple agglomerative clustering approach and previous work.
5 0.11016322 33 acl-2012-Automatic Event Extraction with Structured Preference Modeling
Author: Wei Lu ; Dan Roth
Abstract: This paper presents a novel sequence labeling model based on the latent-variable semiMarkov conditional random fields for jointly extracting argument roles of events from texts. The model takes in coarse mention and type information and predicts argument roles for a given event template. This paper addresses the event extraction problem in a primarily unsupervised setting, where no labeled training instances are available. Our key contribution is a novel learning framework called structured preference modeling (PM), that allows arbitrary preference to be assigned to certain structures during the learning procedure. We establish and discuss connections between this framework and other existing works. We show empirically that the structured preferences are crucial to the success of our task. Our model, trained without annotated data and with a small number of structured preferences, yields performance competitive to some baseline supervised approaches.
6 0.10425029 73 acl-2012-Discriminative Learning for Joint Template Filling
7 0.083805725 49 acl-2012-Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study
8 0.079779796 126 acl-2012-Labeling Documents with Timestamps: Learning from their Time Expressions
9 0.074953318 193 acl-2012-Text-level Discourse Parsing with Rich Linguistic Features
10 0.073194906 157 acl-2012-PDTB-style Discourse Annotation of Chinese Text
11 0.065856665 85 acl-2012-Event Linking: Grounding Event Reference in a News Archive
12 0.063608252 201 acl-2012-Towards the Unsupervised Acquisition of Discourse Relations
13 0.06242935 159 acl-2012-Pattern Learning for Relation Extraction with a Hierarchical Topic Model
14 0.059058279 47 acl-2012-Chinese Comma Disambiguation for Discourse Analysis
15 0.055980474 40 acl-2012-Big Data versus the Crowd: Looking for Relationships in All the Right Places
16 0.054354928 3 acl-2012-A Class-Based Agreement Model for Generating Accurately Inflected Translations
17 0.052153978 187 acl-2012-Subgroup Detection in Ideological Discussions
18 0.050801713 146 acl-2012-Modeling Topic Dependencies in Hierarchical Text Categorization
19 0.050756793 115 acl-2012-Identifying High-Impact Sub-Structures for Convolution Kernels in Document-level Sentiment Classification
20 0.048965901 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures
topicId topicWeight
[(0, -0.161), (1, 0.112), (2, -0.075), (3, 0.058), (4, 0.042), (5, 0.047), (6, -0.068), (7, 0.04), (8, 0.009), (9, 0.041), (10, 0.008), (11, -0.17), (12, -0.089), (13, -0.063), (14, 0.039), (15, 0.086), (16, 0.044), (17, 0.111), (18, -0.297), (19, -0.15), (20, -0.074), (21, 0.08), (22, 0.038), (23, 0.034), (24, 0.072), (25, 0.02), (26, -0.041), (27, 0.054), (28, -0.036), (29, -0.038), (30, -0.087), (31, 0.035), (32, 0.039), (33, -0.008), (34, 0.07), (35, 0.037), (36, -0.05), (37, -0.035), (38, 0.09), (39, -0.016), (40, -0.043), (41, -0.02), (42, 0.048), (43, 0.05), (44, 0.077), (45, -0.101), (46, 0.107), (47, -0.115), (48, 0.005), (49, -0.106)]
simIndex simValue paperId paperTitle
same-paper 1 0.92999721 50 acl-2012-Collective Classification for Fine-grained Information Status
Author: Katja Markert ; Yufang Hou ; Michael Strube
Abstract: Previous work on classifying information status (Nissim, 2006; Rahman and Ng, 2011) is restricted to coarse-grained classification and focuses on conversational dialogue. We here introduce the task of classifying finegrained information status and work on written text. We add a fine-grained information status layer to the Wall Street Journal portion of the OntoNotes corpus. We claim that the information status of a mention depends not only on the mention itself but also on other mentions in the vicinity and solve the task by collectively classifying the information status ofall mentions. Our approach strongly outperforms reimplementations of previous work.
2 0.8279655 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale
Author: Michael Wick ; Sameer Singh ; Andrew McCallum
Abstract: Sameer Singh Andrew McCallum University of Massachusetts University of Massachusetts 140 Governor’s Drive 140 Governor’s Drive Amherst, MA Amherst, MA s ameer@ cs .umas s .edu mccal lum@ c s .umas s .edu Hamming” who authored “The unreasonable effectiveness of mathematics.” Features of the mentions Methods that measure compatibility between mention pairs are currently the dominant ap- proach to coreference. However, they suffer from a number of drawbacks including difficulties scaling to large numbers of mentions and limited representational power. As these drawbacks become increasingly restrictive, the need to replace the pairwise approaches with a more expressive, highly scalable alternative is becoming urgent. In this paper we propose a novel discriminative hierarchical model that recursively partitions entities into trees of latent sub-entities. These trees succinctly summarize the mentions providing a highly compact, information-rich structure for reasoning about entities and coreference uncertainty at massive scales. We demonstrate that the hierarchical model is several orders of magnitude faster than pairwise, allowing us to perform coreference on six million author mentions in under four hours on a single CPU.
3 0.82720035 58 acl-2012-Coreference Semantics from Web Features
Author: Mohit Bansal ; Dan Klein
Abstract: To address semantic ambiguities in coreference resolution, we use Web n-gram features that capture a range of world knowledge in a diffuse but robust way. Specifically, we exploit short-distance cues to hypernymy, semantic compatibility, and semantic context, as well as general lexical co-occurrence. When added to a state-of-the-art coreference baseline, our Web features give significant gains on multiple datasets (ACE 2004 and ACE 2005) and metrics (MUC and B3), resulting in the best results reported to date for the end-to-end task of coreference resolution.
4 0.63309199 18 acl-2012-A Probabilistic Model for Canonicalizing Named Entity Mentions
Author: Dani Yogatama ; Yanchuan Sim ; Noah A. Smith
Abstract: We present a statistical model for canonicalizing named entity mentions into a table whose rows represent entities and whose columns are attributes (or parts of attributes). The model is novel in that it incorporates entity context, surface features, firstorder dependencies among attribute-parts, and a notion of noise. Transductive learning from a few seeds and a collection of mention tokens combines Bayesian inference and conditional estimation. We evaluate our model and its components on two datasets collected from political blogs and sports news, finding that it outperforms a simple agglomerative clustering approach and previous work.
5 0.43402562 49 acl-2012-Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study
Author: Nathan Schneider ; Behrang Mohit ; Kemal Oflazer ; Noah A. Smith
Abstract: “Lightweight” semantic annotation of text calls for a simple representation, ideally without requiring a semantic lexicon to achieve good coverage in the language and domain. In this paper, we repurpose WordNet’s supersense tags for annotation, developing specific guidelines for nominal expressions and applying them to Arabic Wikipedia articles in four topical domains. The resulting corpus has high coverage and was completed quickly with reasonable inter-annotator agreement.
6 0.40768382 73 acl-2012-Discriminative Learning for Joint Template Filling
7 0.39632979 33 acl-2012-Automatic Event Extraction with Structured Preference Modeling
8 0.386646 126 acl-2012-Labeling Documents with Timestamps: Learning from their Time Expressions
9 0.30800533 195 acl-2012-The Creation of a Corpus of English Metalanguage
10 0.29830337 85 acl-2012-Event Linking: Grounding Event Reference in a News Archive
11 0.28699049 11 acl-2012-A Feature-Rich Constituent Context Model for Grammar Induction
12 0.27098632 34 acl-2012-Automatically Learning Measures of Child Language Development
13 0.25628549 193 acl-2012-Text-level Discourse Parsing with Rich Linguistic Features
14 0.24518904 124 acl-2012-Joint Inference of Named Entity Recognition and Normalization for Tweets
15 0.24372971 201 acl-2012-Towards the Unsupervised Acquisition of Discourse Relations
16 0.24286333 200 acl-2012-Toward Automatically Assembling Hittite-Language Cuneiform Tablet Fragments into Larger Texts
17 0.24233942 127 acl-2012-Large-Scale Syntactic Language Modeling with Treelets
18 0.2392637 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base
19 0.23653844 47 acl-2012-Chinese Comma Disambiguation for Discourse Analysis
20 0.23148625 146 acl-2012-Modeling Topic Dependencies in Hierarchical Text Categorization
topicId topicWeight
[(25, 0.034), (26, 0.043), (28, 0.037), (30, 0.031), (31, 0.016), (37, 0.039), (39, 0.032), (74, 0.03), (82, 0.026), (84, 0.033), (85, 0.038), (90, 0.094), (92, 0.037), (94, 0.017), (96, 0.322), (98, 0.012), (99, 0.069)]
simIndex simValue paperId paperTitle
same-paper 1 0.74712551 50 acl-2012-Collective Classification for Fine-grained Information Status
Author: Katja Markert ; Yufang Hou ; Michael Strube
Abstract: Previous work on classifying information status (Nissim, 2006; Rahman and Ng, 2011) is restricted to coarse-grained classification and focuses on conversational dialogue. We here introduce the task of classifying finegrained information status and work on written text. We add a fine-grained information status layer to the Wall Street Journal portion of the OntoNotes corpus. We claim that the information status of a mention depends not only on the mention itself but also on other mentions in the vicinity and solve the task by collectively classifying the information status ofall mentions. Our approach strongly outperforms reimplementations of previous work.
2 0.57291722 159 acl-2012-Pattern Learning for Relation Extraction with a Hierarchical Topic Model
Author: Enrique Alfonseca ; Katja Filippova ; Jean-Yves Delort ; Guillermo Garrido
Abstract: We describe the use of a hierarchical topic model for automatically identifying syntactic and lexical patterns that explicitly state ontological relations. We leverage distant supervision using relations from the knowledge base FreeBase, but do not require any manual heuristic nor manual seed list selections. Results show that the learned patterns can be used to extract new relations with good precision.
3 0.42460433 58 acl-2012-Coreference Semantics from Web Features
Author: Mohit Bansal ; Dan Klein
Abstract: To address semantic ambiguities in coreference resolution, we use Web n-gram features that capture a range of world knowledge in a diffuse but robust way. Specifically, we exploit short-distance cues to hypernymy, semantic compatibility, and semantic context, as well as general lexical co-occurrence. When added to a state-of-the-art coreference baseline, our Web features give significant gains on multiple datasets (ACE 2004 and ACE 2005) and metrics (MUC and B3), resulting in the best results reported to date for the end-to-end task of coreference resolution.
4 0.41719925 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures
Author: Danilo Croce ; Alessandro Moschitti ; Roberto Basili ; Martha Palmer
Abstract: In this paper, we propose innovative representations for automatic classification of verbs according to mainstream linguistic theories, namely VerbNet and FrameNet. First, syntactic and semantic structures capturing essential lexical and syntactic properties of verbs are defined. Then, we design advanced similarity functions between such structures, i.e., semantic tree kernel functions, for exploiting distributional and grammatical information in Support Vector Machines. The extensive empirical analysis on VerbNet class and frame detection shows that our models capture mean- ingful syntactic/semantic structures, which allows for improving the state-of-the-art.
5 0.41623896 191 acl-2012-Temporally Anchored Relation Extraction
Author: Guillermo Garrido ; Anselmo Penas ; Bernardo Cabaleiro ; Alvaro Rodrigo
Abstract: Although much work on relation extraction has aimed at obtaining static facts, many of the target relations are actually fluents, as their validity is naturally anchored to a certain time period. This paper proposes a methodological approach to temporally anchored relation extraction. Our proposal performs distant supervised learning to extract a set of relations from a natural language corpus, and anchors each of them to an interval of temporal validity, aggregating evidence from documents supporting the relation. We use a rich graphbased document-level representation to generate novel features for this task. Results show that our implementation for temporal anchoring is able to achieve a 69% of the upper bound performance imposed by the relation extraction step. Compared to the state of the art, the overall system achieves the highest precision reported.
6 0.41389811 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base
7 0.41075116 63 acl-2012-Cross-lingual Parse Disambiguation based on Semantic Correspondence
8 0.41003895 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
9 0.4083727 29 acl-2012-Assessing the Effect of Inconsistent Assessors on Summarization Evaluation
10 0.40802583 219 acl-2012-langid.py: An Off-the-shelf Language Identification Tool
11 0.40579516 62 acl-2012-Cross-Lingual Mixture Model for Sentiment Classification
12 0.40559292 52 acl-2012-Combining Coherence Models and Machine Translation Evaluation Metrics for Summarization Evaluation
13 0.40415466 40 acl-2012-Big Data versus the Crowd: Looking for Relationships in All the Right Places
14 0.4032197 130 acl-2012-Learning Syntactic Verb Frames using Graphical Models
15 0.40279672 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information
16 0.4024061 99 acl-2012-Finding Salient Dates for Building Thematic Timelines
17 0.40171698 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
18 0.40168619 167 acl-2012-QuickView: NLP-based Tweet Search
19 0.40076295 146 acl-2012-Modeling Topic Dependencies in Hierarchical Text Categorization
20 0.4005909 187 acl-2012-Subgroup Detection in Ideological Discussions