acl acl2012 acl2012-73 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Einat Minkov ; Luke Zettlemoyer
Abstract: This paper presents a joint model for template filling, where the goal is to automatically specify the fields of target relations such as seminar announcements or corporate acquisition events. The approach models mention detection, unification and field extraction in a flexible, feature-rich model that allows for joint modeling of interdependencies at all levels and across fields. Such an approach can, for example, learn likely event durations and the fact that start times should come before end times. While the joint inference space is large, we demonstrate effective learning with a Perceptron-style approach that uses simple, greedy beam decoding. Empirical results in two benchmark domains demonstrate consistently strong performance on both mention de- tection and template filling tasks.
Reference: text
sentIndex sentText sentNum sentScore
1 i l Abstract This paper presents a joint model for template filling, where the goal is to automatically specify the fields of target relations such as seminar announcements or corporate acquisition events. [sent-4, score-1.051]
2 The approach models mention detection, unification and field extraction in a flexible, feature-rich model that allows for joint modeling of interdependencies at all levels and across fields. [sent-5, score-0.611]
3 Empirical results in two benchmark domains demonstrate consistently strong performance on both mention de- tection and template filling tasks. [sent-8, score-0.426]
4 Template filling is an IE task where the goal is to populate the fields of a target relation, for example to extract the attributes of a job posting (Califf and Mooney, 2003) or to recover the details of a corporate acquisition event from a news story (Freitag and McCallum, 2000). [sent-10, score-0.674]
5 For example, Figure 1 shows an extraction from CMU seminar announcement corpus (Freitag and McCallum, 2000). [sent-12, score-0.47]
6 that describe field values, unify these mentions by grouping them according to target field, and normalizing the results within each group to provide the final extractions. [sent-17, score-0.416]
7 In this paper, we present a joint modeling and learning approach for the combined tasks of mention detection, unification, and template filling, as described above. [sent-22, score-0.346]
8 We present a simple, feature-rich, discriminative model that readily incorporates a broad range of possible constraints on the mentions and joint field assignments. [sent-26, score-0.46]
9 We demonstrate empirically that these challenges can be solved with a combination of greedy beam decoding, performed directly in the joint space of possible mention clusters and field assignments, and structured Perceptronstyle learning algorithm (Collins, 2002). [sent-29, score-0.495]
10 We report experimental evaluations on two benchmark datasets in different genres, the CMU seminar announcements and corporate acquisitions (Freitag and McCallum, 2000). [sent-30, score-0.584]
11 In each case, we evaluated both template extraction and mention detection performance. [sent-31, score-0.401]
12 2 Related Work Research on the task of template filling has focused on the extraction of field value mentions from the underlying text. [sent-34, score-0.757]
13 There has been little effort towards a comprehensive approach that includes mention unification, as well as considers the structure of the target relational schema to create semantically valid outputs. [sent-36, score-0.516]
14 In their model, slot-filling entities are first generated, and entity mentions are then realized in text. [sent-38, score-0.328]
15 In this IE task, the target relation must agree with the entity types assigned to it; e. [sent-47, score-0.267]
16 Compared with the extraction of tuples of entity mention pairs, template filling is associated with a more complex target relational schema. [sent-52, score-1.213]
17 Interestingly, several researchers have attempted to model label consistency and high-level relational constraints using state-of-the-art sequential models of named entity recognition (NER). [sent-53, score-0.298]
18 (2005) further modelled high-level semantic constraints; for example, using the CMU seminar announcements dataset, spans labeled as start time or end time were required to be seman- tically consistent. [sent-57, score-0.581]
19 In the proposed framework we take a bottom-up approach to identifying entity mentions in text, where given a noisy set of candidate named entities, described using rich semantic and surface features, discriminative learning is applied to label these mentions. [sent-58, score-0.543]
20 We will show that this approach yields better performance on the CMU seminar announcement dataset when evaluated in terms of NER. [sent-59, score-0.436]
21 3 Problem Setting In the template filling task, a target relation r is provided, comprised of attributes (also referred to as Figure 2: The relational schema for the seminars domain. [sent-61, score-0.883]
22 Figure 3: A record partially populated from text. [sent-62, score-0.215]
23 Given a document d, which is known to describe a tuple of the underlying relation, the goal is to populate the fields with values based on the text. [sent-64, score-0.366]
24 In this work, we describe domain knowledge through an extended relational database schema R. [sent-66, score-0.275]
25 In this schema, every field of the target relation maps to a tuple of another relation, giving rise to a hierarchical view of template filling. [sent-67, score-0.688]
26 Figure 2 describes a relational schema for the seminar announcement domain. [sent-68, score-0.662]
27 As shown, each field of the seminar relation maps to another relation; e. [sent-69, score-0.661]
28 , person) consist of a single attribute, whereas the date and time relations are characterised with multiple attributes; for example, the time relation includes the fields of hour, minutes and ampm. [sent-74, score-0.423]
29 The attributes of a relation may be defined as mandatory or optional. [sent-79, score-0.278]
30 , either day-of-month or day-of-week must be populated in the date relation. [sent-89, score-0.252]
31 Finally, a combination of field values may be required to be valid, e. [sent-90, score-0.278]
32 This function checks whether two valid tuples v1 and v2 are inconsistent, implying a negation of possible unification of these tuples. [sent-94, score-0.598]
33 In this work, we consider date and time tuples as contradictory if they contain semantically different values for some field; tuples of location, person and title are required to have minimal overlap in their string values to avoid contradiction. [sent-95, score-1.225]
34 Given document d, the hierarchical schema R is populated in a bottom-up fashion. [sent-97, score-0.35]
35 Generally, parent-free relations in the hierarchy correspond to generic entities, realized as entity mentions in the text. [sent-98, score-0.329]
36 In Figure 2, these relations are denoted by double-line boundary, including location, person, title, date and time; every tuple of these relations maps to a named entity mention. [sent-99, score-0.558]
37 1 Figure 3 demonstrates the correct mapping of named entity mentions to tuples, as well as tuple unification, for the example shown in Figure 1. [sent-100, score-0.468]
38 For example, the mentions “Wean 5409” and “Wean Hall 5409” correspond to tuples of the location relation, where the two tuples are resolved into a unified set. [sent-101, score-1.161]
39 To complete template filling, the remaining relations of the schema are populated bottom-up, where each field links to a unified set of populated tuples. [sent-102, score-0.978]
40 Value normalization of the unified tuples is another component oftemplate filling. [sent-105, score-0.507]
41 We partially address normalization: tuples of semantically detailed (multi-attribute) relations, e. [sent-106, score-0.504]
42 , date and time, are resolved into their semantic union, while textual tuples (e. [sent-108, score-0.564]
43 In this work, we assume that each template slot contains at most one value. [sent-111, score-0.208]
44 1In the multi-attribute relations of date and time, each attribute maps to a text span, where the set of spans at tuple-level is required to be sequential (up to a small distance d). [sent-113, score-0.432]
45 A set of candidate mentions Sd(a) is extracted from document d per each attribute a of a relation r ∈ L, where L is the set oatft parent-free are lraeltaiotniosn nin r T ∈. [sent-123, score-0.585]
46 For each relation r ∈ L, veamlpidlo cyaenddi fodrat teh tuples Ed(r) are ccho nresltarutioctned r f∈ ro Lm, the candidate mentions that map to its attributes. [sent-128, score-0.825]
47 Importantly, ehtsow isever, the tuples within a candidate unification set are required to be non-contradictory. [sent-133, score-0.707]
48 In addition, the text spans that comprise the mentions within each set must not overlap. [sent-134, score-0.281]
49 Finally, we do not split tuples with identical string values between different sets. [sent-135, score-0.486]
50 To construct the space of candidate tuples of the target relation, the remaining relations r ∈ {T −L} are visited bottom-up, where each tfiioelnds a ∈ A(r) Lis} mapped eind tbuorntt otom a (possibly euancihfied) populated tuple popfe eitds type. [sent-137, score-0.943]
51 nT thoe av (apliods (and nonoverlapping) combinations of field mappings constitute a set of candidate tuples of r. [sent-138, score-0.744]
52 The candidate tuples generated using this procedure are structured entities, constructed using typed named entity recognition, unification, and hierarchical assignment of field values (Figure 3). [sent-139, score-0.982]
53 We will derive features that describe local and global properties of the candidate tuples, encoding both surface and semantic information. [sent-140, score-0.203]
54 Populate every low-level relation r ∈ L from text d: • Construct a set of candidate valid tuples Ed (r) given Chigohn-strreuccatll aty speetd ocfan cdainddaidtea tteex tv spans uSplde e(as) E, a ∈ A(r) . [sent-144, score-0.813]
55 Iterate • bottom-up through relations r ∈ {T − L}: Initialize the set of candidate tuples Ed(r) sIenit. [sent-147, score-0.611]
56 t to an empty • Iterate through attributes a ∈ A(r): – Retrieve the set of candidate tuples (or unified tuple sets) Ed (r′), where r′ is the relation that attribute a links to in T. [sent-148, score-1.073]
57 For every pair of candidate tuples e ∈ Ed(r) and – teu′p∈le eE′. [sent-150, score-0.55]
58 Apply Equation 1 to rank the partially filled candidate tuples e ∈ Ed(r). [sent-152, score-0.55]
59 , m, are pre-defined feature functions describing a candidate record y of the target relation given document d and the extended schema T. [sent-158, score-0.532]
60 If the topGiven α¯ , an scoring candidate is different from the correct mapping known, then: (i) α¯ is incremented with the feature vector of the correct candidate, and (ii) the feature vector of the top-scoring candidate is subtracted from α¯ . [sent-162, score-0.285]
61 This means that rather than instantiate the full space ofvalid candidate records (Section 4. [sent-168, score-0.213]
62 As detailed, only a set of top scoring tuples of size k (beam size) is maintained per relation r ∈ T during candidate generation. [sent-171, score-0.737]
63 An advantage of the proposed approach is that rather than output a single prediction, a list of coherent candidate tuples may be generated, ranked according to Equation 1. [sent-175, score-0.55]
64 5 Seminar Extraction Task Dataset The CMU seminar announcement dataset (Freitag and McCallum, 2000) includes 485 emails containing seminar announcements. [sent-176, score-0.754]
65 2 We consider this corpus as an example of semi-structured text, where some of the field values appear in the email header, in a tabular structure, or using special formatting (Califf and Mooney, 1999; Minkov et al. [sent-179, score-0.354]
66 3 We used a set of rules to extract candidate named entities per the types specified in Figure 2. [sent-181, score-0.289]
67 849 recall for the named entities of type date and time is near perfect, and is estimated at 96%, 91% and 90% for location, speaker and title, respectively. [sent-192, score-0.266]
68 , whether the spans assigned to the location field include words typical of location, such as “room” or “hall”. [sent-206, score-0.365]
69 We propose a set of features that model correspondence between the text spans assigned to each field and document structure. [sent-215, score-0.387]
70 Specifically, these features model whether at least one of the spans mapped to each field appears in the email header; captures a full line in the document; is indent; appears within space lines; or in a tabular format. [sent-216, score-0.522]
71 These features refer to the semantic interpretation of field values. [sent-222, score-0.279]
72 According to the relational schema (Figure 2), date and time include detailed attributes, whereas other relations are represented as strings. [sent-223, score-0.468]
73 Another feature encodes the size of the most semantically detailed named entity that maps to a field; for example, the most detailed entity mention of type stime in Figure 1 is “3:30”, comprising of two attribute values, namely hour and minutes. [sent-278, score-0.7]
74 These features were designed to favor semantically detailed mentions and unified sets. [sent-280, score-0.36]
75 We have experimented with features that encode the shortest distance between named entity mentions mapping to different fields (measured in terms of separating lines or sentences), based on the hypothesis that field values typically co-appear in the same segments of the document. [sent-283, score-0.764]
76 Experiments We conducted 5-fold cross validation experiments using the seminar extraction dataset. [sent-286, score-0.401]
77 As discussed earlier, we assume that a single record is described in each document, and that each field corresponds to a single value. [sent-287, score-0.251]
78 In evaluating the template filling task, only exact matches are accepted as true positives, where partial matches are counted as errors (Siefkes, 2008). [sent-289, score-0.314]
79 In another experiment we therefore mimic the typical scenario of template filling, in which the value of the highest scoring named entity is assigned to each field. [sent-299, score-0.404]
80 Finally, we experimented with populating every field of the target schema independently of the other fields. [sent-303, score-0.411]
81 This is largely due to erroneous assignments of named entities of other types (mainly, person) as titles; such errors are avoided in the full joint model, where tuple validity is enforced. [sent-305, score-0.383]
82 ) The best results per field are marked in boldface. [sent-326, score-0.223]
83 While a variety of methods have been applied in previous works, none has modeled template filling in a joint fashion. [sent-328, score-0.377]
84 (2005), applied sequential models to perform NER on this dataset, identifying named entities that pertain to the template slots. [sent-331, score-0.313]
85 We believe that these results demonstrate the benefit of performing mention recognition as part of a joint model that takes into account detailed semantics of the underlying relational schema, when available. [sent-336, score-0.327]
86 Importantly, an advantage of the proposed approach 851 Figure 4: The relational schema for acquisitions. [sent-342, score-0.275]
87 6 Corporate Acquisitions Dataset The corporate acquisitions corpus contains 600 newswire articles, describing factual or potential corporate acquisition events. [sent-348, score-0.369]
88 The corpus has been annotated with the official names of the parties to an acquisition: acquired, purchaser and seller, as well as their corresponding abbreviated names and company codes. [sent-349, score-0.239]
89 6 We describe the target schema using the relational structure depicted in Figure 4. [sent-350, score-0.331]
90 The schema includes two relations: the corp relation de- scribes a corporate entity, including its full name, abbreviated name and code as attributes; the target acquisition relation includes three role-designating attributes, each linked to a corp tuple. [sent-351, score-0.963]
91 Semantic features are applied to corp tuples: we model whether the abbreviated name is a subset of the full name; whether the corporate code forms exact initials of the full or abbreviated names; or whether it has high string similarity to any of these values. [sent-431, score-0.578]
92 Finally, cross-type features encode the shortest string between spans mapping to different roles in the acquisition relation. [sent-432, score-0.219]
93 Experiments We applied beam search, where corp tuples are extracted first, and acquisition tuples are constructed using the top scoring corp entities. [sent-433, score-1.286]
94 Interestingly, the performance of our model on the code fields is high; these fields do not involve boundary prediction, and thus reflect the quality of role assignment. [sent-440, score-0.275]
95 In contrast, the semantic features, which account for the semantic cohesiveness of the populated corp tuples, are shown to be necessary. [sent-443, score-0.329]
96 852 ing them degrades the extraction of the abbreviated names; these features allow prediction of abbreviated names jointly with the full corporate names, which are more regular (e. [sent-445, score-0.558]
97 Table 5 further shows results on NER, the task of recovering the sets of named entity mentions pertaining to each target field. [sent-450, score-0.406]
98 These results are consistent with the case study of seminar extraction. [sent-452, score-0.318]
99 7 Summary and Future Work We presented a joint approach for template filling that models mention detection, unification, and field extraction in a flexible, feature-rich model. [sent-453, score-0.766]
100 Extracting personal names from emails: Applying named entity recognition to informal text. [sent-522, score-0.236]
wordName wordTfidf (topN-words)
[('tuples', 0.432), ('seminar', 0.318), ('field', 0.194), ('template', 0.171), ('mentions', 0.166), ('schema', 0.161), ('populated', 0.158), ('filling', 0.143), ('attributes', 0.129), ('unification', 0.127), ('beam', 0.126), ('corporate', 0.126), ('siefkes', 0.119), ('fields', 0.119), ('tuple', 0.118), ('candidate', 0.118), ('spans', 0.115), ('relational', 0.114), ('mention', 0.112), ('relation', 0.109), ('minkov', 0.103), ('entity', 0.102), ('freitag', 0.099), ('corp', 0.095), ('abbreviated', 0.095), ('date', 0.094), ('attribute', 0.092), ('ed', 0.091), ('extraction', 0.083), ('named', 0.082), ('announcements', 0.08), ('califf', 0.08), ('wean', 0.08), ('unified', 0.075), ('announcement', 0.069), ('sutton', 0.069), ('email', 0.066), ('finkel', 0.066), ('joint', 0.063), ('title', 0.063), ('relations', 0.061), ('tie', 0.061), ('entities', 0.06), ('full', 0.06), ('acquisitions', 0.06), ('datestimeetimelocationspeakertitle', 0.06), ('stime', 0.06), ('acquisition', 0.057), ('record', 0.057), ('target', 0.056), ('location', 0.056), ('haghighi', 0.054), ('values', 0.054), ('einat', 0.052), ('header', 0.052), ('names', 0.052), ('ner', 0.051), ('cmu', 0.051), ('dataset', 0.049), ('scoring', 0.049), ('features', 0.047), ('mccallum', 0.045), ('populate', 0.044), ('roth', 0.041), ('maps', 0.04), ('yih', 0.04), ('characterised', 0.04), ('elaine', 0.04), ('etime', 0.04), ('mandatory', 0.04), ('oatft', 0.04), ('perceptronstyle', 0.04), ('peshkin', 0.04), ('purchaser', 0.04), ('purnamepurabrpurcodeacqnameacqabracqcodesellnamesellabrsellcode', 0.04), ('tabular', 0.04), ('valid', 0.039), ('detailed', 0.038), ('batch', 0.038), ('semantic', 0.038), ('boundary', 0.037), ('ie', 0.037), ('discriminative', 0.037), ('slot', 0.037), ('detection', 0.035), ('haifa', 0.035), ('records', 0.035), ('semantically', 0.034), ('equation', 0.033), ('ijcai', 0.032), ('person', 0.032), ('interdependencies', 0.032), ('freund', 0.032), ('seller', 0.032), ('collins', 0.031), ('document', 0.031), ('required', 0.03), ('structural', 0.03), ('speaker', 0.03), ('per', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000005 73 acl-2012-Discriminative Learning for Joint Template Filling
Author: Einat Minkov ; Luke Zettlemoyer
Abstract: This paper presents a joint model for template filling, where the goal is to automatically specify the fields of target relations such as seminar announcements or corporate acquisition events. The approach models mention detection, unification and field extraction in a flexible, feature-rich model that allows for joint modeling of interdependencies at all levels and across fields. Such an approach can, for example, learn likely event durations and the fact that start times should come before end times. While the joint inference space is large, we demonstrate effective learning with a Perceptron-style approach that uses simple, greedy beam decoding. Empirical results in two benchmark domains demonstrate consistently strong performance on both mention de- tection and template filling tasks.
2 0.17123471 18 acl-2012-A Probabilistic Model for Canonicalizing Named Entity Mentions
Author: Dani Yogatama ; Yanchuan Sim ; Noah A. Smith
Abstract: We present a statistical model for canonicalizing named entity mentions into a table whose rows represent entities and whose columns are attributes (or parts of attributes). The model is novel in that it incorporates entity context, surface features, firstorder dependencies among attribute-parts, and a notion of noise. Transductive learning from a few seeds and a collection of mention tokens combines Bayesian inference and conditional estimation. We evaluate our model and its components on two datasets collected from political blogs and sports news, finding that it outperforms a simple agglomerative clustering approach and previous work.
3 0.15714863 57 acl-2012-Concept-to-text Generation via Discriminative Reranking
Author: Ioannis Konstas ; Mirella Lapata
Abstract: This paper proposes a data-driven method for concept-to-text generation, the task of automatically producing textual output from non-linguistic input. A key insight in our approach is to reduce the tasks of content selection (“what to say”) and surface realization (“how to say”) into a common parsing problem. We define a probabilistic context-free grammar that describes the structure of the input (a corpus of database records and text describing some of them) and represent it compactly as a weighted hypergraph. The hypergraph structure encodes exponentially many derivations, which we rerank discriminatively using local and global features. We propose a novel decoding algorithm for finding the best scoring derivation and generating in this setting. Experimental evaluation on the ATIS domain shows that our model outperforms a competitive discriminative system both using BLEU and in a judgment elicitation study.
4 0.15702167 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale
Author: Michael Wick ; Sameer Singh ; Andrew McCallum
Abstract: Sameer Singh Andrew McCallum University of Massachusetts University of Massachusetts 140 Governor’s Drive 140 Governor’s Drive Amherst, MA Amherst, MA s ameer@ cs .umas s .edu mccal lum@ c s .umas s .edu Hamming” who authored “The unreasonable effectiveness of mathematics.” Features of the mentions Methods that measure compatibility between mention pairs are currently the dominant ap- proach to coreference. However, they suffer from a number of drawbacks including difficulties scaling to large numbers of mentions and limited representational power. As these drawbacks become increasingly restrictive, the need to replace the pairwise approaches with a more expressive, highly scalable alternative is becoming urgent. In this paper we propose a novel discriminative hierarchical model that recursively partitions entities into trees of latent sub-entities. These trees succinctly summarize the mentions providing a highly compact, information-rich structure for reasoning about entities and coreference uncertainty at massive scales. We demonstrate that the hierarchical model is several orders of magnitude faster than pairwise, allowing us to perform coreference on six million author mentions in under four hours on a single CPU.
5 0.15240978 33 acl-2012-Automatic Event Extraction with Structured Preference Modeling
Author: Wei Lu ; Dan Roth
Abstract: This paper presents a novel sequence labeling model based on the latent-variable semiMarkov conditional random fields for jointly extracting argument roles of events from texts. The model takes in coarse mention and type information and predicts argument roles for a given event template. This paper addresses the event extraction problem in a primarily unsupervised setting, where no labeled training instances are available. Our key contribution is a novel learning framework called structured preference modeling (PM), that allows arbitrary preference to be assigned to certain structures during the learning procedure. We establish and discuss connections between this framework and other existing works. We show empirically that the structured preferences are crucial to the success of our task. Our model, trained without annotated data and with a small number of structured preferences, yields performance competitive to some baseline supervised approaches.
6 0.1465584 159 acl-2012-Pattern Learning for Relation Extraction with a Hierarchical Topic Model
7 0.14087869 208 acl-2012-Unsupervised Relation Discovery with Sense Disambiguation
8 0.12698989 191 acl-2012-Temporally Anchored Relation Extraction
9 0.10926953 150 acl-2012-Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia
10 0.10425029 50 acl-2012-Collective Classification for Fine-grained Information Status
11 0.097570777 40 acl-2012-Big Data versus the Crowd: Looking for Relationships in All the Right Places
12 0.095718391 126 acl-2012-Labeling Documents with Timestamps: Learning from their Time Expressions
13 0.095372267 153 acl-2012-Named Entity Disambiguation in Streaming Data
14 0.094732068 124 acl-2012-Joint Inference of Named Entity Recognition and Normalization for Tweets
15 0.088537723 186 acl-2012-Structuring E-Commerce Inventory
16 0.084435776 58 acl-2012-Coreference Semantics from Web Features
17 0.081215598 6 acl-2012-A Comprehensive Gold Standard for the Enron Organizational Hierarchy
18 0.081024371 85 acl-2012-Event Linking: Grounding Event Reference in a News Archive
19 0.079644389 99 acl-2012-Finding Salient Dates for Building Thematic Timelines
20 0.075973906 60 acl-2012-Coupling Label Propagation and Constraints for Temporal Fact Extraction
topicId topicWeight
[(0, -0.232), (1, 0.133), (2, -0.081), (3, 0.152), (4, 0.059), (5, 0.058), (6, -0.058), (7, 0.044), (8, 0.079), (9, -0.056), (10, 0.056), (11, -0.128), (12, -0.143), (13, -0.09), (14, 0.069), (15, 0.094), (16, 0.002), (17, 0.093), (18, -0.117), (19, -0.076), (20, -0.016), (21, 0.002), (22, -0.025), (23, 0.013), (24, -0.001), (25, 0.043), (26, -0.011), (27, -0.026), (28, -0.058), (29, -0.065), (30, 0.027), (31, 0.022), (32, 0.048), (33, 0.04), (34, -0.08), (35, 0.016), (36, 0.008), (37, 0.1), (38, -0.008), (39, -0.002), (40, 0.027), (41, -0.025), (42, -0.093), (43, -0.062), (44, -0.077), (45, 0.072), (46, -0.053), (47, -0.006), (48, 0.034), (49, 0.178)]
simIndex simValue paperId paperTitle
same-paper 1 0.95093054 73 acl-2012-Discriminative Learning for Joint Template Filling
Author: Einat Minkov ; Luke Zettlemoyer
Abstract: This paper presents a joint model for template filling, where the goal is to automatically specify the fields of target relations such as seminar announcements or corporate acquisition events. The approach models mention detection, unification and field extraction in a flexible, feature-rich model that allows for joint modeling of interdependencies at all levels and across fields. Such an approach can, for example, learn likely event durations and the fact that start times should come before end times. While the joint inference space is large, we demonstrate effective learning with a Perceptron-style approach that uses simple, greedy beam decoding. Empirical results in two benchmark domains demonstrate consistently strong performance on both mention de- tection and template filling tasks.
2 0.74286377 18 acl-2012-A Probabilistic Model for Canonicalizing Named Entity Mentions
Author: Dani Yogatama ; Yanchuan Sim ; Noah A. Smith
Abstract: We present a statistical model for canonicalizing named entity mentions into a table whose rows represent entities and whose columns are attributes (or parts of attributes). The model is novel in that it incorporates entity context, surface features, firstorder dependencies among attribute-parts, and a notion of noise. Transductive learning from a few seeds and a collection of mention tokens combines Bayesian inference and conditional estimation. We evaluate our model and its components on two datasets collected from political blogs and sports news, finding that it outperforms a simple agglomerative clustering approach and previous work.
3 0.62367105 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale
Author: Michael Wick ; Sameer Singh ; Andrew McCallum
Abstract: Sameer Singh Andrew McCallum University of Massachusetts University of Massachusetts 140 Governor’s Drive 140 Governor’s Drive Amherst, MA Amherst, MA s ameer@ cs .umas s .edu mccal lum@ c s .umas s .edu Hamming” who authored “The unreasonable effectiveness of mathematics.” Features of the mentions Methods that measure compatibility between mention pairs are currently the dominant ap- proach to coreference. However, they suffer from a number of drawbacks including difficulties scaling to large numbers of mentions and limited representational power. As these drawbacks become increasingly restrictive, the need to replace the pairwise approaches with a more expressive, highly scalable alternative is becoming urgent. In this paper we propose a novel discriminative hierarchical model that recursively partitions entities into trees of latent sub-entities. These trees succinctly summarize the mentions providing a highly compact, information-rich structure for reasoning about entities and coreference uncertainty at massive scales. We demonstrate that the hierarchical model is several orders of magnitude faster than pairwise, allowing us to perform coreference on six million author mentions in under four hours on a single CPU.
4 0.60871351 208 acl-2012-Unsupervised Relation Discovery with Sense Disambiguation
Author: Limin Yao ; Sebastian Riedel ; Andrew McCallum
Abstract: To discover relation types from text, most methods cluster shallow or syntactic patterns of relation mentions, but consider only one possible sense per pattern. In practice this assumption is often violated. In this paper we overcome this issue by inducing clusters of pattern senses from feature representations of patterns. In particular, we employ a topic model to partition entity pairs associated with patterns into sense clusters using local and global features. We merge these sense clusters into semantic relations using hierarchical agglomerative clustering. We compare against several baselines: a generative latent-variable model, a clustering method that does not disambiguate between path senses, and our own approach but with only local features. Experimental results show our proposed approach discovers dramatically more accurate clusters than models without sense disambiguation, and that incorporating global features, such as the document theme, is crucial.
5 0.59769011 57 acl-2012-Concept-to-text Generation via Discriminative Reranking
Author: Ioannis Konstas ; Mirella Lapata
Abstract: This paper proposes a data-driven method for concept-to-text generation, the task of automatically producing textual output from non-linguistic input. A key insight in our approach is to reduce the tasks of content selection (“what to say”) and surface realization (“how to say”) into a common parsing problem. We define a probabilistic context-free grammar that describes the structure of the input (a corpus of database records and text describing some of them) and represent it compactly as a weighted hypergraph. The hypergraph structure encodes exponentially many derivations, which we rerank discriminatively using local and global features. We propose a novel decoding algorithm for finding the best scoring derivation and generating in this setting. Experimental evaluation on the ATIS domain shows that our model outperforms a competitive discriminative system both using BLEU and in a judgment elicitation study.
6 0.58788657 215 acl-2012-WizIE: A Best Practices Guided Development Environment for Information Extraction
7 0.58722478 153 acl-2012-Named Entity Disambiguation in Streaming Data
8 0.58279687 33 acl-2012-Automatic Event Extraction with Structured Preference Modeling
9 0.54558921 50 acl-2012-Collective Classification for Fine-grained Information Status
10 0.53646386 159 acl-2012-Pattern Learning for Relation Extraction with a Hierarchical Topic Model
11 0.52506906 124 acl-2012-Joint Inference of Named Entity Recognition and Normalization for Tweets
12 0.51654989 58 acl-2012-Coreference Semantics from Web Features
13 0.5150727 186 acl-2012-Structuring E-Commerce Inventory
14 0.49099517 126 acl-2012-Labeling Documents with Timestamps: Learning from their Time Expressions
15 0.47563207 43 acl-2012-Building Trainable Taggers in a Web-based, UIMA-Supported NLP Workbench
16 0.43942642 150 acl-2012-Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia
17 0.41738915 191 acl-2012-Temporally Anchored Relation Extraction
18 0.40862429 60 acl-2012-Coupling Label Propagation and Constraints for Temporal Fact Extraction
19 0.40607035 6 acl-2012-A Comprehensive Gold Standard for the Enron Organizational Hierarchy
20 0.40362161 75 acl-2012-Discriminative Strategies to Integrate Multiword Expression Recognition and Parsing
topicId topicWeight
[(23, 0.232), (25, 0.02), (26, 0.05), (28, 0.035), (30, 0.03), (37, 0.026), (39, 0.084), (74, 0.029), (82, 0.051), (84, 0.032), (85, 0.023), (90, 0.157), (92, 0.059), (94, 0.023), (99, 0.066)]
simIndex simValue paperId paperTitle
1 0.92813784 181 acl-2012-Spectral Learning of Latent-Variable PCFGs
Author: Shay B. Cohen ; Karl Stratos ; Michael Collins ; Dean P. Foster ; Lyle Ungar
Abstract: We introduce a spectral learning algorithm for latent-variable PCFGs (Petrov et al., 2006). Under a separability (singular value) condition, we prove that the method provides consistent parameter estimates.
same-paper 2 0.80326837 73 acl-2012-Discriminative Learning for Joint Template Filling
Author: Einat Minkov ; Luke Zettlemoyer
Abstract: This paper presents a joint model for template filling, where the goal is to automatically specify the fields of target relations such as seminar announcements or corporate acquisition events. The approach models mention detection, unification and field extraction in a flexible, feature-rich model that allows for joint modeling of interdependencies at all levels and across fields. Such an approach can, for example, learn likely event durations and the fact that start times should come before end times. While the joint inference space is large, we demonstrate effective learning with a Perceptron-style approach that uses simple, greedy beam decoding. Empirical results in two benchmark domains demonstrate consistently strong performance on both mention de- tection and template filling tasks.
3 0.67458546 187 acl-2012-Subgroup Detection in Ideological Discussions
Author: Amjad Abu-Jbara ; Pradeep Dasigi ; Mona Diab ; Dragomir Radev
Abstract: The rapid and continuous growth of social networking sites has led to the emergence of many communities of communicating groups. Many of these groups discuss ideological and political topics. It is not uncommon that the participants in such discussions split into two or more subgroups. The members of each subgroup share the same opinion toward the discussion topic and are more likely to agree with members of the same subgroup and disagree with members from opposing subgroups. In this paper, we propose an unsupervised approach for automatically detecting discussant subgroups in online communities. We analyze the text exchanged between the participants of a discussion to identify the attitude they carry toward each other and towards the various aspects of the discussion topic. We use attitude predictions to construct an attitude vector for each discussant. We use clustering techniques to cluster these vectors and, hence, determine the subgroup membership of each participant. We compare our methods to text clustering and other baselines, and show that our method achieves promising results.
4 0.66915447 191 acl-2012-Temporally Anchored Relation Extraction
Author: Guillermo Garrido ; Anselmo Penas ; Bernardo Cabaleiro ; Alvaro Rodrigo
Abstract: Although much work on relation extraction has aimed at obtaining static facts, many of the target relations are actually fluents, as their validity is naturally anchored to a certain time period. This paper proposes a methodological approach to temporally anchored relation extraction. Our proposal performs distant supervised learning to extract a set of relations from a natural language corpus, and anchors each of them to an interval of temporal validity, aggregating evidence from documents supporting the relation. We use a rich graphbased document-level representation to generate novel features for this task. Results show that our implementation for temporal anchoring is able to achieve a 69% of the upper bound performance imposed by the relation extraction step. Compared to the state of the art, the overall system achieves the highest precision reported.
Author: Pradeep Dasigi ; Weiwei Guo ; Mona Diab
Abstract: We describe an unsupervised approach to the problem of automatically detecting subgroups of people holding similar opinions in a discussion thread. An intuitive way of identifying this is to detect the attitudes of discussants towards each other or named entities or topics mentioned in the discussion. Sentiment tags play an important role in this detection, but we also note another dimension to the detection of people’s attitudes in a discussion: if two persons share the same opinion, they tend to use similar language content. We consider the latter to be an implicit attitude. In this paper, we investigate the impact of implicit and explicit attitude in two genres of social media discussion data, more formal wikipedia discussions and a debate discussion forum that is much more informal. Experimental results strongly suggest that implicit attitude is an important complement for explicit attitudes (expressed via sentiment) and it can improve the sub-group detection performance independent of genre.
6 0.66165596 28 acl-2012-Aspect Extraction through Semi-Supervised Modeling
7 0.65523219 99 acl-2012-Finding Salient Dates for Building Thematic Timelines
8 0.65490878 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information
9 0.65389287 21 acl-2012-A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election Cycle
10 0.64960396 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base
11 0.64913988 159 acl-2012-Pattern Learning for Relation Extraction with a Hierarchical Topic Model
12 0.64907575 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons
13 0.64804208 167 acl-2012-QuickView: NLP-based Tweet Search
14 0.647246 142 acl-2012-Mining Entity Types from Query Logs via User Intent Modeling
15 0.64617372 182 acl-2012-Spice it up? Mining Refinements to Online Instructions from User Generated Content
16 0.64456582 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale
17 0.64346528 62 acl-2012-Cross-Lingual Mixture Model for Sentiment Classification
18 0.64319909 130 acl-2012-Learning Syntactic Verb Frames using Graphical Models
19 0.64264101 37 acl-2012-Baselines and Bigrams: Simple, Good Sentiment and Topic Classification
20 0.64253068 40 acl-2012-Big Data versus the Crowd: Looking for Relationships in All the Right Places