emnlp emnlp2010 emnlp2010-20 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Apoorv Agarwal ; Owen Rambow
Abstract: In this paper we introduce the new task of social event extraction from text. We distinguish two broad types of social events depending on whether only one or both parties are aware of the social contact. We annotate part of Automatic Content Extraction (ACE) data, and perform experiments using Support Vector Machines with Kernel methods. We use a combination of structures derived from phrase structure trees and dependency trees. A characteristic of our events (which distinguishes them from ACE events) is that the participating entities can be spread far across the parse trees. We use syntactic and semantic insights to devise a new structure derived from dependency trees and show that this plays a role in achieving the best performing system for both social event detection and classification tasks. We also use three data sampling approaches to solve the problem of data skewness. Sampling methods improve the F1-measure for the task of relation detection by over 20% absolute over the baseline.
Reference: text
sentIndex sentText sentNum sentScore
1 Abstract In this paper we introduce the new task of social event extraction from text. [sent-6, score-0.812]
2 We distinguish two broad types of social events depending on whether only one or both parties are aware of the social contact. [sent-7, score-1.065]
3 A characteristic of our events (which distinguishes them from ACE events) is that the participating entities can be spread far across the parse trees. [sent-10, score-0.364]
4 We use syntactic and semantic insights to devise a new structure derived from dependency trees and show that this plays a role in achieving the best performing system for both social event detection and classification tasks. [sent-11, score-1.086]
5 1 Introduction This paper introduces a novel natural language processing (NLP) task, social event extraction. [sent-14, score-0.763]
6 We are interested in this task because it contributes to our overall research goal, which is to extract a social network from written text. [sent-15, score-0.451]
7 The extracted social network can be used for various applications such as summarization, question-answering, or the detection of main characters in a story. [sent-16, score-0.524]
8 For example, we manually extracted the social network of characters in 1024 Owen Rambow CCLS Columbia University New York, U. [sent-17, score-0.451]
9 edu Alice in Wonderland and ran standard social network analysis algorithms on the network. [sent-22, score-0.451]
10 Moreover, characters occurring in a scene together were given same social roles and positions. [sent-24, score-0.392]
11 We take a “social network” to be a network consisting of individual human beings and groups of human beings who are connected to each other by the virtue of participating in social events. [sent-27, score-0.496]
12 We define social events to be events that occur between people where at least one person is aware of the other and of the event taking place. [sent-28, score-1.25]
13 In the sentence John thinks Mary is great, only John is aware of Mary and the event is the thinking event. [sent-30, score-0.439]
14 A text can describe a social network in two ways: explicitly, by stating the type of relationship between two individuals (e. [sent-32, score-0.451]
15 husband-wife), or implicitly, by describing an event which creates or perpetuates a social relationship (e. [sent-34, score-0.763]
16 We will call these types of events social events. [sent-37, score-0.573]
17 We define two types of social events: interaction, in which both parties are aware of the social event (e. [sent-38, score-1.255]
18 This paper is the first attempt to detect and classify social events present in text. [sent-46, score-0.573]
19 Our task is different from related tasks, notably from the Automated Content Extraction (ACE) relation and event extraction tasks because the events are different (they are a class of events defined through the effect on participants’ cognitive state), and the linguistic realization is different. [sent-47, score-1.001]
20 Mentions of entities1 engaged in a social event are often quite distant from each other in the sentence (unlike in ACE relations where about 70% of relations are local, in our social event annotation, only 25% of the events are local. [sent-48, score-1.855]
21 In fact, the average number of words between entities participating in any social event is 9. [sent-49, score-0.946]
22 ) We use tree kernel methods (on structures derived from phrase structure trees and dependency trees) in conjunction with Support Vector Machines (SVMs) to solve our tasks. [sent-50, score-0.528]
23 Data skewness turns out to be a big challenge for the task of relation detection since there are many more pairs of entities without a relation as compared to pairs of entities that have a relation. [sent-53, score-0.663]
24 Moreover, we introduce a new sequence kernel that outperforms previously proposed sequence kernels for the task of social event detection and plays a role to achieve the best performing system for the task of social event detection and classification. [sent-55, score-2.133]
25 We then discuss kernel methods and the structures we use, and introduce our new structure in Section 4. [sent-60, score-0.323]
26 1025 iments and results for social event detection and social event classification tasks. [sent-64, score-1.652]
27 2 Literature Survey There has not been much work in developing techniques for ACE event extraction as compared to ACE relation extraction. [sent-66, score-0.553]
28 The most salient work for event extraction is Grishman et al. [sent-67, score-0.42]
29 To solve the task for event ex- traction, Grishman et al. [sent-69, score-0.371]
30 They extract two kinds of patterns: 1) the sequence of constituent heads separating anchor and its arguments and 2) a predicate argument subgraph of the sentence connecting anchor to all the event arguments. [sent-71, score-0.433]
31 The structures we use for kernel methods are a super-set of the patterns used by Grishman et al. [sent-76, score-0.289]
32 Moreover, in our work, we take gold annotation for entity mentions, and do not deal with the task of named entity detection or resolution. [sent-78, score-0.294]
33 Finally, our social events are a broad class of event types, and they involve linguistic expressions for expressing interactions and cognition that do not seem to have a correlation with the topics of documents. [sent-79, score-1.038]
34 The supervised approaches used for relation extraction can broadly be divided into three main categories: 1) feature-based approaches 2) kernelbased approaches and 3) a combination of feature and kernel based approaches. [sent-81, score-0.351]
35 Collins and Duffy (2002) are among the earliest researchers to propose the use of tree kernels for various NLP tasks. [sent-92, score-0.305]
36 Since then kernels have been used for the task of relation extraction (Zelenko et al. [sent-93, score-0.412]
37 Apart from using kernels over dependency trees, Culotta and Jeffrey (2004) incorporate features like words, part of speech (POS) tags, syntactic chunk tag, entity type, entity level, relation argument and WordNet hypernym. [sent-102, score-0.615]
38 We discuss their structures and kernel method in detail in Section 4. [sent-114, score-0.289]
39 We leverage this work by annotating social events on the English part of ACE 2005 Multilingual Training Data2 that has already been annotated for entities, relations and events. [sent-118, score-0.647]
40 (2010), we introduce a comprehensive set of social events which are conceptually different from the event annotation that already exists for ACE. [sent-120, score-0.977]
41 Our annotation scheme is reliable, achieving a moderate kappa for relation detection (0. [sent-123, score-0.3]
42 Following are the two broad types of social events that were annotated: Interaction event (INR): When both entities participating in an event are aware of each other and of the social event, we say they have an INR relation. [sent-129, score-1.958]
43 INR As is intuitive, if one person informs the other about something, both have to be cognizant of each other and of the informing event in which they are both participating. [sent-132, score-0.476]
44 Observation event (OBS): When only one person (out of the two people that are participating in an event) is aware of the other and of the social event, we say they have an OBS relation. [sent-133, score-0.933]
45 PPR requires that one entity can observe the other entity in real time not through a broadcast medium, in contrast to the subtype PCR, where one entity observes the other through media (TV, radio, magazines etc. [sent-135, score-0.314]
46 ) Any other observation event that is not PPR or PCR 2Version: 6. [sent-136, score-0.371]
47 In this sentence, the event said marks a COG relation between Toujan Faisal and the committee. [sent-139, score-0.545]
48 2 Comparison Between Social Events and ACE Annotations The ACE effort is about entity, relation and event annotation. [sent-145, score-0.504]
49 Our event annotations are different from ACE event annotations because we annotate text that expresses the cognitive states of the people involved, or allows the annotator to infer it. [sent-149, score-0.851]
50 Therefore, at the top level of classification we differentiate between events in which only one entity is cognizant of the other (observation) versus events when both entities are cognizant of each other (interaction). [sent-150, score-0.743]
51 This distinction is, we believe, novel in event or relation annotation. [sent-151, score-0.504]
52 Now we present statistics and examples to make clear how our annotations are different from ACE event annotations. [sent-152, score-0.44]
53 These files contain a total of 212 social events. [sent-154, score-0.425]
54 We found a total of 63 candidate ACE events that had at least two Person entities involved. [sent-155, score-0.319]
55 The majority of social events that match the ACE events are of type INR. [sent-157, score-0.754]
56 On analysis, we found that most of these correspond to the ACE event type CONTACT. [sent-158, score-0.371]
57 Specifically, the “meeting” event, which is an ACE CONTACT event and an INR event according to our definition, is the major cause of overlap. [sent-159, score-0.742]
58 For example, in Example 1, we recorded an INR event between Toujan Faisal and committee (event span: 1027 informed). [sent-161, score-0.421]
59 ACE does not record any event between these two entities because informed does not entail a CONTACT event for ACE event annotations. [sent-162, score-1.308]
60 Being an event that has two person entities involved makes the above sentence a potential social event. [sent-164, score-0.958]
61 However, we do not record any event between these entities since the text does not reveal the cognitive states of the two entities; we do not know whether one was aware of the other. [sent-165, score-0.618]
62 ACE defines a class of social relations (PERSOC) that records named relations like friendship, co-worker, long lasting etc. [sent-166, score-0.585]
63 Therefore, even though these relations are directly relevant to our overall goal of social event extraction, we do not annotate, detect or classify these relations in this paper. [sent-168, score-0.911]
64 4 Tree Kernels, Discrete Structures, and Language In this section, we give details of the structures and kernel we use for our classification tasks. [sent-169, score-0.342]
65 Therefore, we use convolution kernels with a linear learning machine (Support Vector Machines) for our classification task. [sent-185, score-0.368]
66 Now we present the “discrete” structures followed by the kernel we used. [sent-186, score-0.289]
67 In Figure 1, since the target entities are at the leftmost and rightmost branch of the depen- common dependency tree that 3We omitted SK6, which is the worst performing kernel in (Nguyen et al. [sent-195, score-0.474]
68 We use the Partial Tree (PT) kernel, first proposed by Moschitti (2006a), for structures derived from dependency trees and Subset Tree (SST) kernel, proposed by Collins and Duffy (2002), for structures derived from phrase structure trees. [sent-210, score-0.404]
69 We are interested in modeling classes of events which are characterized by the cognitive states of participants–who is aware of whom. [sent-217, score-0.29]
70 And even more strikingly, any verb that can be put in that position is likely to have this interpretation; for example, we are likely to interpret the neologistic John gazooked to Mary about Percy as a similarly structured social event. [sent-222, score-0.421]
71 These techniques are non-heuristic sampling methods that aim at balancing the class proportions by removing examples of the majority class and by duplicating instances of the minority class respectively. [sent-235, score-0.33]
72 Analogously, we tried two transformations on our dependency tree structures to produce synthetic examples. [sent-250, score-0.335]
73 The first transformation is based on the observation that in control verb constructions, the matrix verb typically does not contribute to the interpretation as a social event or not. [sent-251, score-0.868]
74 Here, the observation is that for the COG social events, the second target may be very deeply embedded in the tree. [sent-255, score-0.42]
75 For example, in Example 1, Toujan Faisal and the Interior Ministry Committee participate in a COG event (because Faisal is aware of the Committee during the saying event). [sent-256, score-0.439]
76 6 Experiments And Results In this section we present experiments and results for our two tasks: social event detection and classification. [sent-261, score-0.836]
77 For the social event detection task, we wish to validate the following research hypotheses. [sent-262, score-0.836]
78 In contrast, the social event classification task does not suffer from data skewness because the INR and COG relations; both occur almost the same number of times. [sent-264, score-0.864]
79 1 Experimental Set-up We use part of ACE data that we annotated for social events. [sent-269, score-0.392]
80 If our annotators recorded a relation between a pair of entity mentions, we say there is a relation between the corresponding entities. [sent-273, score-0.36]
81 2 Social Event Detection Social event detection is the task of detecting if any social event exists between a pair of entities in a sentence. [sent-288, score-1.345]
82 We formulate the problem as a binary classification task by labeling an example that does not have a social event as class -1 and by labeling an example that either has an INR or COG social event as class 1. [sent-289, score-1.669]
83 0817269 Table 1: Baseline System for the task of social event detection. [sent-298, score-0.763]
84 Grammatical relation tree structure (GR), a structure derived from dependency tree by replacing the words by their grammatical relations achieves the best precision. [sent-303, score-0.53]
85 1031 sifier learns that if both the arguments of a predicate contain target entities then it is a social event. [sent-305, score-0.589]
86 Among kernels for single structures, the path enclosed tree for PSTs (PET) achieves the best recall. [sent-306, score-0.305]
87 As in the undersampled system, when the data is balanced, SqGRW (sequence kernel on dependency tree in which grammatical relations are inserted as intermediate nodes) achieves the best recall. [sent-350, score-0.423]
88 This exemplifies the difference in the nature of our event annotations from that of ACE relations. [sent-353, score-0.405]
89 1032 Table 4 presents results for using the oversampling method with transformation that produces synthetic positive examples by using a transformation on dependency trees such that the new synthetic examples are “close” to the original examples. [sent-366, score-0.56]
90 3 Social Event Classification For the social event classification task, we only consider pairs of entities that have an event. [sent-372, score-0.954]
91 Even though the task of reasoning if an event is about one-way or mutual cognition seems hard, our system beats the chance baseline by 28. [sent-384, score-0.42]
92 Once again we notice that the combination of kernels works better than single kernels alone, though the difference here is less pronounced. [sent-387, score-0.46]
93 7 Conclusion And Future Work In this paper, we have introduced the novel tasks of social event detection and classification. [sent-390, score-0.836]
94 We also introduced a new sequence structure (SqGRW) which plays a role in achieving the best accuracy for both, social event detection and social event classification tasks. [sent-398, score-1.75]
95 We will also investigate the relation between classes of social events and their syntactic realization. [sent-400, score-0.706]
96 Annotation scheme for social network extraction from text. [sent-410, score-0.5]
97 A study on convolution kernels for shallow semantic parsing. [sent-492, score-0.315]
98 Efficient convolution kernels for dependency and constituent syntactic trees. [sent-496, score-0.379]
99 Convolution kernels on constituent, dependency and sequential structures for relation extraction. [sent-505, score-0.547]
100 A composite kernel to extract relations between entities with both flat and structured features. [sent-527, score-0.381]
wordName wordTfidf (topN-words)
[('social', 0.392), ('event', 0.371), ('ace', 0.308), ('kernels', 0.23), ('inr', 0.223), ('events', 0.181), ('kernel', 0.169), ('faisal', 0.164), ('toujan', 0.159), ('entities', 0.138), ('relation', 0.133), ('cog', 0.123), ('structures', 0.12), ('nguyen', 0.117), ('sqgrw', 0.096), ('entity', 0.094), ('convolution', 0.085), ('oversampling', 0.08), ('moschitti', 0.076), ('synthetic', 0.076), ('tree', 0.075), ('pet', 0.074), ('relations', 0.074), ('sampling', 0.073), ('detection', 0.073), ('grishman', 0.07), ('aware', 0.068), ('gr', 0.066), ('trees', 0.066), ('dependency', 0.064), ('mentions', 0.064), ('cameraman', 0.064), ('ppr', 0.064), ('psts', 0.064), ('sst', 0.064), ('alessandro', 0.064), ('guodong', 0.06), ('network', 0.059), ('informed', 0.057), ('person', 0.057), ('agarwal', 0.055), ('alice', 0.055), ('interior', 0.055), ('minority', 0.055), ('pcr', 0.055), ('classification', 0.053), ('committee', 0.05), ('cognition', 0.049), ('imbalanced', 0.049), ('mary', 0.049), ('extraction', 0.049), ('apoorv', 0.048), ('cognizant', 0.048), ('daughter', 0.048), ('grw', 0.048), ('rabbit', 0.048), ('skewness', 0.048), ('transformation', 0.047), ('participating', 0.045), ('class', 0.045), ('grammatical', 0.041), ('cognitive', 0.041), ('obs', 0.041), ('said', 0.041), ('examples', 0.035), ('positive', 0.034), ('structure', 0.034), ('literary', 0.034), ('rambow', 0.034), ('nsubj', 0.034), ('annotations', 0.034), ('files', 0.033), ('belonging', 0.033), ('discrete', 0.033), ('achieving', 0.033), ('interaction', 0.033), ('annotation', 0.033), ('mention', 0.032), ('appos', 0.032), ('balancing', 0.032), ('chawla', 0.032), ('kolcz', 0.032), ('kotsiantis', 0.032), ('overseeing', 0.032), ('parties', 0.032), ('subtype', 0.032), ('tagpbrelegt', 0.032), ('telecinco', 0.032), ('harabagiu', 0.032), ('arguments', 0.031), ('machines', 0.031), ('sequence', 0.031), ('classifier', 0.031), ('ji', 0.03), ('proportion', 0.029), ('verb', 0.029), ('culotta', 0.028), ('ministry', 0.028), ('kappa', 0.028), ('target', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 1.000001 20 emnlp-2010-Automatic Detection and Classification of Social Events
Author: Apoorv Agarwal ; Owen Rambow
Abstract: In this paper we introduce the new task of social event extraction from text. We distinguish two broad types of social events depending on whether only one or both parties are aware of the social contact. We annotate part of Automatic Content Extraction (ACE) data, and perform experiments using Support Vector Machines with Kernel methods. We use a combination of structures derived from phrase structure trees and dependency trees. A characteristic of our events (which distinguishes them from ACE events) is that the participating entities can be spread far across the parse trees. We use syntactic and semantic insights to devise a new structure derived from dependency trees and show that this plays a role in achieving the best performing system for both social event detection and classification tasks. We also use three data sampling approaches to solve the problem of data skewness. Sampling methods improve the F1-measure for the task of relation detection by over 20% absolute over the baseline.
2 0.21371825 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks
Author: Ekaterina Buyko ; Udo Hahn
Abstract: In state-of-the-art approaches to information extraction (IE), dependency graphs constitute the fundamental data structure for syntactic structuring and subsequent knowledge elicitation from natural language documents. The top-performing systems in the BioNLP 2009 Shared Task on Event Extraction all shared the idea to use dependency structures generated by a variety of parsers either directly or in some converted manner — and optionally modified their output to fit the special needs of IE. As there are systematic differences between various dependency representations being used in this competition, we scrutinize on different encoding styles for dependency information and their possible impact on solving several IE tasks. After assessing more or less established dependency representations such as the Stanford and CoNLL-X dependen— cies, we will then focus on trimming operations that pave the way to more effective IE. Our evaluation study covers data from a number of constituency- and dependency-based parsers and provides experimental evidence which dependency representations are particularly beneficial for the event extraction task. Based on empirical findings from our study we were able to achieve the performance of 57.2% F-score on the development data set of the BioNLP Shared Task 2009.
3 0.13989292 27 emnlp-2010-Clustering-Based Stratified Seed Sampling for Semi-Supervised Relation Classification
Author: Longhua Qian ; Guodong Zhou
Abstract: Seed sampling is critical in semi-supervised learning. This paper proposes a clusteringbased stratified seed sampling approach to semi-supervised learning. First, various clustering algorithms are explored to partition the unlabeled instances into different strata with each stratum represented by a center. Then, diversity-motivated intra-stratum sampling is adopted to choose the center and additional instances from each stratum to form the unlabeled seed set for an oracle to annotate. Finally, the labeled seed set is fed into a bootstrapping procedure as the initial labeled data. We systematically evaluate our stratified bootstrapping approach in the semantic relation classification subtask of the ACE RDC (Relation Detection and Classification) task. In particular, we compare various clustering algorithms on the stratified bootstrapping performance. Experimental results on the ACE RDC 2004 corpus show that our clusteringbased stratified bootstrapping approach achieves the best F1-score of 75.9 on the subtask of semantic relation classification, approaching the one with golden clustering.
4 0.13404831 122 emnlp-2010-WikiWars: A New Corpus for Research on Temporal Expressions
Author: Pawel Mazur ; Robert Dale
Abstract: The reliable extraction of knowledge from text requires an appropriate treatment of the time at which reported events take place. Unfortunately, there are very few annotated data sets that support the development of techniques for event time-stamping and tracking the progression of time through a narrative. In this paper, we present a new corpus of temporally-rich documents sourced from English Wikipedia, which we have annotated with TIMEX2 tags. The corpus contains around 120000 tokens, and 2600 TIMEX2 expressions, thus comparing favourably in size to other existing corpora used in these areas. We describe the prepa- ration of the corpus, and compare the profile of the data with other existing temporally annotated corpora. We also report the results obtained when we use DANTE, our temporal expression tagger, to process this corpus, and point to where further work is required. The corpus is publicly available for research purposes.
5 0.11494692 28 emnlp-2010-Collective Cross-Document Relation Extraction Without Labelled Data
Author: Limin Yao ; Sebastian Riedel ; Andrew McCallum
Abstract: We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle relation extraction and entity identification jointly. We use distant supervision to train a factor graph model for relation extraction based on an existing knowledge base (Freebase, derived in parts from Wikipedia). For inference we run an efficient Gibbs sampler that leads to linear time joint inference. We evaluate our approach both for an indomain (Wikipedia) and a more realistic outof-domain (New York Times Corpus) setting. For the in-domain setting, our joint model leads to 4% higher precision than an isolated local approach, but has no advantage over a pipeline. For the out-of-domain data, we benefit strongly from joint modelling, and observe improvements in precision of 13% over the pipeline, and 15% over the isolated baseline.
7 0.090466544 32 emnlp-2010-Context Comparison of Bursty Events in Web Search and Online Media
8 0.090430811 62 emnlp-2010-Improving Mention Detection Robustness to Noisy Input
9 0.07954859 8 emnlp-2010-A Multi-Pass Sieve for Coreference Resolution
10 0.078431457 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications
11 0.075224087 25 emnlp-2010-Better Punctuation Prediction with Dynamic Conditional Random Fields
12 0.074127302 120 emnlp-2010-What's with the Attitude? Identifying Sentences with Attitude in Online Discussions
13 0.07143978 44 emnlp-2010-Enhancing Mention Detection Using Projection via Aligned Corpora
14 0.0707523 119 emnlp-2010-We're Not in Kansas Anymore: Detecting Domain Changes in Streams
15 0.069119863 80 emnlp-2010-Modeling Organization in Student Essays
16 0.061010912 86 emnlp-2010-Non-Isomorphic Forest Pair Translation
17 0.056988437 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing
18 0.056722142 107 emnlp-2010-Towards Conversation Entailment: An Empirical Investigation
19 0.054383792 33 emnlp-2010-Cross Language Text Classification by Model Translation and Semi-Supervised Learning
20 0.052943211 73 emnlp-2010-Learning Recurrent Event Queries for Web Search
topicId topicWeight
[(0, 0.201), (1, 0.133), (2, -0.003), (3, 0.273), (4, 0.042), (5, -0.156), (6, 0.052), (7, 0.094), (8, 0.021), (9, -0.096), (10, -0.08), (11, -0.113), (12, 0.032), (13, 0.043), (14, -0.023), (15, 0.018), (16, 0.197), (17, 0.285), (18, -0.014), (19, 0.063), (20, 0.163), (21, -0.04), (22, 0.078), (23, -0.153), (24, -0.109), (25, -0.109), (26, 0.249), (27, 0.084), (28, 0.022), (29, -0.202), (30, -0.021), (31, -0.076), (32, -0.156), (33, -0.059), (34, 0.003), (35, -0.017), (36, -0.017), (37, -0.02), (38, -0.034), (39, -0.011), (40, 0.055), (41, 0.027), (42, -0.047), (43, -0.024), (44, 0.035), (45, 0.01), (46, 0.006), (47, 0.038), (48, 0.095), (49, -0.01)]
simIndex simValue paperId paperTitle
same-paper 1 0.97758853 20 emnlp-2010-Automatic Detection and Classification of Social Events
Author: Apoorv Agarwal ; Owen Rambow
Abstract: In this paper we introduce the new task of social event extraction from text. We distinguish two broad types of social events depending on whether only one or both parties are aware of the social contact. We annotate part of Automatic Content Extraction (ACE) data, and perform experiments using Support Vector Machines with Kernel methods. We use a combination of structures derived from phrase structure trees and dependency trees. A characteristic of our events (which distinguishes them from ACE events) is that the participating entities can be spread far across the parse trees. We use syntactic and semantic insights to devise a new structure derived from dependency trees and show that this plays a role in achieving the best performing system for both social event detection and classification tasks. We also use three data sampling approaches to solve the problem of data skewness. Sampling methods improve the F1-measure for the task of relation detection by over 20% absolute over the baseline.
2 0.72618908 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks
Author: Ekaterina Buyko ; Udo Hahn
Abstract: In state-of-the-art approaches to information extraction (IE), dependency graphs constitute the fundamental data structure for syntactic structuring and subsequent knowledge elicitation from natural language documents. The top-performing systems in the BioNLP 2009 Shared Task on Event Extraction all shared the idea to use dependency structures generated by a variety of parsers either directly or in some converted manner — and optionally modified their output to fit the special needs of IE. As there are systematic differences between various dependency representations being used in this competition, we scrutinize on different encoding styles for dependency information and their possible impact on solving several IE tasks. After assessing more or less established dependency representations such as the Stanford and CoNLL-X dependen— cies, we will then focus on trimming operations that pave the way to more effective IE. Our evaluation study covers data from a number of constituency- and dependency-based parsers and provides experimental evidence which dependency representations are particularly beneficial for the event extraction task. Based on empirical findings from our study we were able to achieve the performance of 57.2% F-score on the development data set of the BioNLP Shared Task 2009.
3 0.55438662 122 emnlp-2010-WikiWars: A New Corpus for Research on Temporal Expressions
Author: Pawel Mazur ; Robert Dale
Abstract: The reliable extraction of knowledge from text requires an appropriate treatment of the time at which reported events take place. Unfortunately, there are very few annotated data sets that support the development of techniques for event time-stamping and tracking the progression of time through a narrative. In this paper, we present a new corpus of temporally-rich documents sourced from English Wikipedia, which we have annotated with TIMEX2 tags. The corpus contains around 120000 tokens, and 2600 TIMEX2 expressions, thus comparing favourably in size to other existing corpora used in these areas. We describe the prepa- ration of the corpus, and compare the profile of the data with other existing temporally annotated corpora. We also report the results obtained when we use DANTE, our temporal expression tagger, to process this corpus, and point to where further work is required. The corpus is publicly available for research purposes.
4 0.41980037 27 emnlp-2010-Clustering-Based Stratified Seed Sampling for Semi-Supervised Relation Classification
Author: Longhua Qian ; Guodong Zhou
Abstract: Seed sampling is critical in semi-supervised learning. This paper proposes a clusteringbased stratified seed sampling approach to semi-supervised learning. First, various clustering algorithms are explored to partition the unlabeled instances into different strata with each stratum represented by a center. Then, diversity-motivated intra-stratum sampling is adopted to choose the center and additional instances from each stratum to form the unlabeled seed set for an oracle to annotate. Finally, the labeled seed set is fed into a bootstrapping procedure as the initial labeled data. We systematically evaluate our stratified bootstrapping approach in the semantic relation classification subtask of the ACE RDC (Relation Detection and Classification) task. In particular, we compare various clustering algorithms on the stratified bootstrapping performance. Experimental results on the ACE RDC 2004 corpus show that our clusteringbased stratified bootstrapping approach achieves the best F1-score of 75.9 on the subtask of semantic relation classification, approaching the one with golden clustering.
5 0.39788529 28 emnlp-2010-Collective Cross-Document Relation Extraction Without Labelled Data
Author: Limin Yao ; Sebastian Riedel ; Andrew McCallum
Abstract: We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle relation extraction and entity identification jointly. We use distant supervision to train a factor graph model for relation extraction based on an existing knowledge base (Freebase, derived in parts from Wikipedia). For inference we run an efficient Gibbs sampler that leads to linear time joint inference. We evaluate our approach both for an indomain (Wikipedia) and a more realistic outof-domain (New York Times Corpus) setting. For the in-domain setting, our joint model leads to 4% higher precision than an isolated local approach, but has no advantage over a pipeline. For the out-of-domain data, we benefit strongly from joint modelling, and observe improvements in precision of 13% over the pipeline, and 15% over the isolated baseline.
7 0.29557112 120 emnlp-2010-What's with the Attitude? Identifying Sentences with Attitude in Online Discussions
8 0.28941762 25 emnlp-2010-Better Punctuation Prediction with Dynamic Conditional Random Fields
9 0.28688148 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications
10 0.26706871 80 emnlp-2010-Modeling Organization in Student Essays
11 0.24948066 8 emnlp-2010-A Multi-Pass Sieve for Coreference Resolution
12 0.24935558 32 emnlp-2010-Context Comparison of Bursty Events in Web Search and Online Media
13 0.24269693 62 emnlp-2010-Improving Mention Detection Robustness to Noisy Input
14 0.24240026 73 emnlp-2010-Learning Recurrent Event Queries for Web Search
15 0.22588749 59 emnlp-2010-Identifying Functional Relations in Web Text
16 0.20945393 31 emnlp-2010-Constraints Based Taxonomic Relation Classification
17 0.19636804 44 emnlp-2010-Enhancing Mention Detection Using Projection via Aligned Corpora
18 0.19459215 119 emnlp-2010-We're Not in Kansas Anymore: Detecting Domain Changes in Streams
19 0.19249065 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment
20 0.19243346 86 emnlp-2010-Non-Isomorphic Forest Pair Translation
topicId topicWeight
[(2, 0.308), (3, 0.031), (10, 0.015), (12, 0.062), (29, 0.075), (30, 0.024), (32, 0.016), (52, 0.027), (56, 0.064), (62, 0.017), (66, 0.105), (72, 0.085), (76, 0.023), (77, 0.016), (79, 0.014), (82, 0.023), (87, 0.029), (89, 0.011)]
simIndex simValue paperId paperTitle
same-paper 1 0.76457644 20 emnlp-2010-Automatic Detection and Classification of Social Events
Author: Apoorv Agarwal ; Owen Rambow
Abstract: In this paper we introduce the new task of social event extraction from text. We distinguish two broad types of social events depending on whether only one or both parties are aware of the social contact. We annotate part of Automatic Content Extraction (ACE) data, and perform experiments using Support Vector Machines with Kernel methods. We use a combination of structures derived from phrase structure trees and dependency trees. A characteristic of our events (which distinguishes them from ACE events) is that the participating entities can be spread far across the parse trees. We use syntactic and semantic insights to devise a new structure derived from dependency trees and show that this plays a role in achieving the best performing system for both social event detection and classification tasks. We also use three data sampling approaches to solve the problem of data skewness. Sampling methods improve the F1-measure for the task of relation detection by over 20% absolute over the baseline.
2 0.47466695 35 emnlp-2010-Discriminative Sample Selection for Statistical Machine Translation
Author: Sankaranarayanan Ananthakrishnan ; Rohit Prasad ; David Stallard ; Prem Natarajan
Abstract: Production of parallel training corpora for the development of statistical machine translation (SMT) systems for resource-poor languages usually requires extensive manual effort. Active sample selection aims to reduce the labor, time, and expense incurred in producing such resources, attaining a given performance benchmark with the smallest possible training corpus by choosing informative, nonredundant source sentences from an available candidate pool for manual translation. We present a novel, discriminative sample selection strategy that preferentially selects batches of candidate sentences with constructs that lead to erroneous translations on a held-out development set. The proposed strategy supports a built-in diversity mechanism that reduces redundancy in the selected batches. Simulation experiments on English-to-Pashto and Spanish-to-English translation tasks demon- strate the superiority of the proposed approach to a number of competing techniques, such as random selection, dissimilarity-based selection, as well as a recently proposed semisupervised active learning strategy.
Author: Hugo Hernault ; Danushka Bollegala ; Mitsuru Ishizuka
Abstract: Several recent discourse parsers have employed fully-supervised machine learning approaches. These methods require human annotators to beforehand create an extensive training corpus, which is a time-consuming and costly process. On the other hand, unlabeled data is abundant and cheap to collect. In this paper, we propose a novel semi-supervised method for discourse relation classification based on the analysis of cooccurring features in unlabeled data, which is then taken into account for extending the feature vectors given to a classifier. Our experimental results on the RST Discourse Treebank corpus and Penn Discourse Treebank indicate that the proposed method brings a significant improvement in classification accuracy and macro-average F-score when small training datasets are used. For instance, with training sets of c.a. 1000 labeled instances, the proposed method brings improvements in accuracy and macro-average F-score up to 50% compared to a baseline classifier. We believe that the proposed method is a first step towards detecting low-occurrence relations, which is useful for domains with a lack of annotated data.
4 0.46838439 69 emnlp-2010-Joint Training and Decoding Using Virtual Nodes for Cascaded Segmentation and Tagging Tasks
Author: Xian Qian ; Qi Zhang ; Yaqian Zhou ; Xuanjing Huang ; Lide Wu
Abstract: Many sequence labeling tasks in NLP require solving a cascade of segmentation and tagging subtasks, such as Chinese POS tagging, named entity recognition, and so on. Traditional pipeline approaches usually suffer from error propagation. Joint training/decoding in the cross-product state space could cause too many parameters and high inference complexity. In this paper, we present a novel method which integrates graph structures of two subtasks into one using virtual nodes, and performs joint training and decoding in the factorized state space. Experimental evaluations on CoNLL 2000 shallow parsing data set and Fourth SIGHAN Bakeoff CTB POS tagging data set demonstrate the superiority of our method over cross-product, pipeline and candidate reranking approaches.
5 0.46284014 32 emnlp-2010-Context Comparison of Bursty Events in Web Search and Online Media
Author: Yunliang Jiang ; Cindy Xide Lin ; Qiaozhu Mei
Abstract: In this paper, we conducted a systematic comparative analysis of language in different contexts of bursty topics, including web search, news media, blogging, and social bookmarking. We analyze (1) the content similarity and predictability between contexts, (2) the coverage of search content by each context, and (3) the intrinsic coherence of information in each context. Our experiments show that social bookmarking is a better predictor to the bursty search queries, but news media and social blogging media have a much more compelling coverage. This comparison provides insights on how the search behaviors and social information sharing behaviors of users are correlated to the professional news media in the context of bursty events.
6 0.46265113 49 emnlp-2010-Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields
7 0.46221969 45 emnlp-2010-Evaluating Models of Latent Document Semantics in the Presence of OCR Errors
8 0.4606806 86 emnlp-2010-Non-Isomorphic Forest Pair Translation
9 0.46035931 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice
10 0.4583824 82 emnlp-2010-Multi-Document Summarization Using A* Search and Discriminative Learning
11 0.45699108 55 emnlp-2010-Handling Noisy Queries in Cross Language FAQ Retrieval
12 0.45678219 84 emnlp-2010-NLP on Spoken Documents Without ASR
13 0.45643014 25 emnlp-2010-Better Punctuation Prediction with Dynamic Conditional Random Fields
14 0.4562344 105 emnlp-2010-Title Generation with Quasi-Synchronous Grammar
15 0.45565027 92 emnlp-2010-Predicting the Semantic Compositionality of Prefix Verbs
16 0.4555032 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification
17 0.45459959 31 emnlp-2010-Constraints Based Taxonomic Relation Classification
18 0.45445853 18 emnlp-2010-Assessing Phrase-Based Translation Models with Oracle Decoding
19 0.4536446 120 emnlp-2010-What's with the Attitude? Identifying Sentences with Attitude in Online Discussions
20 0.45317441 2 emnlp-2010-A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model