acl acl2010 acl2010-247 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Cosmin Bejan ; Sanda Harabagiu
Abstract: This paper examines how a new class of nonparametric Bayesian models can be effectively applied to an open-domain event coreference task. Designed with the purpose of clustering complex linguistic objects, these models consider a potentially infinite number of features and categorical outcomes. The evaluation performed for solving both within- and cross-document event coreference shows significant improvements of the models when compared against two baselines for this task.
Reference: text
sentIndex sentText sentNum sentScore
1 The evaluation performed for solving both within- and cross-document event coreference shows significant improvements of the models when compared against two baselines for this task. [sent-3, score-0.788]
2 1 Introduction The event coreference task consists of finding clusters of event mentions that refer to the same event. [sent-4, score-1.651]
3 Although it has not been extensively studied in comparison with the related problem of entity coreference resolution, solving event coreference has already proved its usefulness in various applications such as topic detection and tracking (Allan et al. [sent-5, score-1.011]
4 Previous approaches for solving event coreference relied on supervised learning methods that explore various linguistic properties in order to decide if a pair of event mentions is coreferential or not (Humphreys et al. [sent-10, score-1.73]
5 Also, since these models are dependent on local pairwise decisions, they are unable to capture a global event distribution at topic or document collection level. [sent-14, score-0.676]
6 To address these limitations and to provide a more flexible representation for modeling observable data with rich properties, we present two novel, fully generative, nonparametric Bayesian models for unsupervised within- and crossdocument event coreference resolution. [sent-15, score-0.981]
7 , 2006) to take into account additional properties associated with observable objects (i. [sent-17, score-0.215]
8 , 2002) in order to (1) consider a potentially infinite number of features associated with observable objects, (2) perform an automatic selection of the most salient features, and (3) capture the structural dependencies of observable objects at the discourse level. [sent-23, score-0.442]
9 These models provide additional details and experimental results to our preliminary work on unsupervised event coreference resolution (Bejan et al. [sent-27, score-0.848]
10 One relevant theory on event identity was proposed by Davidson (1969) who argued that two events are identical if they have the same causes and effects. [sent-30, score-0.702]
11 Later on, a different theory was proposed by Quine (1985) who considered that each event refers to a physical object (which is well defined in space and time), and therefore, two events are identical 1412 Proce dingUsp opfs thaela 4, 8Stwhe Adnen u,a 1l1- M16e Jtiunlgy o 2f0 t1h0e. [sent-31, score-0.702]
12 In (Davidson, 1985), Davidson abandoned his suggestion to embrace the Quinean theory on event identity (Malpas, 2009). [sent-34, score-0.565]
13 1 An Example In accordance with the Quinean theory, we consider that two event mentions are coreferential if they have the same event properties and share the same event participants. [sent-36, score-2.072]
14 For instance, the sentences from Example 1encode event mentions that refer to several individuated events. [sent-37, score-0.902]
15 These sentences are extracted from a newly annotated corpus with event coreference information (see Section 4). [sent-38, score-0.788]
16 In this corpus, we organize documents that describe the same seminal event into topics. [sent-39, score-0.6]
17 In particular, the topics shown in this example describe the seminal event of buying ATI by AMD (topic 43) and the seminal event of buying EDS by HP (topic 44). [sent-40, score-1.258]
18 Although all the event mentions of interest emphasized in boldface in Example 1evoke the same generic event buy, they refer to three individuated events: e1 = {em1, em2}, e2 = {em3−6, em8}, and e3 = {em7}. [sent-41, score-1.467]
19 This organization AoMf eDv]en 6=t m [Bentions leads to the idea of creating an event hierarchy which has on the first level, event mentions, on the second level, individuated events, and on the third level, generic events. [sent-43, score-1.203]
20 In particular, the event hierarchy corresponding to the event mentions annotated in our example is illustrated in Figure 1. [sent-44, score-1.394]
21 Solving the event coreference problem poses many interesting challenges. [sent-45, score-0.788]
22 , em3(buy)–verb, word classes em4(purchase)–noun); (iii) not all the mentions from the same chain are synonymous (e. [sent-48, score-0.297]
23 , in WordNet (Fellbaum, 1998), the genus of buy is acquire); (iv) partial (or all) properties and participants of an event mention can be omitted in text (e. [sent-52, score-0.841]
24 buy e1 em1 em2 e2 em3 em4 em5 e3 em6 em8 em7 Figure 1: Fragment from the event hierarchy. [sent-64, score-0.676]
25 5, we discuss additional aspects of the event coreference problem that are not revealed in Example 1. [sent-65, score-0.788]
26 2 Linguistic Features The events representing coreference clusters of event mentions are characterized by a large set of linguistic features. [sent-67, score-1.223]
27 To compute an accurate event distribution for event coreference resolution, we associate the following categories oflinguistic features with each annotated event mention. [sent-68, score-2.005]
28 For instance, the lexical features extracted for the event mention em7(bought) from our example are HW:bought, HL:buy, LHL:it, RHL:Compaq, LHE:acquisition, and RHE:acquire. [sent-70, score-0.752]
29 Class Features (CF) These features aim to group mentions into several types of classes: the partof-speech of the HW feature (POS), the word class of the HW feature (HWC), and the event class of the mention (EC). [sent-71, score-1.204]
30 As values for the EC feature, we consider the seven event classes defined in the TimeML specification language (Pustejovsky et al. [sent-73, score-0.565]
31 In order to extract the event classes cor- responding to the event mentions from a given dataset, we employed the event extractor described in (Bejan, 2007). [sent-75, score-1.959]
32 WordNet Features (WF) In our efforts to create clusters ofevent mention attributes as close as possible to the true attribute clusters of the individuated events, we build two sets of word clusters using the entire lexical information from the WordNet database. [sent-78, score-0.31]
33 After creating these sets of clusters, we then associate each event mention with only one cluster from each set. [sent-79, score-0.7]
34 Semantic Features (SF) To extract features that characterize participants and properties of event mentions, we use the semantic parser described in (Bejan and Hathaway, 2007). [sent-84, score-0.688]
35 One category of semantic features that we identify for event mentions is the predicate argument structures encoded in PropBank annotations (Palmer et al. [sent-85, score-0.955]
36 In PropBank, the predicate argument structures are represented by events expressed as verbs in text and by the semantic roles, or predicate arguments, associated with these events. [sent-87, score-0.283]
37 In particular, the predicate arguments associated to the event mention em8(bought) from Example 1 are ARG0:[it], ARG1:[Compaq Computer Corp. [sent-90, score-0.772]
38 Event mentions are not only expressed as verbs in text, but also as nouns and adjectives. [sent-92, score-0.264]
39 The semantic roles associated with a word in FrameNet, or frame elements, are locally defined for the semantic frame evoked by the word. [sent-96, score-0.251]
40 Additionally, we use this map to create a more general semantic feature which assigns to each predicate argument a frame element label. [sent-104, score-0.233]
41 Two additional semantic features used in our experiments are: (1) the semantic frame (FR) evoked by every mention;1 and (2) the WNS feature applied to the head word of every semantic role (e. [sent-106, score-0.334]
42 It is worth noting that there exist event mentions for which not all the features can be extracted. [sent-111, score-0.881]
43 For example, the LHE and RHE features are missing for the first and last event mentions in a document, respectively. [sent-112, score-0.881]
44 Also, many semantic roles can be absent for an event mention in a given context. [sent-113, score-0.741]
45 1 The reason for extracting this feature is given by the fact that, in general, frames are able to capture properties of generic events (Lowe et al. [sent-114, score-0.261]
46 1414 3 Nonparametric Bayesian Models As input for our models, we consider a collection of I documents, where each document i has Ji event mentions. [sent-116, score-0.609]
47 Each event mention is characterized by L feature types, FT, and each feature type is represented by a finite vocabulary of feature values, fv. [sent-120, score-1.023]
48 Thus, we can represent the observable properties of an event mention as a vector of L feature type feature value pairs h(FT1 : fv1i) , . [sent-121, score-1.028]
49 , event mention) by a finite number of feature types L. [sent-128, score-0.7]
50 However, their model is strictly customized for entity coreference resolution, and therefore, extending it to include additional features for each observable object is a challenging task (Ng, 2008; Poon and Domingos, 2008). [sent-130, score-0.385]
51 To describe its extension, we consider Z the set of indicator random variables for indices ofevents, φz the set ofparameters associated with an event z, φ a notation for all model parameters, and X a notation for all random variables that represent observable features. [sent-134, score-0.79]
52 2 Given a document collection annotated with event mentions, the goal is to find the best assignment of event indices Z∗, which maximize the poste- rior probability P(Z|X). [sent-135, score-1.174]
53 Similar to the HDP model, the distribution over events associated with each document, β, is generated by a Dirichlet process with a 2 In this subsection, the feature term is used in context of a feature type. [sent-137, score-0.399]
54 Since this setting enables a clustering of event mentions at the document level, it is desirable that events be shared across documents and the number of events K be inferred from data. [sent-139, score-1.147]
55 The global distribution drawn from this DP prior, denoted as β0 in Figure 2(a), encodes the event mixing weights. [sent-142, score-0.632]
56 Thus, same global events are used for each docu- ment, but each event has a document specific distribution βi that is drawn from a DP prior centered on the global weights β0. [sent-143, score-0.845]
57 , 2006) and use tPhe(Z G|Xib)b,s sampling algorithm (Geman a anndd G ueseman, 1984) based on the direct assignment sampling scheme. [sent-145, score-0.226]
58 Thus, by Bayes rule, the formula for sampling an event index for mention j from document i, Zi,j, is:3 P(Zi,j| Z−i,j,X) ∝ P(Zi,j| Z−i,j)YP(Xi,j|Z,X−i,j) where Xi,j represents the feature value of a feature type corresponding to the event mention j from the document i. [sent-149, score-1.82]
59 In the process of generating an event mention, an event index z is first sampled by using a mechanism that facilitates sampling from a prior for infinite mixture models called the Chinese restaurant franchise (CRF) representation, as reported in (Teh et al. [sent-150, score-1.474]
60 Next, to generate a feature value x (with the feature type X) of the event mention, the event z is 3 Z−i,j represents a notation for Z − {Zi,j }. [sent-152, score-1.318]
61 n W fero amss a symmetric eDmiri scshiloent distribution with concentration λX: P(Xi,j = x | Z, X−i,j) ∝ nx,z + λX where Xi,j is the feature type of the mention j from the document i, and nx,z is the number of times the feature value x has been associated with the event index z in (Z, X−i,j). [sent-160, score-1.037]
62 2 An Infinite Feature Model To relax some of the restrictions of the first model, we devise an approach that combines the infinite factorial hidden Markov model (iFHMM) with the infinite hidden Markov model (iHMM) to form the iFHMM-iHMM model. [sent-174, score-0.322]
63 4 Specifically, the mIBP defines a distribution over an unbounded set of binary Markov chains, where each chain can be associated with a binary latent feature that evolves over time according to Markov dynamics. [sent-179, score-0.269]
64 Therefore, if we denote by M the total number of feature chains and by T the number of observable components, the mIBP defines a probability distribution over a binary matrix F with T rows, which correspond to observations, and an unbounded number of columns M, which correspond to features. [sent-180, score-0.273]
65 rIne other words, F decomposes the observations and represents them as feature factors, which can then be associated with hidden variables in an iFHMM model as depicted in Figure 2(c). [sent-188, score-0.218]
66 On the other hand, the iHMM represents a nonparametric extension of the hidden Markov model (HMM) (Rabiner, 1989) that allows performing inference on an infinite number of states K. [sent-191, score-0.225]
67 In general, depending on the value that was sampled in the previous step (t 1), a feature fm sisa sampled f thore t phree tvitho component according to fthe P(Ftm = 1| Ftm−1 = 1) and P(Ftm = 1| Ftm−1 = 0) probabi=liti 1es|. [sent-199, score-0.278]
68 , sT) the sequence of hidden states corresponding to the sequence of event mentions (y1, . [sent-205, score-0.876]
69 , K}, and each mention yt is represented by a sequence odf elaactehnt m feean-tures hFt1, Ft2, . [sent-211, score-0.224]
70 f Oinneed as πij = P(st = j | st−1 = i), and a mention yt is generated according to a likelihood model F that is parameterized by a state-dependent parameter φst (yt | st ∼ F(φst )). [sent-216, score-0.307]
71 ideas of slice sampling and dynamic programming for an efficient sampling of state trajectories. [sent-220, score-0.261]
72 For sampling the whole hidden state trajectory s, this algorithm employs a forward filteringbackward sampling technique. [sent-223, score-0.273]
73 In the forward step of our adapted beam sampler, for each mention yt, we sample features using the mIBP mechanism and the auxiliary variable ut ∼ Uniform (0, πst−1st ). [sent-224, score-0.289]
74 In the backward step, we first sample the event for the last state sT directly from P(sT | y1:T, u1:T) and then, for all t : T−1 . [sent-228, score-0.565]
75 , oK) defined as: XT ok =X X nmk Xt=1 fmX X∈Bt In this formula, nmk counts how many times the feature fm was sampled for the event k, and Bt stores a finite set of features for yt. [sent-236, score-0.926]
76 The mechanism for building a finite set of representative features for the mention yt is based on slice sampling (Neal, 2003). [sent-237, score-0.502]
77 Letting qm be the number oftimes the feature fm was sampled in the mIBP, and vt an auxiliary variable for yt such that vt ∼ Uniform (1, max{qm : Ftm = 1}), we define the∼ ∼fi Uninteif ofermatu(1re, msaetx Bt for the =obs 1e}r)v,a twioen d yt as Bt = {fm : Ftm = 1∧qm vt}. [sent-238, score-0.507]
78 However, the utilization of the ACE corpus for the task of solving event coreference is lim- ited because this resource provides only withindocument event coreference annotations using a restricted set of event types such as LIFE, BUSINESS, CONFLICT, and JUSTICE. [sent-245, score-2.141]
79 Therefore, as a second dataset, we created the EventCorefBank (ECB) corpus6 to increase the diversity of event types and to be able to evaluate our models for both within- and cross-document event coreference resolution. [sent-246, score-1.353]
80 One important step in the creation process of this corpus consists in finding sets of related documents that describe the same seminal event such that the annotation of coreferential event mentions across documents is possible. [sent-247, score-1.512]
81 Evaluation For a more realistic approach, we not only trained the models on the manually annotated event mentions (i. [sent-251, score-0.829]
82 , true mentions), but also on all the possible mentions encoded in the two datasets. [sent-253, score-0.264]
83 To extract all event mentions, we ran the event identifier described in (Bejan, 2007). [sent-254, score-1.13]
84 17430895237 tions) were able to cover all the true mentions from both datasets. [sent-264, score-0.264]
85 In the evaluation process, we considered only the true mentions of the ACE test dataset, and the event mentions of the test sets derived from a 5fold cross validation scheme on the ECB dataset. [sent-268, score-1.093]
86 For evaluating the cross-document coreference annotations, we adopted the same approach as described in (Bagga and Baldwin, 1999) by merging all the documents from the same topic into a meta-document and then scoring this document as performed for within-document evaluation. [sent-269, score-0.267]
87 For both corpora, we considered a set of 132 feature types, where each feature type consists on average of 3900 distinct feature values. [sent-270, score-0.282]
88 Baselines We consider two baselines for event coreference resolution (rows 1&2 in Tables 2&3). [sent-271, score-0.848]
89 One baseline groups each event mention by its event class (BLeclass). [sent-272, score-1.265]
90 Therefore, for this baseline, we cluster mentions according to their corresponding EC feature value. [sent-273, score-0.358]
91 Similarly, the second baseline uses as grouping criteria for event mentions their corresponding WNS feature value (BLsyn). [sent-274, score-0.923]
92 For instance, Figure 3(a) shows that the HDPflat model corresponding to row 7 in Table 3 converges in 350 iteration steps to a posterior distribution over event mentions from ACE with around 2000 latent events. [sent-283, score-0.898]
93 As listed in Table 4, all the iFHMM-iHMM models that used a feature sampling scheme significantly outperform the iFHMM-iHMMall model; this proves that all the sampling schemes considered in the iFHMMiHMM framework are able to successfully filter out noisy and redundant feature values. [sent-343, score-0.414]
94 However, for this result, ground truth event mentions as well as a manually tuned coreference threshold were employed. [sent-346, score-1.052]
95 5 Error Analysis One frequent error occurs when a more complex form of semantic inference is needed to find a correspondence between two event mentions of the same individuated event. [sent-347, score-0.943]
96 This example also suggests the need for a better modeling of the discourse salience for event mentions. [sent-349, score-0.565]
97 Another common error is made when matching the semantic roles corresponding to coreferential event mentions. [sent-350, score-0.689]
98 Although we simulated entity coreference by using various semantic features, the task of matching participants of coreferential event mentions is not completely solved. [sent-351, score-1.176]
99 Simi- larly for event properties, many clioarnef jeareil. [sent-354, score-0.565]
100 Our experiments for event coreference resolution proved that these models are able to solve real data applications in which the feature and cluster numbers are treated as free parameters, and the selection of feature values is performed automatically. [sent-361, score-1.036]
wordName wordTfidf (topN-words)
[('event', 0.565), ('mentions', 0.264), ('coreference', 0.223), ('hdp', 0.179), ('ecb', 0.176), ('hdpflat', 0.161), ('mibp', 0.146), ('events', 0.137), ('mention', 0.135), ('bejan', 0.132), ('hdpstruct', 0.132), ('sampling', 0.113), ('buy', 0.111), ('observable', 0.11), ('davidson', 0.102), ('infinite', 0.095), ('feature', 0.094), ('gael', 0.094), ('yt', 0.089), ('ftm', 0.088), ('ifhmm', 0.088), ('coreferential', 0.083), ('ace', 0.083), ('nonparametric', 0.083), ('st', 0.083), ('cosmin', 0.073), ('individuated', 0.073), ('hl', 0.07), ('sampled', 0.068), ('frame', 0.065), ('resolution', 0.06), ('lidstone', 0.059), ('framenet', 0.053), ('features', 0.052), ('markov', 0.052), ('qm', 0.051), ('buyer', 0.051), ('bought', 0.051), ('teh', 0.05), ('bayesian', 0.048), ('fm', 0.048), ('adrian', 0.048), ('bagga', 0.048), ('hidden', 0.047), ('amd', 0.047), ('pustejovsky', 0.047), ('donald', 0.046), ('zoubin', 0.044), ('purchase', 0.044), ('beal', 0.044), ('compaq', 0.044), ('hwc', 0.044), ('ifhmmihmm', 0.044), ('wns', 0.044), ('document', 0.044), ('semantic', 0.041), ('finite', 0.041), ('bt', 0.041), ('hw', 0.041), ('van', 0.039), ('associated', 0.039), ('whye', 0.038), ('humphreys', 0.038), ('factorial', 0.038), ('quine', 0.038), ('ihmm', 0.038), ('hp', 0.038), ('variables', 0.038), ('sanda', 0.037), ('mechanism', 0.037), ('wordnet', 0.036), ('objects', 0.036), ('seminal', 0.035), ('slice', 0.035), ('unfiltered', 0.035), ('distribution', 0.035), ('vt', 0.034), ('unbounded', 0.034), ('beam', 0.034), ('clusters', 0.034), ('latent', 0.034), ('predicate', 0.033), ('indian', 0.033), ('chain', 0.033), ('gaizauskas', 0.033), ('timeml', 0.033), ('global', 0.032), ('billion', 0.032), ('index', 0.031), ('ghahramani', 0.031), ('ut', 0.031), ('dp', 0.031), ('properties', 0.03), ('fr', 0.029), ('buffet', 0.029), ('buying', 0.029), ('lepore', 0.029), ('lhl', 0.029), ('mclaughlin', 0.029), ('nmk', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000011 247 acl-2010-Unsupervised Event Coreference Resolution with Rich Linguistic Features
Author: Cosmin Bejan ; Sanda Harabagiu
Abstract: This paper examines how a new class of nonparametric Bayesian models can be effectively applied to an open-domain event coreference task. Designed with the purpose of clustering complex linguistic objects, these models consider a potentially infinite number of features and categorical outcomes. The evaluation performed for solving both within- and cross-document event coreference shows significant improvements of the models when compared against two baselines for this task.
2 0.22882025 28 acl-2010-An Entity-Level Approach to Information Extraction
Author: Aria Haghighi ; Dan Klein
Abstract: We present a generative model of template-filling in which coreference resolution and role assignment are jointly determined. Underlying template roles first generate abstract entities, which in turn generate concrete textual mentions. On the standard corporate acquisitions dataset, joint resolution in our entity-level model reduces error over a mention-level discriminative approach by up to 20%.
3 0.22800216 165 acl-2010-Learning Script Knowledge with Web Experiments
Author: Michaela Regneri ; Alexander Koller ; Manfred Pinkal
Abstract: We describe a novel approach to unsupervised learning of the events that make up a script, along with constraints on their temporal ordering. We collect naturallanguage descriptions of script-specific event sequences from volunteers over the Internet. Then we compute a graph representation of the script’s temporal structure using a multiple sequence alignment algorithm. The evaluation of our system shows that we outperform two informed baselines.
4 0.22561535 72 acl-2010-Coreference Resolution across Corpora: Languages, Coding Schemes, and Preprocessing Information
Author: Marta Recasens ; Eduard Hovy
Abstract: This paper explores the effect that different corpus configurations have on the performance of a coreference resolution system, as measured by MUC, B3, and CEAF. By varying separately three parameters (language, annotation scheme, and preprocessing information) and applying the same coreference resolution system, the strong bonds between system and corpus are demonstrated. The experiments reveal problems in coreference resolution evaluation relating to task definition, coding schemes, and features. They also ex- pose systematic biases in the coreference evaluation metrics. We show that system comparison is only possible when corpus parameters are in exact agreement.
5 0.20426847 219 acl-2010-Supervised Noun Phrase Coreference Research: The First Fifteen Years
Author: Vincent Ng
Abstract: The research focus of computational coreference resolution has exhibited a shift from heuristic approaches to machine learning approaches in the past decade. This paper surveys the major milestones in supervised coreference research since its inception fifteen years ago.
6 0.19984902 106 acl-2010-Event-Based Hyperspace Analogue to Language for Query Expansion
7 0.18132792 233 acl-2010-The Same-Head Heuristic for Coreference
8 0.16504739 73 acl-2010-Coreference Resolution with Reconcile
9 0.12896004 225 acl-2010-Temporal Information Processing of a New Language: Fast Porting with Minimal Resources
10 0.12767388 33 acl-2010-Assessing the Role of Discourse References in Entailment Inference
11 0.098923415 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates
12 0.097453684 229 acl-2010-The Influence of Discourse on Syntax: A Psycholinguistic Model of Sentence Processing
13 0.09378273 155 acl-2010-Kernel Based Discourse Relation Recognition with Temporal Ordering Information
14 0.089032032 121 acl-2010-Generating Entailment Rules from FrameNet
15 0.082639873 158 acl-2010-Latent Variable Models of Selectional Preference
16 0.078443706 85 acl-2010-Detecting Experiences from Weblogs
17 0.078099847 38 acl-2010-Automatic Evaluation of Linguistic Quality in Multi-Document Summarization
18 0.077132791 149 acl-2010-Incorporating Extra-Linguistic Information into Reference Resolution in Collaborative Task Dialogue
19 0.076320618 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
20 0.074245013 238 acl-2010-Towards Open-Domain Semantic Role Labeling
topicId topicWeight
[(0, -0.216), (1, 0.132), (2, 0.04), (3, -0.186), (4, -0.068), (5, 0.256), (6, -0.006), (7, -0.007), (8, 0.081), (9, 0.007), (10, -0.03), (11, -0.067), (12, 0.039), (13, -0.114), (14, 0.067), (15, -0.01), (16, -0.009), (17, 0.025), (18, -0.021), (19, -0.069), (20, 0.095), (21, -0.111), (22, -0.049), (23, -0.101), (24, 0.162), (25, -0.023), (26, 0.093), (27, 0.02), (28, 0.096), (29, -0.197), (30, 0.229), (31, 0.128), (32, 0.051), (33, 0.051), (34, 0.028), (35, 0.038), (36, 0.097), (37, 0.085), (38, 0.186), (39, -0.052), (40, -0.016), (41, 0.058), (42, 0.055), (43, -0.024), (44, 0.116), (45, -0.036), (46, 0.014), (47, 0.091), (48, -0.017), (49, 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 0.96134037 247 acl-2010-Unsupervised Event Coreference Resolution with Rich Linguistic Features
Author: Cosmin Bejan ; Sanda Harabagiu
Abstract: This paper examines how a new class of nonparametric Bayesian models can be effectively applied to an open-domain event coreference task. Designed with the purpose of clustering complex linguistic objects, these models consider a potentially infinite number of features and categorical outcomes. The evaluation performed for solving both within- and cross-document event coreference shows significant improvements of the models when compared against two baselines for this task.
2 0.70390475 165 acl-2010-Learning Script Knowledge with Web Experiments
Author: Michaela Regneri ; Alexander Koller ; Manfred Pinkal
Abstract: We describe a novel approach to unsupervised learning of the events that make up a script, along with constraints on their temporal ordering. We collect naturallanguage descriptions of script-specific event sequences from volunteers over the Internet. Then we compute a graph representation of the script’s temporal structure using a multiple sequence alignment algorithm. The evaluation of our system shows that we outperform two informed baselines.
3 0.68874818 28 acl-2010-An Entity-Level Approach to Information Extraction
Author: Aria Haghighi ; Dan Klein
Abstract: We present a generative model of template-filling in which coreference resolution and role assignment are jointly determined. Underlying template roles first generate abstract entities, which in turn generate concrete textual mentions. On the standard corporate acquisitions dataset, joint resolution in our entity-level model reduces error over a mention-level discriminative approach by up to 20%.
4 0.64224476 225 acl-2010-Temporal Information Processing of a New Language: Fast Porting with Minimal Resources
Author: Francisco Costa ; Antonio Branco
Abstract: We describe the semi-automatic adaptation of a TimeML annotated corpus from English to Portuguese, a language for which TimeML annotated data was not available yet. In order to validate this adaptation, we use the obtained data to replicate some results in the literature that used the original English data. The fact that comparable results are obtained indicates that our approach can be used successfully to rapidly create semantically annotated resources for new languages.
5 0.59383917 72 acl-2010-Coreference Resolution across Corpora: Languages, Coding Schemes, and Preprocessing Information
Author: Marta Recasens ; Eduard Hovy
Abstract: This paper explores the effect that different corpus configurations have on the performance of a coreference resolution system, as measured by MUC, B3, and CEAF. By varying separately three parameters (language, annotation scheme, and preprocessing information) and applying the same coreference resolution system, the strong bonds between system and corpus are demonstrated. The experiments reveal problems in coreference resolution evaluation relating to task definition, coding schemes, and features. They also ex- pose systematic biases in the coreference evaluation metrics. We show that system comparison is only possible when corpus parameters are in exact agreement.
6 0.56698841 106 acl-2010-Event-Based Hyperspace Analogue to Language for Query Expansion
7 0.51603711 233 acl-2010-The Same-Head Heuristic for Coreference
8 0.4975124 219 acl-2010-Supervised Noun Phrase Coreference Research: The First Fifteen Years
9 0.49530163 73 acl-2010-Coreference Resolution with Reconcile
10 0.38173282 111 acl-2010-Extracting Sequences from the Web
11 0.37744313 139 acl-2010-Identifying Generic Noun Phrases
12 0.36447531 196 acl-2010-Plot Induction and Evolutionary Search for Story Generation
13 0.35659355 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields
14 0.35330772 248 acl-2010-Unsupervised Ontology Induction from Text
15 0.35197201 85 acl-2010-Detecting Experiences from Weblogs
16 0.35113987 108 acl-2010-Expanding Verb Coverage in Cyc with VerbNet
17 0.34598184 230 acl-2010-The Manually Annotated Sub-Corpus: A Community Resource for and by the People
18 0.31795061 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates
19 0.31407303 63 acl-2010-Comparable Entity Mining from Comparative Questions
20 0.30445644 12 acl-2010-A Probabilistic Generative Model for an Intermediate Constituency-Dependency Representation
topicId topicWeight
[(14, 0.035), (17, 0.228), (25, 0.09), (33, 0.022), (39, 0.01), (42, 0.014), (44, 0.019), (58, 0.016), (59, 0.073), (72, 0.013), (73, 0.052), (78, 0.032), (80, 0.023), (83, 0.114), (84, 0.029), (98, 0.115)]
simIndex simValue paperId paperTitle
same-paper 1 0.81960046 247 acl-2010-Unsupervised Event Coreference Resolution with Rich Linguistic Features
Author: Cosmin Bejan ; Sanda Harabagiu
Abstract: This paper examines how a new class of nonparametric Bayesian models can be effectively applied to an open-domain event coreference task. Designed with the purpose of clustering complex linguistic objects, these models consider a potentially infinite number of features and categorical outcomes. The evaluation performed for solving both within- and cross-document event coreference shows significant improvements of the models when compared against two baselines for this task.
2 0.75970298 174 acl-2010-Modeling Semantic Relevance for Question-Answer Pairs in Web Social Communities
Author: Baoxun Wang ; Xiaolong Wang ; Chengjie Sun ; Bingquan Liu ; Lin Sun
Abstract: Quantifying the semantic relevance between questions and their candidate answers is essential to answer detection in social media corpora. In this paper, a deep belief network is proposed to model the semantic relevance for question-answer pairs. Observing the textual similarity between the community-driven questionanswering (cQA) dataset and the forum dataset, we present a novel learning strategy to promote the performance of our method on the social community datasets without hand-annotating work. The experimental results show that our method outperforms the traditional approaches on both the cQA and the forum corpora.
3 0.73630404 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition
Author: Partha Pratim Talukdar ; Fernando Pereira
Abstract: Graph-based semi-supervised learning (SSL) algorithms have been successfully used to extract class-instance pairs from large unstructured and structured text collections. However, a careful comparison of different graph-based SSL algorithms on that task has been lacking. We compare three graph-based SSL algorithms for class-instance acquisition on a variety of graphs constructed from different domains. We find that the recently proposed MAD algorithm is the most effective. We also show that class-instance extraction can be significantly improved by adding semantic information in the form of instance-attribute edges derived from an independently developed knowledge base. All of our code and data will be made publicly available to encourage reproducible research in this area.
4 0.66604191 71 acl-2010-Convolution Kernel over Packed Parse Forest
Author: Min Zhang ; Hui Zhang ; Haizhou Li
Abstract: This paper proposes a convolution forest kernel to effectively explore rich structured features embedded in a packed parse forest. As opposed to the convolution tree kernel, the proposed forest kernel does not have to commit to a single best parse tree, is thus able to explore very large object spaces and much more structured features embedded in a forest. This makes the proposed kernel more robust against parsing errors and data sparseness issues than the convolution tree kernel. The paper presents the formal definition of convolution forest kernel and also illustrates the computing algorithm to fast compute the proposed convolution forest kernel. Experimental results on two NLP applications, relation extraction and semantic role labeling, show that the proposed forest kernel significantly outperforms the baseline of the convolution tree kernel. 1
5 0.65777421 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields
Author: Jackie Chi Kit Cheung ; Gerald Penn
Abstract: One goal of natural language generation is to produce coherent text that presents information in a logical order. In this paper, we show that topological fields, which model high-level clausal structure, are an important component of local coherence in German. First, we show in a sentence ordering experiment that topological field information improves the entity grid model of Barzilay and Lapata (2008) more than grammatical role and simple clausal order information do, particularly when manual annotations of this information are not available. Then, we incorporate the model enhanced with topological fields into a natural language generation system that generates constituent orders for German text, and show that the added coherence component improves performance slightly, though not statistically significantly.
6 0.65300936 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese
7 0.64854753 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
8 0.64196473 1 acl-2010-"Ask Not What Textual Entailment Can Do for You..."
9 0.64193261 169 acl-2010-Learning to Translate with Source and Target Syntax
10 0.64167047 252 acl-2010-Using Parse Features for Preposition Selection and Error Detection
11 0.6406374 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
12 0.63974589 128 acl-2010-Grammar Prototyping and Testing with the LinGO Grammar Matrix Customization System
13 0.63951176 248 acl-2010-Unsupervised Ontology Induction from Text
14 0.63849658 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
15 0.63835323 261 acl-2010-Wikipedia as Sense Inventory to Improve Diversity in Web Search Results
16 0.63749129 39 acl-2010-Automatic Generation of Story Highlights
17 0.63652849 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification
18 0.63641262 245 acl-2010-Understanding the Semantic Structure of Noun Phrase Queries
19 0.63626391 113 acl-2010-Extraction and Approximation of Numerical Attributes from the Web
20 0.6353606 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses