acl acl2013 acl2013-267 knowledge-graph by maker-knowledge-mining

267 acl-2013-PARMA: A Predicate Argument Aligner

Source: pdf

Author: Travis Wolfe ; Benjamin Van Durme ; Mark Dredze ; Nicholas Andrews ; Charley Beller ; Chris Callison-Burch ; Jay DeYoung ; Justin Snyder ; Jonathan Weese ; Tan Xu ; Xuchen Yao

Abstract: We introduce PARMA, a system for crossdocument, semantic predicate and argument alignment. Our system combines a number of linguistic resources familiar to researchers in areas such as recognizing textual entailment and question answering, integrating them into a simple discriminative model. PARMA achieves state of the art results on an existing and a new dataset. We suggest that previous efforts have focussed on data that is biased and too easy, and we provide a more difficult dataset based on translation data with a low baseline which we beat by 17% F1.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 PARMA achieves state of the art results on an existing and a new dataset. [sent-3, score-0.039]

2 We suggest that previous efforts have focussed on data that is biased and too easy, and we provide a more difficult dataset based on translation data with a low baseline which we beat by 17% F1. [sent-4, score-0.049]

3 1 Introduction A key step of the information extraction pipeline is entity disambiguation, in which discovered entities across many sentences and documents must be organized to represent real world entities. [sent-5, score-0.143]

4 The NLP community has a long history of entity disambiguation both within and across documents. [sent-6, score-0.107]

5 Commonly a situational predicate is taken to correspond to either an event or a state, lexically realized in verbs such as “elect” or nominalizations such as “election”. [sent-8, score-0.337]

6 Similar to entity coreference resolution, almost all of this work assumes unanchored mentions: predicate argument tuples are grouped together based on coreferent events. [sent-9, score-0.567]

7 The first work on event coreference dates back to Bagga and Baldwin (1999). [sent-10, score-0.232]

8 As with unanchored entity disambiguation, these methods rely on clustering methods and evaluation metrics. [sent-13, score-0.118]

9 Another view ofpredicate disambiguation seeks to link or align predicate argument tuples to an existing anchored resource containing references to events or actions, similar to anchored entity disambiguation (entity linking) (Dredze et al. [sent-14, score-0.603]

10 The most relevant, and perhaps only, work in this area is that of Roth and Frank (2012) who linked predicates across document pairs, measuring the F1 of aligned pairs. [sent-16, score-0.309]

11 Here we present PARMA, a new system for predicate argument alignment. [sent-17, score-0.303]

12 As opposed to Roth and Frank, PARMA is designed as a a trainable platform for the incorporation of the sort of lexical semantic resources used in the related areas of Recognizing Textual Entailment (RTE) and Question Answering (QA). [sent-18, score-0.086]

13 We demonstrate the effectiveness of this approach by achieving state of the art performance on the data of Roth and Frank despite having little relevant training data. [sent-19, score-0.039]

14 In response, we evaluate on a new and more challenging dataset for predicate argument alignment derived from multiple translation data. [sent-21, score-0.478]

15 We release PARMA as a new framework for the incorporation and evaluation of new resources for predicate argument alignment. [sent-22, score-0.34]

16 1 2 PARMA PARMA (Predicate ARguMent Aligner) is a pipelined system with a wide variety of features used to align predicates and arguments in two documents. [sent-23, score-0.289]

17 Predicates are represented as mention spans and arguments are represented as coreference chains (sets of mention spans) provided by in-document coreference resolution systems such as included in the Stanford NLP toolkit. [sent-24, score-0.708]

18 Results indicated that the chains are of sufficient quality so as not to limit performance, though future work 1https : / / github . [sent-25, score-0.085]

19 Figure 1: Example of gold-standard alignment pairs from Roth and Frank’s data set and our data set created from the LDC’s Multiple Translation Corpora. [sent-29, score-0.13]

20 where most of the alignments The RF data set exhibits high lexical overlap, are between identical words like police-police and said-said. [sent-30, score-0.087]

21 The LDC MTC was constructed to increase lexical diversity, leading to more challenging alignments like veranda- balcony and tent-camp may relax this assumption. [sent-31, score-0.183]

22 We refer to a predicate or an argument as an “item” with type predicate or argument. [sent-32, score-0.482]

23 An alignment between two documents is a subset of all pairs of items in either documents with the same type. [sent-33, score-0.234]

24 2 We call the two documents being aligned the source document S and the target document T. [sent-34, score-0.324]

25 Items are referred to by their index, and ai,j is a binary variable representing an alignment between item iin S and item j in T. [sent-35, score-0.176]

26 A full alignment is an assignment = {aij : i ∈ NS, j ∈ NT}, where NS ganndm NT are th {ea set :o fi i ∈tem N in,djic ∈es Nfor }S, awnhde rTe respectively. [sent-36, score-0.09]

27 We train a logistic regression model on example alignmentsand maximize the likelihood of a document alignment under the assumption that the item alignments are independent. [sent-37, score-0.28]

28 Our objective is to maximize the log-likelihood of all p(S, T) with an L1 regularizer (with parameter λ). [sent-38, score-0.038]

29 After learning model parameters w by regularized max- a imum likelihood on training data, we introducing a threshold τ on alignment probabilities to get a classifier. [sent-39, score-0.09]

30 We perform line search on τ and choose the value that maximizes F1 on dev data. [sent-40, score-0.041]

31 1 Features The focus of PARMA is the integration of a diverse range of features based on existing lexical semantic resources. [sent-43, score-0.049]

32 The following features cover the spectrum from high-precision 2Note that type is not the same thing as part of speech: we allow nominal predicates like “death”. [sent-45, score-0.146]

33 Each feature has access to the proposed argument or predicate spans to be linked and the containing sentences as context. [sent-47, score-0.374]

34 For extra training data, we pool material from different datasets and use the multi-domain split feature space approach to learn dataset specific behaviors (Daum ´e, 2007). [sent-49, score-0.086]

35 Features in general are defined over mention spans or head tokens, but we split these features to create separate feature-spaces for predicates and arguments. [sent-50, score-0.3]

36 3 For argument coref chains we heuristically choose a canonical mention to represent each chain, and some features only look at this canonical mention. [sent-51, score-0.436]

37 The canonical mention is chosen based on length,4 information about the head word,5 and position in the document. [sent-52, score-0.143]

38 6 In most cases, coref chains that are longer than one are proper nouns and the canonical mention is the first and longest mention (outranking pronominal references and other name shortenings). [sent-53, score-0.335]

39 PPDB We use lexical features from the Paraphrase Database (PPDB) (Ganitkevitch et al. [sent-54, score-0.049]

40 We make use of the English lexical portion which contains over 7 million rules for rewriting terms like “planet” and “earth”. [sent-57, score-0.049]

41 4in tokens, not counting some words like determiners and auxiliary verbs 5like its part of speech tag and whether the it was tagged as a named entity 6mentions that appear earlier in the document and earlier in a given sentence are given preference 64 treat as independent experts. [sent-59, score-0.202]

42 For each of these rule probabilities (experts), we find all rules that match the head tokens of a given alignment and have a feature for the max and harmonic mean of the log probabilities of the resulting rule set. [sent-60, score-0.09]

43 FrameNet FrameNet is a lexical database based on Charles Fillmore’s Frame Semantics (Fillmore, 1976; Baker et al. [sent-61, score-0.089]

44 The database (and the theory) is organized around semantic frames that can be thought of as descriptions of events. [sent-63, score-0.106]

45 The Destroying frame, for instance, includes frame elements De st royer or Cau s e Undergoer. [sent-65, score-0.066]

46 Frames are related to other frames through inheritance and perspectivization. [sent-66, score-0.066]

47 For instance the frames Commerce buy and Commerce sel l(with respective lexical realizations “buy” and “sell”) are both perspectives of Comme rce goods -t rans fer (no lexical realizations) which inherits from Trans fer (with lexical realization “transfer”). [sent-67, score-0.288]

48 We compute a shortest path between headwords given edges (hypernym, hyponym, perspectivized parent and child) in FrameNet and bucket by distance to get features. [sent-68, score-0.069]

49 TED Alignments Given two predicates or arguments in two sentences, we attempt to align the two sentences they appear in using a Tree Edit Distance (TED) model that aligns two dependency trees, based on the work described by (Yao et al. [sent-70, score-0.289]

50 The TED model aligns one tree with the other using the dynamic programming algorithm ofZhang and Shasha (1989) with three predefined edits: deletion, insertion and substitution, seeking a solution yielding the minimum edit cost. [sent-73, score-0.082]

51 Once we have built a tree alignment, we extract features for 1) whether the heads of the two phrases are aligned and 2) the count of how many tokens are aligned in both trees. [sent-74, score-0.108]

52 WordNet WordNet (Miller, 1995) is a database of information (synonyms, hypernyms, etc. [sent-75, score-0.04]

53 For features, we find the shortest path linking the head words of two mentions using synonym, hypernym, hyponym, meronym, and holonym edges and bucket the length. [sent-80, score-0.108]

54 String Transducer To represent similarity be- tween arguments that are names, we use a stochastic edit distance model. [sent-81, score-0.209]

55 This stochastic string-tostring transducer has latent “edit” and “no edit” regions where the latent regions allow the model to assign high probability to contiguous regions of edits (or no edits), which are typical between variations of person names. [sent-82, score-0.268]

56 In an edit region, parameters govern the relative probability of insertion, deletion, substitution, and copy operations. [sent-83, score-0.082]

57 Since in-domain name pairs were not available, we picked 10,000 entities at random from Wikipedia to estimate the transducer parameters. [sent-86, score-0.142]

58 The entity labels were used as weak supervision during EM, as in Andrews et al. [sent-87, score-0.058]

59 For a pair of mention spans, we compute the conditional log-likelihood of the two mentions going both ways, take the max, and then bucket to get binary features. [sent-89, score-0.191]

60 We duplicate these features with copies that only fire if both mentions are tagged as PER, ORG or LOC. [sent-90, score-0.039]

61 Extended Event Coreference Bank Based on the dataset of Bejan and Harabagiu (2010), Lee et al. [sent-94, score-0.049]

62 (2012) introduced the Extended Event Coreference Bank (EECB) to evaluate cross-document event coreference. [sent-95, score-0.086]

63 EECB provides document clusters, within which entities and events may corefer. [sent-96, score-0.142]

64 To produce source and target document pairs, we select the first document within every cluster as the source and each of the remaining documents as target documents (i. [sent-99, score-0.322]

65 Roth and Frank The only existing dataset for our task is from Roth and Frank (2012) (RF), who − 7https : / / github . [sent-103, score-0.085]

66 com/ 65 cnap / anno-pipe ine l annotated documents from the English Gigaword Fifth Edition corpus (Parker et al. [sent-104, score-0.052]

67 The data was generated by clustering similar news stories from Gigaword using TF-IDF cosine similarity of their headlines. [sent-106, score-0.071]

68 This corpus is small, containing only 10 document pairs in the development set and 60 in the test set. [sent-107, score-0.149]

69 To increase the training size, we train PARMA with 150 randomly selected document pairs from both EECB and MTC, and the entire dev set from Roth and Frank using multidomain feature splitting. [sent-108, score-0.19]

70 We tuned the threshold τ on the Roth and Frank dev set, but choose the regularizer λ based on a grid search on a 5-fold version of the EECB dataset. [sent-109, score-0.079]

71 Multiple Translation Corpora We constructed a new predicate argument alignment dataset based on the LDC Multiple Translation Corpora (MTC),8 which consist of multiple English translations for foreign news articles. [sent-110, score-0.475]

72 Since these multiple translations are semantically equivalent, they provide a good resource for aligned predicate argument pairs. [sent-111, score-0.39]

73 However, finding good pairs is a challenge: we want pairs with significant overlap so that they have predicates and arguments that align, but not documents that are trivial rewrites of each other. [sent-112, score-0.485]

74 Roth and Frank selected document pairs based on clustering, meaning that the pairs had high lexical overlap, often resulting in minimal rewrites of each other. [sent-113, score-0.281]

75 To create a more challenging dataset, we selected document pairs from the multiple translations that minimize the lexical overlap (in English). [sent-115, score-0.34]

76 Because these are translations, we know that there are equivalent predicates and arguments in each pair, and that any lexical variation preserves meaning. [sent-116, score-0.286]

77 Therefore, we can select pairs with minimal lexical overlap in order to create a system that truly stresses lexically-based alignment systems. [sent-117, score-0.252]

78 Each document pair has a correspondence between sentences, and we run GIZA++ on these sentences to produce token-level alignments. [sent-118, score-0.109]

79 We take all aligned nouns as arguments and all aligned verbs (excluding be-verbs, light verbs, and reporting verbs) as predicates. [sent-119, score-0.234]

80 Red squares show the F1 for individual document pairs drawn from Roth and Frank’s data set, and black circles show F1 for our Multiple Translation Corpora test set. [sent-121, score-0.149]

81 The xaxis represents the cosine similarity between the document pairs. [sent-122, score-0.18]

82 On the RF data set, performance is correlated with lexical similarity. [sent-123, score-0.082]

83 This could be due to the fact that some of the documents in the RF sets are minor re-writes of the same newswire story, making them easy to align. [sent-125, score-0.052]

84 The amount of substitutions we perform can vary the “relatedness” of the two documents in terms of the predicates and arguments that they talk about. [sent-127, score-0.289]

85 This reflects our expectation of real world data, where we do not expect perfect overlap in predicates and arguments between a source and target document, as you would in translation data. [sent-128, score-0.31]

86 Lastly, we prune any document pairs that have more than 80 predicates or arguments or have a Jaccard index on bags of lemmas greater than 0. [sent-129, score-0.386]

87 Given an alignment to be scored A and a reference alignment B which contains SURE and POSSIBLE links, Bs and Bp respectively, precision and recall are: P =|A| ∩A| Bp| R =|A| ∩Bs B|s| (1) 66 TableERM1FT:CBPARl MPeAmoRAtm hM oa AundtpFerafkom7645Fs3287149t. [sent-134, score-0.18]

88 Lemma baseline Following Roth and Frank we include a lemma baseline, in which two predicates or arguments align if they have the same lemma. [sent-141, score-0.366]

89 9 4 Results On every dataset PARMA significantly improves over the lemma baselines (Table 1). [sent-142, score-0.126]

90 On RF, compared to Roth and Frank, the best published method for this task, we also improve, making PARMA the state of the art system for this task. [sent-143, score-0.039]

91 We also observe that MTC was more challenging than the other datasets, with a lower lemma baseline. [sent-146, score-0.113]

92 Figure 2 shows the correlation between document similarity and document F1 score for RF and MTC. [sent-147, score-0.254]

93 Additionally, there is more data in the MTC dataset which has low cosine similarity than in RF. [sent-149, score-0.12]

94 5 Conclusion PARMA achieves state of the art performance on three datasets for predicate argument alignment. [sent-150, score-0.379]

95 It builds on the development of lexical semantic resources and provides a platform for learning to utilize these resources. [sent-151, score-0.049]

96 Additionally, we show that 9We could not reproduce lemma from Roth and Frank (shown in Table 1) due to a difference in lemmatizers. [sent-152, score-0.077]

97 task difficulty can be strongly tied to lexical similarity if the evaluation dataset is not chosen carefully, and this provides an artificially high baseline in previous work. [sent-155, score-0.134]

98 PARMA is robust to drops in lexical similarity and shows large improvements in those cases. [sent-156, score-0.085]

99 Aligning predicate argument structures in monolingual comparable texts: A new corpus for a new task. [sent-241, score-0.303]

100 Answer extraction as sequence tagging with tree edit distance. [sent-246, score-0.082]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('parma', 0.536), ('roth', 0.226), ('mtc', 0.208), ('rf', 0.204), ('frank', 0.2), ('predicate', 0.179), ('eecb', 0.149), ('ppdb', 0.149), ('predicates', 0.146), ('coreference', 0.146), ('argument', 0.124), ('document', 0.109), ('arguments', 0.091), ('alignment', 0.09), ('andrews', 0.087), ('event', 0.086), ('mention', 0.083), ('edit', 0.082), ('lemma', 0.077), ('overlap', 0.073), ('spans', 0.071), ('transducer', 0.069), ('bucket', 0.069), ('frames', 0.066), ('frame', 0.066), ('bejan', 0.065), ('framenet', 0.065), ('durme', 0.062), ('canonical', 0.06), ('police', 0.06), ('balcony', 0.06), ('coref', 0.06), ('unanchored', 0.06), ('entity', 0.058), ('edits', 0.058), ('ldc', 0.056), ('fillmore', 0.055), ('aligned', 0.054), ('documents', 0.052), ('align', 0.052), ('dredze', 0.051), ('dataset', 0.049), ('yao', 0.049), ('lexical', 0.049), ('chains', 0.049), ('xuchen', 0.049), ('crossdocument', 0.049), ('commerce', 0.049), ('alleged', 0.049), ('diplomatic', 0.049), ('disambiguation', 0.049), ('benjamin', 0.048), ('maryland', 0.048), ('regions', 0.047), ('parker', 0.046), ('arrested', 0.046), ('anchored', 0.046), ('bagga', 0.046), ('gigaword', 0.045), ('israeli', 0.043), ('rewrites', 0.043), ('bomb', 0.043), ('napoles', 0.043), ('item', 0.043), ('ted', 0.041), ('dev', 0.041), ('database', 0.04), ('realizations', 0.04), ('pairs', 0.04), ('mentions', 0.039), ('resolution', 0.039), ('bank', 0.039), ('art', 0.039), ('paraphrase', 0.038), ('alignments', 0.038), ('hyponym', 0.038), ('regularizer', 0.038), ('ganitkevitch', 0.038), ('harabagiu', 0.038), ('datasets', 0.037), ('incorporation', 0.037), ('bs', 0.037), ('lexically', 0.037), ('bp', 0.037), ('challenging', 0.036), ('similarity', 0.036), ('github', 0.036), ('verbs', 0.035), ('charles', 0.035), ('wordnet', 0.035), ('nicholas', 0.035), ('buy', 0.035), ('cosine', 0.035), ('rte', 0.034), ('van', 0.034), ('correlated', 0.033), ('aligner', 0.033), ('baker', 0.033), ('entities', 0.033), ('translations', 0.033)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000007 267 acl-2013-PARMA: A Predicate Argument Aligner

Author: Travis Wolfe ; Benjamin Van Durme ; Mark Dredze ; Nicholas Andrews ; Charley Beller ; Chris Callison-Burch ; Jay DeYoung ; Justin Snyder ; Jonathan Weese ; Tan Xu ; Xuchen Yao

2 0.18196177 306 acl-2013-SPred: Large-scale Harvesting of Semantic Predicates

Author: Tiziano Flati ; Roberto Navigli

Abstract: We present SPred, a novel method for the creation of large repositories of semantic predicates. We start from existing collocations to form lexical predicates (e.g., break ∗) and learn the semantic classes that best f∗it) tahned ∗ argument. Taon idco this, we extract failtl thhee ∗ occurrences ion Wikipedia ewxthraiccht match the predicate and abstract its arguments to general semantic classes (e.g., break BODY PART, break AGREEMENT, etc.). Our experiments show that we are able to create a large collection of semantic predicates from the Oxford Advanced Learner’s Dictionary with high precision and recall, and perform well against the most similar approach.

3 0.17183056 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction

Author: Peifeng Li ; Qiaoming Zhu ; Guodong Zhou

Abstract: As a paratactic language, sentence-level argument extraction in Chinese suffers much from the frequent occurrence of ellipsis with regard to inter-sentence arguments. To resolve such problem, this paper proposes a novel global argument inference model to explore specific relationships, such as Coreference, Sequence and Parallel, among relevant event mentions to recover those intersentence arguments in the sentence, discourse and document layers which represent the cohesion of an event or a topic. Evaluation on the ACE 2005 Chinese corpus justifies the effectiveness of our global argument inference model over a state-of-the-art baseline. 1

4 0.1557761 189 acl-2013-ImpAr: A Deterministic Algorithm for Implicit Semantic Role Labelling

Author: Egoitz Laparra ; German Rigau

Abstract: This paper presents a novel deterministic algorithm for implicit Semantic Role Labeling. The system exploits a very simple but relevant discursive property, the argument coherence over different instances of a predicate. The algorithm solves the implicit arguments sequentially, exploiting not only explicit but also the implicit arguments previously solved. In addition, we empirically demonstrate that the algorithm obtains very competitive and robust performances with respect to supervised approaches that require large amounts of costly training data.

5 0.15428641 9 acl-2013-A Lightweight and High Performance Monolingual Word Aligner

Author: Xuchen Yao ; Benjamin Van Durme ; Chris Callison-Burch ; Peter Clark

Abstract: Fast alignment is essential for many natural language tasks. But in the setting of monolingual alignment, previous work has not been able to align more than one sentence pair per second. We describe a discriminatively trained monolingual word aligner that uses a Conditional Random Field to globally decode the best alignment with features drawn from source and target sentences. Using just part-of-speech tags and WordNet as external resources, our aligner gives state-of-the-art result, while being an order-of-magnitude faster than the previous best performing system.

6 0.14116241 130 acl-2013-Domain-Specific Coreference Resolution with Lexicalized Features

7 0.13267528 314 acl-2013-Semantic Roles for String to Tree Machine Translation

8 0.13061987 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution

9 0.13053526 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation

10 0.12992661 27 acl-2013-A Two Level Model for Context Sensitive Inference Rules

11 0.1239128 206 acl-2013-Joint Event Extraction via Structured Prediction with Global Features

12 0.11803173 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning

13 0.11644331 296 acl-2013-Recognizing Identical Events with Graph Kernels

14 0.10833964 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection

15 0.1080723 224 acl-2013-Learning to Extract International Relations from Political Context

16 0.10794783 98 acl-2013-Cross-lingual Transfer of Semantic Role Labeling Models

17 0.10620578 376 acl-2013-Using Lexical Expansion to Learn Inference Rules from Sparse Data

18 0.10404044 139 acl-2013-Entity Linking for Tweets

19 0.095872261 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference

20 0.093438171 265 acl-2013-Outsourcing FrameNet to the Crowd

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.241), (1, 0.032), (2, 0.031), (3, -0.154), (4, -0.037), (5, 0.174), (6, -0.083), (7, 0.133), (8, -0.017), (9, 0.031), (10, 0.06), (11, -0.051), (12, 0.014), (13, 0.02), (14, 0.032), (15, -0.008), (16, 0.034), (17, 0.11), (18, 0.121), (19, 0.091), (20, -0.061), (21, 0.017), (22, -0.088), (23, -0.034), (24, 0.029), (25, 0.023), (26, -0.024), (27, 0.052), (28, 0.145), (29, 0.011), (30, -0.065), (31, -0.01), (32, -0.05), (33, 0.009), (34, -0.03), (35, -0.063), (36, -0.074), (37, 0.038), (38, 0.038), (39, 0.015), (40, -0.003), (41, -0.012), (42, 0.013), (43, 0.029), (44, 0.014), (45, 0.038), (46, -0.003), (47, -0.159), (48, 0.065), (49, 0.003)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92759573 267 acl-2013-PARMA: A Predicate Argument Aligner

Author: Travis Wolfe ; Benjamin Van Durme ; Mark Dredze ; Nicholas Andrews ; Charley Beller ; Chris Callison-Burch ; Jay DeYoung ; Justin Snyder ; Jonathan Weese ; Tan Xu ; Xuchen Yao

2 0.69687665 189 acl-2013-ImpAr: A Deterministic Algorithm for Implicit Semantic Role Labelling

Author: Egoitz Laparra ; German Rigau

3 0.66132617 306 acl-2013-SPred: Large-scale Harvesting of Semantic Predicates

Author: Tiziano Flati ; Roberto Navigli

4 0.63887322 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation

Author: Zhenhua Tian ; Hengheng Xiang ; Ziqi Liu ; Qinghua Zheng

Abstract: This paper presents an unsupervised random walk approach to alleviate data sparsity for selectional preferences. Based on the measure of preferences between predicates and arguments, the model aggregates all the transitions from a given predicate to its nearby predicates, and propagates their argument preferences as the given predicate’s smoothed preferences. Experimental results show that this approach outperforms several state-of-the-art methods on the pseudo-disambiguation task, and it better correlates with human plausibility judgements.

5 0.57090515 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution

Author: Sebastian Martschat

Abstract: We present an unsupervised model for coreference resolution that casts the problem as a clustering task in a directed labeled weighted multigraph. The model outperforms most systems participating in the English track of the CoNLL’ 12 shared task.

6 0.57013577 130 acl-2013-Domain-Specific Coreference Resolution with Lexicalized Features

7 0.55379885 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning

8 0.54887491 314 acl-2013-Semantic Roles for String to Tree Machine Translation

9 0.54854912 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction

10 0.54377884 106 acl-2013-Decentralized Entity-Level Modeling for Coreference Resolution

11 0.5228706 206 acl-2013-Joint Event Extraction via Structured Prediction with Global Features

12 0.50480354 344 acl-2013-The Effects of Lexical Resource Quality on Preference Violation Detection

13 0.50441384 324 acl-2013-Smatch: an Evaluation Metric for Semantic Feature Structures

14 0.5012694 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference

15 0.48865297 205 acl-2013-Joint Apposition Extraction with Syntactic and Semantic Constraints

16 0.48051688 98 acl-2013-Cross-lingual Transfer of Semantic Role Labeling Models

17 0.47745514 224 acl-2013-Learning to Extract International Relations from Political Context

18 0.47197282 9 acl-2013-A Lightweight and High Performance Monolingual Word Aligner

19 0.47104785 376 acl-2013-Using Lexical Expansion to Learn Inference Rules from Sparse Data

20 0.46656749 367 acl-2013-Universal Conceptual Cognitive Annotation (UCCA)

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.06), (6, 0.032), (11, 0.054), (14, 0.024), (15, 0.018), (24, 0.062), (26, 0.056), (28, 0.017), (35, 0.071), (42, 0.047), (48, 0.058), (60, 0.186), (64, 0.016), (70, 0.052), (88, 0.048), (90, 0.037), (95, 0.107)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.89949059 63 acl-2013-Automatic detection of deception in child-produced speech using syntactic complexity features

Author: Maria Yancheva ; Frank Rudzicz

Abstract: It is important that the testimony of children be admissible in court, especially given allegations of abuse. Unfortunately, children can be misled by interrogators or might offer false information, with dire consequences. In this work, we evaluate various parameterizations of five classifiers (including support vector machines, neural networks, and random forests) in deciphering truth from lies given transcripts of interviews with 198 victims of abuse between the ages of 4 and 7. These evaluations are performed using a novel set of syntactic features, including measures of complexity. Our results show that sentence length, the mean number of clauses per utterance, and the StajnerMitkov measure of complexity are highly informative syntactic features, that classification accuracy varies greatly by the age of the speaker, and that accuracy up to 91.7% can be achieved by support vector machines given a sufficient amount of data.

same-paper 2 0.83239675 267 acl-2013-PARMA: A Predicate Argument Aligner

Author: Travis Wolfe ; Benjamin Van Durme ; Mark Dredze ; Nicholas Andrews ; Charley Beller ; Chris Callison-Burch ; Jay DeYoung ; Justin Snyder ; Jonathan Weese ; Tan Xu ; Xuchen Yao

3 0.81408536 5 acl-2013-A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art

Author: Peter A. Rankel ; John M. Conroy ; Hoa Trang Dang ; Ani Nenkova

Abstract: How good are automatic content metrics for news summary evaluation? Here we provide a detailed answer to this question, with a particular focus on assessing the ability of automatic evaluations to identify statistically significant differences present in manual evaluation of content. Using four years of data from the Text Analysis Conference, we analyze the performance of eight ROUGE variants in terms of accuracy, precision and recall in finding significantly different systems. Our experiments show that some of the neglected variants of ROUGE, based on higher order n-grams and syntactic dependencies, are most accurate across the years; the commonly used ROUGE-1 scores find too many significant differences between systems which manual evaluation would deem comparable. We also test combinations ofROUGE variants and find that they considerably improve the accuracy of automatic prediction.

4 0.71149921 97 acl-2013-Cross-lingual Projections between Languages from Different Families

Author: Mo Yu ; Tiejun Zhao ; Yalong Bai ; Hao Tian ; Dianhai Yu

Abstract: Cross-lingual projection methods can benefit from resource-rich languages to improve performances of NLP tasks in resources-scarce languages. However, these methods confronted the difficulty of syntactic differences between languages especially when the pair of languages varies greatly. To make the projection method well-generalize to diverse languages pairs, we enhance the projection method based on word alignments by introducing target-language word representations as features and proposing a novel noise removing method based on these word representations. Experiments showed that our methods improve the performances greatly on projections between English and Chinese.

5 0.71099329 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning

Author: Emmanuel Lassalle ; Pascal Denis

Abstract: This paper proposes a new method for significantly improving the performance of pairwise coreference models. Given a set of indicators, our method learns how to best separate types of mention pairs into equivalence classes for which we construct distinct classification models. In effect, our approach finds an optimal feature space (derived from a base feature set and indicator set) for discriminating coreferential mention pairs. Although our approach explores a very large space of possible feature spaces, it remains tractable by exploiting the structure of the hierarchies built from the indicators. Our exper- iments on the CoNLL-2012 Shared Task English datasets (gold mentions) indicate that our method is robust relative to different clustering strategies and evaluation metrics, showing large and consistent improvements over a single pairwise model using the same base features. Our best system obtains a competitive 67.2 of average F1 over MUC, and CEAF which, despite its simplicity, places it above the mean score of other systems on these datasets. B3,

6 0.7107867 316 acl-2013-SenseSpotting: Never let your parallel data tie you to an old domain

7 0.71070695 134 acl-2013-Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction

8 0.71046245 174 acl-2013-Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation

9 0.70953703 187 acl-2013-Identifying Opinion Subgroups in Arabic Online Discussions

10 0.70903152 78 acl-2013-Categorization of Turkish News Documents with Morphological Analysis

11 0.70812595 144 acl-2013-Explicit and Implicit Syntactic Features for Text Classification

12 0.70719635 117 acl-2013-Detecting Turnarounds in Sentiment Analysis: Thwarting

13 0.70706087 164 acl-2013-FudanNLP: A Toolkit for Chinese Natural Language Processing

14 0.70546478 207 acl-2013-Joint Inference for Fine-grained Opinion Extraction

15 0.70543486 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation

16 0.7052685 254 acl-2013-Multimodal DBN for Predicting High-Quality Answers in cQA portals

17 0.70461339 250 acl-2013-Models of Translation Competitions

18 0.70456439 137 acl-2013-Enlisting the Ghost: Modeling Empty Categories for Machine Translation

19 0.70448506 240 acl-2013-Microblogs as Parallel Corpora

20 0.70384574 81 acl-2013-Co-Regression for Cross-Language Review Rating Prediction