acl acl2010 acl2010-120 knowledge-graph by maker-knowledge-mining

120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

Source: pdf

Author: Omri Abend ; Ari Rappoport

Abstract: The core-adjunct argument distinction is a basic one in the theory of argument structure. The task of distinguishing between the two has strong relations to various basic NLP tasks such as syntactic parsing, semantic role labeling and subcategorization acquisition. This paper presents a novel unsupervised algorithm for the task that uses no supervised models, utilizing instead state-of-the-art syntactic induction algorithms. This is the first work to tackle this task in a fully unsupervised scenario.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 i l Abstract The core-adjunct argument distinction is a basic one in the theory of argument structure. [sent-4, score-0.696]

2 1 Introduction The distinction between core arguments (henceforth, cores) and adjuncts is included in most theories on argument structure (Dowty, 2000). [sent-8, score-0.87]

3 The distinction can be viewed syntactically, as one between obligatory and optional arguments, or semantically, as one between arguments whose meanings are predicate dependent and independent. [sent-9, score-0.643]

4 Adjuncts are optional arguments which, like adverbs, modify the meaning of the described event in a predictable or predicate-independent manner. [sent-11, score-0.283]

5 The marked argument is a core in 1 and an adjunct in 2 and 3. [sent-18, score-0.465]

6 Adjuncts form an independent semantic unit and their semantic role can often be inferred independently of the predicate (e. [sent-19, score-0.315]

7 Sometimes the same argument plays a different role in different sentences. [sent-29, score-0.343]

8 However, in “The troops are based [in the park]”, the same argument is obligatory, since being based requires a place to be based in. [sent-31, score-0.3]

9 Distinguishing between the two argument types has been discussed extensively in various formulations in the NLP literature, notably in PP attachment, semantic role labeling (SRL) and subcategorization acquisition. [sent-32, score-0.51]

10 The measures used are based on selectional preference, predicate-slot collocation and argument-slot collocation. [sent-40, score-0.345]

11 , 2005), obtaining roughly 70% accuracy when evaluated on the prepositional arguments and more than 80% for the entire argument set. [sent-42, score-0.773]

12 Its core labels are predicate specific, while adjunct (or modifiers under their terminology) labels are shared across predicates. [sent-52, score-0.359]

13 The organization of PropBank is based on the notion of diathesis alternations, which are (roughly) defined to be alternations between two subcategorization frames that preserve meaning or change it systematically. [sent-54, score-0.337]

14 Adjuncts are defined to be optional arguments appearing with a wide variety of verbs and frames. [sent-58, score-0.355]

15 , as arguments that do not change their place or slot when the frame undergoes an alternation. [sent-61, score-0.592]

16 Another difference is that FN does not allow any type of non-core argument to attach to a given frame. [sent-72, score-0.3]

17 However, in “He walked [into his office]”, the marked argument is tagged as a directional adjunct in PB but as a ‘Direction’ core in FN. [sent-78, score-0.519]

18 Under both schemes, non-cores are usually confined to a few specific semantic domains, notably time, place and manner, in contrast to cores that are not restricted in their scope of applicability. [sent-79, score-0.419]

19 Work in SRL does not tackle the core-adjunct task separately but as part of general argument classification. [sent-84, score-0.3]

20 The success of supervised methods stems from the fact that the predicate-slot combination (slot is represented in this paper by its preposition) strongly determines whether a given argument is an adjunct or a core (see Section 3. [sent-95, score-0.55]

21 In addition, supervised models utilize supervised parsers and POS taggers, while the current state-of-the-art in unsupervised parsing and POS tagging is considerably worse than their supervised counterparts. [sent-99, score-0.365]

22 The setup of (Grenager and Manning, 2006), who presented a Bayesian Network model for argument classification, is perhaps closest to ours. [sent-105, score-0.3]

23 Their work relied on a supervised parser and a rule-based argument identification (both during training and testing). [sent-106, score-0.433]

24 (2009) tackled the argument identification task alone and did not perform argument classification of any sort. [sent-110, score-0.683]

25 For instance, the results of (Hindle and Rooth, 1993) indicate that their PP attachment system works better for cores than for adjuncts. [sent-114, score-0.445]

26 Determining the allowable subcategorization frames for a given predicate necessarily involves separating its cores from its allowable adjuncts (which are not framed). [sent-128, score-1.034]

27 Second, the common approach to the task focuses on syntax and tries to identify the en- tire frame, rather than to tag each argument separately. [sent-133, score-0.347]

28 Consequently, the common evaluation focuses on the quality of the allowable frames acquired for each verb type, and not on the classification of specific arguments in a given corpus. [sent-135, score-0.42]

29 Villavicencio (2002) developed a classifier based on preposition selection and frequency information for modeling the distinction for locative prepositional phrases. [sent-140, score-0.417]

30 228 3 Algorithm We are given a (predicate, argument) pair in a test sentence, and we need to determine whether the argument is a core or an adjunct. [sent-145, score-0.362]

31 1 Overview Our algorithm utilizes statistics based on the (predicate, slot, argument head) (PSH) joint distribution (a slot is represented by its preposition). [sent-149, score-0.596]

32 Our results will show that the training data accounts well for the argument realization phenomena in the test set, despite the length bound on its sentences. [sent-154, score-0.3]

33 We define three measures, one quantifying the obligatoriness of the slot, another quantifying the selectional preference of the verb to the argument and a third that quantifies the association between the head word and the slot irrespective of the predicate (Section 3. [sent-158, score-1.241]

34 Nonprepositional arguments in English tend to be cores (e. [sent-166, score-0.627]

35 , in more than 85% of the cases in PB sections 2–21), while prepositional arguments tend to be equally divided between cores and adjuncts. [sent-168, score-0.762]

36 2 Data Collection The statistical measures used by our classifier are based on the (predicate, slot, argument head) (PSH) joint distribution. [sent-171, score-0.443]

37 A preposition is defined to be any word which is the first word of an argument and belongs to a prepositions cluster. [sent-179, score-0.52]

38 A sequence of words will be marked as an argument of the verb if it is a constituent that does not contain the verb (according to the unsupervised parse tree), whose parent is an ancestor of the verb. [sent-186, score-0.51]

39 Since the sentences in question are short, we consider every word which does not belong to a closed class cluster as a head word (an argument can have several head words). [sent-189, score-0.576]

40 Using these annotation layers, we traverse the corpus and extract every (predicate, slot, argument head) triplet. [sent-198, score-0.3]

41 In case an argument has several head words, each of them is considered as an independent sample. [sent-199, score-0.42]

42 Given a (predicate, prepositional argument) pair from the test set, we first tag and parse the argument using the unsupervised tools above5. [sent-204, score-0.592]

43 Each word in the argument is now represented by its word form (without lemmatization), its unsupervised POS tag and its depth in the parse tree of the argument. [sent-205, score-0.457]

44 The last two will be used to determine which are the head words of the argument (see below). [sent-206, score-0.42]

45 Since the seman- tics of cores is more predicate dependent than the semantics of adjuncts, we expect arguments for which the predicate has a strong preference (in a specific slot) to be cores. [sent-210, score-1.105]

46 It aims to quantify the likelihood that a certain argument appears in a certain slot of a predicate. [sent-212, score-0.596]

47 For a given predicate slot pair (p, s), we define its preference to the argument head h to be: SP(p,s,h) = X Pr(h′|p,s) · sim(h,h′) h′ ∈XHeads Pr(h|p,s) =ΣhN′N(p(,ps,,sh,h)′) sim(h, h′) is a similarity measure between argument heads. [sent-216, score-1.332]

48 The similarity measure we use is based on the slot distributions of the arguments. [sent-225, score-0.328]

49 Each head word h is assigned a vector where each coordinate corresponds to a slot s. [sent-227, score-0.416]

50 Since arguments in the test set can be quite long, not every open class word in the argument is taken to be a head word. [sent-232, score-0.667]

51 Instead, only those appearing in the top level (depth = 1) of the argument under its unsupervised parse tree are taken. [sent-233, score-0.442]

52 The selectional preference of the whole argument is then defined to be the arithmetic mean of this measure over all of its head words. [sent-235, score-0.68]

53 If the argument has no head words under this definition or if none of the head words appeared in the training corpus, the selectional preference is undefined. [sent-236, score-0.768]

54 Since cores are obligatory, when a predicate persistently appears with an argument in a certain slot, the arguments in this slot tends to be cores. [sent-238, score-1.417]

55 We use the Pointwise Mutual Information measure (PMI) to capture the slot and the predicate’s collocation tendency. [sent-240, score-0.462]

56 In order not to bias the counts towards predicates which tend to take more arguments, we define here N(p, s) to be the number of times the (p, s) pair occurred in the training corpus, irrespective of the number of head words the argument had (and not e. [sent-242, score-0.42]

57 Therefore, if an argument tends to appear in a certain slot in many of its instances, it is an indication that this argument tends to have a consistent semantic flavor in most of its instances. [sent-248, score-0.935]

58 In this case, the argument and the preposition can be viewed as forming a unit on their own, independent of the predicate with which they appear. [sent-249, score-0.61]

59 Let p, s, h be a predicate, a slot and a head word respectively. [sent-252, score-0.416]

60 We then use6: AS(s,h) = 1−Pr(s|h) = 1−ΣΣp′p,′sN′N((pp′′,,ss,′h,h)) We select the head words we did with the selectional Again, the AS of the whole to be the arithmetic mean of of its head words. [sent-253, score-0.378]

61 argument is defined the measure over all Thresholding. [sent-255, score-0.332]

62 In order to turn these measures into classifiers, we set a threshold below which arguments are marked as adjuncts and above which as cores. [sent-256, score-0.485]

63 That is, we find the threshold which tags half of the arguments as cores and half as adjuncts. [sent-258, score-0.627]

64 This relies on the prior knowledge that prepositional arguments are roughly equally divided between cores and adjuncts7. [sent-259, score-0.797]

65 Each of the classifiers may either classify an argument as an adjunct, classify it as a core, or abstain. [sent-263, score-0.3]

66 In order to obtain a high accuracy classifier, to be used for self-training below, the ensemble classifier only tags arguments for which none of 6The conditional probability is subtracted from 1 so that higher values correspond to cores, as with the other measures. [sent-264, score-0.549]

67 We observe that a predicate and a slot generally determine whether the argument is a core or an adjunct. [sent-272, score-0.852]

68 For instance, in our development data, a classifier which assigns all arguments that share a predicate and a slot their most common label, yields 94. [sent-273, score-0.807]

69 We therefore apply the following procedure: (1) tag the training data with the ensemble classifier; (2) for each test sample x, if more than a ratio of α of the training samples sharing the same predicate and slot with x are labeled as cores, tag x as core. [sent-276, score-0.76]

70 Test samples which do not share a predicate and a slot with any training sample are considered out of coverage. [sent-278, score-0.49]

71 The parameter α is chosen so half of the arguments are tagged as cores and half as adjuncts. [sent-279, score-0.681]

72 Cores were defined to be any argument bearing the labels ‘A0’ ‘A5’, ‘C-A0’ ‘C-A5’ or ‘R-A0’ ‘R-A5’ . [sent-298, score-0.3]

73 The non-prepositional arguments include 145767 (87%) cores and 21767 (13%) adjuncts. [sent-306, score-0.627]

74 However, we do compare against a non-trivial baseline, which closely follows the rationale of cores as obligatory arguments. [sent-317, score-0.45]

75 Our Window Baseline tags a corpus using MXPOST and computes, for each predicate and preposition, the ratio between the number of times that the preposition appeared in a window of W words after the verb and the total number of times that the verb appeared. [sent-318, score-0.41]

76 If this number exceeds a certain threshold β, all arguments having that predicate and preposition are tagged as cores. [sent-319, score-0.611]

77 Results for the ensemble classifier (prior to the bootstrap- ping stage) are presented in two variants: one 8The first 15K arguments were used for the algorithm’s development and therefore excluded from the evaluation. [sent-325, score-0.493]

78 in which the ensemble is used to tag arguments for which all three measures give a prediction (the ‘Ensemble(Intersection)’ classifier) and one in which the ensemble tags all arguments for which at least one classifier gives a prediction (the ‘Ensemble(Union)’ classifier). [sent-329, score-1.036]

79 It uses roughly 100M arguments which were extracted from the – – web-crawling based corpus of (Gabrilovich and Markovitch, 2005) and the British National Corpus (Burnard, 2000). [sent-337, score-0.282]

80 Head Dependence the entropy of the predicate distribution given the slot and the head (following (Merlo and Esteve Ferrer, 2006)): – – – – HD(s, h) = −ΣpPr(p|s, h) · log(Pr(p|s, h)) Low entropy implies a core. [sent-346, score-0.61]

81 Effective accuracy is defined to be the accuracy resulting from labeling each out of coverage argument with an adjunct label. [sent-375, score-0.568]

82 The sections of the table are (from left to right): selectional preference measures, predicate-slot measures, argument-slot measures and head dependence. [sent-396, score-0.421]

83 The great majority of non-prepositional arguments are cores (87% in the test set). [sent-401, score-0.627]

84 We therefore tag all non-prepositional as cores and tag prepositional arguments using our model. [sent-402, score-0.856]

85 In order to minimize supervision, we distinguish between the prepositional and the nonprepositional arguments using Clark’s tagger. [sent-403, score-0.435]

86 Finally, we experiment on a scenario where even argument identification on the test set is not provided, but performed by the algorithm of (Abend et al. [sent-404, score-0.397]

87 Non-prepositional arguments are invariably tagged as cores and out of coverage prepositional arguments as adjuncts. [sent-408, score-1.116]

88 An unlabeled match is defined to be an argument that agrees in its boundaries with a gold standard ar- gument and a labeled match requires in addition that the arguments agree in their core/adjunct label. [sent-410, score-0.583]

89 This is an indication that the collocation between the argument and the preposition is more indicative of the core/adjunct label than the obligatoriness of the slot (as expressed by the predicate-slot collocation). [sent-415, score-0.899]

90 Table 3 : Unlabeled and labeled scores for the experiments using the unsupervised argument identification system of (Abend et al. [sent-429, score-0.458]

91 The obtained effective accuracy for the entire set of arguments, where the prepositional arguments are automatically identified, was 81. [sent-447, score-0.438]

92 Table 3 presents results of our experiments with the unsupervised argument identification model of (Abend et al. [sent-449, score-0.458]

93 The unlabeled scores reflect performance on argument identification alone, while the labeled scores reflect the joint performance of both the 2009 and our algorithms. [sent-451, score-0.384]

94 The accuracy of our model on the entire set (prepositional argument subset) of correctly identified arguments was 83. [sent-453, score-0.603]

95 6 Conclusion We presented a fully unsupervised algorithm for the classification of arguments into cores and adjuncts. [sent-462, score-0.772]

96 Since most non-prepositional arguments are cores, we focused on prepositional arguments, which are roughly equally divided between cores and adjuncts. [sent-463, score-0.797]

97 We also show that (somewhat surprisingly) an argument-slot collocation measure gives more accurate predictions than a predicate-slot collocation measure on this task. [sent-467, score-0.363]

98 We speculate the reason is that the head word disambiguates the preposition and that this disambiguation generally determines whether a prepositional argument is a core or an adjunct (somewhat independently of the predicate). [sent-468, score-0.836]

99 Current supervised SRL models tend to perform worse on adjuncts than on cores (Pradhan et al. [sent-472, score-0.63]

100 We believe a better understanding of the differences between cores and adjuncts may contribute to the development of better SRL techniques, in both its supervised and unsupervised variants. [sent-475, score-0.74]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('cores', 0.38), ('argument', 0.3), ('slot', 0.296), ('arguments', 0.247), ('predicate', 0.194), ('abend', 0.183), ('ensemble', 0.176), ('adjuncts', 0.165), ('selectional', 0.138), ('prepositional', 0.135), ('collocation', 0.134), ('subcategorization', 0.128), ('head', 0.12), ('sp', 0.118), ('preposition', 0.116), ('unsupervised', 0.11), ('prepositions', 0.104), ('adjunct', 0.103), ('distinction', 0.096), ('preference', 0.09), ('supervised', 0.085), ('omri', 0.084), ('pb', 0.08), ('frames', 0.079), ('alternations', 0.077), ('measures', 0.073), ('srl', 0.071), ('obligatory', 0.07), ('classifier', 0.07), ('rappoport', 0.068), ('sid', 0.066), ('core', 0.062), ('psh', 0.061), ('ari', 0.057), ('seginer', 0.056), ('accuracy', 0.056), ('tagged', 0.054), ('coverage', 0.053), ('reichart', 0.053), ('diathesis', 0.053), ('esteve', 0.053), ('nonprepositional', 0.053), ('obligatoriness', 0.053), ('villavicencio', 0.053), ('yuri', 0.053), ('zeman', 0.053), ('pr', 0.051), ('tagger', 0.05), ('verb', 0.05), ('clark', 0.049), ('scenario', 0.049), ('frame', 0.049), ('identification', 0.048), ('pos', 0.047), ('tag', 0.047), ('ferrer', 0.046), ('swier', 0.046), ('thesaurus', 0.045), ('park', 0.044), ('allowable', 0.044), ('role', 0.043), ('merlo', 0.042), ('deterioration', 0.042), ('verbs', 0.04), ('prototype', 0.04), ('roi', 0.04), ('semantic', 0.039), ('palmer', 0.039), ('pmi', 0.038), ('pp', 0.038), ('football', 0.037), ('modals', 0.037), ('closed', 0.036), ('induction', 0.036), ('optional', 0.036), ('unlabeled', 0.036), ('briscoe', 0.036), ('intersection', 0.035), ('tackled', 0.035), ('boukobza', 0.035), ('breiman', 0.035), ('cobuild', 0.035), ('hargraves', 0.035), ('prepnet', 0.035), ('willis', 0.035), ('roughly', 0.035), ('fully', 0.035), ('stevenson', 0.034), ('attachment', 0.034), ('propbank', 0.032), ('appearing', 0.032), ('measure', 0.032), ('works', 0.031), ('predictions', 0.031), ('sarkar', 0.031), ('erk', 0.031), ('pradhan', 0.031), ('colleague', 0.031), ('aline', 0.031), ('schulte', 0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

Author: Omri Abend ; Ari Rappoport

2 0.26843637 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

Author: Matthew Gerber ; Joyce Chai

Abstract: Despite its substantial coverage, NomBank does not account for all withinsentence arguments and ignores extrasentential arguments altogether. These arguments, which we call implicit, are important to semantic processing, and their recovery could potentially benefit many NLP applications. We present a study of implicit arguments for a select group of frequent nominal predicates. We show that implicit arguments are pervasive for these predicates, adding 65% to the coverage of NomBank. We demonstrate the feasibility of recovering implicit arguments with a supervised classification model. Our results and analyses provide a baseline for future work on this emerging task.

3 0.24956872 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans

Author: Fei Huang ; Alexander Yates

Abstract: Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domain of the training data. Semantic role labeling techniques are typically trained on newswire text, and in tests their performance on fiction is as much as 19% worse than their performance on newswire text. We investigate techniques for building open-domain semantic role labeling systems that approach the ideal of a train-once, use-anywhere system. We leverage recently-developed techniques for learning representations of text using latent-variable language models, and extend these techniques to ones that provide the kinds of features that are useful for semantic role labeling. In experiments, our novel system reduces error by 16% relative to the previous state of the art on out-of-domain text.

4 0.24325421 158 acl-2010-Latent Variable Models of Selectional Preference

Author: Diarmuid O Seaghdha

Abstract: This paper describes the application of so-called topic models to selectional preference induction. Three models related to Latent Dirichlet Allocation, a proven method for modelling document-word cooccurrences, are presented and evaluated on datasets of human plausibility judgements. Compared to previously proposed techniques, these models perform very competitively, especially for infrequent predicate-argument combinations where they exceed the quality of Web-scale predictions while using relatively little data.

5 0.23372741 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

Author: Yotaro Watanabe ; Masayuki Asahara ; Yuji Matsumoto

Abstract: In predicate-argument structure analysis, it is important to capture non-local dependencies among arguments and interdependencies between the sense of a predicate and the semantic roles of its arguments. However, no existing approach explicitly handles both non-local dependencies and semantic dependencies between predicates and arguments. In this paper we propose a structured model that overcomes the limitation of existing approaches; the model captures both types of dependencies simultaneously by introducing four types of factors including a global factor type capturing non-local dependencies among arguments and a pairwise factor type capturing local dependencies between a predicate and an argument. In experiments the proposed model achieved competitive results compared to the stateof-the-art systems without applying any feature selection procedure.

6 0.21535669 216 acl-2010-Starting from Scratch in Semantic Role Labeling

7 0.20548297 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

8 0.20321807 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

9 0.19609946 94 acl-2010-Edit Tree Distance Alignments for Semantic Role Labelling

10 0.18920232 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

11 0.18667176 203 acl-2010-Rebanking CCGbank for Improved NP Interpretation

12 0.16864553 238 acl-2010-Towards Open-Domain Semantic Role Labeling

13 0.16837189 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery

14 0.15350765 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

15 0.14758414 252 acl-2010-Using Parse Features for Preposition Selection and Error Detection

16 0.14462577 25 acl-2010-Adapting Self-Training for Semantic Role Labeling

17 0.14423612 41 acl-2010-Automatic Selectional Preference Acquisition for Latin Verbs

18 0.14010273 146 acl-2010-Improving Chinese Semantic Role Labeling with Rich Syntactic Features

19 0.13689537 148 acl-2010-Improving the Use of Pseudo-Words for Evaluating Selectional Preferences

20 0.12421623 36 acl-2010-Automatic Collocation Suggestion in Academic Writing

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.287), (1, 0.161), (2, 0.32), (3, 0.133), (4, 0.083), (5, 0.003), (6, -0.148), (7, -0.005), (8, 0.102), (9, -0.098), (10, 0.127), (11, 0.088), (12, -0.025), (13, 0.033), (14, 0.198), (15, -0.026), (16, 0.046), (17, -0.02), (18, 0.058), (19, 0.097), (20, 0.02), (21, 0.088), (22, -0.027), (23, -0.051), (24, 0.055), (25, 0.106), (26, 0.069), (27, -0.03), (28, 0.05), (29, 0.059), (30, 0.018), (31, -0.065), (32, 0.06), (33, -0.011), (34, -0.058), (35, -0.071), (36, -0.058), (37, 0.026), (38, 0.025), (39, 0.04), (40, 0.021), (41, 0.001), (42, -0.08), (43, 0.008), (44, -0.057), (45, -0.017), (46, -0.005), (47, 0.039), (48, -0.021), (49, 0.008)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96313822 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

Author: Omri Abend ; Ari Rappoport

2 0.79012614 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

Author: Matthew Gerber ; Joyce Chai

3 0.74682468 216 acl-2010-Starting from Scratch in Semantic Role Labeling

Author: Michael Connor ; Yael Gertner ; Cynthia Fisher ; Dan Roth

Abstract: A fundamental step in sentence comprehension involves assigning semantic roles to sentence constituents. To accomplish this, the listener must parse the sentence, find constituents that are candidate arguments, and assign semantic roles to those constituents. Each step depends on prior lexical and syntactic knowledge. Where do children learning their first languages begin in solving this problem? In this paper we focus on the parsing and argumentidentification steps that precede Semantic Role Labeling (SRL) training. We combine a simplified SRL with an unsupervised HMM part of speech tagger, and experiment with psycholinguisticallymotivated ways to label clusters resulting from the HMM so that they can be used to parse input for the SRL system. The results show that proposed shallow representations of sentence structure are robust to reductions in parsing accuracy, and that the contribution of alternative representations of sentence structure to successful semantic role labeling varies with the integrity of the parsing and argumentidentification stages.

4 0.71463484 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

Author: Yotaro Watanabe ; Masayuki Asahara ; Yuji Matsumoto

5 0.66875112 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

Author: Hirotoshi Taira ; Sanae Fujita ; Masaaki Nagata

Abstract: Maintaining high annotation consistency in large corpora is crucial for statistical learning; however, such work is hard, especially for tasks containing semantic elements. This paper describes predicate argument structure analysis using transformation-based learning. An advantage of transformation-based learning is the readability of learned rules. A disadvantage is that the rule extraction procedure is time-consuming. We present incremental-based, transformation-based learning for semantic processing tasks. As an example, we deal with Japanese predicate argument analysis and show some tendencies of annotators for constructing a corpus with our method.

6 0.66396093 41 acl-2010-Automatic Selectional Preference Acquisition for Latin Verbs

7 0.65091527 158 acl-2010-Latent Variable Models of Selectional Preference

8 0.64646411 148 acl-2010-Improving the Use of Pseudo-Words for Evaluating Selectional Preferences

9 0.64604247 238 acl-2010-Towards Open-Domain Semantic Role Labeling

10 0.63243896 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans

11 0.58462948 94 acl-2010-Edit Tree Distance Alignments for Semantic Role Labelling

12 0.53316456 108 acl-2010-Expanding Verb Coverage in Cyc with VerbNet

13 0.53113157 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

14 0.52943414 203 acl-2010-Rebanking CCGbank for Improved NP Interpretation

15 0.50728762 60 acl-2010-Collocation Extraction beyond the Independence Assumption

16 0.50554842 130 acl-2010-Hard Constraints for Grammatical Function Labelling

17 0.49755609 25 acl-2010-Adapting Self-Training for Semantic Role Labeling

18 0.49422494 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

19 0.45020992 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

20 0.43409011 85 acl-2010-Detecting Experiences from Weblogs

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(14, 0.018), (25, 0.1), (39, 0.012), (42, 0.021), (44, 0.021), (59, 0.113), (64, 0.025), (73, 0.055), (78, 0.106), (80, 0.014), (83, 0.101), (84, 0.042), (97, 0.178), (98, 0.109)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.8962279 196 acl-2010-Plot Induction and Evolutionary Search for Story Generation

Author: Neil McIntyre ; Mirella Lapata

Abstract: In this paper we develop a story generator that leverages knowledge inherent in corpora without requiring extensive manual involvement. A key feature in our approach is the reliance on a story planner which we acquire automatically by recording events, their participants, and their precedence relationships in a training corpus. Contrary to previous work our system does not follow a generate-and-rank architecture. Instead, we employ evolutionary search techniques to explore the space of possible stories which we argue are well suited to the story generation task. Experiments on generating simple children’s stories show that our system outperforms pre- vious data-driven approaches.

2 0.89279079 226 acl-2010-The Human Language Project: Building a Universal Corpus of the World's Languages

Author: Steven Abney ; Steven Bird

Abstract: We present a grand challenge to build a corpus that will include all of the world’s languages, in a consistent structure that permits large-scale cross-linguistic processing, enabling the study of universal linguistics. The focal data types, bilingual texts and lexicons, relate each language to one of a set of reference languages. We propose that the ability to train systems to translate into and out of a given language be the yardstick for determining when we have successfully captured a language. We call on the computational linguistics community to begin work on this Universal Corpus, pursuing the many strands of activity described here, as their contribution to the global effort to document the world’s linguistic heritage before more languages fall silent.

same-paper 3 0.86957669 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

Author: Omri Abend ; Ari Rappoport

4 0.84932077 189 acl-2010-Optimizing Question Answering Accuracy by Maximizing Log-Likelihood

Author: Matthias H. Heie ; Edward W. D. Whittaker ; Sadaoki Furui

Abstract: In this paper we demonstrate that there is a strong correlation between the Question Answering (QA) accuracy and the log-likelihood of the answer typing component of our statistical QA model. We exploit this observation in a clustering algorithm which optimizes QA accuracy by maximizing the log-likelihood of a set of question-and-answer pairs. Experimental results show that we achieve better QA accuracy using the resulting clusters than by using manually derived clusters.

5 0.77931833 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models

Author: Stefan Thater ; Hagen Furstenau ; Manfred Pinkal

Abstract: We present a syntactically enriched vector model that supports the computation of contextualized semantic representations in a quasi compositional fashion. It employs a systematic combination of first- and second-order context vectors. We apply our model to two different tasks and show that (i) it substantially outperforms previous work on a paraphrase ranking task, and (ii) achieves promising results on a wordsense similarity task; to our knowledge, it is the first time that an unsupervised method has been applied to this task.

6 0.77810872 158 acl-2010-Latent Variable Models of Selectional Preference

7 0.77119589 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

8 0.76518685 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

9 0.76394486 23 acl-2010-Accurate Context-Free Parsing with Combinatory Categorial Grammar

10 0.7606014 71 acl-2010-Convolution Kernel over Packed Parse Forest

11 0.75504285 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

12 0.75358033 130 acl-2010-Hard Constraints for Grammatical Function Labelling

13 0.75254923 169 acl-2010-Learning to Translate with Source and Target Syntax

14 0.74730676 248 acl-2010-Unsupervised Ontology Induction from Text

15 0.74499637 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

16 0.74224859 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

17 0.74105543 53 acl-2010-Blocked Inference in Bayesian Tree Substitution Grammars

18 0.74080372 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition

19 0.73700988 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields

20 0.7368443 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans