acl acl2011 acl2011-269 knowledge-graph by maker-knowledge-mining

269 acl-2011-Scaling up Automatic Cross-Lingual Semantic Role Annotation


Source: pdf

Author: Lonneke van der Plas ; Paola Merlo ; James Henderson

Abstract: Broad-coverage semantic annotations for training statistical learners are only available for a handful of languages. Previous approaches to cross-lingual transfer of semantic annotations have addressed this problem with encouraging results on a small scale. In this paper, we scale up previous efforts by using an automatic approach to semantic annotation that does not rely on a semantic ontology for the target language. Moreover, we improve the quality of the transferred semantic annotations by using a joint syntacticsemantic parser that learns the correlations between syntax and semantics of the target language and smooths out the errors from automatic transfer. We reach a labelled F-measure for predicates and arguments of only 4% and 9% points, respectively, lower than the upper bound from manual annotations.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 vande rP l s P ao la a Abstract Broad-coverage semantic annotations for training statistical learners are only available for a handful of languages. [sent-2, score-0.405]

2 Previous approaches to cross-lingual transfer of semantic annotations have addressed this problem with encouraging results on a small scale. [sent-3, score-0.748]

3 In this paper, we scale up previous efforts by using an automatic approach to semantic annotation that does not rely on a semantic ontology for the target language. [sent-4, score-0.71]

4 Moreover, we improve the quality of the transferred semantic annotations by using a joint syntacticsemantic parser that learns the correlations between syntax and semantics of the target language and smooths out the errors from automatic transfer. [sent-5, score-1.184]

5 We reach a labelled F-measure for predicates and arguments of only 4% and 9% points, respectively, lower than the upper bound from manual annotations. [sent-6, score-0.426]

6 One approach to addressing this problem is to develop methods that automatically generate annotated data by transferring annotations in parallel corpora from languages for which this information is available to languages for which these data are not available (Yarowsky et al. [sent-8, score-0.418]

7 Previous work on the cross-lingual transfer of semantic annotations (Pad o´, 2007; Basili et al. [sent-11, score-0.748]

8 ch has produced annotations of good quality for test sets that were carefully selected based on semantic ontologies on the source and target side. [sent-15, score-0.483]

9 It has been suggested that these annotations could be used to train semantic role labellers (Basili et al. [sent-16, score-0.481]

10 In this paper, we generate high-quality broadcoverage semantic annotations using an automatic approach that does not rely on a semantic ontology for the target language. [sent-18, score-0.75]

11 Furthermore, to our knowledge, we report the first results on using joint syntactic-semantic learning to improve the quality of the semantic annotations from automatic crosslingual transfer. [sent-19, score-0.617]

12 Results on correlations between syntax and semantics found in previous work (Merlo and van der Plas, 2009; Lang and Lapata, 2010) have led us to make use of the available syntactic annotations on the target language. [sent-20, score-0.747]

13 We use the semantic annotations resulting from cross-lingual transfer combined with syntactic annotations to train a joint syntactic-semantic parser for the target language, which, in turn, re-annotates the corpus (See Figure 1). [sent-21, score-1.261]

14 We show that the semantic annotations produced by this parser are of higher quality than the data on which it was trained. [sent-22, score-0.566]

15 Given our goal of producing broad-coverage annotations in a setting based on an aligned corpus, our choices of formal representation and of labelling scheme differ from previous work (Pad o´, 2007; Basili et al. [sent-23, score-0.423]

16 We choose a dependency representation both for the syntax and semantics because relations are expressed as direct arcs between words. [sent-25, score-0.238]

17 This representation allows cross-lingual transfer to use word-based alignments directly, eschewing the need for complex constituent-alignment algorithms. [sent-26, score-0.343]

18 , 2003) and is the preferred annotation scheme for a joint syntactic-semantic setting (Merlo and van der Plas, 2009). [sent-30, score-0.506]

19 (2007) showed that the PropBank annotation scheme can be used for languages other than English directly. [sent-32, score-0.226]

20 Recently, Wu and Fung (2009a; 2009b) also show that semantic roles help in statistical machine translation, capitalising on a study of the correspondence between English and Chinese which indicates that 84% of roles transfer directly, for PropBank-style annotations. [sent-36, score-0.9]

21 These results indicate high correspondence across languages at a shallow semantic level. [sent-37, score-0.329]

22 Based on these results, our transfer of semantic annotations from English sentences to their French translations is based on a very strong mapping hy300 pothesis, adapted from the Direct Correspondence Assumption for syntactic dependency trees by Hwa et al. [sent-38, score-0.886]

23 The relationships which we transfer are semantic role dependencies and the properties are predicate senses. [sent-41, score-0.77]

24 We introduce one constraint to the direct semantic transfer. [sent-42, score-0.286]

25 Because the semantic annotations in the target language are limited to verbal predicates, we only transfer predicates to words the syntactic parser has tagged as a verb. [sent-43, score-1.112]

26 (2005), the direct correspondence assumption is a strong hypothesis that is useful to trigger a projection process, but will not work correctly for several cases. [sent-45, score-0.246]

27 We know from the annotation guidelines used to annotate the French gold sentences that all verbs, except modals and realisations of the verb eˆtre, should receive a predicate label. [sent-47, score-0.377]

28 We define a filter that removes sentences with missing predicate labels based on PoS-information in the French sentence. [sent-48, score-0.262]

29 1 Learning joint syntactic-semantic structures We know from previous work that there is a strong correlation between syntax and semantics (Merlo and van der Plas, 2009), and that this correlation has been successfully applied for the unsupervised induction of semantic roles (Lang and Lapata, 2010). [sent-50, score-0.855]

30 However, previous work in machine translation leads us to believe that transferring the correlations between syntax and semantics across languages would be problematic due to argumentstructure divergences (Dorr, 1994). [sent-51, score-0.44]

31 For example, the English verb like and the French verb plaire do not share correlations between syntax and semantics. [sent-52, score-0.318]

32 The verb like takes an A0 subject and an A1 direct object, whereas the verb plaire licences an A1 subject and an A0 indirect object. [sent-53, score-0.233]

33 We therefore transfer semantic roles crosslingually based only on lexical alignments and add syntactic information after transfer. [sent-54, score-0.746]

34 In Figure 1, we see that cross-lingual transfer takes place at the semantic level, a level that is more abstract and known to port relatively well across languages, while the correlations with syntax, that are known to diverge cross-lingually, are learnt on the target language only. [sent-55, score-0.695]

35 We train a joint syntactic-semantic parser on the combination of the two linguistic levels that learns the correlations between these structures in the target language and is able to smooth out errors from automatic transfer. [sent-56, score-0.423]

36 3 Experiments We used two statistical parsers in our transfer of semantic annotations from English to French, one for syntactic parsing and one for joint syntacticsemantic parsing. [sent-57, score-0.985]

37 1 The statistical parsers For our syntactic-semantic parsing model, we use a freely-available parser (Henderson et al. [sent-60, score-0.169]

38 The probabilistic model is a joint generative model of syntactic and semantic dependencies that maximises the joint probability of the syntactic and semantic dependencies, while building two separate structures. [sent-63, score-0.71]

39 For the French syntactic parser, we used the dependency parser described in Titov and Henderson (2007). [sent-64, score-0.232]

40 We train the parser on the dependency version of the French Paris treebank (Candito et al. [sent-65, score-0.222]

41 2 Data To transfer semantic annotation from English to French, we used the Europarl corpus (Koehn, 2003)1 . [sent-69, score-0.708]

42 Furthermore, because translation shifts are known to pose problems for the automatic projection of semantic roles across languages (Pad o´, 2007), we select only those parallel sentences in Europarl that are direct translations from English to French, or vice versa. [sent-72, score-0.703]

43 We use the automatic dependency conversion of the French Treebank into dependency format provided to us by Candito and Crabb ´e and described in Candito et al. [sent-77, score-0.123]

44 , 2005) and NomBank labels (Meyers, 2007) is used to train the syntactic-semantic parser described in Subsection 3. [sent-81, score-0.128]

45 3 Test sets For testing, we used the hand-annotated data described in (van der Plas et al. [sent-84, score-0.142]

46 One-thousand French sentences are extracted randomly from our parallel corpus without any constraints on the semantic parallelism of the sentences, unlike much previous work. [sent-86, score-0.354]

47 4 Results We evaluate our methods for automatic annotation generation twice: once after the transfer step, and once after joint syntactic-semantic learning. [sent-88, score-0.617]

48 The comparison of these two steps will tell us whether the joint syntactic-semantic parser is able to improve semantic annotations by learning from the syntactic annotations available. [sent-89, score-0.873]

49 Table 1 shows the results of automatically annotating French sentences with semantic role annotation. [sent-91, score-0.32]

50 The first set of columns of results re2Due to filtering, the test set for the transfer (filter) model is smaller and not directly comparable to the other three models. [sent-92, score-0.343]

51 ports labelling and identification of predicates and the second set of columns reports labelling and identification of arguments, respectively, for the predicates that are identified. [sent-94, score-0.692]

52 The first two rows show the results when applying direct semantic transfer. [sent-95, score-0.286]

53 Rows three and four show results when using the joint syntactic-semantic parser to re-annotate the sentences. [sent-96, score-0.214]

54 For both annotation models we show results when using the filter described in Section 2 and without the filter. [sent-97, score-0.212]

55 The most striking result that we can read from Table 1 is that the joint syntactic-semantic learning step results in large improvements, especially for argument labelling, where the F-measure increases from 54% to 65% for the unfiltered data. [sent-98, score-0.265]

56 The parser is able to outperform the quality of the semantic data on which it was trained by using the information contained in the syntax. [sent-99, score-0.371]

57 This result is in accordance with results reported in Merlo and Van der Plas (2009) and Lang and Lapata (2010), where the authors find a high correlation between syntactic functions and PropBank semantic roles. [sent-100, score-0.448]

58 However, when training a parser on the annotations we see that filtering only results in better recall scores for predicate labelling. [sent-102, score-0.499]

59 This is not surprising given that the filters apply to completeness in predicate labelling specifically. [sent-103, score-0.375]

60 The improvements from joint syntactic-semantic learning for argument labelling are largest for the unfiltered setting, because the parser has access to larger amounts of data. [sent-104, score-0.59]

61 As an upper bound we take the inter-annotator agreement for manual annotation on a random set of 100 sentences (van der Plas et al. [sent-106, score-0.484]

62 The parser reaches an 302 F-measure on predicate labelling of 55% when using filtered data, which is very close to the upper bound (59%). [sent-108, score-0.577]

63 The upper bound for argument inter-annotator agreement is an F-measure of 74%. [sent-109, score-0.219]

64 The parser trained on unfiltered data reaches an F-measure of 65%. [sent-110, score-0.199]

65 These results on unrestricted test sets and their comparison to manual annotation show that we are able to scale up cross-lingual semantic role annotation. [sent-111, score-0.516]

66 5 Discussion and error analysis A more detailed analysis of the distribution of improvements over the types of roles further strengthens the conclusion that the parser learns the correlations between syntax and semantics. [sent-112, score-0.457]

67 It is a wellknown fact that there exists a strong correlation between syntactic function and semantic role for the A0 and A1 arguments: A0s are commonly mapped onto subjects and A1s are often realised as direct objects (Lang and Lapata, 2010). [sent-113, score-0.458]

68 It is therefore not surprising that the F-measure on these types of arguments increases by 12% and 15%, respectively, after joint-syntactic semantic learning. [sent-114, score-0.284]

69 With respect to predicate labelling, comparison of the output after transfer with the output after parsing (on the development set) shows how the parser smooths out transfer errors and how interlingual divergences can be solved by making use of the variations we find intra-lingually. [sent-117, score-1.109]

70 Figure 2: Differences in predicate-argument labelling after transfer and after parsing syntactic-semantic parser to the English sentence. [sent-125, score-0.709]

71 The second line shows the French translation and the predicate-argument structure as it is transferred cross-lingually following the method described in Section 2. [sent-126, score-0.137]

72 The first occurrence of services is aligned to the first occur- rence of services in the English sentence and gets the A1 label. [sent-129, score-0.277]

73 The second occurrence of services gets no argument label, because there is no alignment between the C-A1 argument to, the head of the infinitival clause, and the French word services. [sent-130, score-0.362]

74 The third line shows the analysis resulting from the syntactic-semantic parser that has been trained on a corpus of French sentences labelled with automatically transferred annotations and syntactic annotations. [sent-131, score-0.587]

75 The parser has access to several labelled examples of the predicate-argument structure of rester, which in many other cases is translated with remain and has the same predicate-argument structure as rester. [sent-132, score-0.195]

76 Because the languages and annotation framework adopted in previous work are not directly comparable to ours, and their methods have been evaluated on restricted test sets, results are not strictly comparable. [sent-135, score-0.195]

77 But for completeness, recall that our best result for predicate identification is an F-measure of 55% accompanied with an F-measure of 60% for argument labelling. [sent-136, score-0.249]

78 Pad o´ (2007) reports a 56% F-measure on transferring FrameNet roles, know- ing the predicate, from an automatically parsed and semantically annotated English corpus. [sent-137, score-0.156]

79 Pad o´ and Pitel (2007), transferring semantic annotation to French, report a best result of 57% F-measure for argument labelling given the predicate. [sent-138, score-0.761]

80 (2009), in an approach based on phrase-based machine translation to transfer FrameNet-like annotation from English to Italian, report 42% recall in identifying predicates and an aggregated 73% recall of identifying predicates and roles given these pred303 icates. [sent-140, score-0.929]

81 They do not report an unaggregated number that can be compared to our 60% argument labelling. [sent-141, score-0.108]

82 (2009) by 11% using Hidden Markov Models to support the automatic semantic transfer. [sent-143, score-0.243]

83 Johansson and Nugues (2006) trained a FrameNet-based semantic role labeller for Swedish on annotations transferred cross-lingually from English parallel data. [sent-144, score-0.637]

84 They report 55% Fmeasure for argument labelling given the frame on 150 translated example sentences. [sent-145, score-0.305]

85 6 Conclusions In this paper, we have scaled up previous efforts of annotation by using an automatic approach to semantic annotation transfer in combination with a joint syntactic-semantic parsing architecture. [sent-146, score-1.023]

86 We propose a direct transfer method that requires neither manual intervention nor a semantic ontology for the target language. [sent-147, score-0.773]

87 This method leads to semantically annotated data of sufficient quality to train a syntactic-semantic parser that further improves the quality of the semantic annotation by joint learning of syntactic-semantic structures on the target language. [sent-148, score-0.69]

88 The labelled F-measure of the resulting annotations for predicates is only 4% point lower than the upper bound and the resulting annotations for arguments only 9%. [sent-149, score-0.774]

89 Cross-lingual alignment of FrameNet annotations through Hidden Markov Models. [sent-165, score-0.228]

90 Analyse syntaxique du fran ¸cais : des constituants aux d ´ependances. [sent-182, score-0.152]

91 Learn- ing bilingual semantic frames: Shallow semantic parsing vs. [sent-206, score-0.461]

92 A latent variable model of synchronous parsing for syntactic and semantic dependencies. [sent-215, score-0.31]

93 Abstraction and generalisation in semantic role labels: PropBank, VerbNet 304 or both? [sent-265, score-0.286]

94 Annotation guidelines for NomBank - noun argument structure for PropBank. [sent-270, score-0.108]

95 Adding semantic role annotation to a corpus of written Dutch. [sent-277, score-0.441]

96 Annotation pr´ ecise du fran ¸cais en s ´emantique de rˆ oles par projection crosslinguistique. [sent-294, score-0.175]

97 The Proposition Bank: An annotated corpus of semantic roles. [sent-308, score-0.21]

98 Online graph planarisation for synchronous parsing of semantic and syntactic dependencies. [sent-322, score-0.31]

99 Crosslingual validity of PropBank in the manual annotation of French. [sent-329, score-0.197]

100 Inducing multilingual text analysis tools via robust projection across aligned corpora. [sent-348, score-0.091]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('transfer', 0.343), ('plas', 0.25), ('french', 0.217), ('semantic', 0.21), ('labelling', 0.197), ('annotations', 0.195), ('merlo', 0.191), ('basili', 0.178), ('pad', 0.164), ('annotation', 0.155), ('der', 0.142), ('predicate', 0.141), ('roles', 0.134), ('predicates', 0.132), ('parser', 0.128), ('propbank', 0.117), ('services', 0.113), ('geneva', 0.11), ('argument', 0.108), ('hwa', 0.104), ('transferred', 0.104), ('lang', 0.101), ('henderson', 0.097), ('correlations', 0.097), ('van', 0.092), ('projection', 0.091), ('transferring', 0.091), ('candito', 0.087), ('joint', 0.086), ('correspondence', 0.079), ('xe', 0.078), ('direct', 0.076), ('role', 0.076), ('arguments', 0.074), ('titov', 0.072), ('unfiltered', 0.071), ('lapata', 0.07), ('des', 0.068), ('labelled', 0.067), ('xf', 0.065), ('syntax', 0.064), ('annesi', 0.063), ('cais', 0.063), ('lonneke', 0.063), ('monachesi', 0.063), ('plaire', 0.063), ('postaux', 0.063), ('rester', 0.063), ('divergences', 0.062), ('crosslingual', 0.06), ('fung', 0.06), ('syntactic', 0.059), ('parallelism', 0.058), ('filter', 0.057), ('ontology', 0.057), ('bound', 0.056), ('switzerland', 0.055), ('upper', 0.055), ('framenet', 0.054), ('semantics', 0.053), ('parallel', 0.052), ('rence', 0.051), ('yf', 0.051), ('smooths', 0.051), ('syntacticsemantic', 0.051), ('treebank', 0.049), ('abeill', 0.048), ('les', 0.048), ('verb', 0.047), ('europarl', 0.046), ('fran', 0.045), ('nombank', 0.045), ('crabb', 0.045), ('dependency', 0.045), ('english', 0.045), ('target', 0.045), ('manual', 0.042), ('parsing', 0.041), ('languages', 0.04), ('fillmore', 0.039), ('du', 0.039), ('wu', 0.039), ('correlation', 0.037), ('completeness', 0.037), ('filtering', 0.035), ('ye', 0.034), ('reports', 0.034), ('sentences', 0.034), ('learns', 0.034), ('unrestricted', 0.033), ('quality', 0.033), ('law', 0.033), ('translation', 0.033), ('alignment', 0.033), ('automatic', 0.033), ('johansson', 0.032), ('scheme', 0.031), ('parsed', 0.031), ('palmer', 0.03), ('removes', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 269 acl-2011-Scaling up Automatic Cross-Lingual Semantic Role Annotation

Author: Lonneke van der Plas ; Paola Merlo ; James Henderson

Abstract: Broad-coverage semantic annotations for training statistical learners are only available for a handful of languages. Previous approaches to cross-lingual transfer of semantic annotations have addressed this problem with encouraging results on a small scale. In this paper, we scale up previous efforts by using an automatic approach to semantic annotation that does not rely on a semantic ontology for the target language. Moreover, we improve the quality of the transferred semantic annotations by using a joint syntacticsemantic parser that learns the correlations between syntax and semantics of the target language and smooths out the errors from automatic transfer. We reach a labelled F-measure for predicates and arguments of only 4% and 9% points, respectively, lower than the upper bound from manual annotations.

2 0.22975311 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

Author: Joel Lang ; Mirella Lapata

Abstract: In this paper we describe an unsupervised method for semantic role induction which holds promise for relieving the data acquisition bottleneck associated with supervised role labelers. We present an algorithm that iteratively splits and merges clusters representing semantic roles, thereby leading from an initial clustering to a final clustering of better quality. The method is simple, surprisingly effective, and allows to integrate linguistic knowledge transparently. By combining role induction with a rule-based component for argument identification we obtain an unsupervised end-to-end semantic role labeling system. Evaluation on the CoNLL 2008 benchmark dataset demonstrates that our method outperforms competitive unsupervised approaches by a wide margin.

3 0.17895155 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

Author: Ivan Titov ; Alexandre Klementiev

Abstract: We propose a non-parametric Bayesian model for unsupervised semantic parsing. Following Poon and Domingos (2009), we consider a semantic parsing setting where the goal is to (1) decompose the syntactic dependency tree of a sentence into fragments, (2) assign each of these fragments to a cluster of semantically equivalent syntactic structures, and (3) predict predicate-argument relations between the fragments. We use hierarchical PitmanYor processes to model statistical dependencies between meaning representations of predicates and those of their arguments, as well as the clusters of their syntactic realizations. We develop a modification of the MetropolisHastings split-merge sampler, resulting in an efficient inference algorithm for the model. The method is experimentally evaluated by us- ing the induced semantic representation for the question answering task in the biomedical domain.

4 0.15210062 216 acl-2011-MEANT: An inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility based on semantic roles

Author: Chi-kiu Lo ; Dekai Wu

Abstract: We introduce a novel semi-automated metric, MEANT, that assesses translation utility by matching semantic role fillers, producing scores that correlate with human judgment as well as HTER but at much lower labor cost. As machine translation systems improve in lexical choice and fluency, the shortcomings of widespread n-gram based, fluency-oriented MT evaluation metrics such as BLEU, which fail to properly evaluate adequacy, become more apparent. But more accurate, nonautomatic adequacy-oriented MT evaluation metrics like HTER are highly labor-intensive, which bottlenecks the evaluation cycle. We first show that when using untrained monolingual readers to annotate semantic roles in MT output, the non-automatic version of the metric HMEANT achieves a 0.43 correlation coefficient with human adequacyjudgments at the sentence level, far superior to BLEU at only 0.20, and equal to the far more expensive HTER. We then replace the human semantic role annotators with automatic shallow semantic parsing to further automate the evaluation metric, and show that even the semiautomated evaluation metric achieves a 0.34 correlation coefficient with human adequacy judgment, which is still about 80% as closely correlated as HTER despite an even lower labor cost for the evaluation procedure. The results show that our proposed metric is significantly better correlated with human judgment on adequacy than current widespread automatic evaluation metrics, while being much more cost effective than HTER. 1

5 0.14962822 167 acl-2011-Improving Dependency Parsing with Semantic Classes

Author: Eneko Agirre ; Kepa Bengoetxea ; Koldo Gojenola ; Joakim Nivre

Abstract: This paper presents the introduction of WordNet semantic classes in a dependency parser, obtaining improvements on the full Penn Treebank for the first time. We tried different combinations of some basic semantic classes and word sense disambiguation algorithms. Our experiments show that selecting the adequate combination of semantic features on development data is key for success. Given the basic nature of the semantic classes and word sense disambiguation algorithms used, we think there is ample room for future improvements. 1

6 0.13019001 274 acl-2011-Semi-Supervised Frame-Semantic Parsing for Unknown Predicates

7 0.1061725 182 acl-2011-Joint Annotation of Search Queries

8 0.096041322 143 acl-2011-Getting the Most out of Transition-based Dependency Parsing

9 0.094219409 52 acl-2011-Automatic Labelling of Topic Models

10 0.091750741 144 acl-2011-Global Learning of Typed Entailment Rules

11 0.090997756 111 acl-2011-Effects of Noun Phrase Bracketing in Dependency Parsing and Machine Translation

12 0.088300884 157 acl-2011-I Thou Thee, Thou Traitor: Predicting Formal vs. Informal Address in English Literature

13 0.08679802 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

14 0.086452246 48 acl-2011-Automatic Detection and Correction of Errors in Dependency Treebanks

15 0.085655071 240 acl-2011-ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation

16 0.084583364 79 acl-2011-Confidence Driven Unsupervised Semantic Parsing

17 0.082693651 259 acl-2011-Rare Word Translation Extraction from Aligned Comparable Documents

18 0.081696905 87 acl-2011-Corpus Expansion for Statistical Machine Translation with Semantic Role Label Substitution Rules

19 0.079036884 70 acl-2011-Clustering Comparable Corpora For Bilingual Lexicon Extraction

20 0.077266984 92 acl-2011-Data point selection for cross-language adaptation of dependency parsers


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.23), (1, -0.038), (2, -0.072), (3, -0.08), (4, 0.029), (5, 0.015), (6, 0.098), (7, 0.055), (8, 0.072), (9, -0.075), (10, 0.067), (11, -0.073), (12, 0.069), (13, -0.032), (14, -0.065), (15, -0.061), (16, -0.059), (17, -0.117), (18, 0.019), (19, -0.043), (20, -0.055), (21, 0.133), (22, -0.162), (23, -0.028), (24, 0.067), (25, 0.034), (26, -0.147), (27, -0.135), (28, -0.024), (29, -0.081), (30, 0.052), (31, -0.073), (32, 0.001), (33, -0.027), (34, 0.002), (35, 0.035), (36, 0.062), (37, 0.02), (38, -0.137), (39, -0.088), (40, -0.011), (41, 0.122), (42, -0.067), (43, 0.021), (44, -0.051), (45, 0.018), (46, -0.058), (47, -0.073), (48, -0.015), (49, 0.093)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9641971 269 acl-2011-Scaling up Automatic Cross-Lingual Semantic Role Annotation

Author: Lonneke van der Plas ; Paola Merlo ; James Henderson

Abstract: Broad-coverage semantic annotations for training statistical learners are only available for a handful of languages. Previous approaches to cross-lingual transfer of semantic annotations have addressed this problem with encouraging results on a small scale. In this paper, we scale up previous efforts by using an automatic approach to semantic annotation that does not rely on a semantic ontology for the target language. Moreover, we improve the quality of the transferred semantic annotations by using a joint syntacticsemantic parser that learns the correlations between syntax and semantics of the target language and smooths out the errors from automatic transfer. We reach a labelled F-measure for predicates and arguments of only 4% and 9% points, respectively, lower than the upper bound from manual annotations.

2 0.77831179 68 acl-2011-Classifying arguments by scheme

Author: Vanessa Wei Feng ; Graeme Hirst

Abstract: Argumentation schemes are structures or templates for various kinds of arguments. Given the text of an argument with premises and conclusion identified, we classify it as an instance ofone offive common schemes, using features specific to each scheme. We achieve accuracies of 63–91% in one-against-others classification and 80–94% in pairwise classification (baseline = 50% in both cases).

3 0.76970768 274 acl-2011-Semi-Supervised Frame-Semantic Parsing for Unknown Predicates

Author: Dipanjan Das ; Noah A. Smith

Abstract: We describe a new approach to disambiguating semantic frames evoked by lexical predicates previously unseen in a lexicon or annotated data. Our approach makes use of large amounts of unlabeled data in a graph-based semi-supervised learning framework. We construct a large graph where vertices correspond to potential predicates and use label propagation to learn possible semantic frames for new ones. The label-propagated graph is used within a frame-semantic parser and, for unknown predicates, results in over 15% absolute improvement in frame identification accuracy and over 13% absolute improvement in full frame-semantic parsing F1 score on a blind test set, over a state-of-the-art supervised baseline.

4 0.76788461 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

Author: Joel Lang ; Mirella Lapata

Abstract: In this paper we describe an unsupervised method for semantic role induction which holds promise for relieving the data acquisition bottleneck associated with supervised role labelers. We present an algorithm that iteratively splits and merges clusters representing semantic roles, thereby leading from an initial clustering to a final clustering of better quality. The method is simple, surprisingly effective, and allows to integrate linguistic knowledge transparently. By combining role induction with a rule-based component for argument identification we obtain an unsupervised end-to-end semantic role labeling system. Evaluation on the CoNLL 2008 benchmark dataset demonstrates that our method outperforms competitive unsupervised approaches by a wide margin.

5 0.72321987 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

Author: Ivan Titov ; Alexandre Klementiev

Abstract: We propose a non-parametric Bayesian model for unsupervised semantic parsing. Following Poon and Domingos (2009), we consider a semantic parsing setting where the goal is to (1) decompose the syntactic dependency tree of a sentence into fragments, (2) assign each of these fragments to a cluster of semantically equivalent syntactic structures, and (3) predict predicate-argument relations between the fragments. We use hierarchical PitmanYor processes to model statistical dependencies between meaning representations of predicates and those of their arguments, as well as the clusters of their syntactic realizations. We develop a modification of the MetropolisHastings split-merge sampler, resulting in an efficient inference algorithm for the model. The method is experimentally evaluated by us- ing the induced semantic representation for the question answering task in the biomedical domain.

6 0.60489839 216 acl-2011-MEANT: An inexpensive, high-accuracy, semi-automatic metric for evaluating translation utility based on semantic roles

7 0.56020945 230 acl-2011-Neutralizing Linguistically Problematic Annotations in Unsupervised Dependency Parsing Evaluation

8 0.55166709 167 acl-2011-Improving Dependency Parsing with Semantic Classes

9 0.52915025 322 acl-2011-Unsupervised Learning of Semantic Relation Composition

10 0.50538868 214 acl-2011-Lost in Translation: Authorship Attribution using Frame Semantics

11 0.50097585 200 acl-2011-Learning Dependency-Based Compositional Semantics

12 0.49040079 79 acl-2011-Confidence Driven Unsupervised Semantic Parsing

13 0.48522061 84 acl-2011-Contrasting Opposing Views of News Articles on Contentious Issues

14 0.47932643 157 acl-2011-I Thou Thee, Thou Traitor: Predicting Formal vs. Informal Address in English Literature

15 0.45242152 138 acl-2011-French TimeBank: An ISO-TimeML Annotated Reference Corpus

16 0.43670574 229 acl-2011-NULEX: An Open-License Broad Coverage Lexicon

17 0.43656555 42 acl-2011-An Interface for Rapid Natural Language Processing Development in UIMA

18 0.43409684 293 acl-2011-Template-Based Information Extraction without the Templates

19 0.42957485 87 acl-2011-Corpus Expansion for Statistical Machine Translation with Semantic Role Label Substitution Rules

20 0.42752266 320 acl-2011-Unsupervised Discovery of Domain-Specific Knowledge from Text


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.028), (15, 0.216), (17, 0.045), (26, 0.049), (31, 0.011), (37, 0.1), (39, 0.1), (41, 0.082), (55, 0.042), (59, 0.075), (72, 0.029), (91, 0.029), (96, 0.115)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.93979573 138 acl-2011-French TimeBank: An ISO-TimeML Annotated Reference Corpus

Author: Andre Bittar ; Pascal Amsili ; Pascal Denis ; Laurence Danlos

Abstract: This article presents the main points in the creation of the French TimeBank (Bittar, 2010), a reference corpus annotated according to the ISO-TimeML standard for temporal annotation. A number of improvements were made to the markup language to deal with linguistic phenomena not yet covered by ISO-TimeML, including cross-language modifications and others specific to French. An automatic preannotation system was used to speed up the annotation process. A preliminary evaluation of the methodology adopted for this project yields positive results in terms of data quality and annotation time.

same-paper 2 0.79950809 269 acl-2011-Scaling up Automatic Cross-Lingual Semantic Role Annotation

Author: Lonneke van der Plas ; Paola Merlo ; James Henderson

Abstract: Broad-coverage semantic annotations for training statistical learners are only available for a handful of languages. Previous approaches to cross-lingual transfer of semantic annotations have addressed this problem with encouraging results on a small scale. In this paper, we scale up previous efforts by using an automatic approach to semantic annotation that does not rely on a semantic ontology for the target language. Moreover, we improve the quality of the transferred semantic annotations by using a joint syntacticsemantic parser that learns the correlations between syntax and semantics of the target language and smooths out the errors from automatic transfer. We reach a labelled F-measure for predicates and arguments of only 4% and 9% points, respectively, lower than the upper bound from manual annotations.

3 0.77205038 287 acl-2011-Structural Topic Model for Latent Topical Structure Analysis

Author: Hongning Wang ; Duo Zhang ; ChengXiang Zhai

Abstract: Topic models have been successfully applied to many document analysis tasks to discover topics embedded in text. However, existing topic models generally cannot capture the latent topical structures in documents. Since languages are intrinsically cohesive and coherent, modeling and discovering latent topical transition structures within documents would be beneficial for many text analysis tasks. In this work, we propose a new topic model, Structural Topic Model, which simultaneously discovers topics and reveals the latent topical structures in text through explicitly modeling topical transitions with a latent first-order Markov chain. Experiment results show that the proposed Structural Topic Model can effectively discover topical structures in text, and the identified structures significantly improve the performance of tasks such as sentence annotation and sentence ordering. ,

4 0.67055559 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

Author: Joel Lang ; Mirella Lapata

Abstract: In this paper we describe an unsupervised method for semantic role induction which holds promise for relieving the data acquisition bottleneck associated with supervised role labelers. We present an algorithm that iteratively splits and merges clusters representing semantic roles, thereby leading from an initial clustering to a final clustering of better quality. The method is simple, surprisingly effective, and allows to integrate linguistic knowledge transparently. By combining role induction with a rule-based component for argument identification we obtain an unsupervised end-to-end semantic role labeling system. Evaluation on the CoNLL 2008 benchmark dataset demonstrates that our method outperforms competitive unsupervised approaches by a wide margin.

5 0.66704535 182 acl-2011-Joint Annotation of Search Queries

Author: Michael Bendersky ; W. Bruce Croft ; David A. Smith

Abstract: W. Bruce Croft Dept. of Computer Science University of Massachusetts Amherst, MA cro ft @ c s .uma s s .edu David A. Smith Dept. of Computer Science University of Massachusetts Amherst, MA dasmith@ c s .umas s .edu articles or web pages). As previous research shows, these differences severely limit the applicability of Marking up search queries with linguistic annotations such as part-of-speech tags, capitalization, and segmentation, is an impor- tant part of query processing and understanding in information retrieval systems. Due to their brevity and idiosyncratic structure, search queries pose a challenge to existing NLP tools. To address this challenge, we propose a probabilistic approach for performing joint query annotation. First, we derive a robust set of unsupervised independent annotations, using queries and pseudo-relevance feedback. Then, we stack additional classifiers on the independent annotations, and exploit the dependencies between them to further improve the accuracy, even with a very limited amount of available training data. We evaluate our method using a range of queries extracted from a web search log. Experimental results verify the effectiveness of our approach for both short keyword queries, and verbose natural language queries.

6 0.66383648 316 acl-2011-Unary Constraints for Efficient Context-Free Parsing

7 0.66327143 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction

8 0.66217208 58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing

9 0.66194087 164 acl-2011-Improving Arabic Dependency Parsing with Form-based and Functional Morphological Features

10 0.66171962 137 acl-2011-Fine-Grained Class Label Markup of Search Queries

11 0.66140079 97 acl-2011-Discovering Sociolinguistic Associations with Structured Sparsity

12 0.65819037 209 acl-2011-Lexically-Triggered Hidden Markov Models for Clinical Document Coding

13 0.6569131 128 acl-2011-Exploring Entity Relations for Named Entity Disambiguation

14 0.65610814 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

15 0.65422356 192 acl-2011-Language-Independent Parsing with Empty Elements

16 0.653229 300 acl-2011-The Surprising Variance in Shortest-Derivation Parsing

17 0.65223879 170 acl-2011-In-domain Relation Discovery with Meta-constraints via Posterior Regularization

18 0.65173042 178 acl-2011-Interactive Topic Modeling

19 0.65059853 119 acl-2011-Evaluating the Impact of Coder Errors on Active Learning

20 0.64988071 5 acl-2011-A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing