emnlp emnlp2011 emnlp2011-95 knowledge-graph by maker-knowledge-mining

95 emnlp-2011-Multi-Source Transfer of Delexicalized Dependency Parsers


Source: pdf

Author: Ryan McDonald ; Slav Petrov ; Keith Hall

Abstract: We present a simple method for transferring dependency parsers from source languages with labeled training data to target languages without labeled training data. We first demonstrate that delexicalized parsers can be directly transferred between languages, producing significantly higher accuracies than unsupervised parsers. We then use a constraint driven learning algorithm where constraints are drawn from parallel corpora to project the final parser. Unlike previous work on projecting syntactic resources, we show that simple methods for introducing multiple source lan- guages can significantly improve the overall quality of the resulting parsers. The projected parsers from our system result in state-of-theart performance when compared to previously studied unsupervised and projected parsing systems across eight different languages.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 com Abstract We present a simple method for transferring dependency parsers from source languages with labeled training data to target languages without labeled training data. [sent-2, score-0.987]

2 We first demonstrate that delexicalized parsers can be directly transferred between languages, producing significantly higher accuracies than unsupervised parsers. [sent-3, score-0.588]

3 The projected parsers from our system result in state-of-theart performance when compared to previously studied unsupervised and projected parsing systems across eight different languages. [sent-6, score-0.735]

4 , 2010) and constructing parsers for new languages (Collins et al. [sent-18, score-0.32]

5 One major obstacle in building statistical parsers for new languages is that they often lack the manually annotated resources available for English. [sent-21, score-0.32]

6 c e2th0o1d1s A ins Nocaitautiroanl L foarn Cguoamgpeu Ptartoicoensaslin Lgin,g puaigsetisc 6s2–72, studies that transfer parsers to new languages by projecting syntax across word alignments extracted from parallel corpora (Hwa et al. [sent-39, score-1.001]

7 In this work we present a method for creating dependency parsers for languages for which no labeled training data is available. [sent-43, score-0.42]

8 First, we train a source side English parser that, crucially, is delexicalized so that its predictions rely soley on the part-of-speech tags of the input sentence, in the same vein as Zeman and Resnik (2008). [sent-44, score-0.615]

9 We empirically show that directly transferring delexicalized models (i. [sent-45, score-0.41]

10 This result holds in the presence of both gold POS tags as well as automatic tags projected from English. [sent-48, score-0.443]

11 This emphasizes that even for languages with no syntactic resources or possibly even parallel data simple transfer methods can already be more powerful than grammar induction systems. [sent-49, score-0.921]

12 Next, we use this delexicalized English parser to seed a perceptron learner for the target language. [sent-50, score-0.599]

13 The model is trained to update towards parses that are in high agreement with a source side English parse based on constraints drawn from alignments in the parallel data. [sent-51, score-0.297]

14 The resulting parser consistently improves on the directly transferred delexicalized parser, reducing relative errors by 8% on – – average, and as much as 18% on some languages. [sent-56, score-0.546]

15 Finally, we show that by transferring parsers from multiple source languages we can further reduce errors by 16% over the directly transferred English baseline. [sent-57, score-0.735]

16 , 2009) and grammar (Berg-Kirkpatrick and Klein, 2010; Cohen and Smith, 2009) induction, that shows that adding languages leads to improvements. [sent-59, score-0.266]

17 We present a comprehensive set of experiments on eight Indo-European languages for which a significant amount of parallel data exists. [sent-60, score-0.345]

18 2 Preliminaries In this paper we focus on transferring dependency parsers between languages. [sent-104, score-0.437]

19 A dependency parser takes a tokenized input sentence (optionally part-ofspeech tagged) and produces a connected tree where directed arcs represent a syntactic head-modifier relationship. [sent-105, score-0.339]

20 Additionally, we use the following eight languages as both source and target languages: Danish (da), Dutch (nl), German (de), Greek (el), Italian (it), Portuguese (pt), Spanish (es) and Swedish (sv). [sent-118, score-0.441]

21 We focused on this subset of languages because they are Indo-European and a significant amount of parallel data exists for each language. [sent-120, score-0.273]

22 However, the restriction to Indo-European languages does make the results less conclusive when one wishes to transfer a parser from English to Chinese, for example. [sent-122, score-0.878]

23 Our approach relies on a consistent set of partof-speech tags across languages and treebanks. [sent-125, score-0.314]

24 Like all treebank projection studies we require a corpus of parallel text for each pair of languages we study. [sent-131, score-0.427]

25 First, the parser is near state-of-the-art on English parsing benchmarks and second, and more importantly, the parser is extremely fast to train and run, making it easy to run a large number of experiments. [sent-144, score-0.469]

26 Preliminary experiments using a different dependency parser MSTParser (McDonald et al. [sent-145, score-0.309]

27 Furthermore, we evaluate with both gold-standard part-of-speech tags, as well as predicted part-ofspeech tags from the projected part-of-speech tagger ofDas and Petrov (201 1). [sent-149, score-0.343]

28 – 3 Transferring from English To simplify discussion, we first focus on the most common instantiation of parser transfer in the literature: transferring from English to other languages. [sent-153, score-0.895]

29 1 Direct Transfer We start with the observation that discriminatively trained dependency parsers rely heavily on part-ofspeech tagging features. [sent-156, score-0.275]

30 For example, when training and testing a parser on our English data, a parser with all features obtains an UAS of 89. [sent-157, score-0.472]

31 a delexicalized parser a parser that only has nonlexical features obtains an UAS of 82. [sent-163, score-0.684]

32 The key observation is that part-of-speech tags contain a significant amount of information for unlabeled dependency parsing. [sent-165, score-0.278]

33 – – This observation combined with our universal part-of-speech tagset, leads to the idea of direct transfer, i. [sent-166, score-0.272]

34 , directly parsing the target language with the source language parser without relying on parallel corpora. [sent-168, score-0.54]

35 Because we use a mapping of the treebank specific part-of-speech tags to a common tagset, the performance of a such a system is easy to measure simply parse the target language data set with a delexicalized parser trained on the source language data. [sent-170, score-0.814]

36 In the first, we assumed that the test set for each target language had gold part-of-speech tags, and in the second we used predicted part-of-speech tags from the projection tagger of Das and Petrov (201 1), which also uses English as the source language. [sent-172, score-0.466]

37 We report results for both the English direct transfer parser (en-dir. [sent-174, score-0.865]

38 ) as well as a baseline unsupervised grammar induction system the dependency model with valence (DMV) of Klein and Manning (2004), as obtained by the implementation of Ganchev et al. [sent-175, score-0.316]

39 From this table we can see that direct transfer is a very strong baseline and is over 20% absolute better than the DMV model for both gold and predicted POS tags. [sent-179, score-0.75]

40 Table 4, which we will discuss in more detail later, further shows that the direct transfer parser also significantly outperforms stateof-the-art unsupervised grammar induction models, but in a more limited setting of sentences of length less than 10. [sent-180, score-1.081]

41 sentences results in slightly lower accura- 65 to some degree, across languages and treebank standards. [sent-184, score-0.284]

42 2 Projected Transfer Unlike most language transfer systems for parsers, the direct transfer approach does not rely on projecting syntax across aligned parallel corpora (modulo the fact that non-gold tags come from a system that uses parallel corpora). [sent-192, score-1.587]

43 In this section we describe a simple mechanism for projecting from the direct transfer system using large amounts of parallel data in a similar vein to Hwa et al. [sent-193, score-0.818]

44 In our case, the weak signals come from aligned source and target sentences, and the agreement in their corresponding parses, which is similar to posterior regularization or the bilingual view of Smith and Smith (2004) and Burkett et al. [sent-200, score-0.335]

45 It starts by labeling a set of target language sentences with a parser, which in our case is the direct transfer parser from the previous section (line 1). [sent-203, score-0.961]

46 Next, it uses these parsed target sentences to ‘seed’ a new parser by training a parameter vector using the predicted parses as a gold standard via standard perceptron updates for J rounds (lines 3-6). [sent-204, score-0.488]

47 This generates a parser that emulates the direct transfer parser, but 5This requires a transition-based parser with a beam greater than 1to allow for ambiguity to be resolved at later stages. [sent-205, score-1.074]

48 , DP : x → y Input: X = {xi}in=1 : P = {(xis , x =it , ai)}im=1 : DP)}delex: DPlex: target language sentences aligned source-target sentences delexicalized source parser lexicalized source parser Algorithm: 1. [sent-208, score-0.966]

49 Lwet = yt w= + a φrg(mxia,xyty)t −∈Y φt(AxLiI,GyN1i)(ys,yt,ai) return DP∗ such that DP∗ (x) = argmaxy w · φ(x, y) Figure 2: Perceptron-based learning algorithm for training a parser by seeding the model with a direct transfer parser and projecting constraints across parallel corpora. [sent-229, score-1.337]

50 The parser starts with non-random accuracies by emulating the direct transfer model and slowly tries to induce better parameters by selecting parses from its k-best list 66 that are considered ‘good’ by some external metric. [sent-237, score-0.97]

51 However, since we seed the learner with the direct transfer parser, we bias the parameters to select parses that both align well and also have high scores under the direct transfer model. [sent-241, score-1.505]

52 Since posterior regularization is closely related to constraint driven learning, this makes our algorithm also similar to the parser projection approach of Ganchev et al. [sent-314, score-0.433]

53 First, we bias our model towards the direct transfer model, which is already quite powerful. [sent-317, score-0.656]

54 After training the projected parser we average the parameters 67 a delexicalized English direct transfer parser (en-dir. [sent-325, score-1.479]

55 The parsers evaluated using predicted part-of-speech tags use the predicted tags at both training and testing time and are thus free of any target language specific resources. [sent-330, score-0.535]

56 Note that the results in Table 1 indicate that parsers using predicted part-of-speech tags are only slightly worse than the parsers using gold tags (about 2-3% absolute), showing that these methods are robust to tagging errors. [sent-341, score-0.576]

57 – 4 Multi-Source Transfer The previous section focused on transferring an English parser to a new target language. [sent-342, score-0.503]

58 Past studies have shown that for both part-of-speech tagging and grammar induction, learning with multiple comparable languages leads to improvements (Cohen and Smith, 2009; Snyder et al. [sent-346, score-0.266]

59 Each column represents which source language was used to train a delexicalized parser and each row represents which target language test data was used. [sent-430, score-0.609]

60 Bold numbers are when source equals target and underlined numbers are the single best UAS for a target language. [sent-431, score-0.284]

61 Table 2 shows the matrix of source-target lan- guage UAS for all nine languages we consider (the original eight target languages plus English). [sent-434, score-0.53]

62 , Portuguese as a source tends to parse target languages much better than Danish, and is also more amenable as a target testing language. [sent-440, score-0.496]

63 , the Romance languages (Spanish, Italian and Portuguese) tend to transfer well to one another. [sent-443, score-0.669]

64 In other words, the multi-source direct transfer parser for Danish will be trained by first concatenating the training corpora of the remaining eight languages, training a delexicalized parser on this data and then directly using this parser to analyze the Danish test data. [sent-450, score-1.607]

65 2 except that we use the multi-source direct transfer model to seed the algorithm instead of the English-only direct transfer model. [sent-452, score-1.354]

66 The first (best-source) is the direct transfer results for the oracle single-best source language per target language. [sent-455, score-0.844]

67 The second (avg-source) is the mean UAS over all source languages per target language. [sent-456, score-0.369]

68 The resulting parsers are typically much more accurate than the English direct transfer system (Table 1). [sent-461, score-0.795]

69 On average, the multi-source direct transfer system reduces errors by 10% relative over the English-only direct transfer system. [sent-462, score-1.312]

70 An inspection of Table 2 shows that for these two languages English is a particularly good source training language. [sent-465, score-0.273]

71 Some languages see basically no change relative the multi-source direct transfer model, while some languages see modest to significant increases. [sent-467, score-1.018]

72 In particular, starting with an English-only direct transfer parser with 57. [sent-469, score-0.865]

73 0% UAS on average, by adding parallel corpora and multiple source languages we finish with parser having 63. [sent-470, score-0.574]

74 Thus, even this simple method of source model from the languages in Table 2 (excluding the target language). [sent-473, score-0.369]

75 avg-source is the mean UAS over the source models for the target (excluding target language). [sent-474, score-0.284]

76 5 Comparison Comparing unsupervised and parser projection systems is difficult as many publications use nonoverlapping sets of languages or different evaluation criteria. [sent-482, score-0.528]

77 Second, we can generate USR results for all eight languages and not just for the languages that they report. [sent-489, score-0.434]

78 PGI: The phylogenetic grammar induction (PGI) Tmhoede pl oylfo Berg-Kirkpatrick an indd uKctlieoinn (2010), in which the parameters of completely 69 unsupervised DMV models for multiple languages are coupled via a phylogenetic prior. [sent-490, score-0.481]

79 (2009), i zna wtiohnich(P a supervised English parser is used to generate constraints that are projected using a parallel corpus and used to regularize a target language parser. [sent-492, score-0.623]

80 The overall trends carry over from the full treebank setting to this reduced sentence length setup: the projected models outperform the direct transfer models and multisource transfer gives higher accuracy than transferring only from English. [sent-496, score-1.643]

81 Most previous work has assumed gold part-of-speech tags, but as the code for USR is publicly available we were able to train it using the same projected part-of-speech tags used in our models. [sent-497, score-0.341]

82 It is not surprising that a parser transferred from annotated resources does significantly better than unsupervised systems since it has much more information from which to learn. [sent-500, score-0.39]

83 For Spanish we can see that the multi-source direct transfer parser is better (75. [sent-503, score-0.865]

84 6%), and this is also true for the multi-source projected parser to three representative systems from related work. [sent-505, score-0.402]

85 We trained a multi-source direct transfer parser for Bulgarian which obtained a score of 72. [sent-516, score-0.865]

86 Thus, under identical conditions the direct transfer model obtains accuracies comparable to PR. [sent-527, score-0.766]

87 8% by directly transferring parsers form Italian or Portuguese respectively. [sent-539, score-0.337]

88 6 Discussion One fundamental point the above experiments illustrate is that even for languages for which no resources exist, simple methods for transferring parsers work remarkably well. [sent-540, score-0.518]

89 Using the CoNLL 2007 English data set for training, the English direct transfer model is 63. [sent-542, score-0.656]

90 70 one can transfer part-of-speech tags, then a large part of transferring unlabeled dependencies has been solved. [sent-548, score-0.764]

91 This observation should lead to a new baseline in unsupervised and projected grammar induction the UAS of a delexicalized English parser. [sent-549, score-0.657]

92 Preliminary experiments for Arabic (ar), Chinese (zh), and Japanese (ja) suggest similar direct transfer methods are applicable. [sent-551, score-0.656]

93 6% for ar/zh/ja respectively, whereas an English direct transfer parser obtains 32. [sent-555, score-0.919]

94 In this setting only Indo-European languages are used as source data. [sent-562, score-0.273]

95 Thus, even across language groups di– rect transfer is a reasonable baseline. [sent-563, score-0.519]

96 This suggests that even better transfer models can be produced by separately weighting each of the sources depending on the target language either weighting by hand, if we know the language group of the target language, or automatically, if we do not. [sent-571, score-0.68]

97 – 7 Conclusions We presented a simple, yet effective approach for projecting parsers from languages with labeled training data to languages without any labeled training data. [sent-574, score-0.571]

98 Central to our approach is the idea of delexicalizing the models, which combined with a standardized part-of-speech tagset allows us to directly transfer models between languages. [sent-575, score-0.614]

99 We then use a constraint driven learning algorithm to adapt the transferred parsers to the respective target language, obtaining an additional 16% error reduction on average in a multi-source setting. [sent-576, score-0.411]

100 Our final parsers achieve state-of-the-art accuracies on eight Indo-European languages, significantly outperforming previous unsupervised and projected systems. [sent-577, score-0.516]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('transfer', 0.488), ('delexicalized', 0.212), ('parser', 0.209), ('uas', 0.198), ('transferring', 0.198), ('projected', 0.193), ('languages', 0.181), ('direct', 0.168), ('ganchev', 0.166), ('parsers', 0.139), ('transferred', 0.125), ('usr', 0.108), ('zeman', 0.104), ('tags', 0.102), ('align', 0.102), ('smith', 0.1), ('dependency', 0.1), ('target', 0.096), ('source', 0.092), ('parallel', 0.092), ('tagset', 0.09), ('gaard', 0.09), ('greek', 0.085), ('dmv', 0.085), ('grammar', 0.085), ('nivre', 0.084), ('petrov', 0.084), ('english', 0.083), ('projection', 0.082), ('spanish', 0.081), ('danish', 0.075), ('induction', 0.075), ('treebank', 0.072), ('eight', 0.072), ('treebanks', 0.07), ('projecting', 0.07), ('universal', 0.068), ('cohen', 0.064), ('das', 0.061), ('hwa', 0.057), ('aligned', 0.056), ('unsupervised', 0.056), ('accuracies', 0.056), ('obtains', 0.054), ('portuguese', 0.054), ('buchholz', 0.052), ('parsing', 0.051), ('naseem', 0.051), ('resnik', 0.051), ('driven', 0.051), ('dp', 0.049), ('parses', 0.049), ('punc', 0.049), ('chang', 0.049), ('predicted', 0.048), ('regularization', 0.048), ('dutch', 0.047), ('hall', 0.047), ('gold', 0.046), ('marsi', 0.045), ('eisner', 0.045), ('mcdonald', 0.044), ('posterior', 0.043), ('phylogenetic', 0.042), ('pr', 0.042), ('seed', 0.042), ('altaic', 0.042), ('drafts', 0.042), ('xsm', 0.042), ('bulgarian', 0.04), ('concatenating', 0.04), ('adj', 0.04), ('spitkovsky', 0.04), ('perceptron', 0.04), ('unlabeled', 0.04), ('sv', 0.038), ('noun', 0.038), ('dependencies', 0.038), ('lengths', 0.037), ('yt', 0.037), ('europarl', 0.037), ('stack', 0.037), ('observation', 0.036), ('pgi', 0.036), ('standardized', 0.036), ('multisource', 0.036), ('semitic', 0.036), ('italian', 0.035), ('snyder', 0.034), ('klein', 0.033), ('constraints', 0.033), ('dca', 0.033), ('begun', 0.033), ('tences', 0.033), ('google', 0.032), ('pt', 0.031), ('parse', 0.031), ('alignment', 0.031), ('across', 0.031), ('arcs', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 95 emnlp-2011-Multi-Source Transfer of Delexicalized Dependency Parsers

Author: Ryan McDonald ; Slav Petrov ; Keith Hall

Abstract: We present a simple method for transferring dependency parsers from source languages with labeled training data to target languages without labeled training data. We first demonstrate that delexicalized parsers can be directly transferred between languages, producing significantly higher accuracies than unsupervised parsers. We then use a constraint driven learning algorithm where constraints are drawn from parallel corpora to project the final parser. Unlike previous work on projecting syntactic resources, we show that simple methods for introducing multiple source lan- guages can significantly improve the overall quality of the resulting parsers. The projected parsers from our system result in state-of-theart performance when compared to previously studied unsupervised and projected parsing systems across eight different languages.

2 0.28352973 115 emnlp-2011-Relaxed Cross-lingual Projection of Constituent Syntax

Author: Wenbin Jiang ; Qun Liu ; Yajuan Lv

Abstract: We propose a relaxed correspondence assumption for cross-lingual projection of constituent syntax, which allows a supposed constituent of the target sentence to correspond to an unrestricted treelet in the source parse. Such a relaxed assumption fundamentally tolerates the syntactic non-isomorphism between languages, and enables us to learn the target-language-specific syntactic idiosyncrasy rather than a strained grammar directly projected from the source language syntax. Based on this assumption, a novel constituency projection method is also proposed in order to induce a projected constituent treebank from the source-parsed bilingual corpus. Experiments show that, the parser trained on the projected treebank dramatically outperforms previous projected and unsupervised parsers.

3 0.22591098 146 emnlp-2011-Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance

Author: Shay B. Cohen ; Dipanjan Das ; Noah A. Smith

Abstract: We describe a method for prediction of linguistic structure in a language for which only unlabeled data is available, using annotated data from a set of one or more helper languages. Our approach is based on a model that locally mixes between supervised models from the helper languages. Parallel data is not used, allowing the technique to be applied even in domains where human-translated texts are unavailable. We obtain state-of-theart performance for two tasks of structure prediction: unsupervised part-of-speech tagging and unsupervised dependency parsing.

4 0.2065547 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives

Author: Keith Hall ; Ryan McDonald ; Jason Katz-Brown ; Michael Ringgaard

Abstract: We present an online learning algorithm for training parsers which allows for the inclusion of multiple objective functions. The primary example is the extension of a standard supervised parsing objective function with additional loss-functions, either based on intrinsic parsing quality or task-specific extrinsic measures of quality. Our empirical results show how this approach performs for two dependency parsing algorithms (graph-based and transition-based parsing) and how it achieves increased performance on multiple target tasks including reordering for machine translation and parser adaptation.

5 0.19573145 141 emnlp-2011-Unsupervised Dependency Parsing without Gold Part-of-Speech Tags

Author: Valentin I. Spitkovsky ; Hiyan Alshawi ; Angel X. Chang ; Daniel Jurafsky

Abstract: We show that categories induced by unsupervised word clustering can surpass the performance of gold part-of-speech tags in dependency grammar induction. Unlike classic clustering algorithms, our method allows a word to have different tags in different contexts. In an ablative analysis, we first demonstrate that this context-dependence is crucial to the superior performance of gold tags — requiring a word to always have the same part-ofspeech significantly degrades the performance of manual tags in grammar induction, eliminating the advantage that human annotation has over unsupervised tags. We then introduce a sequence modeling technique that combines the output of a word clustering algorithm with context-colored noise, to allow words to be tagged differently in different contexts. With these new induced tags as input, our state-of- the-art dependency grammar inducer achieves 59. 1% directed accuracy on Section 23 (all sentences) of the Wall Street Journal (WSJ) corpus — 0.7% higher than using gold tags.

6 0.1744983 4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser

7 0.16286625 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation

8 0.15578555 136 emnlp-2011-Training a Parser for Machine Translation Reordering

9 0.13875221 140 emnlp-2011-Universal Morphological Analysis using Structured Nearest Neighbor Prediction

10 0.12620574 102 emnlp-2011-Parse Correction with Specialized Models for Difficult Attachment Types

11 0.12559542 50 emnlp-2011-Evaluating Dependency Parsing: Robust and Heuristics-Free Cross-Annotation Evaluation

12 0.11752988 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features

13 0.11426981 75 emnlp-2011-Joint Models for Chinese POS Tagging and Dependency Parsing

14 0.11354123 103 emnlp-2011-Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus

15 0.096312664 118 emnlp-2011-SMT Helps Bitext Dependency Parsing

16 0.095101506 74 emnlp-2011-Inducing Sentence Structure from Parallel Corpora for Reordering

17 0.094507001 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification

18 0.093421839 3 emnlp-2011-A Correction Model for Word Alignments

19 0.085687444 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation

20 0.085000463 121 emnlp-2011-Semi-supervised CCG Lexicon Extension


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.309), (1, 0.18), (2, -0.019), (3, 0.299), (4, -0.081), (5, 0.216), (6, -0.095), (7, 0.068), (8, -0.149), (9, 0.049), (10, -0.134), (11, 0.023), (12, 0.078), (13, 0.069), (14, 0.154), (15, 0.063), (16, 0.083), (17, -0.02), (18, -0.143), (19, -0.071), (20, 0.008), (21, -0.098), (22, 0.024), (23, 0.029), (24, 0.127), (25, 0.054), (26, 0.063), (27, -0.053), (28, -0.012), (29, -0.055), (30, 0.008), (31, 0.015), (32, 0.097), (33, 0.062), (34, -0.094), (35, -0.1), (36, -0.05), (37, -0.004), (38, 0.043), (39, -0.114), (40, -0.015), (41, -0.006), (42, -0.013), (43, -0.003), (44, -0.085), (45, -0.04), (46, -0.065), (47, 0.059), (48, -0.056), (49, 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96628284 95 emnlp-2011-Multi-Source Transfer of Delexicalized Dependency Parsers

Author: Ryan McDonald ; Slav Petrov ; Keith Hall

Abstract: We present a simple method for transferring dependency parsers from source languages with labeled training data to target languages without labeled training data. We first demonstrate that delexicalized parsers can be directly transferred between languages, producing significantly higher accuracies than unsupervised parsers. We then use a constraint driven learning algorithm where constraints are drawn from parallel corpora to project the final parser. Unlike previous work on projecting syntactic resources, we show that simple methods for introducing multiple source lan- guages can significantly improve the overall quality of the resulting parsers. The projected parsers from our system result in state-of-theart performance when compared to previously studied unsupervised and projected parsing systems across eight different languages.

2 0.85840368 115 emnlp-2011-Relaxed Cross-lingual Projection of Constituent Syntax

Author: Wenbin Jiang ; Qun Liu ; Yajuan Lv

Abstract: We propose a relaxed correspondence assumption for cross-lingual projection of constituent syntax, which allows a supposed constituent of the target sentence to correspond to an unrestricted treelet in the source parse. Such a relaxed assumption fundamentally tolerates the syntactic non-isomorphism between languages, and enables us to learn the target-language-specific syntactic idiosyncrasy rather than a strained grammar directly projected from the source language syntax. Based on this assumption, a novel constituency projection method is also proposed in order to induce a projected constituent treebank from the source-parsed bilingual corpus. Experiments show that, the parser trained on the projected treebank dramatically outperforms previous projected and unsupervised parsers.

3 0.72231883 146 emnlp-2011-Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance

Author: Shay B. Cohen ; Dipanjan Das ; Noah A. Smith

Abstract: We describe a method for prediction of linguistic structure in a language for which only unlabeled data is available, using annotated data from a set of one or more helper languages. Our approach is based on a model that locally mixes between supervised models from the helper languages. Parallel data is not used, allowing the technique to be applied even in domains where human-translated texts are unavailable. We obtain state-of-theart performance for two tasks of structure prediction: unsupervised part-of-speech tagging and unsupervised dependency parsing.

4 0.60053062 141 emnlp-2011-Unsupervised Dependency Parsing without Gold Part-of-Speech Tags

Author: Valentin I. Spitkovsky ; Hiyan Alshawi ; Angel X. Chang ; Daniel Jurafsky

Abstract: We show that categories induced by unsupervised word clustering can surpass the performance of gold part-of-speech tags in dependency grammar induction. Unlike classic clustering algorithms, our method allows a word to have different tags in different contexts. In an ablative analysis, we first demonstrate that this context-dependence is crucial to the superior performance of gold tags — requiring a word to always have the same part-ofspeech significantly degrades the performance of manual tags in grammar induction, eliminating the advantage that human annotation has over unsupervised tags. We then introduce a sequence modeling technique that combines the output of a word clustering algorithm with context-colored noise, to allow words to be tagged differently in different contexts. With these new induced tags as input, our state-of- the-art dependency grammar inducer achieves 59. 1% directed accuracy on Section 23 (all sentences) of the Wall Street Journal (WSJ) corpus — 0.7% higher than using gold tags.

5 0.57715625 103 emnlp-2011-Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus

Author: Emily M. Bender ; Dan Flickinger ; Stephan Oepen ; Yi Zhang

Abstract: In order to obtain a fine-grained evaluation of parser accuracy over naturally occurring text, we study 100 examples each of ten reasonably frequent linguistic phenomena, randomly selected from a parsed version of the English Wikipedia. We construct a corresponding set of gold-standard target dependencies for these 1000 sentences, operationalize mappings to these targets from seven state-of-theart parsers, and evaluate the parsers against this data to measure their level of success in identifying these dependencies.

6 0.55892652 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives

7 0.54330051 4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser

8 0.50913978 140 emnlp-2011-Universal Morphological Analysis using Structured Nearest Neighbor Prediction

9 0.49820825 102 emnlp-2011-Parse Correction with Specialized Models for Difficult Attachment Types

10 0.48012829 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features

11 0.46981645 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation

12 0.4499386 118 emnlp-2011-SMT Helps Bitext Dependency Parsing

13 0.42987463 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification

14 0.41918674 136 emnlp-2011-Training a Parser for Machine Translation Reordering

15 0.40100911 50 emnlp-2011-Evaluating Dependency Parsing: Robust and Heuristics-Free Cross-Annotation Evaluation

16 0.39379776 74 emnlp-2011-Inducing Sentence Structure from Parallel Corpora for Reordering

17 0.38122231 16 emnlp-2011-Accurate Parsing with Compact Tree-Substitution Grammars: Double-DOP

18 0.36873576 75 emnlp-2011-Joint Models for Chinese POS Tagging and Dependency Parsing

19 0.36570451 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming

20 0.35231486 121 emnlp-2011-Semi-supervised CCG Lexicon Extension


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(23, 0.084), (36, 0.024), (37, 0.019), (45, 0.048), (53, 0.02), (54, 0.018), (57, 0.015), (62, 0.012), (64, 0.436), (66, 0.052), (69, 0.014), (79, 0.083), (82, 0.019), (90, 0.015), (96, 0.052), (98, 0.011)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.9234944 45 emnlp-2011-Dual Decomposition with Many Overlapping Components

Author: Andre Martins ; Noah Smith ; Mario Figueiredo ; Pedro Aguiar

Abstract: Dual decomposition has been recently proposed as a way of combining complementary models, with a boost in predictive power. However, in cases where lightweight decompositions are not readily available (e.g., due to the presence of rich features or logical constraints), the original subgradient algorithm is inefficient. We sidestep that difficulty by adopting an augmented Lagrangian method that accelerates model consensus by regularizing towards the averaged votes. We show how first-order logical constraints can be handled efficiently, even though the corresponding subproblems are no longer combinatorial, and report experiments in dependency parsing, with state-of-the-art results. 1

same-paper 2 0.87844604 95 emnlp-2011-Multi-Source Transfer of Delexicalized Dependency Parsers

Author: Ryan McDonald ; Slav Petrov ; Keith Hall

Abstract: We present a simple method for transferring dependency parsers from source languages with labeled training data to target languages without labeled training data. We first demonstrate that delexicalized parsers can be directly transferred between languages, producing significantly higher accuracies than unsupervised parsers. We then use a constraint driven learning algorithm where constraints are drawn from parallel corpora to project the final parser. Unlike previous work on projecting syntactic resources, we show that simple methods for introducing multiple source lan- guages can significantly improve the overall quality of the resulting parsers. The projected parsers from our system result in state-of-theart performance when compared to previously studied unsupervised and projected parsing systems across eight different languages.

3 0.86894351 146 emnlp-2011-Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance

Author: Shay B. Cohen ; Dipanjan Das ; Noah A. Smith

Abstract: We describe a method for prediction of linguistic structure in a language for which only unlabeled data is available, using annotated data from a set of one or more helper languages. Our approach is based on a model that locally mixes between supervised models from the helper languages. Parallel data is not used, allowing the technique to be applied even in domains where human-translated texts are unavailable. We obtain state-of-theart performance for two tasks of structure prediction: unsupervised part-of-speech tagging and unsupervised dependency parsing.

4 0.53382027 59 emnlp-2011-Fast and Robust Joint Models for Biomedical Event Extraction

Author: Sebastian Riedel ; Andrew McCallum

Abstract: Extracting biomedical events from literature has attracted much recent attention. The bestperforming systems so far have been pipelines of simple subtask-specific local classifiers. A natural drawback of such approaches are cascading errors introduced in early stages of the pipeline. We present three joint models of increasing complexity designed to overcome this problem. The first model performs joint trigger and argument extraction, and lends itself to a simple, efficient and exact inference algorithm. The second model captures correlations between events, while the third model ensures consistency between arguments of the same event. Inference in these models is kept tractable through dual decomposition. The first two models outperform the previous best joint approaches and are very competitive with respect to the current state-of-theart. The third model yields the best results reported so far on the BioNLP 2009 shared task, the BioNLP 2011 Genia task and the BioNLP 2011Infectious Diseases task.

5 0.51506013 122 emnlp-2011-Simple Effective Decipherment via Combinatorial Optimization

Author: Taylor Berg-Kirkpatrick ; Dan Klein

Abstract: We present a simple objective function that when optimized yields accurate solutions to both decipherment and cognate pair identification problems. The objective simultaneously scores a matching between two alphabets and a matching between two lexicons, each in a different language. We introduce a simple coordinate descent procedure that efficiently finds effective solutions to the resulting combinatorial optimization problem. Our system requires only a list of words in both languages as input, yet it competes with and surpasses several state-of-the-art systems that are both substantially more complex and make use of more information.

6 0.48671624 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features

7 0.4793596 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation

8 0.47417593 140 emnlp-2011-Universal Morphological Analysis using Structured Nearest Neighbor Prediction

9 0.47383127 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax

10 0.47103301 115 emnlp-2011-Relaxed Cross-lingual Projection of Constituent Syntax

11 0.4678483 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming

12 0.44560561 136 emnlp-2011-Training a Parser for Machine Translation Reordering

13 0.44235894 77 emnlp-2011-Large-Scale Cognate Recovery

14 0.44024581 11 emnlp-2011-A Simple Word Trigger Method for Social Tag Suggestion

15 0.43669465 13 emnlp-2011-A Word Reordering Model for Improved Machine Translation

16 0.43237725 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification

17 0.43094257 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation

18 0.43039924 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation

19 0.42997953 64 emnlp-2011-Harnessing different knowledge sources to measure semantic relatedness under a uniform model

20 0.4281646 79 emnlp-2011-Lateen EM: Unsupervised Training with Multiple Objectives, Applied to Dependency Grammar Induction