acl acl2012 acl2012-3 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Spence Green ; John DeNero
Abstract: When automatically translating from a weakly inflected source language like English to a target language with richer grammatical features such as gender and dual number, the output commonly contains morpho-syntactic agreement errors. To address this issue, we present a target-side, class-based agreement model. Agreement is promoted by scoring a sequence of fine-grained morpho-syntactic classes that are predicted during decoding for each translation hypothesis. For English-to-Arabic translation, our model yields a +1.04 BLEU average improvement over a state-of-the-art baseline. The model does not require bitext or phrase table annotations and can be easily implemented as a feature in many phrase-based decoders. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract When automatically translating from a weakly inflected source language like English to a target language with richer grammatical features such as gender and dual number, the output commonly contains morpho-syntactic agreement errors. [sent-2, score-0.7]
2 To address this issue, we present a target-side, class-based agreement model. [sent-3, score-0.299]
3 Agreement is promoted by scoring a sequence of fine-grained morpho-syntactic classes that are predicted during decoding for each translation hypothesis. [sent-4, score-0.567]
4 The model does not require bitext or phrase table annotations and can be easily implemented as a feature in many phrase-based decoders. [sent-7, score-0.192]
5 English is a weakly inflected language: it has a narrow verbal paradigm, restricted nominal inflection (plurals), and only the vestiges of a case system. [sent-9, score-0.182]
6 Consequently, translation into English—which accounts for much of the machine translation (MT) literature (Lopez, 2008)—often involves some amount of morpho-syntactic dimensionality reduction. [sent-10, score-0.494]
7 The correct translation is a monotone mapping of the input. [sent-14, score-0.247]
8 Although the translation has 146 dene ro @ google . [sent-26, score-0.247]
9 fem The car goes quickly Figure 1: Ungrammatical Arabic output of Google Translate for the English input The car goes quickly. [sent-42, score-0.317]
10 The subject should agree with the verb in both gender and number, but the verb has masculine inflection. [sent-43, score-0.3]
11 This paper addresses the problem of generating text that conforms to morpho-syntactic agreement rules. [sent-46, score-0.299]
12 We address this shortcoming with an agreement model that scores sequences of fine-grained morphosyntactic classes. [sent-48, score-0.401]
13 Finally, agreement is promoted by scoring the predicted class sequences with a generative Markov model. [sent-53, score-0.538]
14 Un- like previous models for scoring syntactic relations, our model does not require bitext annotations, phrase table features, or decoder modifications. [sent-55, score-0.41]
15 Intuition might suggest that the standard n-gram language model (LM) is sufficient to handle agreement phenomena. [sent-59, score-0.354]
16 c so2c0ia1t2io Ans fso rc Ciatoiomnp fuotart Cio nmaplu Ltiantgiounisatlic Lsi,n pgaugiestsi1c 4s6–15 , It has also been suggested that this setting requires morphological generation because the bitext may not contain all inflected variants (Minkov et al. [sent-65, score-0.363]
17 However, using lexical coverage experiments, we show that there is ample room for translation quality improvements through better selection of forms that already exist in the translation model. [sent-69, score-0.58]
18 1 Morpho-syntactic Agreement Morpho-syntactic agreement refers to a relationship between two sentence elements a and b that must have at least one matching grammatical feature. [sent-71, score-0.389]
19 In some languages, agreement affects the surface forms of the words. [sent-73, score-0.344]
20 en this nominal appears in the subject argument position, the verb-subject agreement relationship triggers feminine inflection of the verb. [sent-80, score-0.449]
21 Our model treats agreement as a sequence of scored, pairwise relations between adjacent words. [sent-81, score-0.41]
22 Of course, this assumption excludes some agreement phenomena, but it is sufficient for many common cases. [sent-82, score-0.299]
23 We focus on English-Arabic translation as an example of a translation direction that expresses substantially more morphological information in the target. [sent-83, score-0.602]
24 The agreement model scores sequences of morphosyntactic word classes, which express grammatical features relevant to agreement. [sent-85, score-0.534]
25 In some languages, agreement relations exist between bound morphemes, which are syntactically independent yet phonologically dependent morphemes. [sent-89, score-0.355]
26 For example, 1We use morpho-syntactic and grammatical agreement interchangeably, as is common in the literature. [sent-90, score-0.389]
27 Our model segments the raw token, tags each segment with a morpho-syntactic class (e. [sent-123, score-0.201]
28 Because the morphemes bear conflicting grammatical features and basic parts of speech (POS), we need to segment the token before we can evaluate agreement relations. [sent-128, score-0.611]
29 2 Segmentation is typically applied as a bitext preprocessing step, and there is a rich literature on the effect of different segmentation schemata on translation quality (Koehn and Knight, 2003; Habash and Sadat, 2006; El Kholy and Habash, 2012). [sent-129, score-0.421]
30 Unlike previous work, we segment each translation hypothesis as it is generated (i. [sent-130, score-0.348]
31 For example, it may be useful to count tokens with bound morphemes as a unit during phrase extraction, but to score segmented morphemes separately for agreement. [sent-134, score-0.243]
32 For this task we also train a standard CRF model on full sentences with gold classes and segmentation. [sent-151, score-0.167]
33 A translation derivation is a tuple he, f,ai where e is the target, f is the source, and a ihse an alignment between the two. [sent-153, score-0.247]
34 The CRF tagging model predicts a target-side class sequence τ∗ XI τ∗= argmτaxXi=1θtag·{φo(τi,i,e) + φt(τi,τi−1)} where further notation is defined in Fig. [sent-154, score-0.211]
35 148 Set ofClasses The tagger assigns morpho-syntactic classes, which are coarse POS categories refined with grammatical features such as gender and definiteness. [sent-157, score-0.406]
36 Many of these treebanks also contain per-token morphological annotations. [sent-161, score-0.192]
37 We restricted the set of classes to observed combinations in the training data, so the model implicitly disallows incoherent classes like “Verb+Def”. [sent-168, score-0.279]
38 car’ Features The tagging CRF includes emission features φo that indicate a class τi appearing with various orthographic characteristics of the word sequence being tagged. [sent-169, score-0.245]
39 However, we conduct CRF inference in tandem with the translation decoding procedure (§3), creating an environment in which subsequent words of the observation are not available; the MT system has yet to generate the rest of the translation when the tagging features for a position are scored. [sent-171, score-0.709]
40 For example, the model learns that the Arabic class “Noun+Fem” is followed by “Adj+Fem” and not “Adj+Masc” (noun-adjective gender agreement). [sent-176, score-0.294]
41 4Case is also relevant to agreement in Arabic, but it is mostly indicated by diacritics, which are absent in unvocalized text. [sent-177, score-0.299]
42 4 Word Class Sequence Scoring The CRF tagger model defines a conditional distribution p(τ|e; θtag) for a class sequence τ given a sentencep e τa|end;θ θmodel parameters θtag. [sent-179, score-0.245]
43 Discriminative model scores have been used as MT features (Galley and Manning, 2009), but we obtained better results by scoring the 1-best class sequences with a generative model. [sent-182, score-0.337]
44 We trained a simple add-1 smoothed bigram language model over gold class sequences in the same treebank training data: YI q(e) = p(τ) = Yp(τi|τi−1) Yi=1 We chose a bigram model due to the aggressive recombination strategy in our phrase-based decoder. [sent-183, score-0.413]
45 In less restrictive decoders, higher order scoring models could be used to score longer- distance agreement relations. [sent-185, score-0.389]
46 We integrate the segmentation, tagging, and scoring models into a self-contained component in the translation decoder. [sent-186, score-0.337]
47 3 Inference during Translation Decoding Scoring the agreement model as part of translation decoding requires a novel inference procedure. [sent-187, score-0.719]
48 The beam search relies on three operations, two of which affect the agreement model: 149 Ney (2004). [sent-193, score-0.299]
49 • • Extend a hypothesis with a new phrase pair Recombine hypotheses with identical states We assume familiarity with these operations, which are described in detail in (Och and Ney, 2004). [sent-198, score-0.196]
50 2 Agreement Model Inference The class-based agreement model is implemented as a feature function hm in Eq. [sent-200, score-0.354]
51 The inputs are a translation hypothesis e1I, an index n distinguishing the prefix from the attachment, and a flag indicating if their concatenation is a goal hypothesis. [sent-204, score-0.304]
52 With a trigram language model, the state might be the last two words of the translation prefix. [sent-207, score-0.247]
53 As a result, two hypotheses with different full prefixes— and thus potentially different sequences of agreement relations—can be recombined. [sent-209, score-0.43]
54 The agreement model state is the last tagged segment hs, ti of the concatenated hypothesis. [sent-212, score-0.398]
55 To accelerate tagger decoding in our experiments, we also used tagging dictionaries for frequently observed word types. [sent-216, score-0.26]
56 3 Translation Model Features The agreement model score is one decoder feature function. [sent-219, score-0.438]
57 To discriminate between hypotheses that might have the same number of raw tokens, but different underlying segmentations, we add a penalty equal to the length difference between the segmented and unsegmented attachments | sˆ1L| − |eIn+1|. [sent-224, score-0.18]
58 4 Related Work We compare our class-based model to previous approaches to scoring syntactic relations in MT. [sent-225, score-0.245]
59 Williams and Koehn (201 1) annotated German trees, and extracted translation rules from them. [sent-231, score-0.247]
60 In contrast, our class-based model does not require any manual rules and scores similar agreement phenomena as probabilistic sequences. [sent-233, score-0.354]
61 Factored Translation Models Factored translation models (Koehn and Hoang, 2007) facilitate a more data-oriented approach to agreement modeling. [sent-234, score-0.546]
62 Subotin (201 1) recently extended factored translation models to hierarchical phrase-based translation and developed a discriminative model for predicting target-side morphology in English-Czech. [sent-244, score-0.687]
63 In contrast to these methods, our model does not affect phrase extraction and does not require annotated translation rules. [sent-246, score-0.357]
64 Och (1999) showed a method for inducing bilingual word classes that placed each phrase pair into a two-dimensional equivalence class. [sent-250, score-0.167]
65 Then they mixed the classes into a word-based LM. [sent-253, score-0.231]
66 It is unclear if their classes captured agreement information. [sent-255, score-0.411]
67 Target-Side Syntactic LMs Our agreement model is a form of syntactic LM, of which there is a long history of research, especially in speech processing. [sent-257, score-0.398]
68 6 In contrast, our class-based model encodes shallow syntactic information without a noticeable effect on decoding time. [sent-267, score-0.217]
69 Our model can be viewed as a way to score local syntactic relations without extensive decoder modifications. [sent-268, score-0.239]
70 The target-side structure enables scoring hypotheses with a trigram dependency LM. [sent-271, score-0.174]
71 5 Experiments We first evaluate the Arabic segmenter and tagger components independently, then provide EnglishArabic translation quality results. [sent-272, score-0.423]
72 For training the tagger, we automatically converted the ATB morphological analyses to the fine-grained class set. [sent-290, score-0.21]
73 This includes next-character segmenter features and nextword tagger features. [sent-297, score-0.219]
74 2 Translation Quality Experimental Setup Our decoder is based on the phrase-based approach to translation (Och and Ney, 2004) and contains various feature functions including phrase relative frequency, word-level alignment statistics, and lexicalized re-ordering models (Tillmann, 2004; Och et al. [sent-304, score-0.386]
75 We trained the translation model on 502 million words of parallel text collected from a variety of sources, including the Web. [sent-345, score-0.302]
76 To reverse the translation direction for each data set, we chose the first English reference as the source and the Arabic as the reference. [sent-353, score-0.247]
77 The NIST sets come in two varieties: newswire (MT02-05) and mixed genre (MT06,08). [sent-354, score-0.248]
78 Newswire contains primarily Modern Standard Arabic (MSA), while the mixed genre data sets also contain transcribed speech and web text. [sent-355, score-0.238]
79 Since the ATB contains MSA, and significant lexical and syntactic differences 152 may exist between MSA and the mixed genres, we achieved best results by tuning on MT04, the largest newswire set. [sent-356, score-0.262]
80 We evaluated translation quality with BLEU-4 (Pa- pineni et al. [sent-357, score-0.247]
81 2 shows translation quality results on newswire, while Tbl. [sent-360, score-0.247]
82 Finally, +POS+Agr shows the class-based model with the fine-grained classes (e. [sent-366, score-0.167]
83 We realized smaller, yet statistically significant, gains on the mixed genre data sets. [sent-371, score-0.196]
84 Conversely, the mixed genre data sets contain more irregularities. [sent-376, score-0.238]
85 MusicMatch start program miozik maatsh MusicMatch Start the program music match (MusicMatch) In these imperatives, there are no lexically marked agreement relations to score. [sent-403, score-0.355]
86 The ATB contains few examples like these, so our class-based model probably does not effectively discriminate between alternative hypotheses for these types of sentences. [sent-408, score-0.185]
87 Phrase Table Coverage In a standard phrasebased system, effective translation into a highly inflected target language requires that the phrase table contain the inflected word forms necessary to construct an output with correct agreement. [sent-409, score-0.694]
88 During development, we observed that the phrase table of our large-scale English-Arabic system did often contain the inflected forms that we desired the system to select. [sent-411, score-0.273]
89 In fact, correctly agreeing alternatives often appeared in n-best translation lists. [sent-412, score-0.247]
90 The statistics below report the tokenlevel recall of reference unigrams:10 • Baseline system translation output: 44. [sent-414, score-0.247]
91 8% The bottom category includes all lexical items that the decoder could produce in a translation of the source. [sent-416, score-0.331]
92 We randomly sampled 100 of these sentences and counted agreement errors of all types. [sent-422, score-0.299]
93 In our output, a frequent source ofremaining errors was the case of so-called “deflected agreement”: inanimate plural nouns require feminine singular agreement with modifiers. [sent-426, score-0.545]
94 On the other hand, animate plural nouns require the sound plural, which is indicated by an appropriate masculine or feminine suffix. [sent-427, score-0.262]
95 The ATB does not contain animacy annotations, so our agreement model cannot discriminate between these two cases. [sent-437, score-0.481]
96 7 Conclusion and Outlook Our class-based agreement model improves translation quality by promoting local agreement, but with a minimal increase in decoding time and no additional storage requirements for the phrase table. [sent-439, score-0.774]
97 Nevertheless, we also showed an improvement, albeit less significant, on mixed genre evaluation sets. [sent-442, score-0.196]
98 However, our analysis has shown that for Arabic, these genres typically contain more Latin script and transliterated words, and thus there is less morphology to score. [sent-444, score-0.178]
99 A corpus for modeling morpho-syntactic agreement in Arabic: Gender, number and rationality. [sent-454, score-0.299]
100 Phrasal: A statistical machine translation toolkit for exploring new model features. [sent-519, score-0.302]
wordName wordTfidf (topN-words)
[('arabic', 0.335), ('agreement', 0.299), ('translation', 0.247), ('fem', 0.159), ('atb', 0.158), ('gender', 0.137), ('inflected', 0.131), ('mixed', 0.119), ('decoding', 0.118), ('mt', 0.115), ('habash', 0.114), ('classes', 0.112), ('morphological', 0.108), ('macherey', 0.106), ('musicmatch', 0.106), ('class', 0.102), ('feminine', 0.099), ('morphemes', 0.094), ('plural', 0.094), ('segmentation', 0.092), ('scoring', 0.09), ('grammatical', 0.09), ('segmenter', 0.088), ('tagger', 0.088), ('decoder', 0.084), ('hypotheses', 0.084), ('crf', 0.083), ('bitext', 0.082), ('och', 0.079), ('msa', 0.079), ('car', 0.079), ('genre', 0.077), ('morphology', 0.07), ('masculine', 0.069), ('factored', 0.068), ('genres', 0.066), ('lms', 0.065), ('sg', 0.063), ('brants', 0.061), ('koehn', 0.061), ('avg', 0.059), ('hypothesis', 0.057), ('relations', 0.056), ('phrase', 0.055), ('model', 0.055), ('bigram', 0.054), ('tagging', 0.054), ('uszkoreit', 0.053), ('alkuhlani', 0.053), ('eagr', 0.053), ('inanimate', 0.053), ('masc', 0.053), ('recombine', 0.053), ('yjj', 0.053), ('incremental', 0.053), ('newswire', 0.052), ('morpheme', 0.051), ('inflection', 0.051), ('unification', 0.051), ('penalty', 0.05), ('lm', 0.05), ('adj', 0.049), ('fraser', 0.049), ('latin', 0.049), ('coarse', 0.048), ('ccg', 0.047), ('sequences', 0.047), ('verb', 0.047), ('tuning', 0.047), ('avramidis', 0.046), ('kholy', 0.046), ('maamouri', 0.046), ('recombination', 0.046), ('targetside', 0.046), ('birch', 0.046), ('galley', 0.046), ('emission', 0.046), ('discriminate', 0.046), ('intrinsic', 0.046), ('forms', 0.045), ('della', 0.044), ('rambow', 0.044), ('syntactic', 0.044), ('segment', 0.044), ('green', 0.043), ('phrasebased', 0.043), ('pietra', 0.043), ('features', 0.043), ('def', 0.042), ('iob', 0.042), ('supertags', 0.042), ('whitespace', 0.042), ('character', 0.042), ('treebanks', 0.042), ('contain', 0.042), ('token', 0.041), ('coverage', 0.041), ('brown', 0.041), ('tag', 0.04), ('animacy', 0.039)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999839 3 acl-2012-A Class-Based Agreement Model for Generating Accurately Inflected Translations
Author: Spence Green ; John DeNero
Abstract: When automatically translating from a weakly inflected source language like English to a target language with richer grammatical features such as gender and dual number, the output commonly contains morpho-syntactic agreement errors. To address this issue, we present a target-side, class-based agreement model. Agreement is promoted by scoring a sequence of fine-grained morpho-syntactic classes that are predicted during decoding for each translation hypothesis. For English-to-Arabic translation, our model yields a +1.04 BLEU average improvement over a state-of-the-art baseline. The model does not require bitext or phrase table annotations and can be easily implemented as a feature in many phrase-based decoders. 1
2 0.29516953 202 acl-2012-Transforming Standard Arabic to Colloquial Arabic
Author: Emad Mohamed ; Behrang Mohit ; Kemal Oflazer
Abstract: We present a method for generating Colloquial Egyptian Arabic (CEA) from morphologically disambiguated Modern Standard Arabic (MSA). When used in POS tagging, this process improves the accuracy from 73.24% to 86.84% on unseen CEA text, and reduces the percentage of out-ofvocabulary words from 28.98% to 16.66%. The process holds promise for any NLP task targeting the dialectal varieties of Arabic; e.g., this approach may provide a cheap way to leverage MSA data and morphological resources to create resources for colloquial Arabic to English machine translation. It can also considerably speed up the annotation of Arabic dialects.
3 0.28338388 207 acl-2012-Unsupervised Morphology Rivals Supervised Morphology for Arabic MT
Author: David Stallard ; Jacob Devlin ; Michael Kayser ; Yoong Keok Lee ; Regina Barzilay
Abstract: If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inflected languages. In this paper, we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers, using a state-of-theart Arabic-to-English MT system. We apply maximum marginal decoding to the unsupervised analyzer, and show that this yields the best published segmentation accuracy for Arabic, while also making segmentation output more stable. Our approach gives an 18% relative BLEU gain for Levantine dialectal Arabic. Furthermore, it gives higher gains for Modern Standard Arabic (MSA), as measured on NIST MT-08, than does MADA (Habash and Rambow, 2005), a leading supervised MSA segmenter.
4 0.24667746 27 acl-2012-Arabic Retrieval Revisited: Morphological Hole Filling
Author: Kareem Darwish ; Ahmed Ali
Abstract: Due to Arabic’s morphological complexity, Arabic retrieval benefits greatly from morphological analysis – particularly stemming. However, the best known stemming does not handle linguistic phenomena such as broken plurals and malformed stems. In this paper we propose a model of character-level morphological transformation that is trained using Wikipedia hypertext to page title links. The use of our model yields statistically significant improvements in Arabic retrieval over the use of the best statistical stemming technique. The technique can potentially be applied to other languages.
5 0.21295908 140 acl-2012-Machine Translation without Words through Substring Alignment
Author: Graham Neubig ; Taro Watanabe ; Shinsuke Mori ; Tatsuya Kawahara
Abstract: In this paper, we demonstrate that accurate machine translation is possible without the concept of “words,” treating MT as a problem of transformation between character strings. We achieve this result by applying phrasal inversion transduction grammar alignment techniques to character strings to train a character-based translation model, and using this in the phrase-based MT framework. We also propose a look-ahead parsing algorithm and substring-informed prior probabilities to achieve more effective and efficient alignment. In an evaluation, we demonstrate that character-based translation can achieve results that compare to word-based systems while effectively translating unknown and uncommon words over several language pairs.
6 0.21295215 155 acl-2012-NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
7 0.19385935 141 acl-2012-Maximum Expected BLEU Training of Phrase and Lexicon Translation Models
8 0.16697603 25 acl-2012-An Exploration of Forest-to-String Translation: Does Translation Help or Hurt Parsing?
9 0.16032027 143 acl-2012-Mixing Multiple Translation Models in Statistical Machine Translation
10 0.15867557 148 acl-2012-Modified Distortion Matrices for Phrase-Based Statistical Machine Translation
11 0.15263245 131 acl-2012-Learning Translation Consensus with Structured Label Propagation
12 0.14902709 179 acl-2012-Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm
13 0.14898519 97 acl-2012-Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation
14 0.13934048 127 acl-2012-Large-Scale Syntactic Language Modeling with Treelets
15 0.13655941 128 acl-2012-Learning Better Rule Extraction with Translation Span Alignment
16 0.13214226 203 acl-2012-Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information
17 0.13144809 81 acl-2012-Enhancing Statistical Machine Translation with Character Alignment
18 0.12963466 122 acl-2012-Joint Evaluation of Morphological Segmentation and Syntactic Parsing
19 0.12557548 134 acl-2012-Learning to Find Translations and Transliterations on the Web
20 0.12505993 204 acl-2012-Translation Model Size Reduction for Hierarchical Phrase-based Statistical Machine Translation
topicId topicWeight
[(0, -0.388), (1, -0.231), (2, -0.016), (3, -0.015), (4, 0.098), (5, 0.132), (6, 0.084), (7, -0.218), (8, -0.001), (9, -0.005), (10, -0.165), (11, -0.073), (12, 0.136), (13, -0.151), (14, 0.021), (15, -0.079), (16, -0.234), (17, -0.055), (18, -0.115), (19, -0.079), (20, 0.096), (21, -0.02), (22, 0.018), (23, -0.017), (24, 0.069), (25, -0.08), (26, 0.046), (27, -0.062), (28, -0.056), (29, -0.003), (30, 0.031), (31, -0.046), (32, 0.011), (33, 0.064), (34, -0.02), (35, 0.023), (36, 0.008), (37, 0.041), (38, 0.024), (39, -0.018), (40, 0.041), (41, -0.006), (42, -0.02), (43, 0.037), (44, 0.029), (45, 0.006), (46, 0.002), (47, -0.026), (48, -0.006), (49, -0.01)]
simIndex simValue paperId paperTitle
same-paper 1 0.92141712 3 acl-2012-A Class-Based Agreement Model for Generating Accurately Inflected Translations
Author: Spence Green ; John DeNero
Abstract: When automatically translating from a weakly inflected source language like English to a target language with richer grammatical features such as gender and dual number, the output commonly contains morpho-syntactic agreement errors. To address this issue, we present a target-side, class-based agreement model. Agreement is promoted by scoring a sequence of fine-grained morpho-syntactic classes that are predicted during decoding for each translation hypothesis. For English-to-Arabic translation, our model yields a +1.04 BLEU average improvement over a state-of-the-art baseline. The model does not require bitext or phrase table annotations and can be easily implemented as a feature in many phrase-based decoders. 1
2 0.91775781 207 acl-2012-Unsupervised Morphology Rivals Supervised Morphology for Arabic MT
Author: David Stallard ; Jacob Devlin ; Michael Kayser ; Yoong Keok Lee ; Regina Barzilay
Abstract: If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inflected languages. In this paper, we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers, using a state-of-theart Arabic-to-English MT system. We apply maximum marginal decoding to the unsupervised analyzer, and show that this yields the best published segmentation accuracy for Arabic, while also making segmentation output more stable. Our approach gives an 18% relative BLEU gain for Levantine dialectal Arabic. Furthermore, it gives higher gains for Modern Standard Arabic (MSA), as measured on NIST MT-08, than does MADA (Habash and Rambow, 2005), a leading supervised MSA segmenter.
3 0.86790866 202 acl-2012-Transforming Standard Arabic to Colloquial Arabic
Author: Emad Mohamed ; Behrang Mohit ; Kemal Oflazer
Abstract: We present a method for generating Colloquial Egyptian Arabic (CEA) from morphologically disambiguated Modern Standard Arabic (MSA). When used in POS tagging, this process improves the accuracy from 73.24% to 86.84% on unseen CEA text, and reduces the percentage of out-ofvocabulary words from 28.98% to 16.66%. The process holds promise for any NLP task targeting the dialectal varieties of Arabic; e.g., this approach may provide a cheap way to leverage MSA data and morphological resources to create resources for colloquial Arabic to English machine translation. It can also considerably speed up the annotation of Arabic dialects.
4 0.80705363 27 acl-2012-Arabic Retrieval Revisited: Morphological Hole Filling
Author: Kareem Darwish ; Ahmed Ali
Abstract: Due to Arabic’s morphological complexity, Arabic retrieval benefits greatly from morphological analysis – particularly stemming. However, the best known stemming does not handle linguistic phenomena such as broken plurals and malformed stems. In this paper we propose a model of character-level morphological transformation that is trained using Wikipedia hypertext to page title links. The use of our model yields statistically significant improvements in Arabic retrieval over the use of the best statistical stemming technique. The technique can potentially be applied to other languages.
5 0.55463129 155 acl-2012-NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
Author: Tong Xiao ; Jingbo Zhu ; Hao Zhang ; Qiang Li
Abstract: We present a new open source toolkit for phrase-based and syntax-based machine translation. The toolkit supports several state-of-the-art models developed in statistical machine translation, including the phrase-based model, the hierachical phrase-based model, and various syntaxbased models. The key innovation provided by the toolkit is that the decoder can work with various grammars and offers different choices of decoding algrithms, such as phrase-based decoding, decoding as parsing/tree-parsing and forest-based decoding. Moreover, several useful utilities were distributed with the toolkit, including a discriminative reordering model, a simple and fast language model, and an implementation of minimum error rate training for weight tuning. 1
6 0.55333406 122 acl-2012-Joint Evaluation of Morphological Segmentation and Syntactic Parsing
7 0.54558462 140 acl-2012-Machine Translation without Words through Substring Alignment
8 0.5415529 131 acl-2012-Learning Translation Consensus with Structured Label Propagation
9 0.53453106 127 acl-2012-Large-Scale Syntactic Language Modeling with Treelets
10 0.53428996 97 acl-2012-Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation
11 0.53166151 141 acl-2012-Maximum Expected BLEU Training of Phrase and Lexicon Translation Models
12 0.52837718 105 acl-2012-Head-Driven Hierarchical Phrase-based Translation
13 0.52411699 49 acl-2012-Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study
14 0.52116317 137 acl-2012-Lemmatisation as a Tagging Task
15 0.51320738 143 acl-2012-Mixing Multiple Translation Models in Statistical Machine Translation
16 0.50913846 67 acl-2012-Deciphering Foreign Language by Combining Language Models and Context Vectors
17 0.50186062 46 acl-2012-Character-Level Machine Translation Evaluation for Languages with Ambiguous Word Boundaries
18 0.47742587 178 acl-2012-Sentence Simplification by Monolingual Machine Translation
19 0.47592887 13 acl-2012-A Graphical Interface for MT Evaluation and Error Analysis
20 0.47484615 81 acl-2012-Enhancing Statistical Machine Translation with Character Alignment
topicId topicWeight
[(25, 0.02), (26, 0.06), (28, 0.06), (30, 0.034), (37, 0.03), (39, 0.045), (55, 0.011), (57, 0.065), (74, 0.061), (82, 0.022), (83, 0.119), (84, 0.044), (85, 0.048), (90, 0.16), (92, 0.053), (94, 0.04), (99, 0.053)]
simIndex simValue paperId paperTitle
same-paper 1 0.8706755 3 acl-2012-A Class-Based Agreement Model for Generating Accurately Inflected Translations
Author: Spence Green ; John DeNero
Abstract: When automatically translating from a weakly inflected source language like English to a target language with richer grammatical features such as gender and dual number, the output commonly contains morpho-syntactic agreement errors. To address this issue, we present a target-side, class-based agreement model. Agreement is promoted by scoring a sequence of fine-grained morpho-syntactic classes that are predicted during decoding for each translation hypothesis. For English-to-Arabic translation, our model yields a +1.04 BLEU average improvement over a state-of-the-art baseline. The model does not require bitext or phrase table annotations and can be easily implemented as a feature in many phrase-based decoders. 1
2 0.84491664 83 acl-2012-Error Mining on Dependency Trees
Author: Claire Gardent ; Shashi Narayan
Abstract: In recent years, error mining approaches were developed to help identify the most likely sources of parsing failures in parsing systems using handcrafted grammars and lexicons. However the techniques they use to enumerate and count n-grams builds on the sequential nature of a text corpus and do not easily extend to structured data. In this paper, we propose an algorithm for mining trees and apply it to detect the most likely sources of generation failure. We show that this tree mining algorithm permits identifying not only errors in the generation system (grammar, lexicon) but also mismatches between the structures contained in the input and the input structures expected by our generator as well as a few idiosyncrasies/error in the input data.
3 0.83909941 155 acl-2012-NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
Author: Tong Xiao ; Jingbo Zhu ; Hao Zhang ; Qiang Li
Abstract: We present a new open source toolkit for phrase-based and syntax-based machine translation. The toolkit supports several state-of-the-art models developed in statistical machine translation, including the phrase-based model, the hierachical phrase-based model, and various syntaxbased models. The key innovation provided by the toolkit is that the decoder can work with various grammars and offers different choices of decoding algrithms, such as phrase-based decoding, decoding as parsing/tree-parsing and forest-based decoding. Moreover, several useful utilities were distributed with the toolkit, including a discriminative reordering model, a simple and fast language model, and an implementation of minimum error rate training for weight tuning. 1
Author: Patrick Simianer ; Stefan Riezler ; Chris Dyer
Abstract: With a few exceptions, discriminative training in statistical machine translation (SMT) has been content with tuning weights for large feature sets on small development data. Evidence from machine learning indicates that increasing the training sample size results in better prediction. The goal of this paper is to show that this common wisdom can also be brought to bear upon SMT. We deploy local features for SCFG-based SMT that can be read off from rules at runtime, and present a learning algorithm that applies ‘1/‘2 regularization for joint feature selection over distributed stochastic learning processes. We present experiments on learning on 1.5 million training sentences, and show significant improvements over tuning discriminative models on small development sets.
5 0.83415842 110 acl-2012-Historical Analysis of Legal Opinions with a Sparse Mixed-Effects Latent Variable Model
Author: William Yang Wang ; Elijah Mayfield ; Suresh Naidu ; Jeremiah Dittmar
Abstract: We propose a latent variable model to enhance historical analysis of large corpora. This work extends prior work in topic modelling by incorporating metadata, and the interactions between the components in metadata, in a general way. To test this, we collect a corpus of slavery-related United States property law judgements sampled from the years 1730 to 1866. We study the language use in these legal cases, with a special focus on shifts in opinions on controversial topics across different regions. Because this is a longitudinal data set, we are also interested in understanding how these opinions change over the course of decades. We show that the joint learning scheme of our sparse mixed-effects model improves on other state-of-the-art generative and discriminative models on the region and time period identification tasks. Experiments show that our sparse mixed-effects model is more accurate quantitatively and qualitatively interesting, and that these improvements are robust across different parameter settings.
6 0.83112389 148 acl-2012-Modified Distortion Matrices for Phrase-Based Statistical Machine Translation
7 0.82119668 140 acl-2012-Machine Translation without Words through Substring Alignment
9 0.81920087 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons
10 0.81642663 127 acl-2012-Large-Scale Syntactic Language Modeling with Treelets
11 0.81594276 136 acl-2012-Learning to Translate with Multiple Objectives
12 0.8113988 158 acl-2012-PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning
13 0.80942416 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
14 0.80872291 45 acl-2012-Capturing Paradigmatic and Syntagmatic Lexical Relations: Towards Accurate Chinese Part-of-Speech Tagging
15 0.80792648 178 acl-2012-Sentence Simplification by Monolingual Machine Translation
16 0.80754673 11 acl-2012-A Feature-Rich Constituent Context Model for Grammar Induction
17 0.80430847 217 acl-2012-Word Sense Disambiguation Improves Information Retrieval
18 0.80411905 54 acl-2012-Combining Word-Level and Character-Level Models for Machine Translation Between Closely-Related Languages
19 0.80110204 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information
20 0.80080134 27 acl-2012-Arabic Retrieval Revisited: Morphological Hole Filling