acl acl2012 acl2012-207 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: David Stallard ; Jacob Devlin ; Michael Kayser ; Yoong Keok Lee ; Regina Barzilay
Abstract: If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inflected languages. In this paper, we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers, using a state-of-theart Arabic-to-English MT system. We apply maximum marginal decoding to the unsupervised analyzer, and show that this yields the best published segmentation accuracy for Arabic, while also making segmentation output more stable. Our approach gives an 18% relative BLEU gain for Levantine dialectal Arabic. Furthermore, it gives higher gains for Modern Standard Arabic (MSA), as measured on NIST MT-08, than does MADA (Habash and Rambow, 2005), a leading supervised MSA segmenter.
Reference: text
sentIndex sentText sentNum sentScore
1 com Abstract If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inflected languages. [sent-2, score-0.507]
2 In this paper, we compare performance gains for state-of-the-art supervised vs. [sent-3, score-0.101]
3 unsupervised morphological analyzers, using a state-of-theart Arabic-to-English MT system. [sent-4, score-0.273]
4 We apply maximum marginal decoding to the unsupervised analyzer, and show that this yields the best published segmentation accuracy for Arabic, while also making segmentation output more stable. [sent-5, score-0.603]
5 Our approach gives an 18% relative BLEU gain for Levantine dialectal Arabic. [sent-6, score-0.214]
6 Furthermore, it gives higher gains for Modern Standard Arabic (MSA), as measured on NIST MT-08, than does MADA (Habash and Rambow, 2005), a leading supervised MSA segmenter. [sent-7, score-0.179]
7 1 Introduction If unsupervised morphological segmenters could approach the effectiveness of supervised ones, they would be a very attractive choice for improving machine translation (MT) performance in low-resource inflected languages. [sent-8, score-0.669]
8 An additional advantage of Arabic for study is the availability of high-quality supervised segmenters for MSA, such as MADA (Habash and 322 Yoong Keok Lee Regina Barzilay CSAIL Massachusetts Institute of Technology {yklee regina} @ csai l . [sent-10, score-0.256]
9 The MT gain for supervised MSA segmenters on dialect establishes a lower bound, which the unsupervised segmenter must exceed ifit is to be useful for dialect. [sent-13, score-0.991]
10 And comparing the gain for supervised and unsupervised segmenters on MSA tells us how useful the unsupervised segmenter is, relative to the ideal case in which a supervised segmenter is available. [sent-14, score-1.509]
11 In this paper, we show that an unsupervised segmenter can in fact rival or surpass supervised MSA segmenters on MSA itself, while at the same time providing superior performance on dialect. [sent-15, score-0.778]
12 Specifically, we compare the state-of-the-art morphological analyzer of Lee et al. [sent-16, score-0.202]
13 (201 1) with two leading supervised analyzers for MSA, MADA and Sakhr1, each serving as an alternative preprocessor for a state-ofthe-art statistical MT system (Shen et al. [sent-17, score-0.212]
14 We measure MSA performance on NIST MT-08 (NIST, 2010), and dialect performance on a Levantine dialect web corpus (Zbib et al. [sent-19, score-0.206]
15 We also show that it yields MT08 BLEU scores that are higher than those obtained with MADA, a leading supervised MSA segmenter. [sent-22, score-0.177]
16 For Levantine, the segmenter increases BLEU score by 18% over the unsegmented baseline. [sent-23, score-0.441]
17 c so2c0ia1t2io Ans fso rc Ciatoiomnp fuotart Cio nmaplu Ltiantgiounisatlic Lsi,n pgaugiestsi3c 2s2–327, 2 Related Work Machine translation systems that process highly inflected languages often incorporate morphological analysis. [sent-29, score-0.277]
18 Some of these approaches rely on morphological analysis for pre- and post-processing, while others modify the core of a translation system to incorporate morphological information (Habash, 2008; Luong et al. [sent-30, score-0.396]
19 For instance, factored translation Models (Koehn and Hoang, 2007; Yang and Kirchhoff, 2006; Avramidis and Koehn, 2008) parametrize translation probabili- ties as factors encoding morphological features. [sent-32, score-0.33]
20 The approach we have taken in this paper is an instance of a segmented MT model, which divides the input into morphemes and uses the derived morphemes as a unit of translation (Sadat and Habash, 2006; Badr et al. [sent-33, score-0.182]
21 This is a mainstream architecture that has been shown to be effective when translating from a morphologically rich language. [sent-35, score-0.057]
22 A number of recent approaches have explored the use of unsupervised morphological analyzers for MT (Virpioja et al. [sent-36, score-0.338]
23 (2007) apply the unsupervised morphological segmenter Morfessor (Creutz and Lagus, 2007), and apply an existing MT system at the level of morphemes. [sent-39, score-0.684]
24 The system does not outperform the word baseline partially due to the insufficient accuracy of the automatic morphological analyzer. [sent-40, score-0.188]
25 The work of Mermer and Akın (2010) and Mermer and Saraclar (201 1) attempts to integrate morphology and MT more closely than we do, by incorporating bilingual alignment probabilities into a Gibbs-sampled version of Morfessor for Turkish-to- English MT. [sent-41, score-0.061]
26 However, the bilingual strategy shows no gain over the monolingual version, and neither version is competitive for MT with a supervised Turkish morphological segmenter (Oflazer, 1993). [sent-42, score-0.758]
27 By contrast, the unsupervised analyzer we report on here yields MSA-to-English MT performance that equals or exceed the performance obtained with a leading supervised MSA segmenter, MADA (Habash and Rambow, 2005). [sent-43, score-0.354]
28 323 3 Review of Lee Unsupervised Segmenter The segmenter of Lee et al. [sent-44, score-0.411]
29 Model 1prefers small affix lexicons, and assumes that morphemes are drawn independently. [sent-47, score-0.095]
30 Model 2 generates a latent POS tag for each word type, conditioning the word’s affixes on the tag, thereby encouraging compatible affixes to be generated together. [sent-48, score-0.168]
31 Finally, Model 4 models morphosyntactic agreement with a transition probability distribution, encouraging adjacent tokens with the same endings to also have the same final suffix. [sent-50, score-0.03]
32 Johnson and Goldwater (2009) extend MM to Gibbs sampling by drawing a set of N independent Gibbs samples, and selecting for each word the most frequent segmentation found in them. [sent-52, score-0.177]
33 They found that MM improved segmentation accuracy over the mean, consistent with its maximization criterion. [sent-53, score-0.163]
34 First, MM dramatically reduces the output variance of Gibbs sampling (GS). [sent-55, score-0.147]
35 Table 1documents the severity of this variance for the MT-08 lexicon, as measured by the average exact-match accuracy and segmentation F-measure between different runs. [sent-56, score-0.211]
36 By contrast the “MM” column ofTable 1shows that two different runs ofMM, each derived by combining separate sets of 25 GS runs, agree on the segmentations of over 95% of the word token a dramatic improvement in stability. [sent-58, score-0.057]
37 Second, MM reduces noise from the spurious affixes that the unsupervised segmenter induces for large lexicons. [sent-59, score-0.697]
38 As Table 2 shows, the segmenter – DecodingLevelRecPrecF1Acc GMi b sT Toy pk e en 89 857 2. [sent-60, score-0.411]
39 5719 Table 1: Comparison of agreement in outputs between 25 runs of Gibbs sampling vs. [sent-64, score-0.097]
40 We give the average segmentation recall, precision, F1-measure, and exact-match accuracy between outputs, at word-type and word-token levels. [sent-66, score-0.163]
41 AGTSBGSMMTM-08Morf TUonpiq- 9u5ep s pur ef ix e s 14 72163 07219632481796 Table 2: Affix statistics of unsupervised segmenters. [sent-67, score-0.111]
42 For the ATB lexicon, we show statistics for the Lee segmenter with regular Gibbs sampling (GS). [sent-68, score-0.451]
43 For the MT08 lexicon, we also show the output of the Lee segmenter with maximum marginal decoding (MM). [sent-69, score-0.546]
44 induces 130 prefixes and 261 suffixes for MT-08 (statistics for Morfessor are similar). [sent-71, score-0.085]
45 This phenomenon is fundamental to Bayesian nonparametric models, which expand indefinitely to fit the data they are given (Wasserman, 2006). [sent-72, score-0.035]
46 But MM helps to alleviate it, reducing unique prefixes and suffixes for MT-08 by 28% and 21%, respectively. [sent-73, score-0.044]
47 It also reduces the number of unique prefixes/suffixes which account for 95% of the prefix/suffix tokens (Top-95). [sent-74, score-0.035]
48 Finally, we find that in our setting, MM increases accuracy not just over the mean, but over even the best-scoring of the runs. [sent-75, score-0.026]
49 As shown in Table 3, MM increases segmentation F-measure from 86. [sent-76, score-0.137]
50 This exceeds the best published results on ATB (Naradowsky and Toutanova, 2011). [sent-79, score-0.054]
51 These results suggest that MM may be worth considering for other GS applications, not only for the accuracy improvements pointed out by Johnson and Goldwater (2009), but also for its potential to provide more stable and less noisy results. [sent-80, score-0.026]
52 208 Table 3: Segmentation F-scores on ATB dataset for Lee segmenter, shown for each Model level M1–M4 on the Arabic segmentation dataset used by (Poon et al. [sent-85, score-0.137]
53 , 2009): We give the mean, minimum, and maximum F-scores for 25 independent runs of Gibbs sampling, together with the F-score from running MM over that same set of runs. [sent-86, score-0.081]
54 We compare the Lee segmenter with the supervised MSA segmenter MADA, using its “D3” scheme. [sent-92, score-0.923]
55 We also compare with Sakhr, an intensively-engineered, supervised MSA segmenter which applies multiple NLP technologies to the segmentation problem, and which has given the best results for our MT system in previous work (Zbib et al. [sent-93, score-0.649]
56 We apply the appropriate segmenter to split words into morphemes, which we then treat as words for alignment and decoding. [sent-97, score-0.411]
57 (201 1), we segment the test and training sets jointly, estimating separate translation models for each segmenter/dataset combination. [sent-99, score-0.072]
58 For dialect, we use a Levantine dialectal Arabic corpus collected from the web with 1. [sent-104, score-0.074]
59 As expected, Sakhr gives the best results for MSA. [sent-111, score-0.032]
60 Morfessor underperforms the other segmenters, perhaps because of its lower accuracy on Arabic, as reported by Poon et al. [sent-112, score-0.026]
61 The Lee segmenter gives the best results for Levantine, inducing valid Levantine affixes (e. [sent-114, score-0.512]
62 g “hAl+” for MSA’s “h*A-Al+”, English “this-the”) and yielding an 18% relative gain over the unsegmented baseline. [sent-115, score-0.138]
63 What is more surprising is that the Lee segmenter compares favorably with the supervised MSA segmenters on MSA itself. [sent-116, score-0.667]
64 In particular, the Lee segmenter with MM yields higher BLEU scores than does MADA, a leading supervised segmenter, while preserving almost the same performance as GS on dialect. [sent-117, score-0.588]
65 On Small MSA, it recoups 93% of even Sakhr’s gain. [sent-118, score-0.046]
66 By contrast, the Lee segmenter recoups only 79% of Sakhr’s gain on Full MSA. [sent-119, score-0.541]
67 This might result from the phenomenon alluded to in Section 4, where additional data sometimes degrades performance for unsupervised analyzers. [sent-120, score-0.111]
68 However, the Lee segmenter’s gain on Levantine (18%) is higher than its gain on Small MSA (13%), even though Levantine has more data (1. [sent-121, score-0.168]
69 This might be because dialect, being less standardized, has more orthographic and morphological variability, which unsupervised segmentation helps to resolve. [sent-125, score-0.41]
70 These experiments also show that while Model 4 gives the best F-score, Model 3 gives the best MT scores. [sent-126, score-0.064]
71 Comparison of Model 3 and 4 segmentations shows that Model 4 induces a much larger number of inflectional suffixes, especially the feminine singular suffix “-p”, which accounts for a plurality (16%) of the differences by token. [sent-127, score-0.041]
72 While such suffixes improve F-measure on the segmentation references, they do not correspond to any English lexical unit, and thus do not help alignment. [sent-128, score-0.181]
73 An interesting question is how much performance might be gained from a supervised segmenter that was as intensively engineered for dialect as Sakhr was for MSA. [sent-129, score-0.615]
74 38, for a relative gain ofjust 5% over the unsuper325 SystemSMmSaAllFMuSllALDeiavl Table 4: BLEU scores for all experiments. [sent-132, score-0.108]
75 For each of these, the highest Lee segmenter score is in bold, with “+” if statistically significant vs. [sent-135, score-0.411]
76 Given the large engineering effort that would be required to achieve this gain, the unsupervised segmenter may be a more cost-effective choice for dialectal Arabic. [sent-139, score-0.596]
77 We thank Rabih Zbib for his help with interpreting Levantine Arabic segmentation output. [sent-147, score-0.137]
78 Enriching morphologically poor languages for statistical ma- chine translation. [sent-150, score-0.057]
79 Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop. [sent-169, score-0.162]
80 Improving nonparametric bayesian inference: experiments on unsupervised word segmentation with adaptor grammars. [sent-177, score-0.283]
81 A hybrid morpheme-word representation for machine translation of morphologically rich languages. [sent-193, score-0.129]
82 Unsupervised search for the optimal segmentation for statistical machine translation. [sent-197, score-0.137]
83 Unsuper- vised Turkish morphological segmentation for statistical machine translation. [sent-202, score-0.333]
84 Unsupervised bilingual morpheme segmentation and alignment with context-rich hidden semi-Markov models. [sent-210, score-0.181]
85 A new string-to-dependency machine translation algorithm with a target dependency language model. [sent-240, score-0.072]
86 Morphology-aware statistical machine translation based on morphs induced in an unsupervised manner. [sent-245, score-0.183]
87 Phrase-based backoff models for machine translation of highly in- flected languages. [sent-253, score-0.072]
wordName wordTfidf (topN-words)
[('msa', 0.464), ('segmenter', 0.411), ('levantine', 0.277), ('mm', 0.236), ('arabic', 0.18), ('morphological', 0.162), ('zbib', 0.162), ('mada', 0.161), ('segmenters', 0.155), ('segmentation', 0.137), ('mt', 0.131), ('sakhr', 0.116), ('unsupervised', 0.111), ('dialect', 0.103), ('mermer', 0.103), ('supervised', 0.101), ('lee', 0.101), ('rabih', 0.092), ('habash', 0.085), ('gain', 0.084), ('gs', 0.081), ('morfessor', 0.081), ('bleu', 0.074), ('creutz', 0.074), ('dialectal', 0.074), ('marginal', 0.072), ('translation', 0.072), ('virpioja', 0.069), ('affixes', 0.069), ('analyzers', 0.065), ('morphology', 0.061), ('clifton', 0.06), ('runs', 0.057), ('morphologically', 0.057), ('naradowsky', 0.055), ('morphemes', 0.055), ('atb', 0.052), ('nakov', 0.049), ('goldwater', 0.048), ('variance', 0.048), ('gibbs', 0.046), ('badr', 0.046), ('kayser', 0.046), ('luong', 0.046), ('makhoul', 0.046), ('recoups', 0.046), ('leading', 0.046), ('nist', 0.045), ('morpheme', 0.044), ('poon', 0.044), ('nizar', 0.044), ('turkish', 0.044), ('suffixes', 0.044), ('inflected', 0.043), ('ak', 0.043), ('induces', 0.041), ('johnson', 0.041), ('avramidis', 0.04), ('keok', 0.04), ('yoong', 0.04), ('devlin', 0.04), ('affix', 0.04), ('sampling', 0.04), ('analyzer', 0.04), ('decoding', 0.039), ('rambow', 0.038), ('lagus', 0.037), ('mathias', 0.037), ('sadat', 0.037), ('preslav', 0.037), ('stallard', 0.037), ('nonparametric', 0.035), ('reduces', 0.035), ('vised', 0.034), ('saraclar', 0.034), ('kun', 0.034), ('regina', 0.033), ('cos', 0.032), ('gives', 0.032), ('bbn', 0.031), ('lexicon', 0.03), ('noise', 0.03), ('yields', 0.03), ('unsegmented', 0.03), ('matsoukas', 0.03), ('spyros', 0.03), ('variability', 0.03), ('encouraging', 0.03), ('exceeds', 0.027), ('published', 0.027), ('exceed', 0.026), ('koehn', 0.026), ('accuracy', 0.026), ('sarkar', 0.026), ('jacob', 0.026), ('attractive', 0.025), ('maximum', 0.024), ('dramatically', 0.024), ('factored', 0.024), ('relative', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999982 207 acl-2012-Unsupervised Morphology Rivals Supervised Morphology for Arabic MT
Author: David Stallard ; Jacob Devlin ; Michael Kayser ; Yoong Keok Lee ; Regina Barzilay
Abstract: If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inflected languages. In this paper, we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers, using a state-of-theart Arabic-to-English MT system. We apply maximum marginal decoding to the unsupervised analyzer, and show that this yields the best published segmentation accuracy for Arabic, while also making segmentation output more stable. Our approach gives an 18% relative BLEU gain for Levantine dialectal Arabic. Furthermore, it gives higher gains for Modern Standard Arabic (MSA), as measured on NIST MT-08, than does MADA (Habash and Rambow, 2005), a leading supervised MSA segmenter.
2 0.38247341 202 acl-2012-Transforming Standard Arabic to Colloquial Arabic
Author: Emad Mohamed ; Behrang Mohit ; Kemal Oflazer
Abstract: We present a method for generating Colloquial Egyptian Arabic (CEA) from morphologically disambiguated Modern Standard Arabic (MSA). When used in POS tagging, this process improves the accuracy from 73.24% to 86.84% on unseen CEA text, and reduces the percentage of out-ofvocabulary words from 28.98% to 16.66%. The process holds promise for any NLP task targeting the dialectal varieties of Arabic; e.g., this approach may provide a cheap way to leverage MSA data and morphological resources to create resources for colloquial Arabic to English machine translation. It can also considerably speed up the annotation of Arabic dialects.
3 0.28338388 3 acl-2012-A Class-Based Agreement Model for Generating Accurately Inflected Translations
Author: Spence Green ; John DeNero
Abstract: When automatically translating from a weakly inflected source language like English to a target language with richer grammatical features such as gender and dual number, the output commonly contains morpho-syntactic agreement errors. To address this issue, we present a target-side, class-based agreement model. Agreement is promoted by scoring a sequence of fine-grained morpho-syntactic classes that are predicted during decoding for each translation hypothesis. For English-to-Arabic translation, our model yields a +1.04 BLEU average improvement over a state-of-the-art baseline. The model does not require bitext or phrase table annotations and can be easily implemented as a feature in many phrase-based decoders. 1
4 0.1561216 27 acl-2012-Arabic Retrieval Revisited: Morphological Hole Filling
Author: Kareem Darwish ; Ahmed Ali
Abstract: Due to Arabic’s morphological complexity, Arabic retrieval benefits greatly from morphological analysis – particularly stemming. However, the best known stemming does not handle linguistic phenomena such as broken plurals and malformed stems. In this paper we propose a model of character-level morphological transformation that is trained using Wikipedia hypertext to page title links. The use of our model yields statistically significant improvements in Arabic retrieval over the use of the best statistical stemming technique. The technique can potentially be applied to other languages.
5 0.1219817 140 acl-2012-Machine Translation without Words through Substring Alignment
Author: Graham Neubig ; Taro Watanabe ; Shinsuke Mori ; Tatsuya Kawahara
Abstract: In this paper, we demonstrate that accurate machine translation is possible without the concept of “words,” treating MT as a problem of transformation between character strings. We achieve this result by applying phrasal inversion transduction grammar alignment techniques to character strings to train a character-based translation model, and using this in the phrase-based MT framework. We also propose a look-ahead parsing algorithm and substring-informed prior probabilities to achieve more effective and efficient alignment. In an evaluation, we demonstrate that character-based translation can achieve results that compare to word-based systems while effectively translating unknown and uncommon words over several language pairs.
6 0.1005843 122 acl-2012-Joint Evaluation of Morphological Segmentation and Syntactic Parsing
7 0.094187357 141 acl-2012-Maximum Expected BLEU Training of Phrase and Lexicon Translation Models
8 0.092768162 81 acl-2012-Enhancing Statistical Machine Translation with Character Alignment
9 0.0702646 155 acl-2012-NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
10 0.069071092 46 acl-2012-Character-Level Machine Translation Evaluation for Languages with Ambiguous Word Boundaries
11 0.065313607 41 acl-2012-Bootstrapping a Unified Model of Lexical and Phonetic Acquisition
12 0.065068975 210 acl-2012-Unsupervized Word Segmentation: the Case for Mandarin Chinese
13 0.064126149 179 acl-2012-Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm
14 0.062156897 148 acl-2012-Modified Distortion Matrices for Phrase-Based Statistical Machine Translation
15 0.061379269 119 acl-2012-Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese
16 0.060615964 13 acl-2012-A Graphical Interface for MT Evaluation and Error Analysis
17 0.060284372 143 acl-2012-Mixing Multiple Translation Models in Statistical Machine Translation
18 0.060254131 16 acl-2012-A Nonparametric Bayesian Approach to Acoustic Model Discovery
19 0.058317211 54 acl-2012-Combining Word-Level and Character-Level Models for Machine Translation Between Closely-Related Languages
20 0.05822929 131 acl-2012-Learning Translation Consensus with Structured Label Propagation
topicId topicWeight
[(0, -0.174), (1, -0.118), (2, 0.023), (3, -0.001), (4, 0.061), (5, 0.15), (6, 0.084), (7, -0.211), (8, -0.012), (9, -0.017), (10, -0.203), (11, -0.107), (12, 0.133), (13, -0.199), (14, 0.028), (15, -0.142), (16, -0.314), (17, -0.14), (18, -0.15), (19, 0.015), (20, 0.157), (21, -0.031), (22, 0.004), (23, -0.062), (24, 0.05), (25, -0.107), (26, 0.003), (27, -0.011), (28, -0.061), (29, -0.003), (30, -0.019), (31, -0.02), (32, 0.066), (33, 0.027), (34, -0.12), (35, 0.011), (36, 0.019), (37, -0.064), (38, -0.014), (39, 0.033), (40, 0.077), (41, -0.013), (42, 0.046), (43, 0.049), (44, 0.036), (45, -0.027), (46, -0.087), (47, 0.006), (48, 0.016), (49, -0.011)]
simIndex simValue paperId paperTitle
same-paper 1 0.93557769 207 acl-2012-Unsupervised Morphology Rivals Supervised Morphology for Arabic MT
Author: David Stallard ; Jacob Devlin ; Michael Kayser ; Yoong Keok Lee ; Regina Barzilay
Abstract: If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inflected languages. In this paper, we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers, using a state-of-theart Arabic-to-English MT system. We apply maximum marginal decoding to the unsupervised analyzer, and show that this yields the best published segmentation accuracy for Arabic, while also making segmentation output more stable. Our approach gives an 18% relative BLEU gain for Levantine dialectal Arabic. Furthermore, it gives higher gains for Modern Standard Arabic (MSA), as measured on NIST MT-08, than does MADA (Habash and Rambow, 2005), a leading supervised MSA segmenter.
2 0.8829127 202 acl-2012-Transforming Standard Arabic to Colloquial Arabic
Author: Emad Mohamed ; Behrang Mohit ; Kemal Oflazer
Abstract: We present a method for generating Colloquial Egyptian Arabic (CEA) from morphologically disambiguated Modern Standard Arabic (MSA). When used in POS tagging, this process improves the accuracy from 73.24% to 86.84% on unseen CEA text, and reduces the percentage of out-ofvocabulary words from 28.98% to 16.66%. The process holds promise for any NLP task targeting the dialectal varieties of Arabic; e.g., this approach may provide a cheap way to leverage MSA data and morphological resources to create resources for colloquial Arabic to English machine translation. It can also considerably speed up the annotation of Arabic dialects.
3 0.7793377 27 acl-2012-Arabic Retrieval Revisited: Morphological Hole Filling
Author: Kareem Darwish ; Ahmed Ali
Abstract: Due to Arabic’s morphological complexity, Arabic retrieval benefits greatly from morphological analysis – particularly stemming. However, the best known stemming does not handle linguistic phenomena such as broken plurals and malformed stems. In this paper we propose a model of character-level morphological transformation that is trained using Wikipedia hypertext to page title links. The use of our model yields statistically significant improvements in Arabic retrieval over the use of the best statistical stemming technique. The technique can potentially be applied to other languages.
4 0.63400227 3 acl-2012-A Class-Based Agreement Model for Generating Accurately Inflected Translations
Author: Spence Green ; John DeNero
Abstract: When automatically translating from a weakly inflected source language like English to a target language with richer grammatical features such as gender and dual number, the output commonly contains morpho-syntactic agreement errors. To address this issue, we present a target-side, class-based agreement model. Agreement is promoted by scoring a sequence of fine-grained morpho-syntactic classes that are predicted during decoding for each translation hypothesis. For English-to-Arabic translation, our model yields a +1.04 BLEU average improvement over a state-of-the-art baseline. The model does not require bitext or phrase table annotations and can be easily implemented as a feature in many phrase-based decoders. 1
5 0.43222144 122 acl-2012-Joint Evaluation of Morphological Segmentation and Syntactic Parsing
Author: Reut Tsarfaty ; Joakim Nivre ; Evelina Andersson
Abstract: We present novel metrics for parse evaluation in joint segmentation and parsing scenarios where the gold sequence of terminals is not known in advance. The protocol uses distance-based metrics defined for the space of trees over lattices. Our metrics allow us to precisely quantify the performance gap between non-realistic parsing scenarios (assuming gold segmented and tagged input) and realistic ones (not assuming gold segmentation and tags). Our evaluation of segmentation and parsing for Modern Hebrew sheds new light on the performance ofthe best parsing systems to date in the different scenarios.
6 0.31355259 49 acl-2012-Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study
7 0.31065282 210 acl-2012-Unsupervized Word Segmentation: the Case for Mandarin Chinese
8 0.30590057 211 acl-2012-Using Rejuvenation to Improve Particle Filtering for Bayesian Word Segmentation
9 0.28551924 140 acl-2012-Machine Translation without Words through Substring Alignment
10 0.27528897 46 acl-2012-Character-Level Machine Translation Evaluation for Languages with Ambiguous Word Boundaries
11 0.27392581 136 acl-2012-Learning to Translate with Multiple Objectives
12 0.27361345 137 acl-2012-Lemmatisation as a Tagging Task
13 0.26422527 81 acl-2012-Enhancing Statistical Machine Translation with Character Alignment
14 0.23557548 179 acl-2012-Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm
15 0.22740962 16 acl-2012-A Nonparametric Bayesian Approach to Acoustic Model Discovery
16 0.22436942 148 acl-2012-Modified Distortion Matrices for Phrase-Based Statistical Machine Translation
17 0.22335239 158 acl-2012-PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning
18 0.21973242 141 acl-2012-Maximum Expected BLEU Training of Phrase and Lexicon Translation Models
19 0.21774237 127 acl-2012-Large-Scale Syntactic Language Modeling with Treelets
20 0.21740894 41 acl-2012-Bootstrapping a Unified Model of Lexical and Phonetic Acquisition
topicId topicWeight
[(25, 0.019), (26, 0.023), (28, 0.053), (30, 0.012), (37, 0.032), (39, 0.039), (55, 0.318), (57, 0.043), (74, 0.038), (82, 0.012), (84, 0.048), (85, 0.055), (90, 0.121), (92, 0.047), (94, 0.018), (99, 0.037)]
simIndex simValue paperId paperTitle
same-paper 1 0.74282694 207 acl-2012-Unsupervised Morphology Rivals Supervised Morphology for Arabic MT
Author: David Stallard ; Jacob Devlin ; Michael Kayser ; Yoong Keok Lee ; Regina Barzilay
Abstract: If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inflected languages. In this paper, we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers, using a state-of-theart Arabic-to-English MT system. We apply maximum marginal decoding to the unsupervised analyzer, and show that this yields the best published segmentation accuracy for Arabic, while also making segmentation output more stable. Our approach gives an 18% relative BLEU gain for Levantine dialectal Arabic. Furthermore, it gives higher gains for Modern Standard Arabic (MSA), as measured on NIST MT-08, than does MADA (Habash and Rambow, 2005), a leading supervised MSA segmenter.
2 0.71958357 77 acl-2012-Ecological Evaluation of Persuasive Messages Using Google AdWords
Author: Marco Guerini ; Carlo Strapparava ; Oliviero Stock
Abstract: In recent years there has been a growing interest in crowdsourcing methodologies to be used in experimental research for NLP tasks. In particular, evaluation of systems and theories about persuasion is difficult to accommodate within existing frameworks. In this paper we present a new cheap and fast methodology that allows fast experiment building and evaluation with fully-automated analysis at a low cost. The central idea is exploiting existing commercial tools for advertising on the web, such as Google AdWords, to measure message impact in an ecological setting. The paper includes a description of the approach, tips for how to use AdWords for scientific research, and results of pilot experiments on the impact of affective text variations which confirm the effectiveness of the approach.
Author: Patrick Simianer ; Stefan Riezler ; Chris Dyer
Abstract: With a few exceptions, discriminative training in statistical machine translation (SMT) has been content with tuning weights for large feature sets on small development data. Evidence from machine learning indicates that increasing the training sample size results in better prediction. The goal of this paper is to show that this common wisdom can also be brought to bear upon SMT. We deploy local features for SCFG-based SMT that can be read off from rules at runtime, and present a learning algorithm that applies ‘1/‘2 regularization for joint feature selection over distributed stochastic learning processes. We present experiments on learning on 1.5 million training sentences, and show significant improvements over tuning discriminative models on small development sets.
4 0.46154517 110 acl-2012-Historical Analysis of Legal Opinions with a Sparse Mixed-Effects Latent Variable Model
Author: William Yang Wang ; Elijah Mayfield ; Suresh Naidu ; Jeremiah Dittmar
Abstract: We propose a latent variable model to enhance historical analysis of large corpora. This work extends prior work in topic modelling by incorporating metadata, and the interactions between the components in metadata, in a general way. To test this, we collect a corpus of slavery-related United States property law judgements sampled from the years 1730 to 1866. We study the language use in these legal cases, with a special focus on shifts in opinions on controversial topics across different regions. Because this is a longitudinal data set, we are also interested in understanding how these opinions change over the course of decades. We show that the joint learning scheme of our sparse mixed-effects model improves on other state-of-the-art generative and discriminative models on the region and time period identification tasks. Experiments show that our sparse mixed-effects model is more accurate quantitatively and qualitatively interesting, and that these improvements are robust across different parameter settings.
5 0.4594633 140 acl-2012-Machine Translation without Words through Substring Alignment
Author: Graham Neubig ; Taro Watanabe ; Shinsuke Mori ; Tatsuya Kawahara
Abstract: In this paper, we demonstrate that accurate machine translation is possible without the concept of “words,” treating MT as a problem of transformation between character strings. We achieve this result by applying phrasal inversion transduction grammar alignment techniques to character strings to train a character-based translation model, and using this in the phrase-based MT framework. We also propose a look-ahead parsing algorithm and substring-informed prior probabilities to achieve more effective and efficient alignment. In an evaluation, we demonstrate that character-based translation can achieve results that compare to word-based systems while effectively translating unknown and uncommon words over several language pairs.
6 0.45900416 155 acl-2012-NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
7 0.45759845 3 acl-2012-A Class-Based Agreement Model for Generating Accurately Inflected Translations
8 0.4574168 158 acl-2012-PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning
9 0.45706236 136 acl-2012-Learning to Translate with Multiple Objectives
10 0.45459571 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons
11 0.45391837 63 acl-2012-Cross-lingual Parse Disambiguation based on Semantic Correspondence
12 0.45385018 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
13 0.45254144 127 acl-2012-Large-Scale Syntactic Language Modeling with Treelets
14 0.45212555 217 acl-2012-Word Sense Disambiguation Improves Information Retrieval
15 0.45129746 148 acl-2012-Modified Distortion Matrices for Phrase-Based Statistical Machine Translation
16 0.45065549 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures
17 0.44982961 97 acl-2012-Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation
18 0.44958633 219 acl-2012-langid.py: An Off-the-shelf Language Identification Tool
19 0.44753617 45 acl-2012-Capturing Paradigmatic and Syntagmatic Lexical Relations: Towards Accurate Chinese Part-of-Speech Tagging
20 0.44634065 22 acl-2012-A Topic Similarity Model for Hierarchical Phrase-based Translation