emnlp emnlp2013 emnlp2013-186 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Victor Chahuneau ; Eva Schlinger ; Noah A. Smith ; Chris Dyer
Abstract: Translation into morphologically rich languages is an important but recalcitrant problem in MT. We present a simple and effective approach that deals with the problem in two phases. First, a discriminative model is learned to predict inflections of target words from rich source-side annotations. Then, this model is used to create additional sentencespecific word- and phrase-level translations that are added to a standard translation model as “synthetic” phrases. Our approach relies on morphological analysis of the target language, but we show that an unsupervised Bayesian model of morphology can successfully be used in place of a supervised analyzer. We report significant improvements in translation quality when translating from English to Russian, Hebrew and Swahili.
Reference: text
sentIndex sentText sentNum sentScore
1 edu , , Abstract Translation into morphologically rich languages is an important but recalcitrant problem in MT. [sent-4, score-0.24]
2 First, a discriminative model is learned to predict inflections of target words from rich source-side annotations. [sent-6, score-0.308]
3 Then, this model is used to create additional sentencespecific word- and phrase-level translations that are added to a standard translation model as “synthetic” phrases. [sent-7, score-0.155]
4 Our approach relies on morphological analysis of the target language, but we show that an unsupervised Bayesian model of morphology can successfully be used in place of a supervised analyzer. [sent-8, score-0.593]
5 We report significant improvements in translation quality when translating from English to Russian, Hebrew and Swahili. [sent-9, score-0.118]
6 1 Introduction Machine translation into morphologically rich languages is challenging, due to lexical sparsity and the large variety of grammatical features expressed with morphology. [sent-10, score-0.456]
7 In this paper, we introduce a method that uses target language morphological grammars (either hand-crafted or learned unsupervisedly) to address this challenge and demonstrate its effectiveness at improving translation from English into several morphologically rich target languages. [sent-11, score-0.766]
8 Our approach decomposes the process of producing a translation for a word (or phrase) into two steps. [sent-12, score-0.118]
9 First, a meaning-bearing stem is chosen and then an appropriate inflection is selected using a 1677 feature-rich discriminative model that conditions on the source context of the word being translated. [sent-13, score-1.009]
10 We first present our “translate-and-inflect” model for predicting lexical translations into morphologically rich languages given a source word and its context (§2). [sent-18, score-0.457]
11 relate surface forms to underlying hstem, inflectioni pairs; we dacisecu fossrm mhosw to e uinthdeerr a sintagn hdsaterdm morphological analyzer or a simple Bayesian unsupervised analyzer can be used (§3). [sent-20, score-0.604]
12 tion procedure for the inflection model (§4), we employ the translate-andifnleflcetciotn m mooddeel i (n§ an MeT e system. [sent-22, score-0.556]
13 rWanes daetes-carnibdehow we use our model to synthesize translation options (§5) and then evaluate translation quality on English–Russian, English–Hebrew, aantido English– Proce Sdeiantgtlse o,f W thaesh 2i0n1gt3o nC,o UnSfeAre,n 1c8e- o2n1 E Omctpoibriecra 2l0 M13et. [sent-23, score-0.236]
14 oc d2s0 i1n3 N Aastusorcaila Ltiaon g fuoarg Ceo Pmrpoucetastsi on ga,l p Laignegsu 1is6t7ic7s–1687, Swahili translation tasks, finding significant improvements in all language pairs (§6). [sent-25, score-0.118]
15 The input will be a sentence e in the source language (in this paper, always English) and any available linguistic analysis of e. [sent-29, score-0.105]
16 The output f will be composed of (i) a sequence of stems, each denoted σ and (ii) one morphological inflection pattern for each stem, denoted When the information is available, a stem σ is composed of a lemma and an inflectional class. [sent-30, score-1.308]
17 Throughout, we use Ωσ to denote the set of possible morphological inflection patterns for a given stem σ. [sent-31, score-1.145]
18 Ωσ might be defined by a grammar; our models restrict Ωσ to be the set of inflections observed anywhere in our monolingual or bilingual training data as a realization of σ. [sent-32, score-0.174]
19 µ function that maps a stem σ and morphological inflection to a target language surface form f. [sent-34, score-1.214]
20 In some cases, such as our unsupervised approach in §3. [sent-35, score-0.052]
21 Our approach consists in defining a probabilistic model over target words f. [sent-41, score-0.069]
22 The model assumes independence between each target word f conditioned on the source sentence e and its aligned position iin this sentence. [sent-42, score-0.267]
23 2 This assumption is further relaxed in §5 when the model is integrated in the translation system. [sent-43, score-0.118]
24 We decompose the probability of generating each target word f in the following way: µ. [sent-44, score-0.103]
25 | i{n σzfl,eceti,oin}) Here, each stem is gene|rate{dz in}dep|enden{tzly fro}m a single aligned source word ei, but in practice we 1This prevents the model from generating words that would be difficult for the language model to reliably score. [sent-48, score-0.466]
26 1678 use a standard phrase-based model to generate sequences of stems and only the inflection model operates word-by-word. [sent-51, score-0.677]
27 1 Modeling Inflection In morphologically rich languages, each stem may be combined with one or more inflectional morphemes to express many different grammatical features (e. [sent-54, score-0.756]
28 Since the inflectional morphology of a word generally expresses multiple grammatical features, we would like a model that naturally incorporates rich, possibly overlapping features in its representation of both the input (i. [sent-58, score-0.332]
29 We therefore use the following parametric form to model inflectional probabilities: u(µ,e,i) = exphϕ(e,i)>Wψ(µ)+ ψ(µ)>Vψ(µ)i, p(µ | σ,e,i) =Pµ0∈uΩ(σµu,e(µ,i0),e,i). [sent-63, score-0.119]
30 (1) Here, ϕ is an m-dimensional source context feature vector function, ψ is an n-dimensional morphology feature vector function, W ∈ Rm×n and pVh ∈ Ryn fe×ant are parameter mtioantr,ic Wes. [sent-64, score-0.352]
31 ∈A Rs with the more fa Rmiliar log-linear parametrization that is written with a single feature vector, single weight vector and single bias vector, this model is linear in its parameters (it can be understood as working with a feature space that is the outer product of the two feature spaces). [sent-65, score-0.119]
32 However, using two feature vectors allows to define overlapping features of both the input and the output, which is important for modeling morphology in which output variables are naturally expressed as bundles of features. [sent-66, score-0.214]
33 The correct inflection string for the observed Russian form in this particular training instance is = mis-sfm-e (equivalent to the more traditional morphological string: +MAIN+IND+PAST+SING+FEM+MEDIAL+PERF). [sent-69, score-0.836]
34 position iin the dependency tree and πi → ithe typed dependency link. [sent-72, score-0.132]
35 2 πi denotes the parent of the token in Source Context Features: ϕ(e, i) In order to select the best inflection of a targetlanguage word, given the source word it translates and the context of that source word, we seek to exploit as many features of the context as are available. [sent-74, score-0.844]
36 1, where most of the inflection features of the Russian word (past tense, singular number, and feminine gender) can be inferred from the context of the English word it is aligned to. [sent-76, score-0.647]
37 Indeed, many grammatical functions expressed morphologically in Russian are expressed syntactically in English. [sent-77, score-0.25]
38 Fortunately, high-quality parsers and other linguistic analyzers are available for English. [sent-78, score-0.093]
39 On the source side, we apply the following processing steps: • • Part-of-speech tagging with a CRF tagger tPraarint-eodf- on escehctio tangsg i0n2g–2 w1 tohf tahe C PReFnn aTgrgeee-r bank. [sent-79, score-0.105]
40 , 2010), a non-projective dependency 1679 parser trained on the Penn Treebank to produce basic Stanford dependencies. [sent-81, score-0.065]
41 3n We then extract binary features from e using this information, by considering the aligned source word ei, its preceding and following words, and its syntactic neighbors. [sent-83, score-0.157]
42 3 Morphological Grammars and Features We now describe how to obtain morphological analyses and convert them into feature vectors (ψ) for our target languages, Russian, Hebrew, and Swahili, using supervised and unsupervised methods. [sent-85, score-0.576]
43 1 Supervised Morphology The state-of-the-art in morphological analysis uses unweighted morphological transduction rules (usu3The entire monolingual data available for the translation task of the 8th ACL Workshop on Statistical Machine Translation was used. [sent-87, score-0.678]
44 ally in the form of an FST) to produce candidate analyses for each word in a sentence and then statistical models to disambiguate among the analyses in context (Hakkani-T u¨r et al. [sent-88, score-0.2]
45 While this technique is capable of producing high quality linguistic analyses, it is expensive to develop, requiring hand-crafted rulebased analyzers and annotated corpora to train the disambiguation models. [sent-92, score-0.093]
46 As a result, such analyzers are only available for a small number of languages, and, as a practical matter, each analyzer (which resulted from different development efforts) operates differently from the others. [sent-93, score-0.215]
47 µ We therefore focus on using supervised analysis for a single target language, Russian. [sent-94, score-0.111]
48 (2008) which produces for each word in context a lemma and a fixed-length morphological tag encoding the grammatical features. [sent-96, score-0.426]
49 We process the target side of the parallel data with this tool to obtain the information necessary to extract hlemma, inflectioni pairs, from which we compute σ alenmd morphological pfeaairtus,r efr vomect owrhs ψ(µ). [sent-97, score-0.496]
50 Since a positional tag set is used, it is straightforward to convert each fixed-length tag into a feature vector by defining a binary feature for each key-value pair (e. [sent-99, score-0.098]
51 2 Unsupervised Morphology Since many languages into which we might want to translate do not have supervised morphological analyzers, we now turn to the question of how to generate morphological analyses and features using an unsupervised analyzer. [sent-103, score-0.809]
52 We hypothesize that perfect decomposition into rich linguistic structures may not be required for accurate generation of new inflected forms. [sent-104, score-0.122]
53 We will test this hypothesis by experimenting with a simple, unsupervised model of morphology that segments words into sequences of morphemes, assuming a (na ı¨ve) concatenative generation process and a single analysis per type. [sent-105, score-0.237]
54 Formally, we let M represent the set of all possible morphemes and define a regular grammar 1680 M∗MM∗ (i. [sent-108, score-0.115]
55 To infer the decomposition structure for the words in the target language, we assume that the vocabulary was generated by the following process: 1. [sent-111, score-0.069]
56 Sample length distribution parameters λp ∼ Beta(βp, γp) for prefix sequences and λs ∼ Beta(βs, γs) for suffix sequences. [sent-114, score-0.131]
57 Sample a vocabulary by creating each word type w using the following steps: (a) Sample affix sequence lengths: lp ∼ Geometric(λp) ; ls ∼ Geometric(λs). [sent-116, score-0.111]
58 We use blocked Gibbs sampling to sample segmentations for each word in the training vocabulary. [sent-125, score-0.07]
59 Because ofour particular choice ofpriors, it possible to approximately decompose the posterior over the arcs of a compact finite-state machine. [sent-126, score-0.034]
60 This model is reminiscent of work on learning morphology using adaptor grammars (Johnson et al. [sent-128, score-0.198]
61 The inferred morphological grammar is very sensitive to the Dirichlet hyperparameters (αp, αs, ασ) and these are, in turn, sensitive to the number of types in the vocabulary. [sent-130, score-0.312]
62 Therefore, we selected them empirically to obtain a stem vocabulary size on the parallel data that is one-to-one with English. [sent-136, score-0.362]
63 For the unsupervised analyzer, we do not have a mapping from morphemes to structured morphological attributes; however, we can create features from the affix sequences obtained after morphological segmentation. [sent-143, score-0.768]
64 We produce binary features corresponding to the content of each potential affixation position relative to the stem: prefix suffix . [sent-144, score-0.129]
65 µ For example, the unsupervised analysis = wa+ki+wa+STEM of the Swahili word wakiwapiga will produce the following features: ψprefix[−3][wa] ψprefix[−2][ki] ψprefix[−1][wa] 4 (µ) = 1, (µ) = 1, (µ) = 1. [sent-150, score-0.085]
66 Inflection Model Parameter Estimation To set the parameters W and V of the inflection prediction model (Eq. [sent-151, score-0.556]
67 1), we use stochastic gradient descent to maximize the conditional log-likelihood of a training set consisting of pairs of source (English) sentence contextual features (ϕ) and target word inflectional features (ψ). [sent-152, score-0.293]
68 The training instances are extracted from the word-aligned parallel corpus with the English side preprocessed as discussed in §2. [sent-153, score-0.091]
69 2 athned tEhneg target dsidee p disambiguated as dscisucsusesdsed in nin § §3. [sent-154, score-0.11]
70 Wandhe thn morphological category eidnf aosr dmiasctiuosns eisd ainva §i3l. [sent-155, score-0.334]
71 1681 Statistics of the parallel corpora used to train the inflection model are summarized in Table 1. [sent-157, score-0.609]
72 It is important to note here that our richly parameterized model is trained on the full parallel training corpus, not just on a handful of development sentences (which are typically used to tune MT system parameters). [sent-158, score-0.08]
73 Despite this scale, training is simple: the inflection model is trained to discriminate among different inflectional paradigms, not over all possible target language sentences (Blunsom et al. [sent-159, score-0.744]
74 1 Intrinsic Evaluation Before considering the broader problem of integrating the inflection model in a machine translation system, we perform an artificial evaluation to verify that the model learns sensible source sentencetarget inflection patterns. [sent-166, score-1.335]
75 To do so, we create an inflection test set as follows. [sent-167, score-0.556]
76 We preprocess the source (English) sentences exactly as during training (§2. [sent-168, score-0.105]
77 2), and using the target language morphologiincagl analyzer, we convert aeargchet aligned target pwhoorldo to hstem, inflectioni pairs. [sent-169, score-0.324]
78 Using this data, we measure inflection quality using two measurements:5 5Note that we are not evaluating the stem translation model, uvSiprsedRusianveANVMrage576 13 a4. [sent-175, score-0.983]
79 • • the accuracy of predicting the inflection given tthhee source, source context a tnhde target stem, iavnedn the inflection model perplexity on the same set tohfe ete isnft liencsttiaonnce ms. [sent-186, score-1.428]
80 o Additionally, we report the average number of possible inflections for each stem, an upper bound to the perplexity that indicates the inherent difficulty of the task. [sent-187, score-0.241]
81 First, perplexity is substantially lower than the perplexity of a uniform model, indicating our model is overall quite effective at predicting inflections using source context only. [sent-190, score-0.524]
82 Second, in the supervised Russian results, we see that predicting the inflections of adjectives is relatively more difficult than for other parts-ofspeech. [sent-191, score-0.301]
83 Since adjectives agree with the nouns they modify in gender and case, and gender is an idiosyncratic feature of Russian nouns (and therefore not directly predictable from the English source), this difficulty is unsurprising. [sent-192, score-0.216]
84 2 Feature Ablation Our inflection model makes use of numerous feature types. [sent-197, score-0.585]
85 Table 3 explores the effect of removing different kinds of (source) features from the model, evaluated on predicting Russian inflections using supervised morphological grammars. [sent-198, score-0.584]
86 6The models used in the feature ablation experiment were trained on fewer examples, resulting in overall lower accuracies 1682 show the effect of removing either linear or dependency context. [sent-200, score-0.16]
87 We see that both are necessary for good performance; however removing dependency context substantially degrades performance of the model (we interpret this result as evidence that Russian morphological inflection captures grammatical relationships that would be expressed structurally in English). [sent-201, score-1.131]
88 The bottom four rows explore the effect of source language word representation. [sent-202, score-0.105]
89 The results indicate that lexical features are important for accurate prediction of inflection, and that POS tags and Brown clusters are likewise important, but they seem to capture similar information (removing one has little impact, but removing both substantially degrades performance). [sent-203, score-0.126]
90 Table 3: Feature ablation experiments using supervised Russian classification experiments. [sent-204, score-0.089]
91 5 Synthetic Phrases We turn now to translation; recall that our translateand-inflect model is used to augment the set of rules available to a conventional statistical machine translation decoder. [sent-205, score-0.179]
92 We refer to the phrases it produces as synthetic phrases. [sent-206, score-0.172]
93 Our baseline system is a standard hierarchical phrase-based translation model (Chiang, 2007). [sent-207, score-0.118]
94 Following Lopez (2007), the training data is compiled into an efficient binary representation which allows extraction of sentence-specific grammars just before decoding. [sent-208, score-0.048]
95 In our case, this also allows the creation of synthetic inflected phrases that are produced conditioning on the sentence to translate. [sent-209, score-0.267]
96 To generate these synthetic phrases with new inflections possibly unseen in the parallel training than seen in Table 2, but the pattern of results is the relevant datapoint here. [sent-210, score-0.399]
97 Figure 3: Examples of highly weighted features learned by the inflection model. [sent-211, score-0.556]
98 We selected a few frequent morphological features and show their top corresponding source context features. [sent-212, score-0.424]
99 data, we first construct an additional phrase-based translation model on the parallel corpus preprocessed to replace inflected surface words with their stems. [sent-213, score-0.266]
100 We then extract a set of non-gappy phrases for each sentence (e. [sent-214, score-0.043]
wordName wordTfidf (topN-words)
[('inflection', 0.556), ('stem', 0.309), ('russian', 0.29), ('morphological', 0.28), ('inflections', 0.174), ('morphology', 0.15), ('synthetic', 0.129), ('inflectional', 0.119), ('translation', 0.118), ('morphologically', 0.117), ('source', 0.105), ('inflectioni', 0.094), ('swahili', 0.094), ('analyzers', 0.093), ('analyzer', 0.089), ('prefixes', 0.083), ('morphemes', 0.083), ('target', 0.069), ('prefix', 0.067), ('perplexity', 0.067), ('rich', 0.065), ('analyses', 0.064), ('grammatical', 0.063), ('hstem', 0.063), ('plp', 0.063), ('sls', 0.063), ('languages', 0.058), ('hebrew', 0.057), ('inflected', 0.057), ('dir', 0.053), ('stems', 0.053), ('parallel', 0.053), ('aligned', 0.052), ('removing', 0.052), ('unsupervised', 0.052), ('wa', 0.051), ('numerals', 0.05), ('adjectives', 0.049), ('grammars', 0.048), ('english', 0.048), ('ablation', 0.047), ('suffixes', 0.046), ('tense', 0.045), ('lemma', 0.044), ('fst', 0.044), ('pg', 0.044), ('phrases', 0.043), ('supervised', 0.042), ('iin', 0.041), ('nin', 0.041), ('convert', 0.04), ('gender', 0.04), ('beta', 0.04), ('lp', 0.04), ('context', 0.039), ('ki', 0.038), ('degrades', 0.038), ('conditioning', 0.038), ('preprocessed', 0.038), ('affix', 0.038), ('translations', 0.037), ('substantially', 0.036), ('sample', 0.036), ('predicting', 0.036), ('sequences', 0.035), ('brown', 0.035), ('expressed', 0.035), ('ei', 0.034), ('segmentations', 0.034), ('decompose', 0.034), ('produce', 0.033), ('ls', 0.033), ('operates', 0.033), ('independently', 0.033), ('turn', 0.033), ('understood', 0.032), ('diagonal', 0.032), ('grammar', 0.032), ('dependency', 0.032), ('geometric', 0.032), ('past', 0.03), ('intrinsic', 0.03), ('im', 0.029), ('nouns', 0.029), ('feature', 0.029), ('suffix', 0.029), ('augment', 0.028), ('ithe', 0.027), ('mda', 0.027), ('ryn', 0.027), ('richly', 0.027), ('eisd', 0.027), ('definiteness', 0.027), ('vii', 0.027), ('gene', 0.027), ('remark', 0.027), ('vij', 0.027), ('aosr', 0.027), ('eof', 0.027), ('nand', 0.027)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999946 186 emnlp-2013-Translating into Morphologically Rich Languages with Synthetic Phrases
Author: Victor Chahuneau ; Eva Schlinger ; Noah A. Smith ; Chris Dyer
Abstract: Translation into morphologically rich languages is an important but recalcitrant problem in MT. We present a simple and effective approach that deals with the problem in two phases. First, a discriminative model is learned to predict inflections of target words from rich source-side annotations. Then, this model is used to create additional sentencespecific word- and phrase-level translations that are added to a standard translation model as “synthetic” phrases. Our approach relies on morphological analysis of the target language, but we show that an unsupervised Bayesian model of morphology can successfully be used in place of a supervised analyzer. We report significant improvements in translation quality when translating from English to Russian, Hebrew and Swahili.
2 0.31054062 30 emnlp-2013-Automatic Extraction of Morphological Lexicons from Morphologically Annotated Corpora
Author: Ramy Eskander ; Nizar Habash ; Owen Rambow
Abstract: We present a method for automatically learning inflectional classes and associated lemmas from morphologically annotated corpora. The method consists of a core languageindependent algorithm, which can be optimized for specific languages. The method is demonstrated on Egyptian Arabic and German, two morphologically rich languages. Our best method for Egyptian Arabic provides an error reduction of 55.6% over a simple baseline; our best method for German achieves a 66.7% error reduction.
3 0.27961382 19 emnlp-2013-Adaptor Grammars for Learning Non-Concatenative Morphology
Author: Jan A. Botha ; Phil Blunsom
Abstract: This paper contributes an approach for expressing non-concatenative morphological phenomena, such as stem derivation in Semitic languages, in terms of a mildly context-sensitive grammar formalism. This offers a convenient level of modelling abstraction while remaining computationally tractable. The nonparametric Bayesian framework of adaptor grammars is extended to this richer grammar formalism to propose a probabilistic model that can learn word segmentation and morpheme lexicons, including ones with discontiguous strings as elements, from unannotated data. Our experiments on Hebrew and three variants of Arabic data find that the additional expressiveness to capture roots and templates as atomic units improves the quality of concatenative segmentation and stem identification. We obtain 74% accuracy in identifying triliteral Hebrew roots, while performing morphological segmentation with an F1-score of 78. 1.
4 0.27438387 181 emnlp-2013-The Effects of Syntactic Features in Automatic Prediction of Morphology
Author: Wolfgang Seeker ; Jonas Kuhn
Abstract: Morphology and syntax interact considerably in many languages and language processing should pay attention to these interdependencies. We analyze the effect of syntactic features when used in automatic morphology prediction on four typologically different languages. We show that predicting morphology for languages with highly ambiguous word forms profits from taking the syntactic context of words into account and results in state-ofthe-art models.
5 0.25123927 83 emnlp-2013-Exploring the Utility of Joint Morphological and Syntactic Learning from Child-directed Speech
Author: Stella Frank ; Frank Keller ; Sharon Goldwater
Abstract: Frank Keller keller@ inf .ed .ac .uk Sharon Goldwater sgwater@ inf .ed .ac .uk ILCC, School of Informatics University of Edinburgh Edinburgh, EH8 9AB, UK interactions are often (but not necessarily) synergisChildren learn various levels of linguistic structure concurrently, yet most existing models of language acquisition deal with only a single level of structure, implicitly assuming a sequential learning process. Developing models that learn multiple levels simultaneously can provide important insights into how these levels might interact synergistically dur- ing learning. Here, we present a model that jointly induces syntactic categories and morphological segmentations by combining two well-known models for the individual tasks. We test on child-directed utterances in English and Spanish and compare to single-task baselines. In the morphologically poorer language (English), the model improves morphological segmentation, while in the morphologically richer language (Spanish), it leads to better syntactic categorization. These results provide further evidence that joint learning is useful, but also suggest that the benefits may be different for typologically different languages.
6 0.11437751 162 emnlp-2013-Russian Stress Prediction using Maximum Entropy Ranking
7 0.11051262 81 emnlp-2013-Exploring Demographic Language Variations to Improve Multilingual Sentiment Analysis in Social Media
8 0.10205331 26 emnlp-2013-Assembling the Kazakh Language Corpus
9 0.10029335 127 emnlp-2013-Max-Margin Synchronous Grammar Induction for Machine Translation
10 0.098237611 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation
11 0.096575789 70 emnlp-2013-Efficient Higher-Order CRFs for Morphological Tagging
12 0.076399453 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation
13 0.073629774 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models
14 0.069681972 169 emnlp-2013-Semi-Supervised Representation Learning for Cross-Lingual Text Classification
15 0.06907548 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation
16 0.066339582 53 emnlp-2013-Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization
17 0.065759569 201 emnlp-2013-What is Hidden among Translation Rules
18 0.064079836 103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk
19 0.06378226 156 emnlp-2013-Recurrent Continuous Translation Models
20 0.063328348 57 emnlp-2013-Dependency-Based Decipherment for Resource-Limited Machine Translation
topicId topicWeight
[(0, -0.232), (1, -0.162), (2, -0.0), (3, -0.139), (4, -0.388), (5, -0.164), (6, -0.198), (7, -0.145), (8, 0.105), (9, -0.2), (10, 0.037), (11, -0.013), (12, -0.011), (13, -0.078), (14, -0.021), (15, -0.04), (16, 0.089), (17, 0.023), (18, -0.014), (19, -0.03), (20, -0.042), (21, -0.072), (22, -0.057), (23, -0.104), (24, -0.113), (25, 0.007), (26, 0.061), (27, -0.014), (28, -0.019), (29, 0.024), (30, -0.019), (31, 0.023), (32, -0.018), (33, 0.02), (34, 0.053), (35, 0.028), (36, -0.018), (37, -0.02), (38, 0.019), (39, -0.033), (40, -0.009), (41, -0.052), (42, 0.014), (43, -0.028), (44, 0.053), (45, -0.087), (46, -0.014), (47, -0.06), (48, 0.044), (49, 0.06)]
simIndex simValue paperId paperTitle
same-paper 1 0.92040092 186 emnlp-2013-Translating into Morphologically Rich Languages with Synthetic Phrases
Author: Victor Chahuneau ; Eva Schlinger ; Noah A. Smith ; Chris Dyer
Abstract: Translation into morphologically rich languages is an important but recalcitrant problem in MT. We present a simple and effective approach that deals with the problem in two phases. First, a discriminative model is learned to predict inflections of target words from rich source-side annotations. Then, this model is used to create additional sentencespecific word- and phrase-level translations that are added to a standard translation model as “synthetic” phrases. Our approach relies on morphological analysis of the target language, but we show that an unsupervised Bayesian model of morphology can successfully be used in place of a supervised analyzer. We report significant improvements in translation quality when translating from English to Russian, Hebrew and Swahili.
2 0.90509373 30 emnlp-2013-Automatic Extraction of Morphological Lexicons from Morphologically Annotated Corpora
Author: Ramy Eskander ; Nizar Habash ; Owen Rambow
Abstract: We present a method for automatically learning inflectional classes and associated lemmas from morphologically annotated corpora. The method consists of a core languageindependent algorithm, which can be optimized for specific languages. The method is demonstrated on Egyptian Arabic and German, two morphologically rich languages. Our best method for Egyptian Arabic provides an error reduction of 55.6% over a simple baseline; our best method for German achieves a 66.7% error reduction.
3 0.8259536 19 emnlp-2013-Adaptor Grammars for Learning Non-Concatenative Morphology
Author: Jan A. Botha ; Phil Blunsom
Abstract: This paper contributes an approach for expressing non-concatenative morphological phenomena, such as stem derivation in Semitic languages, in terms of a mildly context-sensitive grammar formalism. This offers a convenient level of modelling abstraction while remaining computationally tractable. The nonparametric Bayesian framework of adaptor grammars is extended to this richer grammar formalism to propose a probabilistic model that can learn word segmentation and morpheme lexicons, including ones with discontiguous strings as elements, from unannotated data. Our experiments on Hebrew and three variants of Arabic data find that the additional expressiveness to capture roots and templates as atomic units improves the quality of concatenative segmentation and stem identification. We obtain 74% accuracy in identifying triliteral Hebrew roots, while performing morphological segmentation with an F1-score of 78. 1.
4 0.77172542 83 emnlp-2013-Exploring the Utility of Joint Morphological and Syntactic Learning from Child-directed Speech
Author: Stella Frank ; Frank Keller ; Sharon Goldwater
Abstract: Frank Keller keller@ inf .ed .ac .uk Sharon Goldwater sgwater@ inf .ed .ac .uk ILCC, School of Informatics University of Edinburgh Edinburgh, EH8 9AB, UK interactions are often (but not necessarily) synergisChildren learn various levels of linguistic structure concurrently, yet most existing models of language acquisition deal with only a single level of structure, implicitly assuming a sequential learning process. Developing models that learn multiple levels simultaneously can provide important insights into how these levels might interact synergistically dur- ing learning. Here, we present a model that jointly induces syntactic categories and morphological segmentations by combining two well-known models for the individual tasks. We test on child-directed utterances in English and Spanish and compare to single-task baselines. In the morphologically poorer language (English), the model improves morphological segmentation, while in the morphologically richer language (Spanish), it leads to better syntactic categorization. These results provide further evidence that joint learning is useful, but also suggest that the benefits may be different for typologically different languages.
5 0.69659555 181 emnlp-2013-The Effects of Syntactic Features in Automatic Prediction of Morphology
Author: Wolfgang Seeker ; Jonas Kuhn
Abstract: Morphology and syntax interact considerably in many languages and language processing should pay attention to these interdependencies. We analyze the effect of syntactic features when used in automatic morphology prediction on four typologically different languages. We show that predicting morphology for languages with highly ambiguous word forms profits from taking the syntactic context of words into account and results in state-ofthe-art models.
6 0.40621069 26 emnlp-2013-Assembling the Kazakh Language Corpus
7 0.37564299 70 emnlp-2013-Efficient Higher-Order CRFs for Morphological Tagging
8 0.36773816 162 emnlp-2013-Russian Stress Prediction using Maximum Entropy Ranking
9 0.31676948 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models
10 0.31321242 156 emnlp-2013-Recurrent Continuous Translation Models
11 0.31297362 127 emnlp-2013-Max-Margin Synchronous Grammar Induction for Machine Translation
12 0.30097908 195 emnlp-2013-Unsupervised Spectral Learning of WCFG as Low-rank Matrix Completion
13 0.29463747 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation
14 0.2932933 201 emnlp-2013-What is Hidden among Translation Rules
15 0.29001081 187 emnlp-2013-Translation with Source Constituency and Dependency Trees
16 0.27719003 88 emnlp-2013-Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding
17 0.27664083 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation
18 0.27380323 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models
19 0.26033267 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation
20 0.25128624 103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk
topicId topicWeight
[(3, 0.019), (18, 0.02), (22, 0.02), (30, 0.086), (50, 0.031), (51, 0.147), (66, 0.484), (71, 0.018), (75, 0.029), (77, 0.04), (90, 0.015)]
simIndex simValue paperId paperTitle
1 0.9301393 109 emnlp-2013-Is Twitter A Better Corpus for Measuring Sentiment Similarity?
Author: Shi Feng ; Le Zhang ; Binyang Li ; Daling Wang ; Ge Yu ; Kam-Fai Wong
Abstract: Extensive experiments have validated the effectiveness of the corpus-based method for classifying the word’s sentiment polarity. However, no work is done for comparing different corpora in the polarity classification task. Nowadays, Twitter has aggregated huge amount of data that are full of people’s sentiments. In this paper, we empirically evaluate the performance of different corpora in sentiment similarity measurement, which is the fundamental task for word polarity classification. Experiment results show that the Twitter data can achieve a much better performance than the Google, Web1T and Wikipedia based methods.
same-paper 2 0.87336731 186 emnlp-2013-Translating into Morphologically Rich Languages with Synthetic Phrases
Author: Victor Chahuneau ; Eva Schlinger ; Noah A. Smith ; Chris Dyer
Abstract: Translation into morphologically rich languages is an important but recalcitrant problem in MT. We present a simple and effective approach that deals with the problem in two phases. First, a discriminative model is learned to predict inflections of target words from rich source-side annotations. Then, this model is used to create additional sentencespecific word- and phrase-level translations that are added to a standard translation model as “synthetic” phrases. Our approach relies on morphological analysis of the target language, but we show that an unsupervised Bayesian model of morphology can successfully be used in place of a supervised analyzer. We report significant improvements in translation quality when translating from English to Russian, Hebrew and Swahili.
3 0.85128331 201 emnlp-2013-What is Hidden among Translation Rules
Author: Libin Shen ; Bowen Zhou
Abstract: Most of the machine translation systems rely on a large set of translation rules. These rules are treated as discrete and independent events. In this short paper, we propose a novel method to model rules as observed generation output of a compact hidden model, which leads to better generalization capability. We present a preliminary generative model to test this idea. Experimental results show about one point improvement on TER-BLEU over a strong baseline in Chinese-to-English translation.
4 0.68725228 76 emnlp-2013-Exploiting Discourse Analysis for Article-Wide Temporal Classification
Author: Jun-Ping Ng ; Min-Yen Kan ; Ziheng Lin ; Wei Feng ; Bin Chen ; Jian Su ; Chew Lim Tan
Abstract: In this paper we classify the temporal relations between pairs of events on an article-wide basis. This is in contrast to much of the existing literature which focuses on just event pairs which are found within the same or adjacent sentences. To achieve this, we leverage on discourse analysis as we believe that it provides more useful semantic information than typical lexico-syntactic features. We propose the use of several discourse analysis frameworks, including 1) Rhetorical Structure Theory (RST), 2) PDTB-styled discourse relations, and 3) topical text segmentation. We explain how features derived from these frameworks can be effectively used with support vector machines (SVM) paired with convolution kernels. Experiments show that our proposal is effective in improving on the state-of-the-art significantly by as much as 16% in terms of F1, even if we only adopt less-than-perfect automatic discourse analyzers and parsers. Making use of more accurate discourse analysis can further boost gains to 35%.
5 0.56523222 81 emnlp-2013-Exploring Demographic Language Variations to Improve Multilingual Sentiment Analysis in Social Media
Author: Svitlana Volkova ; Theresa Wilson ; David Yarowsky
Abstract: Theresa Wilson Human Language Technology Center of Excellence Johns Hopkins University Baltimore, MD t aw@ j hu .edu differences may Different demographics, e.g., gender or age, can demonstrate substantial variation in their language use, particularly in informal contexts such as social media. In this paper we focus on learning gender differences in the use of subjective language in English, Spanish, and Russian Twitter data, and explore cross-cultural differences in emoticon and hashtag use for male and female users. We show that gender differences in subjective language can effectively be used to improve sentiment analysis, and in particular, polarity classification for Spanish and Russian. Our results show statistically significant relative F-measure improvement over the gender-independent baseline 1.5% and 1% for Russian, 2% and 0.5% for Spanish, and 2.5% and 5% for English for polarity and subjectivity classification.
6 0.56418353 143 emnlp-2013-Open Domain Targeted Sentiment
7 0.53861177 19 emnlp-2013-Adaptor Grammars for Learning Non-Concatenative Morphology
8 0.52907962 83 emnlp-2013-Exploring the Utility of Joint Morphological and Syntactic Learning from Child-directed Speech
9 0.52791774 30 emnlp-2013-Automatic Extraction of Morphological Lexicons from Morphologically Annotated Corpora
10 0.50981557 181 emnlp-2013-The Effects of Syntactic Features in Automatic Prediction of Morphology
11 0.50931942 38 emnlp-2013-Bilingual Word Embeddings for Phrase-Based Machine Translation
12 0.50907576 47 emnlp-2013-Collective Opinion Target Extraction in Chinese Microblogs
13 0.5090335 8 emnlp-2013-A Joint Learning Model of Word Segmentation, Lexical Acquisition, and Phonetic Variability
14 0.50499624 99 emnlp-2013-Implicit Feature Detection via a Constrained Topic Model and SVM
15 0.50073808 40 emnlp-2013-Breaking Out of Local Optima with Count Transforms and Model Recombination: A Study in Grammar Induction
16 0.49208251 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models
17 0.49000961 59 emnlp-2013-Deriving Adjectival Scales from Continuous Space Word Representations
18 0.48811194 156 emnlp-2013-Recurrent Continuous Translation Models
19 0.48523173 138 emnlp-2013-Naive Bayes Word Sense Induction
20 0.48392829 53 emnlp-2013-Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization