emnlp emnlp2013 emnlp2013-103 knowledge-graph by maker-knowledge-mining

103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk


Source: pdf

Author: Xiaoning Zhu ; Zhongjun He ; Hua Wu ; Haifeng Wang ; Conghui Zhu ; Tiejun Zhao

Abstract: This paper proposes a novel approach that utilizes a machine learning method to improve pivot-based statistical machine translation (SMT). For language pairs with few bilingual data, a possible solution in pivot-based SMT using another language as a

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 However, one of the weaknesses is that some useful sourcetarget translations cannot be generated if the corresponding source phrase and target phrase connect to different pivot phrases. [sent-8, score-1.238]

2 To alleviate the problem, we utilize Markov random walks to connect possible translation phrases between source and target language. [sent-9, score-0.613]

3 1 Introduction Statistical machine translation (SMT) uses bilingual corpora to build translation models. [sent-11, score-0.527]

4 The pivot language approach, which performs translation through a third language, provides a possible solution to the problem. [sent-16, score-0.781]

5 With a triangulation pivot approach, a source-target phrase table can be obtained by combining the source-pivot phrase table and the pivot-target phrase table. [sent-18, score-1.704]

6 However, one of the weaknesses is that some corresponding source and target phrase pairs cannot be generated, because they are connected to different pivot phrases (Cui et al. [sent-19, score-1.003]

7 As illustrated in Figure 1, since there is no direct translation between “很 可 口 henkekou” and “really delicious”, the triangulation method is unable to establish a relation between “很可 口 henkekou” and the two Spanish phrases. [sent-21, score-0.642]

8 To solve this problem, we apply a Markov random walk method to pivot-based SMT system. [sent-22, score-0.642]

9 For example, Brin and Page (1998) used random walk to discover potential relations between queries and documents for link analysis in information retrieval. [sent-24, score-0.677]

10 Analogous to link analysis, the aim of pivot-based translation is to discover potential translations between source and target language via the pivot language. [sent-25, score-0.987]

11 The goal of this paper is to extend the previous triangulation approach by exploring implicit translation relations using random walk method. [sent-29, score-1.266]

12 We evaluated our approach in several translation tasks, including translations between European languages; Chinese-Spanish spoken language translation and Chinese-Japanese translation with English as the pivot language. [sent-30, score-1.309]

13 We review the triangulation method for pivotbased machine translation in section 3. [sent-34, score-0.671]

14 , 2011), a source sentence is first translated to n pivot sentences via a sourcepivot translation system, and then each pivot sentence is translated to m target sentences via a pivot-target translation system. [sent-41, score-1.772]

15 At each step (source to pivot and pivot to target), multiple translation outputs will be generated, thus a minimum Bayesrisk system combination method is often used to select the optimal sentence (González-Rubio et al. [sent-42, score-1.354]

16 On one 525 hand, the time cost is doubled; on the other hand, the translation error of the source-pivot translation system will be transferred to the pivot-target translation. [sent-46, score-0.466]

17 Synthetic Method: A synthetic method creates a synthetic source-target corpus using source-pivot translation model or pivot-target translation model (Utiyama et al. [sent-47, score-0.571]

18 For example, we can translate each pivot sentence in the pivot-target corpus to source language with a pivot-source model, and then combine the translated source sentence with the target sentence to ob- tain a synthetic source-target corpus, and vice versa. [sent-49, score-0.778]

19 However, it is difficult to build a high quality translation system with a corpus created by a machine translation system. [sent-50, score-0.486]

20 Triangulation Method: The triangulation method obtains source-target model by combining source-pivot and pivot-target translation models (Wu and Wang, 2007; Cohn and Lapata 2007), which has been shown to work better than the other pivot approaches (Utiyama and Isahara, 2007). [sent-51, score-1.167]

21 As we mentioned earlier, the weakness of triangulation is that the corresponding source and target phrase pairs cannot be connected in the case that they connect to different pivot phrases. [sent-52, score-1.32]

22 Thus, a pivot model can be obtained by merging these two models. [sent-55, score-0.569]

23 In the translation model, the phrase translation probability and the lexical weight are language dependent, which will be introduced in the next two sub-sections. [sent-56, score-0.71]

24 1 Phrase Translation Probability The triangulation method assumes that there exist translations between phrases and phrase in source and pivot languages, and between phrase and phrase t in pivot and target languages. [sent-58, score-2.41]

25 The phrase translation probability φ between source and target languages is determined by the following model: s p p φ(s | t ) =∑φ(s | p, t )φ (p| t ) =∑ppφ(s |p)φ (p| t ) 3. [sent-59, score-0.654]

26 2 (1) Lexical Weight Given a phrase pair (s , , t ) and a word alignment a between the source word positions i= =1, ,n and the target word positions j = 0, 1, ,m , the lexical weight of phrase pair (s , , t ) can be calculated with the following formula (Koehn et al. [sent-60, score-0.701]

27 ω(s | t )=∑cso'cuonutn( st( , t s )' , t ) (3) Thus the alignment a between the source phrase and target phrase t via pivot phrase is needed for computing the lexical weight. [sent-62, score-1.479]

28 The triangulation method requires that both the source and target phrases connect to the same pivot phrase. [sent-64, score-1.127]

29 In order to alleviate this problem, we propose a random walk model, to discover the implicit relations among the source, pivot and target phrases. [sent-67, score-1.294]

30 s p 526 4 Random Walks on Translation Graph For phrase-based SMT, all source-target phrase pairs are stored in a phrase table. [sent-68, score-0.488]

31 In our random walk approach, we first build a translation graph according to the phrase table. [sent-69, score-1.144]

32 A translation graph contains two types of nodes: source phrase and target phrase. [sent-70, score-0.625]

33 A source phrase and a target phrase t are connected if exists a phrase pair (s , , t ) in the phrase table. [sent-71, score-1.115]

34 The edge can be weighted according to translation probabilities or alignments in the phrase table. [sent-72, score-0.507]

35 For the pivot-based translation, the translation graph can be derived from the source-pivot phrase table and pivot-target phrase table. [sent-73, score-0.772]

36 Our random walk model is inspired by two works (Szummer and Jaakkola, 2002; Craswell and Szummer,2007). [sent-74, score-0.617]

37 The general process of random walk can be described as follows: Let G = (V , E) be a directed graph with n vertices and m edges. [sent-75, score-0.647]

38 A random walk on G follows the following process: start at a vertex v0 , chose and walk along a random neighbor v1 , with v1 ∈ Γ(v0) . [sent-77, score-1.234]

39 s Let S be the set of source phrases, and P be the set of pivot phrases. [sent-79, score-0.62]

40 Let R represent the binary relations between source phrases and pivot phrases. [sent-82, score-0.699]

41 Then the 1-step translation Rik from node ito node k can be directly obtained in the phrase table. [sent-83, score-0.579]

42 ing SP and PT phrase table through triangulation method. [sent-89, score-0.626]

43 S P T S P S T P T SPT (a) Pivot without (b) Random walk on random walk source-pivot side (c) Random walk on pivot-target side (d) Random walk on both sides Figure 3: Some possible decoding processes of random walk based pivot approach. [sent-91, score-3.276]

44 The □ stands for the source phrase (S); the ○ represents the pivot phrase (P) and the ◇ Pt|0(k k| |i ) = [At]ik stands for the target phrase (T). [sent-92, score-1.398]

45 1 Framework of Random Walk Approach The overall framework of random walk for pivotbased machine translation is shown in Figure 2. [sent-95, score-0.902]

46 Before using random walk model, we have two phrase tables: source-pivot phrase table (SP phrase table) and pivot-target phrase table (PT phrase table). [sent-96, score-1.879]

47 After applying the random walk approach, we can achieve two extended phrase table: extended source-pivot phrase table (S’P’ phrase table) and extended pivot-target phrase table (P’T’ phrase table). [sent-97, score-1.939]

48 The goal of pivot-based SMT is to get a source-target phrase table (ST phrase table) via SP phrase table and PT phrase table. [sent-98, score-1.047]

49 Our random walk was applied on SP phrase table or PT phrase table separately. [sent-99, score-1.147]

50 In next 2 sub- sections, we will explain how the phrase transla527 tion probabilities and lexical weight are obtained with random walk model on the phrase table. [sent-100, score-1.156]

51 Figure 3 shows some possible decoding processes of random walk based pivot approach. [sent-101, score-1.165]

52 In figure 3-a, the possible source-target phrase pair can be obtained directly via a pivot phrase, so it does not need a random walk model. [sent-102, score-1.459]

53 In figure 3-b and figure 3-c, one candidate source-target phrase pair can be obtained by random walks on sourcepivot side or pivot-target side. [sent-103, score-0.486]

54 2 Phrase Translation Probabilities For the translation probabilities, the binary relation R is the translation probabilities in the phrase table. [sent-106, score-0.763]

55 According to formula 5, the random walk sums up the probabilities of all paths of length t between the node iand k. [sent-108, score-0.72]

56 Take source-to-pivot phrase graph as an example; denote matrix A contains s+p nodes (s source phrases and p pivot phrases) to represent the translation graph. [sent-109, score-1.201]

57 We can split the matrix A into 4 sub-matrixes: A=0Asp×s s 0Aps×p  (7) where the sub-matrix Asp = [pik ]s× p represents the translation probabilities from source to pivot language, and Aps represents the similar meaning. [sent-111, score-0.91]

58 The first step means the translation from source language to pivot language. [sent-114, score-0.853]

59 The matrix A is derived from the phrase table directly and each element in the graph indicates a translation rule in the phrase table. [sent-115, score-0.799]

60 Compared with the initial phrase table in Step1, although the number of phrases is not increased, the relations between phrase pairs are increased and more translation rules can be obtained. [sent-123, score-0.821]

61 3 Lexical Weights To build a translation graph, the two sets of phrase translation probabilities are represented in the phrase tables. [sent-126, score-1.004]

62 To deal with this, we should conduct a word alignment random walk model to obtain a new alignment a after t steps. [sent-128, score-0.721]

63 Our translation system is an in-house phrase-based system using a log-linear framework including a phrase translation model, a language model, a lexicalized reordering model, a word penalty model and a phrase penalty model, which is analogous to Moses (Koehn et al. [sent-138, score-0.977]

64 The baseline system is the triangulation method based pivot approach (Wu and Wang, 2007). [sent-140, score-0.957]

65 We perform our experiments on different translation directions and via different pivot languages. [sent-151, score-0.81]

66 As a most widely used language in the world (Mydans, 2011), English was used as the pivot language for granted when carrying out experiments on different translation directions. [sent-152, score-0.781]

67 For translating Portuguese to Swedish, we also tried to perform our experiments via different pivot lan- 1 http://www. [sent-153, score-0.621]

68 Training data for experiments using English as the pivot language. [sent-158, score-0.548]

69 Experiments Directions on Different Translation systems6 We build 180 pivot translation (including 90 baseline systems and 90 random walk based systems) using 10 source/target languages and 1 pivot language (English). [sent-177, score-2.048]

70 The baseline system was built following the traditional triangulation pivot approach. [sent-178, score-0.932]

71 Among 90 language pairs, random walk based approach is significantly better than the baseline system in 75 language pairs. [sent-196, score-0.64]

72 But our random walk approach can improve the performance of translations between different language families. [sent-206, score-0.648]

73 Experiments guages via Different Pivot Lan- In addition to using English as the pivot language, we also try some other languages as the pivot language. [sent-210, score-1.184]

74 In this sub-section, experiments were carried out from translating Portuguese to Swedish via different pivot languages. [sent-211, score-0.621]

75 Table 5 summarizes the BLEU% scores of different pivot language when translating from Portuguese to Swedish. [sent-212, score-0.592]

76 Similar to Table 4, our approach still achieves general improvements over the baseline system even if the pivot language has been changed. [sent-213, score-0.571]

77 From the table we can see that for most of the pivot language, the random walk based approach gains more than 1 BLEU score over the baseline. [sent-214, score-1.186]

78 But when using Finnish as the pivot language, the improvement is only 0. [sent-215, score-0.548]

79 This phenomenon shows that the pivot language can also influence the performance of random walk approach. [sent-217, score-1.165]

80 One possible reason for the poor performance of using Finnish as the pivot language is that Finnish belongs to Uralic language family, and the other languages belong to Indo-European family. [sent-218, score-0.63]

81 Thus how to select a best pivot language is our future work. [sent-220, score-0.548]

82 The problem with random walk is that it will lead to a larger phrase table with noises. [sent-221, score-0.882]

83 We used a naive pruning method which selects the top N phrase pairs in the phrase table. [sent-223, score-0.539]

84 For pre-pruning, we prune the SP phrase table and PT phrase table before applying random walks. [sent-225, score-0.649]

85 With a pre- and post-pruning method, the random walk approach is able to achieve further improvements. [sent-230, score-0.617]

86 34 on WMT06, WMT07 and WMT08 respectively, which is much better 531 than the baseline and the random walk approach with pruning. [sent-234, score-0.64]

87 When adopting a post-pruning method, the performance of translation did not improved significantly over the pre-pruning, but the scale of the phrase table dropped to 69M, which is only about 2 times larger than the triangulation method. [sent-236, score-0.859]

88 Experimental results on translating from Portuguese to Swedish via different pivot language. [sent-245, score-0.621]

89 In this sub-section, we consider translation from Chinese to Spanish with English as the pivot language. [sent-255, score-0.781]

90 A pivot task was included in IWSLT 2008 in which the participants need to translate Chinese to Spanish via English. [sent-257, score-0.577]

91 From the table we can find that the random walk model can achieve an absolute improvement of 0. [sent-279, score-0.638]

92 It means that our random walk approach is robust in the realistic scenario. [sent-283, score-0.617]

93 6 Discussions The random walk approach mainly improves the performance of pivot translation in two aspects: reduces the OOVs and provides more hypothesis phrases for decoding. [sent-284, score-1.445]

94 We count the OOVs when decoding with triangulation model and random walk model on IWSLT2008 data. [sent-290, score-0.978]

95 The statistics shows that when using triangulation model, there are 11% OOVs when using triangulation model, compared with 9. [sent-291, score-0.722]

96 In this case, the random walk translation is better than the baseline system. [sent-297, score-0.873]

97 Random walk is able to discover the hidden relations (hypothesis phrases) among source, pivot and target phrases. [sent-301, score-1.152]

98 7 Conclusion and Future Work In this paper, we proposed a random walk method to improve pivot-based statistical machine translation. [sent-304, score-0.662]

99 The random walk method can find implicit relations between phrases in the source and target languages. [sent-305, score-0.862]

100 Improving translation quality by discarding most of the phrase table. [sent-347, score-0.477]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('pivot', 0.548), ('walk', 0.498), ('triangulation', 0.361), ('phrase', 0.244), ('translation', 0.233), ('random', 0.119), ('finnish', 0.09), ('henkekou', 0.086), ('europarl', 0.076), ('pt', 0.076), ('source', 0.072), ('smt', 0.072), ('walks', 0.068), ('portuguese', 0.064), ('oov', 0.06), ('utiyama', 0.06), ('swedish', 0.06), ('languages', 0.059), ('bleu', 0.053), ('asp', 0.052), ('haifeng', 0.052), ('pivotbased', 0.052), ('zhentou', 0.052), ('alignment', 0.052), ('sp', 0.051), ('phrases', 0.047), ('target', 0.046), ('rik', 0.045), ('translating', 0.044), ('spanish', 0.044), ('iwslt', 0.044), ('wu', 0.043), ('formula', 0.043), ('bilingual', 0.041), ('oovs', 0.041), ('aps', 0.041), ('hua', 0.041), ('odd', 0.041), ('synthetic', 0.04), ('almohada', 0.034), ('craswell', 0.034), ('gij', 0.034), ('mydans', 0.034), ('quiero', 0.034), ('sourcepivot', 0.034), ('szummer', 0.034), ('uralic', 0.034), ('xiaoning', 0.034), ('koehn', 0.033), ('relations', 0.032), ('spoken', 0.031), ('translations', 0.031), ('probabilities', 0.03), ('delicious', 0.03), ('rij', 0.03), ('graph', 0.03), ('node', 0.03), ('via', 0.029), ('connect', 0.028), ('discover', 0.028), ('matrix', 0.027), ('years', 0.027), ('pages', 0.027), ('cohn', 0.027), ('chinese', 0.027), ('philipp', 0.026), ('operator', 0.026), ('yao', 0.026), ('pruning', 0.026), ('isahara', 0.025), ('weaknesses', 0.025), ('gonz', 0.025), ('method', 0.025), ('percentages', 0.024), ('trans', 0.024), ('masao', 0.024), ('bannard', 0.024), ('brin', 0.024), ('analogous', 0.023), ('relation', 0.023), ('baseline', 0.023), ('implicit', 0.023), ('transfer', 0.023), ('belongs', 0.023), ('european', 0.022), ('web', 0.022), ('cui', 0.022), ('harbin', 0.022), ('duh', 0.022), ('released', 0.021), ('table', 0.021), ('ito', 0.021), ('rw', 0.021), ('connected', 0.021), ('obtained', 0.021), ('build', 0.02), ('extended', 0.02), ('statistical', 0.02), ('xiang', 0.02), ('martin', 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk

Author: Xiaoning Zhu ; Zhongjun He ; Hua Wu ; Haifeng Wang ; Conghui Zhu ; Tiejun Zhao

Abstract: This paper proposes a novel approach that utilizes a machine learning method to improve pivot-based statistical machine translation (SMT). For language pairs with few bilingual data, a possible solution in pivot-based SMT using another language as a

2 0.14594474 127 emnlp-2013-Max-Margin Synchronous Grammar Induction for Machine Translation

Author: Xinyan Xiao ; Deyi Xiong

Abstract: Traditional synchronous grammar induction estimates parameters by maximizing likelihood, which only has a loose relation to translation quality. Alternatively, we propose a max-margin estimation approach to discriminatively inducing synchronous grammars for machine translation, which directly optimizes translation quality measured by BLEU. In the max-margin estimation of parameters, we only need to calculate Viterbi translations. This further facilitates the incorporation of various non-local features that are defined on the target side. We test the effectiveness of our max-margin estimation framework on a competitive hierarchical phrase-based system. Experiments show that our max-margin method significantly outperforms the traditional twostep pipeline for synchronous rule extraction by 1.3 BLEU points and is also better than previous max-likelihood estimation method.

3 0.13761781 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models

Author: Joern Wuebker ; Stephan Peitz ; Felix Rietig ; Hermann Ney

Abstract: Automatically clustering words from a monolingual or bilingual training corpus into classes is a widely used technique in statistical natural language processing. We present a very simple and easy to implement method for using these word classes to improve translation quality. It can be applied across different machine translation paradigms and with arbitrary types of models. We show its efficacy on a small German→English and a larger F ornenc ah s→mGalelrm Gaenrm mtarann→slEatniognli tsahsk a nwdit ha lbaortghe rst Farnednacrhd→ phrase-based salandti nhie traaskrch wiciathl phrase-based translation systems for a common set of models. Our results show that with word class models, the baseline can be improved by up to 1.4% BLEU and 1.0% TER on the French→German task and 0.3% BLEU aonnd t h1e .1 F%re nTcEhR→ on tehrem German→English Btask.

4 0.13421221 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation

Author: Zhongqiang Huang ; Jacob Devlin ; Rabih Zbib

Abstract: Translation Jacob Devlin Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA j devl in@bbn . com Rabih Zbib Raytheon BBN Technologies 50 Moulton St Cambridge, MA, USA r zbib@bbn . com have tried to introduce grammaticality to the transThis paper describes a factored approach to incorporating soft source syntactic constraints into a hierarchical phrase-based translation system. In contrast to traditional approaches that directly introduce syntactic constraints to translation rules by explicitly decorating them with syntactic annotations, which often exacerbate the data sparsity problem and cause other problems, our approach keeps translation rules intact and factorizes the use of syntactic constraints through two separate models: 1) a syntax mismatch model that associates each nonterminal of a translation rule with a distribution of tags that is used to measure the degree of syntactic compatibility of the translation rule on source spans; 2) a syntax-based reordering model that predicts whether a pair of sibling constituents in the constituent parse tree of the source sentence should be reordered or not when translated to the target language. The features produced by both models are used as soft constraints to guide the translation process. Experiments on Chinese-English translation show that the proposed approach significantly improves a strong string-to-dependency translation system on multiple evaluation sets.

5 0.12288327 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation

Author: Ann Irvine ; Chris Quirk ; Hal Daume III

Abstract: When using a machine translation (MT) model trained on OLD-domain parallel data to translate NEW-domain text, one major challenge is the large number of out-of-vocabulary (OOV) and new-translation-sense words. We present a method to identify new translations of both known and unknown source language words that uses NEW-domain comparable document pairs. Starting with a joint distribution of source-target word pairs derived from the OLD-domain parallel corpus, our method recovers a new joint distribution that matches the marginal distributions of the NEW-domain comparable document pairs, while minimizing the divergence from the OLD-domain distribution. Adding learned translations to our French-English MT model results in gains of about 2 BLEU points over strong baselines.

6 0.11723483 17 emnlp-2013-A Walk-Based Semantically Enriched Tree Kernel Over Distributed Word Representations

7 0.11596783 136 emnlp-2013-Multi-Domain Adaptation for SMT Using Multi-Task Learning

8 0.10928002 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation

9 0.10803863 57 emnlp-2013-Dependency-Based Decipherment for Resource-Limited Machine Translation

10 0.098639324 169 emnlp-2013-Semi-Supervised Representation Learning for Cross-Lingual Text Classification

11 0.094249859 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation

12 0.087838851 15 emnlp-2013-A Systematic Exploration of Diversity in Machine Translation

13 0.082253762 187 emnlp-2013-Translation with Source Constituency and Dependency Trees

14 0.081994154 151 emnlp-2013-Paraphrasing 4 Microblog Normalization

15 0.081985228 201 emnlp-2013-What is Hidden among Translation Rules

16 0.079458274 167 emnlp-2013-Semi-Markov Phrase-Based Monolingual Alignment

17 0.079104714 38 emnlp-2013-Bilingual Word Embeddings for Phrase-Based Machine Translation

18 0.075376943 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models

19 0.071485087 39 emnlp-2013-Boosting Cross-Language Retrieval by Learning Bilingual Phrase Associations from Relevance Rankings

20 0.071328431 157 emnlp-2013-Recursive Autoencoders for ITG-Based Translation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.205), (1, -0.214), (2, 0.043), (3, 0.021), (4, 0.121), (5, -0.031), (6, -0.044), (7, -0.006), (8, -0.001), (9, -0.102), (10, 0.053), (11, 0.018), (12, 0.093), (13, -0.003), (14, -0.004), (15, 0.028), (16, -0.003), (17, 0.038), (18, 0.039), (19, 0.02), (20, 0.017), (21, -0.066), (22, -0.017), (23, 0.02), (24, -0.068), (25, -0.083), (26, -0.104), (27, 0.035), (28, -0.043), (29, 0.001), (30, 0.049), (31, -0.005), (32, -0.009), (33, -0.006), (34, 0.051), (35, -0.12), (36, 0.072), (37, -0.021), (38, -0.077), (39, 0.044), (40, 0.062), (41, 0.088), (42, -0.028), (43, -0.035), (44, 0.036), (45, 0.156), (46, 0.006), (47, -0.083), (48, 0.008), (49, -0.014)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93718243 103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk

Author: Xiaoning Zhu ; Zhongjun He ; Hua Wu ; Haifeng Wang ; Conghui Zhu ; Tiejun Zhao

Abstract: This paper proposes a novel approach that utilizes a machine learning method to improve pivot-based statistical machine translation (SMT). For language pairs with few bilingual data, a possible solution in pivot-based SMT using another language as a

2 0.69685501 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models

Author: Joern Wuebker ; Stephan Peitz ; Felix Rietig ; Hermann Ney

Abstract: Automatically clustering words from a monolingual or bilingual training corpus into classes is a widely used technique in statistical natural language processing. We present a very simple and easy to implement method for using these word classes to improve translation quality. It can be applied across different machine translation paradigms and with arbitrary types of models. We show its efficacy on a small German→English and a larger F ornenc ah s→mGalelrm Gaenrm mtarann→slEatniognli tsahsk a nwdit ha lbaortghe rst Farnednacrhd→ phrase-based salandti nhie traaskrch wiciathl phrase-based translation systems for a common set of models. Our results show that with word class models, the baseline can be improved by up to 1.4% BLEU and 1.0% TER on the French→German task and 0.3% BLEU aonnd t h1e .1 F%re nTcEhR→ on tehrem German→English Btask.

3 0.68552977 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models

Author: Jesus Gonzalez-Rubio ; Daniel Ortiz-Martinez ; Jose-Miguel Benedi ; Francisco Casacuberta

Abstract: Current automatic machine translation systems are not able to generate error-free translations and human intervention is often required to correct their output. Alternatively, an interactive framework that integrates the human knowledge into the translation process has been presented in previous works. Here, we describe a new interactive machine translation approach that is able to work with phrase-based and hierarchical translation models, and integrates error-correction all in a unified statistical framework. In our experiments, our approach outperforms previous interactive translation systems, and achieves estimated effort reductions of as much as 48% relative over a traditional post-edition system.

4 0.68450439 135 emnlp-2013-Monolingual Marginal Matching for Translation Model Adaptation

Author: Ann Irvine ; Chris Quirk ; Hal Daume III

Abstract: When using a machine translation (MT) model trained on OLD-domain parallel data to translate NEW-domain text, one major challenge is the large number of out-of-vocabulary (OOV) and new-translation-sense words. We present a method to identify new translations of both known and unknown source language words that uses NEW-domain comparable document pairs. Starting with a joint distribution of source-target word pairs derived from the OLD-domain parallel corpus, our method recovers a new joint distribution that matches the marginal distributions of the NEW-domain comparable document pairs, while minimizing the divergence from the OLD-domain distribution. Adding learned translations to our French-English MT model results in gains of about 2 BLEU points over strong baselines.

5 0.66659594 57 emnlp-2013-Dependency-Based Decipherment for Resource-Limited Machine Translation

Author: Qing Dou ; Kevin Knight

Abstract: We introduce dependency relations into deciphering foreign languages and show that dependency relations help improve the state-ofthe-art deciphering accuracy by over 500%. We learn a translation lexicon from large amounts of genuinely non parallel data with decipherment to improve a phrase-based machine translation system trained with limited parallel data. In experiments, we observe BLEU gains of 1.2 to 1.8 across three different test sets.

6 0.61255091 15 emnlp-2013-A Systematic Exploration of Diversity in Machine Translation

7 0.59217268 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation

8 0.58815801 84 emnlp-2013-Factored Soft Source Syntactic Constraints for Hierarchical Machine Translation

9 0.57191807 127 emnlp-2013-Max-Margin Synchronous Grammar Induction for Machine Translation

10 0.56293273 201 emnlp-2013-What is Hidden among Translation Rules

11 0.54555148 136 emnlp-2013-Multi-Domain Adaptation for SMT Using Multi-Task Learning

12 0.54313153 88 emnlp-2013-Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding

13 0.50684923 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation

14 0.50419044 187 emnlp-2013-Translation with Source Constituency and Dependency Trees

15 0.49688011 39 emnlp-2013-Boosting Cross-Language Retrieval by Learning Bilingual Phrase Associations from Relevance Rankings

16 0.49415424 156 emnlp-2013-Recurrent Continuous Translation Models

17 0.49380869 13 emnlp-2013-A Study on Bootstrapping Bilingual Vector Spaces from Non-Parallel Data (and Nothing Else)

18 0.47392929 151 emnlp-2013-Paraphrasing 4 Microblog Normalization

19 0.46407399 17 emnlp-2013-A Walk-Based Semantically Enriched Tree Kernel Over Distributed Word Representations

20 0.46354276 42 emnlp-2013-Building Specialized Bilingual Lexicons Using Large Scale Background Knowledge


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.022), (10, 0.013), (18, 0.035), (22, 0.061), (30, 0.11), (45, 0.019), (51, 0.161), (64, 0.285), (66, 0.043), (71, 0.018), (75, 0.038), (77, 0.054), (90, 0.012), (96, 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.78062278 87 emnlp-2013-Fish Transporters and Miracle Homes: How Compositional Distributional Semantics can Help NP Parsing

Author: Angeliki Lazaridou ; Eva Maria Vecchi ; Marco Baroni

Abstract: In this work, we argue that measures that have been shown to quantify the degree of semantic plausibility of phrases, as obtained from their compositionally-derived distributional semantic representations, can resolve syntactic ambiguities. We exploit this idea to choose the correct parsing of NPs (e.g., (live fish) transporter rather than live (fish transporter)). We show that our plausibility cues outperform a strong baseline and significantly improve performance when used in combination with state-of-the-art features.

same-paper 2 0.74793535 103 emnlp-2013-Improving Pivot-Based Statistical Machine Translation Using Random Walk

Author: Xiaoning Zhu ; Zhongjun He ; Hua Wu ; Haifeng Wang ; Conghui Zhu ; Tiejun Zhao

Abstract: This paper proposes a novel approach that utilizes a machine learning method to improve pivot-based statistical machine translation (SMT). For language pairs with few bilingual data, a possible solution in pivot-based SMT using another language as a

3 0.72817045 175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation

Author: Uri Lerner ; Slav Petrov

Abstract: We present a simple and novel classifier-based preordering approach. Unlike existing preordering models, we train feature-rich discriminative classifiers that directly predict the target-side word order. Our approach combines the strengths of lexical reordering and syntactic preordering models by performing long-distance reorderings using the structure of the parse tree, while utilizing a discriminative model with a rich set of features, including lexical features. We present extensive experiments on 22 language pairs, including preordering into English from 7 other languages. We obtain improvements of up to 1.4 BLEU on language pairs in the WMT 2010 shared task. For languages from different families the improvements often exceed 2 BLEU. Many of these gains are also significant in human evaluations.

4 0.59983611 56 emnlp-2013-Deep Learning for Chinese Word Segmentation and POS Tagging

Author: Xiaoqing Zheng ; Hanyang Chen ; Tianyu Xu

Abstract: This study explores the feasibility of performing Chinese word segmentation (CWS) and POS tagging by deep learning. We try to avoid task-specific feature engineering, and use deep layers of neural networks to discover relevant features to the tasks. We leverage large-scale unlabeled data to improve internal representation of Chinese characters, and use these improved representations to enhance supervised word segmentation and POS tagging models. Our networks achieved close to state-of-theart performance with minimal computational cost. We also describe a perceptron-style algorithm for training the neural networks, as an alternative to maximum-likelihood method, to speed up the training process and make the learning algorithm easier to be implemented.

5 0.59932292 157 emnlp-2013-Recursive Autoencoders for ITG-Based Translation

Author: Peng Li ; Yang Liu ; Maosong Sun

Abstract: While inversion transduction grammar (ITG) is well suited for modeling ordering shifts between languages, how to make applying the two reordering rules (i.e., straight and inverted) dependent on actual blocks being merged remains a challenge. Unlike previous work that only uses boundary words, we propose to use recursive autoencoders to make full use of the entire merging blocks alternatively. The recursive autoencoders are capable of generating vector space representations for variable-sized phrases, which enable predicting orders to exploit syntactic and semantic information from a neural language modeling’s perspective. Experiments on the NIST 2008 dataset show that our system significantly improves over the MaxEnt classifier by 1.07 BLEU points.

6 0.59864116 187 emnlp-2013-Translation with Source Constituency and Dependency Trees

7 0.59702981 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models

8 0.59702259 15 emnlp-2013-A Systematic Exploration of Diversity in Machine Translation

9 0.59450781 38 emnlp-2013-Bilingual Word Embeddings for Phrase-Based Machine Translation

10 0.59210372 50 emnlp-2013-Combining PCFG-LA Models with Dual Decomposition: A Case Study with Function Labels and Binarization

11 0.59114826 104 emnlp-2013-Improving Statistical Machine Translation with Word Class Models

12 0.59003174 22 emnlp-2013-Anchor Graph: Global Reordering Contexts for Statistical Machine Translation

13 0.58993715 114 emnlp-2013-Joint Learning and Inference for Grammatical Error Correction

14 0.58957511 13 emnlp-2013-A Study on Bootstrapping Bilingual Vector Spaces from Non-Parallel Data (and Nothing Else)

15 0.58800048 143 emnlp-2013-Open Domain Targeted Sentiment

16 0.58784986 194 emnlp-2013-Unsupervised Relation Extraction with General Domain Knowledge

17 0.58693743 52 emnlp-2013-Converting Continuous-Space Language Models into N-Gram Language Models for Statistical Machine Translation

18 0.58528388 53 emnlp-2013-Cross-Lingual Discriminative Learning of Sequence Models with Posterior Regularization

19 0.58499068 48 emnlp-2013-Collective Personal Profile Summarization with Social Networks

20 0.58427984 47 emnlp-2013-Collective Opinion Target Extraction in Chinese Microblogs