emnlp emnlp2010 emnlp2010-36 knowledge-graph by maker-knowledge-mining

36 emnlp-2010-Discriminative Word Alignment with a Function Word Reordering Model


Source: pdf

Author: Hendra Setiawan ; Chris Dyer ; Philip Resnik

Abstract: We address the modeling, parameter estimation and search challenges that arise from the introduction of reordering models that capture non-local reordering in alignment modeling. In particular, we introduce several reordering models that utilize (pairs of) function words as contexts for alignment reordering. To address the parameter estimation challenge, we propose to estimate these reordering models from a relatively small amount of manuallyaligned corpora. To address the search challenge, we devise an iterative local search algorithm that stochastically explores reordering possibilities. By capturing non-local reordering phenomena, our proposed alignment model bears a closer resemblance to stateof-the-art translation model. Empirical results show significant improvements in alignment quality as well as in translation performance over baselines in a large-scale ChineseEnglish translation task.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu cdyer@ c s Abstract We address the modeling, parameter estimation and search challenges that arise from the introduction of reordering models that capture non-local reordering in alignment modeling. [sent-3, score-1.568]

2 In particular, we introduce several reordering models that utilize (pairs of) function words as contexts for alignment reordering. [sent-4, score-1.036]

3 To address the parameter estimation challenge, we propose to estimate these reordering models from a relatively small amount of manuallyaligned corpora. [sent-5, score-0.624]

4 To address the search challenge, we devise an iterative local search algorithm that stochastically explores reordering possibilities. [sent-6, score-0.621]

5 By capturing non-local reordering phenomena, our proposed alignment model bears a closer resemblance to stateof-the-art translation model. [sent-7, score-1.13]

6 Empirical results show significant improvements in alignment quality as well as in translation performance over baselines in a large-scale ChineseEnglish translation task. [sent-8, score-0.669]

7 1 Introduction In many Statistical Machine Translation (SMT) systems, alignment represents an important piece of information, from which translation rules are learnt. [sent-9, score-0.538]

8 However, while translation models have evolved from word-based to syntax-based modeling, the de facto alignment model remains word-based (Brown et al. [sent-10, score-0.63]

9 This gap between alignment modeling and translation modeling is clearly undesirable as it often generates tensions that would prevent the extraction of many useful translation rules (DeNero and Klein, 2007). [sent-13, score-0.764]

10 (2009) just to name a few, show that alignment models that bear closer resemblance to state-of-theart translation model consistently yields not only a better alignment quality but also an improved translation quality. [sent-21, score-1.168]

11 In this paper, we follow this recent effort to nar- row the gap between alignment model and translation model to improve translation quality. [sent-22, score-0.76]

12 Why is employing stronger reordering models more challenging in alignment than in translation? [sent-25, score-0.965]

13 One answer can be attributed to the fact that alignment points are unobserved in parallel text, thus so are their reorderings. [sent-26, score-0.435]

14 As such, introducing stronger reordering often further exacerbates the computational complexity to do inference over the model. [sent-27, score-0.529]

15 Some recent alignment models appeal to external linguistic knowledge, mostly by using monolingual syntactic parses (Cherry and Lin, 2006; Pauls et al. [sent-28, score-0.487]

16 To our knowledge, however, this approach has been used mainly to constrain reordering possibilities, or to add to the generalization ability of association-based scores, not to directly model reordering in the context of alignment. [sent-30, score-1.089]

17 tc ho2d0s10 in A Nsastoucira tlio Lnan fogru Cagoem Ppruotcaetisosninagl, L pinag eusis 5t3ic4s–54 , In this paper, we introduce a new approach to improving the modeling of reordering in alignment. [sent-33, score-0.562]

18 Instead of relying on monolingual parses, we condition our reordering model on the behavior of function words and the phrases that surround them. [sent-34, score-0.739]

19 At a glance, our reordering model enumerates the function words on both source and target sides, modeling their reordering relative to their neighboring phrases, their neighboring function words, and the sentence boundaries. [sent-39, score-1.486]

20 Because the frequency of function words is high, we find that by predicting the reordering of function words accurately, the reordering of the remaining words improves in accuracy as well. [sent-40, score-1.2]

21 The parameters of our sub-models are estimated from manually-aligned corpora, leading the reordering model more directly toward reproducing human alignments, rather than maximizing the likelihood of unaligned training data. [sent-43, score-0.641]

22 bined with additional features in Section 4 to produce a single discriminative alignment model. [sent-64, score-0.453]

23 Section 5 describes a simple decoding algorithm to find the most probable alignment under the combined model, Section 6 describes the training of our discriminative model and Section 7 presents experimental results for the model using this algorithm. [sent-65, score-0.515]

24 1 shows an example of a Chinese-English sentence pair together with correct alignment points. [sent-68, score-0.407]

25 Predicting the alignment for this particular ChineseEnglish sentence pair is challenging, since the significantly different syntactic structures of these two languages lead to non-monotone reordering. [sent-69, score-0.407]

26 The central question that concerns us here is how to define and infer regularities that can be useful to predict alignment reorderings. [sent-71, score-0.475]

27 The approach we take here is supported by empirical results from a pilot study, conducted as an inquiry into the idea of focusing on function words to model alignment reordering, which we briefly describe. [sent-72, score-0.547]

28 Visually, an all-monotone phrase pair corresponds to a maximal block in the alignment matrix for which internal alignment points appear in monotone order from the top-left corner to the bottom-right corner. [sent-87, score-0.921]

29 The alignment configuration internal to allmonotone phrase pair blocks is, obviously, monotonic, which is a configuration that is effectively modeled by traditional alignments models. [sent-92, score-0.804]

30 On the other hand, the reordering between two adjacent blocks is the focus of our efforts since existing models are less effective at modeling non-monotonic alignment configurations. [sent-93, score-1.139]

31 Clearly with such high coverage, function words are central in predicting nonmonotone reordering in alignment. [sent-105, score-0.638]

32 3 Reordering with Function Words The reordering models we describe follow our previous work using function word models for translation (Setiawan et al. [sent-106, score-0.789]

33 The core hypothesis in this work is that function words provide robust clues to the reordering patterns of the phrases surrounding them. [sent-109, score-0.657]

34 This section provides a high level overview of our reordering model, which attempts to leverage this information. [sent-111, score-0.529]

35 To facilitate subsequent discussions, we introduce the notion of monolingual function word phrase FWi, which consists of the tuple (Yi, Li, Ri), where Yi is the i-th function word and Li,Ri are its left and right neighboring phrases, respectively. [sent-112, score-0.354]

36 Note that this notion of “phrase” is defined only for reordering purposes in our model, and does not necessarily correspond to a linguistic phrase. [sent-113, score-0.529]

37 The primary objective of our reordering model is to predict the projection of monolingual function word phrases from one language to the other, inferring bilingual function word phrase pairs FWi,S→T = (Yi,S→T, Li,S→T, Ri,S→T), which encode the two aforementioned pieces of information. [sent-117, score-0.933]

38 For instance, to estimate the spans of Li,S→T, Ri,S→T, our reordering model assumes that any span to the left of Yi,S is a possible Li,S and any span to the right of Yi,S is a possible Ri,S, deciding which is most probable via features, rather than committing to particular spans (e. [sent-121, score-0.668]

39 We only enforce one criterion on Li,S→T and Ri,S→T: they have to be the maximal alignment blocks satisfying the consistent heuristic (Och and Ney, 2004) that end or start with Yi,S→T on the source S side respectively. [sent-124, score-0.552]

40 Taking the decomposition of Li,S→T as a case i )n) point, ghe trhee o(Li,S→T) dioensc orifb Les the reordering of the left neighbor Li,S→T with respect to the function word Yi,S→T, while d(FWi−1,S→T) and b(hsi)) probe the span of Li,S→T, i. [sent-126, score-0.694]

41 the reordering of the neighboring phrases of a function word, we employ the orientation model introduced by Setiawan et al. [sent-134, score-0.87]

42 Formally, this model takes the form of probability distribution Pori(o(Li,S→T), o(Ri,S→T) |Yi,S→T), which conditions the reordering on th)e| Ylexical identity of the function word alignment (but independent ofthe lexical identity of its neighboring phrases). [sent-136, score-1.223]

43 In particular, o maps the reordering into one of the following four orientation values (borrowed from Nagata et al. [sent-137, score-0.6]

44 537 jections of the function word and the neighboring phrase are adjacent or separated by an intervening phrase. [sent-141, score-0.311]

45 whether Li,S→T and Ri,S→T extend beyond the neighboring function word phrase pairs, we utilize the pairwise dominance model of Setiawan et al. [sent-144, score-0.425]

46 Formally, this model is similar to the pairwise dominance model, except that we use the sentence boundaries as the anchors instead of the neighboring phrase pairs. [sent-153, score-0.354]

47 As a result, there are Tsix reordering mllo adse Sls b →ase Td on fsun ac rtiesonul wt,o trhders. [sent-162, score-0.529]

48 2 Prediction and Parameter Estimation Given FWi−1,S→T (and all other FW∀i′/i,S→T), our reordering model has to decompose Li,S→T into (o(Li,S→T), d(FWi−1,S→T), b(hsi)); and Ri,S→T into (o(Ri,S→T),d(FWi+1,S→T), b(h/si ) )) during prediction and parameter estima)t,io bn(h. [sent-164, score-0.56]

49 In addition to the six reordering models, our model employs several association-based scores that look at alignments in isolation. [sent-185, score-0.739]

50 The use of this feature is widespread in recent alignment models, since it provides a relatively accurate initial prediction. [sent-194, score-0.407]

51 This fea- ture encourages our alignment model to reuse alignment points that are part of the alignments created by the grow-diag-final heuristic, which we used as the baseline of our machine translation experiments. [sent-202, score-1.183]

52 The intuition is to penalize NULL alignments depending on word class, by assigning lower probability mass to unaligned content words than to unaligned function words. [sent-211, score-0.412]

53 Note that with the exception of the alignment bonus feature (4), all features are uni-directional, and therefore we employ these features in both directions just as was done for the reordering models. [sent-213, score-0.936]

54 9, it is necessary to search different alignment configurations, and, because of the non-local dependencies in some of our features, it is not possible to use dynamic programming to perform this search efficiently. [sent-215, score-0.499]

55 We use a local search procedure which starts from some alignment (in our case, a symmetrized Model 4 alignment) and make local changes to it. [sent-217, score-0.52]

56 1 Algorithm Aˆ, To find our search algorithm starts with an initial alignment and iteratively draws a new set by A(1) making a few small changes to the current set. [sent-221, score-0.453]

57 For each step i = [1, n], with alignment a set of neighboring alignments is induced by applying osrminagll ltrigannmsfoenrmtsat Nion(As (discussed below) to the current alignment. [sent-222, score-0.697]

58 After n steps, the algorithm returns Amax as its approximation of In the experiments reported below, we initialized A(1) with the Model 4 alignments symmetrized by using the grow-diag-final-and heuristic (Koehn et al. [sent-224, score-0.312]

59 2 Alignment Neighborhoods We now turn to a discussion of how the alignment neighborhoods used by our stochastic search algo- rithm are generated. [sent-228, score-0.527]

60 We define three local transformation operations that apply to single columns of the alignment grid (which represent all of the alignments to the lth source word), rows, or existing alignment points (l, m). [sent-229, score-1.053]

61 The ALIGN operator applies to the lth column of A and can either add an alignment point (l, m′) or move an existing one (including to null, thus deleting it). [sent-231, score-0.473]

62 ALIGNEXCLUSIVE adds an alignment point (l, m) and deletes all other points from row m. [sent-232, score-0.435]

63 Finally, the SWAP operator swaps (l, m) and (l′, m′), resulting in new alignment points (l, m′) and (l′, m). [sent-233, score-0.469]

64 By iterating over all columns l and rows m, the full alignment space A(S, T) can aben dex rpolworse md. [sent-237, score-0.407]

65 4Using only the ALIGN operator, it is possible to explore the full alignment space; however, using all three operators increases mobility. [sent-239, score-0.447]

66 The solid circles represent the new alignment points added to A(i+1) . [sent-242, score-0.435]

67 6 Discriminative Training To set the model parameters θ, we used the minimum error rate training (MERT) algorithm (Och, 2003) to maximize the F-measure of the 1-best alignment of the model on a development set consisting of sentence pairs with manually generated alignments. [sent-243, score-0.511]

68 First, we can optimize F-measure of the alignments directly, which has been shown to correlate with translation quality in a downstream system (Fraser and Marcu, 2007b). [sent-250, score-0.31]

69 Although MERT is a non-probabilistic optimizer, we explore the alignment space stochastically. [sent-255, score-0.407]

70 7 Experiments We evaluated our proposed alignment model intrinsically on an alignment task and extrinsically on a large-scale translation task, focusing on ChineseEnglish as the language pair. [sent-258, score-0.976]

71 The manually-aligned corpora are primarily used for training the reordering models and for discriminative training purposes. [sent-261, score-0.674]

72 , 2010), a fast implementation of hierarchical phrase-based translation models (Chiang, 2005), which represents a state-of-the-art translation system. [sent-263, score-0.319]

73 For the alignment experiments, we took the first 500 sentence pairs from the newswire genre of the manually-aligned corpora and used the first 250 sentences as the development set, with the remaining 250 as the test set. [sent-266, score-0.482]

74 To ensure blind experimentation, we excluded these sentence pairs from the training of the features, including the reordering models. [sent-267, score-0.571]

75 1 Alignment Quality We used GIZA++, the implementation of the defacto standard IBM alignment model, as our baseline alignment model. [sent-269, score-0.814]

76 We recorded the alignment quality of the test set as our baseline performance. [sent-273, score-0.407]

77 For our alignment model, we used the same set of training data. [sent-274, score-0.407]

78 To align the test set, we first tuned the weights of the features in our discriminative alignment model using minimum error rate training (MERT) (Och, 2003) with Fα=0. [sent-275, score-0.546]

79 Once tuned, we ran our aligner on the test set and measured the quality of the resulting alignment as the performance of our model. [sent-279, score-0.407]

80 1) for our discriminative reordering models with various features (lines 25) versus the baseline IBM word-based Model 4 symmetrized using the grow-diag-final-and heuristic. [sent-287, score-0.671]

81 Table 1 reports the results of our experiments, which are conducted in an incremental fashion primarily to highlight the role of reordering modeling. [sent-291, score-0.566]

82 In the second set of experiments, we added the reordering models into our discriminative model one by one, starting with the orientation models, then the pairwise dominance model and finally the borderwise dominance model, reported in lines +ori, +dom and +bdom respectively. [sent-296, score-1.103]

83 As shown, each additional reordering model provides a significant additional improvement. [sent-297, score-0.56]

84 The best result is obtained by employing all reordering models. [sent-298, score-0.529]

85 These results empirically confirm our hypothesis that we can improve alignment quality by employing reordering models that capture non-local reordering phenomena. [sent-299, score-1.494]

86 4374 Table 2: The translation performance (BLEU) of hierarchical phrase-based translation trained on training data aligned by IBM model 4 symmetrized with the growdiag-final-and heuristic, versus being trained on alignments by our discriminative alignment model. [sent-314, score-1.02]

87 In our alignment model, we employed the whole set of reordering models, i. [sent-317, score-0.936]

88 As shown, our discriminative alignment model produces a consistent and significant improvement over the baseline IBM model 4 (p < 0. [sent-320, score-0.515]

89 8 Related Work The focus of our work is to strengthen the reordering component of alignment modeling. [sent-324, score-0.936]

90 Although the de facto standard, the IBM models do not generalize well in practice: the IBM approach employs a series of reordering models based on the word’s position, but reordering depends on syntactic context rather than absolute position in the sentence. [sent-325, score-1.148]

91 Over the years, there have been many proposals to improve these reordering models, most notably Vogel et al. [sent-326, score-0.529]

92 Alignment modeling is challenging because it often has to consider a prohibitively large alignment space. [sent-329, score-0.44]

93 Our reordering model is closely related to the model proposed by Zhang and Gildea (2005; 2006; 2007a), with respect to conditioning the reordering predictions on lexical items. [sent-337, score-1.12]

94 With respect to the focus on function words, our reordering model is closely related to the UALIGN system (Hermjakob, 2009). [sent-339, score-0.631]

95 We use the notion of function words to infer such regularities, resulting in several reordering models that are employed as features in a discriminative alignment model. [sent-342, score-1.112]

96 In particular, our models predict the reordering of function words by looking at their dependencies with respect to their neighboring phrases, their neighboring function words, and the sentence boundaries. [sent-343, score-0.922]

97 By capturing such long-distance dependencies, our proposed alignment model contributes to the effort to unify alignment and translation. [sent-344, score-0.845]

98 Our experiments demonstrate that our alignment approach achieves both its intrinsic and extrinsic goals. [sent-345, score-0.407]

99 Soft syntactic constraints for word alignment through discriminative training. [sent-369, score-0.453]

100 A clustered global phrase reordering model for statistical machine translation. [sent-430, score-0.61]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('reordering', 0.529), ('alignment', 0.407), ('fwi', 0.283), ('yi', 0.205), ('alignments', 0.179), ('setiawan', 0.151), ('translation', 0.131), ('dominance', 0.129), ('dontcare', 0.113), ('neighboring', 0.111), ('mert', 0.108), ('hsi', 0.097), ('hendra', 0.094), ('ibm', 0.091), ('unaligned', 0.081), ('adjacent', 0.079), ('borderwise', 0.075), ('orientation', 0.071), ('function', 0.071), ('symmetrized', 0.067), ('fraser', 0.065), ('align', 0.062), ('blocks', 0.062), ('phrases', 0.057), ('aligning', 0.057), ('alignexclusive', 0.057), ('bdom', 0.057), ('leftfirst', 0.057), ('pdom', 0.057), ('pori', 0.057), ('span', 0.054), ('monolingual', 0.051), ('blunsom', 0.05), ('phrase', 0.05), ('llr', 0.048), ('pauls', 0.048), ('side', 0.048), ('itg', 0.047), ('transduction', 0.047), ('discriminative', 0.046), ('search', 0.046), ('stochastic', 0.045), ('association', 0.042), ('pairs', 0.042), ('chinese', 0.042), ('inversion', 0.041), ('ma', 0.041), ('chineseenglish', 0.04), ('mm', 0.04), ('subscript', 0.04), ('operators', 0.04), ('decomposition', 0.04), ('giza', 0.038), ('allmonotone', 0.038), ('cdec', 0.038), ('discourage', 0.038), ('hillclimbing', 0.038), ('manuallyaligned', 0.038), ('mg', 0.038), ('nagata', 0.038), ('nonmonotone', 0.038), ('ualign', 0.038), ('pilot', 0.038), ('regularities', 0.038), ('thousand', 0.038), ('identity', 0.037), ('primarily', 0.037), ('null', 0.037), ('och', 0.036), ('sides', 0.035), ('marcu', 0.035), ('heuristic', 0.035), ('configuration', 0.034), ('operator', 0.034), ('corpora', 0.033), ('modeling', 0.033), ('pairwise', 0.033), ('facto', 0.032), ('borders', 0.032), ('lth', 0.032), ('resemblance', 0.032), ('dyer', 0.032), ('bilingual', 0.031), ('approximation', 0.031), ('vogel', 0.031), ('model', 0.031), ('singapore', 0.03), ('infer', 0.03), ('discriminatively', 0.03), ('denero', 0.03), ('gap', 0.029), ('monotone', 0.029), ('neighborhoods', 0.029), ('models', 0.029), ('suntec', 0.028), ('points', 0.028), ('hierarchical', 0.028), ('estimation', 0.028), ('subscripts', 0.027), ('swap', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000005 36 emnlp-2010-Discriminative Word Alignment with a Function Word Reordering Model

Author: Hendra Setiawan ; Chris Dyer ; Philip Resnik

Abstract: We address the modeling, parameter estimation and search challenges that arise from the introduction of reordering models that capture non-local reordering in alignment modeling. In particular, we introduce several reordering models that utilize (pairs of) function words as contexts for alignment reordering. To address the parameter estimation challenge, we propose to estimate these reordering models from a relatively small amount of manuallyaligned corpora. To address the search challenge, we devise an iterative local search algorithm that stochastically explores reordering possibilities. By capturing non-local reordering phenomena, our proposed alignment model bears a closer resemblance to stateof-the-art translation model. Empirical results show significant improvements in alignment quality as well as in translation performance over baselines in a large-scale ChineseEnglish translation task.

2 0.31436443 57 emnlp-2010-Hierarchical Phrase-Based Translation Grammars Extracted from Alignment Posterior Probabilities

Author: Adria de Gispert ; Juan Pino ; William Byrne

Abstract: We report on investigations into hierarchical phrase-based translation grammars based on rules extracted from posterior distributions over alignments of the parallel text. Rather than restrict rule extraction to a single alignment, such as Viterbi, we instead extract rules based on posterior distributions provided by the HMM word-to-word alignmentmodel. We define translation grammars progressively by adding classes of rules to a basic phrase-based system. We assess these grammars in terms of their expressive power, measured by their ability to align the parallel text from which their rules are extracted, and the quality of the translations they yield. In Chinese-to-English translation, we find that rule extraction from posteriors gives translation improvements. We also find that grammars with rules with only one nonterminal, when extracted from posteri- ors, can outperform more complex grammars extracted from Viterbi alignments. Finally, we show that the best way to exploit source-totarget and target-to-source alignment models is to build two separate systems and combine their output translation lattices.

3 0.26267874 76 emnlp-2010-Maximum Entropy Based Phrase Reordering for Hierarchical Phrase-Based Translation

Author: Zhongjun He ; Yao Meng ; Hao Yu

Abstract: Hierarchical phrase-based (HPB) translation provides a powerful mechanism to capture both short and long distance phrase reorderings. However, the phrase reorderings lack of contextual information in conventional HPB systems. This paper proposes a contextdependent phrase reordering approach that uses the maximum entropy (MaxEnt) model to help the HPB decoder select appropriate reordering patterns. We classify translation rules into several reordering patterns, and build a MaxEnt model for each pattern based on various contextual features. We integrate the MaxEnt models into the HPB model. Experimental results show that our approach achieves significant improvements over a standard HPB system on large-scale translation tasks. On Chinese-to-English translation, , the absolute improvements in BLEU (caseinsensitive) range from 1.2 to 2.1.

4 0.24395779 29 emnlp-2010-Combining Unsupervised and Supervised Alignments for MT: An Empirical Study

Author: Jinxi Xu ; Antti-Veikko Rosti

Abstract: Word alignment plays a central role in statistical MT (SMT) since almost all SMT systems extract translation rules from word aligned parallel training data. While most SMT systems use unsupervised algorithms (e.g. GIZA++) for training word alignment, supervised methods, which exploit a small amount of human-aligned data, have become increasingly popular recently. This work empirically studies the performance of these two classes of alignment algorithms and explores strategies to combine them to improve overall system performance. We used two unsupervised aligners, GIZA++ and HMM, and one supervised aligner, ITG, in this study. To avoid language and genre specific conclusions, we ran experiments on test sets consisting of two language pairs (Chinese-to-English and Arabicto-English) and two genres (newswire and weblog). Results show that the two classes of algorithms achieve the same level of MT perfor- mance. Modest improvements were achieved by taking the union of the translation grammars extracted from different alignments. Significant improvements (around 1.0 in BLEU) were achieved by combining outputs of different systems trained with different alignments. The improvements are consistent across languages and genres.

5 0.22269489 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment

Author: Samuel Brody

Abstract: We reveal a previously unnoticed connection between dependency parsing and statistical machine translation (SMT), by formulating the dependency parsing task as a problem of word alignment. Furthermore, we show that two well known models for these respective tasks (DMV and the IBM models) share common modeling assumptions. This motivates us to develop an alignment-based framework for unsupervised dependency parsing. The framework (which will be made publicly available) is flexible, modular and easy to extend. Using this framework, we implement several algorithms based on the IBM alignment models, which prove surprisingly effective on the dependency parsing task, and demonstrate the potential of the alignment-based approach.

6 0.1631109 3 emnlp-2010-A Fast Fertility Hidden Markov Model for Word Alignment Using MCMC

7 0.16043217 18 emnlp-2010-Assessing Phrase-Based Translation Models with Oracle Decoding

8 0.12371463 63 emnlp-2010-Improving Translation via Targeted Paraphrasing

9 0.11810278 50 emnlp-2010-Facilitating Translation Using Source Language Paraphrase Lattices

10 0.11371531 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions

11 0.10941826 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice

12 0.10823346 33 emnlp-2010-Cross Language Text Classification by Model Translation and Semi-Supervised Learning

13 0.10666285 47 emnlp-2010-Example-Based Paraphrasing for Improved Phrase-Based Statistical Machine Translation

14 0.10064247 5 emnlp-2010-A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages

15 0.098060004 68 emnlp-2010-Joint Inference for Bilingual Semantic Role Labeling

16 0.094918653 39 emnlp-2010-EMNLP 044

17 0.08222767 107 emnlp-2010-Towards Conversation Entailment: An Empirical Investigation

18 0.079859197 99 emnlp-2010-Statistical Machine Translation with a Factorized Grammar

19 0.076552637 86 emnlp-2010-Non-Isomorphic Forest Pair Translation

20 0.076155245 42 emnlp-2010-Efficient Incremental Decoding for Tree-to-String Translation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.309), (1, -0.259), (2, 0.104), (3, -0.138), (4, 0.119), (5, -0.343), (6, -0.038), (7, -0.025), (8, -0.054), (9, 0.208), (10, -0.181), (11, 0.052), (12, 0.007), (13, -0.028), (14, -0.033), (15, 0.047), (16, 0.008), (17, -0.05), (18, -0.041), (19, -0.083), (20, 0.023), (21, 0.091), (22, 0.014), (23, -0.135), (24, 0.131), (25, -0.083), (26, -0.02), (27, -0.05), (28, 0.165), (29, 0.005), (30, -0.095), (31, -0.124), (32, 0.165), (33, 0.055), (34, -0.092), (35, 0.037), (36, -0.046), (37, 0.023), (38, -0.024), (39, -0.038), (40, 0.088), (41, -0.025), (42, 0.046), (43, -0.031), (44, 0.042), (45, 0.038), (46, -0.048), (47, -0.009), (48, -0.005), (49, -0.114)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96494108 36 emnlp-2010-Discriminative Word Alignment with a Function Word Reordering Model

Author: Hendra Setiawan ; Chris Dyer ; Philip Resnik

Abstract: We address the modeling, parameter estimation and search challenges that arise from the introduction of reordering models that capture non-local reordering in alignment modeling. In particular, we introduce several reordering models that utilize (pairs of) function words as contexts for alignment reordering. To address the parameter estimation challenge, we propose to estimate these reordering models from a relatively small amount of manuallyaligned corpora. To address the search challenge, we devise an iterative local search algorithm that stochastically explores reordering possibilities. By capturing non-local reordering phenomena, our proposed alignment model bears a closer resemblance to stateof-the-art translation model. Empirical results show significant improvements in alignment quality as well as in translation performance over baselines in a large-scale ChineseEnglish translation task.

2 0.74869305 76 emnlp-2010-Maximum Entropy Based Phrase Reordering for Hierarchical Phrase-Based Translation

Author: Zhongjun He ; Yao Meng ; Hao Yu

Abstract: Hierarchical phrase-based (HPB) translation provides a powerful mechanism to capture both short and long distance phrase reorderings. However, the phrase reorderings lack of contextual information in conventional HPB systems. This paper proposes a contextdependent phrase reordering approach that uses the maximum entropy (MaxEnt) model to help the HPB decoder select appropriate reordering patterns. We classify translation rules into several reordering patterns, and build a MaxEnt model for each pattern based on various contextual features. We integrate the MaxEnt models into the HPB model. Experimental results show that our approach achieves significant improvements over a standard HPB system on large-scale translation tasks. On Chinese-to-English translation, , the absolute improvements in BLEU (caseinsensitive) range from 1.2 to 2.1.

3 0.69141191 29 emnlp-2010-Combining Unsupervised and Supervised Alignments for MT: An Empirical Study

Author: Jinxi Xu ; Antti-Veikko Rosti

Abstract: Word alignment plays a central role in statistical MT (SMT) since almost all SMT systems extract translation rules from word aligned parallel training data. While most SMT systems use unsupervised algorithms (e.g. GIZA++) for training word alignment, supervised methods, which exploit a small amount of human-aligned data, have become increasingly popular recently. This work empirically studies the performance of these two classes of alignment algorithms and explores strategies to combine them to improve overall system performance. We used two unsupervised aligners, GIZA++ and HMM, and one supervised aligner, ITG, in this study. To avoid language and genre specific conclusions, we ran experiments on test sets consisting of two language pairs (Chinese-to-English and Arabicto-English) and two genres (newswire and weblog). Results show that the two classes of algorithms achieve the same level of MT perfor- mance. Modest improvements were achieved by taking the union of the translation grammars extracted from different alignments. Significant improvements (around 1.0 in BLEU) were achieved by combining outputs of different systems trained with different alignments. The improvements are consistent across languages and genres.

4 0.68209046 57 emnlp-2010-Hierarchical Phrase-Based Translation Grammars Extracted from Alignment Posterior Probabilities

Author: Adria de Gispert ; Juan Pino ; William Byrne

Abstract: We report on investigations into hierarchical phrase-based translation grammars based on rules extracted from posterior distributions over alignments of the parallel text. Rather than restrict rule extraction to a single alignment, such as Viterbi, we instead extract rules based on posterior distributions provided by the HMM word-to-word alignmentmodel. We define translation grammars progressively by adding classes of rules to a basic phrase-based system. We assess these grammars in terms of their expressive power, measured by their ability to align the parallel text from which their rules are extracted, and the quality of the translations they yield. In Chinese-to-English translation, we find that rule extraction from posteriors gives translation improvements. We also find that grammars with rules with only one nonterminal, when extracted from posteri- ors, can outperform more complex grammars extracted from Viterbi alignments. Finally, we show that the best way to exploit source-totarget and target-to-source alignment models is to build two separate systems and combine their output translation lattices.

5 0.56641203 3 emnlp-2010-A Fast Fertility Hidden Markov Model for Word Alignment Using MCMC

Author: Shaojun Zhao ; Daniel Gildea

Abstract: A word in one language can be translated to zero, one, or several words in other languages. Using word fertility features has been shown to be useful in building word alignment models for statistical machine translation. We built a fertility hidden Markov model by adding fertility to the hidden Markov model. This model not only achieves lower alignment error rate than the hidden Markov model, but also runs faster. It is similar in some ways to IBM Model 4, but is much easier to understand. We use Gibbs sampling for parameter estimation, which is more principled than the neighborhood method used in IBM Model 4.

6 0.50541049 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment

7 0.47359532 18 emnlp-2010-Assessing Phrase-Based Translation Models with Oracle Decoding

8 0.37485623 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions

9 0.36128047 5 emnlp-2010-A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages

10 0.34942907 68 emnlp-2010-Joint Inference for Bilingual Semantic Role Labeling

11 0.32855004 99 emnlp-2010-Statistical Machine Translation with a Factorized Grammar

12 0.31044275 50 emnlp-2010-Facilitating Translation Using Source Language Paraphrase Lattices

13 0.30322674 39 emnlp-2010-EMNLP 044

14 0.30225375 63 emnlp-2010-Improving Translation via Targeted Paraphrasing

15 0.29884747 47 emnlp-2010-Example-Based Paraphrasing for Improved Phrase-Based Statistical Machine Translation

16 0.28604233 107 emnlp-2010-Towards Conversation Entailment: An Empirical Investigation

17 0.27166599 105 emnlp-2010-Title Generation with Quasi-Synchronous Grammar

18 0.25798836 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice

19 0.2565375 22 emnlp-2010-Automatic Evaluation of Translation Quality for Distant Language Pairs

20 0.25064617 33 emnlp-2010-Cross Language Text Classification by Model Translation and Semi-Supervised Learning


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.012), (12, 0.025), (29, 0.091), (30, 0.027), (32, 0.018), (52, 0.472), (56, 0.064), (62, 0.014), (66, 0.083), (72, 0.042), (76, 0.032), (79, 0.011), (83, 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.88459498 39 emnlp-2010-EMNLP 044

Author: George Foster

Abstract: We describe a new approach to SMT adaptation that weights out-of-domain phrase pairs according to their relevance to the target domain, determined by both how similar to it they appear to be, and whether they belong to general language or not. This extends previous work on discriminative weighting by using a finer granularity, focusing on the properties of instances rather than corpus components, and using a simpler training procedure. We incorporate instance weighting into a mixture-model framework, and find that it yields consistent improvements over a wide range of baselines.

same-paper 2 0.86615562 36 emnlp-2010-Discriminative Word Alignment with a Function Word Reordering Model

Author: Hendra Setiawan ; Chris Dyer ; Philip Resnik

Abstract: We address the modeling, parameter estimation and search challenges that arise from the introduction of reordering models that capture non-local reordering in alignment modeling. In particular, we introduce several reordering models that utilize (pairs of) function words as contexts for alignment reordering. To address the parameter estimation challenge, we propose to estimate these reordering models from a relatively small amount of manuallyaligned corpora. To address the search challenge, we devise an iterative local search algorithm that stochastically explores reordering possibilities. By capturing non-local reordering phenomena, our proposed alignment model bears a closer resemblance to stateof-the-art translation model. Empirical results show significant improvements in alignment quality as well as in translation performance over baselines in a large-scale ChineseEnglish translation task.

3 0.56638259 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions

Author: Zhongqiang Huang ; Martin Cmejrek ; Bowen Zhou

Abstract: In this paper, we present a novel approach to enhance hierarchical phrase-based machine translation systems with linguistically motivated syntactic features. Rather than directly using treebank categories as in previous studies, we learn a set of linguistically-guided latent syntactic categories automatically from a source-side parsed, word-aligned parallel corpus, based on the hierarchical structure among phrase pairs as well as the syntactic structure of the source side. In our model, each X nonterminal in a SCFG rule is decorated with a real-valued feature vector computed based on its distribution of latent syntactic categories. These feature vectors are utilized at decod- ing time to measure the similarity between the syntactic analysis of the source side and the syntax of the SCFG rules that are applied to derive translations. Our approach maintains the advantages of hierarchical phrase-based translation systems while at the same time naturally incorporates soft syntactic constraints.

4 0.53752995 57 emnlp-2010-Hierarchical Phrase-Based Translation Grammars Extracted from Alignment Posterior Probabilities

Author: Adria de Gispert ; Juan Pino ; William Byrne

Abstract: We report on investigations into hierarchical phrase-based translation grammars based on rules extracted from posterior distributions over alignments of the parallel text. Rather than restrict rule extraction to a single alignment, such as Viterbi, we instead extract rules based on posterior distributions provided by the HMM word-to-word alignmentmodel. We define translation grammars progressively by adding classes of rules to a basic phrase-based system. We assess these grammars in terms of their expressive power, measured by their ability to align the parallel text from which their rules are extracted, and the quality of the translations they yield. In Chinese-to-English translation, we find that rule extraction from posteriors gives translation improvements. We also find that grammars with rules with only one nonterminal, when extracted from posteri- ors, can outperform more complex grammars extracted from Viterbi alignments. Finally, we show that the best way to exploit source-totarget and target-to-source alignment models is to build two separate systems and combine their output translation lattices.

5 0.53671521 76 emnlp-2010-Maximum Entropy Based Phrase Reordering for Hierarchical Phrase-Based Translation

Author: Zhongjun He ; Yao Meng ; Hao Yu

Abstract: Hierarchical phrase-based (HPB) translation provides a powerful mechanism to capture both short and long distance phrase reorderings. However, the phrase reorderings lack of contextual information in conventional HPB systems. This paper proposes a contextdependent phrase reordering approach that uses the maximum entropy (MaxEnt) model to help the HPB decoder select appropriate reordering patterns. We classify translation rules into several reordering patterns, and build a MaxEnt model for each pattern based on various contextual features. We integrate the MaxEnt models into the HPB model. Experimental results show that our approach achieves significant improvements over a standard HPB system on large-scale translation tasks. On Chinese-to-English translation, , the absolute improvements in BLEU (caseinsensitive) range from 1.2 to 2.1.

6 0.50973803 29 emnlp-2010-Combining Unsupervised and Supervised Alignments for MT: An Empirical Study

7 0.4997777 18 emnlp-2010-Assessing Phrase-Based Translation Models with Oracle Decoding

8 0.46895975 3 emnlp-2010-A Fast Fertility Hidden Markov Model for Word Alignment Using MCMC

9 0.46493956 35 emnlp-2010-Discriminative Sample Selection for Statistical Machine Translation

10 0.44937477 5 emnlp-2010-A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages

11 0.44449934 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment

12 0.43939352 42 emnlp-2010-Efficient Incremental Decoding for Tree-to-String Translation

13 0.43469343 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice

14 0.42873684 68 emnlp-2010-Joint Inference for Bilingual Semantic Role Labeling

15 0.41421351 47 emnlp-2010-Example-Based Paraphrasing for Improved Phrase-Based Statistical Machine Translation

16 0.41364375 104 emnlp-2010-The Necessity of Combining Adaptation Methods

17 0.41120347 105 emnlp-2010-Title Generation with Quasi-Synchronous Grammar

18 0.40924591 86 emnlp-2010-Non-Isomorphic Forest Pair Translation

19 0.40919796 80 emnlp-2010-Modeling Organization in Student Essays

20 0.40835187 87 emnlp-2010-Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space