acl acl2011 acl2011-69 knowledge-graph by maker-knowledge-mining

69 acl-2011-Clause Restructuring For SMT Not Absolutely Helpful


Source: pdf

Author: Susan Howlett ; Mark Dras

Abstract: There are a number of systems that use a syntax-based reordering step prior to phrasebased statistical MT. An early work proposing this idea showed improved translation performance, but subsequent work has had mixed results. Speculations as to cause have suggested the parser, the data, or other factors. We systematically investigate possible factors to give an initial answer to the question: Under what conditions does this use of syntax help PSMT?

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Clause Restructuring For SMT Not Absolutely Helpful Susan Howlett and Mark Dras Centre for Language Technology Macquarie University Sydney, Australia sus an . [sent-1, score-0.024]

2 Abstract There are a number of systems that use a syntax-based reordering step prior to phrasebased statistical MT. [sent-4, score-0.196]

3 An early work proposing this idea showed improved translation performance, but subsequent work has had mixed results. [sent-5, score-0.081]

4 Speculations as to cause have suggested the parser, the data, or other factors. [sent-6, score-0.025]

5 We systematically investigate possible factors to give an initial answer to the question: Under what conditions does this use of syntax help PSMT? [sent-7, score-0.037]

6 1 Introduction Phrase-based statistical machine translation (PSMT) translates documents from one human language to another by dividing text into contiguous sequences of words (phrases), translating each, and finally reordering them according to a distortion model. [sent-8, score-0.335]

7 The PSMT distortion model typically does not consider linguistic information, and as such encoun- ters difficulty in language pairs that require specific long-distance reorderings, such as German–English. [sent-9, score-0.12]

8 (2005) address this problem by reordering German sentences to more closely parallel English word order, prior to translation by a PSMT system. [sent-11, score-0.243]

9 They find that this reordering-aspreprocessing approach results in a significant improvement in translation performance over the baseline. [sent-12, score-0.078]

10 However, there have been several other systems using the reordering-as-preprocessing approach, and they have met with mixed success. [sent-13, score-0.036]

11 We systematically explore possible explanations for these contradictory results, and conclude that, while reordering is helpful for some sentences, potential improvement can be eroded by many aspects of the PSMT system, independent of the reordering. [sent-14, score-0.301]

12 Second, the parse is used to permute the words according to some reordering rules, which may be automatically or manually determined. [sent-22, score-0.194]

13 Finally, a phrase-based SMT system is trained and tested using the reordered sentences as input, in place of the original sentences. [sent-23, score-0.301]

14 Xia and McCord (2004) (English-to-French translation, using automatically-extracted reordering rules) train on the Canadian Hansard. [sent-25, score-0.196]

15 On a Hansard test set, an improvement over the baseline was only seen if the translation system’s phrase table was restricted to phrases of length at most four. [sent-26, score-0.139]

16 On a news test set, the reordered system performed significantly better than the baseline regardless of the maximum length ofphrases. [sent-27, score-0.385]

17 However, this improvement was only apparent with monotonic decoding; when using a distortion model, the difference disappeared. [sent-28, score-0.153]

18 Xia and McCord attribute the drop-off in performance on the Hansard set to similarity of training and test data. [sent-29, score-0.053]

19 (2005) (German-to-English) use six hand-crafted reordering rules targeting the placement of verbs, subjects, particles and negation. [sent-31, score-0.211]

20 They train and evaluate their system on Europarl text and obtain a BLEU score (Papineni et al. [sent-32, score-0.061]

21 A human evaluation confirms that reordered translations are generally (but not universally) better. [sent-36, score-0.238]

22 (2009) report significant improvements applying one set of hand-crafted rules to translation from English to each of five SOV lanProceedings ofP thoer t4l9atnhd A, Onrnuegaoln M,e Jeuntineg 19 o-f2 t4h,e 2 A0s1s1o. [sent-38, score-0.086]

23 (2007) (Chinese-to-English, hand-crafted rules) report a significant improvement over the baseline system on the NIST 2006 test set, using a distance-based distortion model. [sent-42, score-0.249]

24 Similar results are mentioned in passing for a lexicalised distortion model. [sent-43, score-0.175]

25 Also on news text, Habash (2007) (automaticallyextracted rules, Arabic-to-English) reports a very large improvement when phrases are limited to length 1 and translation is monotonic. [sent-44, score-0.129]

26 However, allowing phrases up to 7 words in length or using a distance-based distortion model causes the difference in performance to disappear. [sent-45, score-0.12]

27 He also includes oracle experiments, in which each system outperforms the other on 40–50% of sentences, suggesting that reordering is useful for many sentences. [sent-47, score-0.305]

28 Zwarts and Dras (2007) implement six rules for Dutch-to-English translation, analogous to those of Collins et al. [sent-48, score-0.041]

29 Considering only their baseline and reordered systems, the improvement is from 20. [sent-50, score-0.302]

30 8; they attribute their poor result to the parser used. [sent-52, score-0.112]

31 Howlett and Dras (2010) reimplement the Collins et al. [sent-53, score-0.024]

32 In addition to their main system, they give results for the baseline and reordered systems, training and testing on Europarl and news text. [sent-55, score-0.32]

33 04 for the reordered system, below the baseline at 20. [sent-58, score-0.269]

34 They explain their lower absolute scores as a consequence of the different test set, but do not explore the reversal in conclusion. [sent-60, score-0.054]

35 Like Habash (2007), Howlett and Dras (2010) include oracle experiments which demonstrate that the reordering is useful for some sentences. [sent-61, score-0.27]

36 Possible explanations for the difference are differences in the reordering process, from either parser performance or implementation of the rules, and differences in the translation process, including PSMT system setup and data used. [sent-64, score-0.438]

37 We examine parser performance in §3 and the remaining possibilities in §4–5. [sent-65, score-0.089]

38 directly comparable; see dth feo rte rxetf feorre ndceeta oilns. [sent-85, score-0.024]

39 HD instead uses the Berkeley parser (Petrov et al. [sent-89, score-0.089]

40 Note that the CKK reordering requires not just category labels (e. [sent-93, score-0.17]

41 SB for subject); parser performance typically goes down when these are learnt, due to sparsity. [sent-97, score-0.089]

42 Dubey and Keller (2003) train and test on the Negra corpus, with 18,602 sentences for training, 1,000 development and 1,000 test, removing sentences longer than 40 words. [sent-99, score-0.112]

43 Petrov and Klein (2008) train and test the Berkeley parser on part of the Tiger corpus, with 20,894 sentences for training and 2,61 1 sentences for each of development and test, all at most 40 words long. [sent-100, score-0.201]

44 The parsing model used by HD is trained on the full Tiger corpus, unrestricted for length, with 38,020 sentences for training and 2,000 sentences for development. [sent-101, score-0.1]

45 The figures reported in Table 1 are the model’s performance on this development set. [sent-102, score-0.032]

46 From these figures, we conclude that sheer parser grunt is unlikely to be responsible for the discrepancy between CKK and HD. [sent-104, score-0.115]

47 It is possible that parser output differs qualitatively in some important way; parser figures alone do not reveal this. [sent-105, score-0.21]

48 Here, we reuse the HD parsing model, plus five Data Set name Size CKKTrain751,088 Test 2,000 WMT Train europarl-v4 1,418,1 15 Tuning test2007 2,000 news-test2008 2,051 Test test2008 2,000 newstest2009 2,525 Table 2: Corpora used, and # of sentence pairs in each. [sent-106, score-0.044]

49 The first is trained on the same data, lowercased; the next two use only 19,000 training sentences (for one model, lowercased); the fourth uses 9,500 sentences; the fifth only 3,800 sentences. [sent-108, score-0.051]

50 The 50% data models are closer to the amount of data available to CKK, and the 25% and 10% models are to investigate the effects of further reduced parser quality. [sent-109, score-0.089]

51 4 Experiments We conduct a number of experiments with the HD system to attempt to replicate the CKK and HD findings. [sent-110, score-0.069]

52 1 Each experiment is paired: the reordered system reuses the recasing and language models of its corresponding baseline system, to eliminate one source of possible variation. [sent-112, score-0.328]

53 Training the parser with less data affects only the reordered systems; for experiments using these models, the corresponding baselines (and thus the shared models) are not retrained. [sent-113, score-0.327]

54 1 System Variations CKK uses the PSMT system Pharaoh (Koehn et al. [sent-116, score-0.035]

55 , 2003), whereas HD uses its successor Moses (Koehn et al. [sent-117, score-0.04]

56 In itself, this should not cause a dramatic difference in performance, as the two systems perform similarly (Hoang and Koehn, 2008). [sent-119, score-0.025]

57 However, there are a number of other differences between the two systems. [sent-120, score-0.032]

58 (2003) (and thus presumably CKK) use an unlexicalised distortion model, whereas HD uses a lexicalised model. [sent-122, score-0.175]

59 CKK does not include a tuning (minimum error rate training) phase, unlike HD. [sent-123, score-0.066]

60 Top row: full parsing model; second row: 50% parsing model. [sent-139, score-0.088]

61 2 Data A likely cause of the results difference between HD and CKK is the data used. [sent-144, score-0.025]

62 CKK used Europarl for training and test, while HD used Europarl and news for training, with news for tuning and test. [sent-145, score-0.168]

63 Our first experiment attempts to replicate CKK as closely as possible, using the CKK training and test data. [sent-146, score-0.064]

64 This data came already tokenized and lowercased; we thus skip tokenisation in preprocessing, use the lowercased parsing models, and skip tokenisation and casing steps in the PSMT system. [sent-147, score-0.332]

65 We try both the full data and 50% data parsing models. [sent-148, score-0.044]

66 To remain close to CKK, we use data from the 2009 Workshop,2 which provided Europarl sets for both training and development. [sent-150, score-0.05]

67 We use europarl -v4 for training, t e st 2 0 0 7 for tuning, and te st 2 0 0 8 for testing. [sent-151, score-0.281]

68 We also run the 3-gram systems of this set with each of the reduced parser models. [sent-152, score-0.089]

69 We still train on europarl-v4 (diverging from HD), but substitute one or both of the tuning and test sets with those of HD: news-te st2 0 0 8 and newste st 2 0 0 9 from the 2010 Workshop. [sent-154, score-0.26]

70 3 For the language model, HD uses both Europarl and news text. [sent-155, score-0.051]

71 To remain close to CKK, we train our language models only on the Europarl training data, and thus use considerably less data than HD here. [sent-156, score-0.076]

72 93 E N lex – E N 5 dist – E N lex – E N 27. [sent-174, score-0.265]

73 69 Table 4: BLEU scores for each experiment on Europarl test set. [sent-218, score-0.03]

74 Columns give: language model order, distortion model (distance, lexicalised), tuning data (none (–), Europarl, News), baseline BLEU score, reordered system BLEU score, performance increase, oracle BLEU score. [sent-219, score-0.59]

75 Comparing the scripts, we found that the NIST scores are always lower than multi-bleu’s on te st 2 0 0 8, but higher on newste st 2 0 0 9, with differences at most 0. [sent-221, score-0.261]

76 5 Results Results for the first experiments, closely replicating CKK, are given in Table 3. [sent-224, score-0.042]

77 Thus the HD reimplementation is indeed close to the original CKK system. [sent-228, score-0.064]

78 Any qualitative differences in parser output not revealed by §3, itinv teh dei implementation eofr othuetp rules, or vine tlheed P bySM §3T, system, are thus producing only a small effect. [sent-229, score-0.169]

79 Results for the remaining experiments are given in Tables 4 and 5, which give results on the te st 2 0 0 8 and newste st 2 0 0 9 test sets respectively, and Table 6, which gives results on the te st 2 0 0 8 test set using the reduced parsing models. [sent-230, score-0.424]

80 We see that the choice of data can have a profound effect, nullifying or even reversing the overall result, even when the reordering system remains the same. [sent-231, score-0.229]

81 Genre differences are an obvious possibility, but we have demonstrated only a dependence on data set. [sent-232, score-0.032]

82 The other factors tested—language model order, lexicalisation of the distortion model, and use of a tuning phase—can all affect the overall performance LM DM T Base. [sent-233, score-0.21]

83 85 lex 25 10 E 50 25 10 –50 25 10 E 50 25 10 27. [sent-293, score-0.099]

84 42 Table 6: Results using the smaller parsing models. [sent-329, score-0.044]

85 Columns are as for Table 4 except LM removed (all are 3-gram), and parser data percentage (%) added. [sent-330, score-0.089]

86 Reducing the quality of the parsing model (by training on less data) also has a negative effect, but the drop must be substantial before it outweighs other factors. [sent-332, score-0.044]

87 In all cases, the oracle outperforms both baseline and reordered systems by a large margin. [sent-333, score-0.369]

88 Its selections show that, in changing test sets, the balance shifts from one system to the other, but both still contribute strongly. [sent-334, score-0.089]

89 This shows that improvements are possible across the board if it is possible to correctly choose which sentences will benefit from reordering. [sent-335, score-0.028]

90 The reimplementation of this system by Howlett and Dras (2010) came to the opposite conclusion. [sent-338, score-0.101]

91 We have systematically varied several aspects of the Howlett and Dras (2010) system and reproduced results close to both papers, plus a full range in between. [sent-339, score-0.099]

92 Our results show that choices in the PSMT system can completely erode potential gains of the reordering preprocessing step, with the largest effect due to simple choice of data. [sent-340, score-0.244]

93 We have shown that a lack of overall improvement using reordering-as- preprocessing need not be due to the usual suspects, language pair and reordering process. [sent-341, score-0.242]

94 Significantly, our oracle experiments show that in all cases the reordering system does produce better translations for some sentences. [sent-342, score-0.305]

95 We conclude that effort is best directed at determining for which sentences the improvement will appear. [sent-343, score-0.087]

96 Improving a statistical MT system with automatically learned rewrite patterns. [sent-400, score-0.035]

97 Using a dependency parser to improve SMT for subject-object-verb languages. [sent-404, score-0.089]

98 Syntax-based word reordering in phrase-based statistical machine translation: Why does it work? [sent-408, score-0.17]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('ckk', 0.549), ('hd', 0.401), ('psmt', 0.274), ('howlett', 0.247), ('reordered', 0.238), ('dras', 0.179), ('reordering', 0.17), ('europarl', 0.134), ('distortion', 0.12), ('oracle', 0.1), ('lex', 0.099), ('tiger', 0.097), ('collins', 0.095), ('parser', 0.089), ('dubey', 0.084), ('lowercased', 0.083), ('newste', 0.082), ('bleu', 0.077), ('negra', 0.067), ('dist', 0.067), ('tuning', 0.066), ('dm', 0.063), ('petrov', 0.057), ('st', 0.056), ('hansard', 0.055), ('lexicalised', 0.055), ('koehn', 0.051), ('news', 0.051), ('habash', 0.048), ('xia', 0.048), ('keller', 0.048), ('mccord', 0.048), ('zwarts', 0.048), ('columns', 0.048), ('translation', 0.045), ('tokenisation', 0.045), ('parsing', 0.044), ('moses', 0.043), ('german', 0.043), ('replicating', 0.042), ('skut', 0.042), ('philipp', 0.041), ('rules', 0.041), ('successor', 0.04), ('preprocessing', 0.039), ('lm', 0.039), ('brants', 0.039), ('systematically', 0.037), ('reimplementation', 0.037), ('mixed', 0.036), ('te', 0.035), ('explanations', 0.035), ('restructuring', 0.035), ('hoang', 0.035), ('system', 0.035), ('replicate', 0.034), ('improvement', 0.033), ('differences', 0.032), ('figures', 0.032), ('baseline', 0.031), ('skip', 0.031), ('nist', 0.03), ('test', 0.03), ('smt', 0.03), ('came', 0.029), ('papineni', 0.029), ('centre', 0.028), ('sentences', 0.028), ('close', 0.027), ('susan', 0.027), ('train', 0.026), ('phrasebased', 0.026), ('conclude', 0.026), ('summit', 0.025), ('cause', 0.025), ('slav', 0.025), ('sus', 0.024), ('tlheed', 0.024), ('reimplement', 0.024), ('casing', 0.024), ('automaticallyextracted', 0.024), ('diverging', 0.024), ('eofr', 0.024), ('feorre', 0.024), ('jaeho', 0.024), ('lexicalisation', 0.024), ('permute', 0.024), ('profound', 0.024), ('reuses', 0.024), ('reversal', 0.024), ('selections', 0.024), ('sov', 0.024), ('stefanie', 0.024), ('wojciech', 0.024), ('hieu', 0.024), ('attribute', 0.023), ('remain', 0.023), ('fifth', 0.023), ('scorer', 0.022), ('silvia', 0.022)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 69 acl-2011-Clause Restructuring For SMT Not Absolutely Helpful

Author: Susan Howlett ; Mark Dras

Abstract: There are a number of systems that use a syntax-based reordering step prior to phrasebased statistical MT. An early work proposing this idea showed improved translation performance, but subsequent work has had mixed results. Speculations as to cause have suggested the parser, the data, or other factors. We systematically investigate possible factors to give an initial answer to the question: Under what conditions does this use of syntax help PSMT?

2 0.1292493 266 acl-2011-Reordering with Source Language Collocations

Author: Zhanyi Liu ; Haifeng Wang ; Hua Wu ; Ting Liu ; Sheng Li

Abstract: This paper proposes a novel reordering model for statistical machine translation (SMT) by means of modeling the translation orders of the source language collocations. The model is learned from a word-aligned bilingual corpus where the collocated words in source sentences are automatically detected. During decoding, the model is employed to softly constrain the translation orders of the source language collocations, so as to constrain the translation orders of those source phrases containing these collocated words. The experimental results show that the proposed method significantly improves the translation quality, achieving the absolute improvements of 1.1~1.4 BLEU score over the baseline methods. 1

3 0.11886867 16 acl-2011-A Joint Sequence Translation Model with Integrated Reordering

Author: Nadir Durrani ; Helmut Schmid ; Alexander Fraser

Abstract: We present a novel machine translation model which models translation by a linear sequence of operations. In contrast to the “N-gram” model, this sequence includes not only translation but also reordering operations. Key ideas of our model are (i) a new reordering approach which better restricts the position to which a word or phrase can be moved, and is able to handle short and long distance reorderings in a unified way, and (ii) a joint sequence model for the translation and reordering probabilities which is more flexible than standard phrase-based MT. We observe statistically significant improvements in BLEU over Moses for German-to-English and Spanish-to-English tasks, and comparable results for a French-to-English task.

4 0.11171462 263 acl-2011-Reordering Constraint Based on Document-Level Context

Author: Takashi Onishi ; Masao Utiyama ; Eiichiro Sumita

Abstract: One problem with phrase-based statistical machine translation is the problem of longdistance reordering when translating between languages with different word orders, such as Japanese-English. In this paper, we propose a method of imposing reordering constraints using document-level context. As the documentlevel context, we use noun phrases which significantly occur in context documents containing source sentences. Given a source sentence, zones which cover the noun phrases are used as reordering constraints. Then, in decoding, reorderings which violate the zones are restricted. Experiment results for patent translation tasks show a significant improvement of 1.20% BLEU points in JapaneseEnglish translation and 1.41% BLEU points in English-Japanese translation.

5 0.10249069 264 acl-2011-Reordering Metrics for MT

Author: Alexandra Birch ; Miles Osborne

Abstract: One of the major challenges facing statistical machine translation is how to model differences in word order between languages. Although a great deal of research has focussed on this problem, progress is hampered by the lack of reliable metrics. Most current metrics are based on matching lexical items in the translation and the reference, and their ability to measure the quality of word order has not been demonstrated. This paper presents a novel metric, the LRscore, which explicitly measures the quality of word order by using permutation distance metrics. We show that the metric is more consistent with human judgements than other metrics, including the BLEU score. We also show that the LRscore can successfully be used as the objective function when training translation model parameters. Training with the LRscore leads to output which is preferred by humans. Moreover, the translations incur no penalty in terms of BLEU scores.

6 0.096362874 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

7 0.090376027 104 acl-2011-Domain Adaptation for Machine Translation by Mining Unseen Words

8 0.087909795 247 acl-2011-Pre- and Postprocessing for Statistical Machine Translation into Germanic Languages

9 0.079810455 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

10 0.076889388 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation

11 0.072303236 152 acl-2011-How Much Can We Gain from Supervised Word Alignment?

12 0.066507466 75 acl-2011-Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction

13 0.065333679 265 acl-2011-Reordering Modeling using Weighted Alignment Matrices

14 0.064265057 233 acl-2011-On-line Language Model Biasing for Statistical Machine Translation

15 0.061438583 90 acl-2011-Crowdsourcing Translation: Professional Quality from Non-Professionals

16 0.06028058 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation

17 0.057855021 57 acl-2011-Bayesian Word Alignment for Statistical Machine Translation

18 0.056347426 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers

19 0.054807141 81 acl-2011-Consistent Translation using Discriminative Learning - A Translation Memory-inspired Approach

20 0.054055817 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.146), (1, -0.109), (2, 0.052), (3, 0.007), (4, 0.015), (5, 0.009), (6, 0.013), (7, 0.006), (8, 0.031), (9, -0.001), (10, 0.009), (11, -0.044), (12, -0.002), (13, -0.107), (14, -0.011), (15, 0.0), (16, -0.005), (17, 0.011), (18, -0.001), (19, -0.035), (20, -0.006), (21, 0.006), (22, -0.002), (23, -0.092), (24, -0.022), (25, 0.08), (26, 0.054), (27, -0.029), (28, -0.012), (29, 0.03), (30, 0.046), (31, -0.005), (32, 0.089), (33, -0.001), (34, 0.019), (35, -0.106), (36, -0.047), (37, -0.006), (38, -0.038), (39, 0.017), (40, 0.082), (41, -0.076), (42, -0.051), (43, 0.014), (44, -0.026), (45, 0.037), (46, -0.051), (47, 0.044), (48, -0.023), (49, -0.065)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.87008744 69 acl-2011-Clause Restructuring For SMT Not Absolutely Helpful

Author: Susan Howlett ; Mark Dras

Abstract: There are a number of systems that use a syntax-based reordering step prior to phrasebased statistical MT. An early work proposing this idea showed improved translation performance, but subsequent work has had mixed results. Speculations as to cause have suggested the parser, the data, or other factors. We systematically investigate possible factors to give an initial answer to the question: Under what conditions does this use of syntax help PSMT?

2 0.81905991 266 acl-2011-Reordering with Source Language Collocations

Author: Zhanyi Liu ; Haifeng Wang ; Hua Wu ; Ting Liu ; Sheng Li

Abstract: This paper proposes a novel reordering model for statistical machine translation (SMT) by means of modeling the translation orders of the source language collocations. The model is learned from a word-aligned bilingual corpus where the collocated words in source sentences are automatically detected. During decoding, the model is employed to softly constrain the translation orders of the source language collocations, so as to constrain the translation orders of those source phrases containing these collocated words. The experimental results show that the proposed method significantly improves the translation quality, achieving the absolute improvements of 1.1~1.4 BLEU score over the baseline methods. 1

3 0.80101562 263 acl-2011-Reordering Constraint Based on Document-Level Context

Author: Takashi Onishi ; Masao Utiyama ; Eiichiro Sumita

Abstract: One problem with phrase-based statistical machine translation is the problem of longdistance reordering when translating between languages with different word orders, such as Japanese-English. In this paper, we propose a method of imposing reordering constraints using document-level context. As the documentlevel context, we use noun phrases which significantly occur in context documents containing source sentences. Given a source sentence, zones which cover the noun phrases are used as reordering constraints. Then, in decoding, reorderings which violate the zones are restricted. Experiment results for patent translation tasks show a significant improvement of 1.20% BLEU points in JapaneseEnglish translation and 1.41% BLEU points in English-Japanese translation.

4 0.75921601 264 acl-2011-Reordering Metrics for MT

Author: Alexandra Birch ; Miles Osborne

Abstract: One of the major challenges facing statistical machine translation is how to model differences in word order between languages. Although a great deal of research has focussed on this problem, progress is hampered by the lack of reliable metrics. Most current metrics are based on matching lexical items in the translation and the reference, and their ability to measure the quality of word order has not been demonstrated. This paper presents a novel metric, the LRscore, which explicitly measures the quality of word order by using permutation distance metrics. We show that the metric is more consistent with human judgements than other metrics, including the BLEU score. We also show that the LRscore can successfully be used as the objective function when training translation model parameters. Training with the LRscore leads to output which is preferred by humans. Moreover, the translations incur no penalty in terms of BLEU scores.

5 0.75901794 16 acl-2011-A Joint Sequence Translation Model with Integrated Reordering

Author: Nadir Durrani ; Helmut Schmid ; Alexander Fraser

Abstract: We present a novel machine translation model which models translation by a linear sequence of operations. In contrast to the “N-gram” model, this sequence includes not only translation but also reordering operations. Key ideas of our model are (i) a new reordering approach which better restricts the position to which a word or phrase can be moved, and is able to handle short and long distance reorderings in a unified way, and (ii) a joint sequence model for the translation and reordering probabilities which is more flexible than standard phrase-based MT. We observe statistically significant improvements in BLEU over Moses for German-to-English and Spanish-to-English tasks, and comparable results for a French-to-English task.

6 0.65995985 247 acl-2011-Pre- and Postprocessing for Statistical Machine Translation into Germanic Languages

7 0.61187214 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

8 0.56059366 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

9 0.54538727 233 acl-2011-On-line Language Model Biasing for Statistical Machine Translation

10 0.54399669 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation

11 0.54155213 60 acl-2011-Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability

12 0.52090091 265 acl-2011-Reordering Modeling using Weighted Alignment Matrices

13 0.51973182 81 acl-2011-Consistent Translation using Discriminative Learning - A Translation Memory-inspired Approach

14 0.48225358 104 acl-2011-Domain Adaptation for Machine Translation by Mining Unseen Words

15 0.47664952 151 acl-2011-Hindi to Punjabi Machine Translation System

16 0.47002915 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers

17 0.46832484 111 acl-2011-Effects of Noun Phrase Bracketing in Dependency Parsing and Machine Translation

18 0.46808803 310 acl-2011-Translating from Morphologically Complex Languages: A Paraphrase-Based Approach

19 0.46694535 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

20 0.4655624 75 acl-2011-Combining Morpheme-based Machine Translation with Post-processing Morpheme Prediction


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.035), (17, 0.033), (26, 0.018), (37, 0.097), (39, 0.062), (41, 0.057), (53, 0.012), (55, 0.029), (59, 0.028), (71, 0.272), (72, 0.026), (91, 0.043), (96, 0.173), (97, 0.011)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.85867691 273 acl-2011-Semantic Representation of Negation Using Focus Detection

Author: Eduardo Blanco ; Dan Moldovan

Abstract: Negation is present in all human languages and it is used to reverse the polarity of part of statements that are otherwise affirmative by default. A negated statement often carries positive implicit meaning, but to pinpoint the positive part from the negative part is rather difficult. This paper aims at thoroughly representing the semantics of negation by revealing implicit positive meaning. The proposed representation relies on focus of negation detection. For this, new annotation over PropBank and a learning algorithm are proposed.

2 0.812343 307 acl-2011-Towards Tracking Semantic Change by Visual Analytics

Author: Christian Rohrdantz ; Annette Hautli ; Thomas Mayer ; Miriam Butt ; Daniel A. Keim ; Frans Plank

Abstract: This paper presents a new approach to detecting and tracking changes in word meaning by visually modeling and representing diachronic development in word contexts. Previous studies have shown that computational models are capable of clustering and disambiguating senses, a more recent trend investigates whether changes in word meaning can be tracked by automatic methods. The aim of our study is to offer a new instrument for investigating the diachronic development of word senses in a way that allows for a better understanding of the nature of semantic change in general. For this purpose we combine techniques from the field of Visual Analytics with unsupervised methods from Natural Language Processing, allowing for an interactive visual exploration of semantic change.

same-paper 3 0.76374185 69 acl-2011-Clause Restructuring For SMT Not Absolutely Helpful

Author: Susan Howlett ; Mark Dras

Abstract: There are a number of systems that use a syntax-based reordering step prior to phrasebased statistical MT. An early work proposing this idea showed improved translation performance, but subsequent work has had mixed results. Speculations as to cause have suggested the parser, the data, or other factors. We systematically investigate possible factors to give an initial answer to the question: Under what conditions does this use of syntax help PSMT?

4 0.73592198 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction

Author: Yee Seng Chan ; Dan Roth

Abstract: In this paper, we observe that there exists a second dimension to the relation extraction (RE) problem that is orthogonal to the relation type dimension. We show that most of these second dimensional structures are relatively constrained and not difficult to identify. We propose a novel algorithmic approach to RE that starts by first identifying these structures and then, within these, identifying the semantic type of the relation. In the real RE problem where relation arguments need to be identified, exploiting these structures also allows reducing pipelined propagated errors. We show that this RE framework provides significant improvement in RE performance.

5 0.63701594 241 acl-2011-Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation

Author: Zhongguo Li

Abstract: Lots of Chinese characters are very productive in that they can form many structured words either as prefixes or as suffixes. Previous research in Chinese word segmentation mainly focused on identifying only the word boundaries without considering the rich internal structures of many words. In this paper we argue that this is unsatisfying in many ways, both practically and theoretically. Instead, we propose that word structures should be recovered in morphological analysis. An elegant approach for doing this is given and the result is shown to be promising enough for encouraging further effort in this direction. Our probability model is trained with the Penn Chinese Treebank and actually is able to parse both word and phrase structures in a unified way. 1 Why Parse Word Structures? Research in Chinese word segmentation has progressed tremendously in recent years, with state of the art performing at around 97% in precision and recall (Xue, 2003; Gao et al., 2005; Zhang and Clark, 2007; Li and Sun, 2009). However, virtually all these systems focus exclusively on recognizing the word boundaries, giving no consideration to the internal structures of many words. Though it has been the standard practice for many years, we argue that this paradigm is inadequate both in theory and in practice, for at least the following four reasons. The first reason is that if we confine our definition of word segmentation to the identification of word boundaries, then people tend to have divergent 1405 opinions as to whether a linguistic unit is a word or not (Sproat et al., 1996). This has led to many different annotation standards for Chinese word segmentation. Even worse, this could cause inconsistency in the same corpus. For instance, 䉂 擌 奒 ‘vice president’ is considered to be one word in the Penn Chinese Treebank (Xue et al., 2005), but is split into two words by the Peking University corpus in the SIGHAN Bakeoffs (Sproat and Emerson, 2003). Meanwhile, 䉂 䀓 惼 ‘vice director’ and 䉂 䚲䡮 ‘deputy are both segmented into two words in the same Penn Chinese Treebank. In fact, all these words are composed of the prefix 䉂 ‘vice’ and a root word. Thus the structure of 䉂擌奒 ‘vice president’ can be represented with the tree in Figure 1. Without a doubt, there is complete agree- manager’ NN ,,ll JJf NNf 䉂 擌奒 Figure 1: Example of a word with internal structure. ment on the correctness of this structure among native Chinese speakers. So if instead of annotating only word boundaries, we annotate the structures of every word, then the annotation tends to be more 1 1Here it is necessary to add a note on terminology used in this paper. Since there is no universally accepted definition of the “word” concept in linguistics and especially in Chinese, whenever we use the term “word” we might mean a linguistic unit such as 䉂 擌奒 ‘vice president’ whose structure is shown as the tree in Figure 1, or we might mean a smaller unit such as 擌奒 ‘president’ which is a substructure of that tree. Hopefully, ProceedingPso orftla thned 4,9 Otrhe Agonnn,u Jauln Mee 1e9t-i2ng4, o 2f0 t1h1e. A ?c s 2o0ci1a1ti Aonss foocria Ctioomnp fourta Ctioomnaplu Ltaintigouniaslti Lcisn,g puaigsetsic 1s405–1414, consistent and there could be less duplication of efforts in developing the expensive annotated corpus. The second reason is applications have different requirements for granularity of words. Take the personal name 撱 嗤吼 ‘Zhou Shuren’ as an example. It’s considered to be one word in the Penn Chinese Treebank, but is segmented into a surname and a given name in the Peking University corpus. For some applications such as information extraction, the former segmentation is adequate, while for others like machine translation, the later finer-grained output is more preferable. If the analyzer can produce a structure as shown in Figure 4(a), then every application can extract what it needs from this tree. A solution with tree output like this is more elegant than approaches which try to meet the needs of different applications in post-processing (Gao et al., 2004). The third reason is that traditional word segmentation has problems in handling many phenomena in Chinese. For example, the telescopic compound 㦌 撥 怂惆 ‘universities, middle schools and primary schools’ is in fact composed ofthree coordinating elements 㦌惆 ‘university’, 撥 惆 ‘middle school’ and 怂惆 ‘primary school’ . Regarding it as one flat word loses this important information. Another example is separable words like 扩 扙 ‘swim’ . With a linear segmentation, the meaning of ‘swimming’ as in 扩 堑 扙 ‘after swimming’ cannot be properly represented, since 扩扙 ‘swim’ will be segmented into discontinuous units. These language usages lie at the boundary between syntax and morphology, and are not uncommon in Chinese. They can be adequately represented with trees (Figure 2). (a) NN (b) ???HHH JJ NNf ???HHH JJf JJf JJf 㦌 撥 怂 惆 VV ???HHH VV NNf ZZ VVf VVf 扩 扙 堑 Figure 2: Example of telescopic compound (a) and separable word (b). The last reason why we should care about word the context will always make it clear what is being referred to with the term “word”. 1406 structures is related to head driven statistical parsers (Collins, 2003). To illustrate this, note that in the Penn Chinese Treebank, the word 戽 䊂䠽 吼 ‘English People’ does not occur at all. Hence constituents headed by such words could cause some difficulty for head driven models in which out-ofvocabulary words need to be treated specially both when they are generated and when they are conditioned upon. But this word is in turn headed by its suffix 吼 ‘people’, and there are 2,233 such words in Penn Chinese Treebank. If we annotate the structure of every compound containing this suffix (e.g. Figure 3), such data sparsity simply goes away.

6 0.63588333 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

7 0.6345166 28 acl-2011-A Statistical Tree Annotator and Its Applications

8 0.63372278 58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing

9 0.63265198 117 acl-2011-Entity Set Expansion using Topic information

10 0.63231087 137 acl-2011-Fine-Grained Class Label Markup of Search Queries

11 0.63199252 318 acl-2011-Unsupervised Bilingual Morpheme Segmentation and Alignment with Context-rich Hidden Semi-Markov Models

12 0.63184857 300 acl-2011-The Surprising Variance in Shortest-Derivation Parsing

13 0.62984407 128 acl-2011-Exploring Entity Relations for Named Entity Disambiguation

14 0.62947083 257 acl-2011-Question Detection in Spoken Conversations Using Textual Conversations

15 0.62946117 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

16 0.62889671 133 acl-2011-Extracting Social Power Relationships from Natural Language

17 0.6282981 246 acl-2011-Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition

18 0.62736517 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

19 0.62736511 327 acl-2011-Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment

20 0.62657416 187 acl-2011-Jointly Learning to Extract and Compress