acl acl2010 acl2010-88 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Shujie Liu ; Chi-Ho Li ; Ming Zhou
Abstract: While Inversion Transduction Grammar (ITG) has regained more and more attention in recent years, it still suffers from the major obstacle of speed. We propose a discriminative ITG pruning framework using Minimum Error Rate Training and various features from previous work on ITG alignment. Experiment results show that it is superior to all existing heuristics in ITG pruning. On top of the pruning framework, we also propose a discriminative ITG alignment model using hierarchical phrase pairs, which improves both F-score and Bleu score over the baseline alignment system of GIZA++. 1
Reference: text
sentIndex sentText sentNum sentScore
1 We propose a discriminative ITG pruning framework using Minimum Error Rate Training and various features from previous work on ITG alignment. [sent-6, score-0.406]
2 On top of the pruning framework, we also propose a discriminative ITG alignment model using hierarchical phrase pairs, which improves both F-score and Bleu score over the baseline alignment system of GIZA++. [sent-8, score-1.002]
3 It does synchronous parsing of two languages with phrasal and word-level alignment as by-product. [sent-10, score-0.275]
4 For this reason ITG has gained more and more attention recently in the word alignment community (Zhang and Gildea, 2005; Cherry and Lin, 2006; Haghighi et al. [sent-11, score-0.261]
5 Therefore all attempts to ITG alignment come with some pruning method. [sent-16, score-0.519]
6 (2009) do pruning based on the probabilities of links from a simpler alignment model (viz. [sent-18, score-0.656]
7 HMM); Zhang and Gildea (2005) propose Tic-tac-toe pruning, which is based on the Model 1 probabilities of word pairs inside and outside a pair of spans. [sent-19, score-0.251]
8 As all the principles behind these techniques have certain contribution in making good pruning decision, it is tempting to incorporate all these features in ITG pruning. [sent-20, score-0.302]
9 In this paper, we propose a novel discriminative pruning framework for discriminative ITG. [sent-21, score-0.486]
10 The pruning model uses no more training data than the discriminative ITG parser itself, and it uses a log-linear model to integrate all features that help identify the correct span pair (like Model 1 probability and HMM posterior). [sent-22, score-0.8]
11 On top of the discriminative pruning method, we also propose a discriminative ITG alignment system using hierarchical phrase pairs. [sent-23, score-0.839]
12 In the following, some basic details on the ITG formalism and ITG parsing are first reviewed (Sections 2 and 3), followed by the definition of pruning in ITG (Section 4). [sent-24, score-0.312]
13 From the viewpoint of word alignment, the terminal unary rules provide the links of word pairs, whereas the binary rules represent the reordering factor. [sent-39, score-0.326]
14 Both ITG alignment 316 Proce dinUgsp osfa tlhae, 4S8wthed Aen n,u 1a1l-1 M6e Jeutilnyg 2 o0f1 t0h. [sent-46, score-0.241]
15 Secondly, the simple ITG leads to redundancy if word alignment is the sole purpose of applying ITG. [sent-50, score-0.28]
16 3 Basics of ITG Parsing Based on the rules in normal form, ITG word alignment is done in a similar way to chart parsing (Wu, 1997). [sent-68, score-0.372]
17 The base step applies all relevant terminal unary rules to establish the links of word pairs. [sent-69, score-0.246]
18 The word pairs are then combined into span pairs in all possible ways. [sent-70, score-0.375]
19 Larger and larger span pairs are recursively built until the sentence pair is built. [sent-71, score-0.415]
20 Each node (rectangle) represents a pair, marked with certain phrase category, of foreign span (F-span) and English span (E-span) (the upper half of the rectangle) and the associated alignment hypothesis (the lower half). [sent-73, score-1.071]
21 Each graph like Figure 1(a) shows only one deri- vation and also only one alignment hypothesis. [sent-74, score-0.241]
22 Each hypernode (rectangle) comprises both a span pair (upper half) and the list of possible alignment hypotheses (lower half) for that span pair. [sent-76, score-0.983]
23 The hyperedges show how larger span pairs are derived from smaller span pairs. [sent-77, score-0.584]
24 Note that a hypernode may have more than one alignment hypothesis, since a hypernode may be derived through more than one hyperedge (e. [sent-78, score-0.439]
25 Due to the use of normal form, the hypotheses of a span pair are different from each other. [sent-81, score-0.425]
26 4 Pruning in ITG Parsing The ITG parsing framework has three levels of pruning: 1) To discard some unpromising span pairs; 2) To discard some unpromising F-spans and/or E-spans; 3) To discard some unpromising alignment hypotheses for a particular span pair. [sent-82, score-1.045]
27 (2008)) is very radical as it implies discarding too many span pairs. [sent-85, score-0.255]
28 It is empirically found to be highly harmful to alignment performance and therefore not adopted in this paper. [sent-86, score-0.241]
29 The third type of pruning is equivalent to minimizing the beam size of alignment hypotheses in each hypernode. [sent-87, score-0.634]
30 That is, during the bottom-up construction of the span pair repertoire, each span pair keeps only the best alignment hypothesis. [sent-89, score-0.929]
31 Once the complete parse tree is built, the k-best list of the topmost span is obtained by minimally expanding the list of alignment hypotheses of minimal number of span pairs. [sent-90, score-0.821]
32 The first type of pruning is equivalent to minimizing the number of hypernodes in a hypergraph. [sent-91, score-0.332]
33 The task of ITG pruning is defined in this paper as the first type of pruning; i. [sent-92, score-0.278]
34 1 The pruning method should maintain a balance between efficiency (run as quickly as possible) and performance (keep as many correct span pairs as possible). [sent-95, score-0.612]
35 317 A naïve approach is that the required pruning method outputs a score given a span pair. [sent-98, score-0.559]
36 5 The DPDI Framework DPDI, the discriminative pruning model proposed in this paper, assigns score to a span pair ? [sent-100, score-0.752]
37 1 Training Samples Discriminative approaches to word alignment use manually annotated alignment for sentence pairs. [sent-141, score-0.523]
38 Discriminative pruning, however, handles not only a sentence pair but every possible span pair. [sent-142, score-0.365]
39 Rather than recruiting annotators for marking span pairs, we modify the parsing algorithm in Section 3 so as to produce span pair annotation out of sentence-level annotation. [sent-144, score-0.633]
40 If the sentence-level annotation satisfies the alignment constraints of ITG, then each F-span will have only one E-span in the parse tree. [sent-146, score-0.241]
41 Consider the example in Figure 2, where the golden links in the alignment annotation are ? [sent-149, score-0.363]
42 When such situation happens, we calculate the product of the inside and outside probability of each alignment hypothesis of the span pair, based on the probabilities of the links from some simpler alignment model2. [sent-168, score-0.968]
43 The E-span with the most probable hypothesis is selected as the alignment of the F-span. [sent-169, score-0.27]
44 It should be noted that this automatic span pair annotation may violate some of the links in the original sentence-level alignment annotation. [sent-172, score-0.668]
45 f1 f2 f3 f4 e1 e2 e3 e4 Figure 3: An example of inside-out alignment The training samples thus obtained are positive training samples. [sent-178, score-0.322]
46 Given an SMT system which produces, with 2 The formulae of the inside and outside probability of a span pair will be elaborated in Section 5. [sent-184, score-0.451]
47 , 2000) probabilities of the word pairs inside and outside a span pair ( ? [sent-348, score-0.506]
48 probability word pairs within the span pair): of ? [sent-359, score-0.325]
49 probability the word pairs outside the span pair): ? [sent-385, score-0.361]
50 The features are explained with the 319 example of Figure 5, in which the span pair in interest is ? [sent-414, score-0.368]
51 The four links are produced by some simpler alignment model like HMM. [sent-419, score-0.351]
52 The feature value of the example span pair is (2* 1)/(2+2)=0. [sent-455, score-0.344]
53 is the number of links which are inconsistent with the phrase pair according to some simpler alignment model (e. [sent-485, score-0.569]
54 is defined as the average ratio of foreign sentence length to Eng- lish sentence length, and it is estimated to be around 1. [sent-514, score-0.225]
55 The rationale underlying this feature is that the ratio of span length should not be too deviated from the average ratio of sentence length. [sent-516, score-0.402]
56 a phrase of the foreign sentence usually occupies roughly the same position of the equivalent English phrase. [sent-561, score-0.247]
57 The feature value for 3 An inconsistent link connects a word within the phrase pair to some word outside the phrase pair. [sent-562, score-0.42]
58 (2009) show that posterior probabilities from the HMM alignment model is useful for pruning. [sent-570, score-0.268]
59 Therefore, we design two new features by replacing the link count in link ratio and inconsistent link ratio with the sum of the link‟s posterior probability. [sent-571, score-0.349]
60 6 The DITG Models The discriminative ITG alignment can be conceived as a two-staged process. [sent-572, score-0.345]
61 In the second stage good alignment hypotheses are assigned to the span pairs selected by DPDI. [sent-574, score-0.59]
62 Another is DITG with hierarchical phrase pairs (henceforth HP-DITG), which relaxes the 1to-1 constraint by adopting hierarchical phrase pairs in Chiang (2007). [sent-577, score-0.359]
63 Each model selects the best alignment hypotheses of each span pair, given a set of features. [sent-578, score-0.54]
64 The MERT module for DITG takes alignment F-score of a sentence pair as the performance measure. [sent-582, score-0.351]
65 1 Word-to-word DITG The following features about alignment link are used in W-DITG: 1) Word pair translation probabilities trained from HMM model (Vogel, et. [sent-586, score-0.457]
66 Wu (1997) proposes a bilingual segmentation grammar extending the terminal rules by including phrase pairs. [sent-600, score-0.219]
67 HP-DITG extends Cherry and Lin‟s approach by not only employing simple phrase pairs but also hierarchical phrase pairs (Chiang, 2007). [sent-604, score-0.281]
68 refer to the English and foreign side of the i-th (simple/hierarchical) phrase pair respectively. [sent-614, score-0.294]
69 During parsing, each span pair does not only examine all possible combinations of sub-span pairs using binary rules, but also checks if the yield of that span pair is exactly the same as that phrase pair. [sent-628, score-0.807]
70 If so, then the alignment links within the phrase pair (which are obtained in standard phrase pair extraction procedure) are taken as an alternative alignment hypothesis of that span pair. [sent-629, score-1.165]
71 1" during parsing, each span pair checks if it contains the lexical anchors "of" and " 的", and if the remaining words in its yield can form two sub-span pairs which fit the reordering constraint among 的 的 ? [sent-644, score-0.472]
72 (Note that span pairs of any category in the ITG normal form grammar can substitute for ? [sent-647, score-0.372]
73 ) If both conditions hold, then the span pair is assigned an alignment hypothesis which combines the alignment links among the lexical anchors (? [sent-650, score-0.961]
74 The rule probabilities and lexical weights in both Englishto-foreign and foreign-to-English directions are estimated and taken as features, in addition to those features in W-DITG, in the discriminative model of alignment hypothesis selection. [sent-658, score-0.425]
75 7 Evaluation DPDI is evaluated against the baselines of Tictac-toe (TTT) pruning (Zhang and Gildea, 2005) and Dynamic Program (DP) pruning (Haghighi et al. [sent-659, score-0.556]
76 Based on DPDI, HP-DITG is evaluated against the alignment systems GIZA++ and BITG. [sent-662, score-0.241]
77 We will first evaluate pruning regarding the pruning decisions themselves. [sent-665, score-0.556]
78 That is, the first evaluation metric, pruning error rate (henceforth PER), measures how many correct E-spans are discarded. [sent-666, score-0.343]
79 The major drawback of PER is that not all decisions in pruning would impact on alignment quality, since certain F-spans are of little use to the entire ITG parse tree. [sent-667, score-0.519]
80 An alternative criterion is the upper bound on alignment F-score, which essentially measures how many links in annotated alignment can be kept in ITG parse. [sent-668, score-0.677]
81 The upper bound of recall is the hit score divided by the total number of golden links. [sent-740, score-0.236]
82 The upper bound of alignment F-score can thus be calculated as well. [sent-750, score-0.353]
83 hit=m[e1ax,e{13A+]/:1[f,1 ,+f21]}=2 A→[C,C] A→[C,C] Finally, we also do end-to-end evaluation using both F-score in alignment and Bleu score in translation. [sent-751, score-0.267]
84 2 Experiment Data Both discriminative pruning and alignment need training data and test data. [sent-754, score-0.644]
85 3 Small-scale Evaluation The first set of experiments evaluates the performance of the three pruning methods using the small 241-sentence set. [sent-763, score-0.278]
86 Each pruning method is plugged in both W-DITG and HP-DITG. [sent-764, score-0.278]
87 IBM Model 1 and HMM alignment model are reimplemented as they are required by the three ITG pruning methods. [sent-765, score-0.519]
88 number of E-spans per F-span), although DPDI spends a bit more time (due to the more complicated model), DPDI makes far less incorrect pruning decisions than the TTT. [sent-769, score-0.278]
89 (2009) performs much poorer than the other two pruning methods. [sent-777, score-0.278]
90 A possible explanation is that better pruning not only speeds up the parsing/alignment process but also guides the search process to focus on the most promising region of the search space. [sent-787, score-0.3]
91 e531u728DP for HP-DITG 8 Conclusion and Future Work This paper reviews word alignment through ITG parsing, and clarifies the problem of ITG pruning. [sent-791, score-0.261]
92 A discriminative pruning model and two discriminative ITG alignments systems are proposed. [sent-792, score-0.486]
93 The pruning model is shown to be superior to all existing ITG pruning methods, and the HP-DITG alignment system is shown to improve state-ofthe-art alignment and translation quality. [sent-793, score-1.057]
94 As the success of HP-DITG illustrates the merit of hierarchical phrase pair, in future we should investigate more features on the relationship between span pair and hierarchical phrase pair. [sent-796, score-0.592]
95 The Normal Form Grammar alignment F-score and Bleu score. [sent-798, score-0.241]
96 On the one hand, a good phrase pair often fails to be extracted due to a link inconsistent with the pair. [sent-801, score-0.275]
97 On the other hand, ITG pruning can be considered as phrase pair selection, and good ITG pruning like DPDI guides the subsequent ITG alignment process so that less links inconsistent to good phrase pairs are produced. [sent-802, score-1.239]
98 This also explains (in Tables 2 and 3) why DPDI with beam size 10 leads to higher Bleu than TTT with beam size 20, even though both pruning methods lead to roughly the same alignment F-score. [sent-803, score-0.649]
99 , and rule schemas (7) are unary rules for alignment to null. [sent-811, score-0.335]
100 ) If there are both English and foreign words linked to null, rule (5) ensures that those English 323 words linked to null precede those foreign words linked to null. [sent-817, score-0.41]
wordName wordTfidf (topN-words)
[('itg', 0.642), ('dpdi', 0.388), ('pruning', 0.278), ('span', 0.255), ('alignment', 0.241), ('ditg', 0.145), ('foreign', 0.136), ('discriminative', 0.104), ('hypernode', 0.099), ('mert', 0.09), ('pair', 0.089), ('links', 0.083), ('phrase', 0.069), ('haghighi', 0.065), ('upper', 0.065), ('inconsistent', 0.06), ('hit', 0.059), ('link', 0.057), ('ttt', 0.057), ('inversion', 0.055), ('smt', 0.055), ('unary', 0.054), ('pairs', 0.05), ('terminal', 0.049), ('unpromising', 0.048), ('ratio', 0.047), ('bound', 0.047), ('beam', 0.045), ('transduction', 0.045), ('hypotheses', 0.044), ('hierarchical', 0.043), ('cherry', 0.043), ('interval', 0.043), ('hmm', 0.042), ('elaborated', 0.042), ('hypergraph', 0.041), ('rules', 0.04), ('samples', 0.039), ('golden', 0.039), ('normal', 0.037), ('rectangle', 0.036), ('error', 0.036), ('moore', 0.036), ('outside', 0.036), ('constraint', 0.035), ('linked', 0.035), ('dp', 0.034), ('parsing', 0.034), ('null', 0.033), ('chiang', 0.033), ('caonsdtdp', 0.032), ('appendix', 0.032), ('rationale', 0.032), ('zhang', 0.032), ('bilingual', 0.031), ('grammar', 0.03), ('denero', 0.029), ('inside', 0.029), ('hypothesis', 0.029), ('correct', 0.029), ('bleu', 0.029), ('aligns', 0.028), ('envelope', 0.028), ('hypernodes', 0.028), ('gildea', 0.028), ('probabilities', 0.027), ('simpler', 0.027), ('minimizing', 0.026), ('topmost', 0.026), ('score', 0.026), ('och', 0.025), ('discard', 0.024), ('harbin', 0.024), ('rge', 0.024), ('hyperedges', 0.024), ('basics', 0.024), ('inverted', 0.024), ('features', 0.024), ('rank', 0.024), ('henceforth', 0.023), ('anchors', 0.023), ('nist', 0.023), ('guides', 0.022), ('merits', 0.022), ('roughly', 0.021), ('half', 0.021), ('training', 0.021), ('sentence', 0.021), ('word', 0.02), ('reality', 0.02), ('hao', 0.02), ('reordering', 0.02), ('obstacle', 0.02), ('intervals', 0.02), ('translation', 0.019), ('robert', 0.019), ('wu', 0.019), ('liu', 0.019), ('leads', 0.019), ('vogel', 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.9999994 88 acl-2010-Discriminative Pruning for Discriminative ITG Alignment
Author: Shujie Liu ; Chi-Ho Li ; Ming Zhou
Abstract: While Inversion Transduction Grammar (ITG) has regained more and more attention in recent years, it still suffers from the major obstacle of speed. We propose a discriminative ITG pruning framework using Minimum Error Rate Training and various features from previous work on ITG alignment. Experiment results show that it is superior to all existing heuristics in ITG pruning. On top of the pruning framework, we also propose a discriminative ITG alignment model using hierarchical phrase pairs, which improves both F-score and Bleu score over the baseline alignment system of GIZA++. 1
2 0.48013783 87 acl-2010-Discriminative Modeling of Extraction Sets for Machine Translation
Author: John DeNero ; Dan Klein
Abstract: We present a discriminative model that directly predicts which set ofphrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets: nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First, we can incorporate features on phrase pairs, in addition to word links. Second, we can optimize for an extraction-based loss function that relates directly to the end task of generating translations. Our model gives improvements in alignment quality relative to state-of-the-art unsupervised and supervised baselines, as well as providing up to a 1.4 improvement in BLEU score in Chinese-to-English translation experiments.
3 0.27734733 133 acl-2010-Hierarchical Search for Word Alignment
Author: Jason Riesa ; Daniel Marcu
Abstract: We present a simple yet powerful hierarchical search algorithm for automatic word alignment. Our algorithm induces a forest of alignments from which we can efficiently extract a ranked k-best list. We score a given alignment within the forest with a flexible, linear discriminative model incorporating hundreds of features, and trained on a relatively small amount of annotated data. We report results on Arabic-English word alignment and translation tasks. Our model outperforms a GIZA++ Model-4 baseline by 6.3 points in F-measure, yielding a 1.1 BLEU score increase over a state-of-the-art syntax-based machine translation system.
4 0.19125463 24 acl-2010-Active Learning-Based Elicitation for Semi-Supervised Word Alignment
Author: Vamshi Ambati ; Stephan Vogel ; Jaime Carbonell
Abstract: Semi-supervised word alignment aims to improve the accuracy of automatic word alignment by incorporating full or partial manual alignments. Motivated by standard active learning query sampling frameworks like uncertainty-, margin- and query-by-committee sampling we propose multiple query strategies for the alignment link selection task. Our experiments show that by active selection of uncertain and informative links, we reduce the overall manual effort involved in elicitation of alignment link data for training a semisupervised word aligner.
5 0.17399105 240 acl-2010-Training Phrase Translation Models with Leaving-One-Out
Author: Joern Wuebker ; Arne Mauser ; Hermann Ney
Abstract: Several attempts have been made to learn phrase translation probabilities for phrasebased statistical machine translation that go beyond pure counting of phrases in word-aligned training data. Most approaches report problems with overfitting. We describe a novel leavingone-out approach to prevent over-fitting that allows us to train phrase models that show improved translation performance on the WMT08 Europarl German-English task. In contrast to most previous work where phrase models were trained separately from other models used in translation, we include all components such as single word lexica and reordering mod- els in training. Using this consistent training of phrase models we are able to achieve improvements of up to 1.4 points in BLEU. As a side effect, the phrase table size is reduced by more than 80%.
6 0.14374593 90 acl-2010-Diversify and Combine: Improving Word Alignment for Machine Translation on Low-Resource Languages
7 0.1424554 170 acl-2010-Letter-Phoneme Alignment: An Exploration
8 0.14119922 201 acl-2010-Pseudo-Word for Phrase-Based Machine Translation
9 0.12935045 110 acl-2010-Exploring Syntactic Structural Features for Sub-Tree Alignment Using Bilingual Tree Kernels
10 0.12674998 115 acl-2010-Filtering Syntactic Constraints for Statistical Machine Translation
11 0.12228099 147 acl-2010-Improving Statistical Machine Translation with Monolingual Collocation
12 0.12152546 262 acl-2010-Word Alignment with Synonym Regularization
13 0.10923846 265 acl-2010-cdec: A Decoder, Alignment, and Learning Framework for Finite-State and Context-Free Translation Models
14 0.090879396 54 acl-2010-Boosting-Based System Combination for Machine Translation
15 0.087816633 102 acl-2010-Error Detection for Statistical Machine Translation Using Linguistic Features
16 0.086434402 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans
17 0.082617342 99 acl-2010-Efficient Third-Order Dependency Parsers
18 0.077470608 118 acl-2010-Fine-Grained Tree-to-String Translation Rule Extraction
19 0.074942082 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
20 0.074205413 163 acl-2010-Learning Lexicalized Reordering Models from Reordering Graphs
topicId topicWeight
[(0, -0.211), (1, -0.292), (2, -0.024), (3, 0.01), (4, 0.043), (5, 0.074), (6, -0.145), (7, 0.095), (8, 0.102), (9, -0.109), (10, -0.121), (11, -0.131), (12, -0.151), (13, 0.099), (14, -0.059), (15, 0.027), (16, 0.003), (17, -0.003), (18, -0.099), (19, -0.046), (20, 0.065), (21, 0.087), (22, 0.062), (23, -0.037), (24, -0.074), (25, 0.07), (26, -0.028), (27, 0.103), (28, 0.015), (29, -0.012), (30, -0.009), (31, -0.015), (32, 0.026), (33, 0.085), (34, 0.084), (35, 0.072), (36, -0.046), (37, 0.084), (38, -0.018), (39, 0.02), (40, -0.022), (41, 0.139), (42, 0.028), (43, -0.066), (44, 0.015), (45, -0.01), (46, -0.034), (47, -0.045), (48, -0.06), (49, -0.013)]
simIndex simValue paperId paperTitle
same-paper 1 0.95068026 88 acl-2010-Discriminative Pruning for Discriminative ITG Alignment
Author: Shujie Liu ; Chi-Ho Li ; Ming Zhou
Abstract: While Inversion Transduction Grammar (ITG) has regained more and more attention in recent years, it still suffers from the major obstacle of speed. We propose a discriminative ITG pruning framework using Minimum Error Rate Training and various features from previous work on ITG alignment. Experiment results show that it is superior to all existing heuristics in ITG pruning. On top of the pruning framework, we also propose a discriminative ITG alignment model using hierarchical phrase pairs, which improves both F-score and Bleu score over the baseline alignment system of GIZA++. 1
2 0.88781059 87 acl-2010-Discriminative Modeling of Extraction Sets for Machine Translation
Author: John DeNero ; Dan Klein
Abstract: We present a discriminative model that directly predicts which set ofphrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets: nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First, we can incorporate features on phrase pairs, in addition to word links. Second, we can optimize for an extraction-based loss function that relates directly to the end task of generating translations. Our model gives improvements in alignment quality relative to state-of-the-art unsupervised and supervised baselines, as well as providing up to a 1.4 improvement in BLEU score in Chinese-to-English translation experiments.
3 0.79162878 133 acl-2010-Hierarchical Search for Word Alignment
Author: Jason Riesa ; Daniel Marcu
Abstract: We present a simple yet powerful hierarchical search algorithm for automatic word alignment. Our algorithm induces a forest of alignments from which we can efficiently extract a ranked k-best list. We score a given alignment within the forest with a flexible, linear discriminative model incorporating hundreds of features, and trained on a relatively small amount of annotated data. We report results on Arabic-English word alignment and translation tasks. Our model outperforms a GIZA++ Model-4 baseline by 6.3 points in F-measure, yielding a 1.1 BLEU score increase over a state-of-the-art syntax-based machine translation system.
4 0.74056822 90 acl-2010-Diversify and Combine: Improving Word Alignment for Machine Translation on Low-Resource Languages
Author: Bing Xiang ; Yonggang Deng ; Bowen Zhou
Abstract: We present a novel method to improve word alignment quality and eventually the translation performance by producing and combining complementary word alignments for low-resource languages. Instead of focusing on the improvement of a single set of word alignments, we generate multiple sets of diversified alignments based on different motivations, such as linguistic knowledge, morphology and heuristics. We demonstrate this approach on an English-to-Pashto translation task by combining the alignments obtained from syntactic reordering, stemming, and partial words. The combined alignment outperforms the baseline alignment, with significantly higher F-scores and better transla- tion performance.
5 0.72848541 24 acl-2010-Active Learning-Based Elicitation for Semi-Supervised Word Alignment
Author: Vamshi Ambati ; Stephan Vogel ; Jaime Carbonell
Abstract: Semi-supervised word alignment aims to improve the accuracy of automatic word alignment by incorporating full or partial manual alignments. Motivated by standard active learning query sampling frameworks like uncertainty-, margin- and query-by-committee sampling we propose multiple query strategies for the alignment link selection task. Our experiments show that by active selection of uncertain and informative links, we reduce the overall manual effort involved in elicitation of alignment link data for training a semisupervised word aligner.
6 0.70894039 170 acl-2010-Letter-Phoneme Alignment: An Exploration
7 0.61909056 262 acl-2010-Word Alignment with Synonym Regularization
8 0.5795399 240 acl-2010-Training Phrase Translation Models with Leaving-One-Out
9 0.53042024 201 acl-2010-Pseudo-Word for Phrase-Based Machine Translation
10 0.51006365 147 acl-2010-Improving Statistical Machine Translation with Monolingual Collocation
11 0.49183315 265 acl-2010-cdec: A Decoder, Alignment, and Learning Framework for Finite-State and Context-Free Translation Models
12 0.47714812 110 acl-2010-Exploring Syntactic Structural Features for Sub-Tree Alignment Using Bilingual Tree Kernels
13 0.38208732 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
14 0.35748428 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
15 0.33292842 102 acl-2010-Error Detection for Statistical Machine Translation Using Linguistic Features
16 0.33009553 199 acl-2010-Preferences versus Adaptation during Referring Expression Generation
17 0.32825473 263 acl-2010-Word Representations: A Simple and General Method for Semi-Supervised Learning
18 0.3168793 9 acl-2010-A Joint Rule Selection Model for Hierarchical Phrase-Based Translation
19 0.30967149 180 acl-2010-On Jointly Recognizing and Aligning Bilingual Named Entities
20 0.3089999 246 acl-2010-Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure
topicId topicWeight
[(14, 0.015), (18, 0.011), (25, 0.062), (39, 0.014), (42, 0.016), (44, 0.011), (56, 0.165), (59, 0.185), (73, 0.062), (78, 0.022), (83, 0.098), (84, 0.021), (98, 0.197)]
simIndex simValue paperId paperTitle
1 0.97632289 64 acl-2010-Complexity Assumptions in Ontology Verbalisation
Author: Richard Power
Abstract: We describe the strategy currently pursued for verbalising OWL ontologies by sentences in Controlled Natural Language (i.e., combining generic rules for realising logical patterns with ontology-specific lexicons for realising atomic terms for individuals, classes, and properties) and argue that its success depends on assumptions about the complexity of terms and axioms in the ontology. We then show, through analysis of a corpus of ontologies, that although these assumptions could in principle be violated, they are overwhelmingly respected in practice by ontology developers.
2 0.92526549 37 acl-2010-Automatic Evaluation Method for Machine Translation Using Noun-Phrase Chunking
Author: Hiroshi Echizen-ya ; Kenji Araki
Abstract: As described in this paper, we propose a new automatic evaluation method for machine translation using noun-phrase chunking. Our method correctly determines the matching words between two sentences using corresponding noun phrases. Moreover, our method determines the similarity between two sentences in terms of the noun-phrase order of appearance. Evaluation experiments were conducted to calculate the correlation among human judgments, along with the scores produced us- ing automatic evaluation methods for MT outputs obtained from the 12 machine translation systems in NTCIR7. Experimental results show that our method obtained the highest correlations among the methods in both sentence-level adequacy and fluency.
same-paper 3 0.91857278 88 acl-2010-Discriminative Pruning for Discriminative ITG Alignment
Author: Shujie Liu ; Chi-Ho Li ; Ming Zhou
Abstract: While Inversion Transduction Grammar (ITG) has regained more and more attention in recent years, it still suffers from the major obstacle of speed. We propose a discriminative ITG pruning framework using Minimum Error Rate Training and various features from previous work on ITG alignment. Experiment results show that it is superior to all existing heuristics in ITG pruning. On top of the pruning framework, we also propose a discriminative ITG alignment model using hierarchical phrase pairs, which improves both F-score and Bleu score over the baseline alignment system of GIZA++. 1
4 0.86691535 87 acl-2010-Discriminative Modeling of Extraction Sets for Machine Translation
Author: John DeNero ; Dan Klein
Abstract: We present a discriminative model that directly predicts which set ofphrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets: nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First, we can incorporate features on phrase pairs, in addition to word links. Second, we can optimize for an extraction-based loss function that relates directly to the end task of generating translations. Our model gives improvements in alignment quality relative to state-of-the-art unsupervised and supervised baselines, as well as providing up to a 1.4 improvement in BLEU score in Chinese-to-English translation experiments.
Author: Decong Li ; Sujian Li ; Wenjie Li ; Wei Wang ; Weiguang Qu
Abstract: It is a fundamental and important task to extract key phrases from documents. Generally, phrases in a document are not independent in delivering the content of the document. In order to capture and make better use of their relationships in key phrase extraction, we suggest exploring the Wikipedia knowledge to model a document as a semantic network, where both n-ary and binary relationships among phrases are formulated. Based on a commonly accepted assumption that the title of a document is always elaborated to reflect the content of a document and consequently key phrases tend to have close semantics to the title, we propose a novel semi-supervised key phrase extraction approach in this paper by computing the phrase importance in the semantic network, through which the influence of title phrases is propagated to the other phrases iteratively. Experimental results demonstrate the remarkable performance of this approach. 1
6 0.85971534 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery
8 0.85777932 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans
9 0.85744536 48 acl-2010-Better Filtration and Augmentation for Hierarchical Phrase-Based Translation Rules
10 0.85656905 172 acl-2010-Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons
11 0.85612631 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
12 0.85608619 148 acl-2010-Improving the Use of Pseudo-Words for Evaluating Selectional Preferences
13 0.8548007 54 acl-2010-Boosting-Based System Combination for Machine Translation
14 0.85469604 254 acl-2010-Using Speech to Reply to SMS Messages While Driving: An In-Car Simulator User Study
15 0.85235089 114 acl-2010-Faster Parsing by Supertagger Adaptation
16 0.85186338 96 acl-2010-Efficient Optimization of an MDL-Inspired Objective Function for Unsupervised Part-Of-Speech Tagging
17 0.85132754 9 acl-2010-A Joint Rule Selection Model for Hierarchical Phrase-Based Translation
18 0.85098279 51 acl-2010-Bilingual Sense Similarity for Statistical Machine Translation
19 0.85008621 261 acl-2010-Wikipedia as Sense Inventory to Improve Diversity in Web Search Results
20 0.84922016 156 acl-2010-Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems