emnlp emnlp2012 emnlp2012-109 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Shujie Liu ; Chi-Ho Li ; Mu Li ; Ming Zhou
Abstract: The training of most syntactic SMT approaches involves two essential components, word alignment and monolingual parser. In the current state of the art these two components are mutually independent, thus causing problems like lack of rule generalization, and violation of syntactic correspondence in translation rules. In this paper, we propose two ways of re-training monolingual parser with the target of maximizing the consistency between parse trees and alignment matrices. One is targeted self-training with a simple evaluation function; the other is based on training data selection from forced alignment of bilingual data. We also propose an auxiliary method for boosting alignment quality, by symmetrizing alignment matrices with respect to parse trees. The best combination of these novel methods achieves 3 Bleu point gain in an IWSLT task and more than 1 Bleu point gain in NIST tasks. 1
Reference: text
sentIndex sentText sentNum sentScore
1 com Abstract The training of most syntactic SMT approaches involves two essential components, word alignment and monolingual parser. [sent-5, score-0.534]
2 In the current state of the art these two components are mutually independent, thus causing problems like lack of rule generalization, and violation of syntactic correspondence in translation rules. [sent-6, score-0.319]
3 In this paper, we propose two ways of re-training monolingual parser with the target of maximizing the consistency between parse trees and alignment matrices. [sent-7, score-0.938]
4 One is targeted self-training with a simple evaluation function; the other is based on training data selection from forced alignment of bilingual data. [sent-8, score-0.831]
5 We also propose an auxiliary method for boosting alignment quality, by symmetrizing alignment matrices with respect to parse trees. [sent-9, score-0.991]
6 In the current state of the art, word aligner and monolingual parser are trained and applied separately. [sent-13, score-0.323]
7 That is, some SL words 854 yielded by a SL parse tree node may not be traced to, via alignment links, some TL words with legitimate syntactic structure. [sent-15, score-0.798]
8 Many good translation rules may thus be filtered by a good monolingual parser. [sent-17, score-0.288]
9 The minimal rules are extracted from a special kind of nodes, known as frontier nodes, on TL parse tree. [sent-21, score-0.622]
10 The concept of frontier node can be illustrated by Figure 1, which shows two partial bilingual sentences with the corresponding TL sub-trees and word alignment links. [sent-22, score-0.932]
11 The TL words yielded by a TL parse node can be traced to the corresponding SL words through alignment links. [sent-23, score-0.637]
12 In the diagram, each parse node is represented by a rectangle, showing the phrase label, span, and complement span respectively. [sent-24, score-0.297]
13 A frontier node is a node of which the span and the complement span do not overlap with each other. [sent-30, score-0.564]
14 In the diagram, frontier nodes are grey in color. [sent-31, score-0.401]
15 Frontier node is the key in the SSMT model, as it identifies the bilingual information which is consistent with both the parse tree and alignment matrix. [sent-32, score-0.841]
16 Two example partial bilingual sentences with word alignment and syntactic tree for the target sentence. [sent-35, score-0.75]
17 Example (a) contains two error links (in dash line), and the syntactic tree for the target sentence of example (b) is wrong. [sent-37, score-0.334]
18 structure by incorrect alignment links, as shown by the two dashed links in Figure 1(a). [sent-38, score-0.553]
19 These two incorrect links hinder the extraction of a good minimal rule “毡房 ” and that of a good composed rule “牧 民 , 的 NP(DT(the), NN(herdsmen), POS('s)) ”. [sent-39, score-0.331]
20 By and large, incorrect alignment links lead to translation rules that are large in size, few in number, and poor in generalization ability (Fossum et al, 2008). [sent-40, score-0.776]
21 Note that in Figure 1(a), the parse tree is correct, and the incorrect alignment links might be fixed if the aligner takes the parse tree into consideration. [sent-44, score-1.168]
22 Similarly, in Figure 1(b) some parsing errors might be fixed if the parser takes into consideration the correct alignment links about “propaganda” and 855 “lecture”. [sent-45, score-0.715]
23 That is, alignment errors and parsing might be fixed if word aligner and parser are not mutually independent. [sent-46, score-0.678]
24 In this paper, we emphasize more on the correction of parsing errors by exploiting alignment information. [sent-47, score-0.422]
25 The general approach is to re-train a parser with parse trees which are the most consistent with alignment matrices. [sent-48, score-0.827]
26 , 201 1) with the simple evaluation function of frontier set size. [sent-50, score-0.352]
27 That is to re-train the parser with the parse trees which give rise to the largest number of frontier nodes. [sent-51, score-0.805]
28 The second strategy is to apply forced alignment (Wuebker et al. [sent-52, score-0.586]
29 , 2010) to bilingual data and select the parse trees generated by our SSMT system for re-training the parser. [sent-53, score-0.422]
30 Besides, although we do not invent a new word aligner exploiting syntactic information, we propose a new method to symmetrize the alignment matrices of two directions by taking parse tree into consideration. [sent-54, score-0.833]
31 That is, a parser is trained with the target of maximizing the agreement between its decision on syntactic structure and that decision in the human-annotated parse trees. [sent-56, score-0.442]
32 As mentioned in Section 1, monolingual syntactic structure is not necessarily suitable for translation, and sometimes the bilingual information in word alignment may help the parser find out the correct structure. [sent-57, score-0.868]
33 Therefore, it is desirable if there is a way to re-train a parser with bilingual information. [sent-58, score-0.305]
34 What is needed includes a framework of parser re-training, and a data selection strategy that maximizes the consistency between parse tree and alignment matrix. [sent-59, score-0.804]
35 In standard selftraining, the top one parse trees produced by the current parser are taken as training data for the next round, and the training objective is still the correctness of monolingual syntactic structure. [sent-64, score-0.591]
36 For each sentence, the n-best parse trees from the current parser are re-ranked in accordance with this external evaluation function, and the top one of the re-ranked candidates is then selected as training data for the next round. [sent-66, score-0.463]
37 As shown by the example in Figure 1(b), an incorrect parse tree is likely to hinder the extraction of good translation rules, because the number of frontier nodes in the incorrect tree is in general smaller than that in the correct tree. [sent-68, score-1.095]
38 Although both parse trees do not have the correct syntactic structure, the tree in Figure 2 has more frontier nodes, leads to more valid translation rules, and is therefore more preferable. [sent-70, score-0.926]
39 Given a bilingual sentence, its alignment matrix, 856 naD1u-l7TrgNJ6eP2nu1-37mN4brI5,6o-f31pPe4N765S,ple3-1P7cVoBGming6T1-utO7lVsBe,35-nP478T1OutloV9-h7P,R43e1ir50poN3,-4P7agnd1lecV-4turs2 nu来 l17 7 听 2 5 他 6们 3 1 宣 n讲 ul4 n u l n u 的 l5 3 4 人 6 很 4多 7 Figure 2. [sent-73, score-0.523]
40 The parse tree selected by TST-FS for the example in Figure 1(b) and the N-best parse trees of the TL sentence, we will calculate the number of frontier nodes for each parse tree, and re-rank the parse trees in its descending order. [sent-74, score-1.398]
41 The new top one parse tree is selected as the training data for the next round of targeted self-training of the TL parser. [sent-75, score-0.415]
42 In the following we will call this approach as targeted self-training with frontier set based evaluation (TST-FS). [sent-76, score-0.45]
43 It is because sometimes a parse tree with an extremely mistaken structure happens to have perfect match with the alignment matrix, thereby giving rise to nearest the largest frontier set size. [sent-78, score-1.018]
44 It is empirically found that a 5-best list of parse trees is already sufficient to significantly improve translation performance. [sent-79, score-0.413]
45 2 Forced Alignment-based Parser ReTraining (FA-PR) If we doubt that the parse tree from a monolingual parser is not appropriate enough for translation purpose, then it seems reasonable to consider using the parse tree produced by an SSMT system to retrain the parser. [sent-81, score-0.926]
46 A naïve idea is simply to run an SSMT system over some SL sentences and retrieve the by-product TL parse trees for re-training the monolingual parser. [sent-82, score-0.365]
47 The biggest problem of this naïve approach is that the translation by an MT system is often a 'weird' TL sentence, and thus the associated parse tree is of little use in improving the parser. [sent-83, score-0.435]
48 When applied to SSMT, given a bilingual sentence, it performs phrase segmentation of the SL side, parsing of the TL side, and word alignment of the bilingual sentence, using the full translation system as in decoding. [sent-86, score-0.834]
49 It finds the best decoding path that generates the TL side of the bilingual sentence, and the parse tree of the TL sentence is also obtained as a by-product. [sent-87, score-0.496]
50 The parse trees from forced alignment are suitable for re-training the monolingual parser. [sent-88, score-0.98]
51 Then perform forced alignment, using the SSMT system, of some bilingual data and obtain the parse trees as new training data for the parser. [sent-91, score-0.632]
52 The new parser can then be applied again to do the second round of forced alignment. [sent-92, score-0.415]
53 This iteration of forced alignment followed by parser re-training is kept going until some stopping criterion is met. [sent-93, score-0.798]
54 In the following we will call this approach as forced alignment based parser re-training (FA-PR). [sent-94, score-0.744]
55 Use parser to parse target sentences of training data, and build a step3: step4: step5: Step6: . [sent-96, score-0.374]
56 SSMT system Perform forced alignment on training data with to get parse trees for target sentence of training data. [sent-97, score-0.924]
57 Forced alignment is guaranteed to obtain a parse tree if all translation rules are kept and no pruning is performed during decoding. [sent-102, score-0.909]
58 Yet in reality an average MT system applies pruning during translation model training and decoding, and a lot of translation rules will then be discarded. [sent-103, score-0.386]
59 In order to have more parse trees be considered by forced alignment, we keep all translation rules and relax pruning constraints in the decoder, viz. [sent-104, score-0.712]
60 Another measure to guarantee the existence of a decoding path in forced alignment is to allow part of a SL or TL sentence translate to null. [sent-107, score-0.638]
61 We also add a null alignment for any span of the source and target sentences to handle the null translation scenario. [sent-109, score-0.751]
62 It is easy to add a null translation candidate for a span of the source sentence during decoding, but not easy for target spans. [sent-110, score-0.351]
63 The feature weights for the added null alignment are set to be very small, so as to avoid the competition with the normal candidates. [sent-112, score-0.466]
64 In order to generate normal trees with not so many null alignment sub-trees for the target sentence (such trees are not suitable for parser re-training), only target spans with less than 4 words can align to null, and such null-aligned sub-tree can only be added no more than 3 times. [sent-113, score-0.951]
65 With all the mentioned modification of the forced alignment, the partial target tree generated using forced alignment for the example in Figure 1(b) is shown in Figure 3. [sent-114, score-0.955]
66 Such aligner produces one set of alignment matrices for the SL-to-TL direction and another set for the TL-to-SL direction. [sent-117, score-0.495]
67 Symmetrization refers to the combination of these two sets of alignment matrices. [sent-118, score-0.376]
68 Given a bilingual sentence and its two alignment matrices and IDG starts with all the links in . [sent-120, score-0.701]
69 step2: Select the one which can generate the biggest frontier set: step3: , . [sent-128, score-0.379]
70 Given a parse tree of the TL side of the bilingual sentence, in each iteration IDSG considers the change of frontier set size caused by 858 夜 2晚 1 3 居 住n ul21 在 43 牧 5民 4 6 的 5 1 毡 房 16 Figure 4, the alignment generated by IDSG for the example in Figure 1(a) the addition of each link in . [sent-132, score-1.245]
71 The link leading to the maximum number of frontier nodes is added (and removed from ). [sent-133, score-0.478]
72 In sum, IDSG add links in an order which take syntactic structure into consideration, and the link with the least violation of the syntactic structure is added first. [sent-135, score-0.382]
73 For the example in Figure 1(a), IDSG succeeds in discarding the two incorrect links, and produces the final alignment and frontier set as shown in Figure 4. [sent-136, score-0.795]
74 2 Combining TST-FS/FA-PR and IDSG Parser re-training aims to improve a parser with alignment matrix while IDSG aims to improve alignment matrix with parse tree. [sent-139, score-1.145]
75 That is, we could either improve alignment matrix by IDSG and then re-train parser with the better alignment, or re-train parser and then improve alignment matrix with better syntactic information. [sent-141, score-1.194]
76 4 Experiment In this section, we conduct experiments on Chinese to English translation task to test our proposed methods of parser re-training and word alignment symmetrization. [sent-143, score-0.672]
77 To get the baseline of this setting, we run IDG to combine the bidirection alignment generated by Giza++ (Och Ney, 2003), and run Berkeley parser (Petrov and 859 Klein, 2007) to parse the target sentences. [sent-167, score-0.75]
78 3 Results of TST-FS/ FA-PR The parser re-training strategies TST-FS and FAPR are tested with two baselines, one is the default parser without any re-training and another is standard self-training (SST). [sent-176, score-0.353]
79 It can be seen that just standard self-training does improve translation performance, as retraining on the TL side of bilingual data is a kind of domain adaptation (from WSJ to IWSLT/NIST). [sent-179, score-0.352]
80 This confirms the value of word alignment information in parser re-training. [sent-181, score-0.534]
81 Finally, the even larger improvement of FA-PR than TSTFS shows that merely increasing the number of frontier nodes is not enough. [sent-182, score-0.401]
82 Some frontier nodes are of poor quality, and the frontier nodes found in forced alignment are more suitable. [sent-183, score-1.388]
83 The forced alignment of bilingual training data does not obtain a full decoding path for every bilingual sentence. [sent-190, score-0.908]
84 In general, the longer the bilingual sentence, the less likely forced alignment is successful, and that is why a lower proportion of NIST can be forced-aligned. [sent-193, score-0.733]
85 As shown by the results in Table 5 and 6, IDSG enlarges the set of translation rules by more than 20%, thereby improving translation performance significantly. [sent-207, score-0.336]
86 2, parser re-training and the new symmetrization method can be combined in two different ways, depending on the order of application. [sent-216, score-0.335]
87 5 Related Works There are a lot of attempts in improving word alignment with syntactic information (Cherry and Lin, 2006; DeNero and Klein, 2007; Hermjackob, 2009) and in improving parser with alignment information (Burkett and Klein, 2008). [sent-221, score-1.018]
88 To improve the performance of syntactic machine translation, Huang and Knight (2006) proposed a method incorporating a handful of relabeling strategies to modify the syntactic trees structures. [sent-223, score-0.31]
89 Ambati and Lavie (2008) restructured target parse trees to generate highly isomorphic target trees that preserve the syntactic boundaries of constituents aligned in the original parse trees. [sent-224, score-0.696]
90 Different from the previous work of modifying tree structures with post-processing methods, our methods try to learn a suitable grammar for string-to-tree SMT models, and directly produce trees which are consistent with word alignment matrices. [sent-228, score-0.632]
91 Instead of modifying the parse tree to improve machine translation performance, many methods were proposed to modify word alignment by taking syntactic tree into consideration, including deleting incorrect word alignment links by a discriminative model (Fossum et al. [sent-229, score-1.516]
92 , 2008), re-aligning sentence pairs using EM method with the rules extracted with initial alignment (Wang et al. [sent-230, score-0.46]
93 , 2010), and removing ambiguous alignment of functional words with constraint from chunk-level information during rule extraction (Wu et al. [sent-231, score-0.407]
94 Our major contribution is the strategies of re-training parser with the bilingual information in alignment matrices. [sent-235, score-0.718]
95 Either of our proposals of targeted self-training with frontier set size as evaluation function and forced alignment based re-training is more effective than baseline on IWSLT data set. [sent-236, score-1.036]
96 As an auxiliary method, we also attempted to improve alignment matrices by a new symmetrization method. [sent-241, score-0.615]
97 In future, we will explore more alternatives in integrating parsing information and alignment information, such as discriminative word alignment using a lot of features from parser. [sent-242, score-0.818]
98 Improving syntax driven translation models by re-structuring divergent and non-isomorphic parse tree structures. [sent-245, score-0.433]
99 Soft syntactic constraints for word alignment through discriminative training. [sent-253, score-0.444]
100 Using syntax to improve word alignment precision for syntax-based machine translation. [sent-261, score-0.401]
wordName wordTfidf (topN-words)
[('alignment', 0.376), ('frontier', 0.352), ('idsg', 0.335), ('ssmt', 0.296), ('tl', 0.217), ('iwslt', 0.213), ('forced', 0.21), ('symmetrization', 0.177), ('parse', 0.177), ('parser', 0.158), ('idg', 0.158), ('nist', 0.151), ('bilingual', 0.147), ('translation', 0.138), ('links', 0.11), ('sl', 0.109), ('trees', 0.098), ('targeted', 0.098), ('tree', 0.093), ('monolingual', 0.09), ('propaganda', 0.079), ('aligner', 0.075), ('null', 0.068), ('syntactic', 0.068), ('incorrect', 0.067), ('span', 0.062), ('rules', 0.06), ('ambati', 0.059), ('hinder', 0.059), ('violation', 0.059), ('link', 0.055), ('wuebker', 0.051), ('selftraining', 0.051), ('lectures', 0.05), ('nodes', 0.049), ('smt', 0.048), ('bold', 0.048), ('round', 0.047), ('fossum', 0.046), ('matrices', 0.044), ('retraining', 0.04), ('target', 0.039), ('relabeling', 0.039), ('wsj', 0.038), ('nu', 0.038), ('strategies', 0.037), ('dialog', 0.036), ('kept', 0.036), ('bleu', 0.034), ('traced', 0.034), ('galley', 0.033), ('minimal', 0.033), ('och', 0.033), ('decoder', 0.031), ('knight', 0.031), ('rule', 0.031), ('harbin', 0.031), ('node', 0.03), ('external', 0.03), ('mt', 0.03), ('nn', 0.029), ('suitable', 0.029), ('matrix', 0.029), ('pruning', 0.029), ('petrov', 0.029), ('decoding', 0.028), ('np', 0.028), ('complement', 0.028), ('side', 0.027), ('partial', 0.027), ('burkett', 0.027), ('biggest', 0.027), ('parsing', 0.026), ('kevin', 0.026), ('consideration', 0.025), ('cherry', 0.025), ('syntax', 0.025), ('generalization', 0.025), ('sentence', 0.024), ('diagram', 0.024), ('kong', 0.024), ('franz', 0.023), ('mutually', 0.023), ('added', 0.022), ('klein', 0.021), ('lot', 0.021), ('denero', 0.02), ('hong', 0.02), ('yielded', 0.02), ('errors', 0.02), ('candidate', 0.02), ('rise', 0.02), ('hermann', 0.02), ('alternatives', 0.019), ('attempts', 0.019), ('iteration', 0.018), ('ibm', 0.018), ('auxiliary', 0.018), ('modifying', 0.018), ('consistent', 0.018)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999964 109 emnlp-2012-Re-training Monolingual Parser Bilingually for Syntactic SMT
Author: Shujie Liu ; Chi-Ho Li ; Mu Li ; Ming Zhou
Abstract: The training of most syntactic SMT approaches involves two essential components, word alignment and monolingual parser. In the current state of the art these two components are mutually independent, thus causing problems like lack of rule generalization, and violation of syntactic correspondence in translation rules. In this paper, we propose two ways of re-training monolingual parser with the target of maximizing the consistency between parse trees and alignment matrices. One is targeted self-training with a simple evaluation function; the other is based on training data selection from forced alignment of bilingual data. We also propose an auxiliary method for boosting alignment quality, by symmetrizing alignment matrices with respect to parse trees. The best combination of these novel methods achieves 3 Bleu point gain in an IWSLT task and more than 1 Bleu point gain in NIST tasks. 1
2 0.16867222 54 emnlp-2012-Forced Derivation Tree based Model Training to Statistical Machine Translation
Author: Nan Duan ; Mu Li ; Ming Zhou
Abstract: A forced derivation tree (FDT) of a sentence pair {f, e} denotes a derivation tree that can tpraainrsl {afte, f} idnetono itetss a acc duerraivtea target etrea tnhsaltat cioann e. In this paper, we present an approach that leverages structured knowledge contained in FDTs to train component models for statistical machine translation (SMT) systems. We first describe how to generate different FDTs for each sentence pair in training corpus, and then present how to infer the optimal FDTs based on their derivation and alignment qualities. As the first step in this line of research, we verify the effectiveness of our approach in a BTGbased phrasal system, and propose four FDTbased component models. Experiments are carried out on large scale English-to-Japanese and Chinese-to-English translation tasks, and significant improvements are reported on both translation quality and alignment quality.
3 0.14401008 127 emnlp-2012-Transforming Trees to Improve Syntactic Convergence
Author: David Burkett ; Dan Klein
Abstract: We describe a transformation-based learning method for learning a sequence of monolingual tree transformations that improve the agreement between constituent trees and word alignments in bilingual corpora. Using the manually annotated English Chinese Translation Treebank, we show how our method automatically discovers transformations that accommodate differences in English and Chinese syntax. Furthermore, when transformations are learned on automatically generated trees and alignments from the same domain as the training data for a syntactic MT system, the transformed trees achieve a 0.9 BLEU improvement over baseline trees.
4 0.12588772 1 emnlp-2012-A Bayesian Model for Learning SCFGs with Discontiguous Rules
Author: Abby Levenberg ; Chris Dyer ; Phil Blunsom
Abstract: We describe a nonparametric model and corresponding inference algorithm for learning Synchronous Context Free Grammar derivations for parallel text. The model employs a Pitman-Yor Process prior which uses a novel base distribution over synchronous grammar rules. Through both synthetic grammar induction and statistical machine translation experiments, we show that our model learns complex translational correspondences— including discontiguous, many-to-many alignments—and produces competitive translation results. Further, inference is efficient and we present results on significantly larger corpora than prior work.
5 0.11068217 67 emnlp-2012-Inducing a Discriminative Parser to Optimize Machine Translation Reordering
Author: Graham Neubig ; Taro Watanabe ; Shinsuke Mori
Abstract: This paper proposes a method for learning a discriminative parser for machine translation reordering using only aligned parallel text. This is done by treating the parser’s derivation tree as a latent variable in a model that is trained to maximize reordering accuracy. We demonstrate that efficient large-margin training is possible by showing that two measures of reordering accuracy can be factored over the parse tree. Using this model in the pre-ordering framework results in significant gains in translation accuracy over standard phrasebased SMT and previously proposed unsupervised syntax induction methods.
6 0.11034419 42 emnlp-2012-Entropy-based Pruning for Phrase-based Machine Translation
7 0.1022202 16 emnlp-2012-Aligning Predicates across Monolingual Comparable Texts using Graph-based Clustering
8 0.09835048 105 emnlp-2012-Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
9 0.097027443 11 emnlp-2012-A Systematic Comparison of Phrase Table Pruning Techniques
10 0.092920348 86 emnlp-2012-Locally Training the Log-Linear Model for SMT
11 0.091042802 123 emnlp-2012-Syntactic Transfer Using a Bilingual Lexicon
12 0.089273207 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
13 0.085347965 82 emnlp-2012-Left-to-Right Tree-to-String Decoding with Prediction
14 0.080725096 35 emnlp-2012-Document-Wide Decoding for Phrase-Based Statistical Machine Translation
15 0.078303732 25 emnlp-2012-Bilingual Lexicon Extraction from Comparable Corpora Using Label Propagation
16 0.075837962 94 emnlp-2012-Multiple Aspect Summarization Using Integer Linear Programming
17 0.074628934 135 emnlp-2012-Using Discourse Information for Paraphrase Extraction
18 0.071849912 18 emnlp-2012-An Empirical Investigation of Statistical Significance in NLP
19 0.065560445 128 emnlp-2012-Translation Model Based Cross-Lingual Language Model Adaptation: from Word Models to Phrase Models
20 0.063037269 39 emnlp-2012-Enlarging Paraphrase Collections through Generalization and Instantiation
topicId topicWeight
[(0, 0.227), (1, -0.204), (2, -0.172), (3, -0.039), (4, -0.013), (5, -0.036), (6, -0.067), (7, 0.058), (8, -0.078), (9, 0.014), (10, -0.043), (11, 0.092), (12, -0.031), (13, 0.148), (14, -0.058), (15, 0.076), (16, 0.052), (17, 0.014), (18, 0.016), (19, -0.039), (20, -0.025), (21, -0.054), (22, -0.043), (23, 0.045), (24, 0.017), (25, -0.008), (26, -0.176), (27, -0.102), (28, -0.026), (29, 0.009), (30, -0.136), (31, 0.068), (32, 0.032), (33, -0.085), (34, -0.047), (35, 0.142), (36, -0.027), (37, -0.05), (38, 0.017), (39, 0.013), (40, 0.008), (41, 0.106), (42, -0.132), (43, -0.006), (44, -0.052), (45, -0.132), (46, 0.249), (47, -0.054), (48, 0.125), (49, 0.119)]
simIndex simValue paperId paperTitle
same-paper 1 0.96404594 109 emnlp-2012-Re-training Monolingual Parser Bilingually for Syntactic SMT
Author: Shujie Liu ; Chi-Ho Li ; Mu Li ; Ming Zhou
Abstract: The training of most syntactic SMT approaches involves two essential components, word alignment and monolingual parser. In the current state of the art these two components are mutually independent, thus causing problems like lack of rule generalization, and violation of syntactic correspondence in translation rules. In this paper, we propose two ways of re-training monolingual parser with the target of maximizing the consistency between parse trees and alignment matrices. One is targeted self-training with a simple evaluation function; the other is based on training data selection from forced alignment of bilingual data. We also propose an auxiliary method for boosting alignment quality, by symmetrizing alignment matrices with respect to parse trees. The best combination of these novel methods achieves 3 Bleu point gain in an IWSLT task and more than 1 Bleu point gain in NIST tasks. 1
2 0.67871791 54 emnlp-2012-Forced Derivation Tree based Model Training to Statistical Machine Translation
Author: Nan Duan ; Mu Li ; Ming Zhou
Abstract: A forced derivation tree (FDT) of a sentence pair {f, e} denotes a derivation tree that can tpraainrsl {afte, f} idnetono itetss a acc duerraivtea target etrea tnhsaltat cioann e. In this paper, we present an approach that leverages structured knowledge contained in FDTs to train component models for statistical machine translation (SMT) systems. We first describe how to generate different FDTs for each sentence pair in training corpus, and then present how to infer the optimal FDTs based on their derivation and alignment qualities. As the first step in this line of research, we verify the effectiveness of our approach in a BTGbased phrasal system, and propose four FDTbased component models. Experiments are carried out on large scale English-to-Japanese and Chinese-to-English translation tasks, and significant improvements are reported on both translation quality and alignment quality.
3 0.60367703 1 emnlp-2012-A Bayesian Model for Learning SCFGs with Discontiguous Rules
Author: Abby Levenberg ; Chris Dyer ; Phil Blunsom
Abstract: We describe a nonparametric model and corresponding inference algorithm for learning Synchronous Context Free Grammar derivations for parallel text. The model employs a Pitman-Yor Process prior which uses a novel base distribution over synchronous grammar rules. Through both synthetic grammar induction and statistical machine translation experiments, we show that our model learns complex translational correspondences— including discontiguous, many-to-many alignments—and produces competitive translation results. Further, inference is efficient and we present results on significantly larger corpora than prior work.
4 0.56644368 127 emnlp-2012-Transforming Trees to Improve Syntactic Convergence
Author: David Burkett ; Dan Klein
Abstract: We describe a transformation-based learning method for learning a sequence of monolingual tree transformations that improve the agreement between constituent trees and word alignments in bilingual corpora. Using the manually annotated English Chinese Translation Treebank, we show how our method automatically discovers transformations that accommodate differences in English and Chinese syntax. Furthermore, when transformations are learned on automatically generated trees and alignments from the same domain as the training data for a syntactic MT system, the transformed trees achieve a 0.9 BLEU improvement over baseline trees.
5 0.52462852 86 emnlp-2012-Locally Training the Log-Linear Model for SMT
Author: Lemao Liu ; Hailong Cao ; Taro Watanabe ; Tiejun Zhao ; Mo Yu ; Conghui Zhu
Abstract: In statistical machine translation, minimum error rate training (MERT) is a standard method for tuning a single weight with regard to a given development data. However, due to the diversity and uneven distribution of source sentences, there are two problems suffered by this method. First, its performance is highly dependent on the choice of a development set, which may lead to an unstable performance for testing. Second, translations become inconsistent at the sentence level since tuning is performed globally on a document level. In this paper, we propose a novel local training method to address these two problems. Unlike a global training method, such as MERT, in which a single weight is learned and used for all the input sentences, we perform training and testing in one step by learning a sentencewise weight for each input sentence. We pro- pose efficient incremental training methods to put the local training into practice. In NIST Chinese-to-English translation tasks, our local training method significantly outperforms MERT with the maximal improvements up to 2.0 BLEU points, meanwhile its efficiency is comparable to that of the global method.
6 0.46686792 16 emnlp-2012-Aligning Predicates across Monolingual Comparable Texts using Graph-based Clustering
7 0.433649 67 emnlp-2012-Inducing a Discriminative Parser to Optimize Machine Translation Reordering
8 0.42663023 25 emnlp-2012-Bilingual Lexicon Extraction from Comparable Corpora Using Label Propagation
9 0.40725946 18 emnlp-2012-An Empirical Investigation of Statistical Significance in NLP
10 0.39511982 123 emnlp-2012-Syntactic Transfer Using a Bilingual Lexicon
11 0.38687459 105 emnlp-2012-Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
12 0.38092124 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
13 0.38068581 82 emnlp-2012-Left-to-Right Tree-to-String Decoding with Prediction
14 0.36927205 7 emnlp-2012-A Novel Discriminative Framework for Sentence-Level Discourse Analysis
15 0.3515749 128 emnlp-2012-Translation Model Based Cross-Lingual Language Model Adaptation: from Word Models to Phrase Models
16 0.33231637 94 emnlp-2012-Multiple Aspect Summarization Using Integer Linear Programming
17 0.32836285 27 emnlp-2012-Characterizing Stylistic Elements in Syntactic Structure
18 0.32484171 133 emnlp-2012-Unsupervised PCFG Induction for Grounded Language Learning with Highly Ambiguous Supervision
19 0.31647825 42 emnlp-2012-Entropy-based Pruning for Phrase-based Machine Translation
20 0.30488405 119 emnlp-2012-Spectral Dependency Parsing with Latent Variables
topicId topicWeight
[(2, 0.011), (11, 0.012), (16, 0.039), (25, 0.013), (34, 0.104), (38, 0.258), (41, 0.012), (60, 0.088), (63, 0.06), (64, 0.019), (65, 0.021), (70, 0.024), (74, 0.099), (76, 0.085), (79, 0.012), (80, 0.021), (86, 0.016)]
simIndex simValue paperId paperTitle
same-paper 1 0.77305812 109 emnlp-2012-Re-training Monolingual Parser Bilingually for Syntactic SMT
Author: Shujie Liu ; Chi-Ho Li ; Mu Li ; Ming Zhou
Abstract: The training of most syntactic SMT approaches involves two essential components, word alignment and monolingual parser. In the current state of the art these two components are mutually independent, thus causing problems like lack of rule generalization, and violation of syntactic correspondence in translation rules. In this paper, we propose two ways of re-training monolingual parser with the target of maximizing the consistency between parse trees and alignment matrices. One is targeted self-training with a simple evaluation function; the other is based on training data selection from forced alignment of bilingual data. We also propose an auxiliary method for boosting alignment quality, by symmetrizing alignment matrices with respect to parse trees. The best combination of these novel methods achieves 3 Bleu point gain in an IWSLT task and more than 1 Bleu point gain in NIST tasks. 1
2 0.57640517 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
Author: Jayant Krishnamurthy ; Tom Mitchell
Abstract: We present a method for training a semantic parser using only a knowledge base and an unlabeled text corpus, without any individually annotated sentences. Our key observation is that multiple forms ofweak supervision can be combined to train an accurate semantic parser: semantic supervision from a knowledge base, and syntactic supervision from dependencyparsed sentences. We apply our approach to train a semantic parser that uses 77 relations from Freebase in its knowledge representation. This semantic parser extracts instances of binary relations with state-of-theart accuracy, while simultaneously recovering much richer semantic structures, such as conjunctions of multiple relations with partially shared arguments. We demonstrate recovery of this richer structure by extracting logical forms from natural language queries against Freebase. On this task, the trained semantic parser achieves 80% precision and 56% recall, despite never having seen an annotated logical form.
3 0.56462514 7 emnlp-2012-A Novel Discriminative Framework for Sentence-Level Discourse Analysis
Author: Shafiq Joty ; Giuseppe Carenini ; Raymond Ng
Abstract: We propose a complete probabilistic discriminative framework for performing sentencelevel discourse analysis. Our framework comprises a discourse segmenter, based on a binary classifier, and a discourse parser, which applies an optimal CKY-like parsing algorithm to probabilities inferred from a Dynamic Conditional Random Field. We show on two corpora that our approach outperforms the state-of-the-art, often by a wide margin.
4 0.55537885 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
Author: Valentin I. Spitkovsky ; Hiyan Alshawi ; Daniel Jurafsky
Abstract: We present a new family of models for unsupervised parsing, Dependency and Boundary models, that use cues at constituent boundaries to inform head-outward dependency tree generation. We build on three intuitions that are explicit in phrase-structure grammars but only implicit in standard dependency formulations: (i) Distributions of words that occur at sentence boundaries such as English determiners resemble constituent edges. (ii) Punctuation at sentence boundaries further helps distinguish full sentences from fragments like headlines and titles, allowing us to model grammatical differences between complete and incomplete sentences. (iii) Sentence-internal punctuation boundaries help with longer-distance dependencies, since punctuation correlates with constituent edges. Our models induce state-of-the-art dependency grammars for many languages without — — special knowledge of optimal input sentence lengths or biased, manually-tuned initializers.
5 0.55535972 130 emnlp-2012-Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars
Author: Kewei Tu ; Vasant Honavar
Abstract: We introduce a novel approach named unambiguity regularization for unsupervised learning of probabilistic natural language grammars. The approach is based on the observation that natural language is remarkably unambiguous in the sense that only a tiny portion of the large number of possible parses of a natural language sentence are syntactically valid. We incorporate an inductive bias into grammar learning in favor of grammars that lead to unambiguous parses on natural language sentences. The resulting family of algorithms includes the expectation-maximization algorithm (EM) and its variant, Viterbi EM, as well as a so-called softmax-EM algorithm. The softmax-EM algorithm can be implemented with a simple and computationally efficient extension to standard EM. In our experiments of unsupervised dependency grammar learn- ing, we show that unambiguity regularization is beneficial to learning, and in combination with annealing (of the regularization strength) and sparsity priors it leads to improvement over the current state of the art.
6 0.55384314 42 emnlp-2012-Entropy-based Pruning for Phrase-based Machine Translation
7 0.55185813 123 emnlp-2012-Syntactic Transfer Using a Bilingual Lexicon
8 0.54955035 127 emnlp-2012-Transforming Trees to Improve Syntactic Convergence
9 0.54886895 82 emnlp-2012-Left-to-Right Tree-to-String Decoding with Prediction
10 0.54806656 14 emnlp-2012-A Weakly Supervised Model for Sentence-Level Semantic Orientation Analysis with Multiple Experts
11 0.54721558 89 emnlp-2012-Mixed Membership Markov Models for Unsupervised Conversation Modeling
12 0.54698682 51 emnlp-2012-Extracting Opinion Expressions with semi-Markov Conditional Random Fields
13 0.54635131 122 emnlp-2012-Syntactic Surprisal Affects Spoken Word Duration in Conversational Contexts
14 0.54350543 70 emnlp-2012-Joint Chinese Word Segmentation, POS Tagging and Parsing
15 0.54332781 77 emnlp-2012-Learning Constraints for Consistent Timeline Extraction
16 0.54306775 54 emnlp-2012-Forced Derivation Tree based Model Training to Statistical Machine Translation
17 0.5425787 45 emnlp-2012-Exploiting Chunk-level Features to Improve Phrase Chunking
18 0.54235703 64 emnlp-2012-Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints
19 0.54188836 18 emnlp-2012-An Empirical Investigation of Statistical Significance in NLP
20 0.5402934 11 emnlp-2012-A Systematic Comparison of Phrase Table Pruning Techniques