acl acl2011 acl2011-166 knowledge-graph by maker-knowledge-mining

166 acl-2011-Improving Decoding Generalization for Tree-to-String Translation

Source: pdf

Author: Jingbo Zhu ; Tong Xiao

Abstract: To address the parse error issue for tree-tostring translation, this paper proposes a similarity-based decoding generation (SDG) solution by reconstructing similar source parse trees for decoding at the decoding time instead of taking multiple source parse trees as input for decoding. Experiments on Chinese-English translation demonstrated that our approach can achieve a significant improvement over the standard method, and has little impact on decoding speed in practice. Our approach is very easy to implement, and can be applied to other paradigms such as tree-to-tree models. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Improving Decoding Generalization for Tree-to-String Translation Jingbo Zhu Natural Language Processing Laboratory Northeastern University, Shenyang, China zhuj ingbo @mai l . [sent-1, score-0.137]

2 cn Abstract To address the parse error issue for tree-tostring translation, this paper proposes a similarity-based decoding generation (SDG) solution by reconstructing similar source parse trees for decoding at the decoding time instead of taking multiple source parse trees as input for decoding. [sent-4, score-3.52]

3 Experiments on Chinese-English translation demonstrated that our approach can achieve a significant improvement over the standard method, and has little impact on decoding speed in practice. [sent-5, score-0.725]

4 Our approach is very easy to implement, and can be applied to other paradigms such as tree-to-tree models. [sent-6, score-0.15]

5 1 Introduction Among linguistically syntax-based statistical machine translation (SMT) approaches, the tree-tostring model (Huang et al. [sent-7, score-0.243]

6 2006) is the simplest and fastest, in which parse trees on source side are used for grammar extraction and decoding. [sent-9, score-0.928]

7 , Chinese) string c and its auto-parsed tree T1-best, the goal of typical tree-to-string SMT is to find a target (e. [sent-12, score-0.305]

8 , English) string e* by the following equation as e* = argmaxPr(e | c,T1âˆ’best) (1) e where Pr(e|c, T1-best) is the probability that e is the translation of the given source string c and its T1-best. [sent-14, score-0.614]

9 A typical tree-to-string decoder aims to search for the best derivation among all consistent derivations that convert source tree into a target-language 418 Tong Xiao Natural Language Processing Laboratory Northeastern University, Shenyang, China xiaotong@mai l . [sent-15, score-0.928]

10 We call this set of consistent derivations the tree-to-string search space. [sent-19, score-0.292]

11 Each derivation in the search space respects the source parse tree. [sent-20, score-0.815]

12 Parsing errors on source parse trees would cause negative effects on tree-to-string translation due to decoding on incorrect source parse trees. [sent-21, score-1.975]

13 2010) to generate is to utilize multiple parsers, which can improve the diversity among source parse trees in . [sent-24, score-1.002]

14 In this solution, the most representative work is the forest-based translation method (Mi et al. [sent-25, score-0.244]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('parse', 0.348), ('shenyang', 0.31), ('decoding', 0.306), ('argmaxpr', 0.274), ('trees', 0.264), ('northeastern', 0.237), ('mai', 0.2), ('translation', 0.186), ('xiao', 0.166), ('source', 0.166), ('forest', 0.151), ('ingbo', 0.137), ('derivations', 0.128), ('jingbo', 0.118), ('respects', 0.118), ('fastest', 0.112), ('reconstructing', 0.112), ('smt', 0.112), ('derivation', 0.108), ('laboratory', 0.107), ('solution', 0.106), ('tong', 0.103), ('mi', 0.103), ('string', 0.102), ('paradigms', 0.1), ('china', 0.097), ('huang', 0.093), ('packed', 0.089), ('typical', 0.083), ('tree', 0.075), ('cn', 0.072), ('issue', 0.07), ('diversity', 0.068), ('zhu', 0.066), ('consistent', 0.065), ('proposes', 0.062), ('pr', 0.06), ('convert', 0.059), ('representative', 0.058), ('simplest', 0.058), ('address', 0.058), ('linguistically', 0.057), ('generalization', 0.056), ('decoder', 0.056), ('search', 0.056), ('among', 0.055), ('aims', 0.05), ('speed', 0.049), ('utilize', 0.049), ('effects', 0.047), ('implement', 0.047), ('error', 0.047), ('cause', 0.047), ('formally', 0.046), ('parsers', 0.045), ('demonstrated', 0.045), ('equation', 0.043), ('call', 0.043), ('incorrect', 0.043), ('denotes', 0.041), ('side', 0.038), ('effectively', 0.038), ('chinese', 0.037), ('easy', 0.036), ('little', 0.036), ('input', 0.036), ('liu', 0.035), ('zhang', 0.035), ('expressed', 0.035), ('alternative', 0.034), ('taking', 0.034), ('grammar', 0.033), ('impact', 0.032), ('produced', 0.031), ('multiple', 0.031), ('generation', 0.03), ('negative', 0.029), ('instead', 0.027), ('best', 0.027), ('improving', 0.027), ('errors', 0.025), ('goal', 0.025), ('short', 0.024), ('parsing', 0.023), ('achieve', 0.022), ('generate', 0.021), ('improvement', 0.021), ('extraction', 0.021), ('target', 0.02), ('space', 0.019), ('represent', 0.018), ('syntactic', 0.016), ('natural', 0.016), ('probability', 0.015), ('significant', 0.015), ('processing', 0.015), ('structure', 0.015), ('applied', 0.014), ('time', 0.013), ('standard', 0.013)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999982 166 acl-2011-Improving Decoding Generalization for Tree-to-String Translation

Author: Jingbo Zhu ; Tong Xiao

2 0.23399195 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation

Author: Xianchao Wu ; Takuya Matsuzaki ; Jun'ichi Tsujii

Abstract: In the present paper, we propose the effective usage of function words to generate generalized translation rules for forest-based translation. Given aligned forest-string pairs, we extract composed tree-to-string translation rules that account for multiple interpretations of both aligned and unaligned target function words. In order to constrain the exhaustive attachments of function words, we limit to bind them to the nearby syntactic chunks yielded by a target dependency parser. Therefore, the proposed approach can not only capture source-tree-to-target-chunk correspondences but can also use forest structures that compactly encode an exponential number of parse trees to properly generate target function words during decoding. Extensive experiments involving large-scale English-toJapanese translation revealed a significant im- provement of 1.8 points in BLEU score, as compared with a strong forest-to-string baseline system.

3 0.23003598 155 acl-2011-Hypothesis Mixture Decoding for Statistical Machine Translation

Author: Nan Duan ; Mu Li ; Ming Zhou

Abstract: This paper presents hypothesis mixture decoding (HM decoding), a new decoding scheme that performs translation reconstruction using hypotheses generated by multiple translation systems. HM decoding involves two decoding stages: first, each component system decodes independently, with the explored search space kept for use in the next step; second, a new search space is constructed by composing existing hypotheses produced by all component systems using a set of rules provided by the HM decoder itself, and a new set of model independent features are used to seek the final best translation from this new search space. Few assumptions are made by our approach about the underlying component systems, enabling us to leverage SMT models based on arbitrary paradigms. We compare our approach with several related techniques, and demonstrate significant BLEU improvements in large-scale Chinese-to-English translation tasks.

4 0.19074725 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

Author: Bing Zhao ; Young-Suk Lee ; Xiaoqiang Luo ; Liu Li

Abstract: We propose a novel technique of learning how to transform the source parse trees to improve the translation qualities of syntax-based translation models using synchronous context-free grammars. We transform the source tree phrasal structure into a set of simpler structures, expose such decisions to the decoding process, and find the least expensive transformation operation to better model word reordering. In particular, we integrate synchronous binarizations, verb regrouping, removal of redundant parse nodes, and incorporate a few important features such as translation boundaries. We learn the structural preferences from the data in a generative framework. The syntax-based translation system integrating the proposed techniques outperforms the best Arabic-English unconstrained system in NIST08 evaluations by 1.3 absolute BLEU, which is statistically significant.

5 0.18241461 30 acl-2011-Adjoining Tree-to-String Translation

Author: Yang Liu ; Qun Liu ; Yajuan Lu

Abstract: We introduce synchronous tree adjoining grammars (TAG) into tree-to-string translation, which converts a source tree to a target string. Without reconstructing TAG derivations explicitly, our rule extraction algorithm directly learns tree-to-string rules from aligned Treebank-style trees. As tree-to-string translation casts decoding as a tree parsing problem rather than parsing, the decoder still runs fast when adjoining is included. Less than 2 times slower, the adjoining tree-tostring system improves translation quality by +0.7 BLEU over the baseline system only allowing for tree substitution on NIST ChineseEnglish test sets.

6 0.18091027 217 acl-2011-Machine Translation System Combination by Confusion Forest

7 0.18027309 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

8 0.17209639 61 acl-2011-Binarized Forest to String Translation

9 0.15214317 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation

10 0.12304804 59 acl-2011-Better Automatic Treebank Conversion Using A Feature-Based Approach

11 0.12194211 268 acl-2011-Rule Markov Models for Fast Tree-to-String Translation

12 0.11269854 123 acl-2011-Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation

13 0.09375751 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers

14 0.087630734 44 acl-2011-An exponential translation model for target language morphology

15 0.085988216 233 acl-2011-On-line Language Model Biasing for Statistical Machine Translation

16 0.085281849 106 acl-2011-Dual Decomposition for Natural Language Processing

17 0.085196644 58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing

18 0.084693633 235 acl-2011-Optimal and Syntactically-Informed Decoding for Monolingual Phrase-Based Alignment

19 0.082838006 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

20 0.082608774 116 acl-2011-Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.192), (1, -0.216), (2, 0.113), (3, -0.055), (4, 0.046), (5, 0.01), (6, -0.2), (7, -0.06), (8, -0.009), (9, -0.021), (10, -0.05), (11, -0.048), (12, -0.008), (13, -0.103), (14, 0.037), (15, -0.004), (16, 0.004), (17, -0.038), (18, -0.013), (19, 0.017), (20, -0.049), (21, -0.037), (22, 0.088), (23, 0.084), (24, 0.014), (25, -0.081), (26, 0.045), (27, 0.059), (28, -0.011), (29, 0.016), (30, -0.071), (31, -0.012), (32, -0.007), (33, -0.054), (34, 0.031), (35, -0.009), (36, -0.116), (37, -0.092), (38, -0.077), (39, -0.15), (40, 0.043), (41, 0.072), (42, -0.083), (43, -0.025), (44, 0.094), (45, -0.03), (46, 0.033), (47, -0.048), (48, -0.019), (49, 0.038)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97171509 166 acl-2011-Improving Decoding Generalization for Tree-to-String Translation

Author: Jingbo Zhu ; Tong Xiao

2 0.83349943 217 acl-2011-Machine Translation System Combination by Confusion Forest

Author: Taro Watanabe ; Eiichiro Sumita

Abstract: The state-of-the-art system combination method for machine translation (MT) is based on confusion networks constructed by aligning hypotheses with regard to word similarities. We introduce a novel system combination framework in which hypotheses are encoded as a confusion forest, a packed forest representing alternative trees. The forest is generated using syntactic consensus among parsed hypotheses: First, MT outputs are parsed. Second, a context free grammar is learned by extracting a set of rules that constitute the parse trees. Third, a packed forest is generated starting from the root symbol of the extracted grammar through non-terminal rewriting. The new hypothesis is produced by searching the best derivation in the forest. Experimental results on the WMT10 system combination shared task yield comparable performance to the conventional confusion network based method with smaller space.

3 0.78322619 155 acl-2011-Hypothesis Mixture Decoding for Statistical Machine Translation

Author: Nan Duan ; Mu Li ; Ming Zhou

4 0.75645024 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation

Author: Xianchao Wu ; Takuya Matsuzaki ; Jun'ichi Tsujii

5 0.72487879 30 acl-2011-Adjoining Tree-to-String Translation

Author: Yang Liu ; Qun Liu ; Yajuan Lu

6 0.72366571 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

7 0.71413797 220 acl-2011-Minimum Bayes-risk System Combination

8 0.66975015 268 acl-2011-Rule Markov Models for Fast Tree-to-String Translation

9 0.63320202 61 acl-2011-Binarized Forest to String Translation

10 0.61998707 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers

11 0.61070347 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation

12 0.60710335 123 acl-2011-Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation

13 0.58504939 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

14 0.51284188 106 acl-2011-Dual Decomposition for Natural Language Processing

15 0.50849581 330 acl-2011-Using Derivation Trees for Treebank Error Detection

16 0.49234673 154 acl-2011-How to train your multi bottom-up tree transducer

17 0.49109358 28 acl-2011-A Statistical Tree Annotator and Its Applications

18 0.47460908 173 acl-2011-Insertion Operator for Bayesian Tree Substitution Grammars

19 0.47074491 59 acl-2011-Better Automatic Treebank Conversion Using A Feature-Based Approach

20 0.46795368 116 acl-2011-Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(17, 0.08), (37, 0.099), (39, 0.095), (41, 0.048), (55, 0.01), (59, 0.018), (91, 0.024), (95, 0.251), (96, 0.266)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.85526013 166 acl-2011-Improving Decoding Generalization for Tree-to-String Translation

Author: Jingbo Zhu ; Tong Xiao

2 0.76390326 61 acl-2011-Binarized Forest to String Translation

Author: Hao Zhang ; Licheng Fang ; Peng Xu ; Xiaoyun Wu

Abstract: Tree-to-string translation is syntax-aware and efficient but sensitive to parsing errors. Forestto-string translation approaches mitigate the risk of propagating parser errors into translation errors by considering a forest of alternative trees, as generated by a source language parser. We propose an alternative approach to generating forests that is based on combining sub-trees within the first best parse through binarization. Provably, our binarization forest can cover any non-consitituent phrases in a sentence but maintains the desirable property that for each span there is at most one nonterminal so that the grammar constant for decoding is relatively small. For the purpose of reducing search errors, we apply the synchronous binarization technique to forest-tostring decoding. Combining the two techniques, we show that using a fast shift-reduce parser we can achieve significant quality gains in NIST 2008 English-to-Chinese track (1.3 BLEU points over a phrase-based system, 0.8 BLEU points over a hierarchical phrase-based system). Consistent and significant gains are also shown in WMT 2010 in the English to German, French, Spanish and Czech tracks.

3 0.76176661 18 acl-2011-A Latent Topic Extracting Method based on Events in a Document and its Application

Author: Risa Kitajima ; Ichiro Kobayashi

Abstract: Recently, several latent topic analysis methods such as LSI, pLSI, and LDA have been widely used for text analysis. However, those methods basically assign topics to words, but do not account for the events in a document. With this background, in this paper, we propose a latent topic extracting method which assigns topics to events. We also show that our proposed method is useful to generate a document summary based on a latent topic.

4 0.76158565 117 acl-2011-Entity Set Expansion using Topic information

Author: Kugatsu Sadamitsu ; Kuniko Saito ; Kenji Imamura ; Genichiro Kikui

Abstract: This paper proposes three modules based on latent topics of documents for alleviating “semantic drift” in bootstrapping entity set expansion. These new modules are added to a discriminative bootstrapping algorithm to realize topic feature generation, negative example selection and entity candidate pruning. In this study, we model latent topics with LDA (Latent Dirichlet Allocation) in an unsupervised way. Experiments show that the accuracy of the extracted entities is improved by 6.7 to 28.2% depending on the domain.

5 0.76098537 241 acl-2011-Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation

Author: Zhongguo Li

Abstract: Lots of Chinese characters are very productive in that they can form many structured words either as prefixes or as suffixes. Previous research in Chinese word segmentation mainly focused on identifying only the word boundaries without considering the rich internal structures of many words. In this paper we argue that this is unsatisfying in many ways, both practically and theoretically. Instead, we propose that word structures should be recovered in morphological analysis. An elegant approach for doing this is given and the result is shown to be promising enough for encouraging further effort in this direction. Our probability model is trained with the Penn Chinese Treebank and actually is able to parse both word and phrase structures in a unified way. 1 Why Parse Word Structures? Research in Chinese word segmentation has progressed tremendously in recent years, with state of the art performing at around 97% in precision and recall (Xue, 2003; Gao et al., 2005; Zhang and Clark, 2007; Li and Sun, 2009). However, virtually all these systems focus exclusively on recognizing the word boundaries, giving no consideration to the internal structures of many words. Though it has been the standard practice for many years, we argue that this paradigm is inadequate both in theory and in practice, for at least the following four reasons. The first reason is that if we confine our definition of word segmentation to the identification of word boundaries, then people tend to have divergent 1405 opinions as to whether a linguistic unit is a word or not (Sproat et al., 1996). This has led to many different annotation standards for Chinese word segmentation. Even worse, this could cause inconsistency in the same corpus. For instance, 䉂擌奒 ‘vice president’ is considered to be one word in the Penn Chinese Treebank (Xue et al., 2005), but is split into two words by the Peking University corpus in the SIGHAN Bakeoffs (Sproat and Emerson, 2003). Meanwhile, 䉂䀓惼 ‘vice director’ and 䉂䚲䡮 ‘deputy are both segmented into two words in the same Penn Chinese Treebank. In fact, all these words are composed of the prefix 䉂 ‘vice’ and a root word. Thus the structure of 䉂擌奒 ‘vice president’ can be represented with the tree in Figure 1. Without a doubt, there is complete agree- manager’ NN ,,ll JJf NNf 䉂擌奒 Figure 1: Example of a word with internal structure. ment on the correctness of this structure among native Chinese speakers. So if instead of annotating only word boundaries, we annotate the structures of every word, then the annotation tends to be more 1 1Here it is necessary to add a note on terminology used in this paper. Since there is no universally accepted definition of the “word” concept in linguistics and especially in Chinese, whenever we use the term “word” we might mean a linguistic unit such as 䉂擌奒 ‘vice president’ whose structure is shown as the tree in Figure 1, or we might mean a smaller unit such as 擌奒 ‘president’ which is a substructure of that tree. Hopefully, ProceedingPso orftla thned 4,9 Otrhe Agonnn,u Jauln Mee 1e9t-i2ng4, o 2f0 t1h1e. A ?c s 2o0ci1a1ti Aonss foocria Ctioomnp fourta Ctioomnaplu Ltaintigouniaslti Lcisn,g puaigsetsic 1s405–1414, consistent and there could be less duplication of efforts in developing the expensive annotated corpus. The second reason is applications have different requirements for granularity of words. Take the personal name 撱嗤吼 ‘Zhou Shuren’ as an example. It’s considered to be one word in the Penn Chinese Treebank, but is segmented into a surname and a given name in the Peking University corpus. For some applications such as information extraction, the former segmentation is adequate, while for others like machine translation, the later finer-grained output is more preferable. If the analyzer can produce a structure as shown in Figure 4(a), then every application can extract what it needs from this tree. A solution with tree output like this is more elegant than approaches which try to meet the needs of different applications in post-processing (Gao et al., 2004). The third reason is that traditional word segmentation has problems in handling many phenomena in Chinese. For example, the telescopic compound 㦌撥怂惆 ‘universities, middle schools and primary schools’ is in fact composed ofthree coordinating elements 㦌惆 ‘university’, 撥惆 ‘middle school’ and 怂惆 ‘primary school’ . Regarding it as one flat word loses this important information. Another example is separable words like 扩扙 ‘swim’ . With a linear segmentation, the meaning of ‘swimming’ as in 扩堑扙 ‘after swimming’ cannot be properly represented, since 扩扙 ‘swim’ will be segmented into discontinuous units. These language usages lie at the boundary between syntax and morphology, and are not uncommon in Chinese. They can be adequately represented with trees (Figure 2). (a) NN (b) ???HHH JJ NNf ???HHH JJf JJf JJf 㦌撥怂惆 VV ???HHH VV NNf ZZ VVf VVf 扩扙堑 Figure 2: Example of telescopic compound (a) and separable word (b). The last reason why we should care about word the context will always make it clear what is being referred to with the term “word”. 1406 structures is related to head driven statistical parsers (Collins, 2003). To illustrate this, note that in the Penn Chinese Treebank, the word 戽䊂䠽吼 ‘English People’ does not occur at all. Hence constituents headed by such words could cause some difficulty for head driven models in which out-ofvocabulary words need to be treated specially both when they are generated and when they are conditioned upon. But this word is in turn headed by its suffix 吼 ‘people’, and there are 2,233 such words in Penn Chinese Treebank. If we annotate the structure of every compound containing this suffix (e.g. Figure 3), such data sparsity simply goes away.

6 0.76096493 28 acl-2011-A Statistical Tree Annotator and Its Applications

7 0.75979841 110 acl-2011-Effective Use of Function Words for Rule Generalization in Forest-Based Translation

8 0.75953817 155 acl-2011-Hypothesis Mixture Decoding for Statistical Machine Translation

9 0.7564373 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation

10 0.75197363 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

11 0.75175726 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

12 0.75043261 251 acl-2011-Probabilistic Document Modeling for Syntax Removal in Text Summarization

13 0.7497921 21 acl-2011-A Pilot Study of Opinion Summarization in Conversations

14 0.7497316 98 acl-2011-Discovery of Topically Coherent Sentences for Extractive Summarization

15 0.74963176 15 acl-2011-A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

16 0.749277 11 acl-2011-A Fast and Accurate Method for Approximate String Search

17 0.74907184 169 acl-2011-Improving Question Recommendation by Exploiting Information Need

18 0.74895871 30 acl-2011-Adjoining Tree-to-String Translation

19 0.74741763 81 acl-2011-Consistent Translation using Discriminative Learning - A Translation Memory-inspired Approach

20 0.74735069 161 acl-2011-Identifying Word Translations from Comparable Corpora Using Latent Topic Models