emnlp emnlp2011 emnlp2011-3 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: J. Scott McCarley ; Abraham Ittycheriah ; Salim Roukos ; Bing Xiang ; Jian-ming Xu
Abstract: Models of word alignment built as sequences of links have limited expressive power, but are easy to decode. Word aligners that model the alignment matrix can express arbitrary alignments, but are difficult to decode. We propose an alignment matrix model as a correction algorithm to an underlying sequencebased aligner. Then a greedy decoding algorithm enables the full expressive power of the alignment matrix formulation. Improved alignment performance is shown for all nine language pairs tested. The improved alignments also improved translation quality from Chinese to English and English to Italian.
Reference: text
sentIndex sentText sentNum sentScore
1 com Abstract Models of word alignment built as sequences of links have limited expressive power, but are easy to decode. [sent-7, score-0.595]
2 Word aligners that model the alignment matrix can express arbitrary alignments, but are difficult to decode. [sent-8, score-0.888]
3 We propose an alignment matrix model as a correction algorithm to an underlying sequencebased aligner. [sent-9, score-0.753]
4 Then a greedy decoding algorithm enables the full expressive power of the alignment matrix formulation. [sent-10, score-0.656]
5 Improved alignment performance is shown for all nine language pairs tested. [sent-11, score-0.506]
6 The improved alignments also improved translation quality from Chinese to English and English to Italian. [sent-12, score-0.543]
7 1 Introduction Word-level alignments of parallel text are crucial for enabling machine learning algorithms to fully uti- lize parallel corpora as training data. [sent-13, score-0.445]
8 Word alignments appear as hidden variables in IBM Models 15 (Brown et al. [sent-14, score-0.365]
9 Other notable applications of word alignments include crosslanguage projection of linguistic analyzers (such as POS taggers and named entity detectors,) a subject which continues to be of interest. [sent-16, score-0.439]
10 , 2001), (Benajiba and Zitouni, 2010) The structure of the alignment model is tightly linked to the task of finding the optimal alignment. [sent-18, score-0.529]
11 889 Many alignment models are factorized in order to use dynamic programming and beam search for efficient marginalization and search. [sent-19, score-0.473]
12 An alignment model that jointly models all of the links in the entire sentence does not motivate a particular decoding order. [sent-23, score-0.682]
13 It simply assigns comparable scores to the alignment of the entire sentence, and may be used to rescore the top-N hypotheses of another aligner, or to decide whether heuristic perturbations to the output of an existing aligner constitute an improvement. [sent-24, score-0.85]
14 In this paper, we will show that by using an existing alignment as a starting point, we can make a significant improvement to the alignment by proposing a series of heuristic perturbations. [sent-26, score-1.08]
15 From any initial alignment configuration, these perturbations define a multitude of paths to the reference (gold) alignment. [sent-28, score-0.639]
16 Our model learns alignment moves that modify an initial alignment into the reference alignment. [sent-29, score-1.105]
17 Furthermore, the resulting model assigns a score to the alignment and thus could be used in numerous rescoring algorithms, such as topN rescorers. [sent-30, score-0.473]
18 ec th2o0d1s1 i Ans Nsoactuiartaioln La fonrg Cuaogmep Purtoatcieosnsainlg L,in pgaugies ti 8c8s9–898, work to choose alignment moves. [sent-33, score-0.473]
19 The alignment moves are sufficiently rich to reach arbitrary phrase to phrase alignments. [sent-35, score-0.585]
20 Since most of the features in the model are not languagespecific, we are able to test the correction model easily on nine language pairs; our corrections improved the alignment quality compared to the input alignments in all nine. [sent-36, score-1.127]
21 This type of alignment model is not symmetric; interchanging source and target lan- guages results in a different aligner. [sent-41, score-0.613]
22 This parameterization does not allow a target word to be linked to more than one source word, so some phrasal alignments are simply not considered. [sent-42, score-0.614]
23 Nevertheless, aligners that use this parameterization internally often incorporate various heuristics in order to augment their output with the disallowed alignments - for example, swapping source and target languages to obtain a second alignment (Koehn et al. [sent-44, score-1.319]
24 , 2006) and using posterior probabilities during alignment prediction even allows the model to see limited right context. [sent-47, score-0.473]
25 Another alignment combination strategy (Deng and Zhou, 2009) directly optimizes the size of the phrase table of a target MT system. [sent-48, score-0.537]
26 , 1996)) motivate a narrative where alignments are selected left-to-right and target words are then generated conditioned upon the alignment and the source words. [sent-50, score-1.009]
27 Discriminative models of alignment incorporate source and target words, as well as more linguisti890 cally motivated features into the prediction of alignment. [sent-52, score-0.613]
28 Examples include the maximum entropy model of (Ittycheriah and Roukos, 2005) or the conditional random field jointly normalized over the entire sequence of alignments of (Blunsom and Cohn, 2006). [sent-54, score-0.404]
29 3 Joint Models An alternate parameterization of alignment is the alignment matrix (Niehues and Vogel, 2008). [sent-55, score-1.126]
30 el, the alignment matrix A = {σij } is an l m matrix of binary variables. [sent-62, score-0.727]
31 There is no constraint limiting the number of source tokens to which a target word is linked either; thus the binary matrix allows some alignments that cannot be modeled by the sequence parameterization. [sent-65, score-0.688]
32 All 2lm binary matrices are potentially allowed in alignment matrix models. [sent-66, score-0.6]
33 (m + 1)l, the number of alignments described by a comparable sequence model. [sent-68, score-0.365]
34 This parameterization is symmetric if source and target are interchanged, then the alignment matrix is transposed. [sent-69, score-0.822]
35 A straightforward approach to the alignment matrix is to build a log linear model (Liu et al. [sent-70, score-0.65]
36 (We continue to refer to “source” and “target” words only for consistency of notation - alignment models such as this are indifferent to the actual direction of translation. [sent-72, score-0.473]
37 ) The log linear model for the alignment (Liu et al. [sent-73, score-0.523]
38 Feature functions may depend upon any number of components σij of the alignment matrix A. [sent-79, score-0.6]
39 The sum over all alignments of a sentence pair (2lm terms) in the partition function is computationally impractical except for very short sentences, and is rarely amenable to dynamic programming. [sent-80, score-0.406]
40 For example, the sum over all alignments may be restricted to a sum over the n-best list from other aligners (Liu et al. [sent-82, score-0.653]
41 This approximation was found to be inconsistent for small n unless the merged results of several aligners were used. [sent-84, score-0.288]
42 4 Alignment Correction Model In this section we describe a novel approach to word alignment, in which we train a log linear (maximum entropy) model of alignment by viewing it as correction model that fixes the errors of an existing aligner. [sent-91, score-0.71]
43 We assume a priori that the aligner will start from an existing alignment of reasonable quality, and will attempt to apply a series of small changes to that alignment in order to correct it. [sent-92, score-1.205]
44 The aligner naturally consists of a move generator and a move selector. [sent-93, score-0.652]
45 The move generator perturbs an existing alignment A in order to create a set of candidate alignments Mt(A), all of which are nearby to A in the space o Mf alignments. [sent-94, score-1.236]
46 We index the set of moves by the decoding step t to indicate that we generate entirely different (even non-overlapping) sets of moves at different steps t of the alignment prediction. [sent-95, score-0.753]
47 The move selector then chooses one of the alignments At+1 ∈ Mt(At), and proceeds iteratively: At+2 ∈ Mt+1 (At+1), etc. [sent-99, score-0.54]
48 Fm Input: alignment A Output: improved alignment Afinal for t = 1→ ldo generate moves: Mt(At) gseelneecrta move: At+1 ← argmaxA∈Mt(At)p(A|At, E, F) Afinal ← Al+1 {repeat ←for A source words} Figure 1: pseudocode for alignment correction target word is sufficient. [sent-107, score-1.798]
49 1 Move generation Many different types of alignment perturbations are possible. [sent-109, score-0.592]
50 Here we restrict ourselves to a very simple move generator that changes the linkage of exactly one source word at a time, or exactly one target word at a time. [sent-110, score-0.472]
51 , 2008) considers deletion of links from an initial alignment (union of aligners) that is likely to overproduce links. [sent-115, score-0.595]
52 From the point of view of the alignment matrix, we consider changes to one row or one column (generically, one slice) of the alignment matrix. [sent-116, score-0.998]
53 At each step t, the move set Mt(At) is formed by choosing a psli tc,e t hofe t mheo cvuer rseentt M alignment matrix At, and generating all possible alignments from a few families of moves. [sent-117, score-1.189]
54 Then the move generator picks another slice and repeats. [sent-118, score-0.37]
55 The m + l slices are cycled in a fixed order: the first m slices correspond to source words (ordered according to a heuristic topdown traversal of the dependency parse tree if available), and the remaining lslices correspond to target words, similarly parse-ordered. [sent-119, score-0.287]
56 ) αβγ αβγ abc◦◦••◦◦•◦◦=⇒acb◦◦•◦•◦◦◦◦ • move a link in row i- for one j and one j0 such tmhaotv σij = k1 i na rnodw σij0 = 0, me jak ane σij = 0 and σij0 = 1(shown here for i= 1. [sent-123, score-0.269]
57 f)e Trehnece w alignments are not reachable from the starting point. [sent-128, score-0.396]
58 For the move generator considered in this pa- per, the summation in Eq. [sent-134, score-0.288]
59 The set of candidate alignments Mt (At) typically does not cofon ctaanind tdhaete re afliegrnemnceen (gold) alignment; we model the best alignment among a finite set of alternatives, rather than the correct alignment from among all possible alignments. [sent-136, score-1.347]
60 Note that if we extended our definition of perturbation to the limiting case that the alignment set included all possible alignments then we would clearly recover the standard log linear model of alignment. [sent-139, score-0.936]
61 3 Training Since the model is designed to predict perturbation to an alignment, it is trained from a collection of errorful alignments and corresponding reference sequences of aligner moves that reach the reference (gold) alignment. [sent-141, score-0.838]
62 We construct a training set from a collection of sentence pairs and reference alignments for training (A∗n, En, Fn)nN=1, as well as collections of corresponding “first pass” alignments A1n produced by another aligner. [sent-142, score-0.777]
63 For each n, we form a number of candidate alignment sets Mt(Atn), one fnourm ebaecrh source iadnadte target mweonrdt. [sent-143, score-0.613]
64 s Ftsor M training purposes, the true alignment from the set is taken to be the one identical with A∗n in the slice targeted by the move generator at the current step. [sent-144, score-0.843]
65 Link-based features are those which decompose into a (linear) sum of alignment matrix elements σij. [sent-150, score-0.6]
66 As an example, if ei is the headword of ei0, and fj is the headword of fj0, then φ(A,E,F) = Xσijσi0j0 Xij (8) counts the number of times that a dependency relation in one language is preserved by alignment in the other language. [sent-159, score-0.719]
67 After aligning a large unannotated parallel corpus with our aligner, we enumerate fully lexicalized geometrical features that can be extracted from the resulting alignments - these are entries in a phrase dictionary. [sent-163, score-0.54]
68 These features are tied, and treated as a single real-valued feature that fires during training and decoding phases if a set of hypothesized links matches the geometrical feature extracted from the unannotated data. [sent-164, score-0.313]
69 1 Arabic-English alignment results We trained the Arabic-English alignment system on 5125 sentences from Arabic-English treebanks (LDC2008E61, LDC2008E22) that had been annotated for word alignment. [sent-168, score-0.976]
70 IT=Italian, PT=Portuguese, JA=Japanese, RU=Russian, DE=German, ES=Spanish, FR=French 894 the training and test sets were decoded with three other aligners, so that the robustness of the correction model to different input alignments could be validated. [sent-179, score-0.518]
71 The three aligners were GIZA++ (Och and Ney, 2003) (with the MOSES (Koehn et al. [sent-180, score-0.288]
72 , 2007) postprocessing option -al ignment grow-diag-final-and) the posterior HMM aligner of (Ge, 2004), a maximum entropy sequential model (ME-seq) (Ittycheriah and Roukos, 2005). [sent-181, score-0.27]
73 ME-seq is our primary point of comparison: it is discriminatively trained (on the same training data,) uses a rich set of features, and provides the best alignments of the three. [sent-182, score-0.395]
74 Three correction models were trained: corr(GIZA++) is trained to correct the alignments produced by GIZA++, corr(HMM) is trained to correct the alignments produced by the HMM aligner, and corr(ME-seq) is trained to correct the alignments produced by the ME-seq model. [sent-183, score-1.446]
75 In Table (1) we show results for our system correcting each of the aligners as measured in the usual recall, precision, and F-measure. [sent-184, score-0.288]
76 1 The resulting improvements in F-measure of the alignments produced by our models over their corresponding baselines is statistically significant (p < 10−4, indicated by a ∗. [sent-185, score-0.365]
77 ) Statistical significance is tested by a Monte Carlo bootstrap (Efron and Tibshirani, 1986) - sampling with replacement the difference in F-measure of the two system’s alignments of the same sentence pair. [sent-186, score-0.365]
78 We also show cross-condition results in which a correction model trained to correct HMM alignments is applied to correct ME-seq alignments. [sent-188, score-0.62]
79 2 Chinese-English alignment results Table (2) presents results for Chinese-English word alignments. [sent-191, score-0.473]
80 For this language pair, reference parses were not available in our training set, so 1We do not distinguish sure and possible links in our annotations - under this circumstance, alignment error rate(Och and Ney, 2003) is 1− F. [sent-194, score-0.684]
81 3 Additional language pairs Table (3) presents alignment results for seven other language pairs. [sent-201, score-0.473]
82 Separate alignment corrector models were trained for both directions of Italian ↔ English ea tnrda Portuguese ↔ English. [sent-202, score-0.503]
83 Manual alignments for training and test data were annotated. [sent-205, score-0.365]
84 Our model obtained improved alignment F-measure in all language pairs, although the improvements were small for ES→EN and FR→EN, the language pairs for × wfohric EhS →theE bNase alninde F accuracy was tlhaneg highest. [sent-212, score-0.527]
85 We note that all of the comparison aligners had equivalent lexical information. [sent-221, score-0.288]
86 The correction model improved performance across all three of these links structures. [sent-238, score-0.329]
87 The single exception is that the number of 2 −1 false alarms increased (Zh-En alignments) o bfu t2 i−n 1th fisa case, rtmhes f i nrsctr pass ME-seq alignment produced few false alarms because it simply proposed few links of this form. [sent-239, score-0.823]
88 5 Translation Impact We tested the impact of improved alignments on × the performance of a phrase-based translation system (Ittycheriah and Roukos, 2007) for three language pairs. [sent-243, score-0.489]
89 Our alignment did not improve the performance of a mature Arabic to English translation system, but two notable successes were obtained: Chinese to English, and English to Italian. [sent-244, score-0.62]
90 It is well known that improved alignment performance does not always improve translation performance (Fraser and Marcu, 2007). [sent-245, score-0.597]
91 A mature machine translation system may incorporate alignments obtained from multiple aligners, or from both directions of an asymmetric aligner. [sent-246, score-0.472]
92 Translation performance further improved, by a smaller amount, using both ME-seq and corr(ME-seq) alignments during the training. [sent-253, score-0.365]
93 The improved alignments impacted the translation performance of the English to Italian translation system (table 7) even more strongly. [sent-254, score-0.559]
94 ) 7 Conclusions A log linear model for the alignment matrix is used to guide systematic improvements to an existing aligner. [sent-260, score-0.684]
95 Our system models arbitrary alignment matrices and allows features that incorporate such information as correlations based on parse trees in both languages. [sent-261, score-0.473]
96 We train models to correct the errors of several existing aligners; we find the resulting 897 models are robust to using different aligners as starting points. [sent-262, score-0.389]
97 Improvements in alignment F-measure, often significant improvements, show that our model successfully corrects input alignments from existing models in all nine language pairs tested. [sent-263, score-0.905]
98 The resulting Chinese-English and English-Italian word alignments also improved translation performance, especially on the English-Italian test, and notably on the particularly difficult subset ofthe Chinese sentences. [sent-264, score-0.489]
99 Using syntax to improve word alignment precision for syntax-based machine translation. [sent-305, score-0.473]
100 Discriminative word alignment with a function word reordering model. [sent-379, score-0.473]
wordName wordTfidf (topN-words)
[('alignment', 0.473), ('alignments', 0.365), ('aligners', 0.288), ('ij', 0.195), ('aligner', 0.189), ('move', 0.175), ('correction', 0.153), ('matrix', 0.127), ('links', 0.122), ('perturbations', 0.119), ('generator', 0.113), ('moves', 0.112), ('ittycheriah', 0.112), ('geometrical', 0.103), ('encorr', 0.095), ('corr', 0.086), ('alarms', 0.082), ('slice', 0.082), ('nearby', 0.076), ('source', 0.076), ('ei', 0.073), ('abc', 0.072), ('acb', 0.072), ('niehues', 0.072), ('mt', 0.071), ('hmm', 0.07), ('translation', 0.07), ('target', 0.064), ('giza', 0.064), ('abraham', 0.062), ('fj', 0.061), ('decoding', 0.056), ('linked', 0.056), ('headword', 0.056), ('slices', 0.056), ('improved', 0.054), ('italian', 0.053), ('parameterization', 0.053), ('row', 0.052), ('consisted', 0.051), ('log', 0.05), ('corrections', 0.049), ('families', 0.049), ('vogel', 0.048), ('afinal', 0.048), ('benajiba', 0.048), ('fossum', 0.048), ('meseq', 0.048), ('perturbation', 0.048), ('unlinked', 0.048), ('reference', 0.047), ('directionality', 0.046), ('arabic', 0.045), ('bleu', 0.044), ('liu', 0.044), ('linkage', 0.044), ('sequential', 0.042), ('link', 0.042), ('parses', 0.042), ('moore', 0.042), ('taskar', 0.042), ('partition', 0.041), ('efron', 0.041), ('subsampled', 0.041), ('chinese', 0.041), ('notable', 0.04), ('parallel', 0.04), ('entropy', 0.039), ('fraser', 0.037), ('loopy', 0.037), ('mature', 0.037), ('correct', 0.036), ('heuristic', 0.035), ('ayan', 0.034), ('deng', 0.034), ('setiawan', 0.034), ('improvement', 0.034), ('ney', 0.034), ('projection', 0.034), ('existing', 0.034), ('variety', 0.033), ('nine', 0.033), ('english', 0.032), ('false', 0.032), ('sampled', 0.032), ('pseudocode', 0.032), ('unannotated', 0.032), ('blunsom', 0.032), ('ibm', 0.032), ('roukos', 0.032), ('discriminative', 0.031), ('starting', 0.031), ('portuguese', 0.031), ('bin', 0.031), ('misses', 0.031), ('motivate', 0.031), ('xij', 0.031), ('trained', 0.03), ('symmetric', 0.029), ('fr', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999857 3 emnlp-2011-A Correction Model for Word Alignments
Author: J. Scott McCarley ; Abraham Ittycheriah ; Salim Roukos ; Bing Xiang ; Jian-ming Xu
Abstract: Models of word alignment built as sequences of links have limited expressive power, but are easy to decode. Word aligners that model the alignment matrix can express arbitrary alignments, but are difficult to decode. We propose an alignment matrix model as a correction algorithm to an underlying sequencebased aligner. Then a greedy decoding algorithm enables the full expressive power of the alignment matrix formulation. Improved alignment performance is shown for all nine language pairs tested. The improved alignments also improved translation quality from Chinese to English and English to Italian.
2 0.15255781 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features
Author: Christos Christodoulopoulos ; Sharon Goldwater ; Mark Steedman
Abstract: In this paper we present a fully unsupervised syntactic class induction system formulated as a Bayesian multinomial mixture model, where each word type is constrained to belong to a single class. By using a mixture model rather than a sequence model (e.g., HMM), we are able to easily add multiple kinds of features, including those at both the type level (morphology features) and token level (context and alignment features, the latter from parallel corpora). Using only context features, our system yields results comparable to state-of-the art, far better than a similar model without the one-class-per-type constraint. Using the additional features provides added benefit, and our final system outperforms the best published results on most of the 25 corpora tested.
3 0.14121906 13 emnlp-2011-A Word Reordering Model for Improved Machine Translation
Author: Karthik Visweswariah ; Rajakrishnan Rajkumar ; Ankur Gandhe ; Ananthakrishnan Ramanathan ; Jiri Navratil
Abstract: Preordering of source side sentences has proved to be useful in improving statistical machine translation. Most work has used a parser in the source language along with rules to map the source language word order into the target language word order. The requirement to have a source language parser is a major drawback, which we seek to overcome in this paper. Instead of using a parser and then using rules to order the source side sentence we learn a model that can directly reorder source side sentences to match target word order using a small parallel corpus with highquality word alignments. Our model learns pairwise costs of a word immediately preced- ing another word. We use the Lin-Kernighan heuristic to find the best source reordering efficiently during training and testing and show that it suffices to provide good quality reordering. We show gains in translation performance based on our reordering model for translating from Hindi to English, Urdu to English (with a public dataset), and English to Hindi. For English to Hindi we show that our technique achieves better performance than a method that uses rules applied to the source side English parse.
4 0.12038259 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
Author: Kevin Gimpel ; Noah A. Smith
Abstract: We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text using a target-side dependency parser. For decoding, we describe a coarse-to-fine approach based on lattice dependency parsing of phrase lattices. We demonstrate performance improvements for Chinese-English and UrduEnglish translation over a phrase-based baseline. We also investigate the use of unsupervised dependency parsers, reporting encouraging preliminary results.
5 0.11552298 44 emnlp-2011-Domain Adaptation via Pseudo In-Domain Data Selection
Author: Amittai Axelrod ; Xiaodong He ; Jianfeng Gao
Abstract: Xiaodong He Microsoft Research Redmond, WA 98052 xiaohe @mi cro s o ft . com Jianfeng Gao Microsoft Research Redmond, WA 98052 j fgao @mi cro s o ft . com have its own argot, vocabulary or stylistic preferences, such that the corpus characteristics will necWe explore efficient domain adaptation for the task of statistical machine translation based on extracting sentences from a large generaldomain parallel corpus that are most relevant to the target domain. These sentences may be selected with simple cross-entropy based methods, of which we present three. As these sentences are not themselves identical to the in-domain data, we call them pseudo in-domain subcorpora. These subcorpora 1% the size of the original can then used to train small domain-adapted Statistical Machine Translation (SMT) systems which outperform systems trained on the entire corpus. Performance is further improved when we use these domain-adapted models in combination with a true in-domain model. The results show that more training data is not always better, and that best results are attained via proper domain-relevant data selection, as well as combining in- and general-domain systems during decoding. – –
6 0.11197305 125 emnlp-2011-Statistical Machine Translation with Local Language Models
7 0.10402872 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
8 0.10279492 38 emnlp-2011-Data-Driven Response Generation in Social Media
9 0.10034323 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
10 0.098810285 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
11 0.093421839 95 emnlp-2011-Multi-Source Transfer of Delexicalized Dependency Parsers
12 0.092476398 10 emnlp-2011-A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions
13 0.091823839 35 emnlp-2011-Correcting Semantic Collocation Errors with L1-induced Paraphrases
14 0.091408081 74 emnlp-2011-Inducing Sentence Structure from Parallel Corpora for Reordering
15 0.090404876 62 emnlp-2011-Generating Subsequent Reference in Shared Visual Scenes: Computation vs Re-Use
16 0.086433358 102 emnlp-2011-Parse Correction with Specialized Models for Difficult Attachment Types
17 0.08467824 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax
18 0.083298132 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation
19 0.082087032 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
20 0.073780648 136 emnlp-2011-Training a Parser for Machine Translation Reordering
topicId topicWeight
[(0, 0.26), (1, 0.133), (2, 0.067), (3, -0.098), (4, -0.003), (5, -0.003), (6, 0.017), (7, 0.063), (8, -0.188), (9, -0.01), (10, -0.026), (11, -0.02), (12, 0.029), (13, 0.027), (14, -0.1), (15, 0.041), (16, -0.024), (17, 0.069), (18, -0.049), (19, -0.037), (20, -0.024), (21, 0.073), (22, -0.118), (23, 0.029), (24, -0.028), (25, 0.156), (26, 0.232), (27, -0.068), (28, 0.109), (29, 0.069), (30, 0.116), (31, 0.03), (32, 0.155), (33, -0.037), (34, 0.087), (35, -0.038), (36, -0.04), (37, 0.11), (38, -0.097), (39, 0.05), (40, -0.038), (41, 0.114), (42, 0.061), (43, 0.116), (44, 0.119), (45, 0.046), (46, -0.138), (47, 0.092), (48, -0.07), (49, -0.147)]
simIndex simValue paperId paperTitle
same-paper 1 0.96360195 3 emnlp-2011-A Correction Model for Word Alignments
Author: J. Scott McCarley ; Abraham Ittycheriah ; Salim Roukos ; Bing Xiang ; Jian-ming Xu
Abstract: Models of word alignment built as sequences of links have limited expressive power, but are easy to decode. Word aligners that model the alignment matrix can express arbitrary alignments, but are difficult to decode. We propose an alignment matrix model as a correction algorithm to an underlying sequencebased aligner. Then a greedy decoding algorithm enables the full expressive power of the alignment matrix formulation. Improved alignment performance is shown for all nine language pairs tested. The improved alignments also improved translation quality from Chinese to English and English to Italian.
2 0.60009366 62 emnlp-2011-Generating Subsequent Reference in Shared Visual Scenes: Computation vs Re-Use
Author: Jette Viethen ; Robert Dale ; Markus Guhe
Abstract: Traditional computational approaches to referring expression generation operate in a deliberate manner, choosing the attributes to be included on the basis of their ability to distinguish the intended referent from its distractors. However, work in psycholinguistics suggests that speakers align their referring expressions with those used previously in the discourse, implying less deliberate choice and more subconscious reuse. This raises the question as to which is a more accurate characterisation of what people do. Using a corpus of dialogues containing 16,358 referring expressions, we explore this question via the generation of subsequent references in shared visual scenes. We use a machine learning approach to referring expression generation and demonstrate that incorporating features that correspond to the computational tradition does not match human referring behaviour as well as using features corresponding to the process of alignment. The results support the view that the traditional model of referring expression generation that is widely assumed in work on natural language generation may not in fact be correct; our analysis may also help explain the oft-observed redundancy found in humanproduced referring expressions.
3 0.55552721 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features
Author: Christos Christodoulopoulos ; Sharon Goldwater ; Mark Steedman
Abstract: In this paper we present a fully unsupervised syntactic class induction system formulated as a Bayesian multinomial mixture model, where each word type is constrained to belong to a single class. By using a mixture model rather than a sequence model (e.g., HMM), we are able to easily add multiple kinds of features, including those at both the type level (morphology features) and token level (context and alignment features, the latter from parallel corpora). Using only context features, our system yields results comparable to state-of-the art, far better than a similar model without the one-class-per-type constraint. Using the additional features provides added benefit, and our final system outperforms the best published results on most of the 25 corpora tested.
4 0.49143615 38 emnlp-2011-Data-Driven Response Generation in Social Media
Author: Alan Ritter ; Colin Cherry ; William B. Dolan
Abstract: Ottawa, Ontario, K1A 0R6 Co l . Cherry@ nrc-cnrc . gc . ca in Redmond, WA 98052 bi l ldol @mi cro so ft . com large corpus of status-response pairs found on Twitter to create a system that responds to Twitter status We present a data-driven approach to generating responses to Twitter status posts, based on phrase-based Statistical Machine Translation. We find that mapping conversational stimuli onto responses is more difficult than translating between languages, due to the wider range of possible responses, the larger fraction of unaligned words/phrases, and the presence of large phrase pairs whose alignment cannot be further decomposed. After addressing these challenges, we compare approaches based on SMT and Information Retrieval in a human evaluation. We show that SMT outperforms IR on this task, and its output is preferred over actual human responses in 15% of cases. As far as we are aware, this is the first work to investigate the use of phrase-based SMT to directly translate a linguistic stimulus into an appropriate response.
5 0.46449155 13 emnlp-2011-A Word Reordering Model for Improved Machine Translation
Author: Karthik Visweswariah ; Rajakrishnan Rajkumar ; Ankur Gandhe ; Ananthakrishnan Ramanathan ; Jiri Navratil
Abstract: Preordering of source side sentences has proved to be useful in improving statistical machine translation. Most work has used a parser in the source language along with rules to map the source language word order into the target language word order. The requirement to have a source language parser is a major drawback, which we seek to overcome in this paper. Instead of using a parser and then using rules to order the source side sentence we learn a model that can directly reorder source side sentences to match target word order using a small parallel corpus with highquality word alignments. Our model learns pairwise costs of a word immediately preced- ing another word. We use the Lin-Kernighan heuristic to find the best source reordering efficiently during training and testing and show that it suffices to provide good quality reordering. We show gains in translation performance based on our reordering model for translating from Hindi to English, Urdu to English (with a public dataset), and English to Hindi. For English to Hindi we show that our technique achieves better performance than a method that uses rules applied to the source side English parse.
6 0.44452631 35 emnlp-2011-Correcting Semantic Collocation Errors with L1-induced Paraphrases
7 0.44067627 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming
8 0.43145311 73 emnlp-2011-Improving Bilingual Projections via Sparse Covariance Matrices
10 0.40128705 74 emnlp-2011-Inducing Sentence Structure from Parallel Corpora for Reordering
11 0.39913467 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
12 0.39692366 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
13 0.38417137 44 emnlp-2011-Domain Adaptation via Pseudo In-Domain Data Selection
14 0.38038966 72 emnlp-2011-Improved Transliteration Mining Using Graph Reinforcement
15 0.37559974 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
16 0.36511958 102 emnlp-2011-Parse Correction with Specialized Models for Difficult Attachment Types
17 0.36489865 88 emnlp-2011-Linear Text Segmentation Using Affinity Propagation
18 0.36352855 125 emnlp-2011-Statistical Machine Translation with Local Language Models
19 0.35807896 11 emnlp-2011-A Simple Word Trigger Method for Social Tag Suggestion
20 0.3570174 10 emnlp-2011-A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions
topicId topicWeight
[(23, 0.108), (36, 0.036), (37, 0.028), (45, 0.058), (53, 0.036), (54, 0.033), (57, 0.011), (62, 0.019), (64, 0.033), (66, 0.042), (69, 0.044), (79, 0.058), (82, 0.014), (85, 0.023), (90, 0.01), (94, 0.314), (96, 0.035), (98, 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 0.76605737 3 emnlp-2011-A Correction Model for Word Alignments
Author: J. Scott McCarley ; Abraham Ittycheriah ; Salim Roukos ; Bing Xiang ; Jian-ming Xu
Abstract: Models of word alignment built as sequences of links have limited expressive power, but are easy to decode. Word aligners that model the alignment matrix can express arbitrary alignments, but are difficult to decode. We propose an alignment matrix model as a correction algorithm to an underlying sequencebased aligner. Then a greedy decoding algorithm enables the full expressive power of the alignment matrix formulation. Improved alignment performance is shown for all nine language pairs tested. The improved alignments also improved translation quality from Chinese to English and English to Italian.
2 0.64197248 119 emnlp-2011-Semantic Topic Models: Combining Word Distributional Statistics and Dictionary Definitions
Author: Weiwei Guo ; Mona Diab
Abstract: In this paper, we propose a novel topic model based on incorporating dictionary definitions. Traditional topic models treat words as surface strings without assuming predefined knowledge about word meaning. They infer topics only by observing surface word co-occurrence. However, the co-occurred words may not be semantically related in a manner that is relevant for topic coherence. Exploiting dictionary definitions explicitly in our model yields a better understanding of word semantics leading to better text modeling. We exploit WordNet as a lexical resource for sense definitions. We show that explicitly modeling word definitions helps improve performance significantly over the baseline for a text categorization task.
3 0.47684595 13 emnlp-2011-A Word Reordering Model for Improved Machine Translation
Author: Karthik Visweswariah ; Rajakrishnan Rajkumar ; Ankur Gandhe ; Ananthakrishnan Ramanathan ; Jiri Navratil
Abstract: Preordering of source side sentences has proved to be useful in improving statistical machine translation. Most work has used a parser in the source language along with rules to map the source language word order into the target language word order. The requirement to have a source language parser is a major drawback, which we seek to overcome in this paper. Instead of using a parser and then using rules to order the source side sentence we learn a model that can directly reorder source side sentences to match target word order using a small parallel corpus with highquality word alignments. Our model learns pairwise costs of a word immediately preced- ing another word. We use the Lin-Kernighan heuristic to find the best source reordering efficiently during training and testing and show that it suffices to provide good quality reordering. We show gains in translation performance based on our reordering model for translating from Hindi to English, Urdu to English (with a public dataset), and English to Hindi. For English to Hindi we show that our technique achieves better performance than a method that uses rules applied to the source side English parse.
4 0.45885301 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
Author: Kevin Gimpel ; Noah A. Smith
Abstract: We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text using a target-side dependency parser. For decoding, we describe a coarse-to-fine approach based on lattice dependency parsing of phrase lattices. We demonstrate performance improvements for Chinese-English and UrduEnglish translation over a phrase-based baseline. We also investigate the use of unsupervised dependency parsers, reporting encouraging preliminary results.
5 0.45269051 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
Author: Yang Gao ; Philipp Koehn ; Alexandra Birch
Abstract: Long-distance reordering remains one of the biggest challenges facing machine translation. We derive soft constraints from the source dependency parsing to directly address the reordering problem for the hierarchical phrasebased model. Our approach significantly improves Chinese–English machine translation on a large-scale task by 0.84 BLEU points on average. Moreover, when we switch the tuning function from BLEU to the LRscore which promotes reordering, we observe total improvements of 1.21 BLEU, 1.30 LRscore and 3.36 TER over the baseline. On average our approach improves reordering precision and recall by 6.9 and 0.3 absolute points, respectively, and is found to be especially effective for long-distance reodering.
6 0.45073503 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features
7 0.44968444 66 emnlp-2011-Hierarchical Phrase-based Translation Representations
8 0.44938302 63 emnlp-2011-Harnessing WordNet Senses for Supervised Sentiment Classification
9 0.44868946 35 emnlp-2011-Correcting Semantic Collocation Errors with L1-induced Paraphrases
10 0.44555259 136 emnlp-2011-Training a Parser for Machine Translation Reordering
11 0.44426605 112 emnlp-2011-Refining the Notions of Depth and Density in WordNet-based Semantic Similarity Measures
12 0.44345021 68 emnlp-2011-Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding
13 0.44310033 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
14 0.44104904 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
15 0.44101331 59 emnlp-2011-Fast and Robust Joint Models for Biomedical Event Extraction
16 0.44081971 144 emnlp-2011-Unsupervised Learning of Selectional Restrictions and Detection of Argument Coercions
17 0.44073161 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax
18 0.4404082 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
19 0.44035247 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming
20 0.43942976 132 emnlp-2011-Syntax-Based Grammaticality Improvement using CCG and Guided Search