emnlp emnlp2011 emnlp2011-18 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Xabier Saralegi ; Iker Manterola ; Inaki San Vicente
Abstract: An A-C bilingual dictionary can be inferred by merging A-B and B-C dictionaries using B as pivot. However, polysemous pivot words often produce wrong translation candidates. This paper analyzes two methods for pruning wrong candidates: one based on exploiting the structure of the source dictionaries, and the other based on distributional similarity computed from comparable corpora. As both methods depend exclusively on easily available resources, they are well suited to less resourced languages. We studied whether these two techniques complement each other given that they are based on different paradigms. We also researched combining them by looking for the best adequacy depending on various application scenarios. ,
Reference: text
sentIndex sentText sentNum sentScore
1 i mant e ro l a Abstract An A-C bilingual dictionary can be inferred by merging A-B and B-C dictionaries using B as pivot. [sent-3, score-0.64]
2 However, polysemous pivot words often produce wrong translation candidates. [sent-4, score-0.555]
3 This paper analyzes two methods for pruning wrong candidates: one based on exploiting the structure of the source dictionaries, and the other based on distributional similarity computed from comparable corpora. [sent-5, score-0.259]
4 Therefore, less resourced languages (as well as less-common language pairs) could benefit from a method to reduce the costs of constructing bilingual dictionaries. [sent-13, score-0.316]
5 However, the presence of less resourced languages in these kinds of resources is still relative -in Wikipedia, too-. [sent-19, score-0.213]
6 Another way to create bilingual dictionaries is by using the most widespread languages (e. [sent-20, score-0.494]
7 ) as a bridge between less resourced languages, since most languages have some bilingual dictionary to/from a major language. [sent-25, score-0.511]
8 These pivot techniques allow new bilingual dictionaries to be built automatically. [sent-26, score-0.815]
9 The presence of polysemous or ambiguous words in any of the dictionaries involved may produce wrong translation pairs. [sent-28, score-0.543]
10 However, each technique has different performance and properties producing dictionaries of certain characteristics, such as different levels of coverage of entries and/or translations. [sent-33, score-0.519]
11 For example, a small dictionary containing the most basic vocabulary and the corresponding most frequent translations can be adequate for some IR and NLP tasks, tourism, or initial stages of language learning. [sent-35, score-0.433]
12 Alternatively, a dictionary which maximizes the vocabulary coverage is more oriented towards advanced users or translation services. [sent-36, score-0.351]
13 ec th2o0d1s1 i Ans Nsoactuiartaioln La fonrg Cuaogmep Purtoatcieosnsainlg L,in pgaugies ti 8c4s6–856, wrong translations when building bilingual dictionaries by means of pivot techniques. [sent-41, score-1.006]
14 For this purpose, we studied the effect the attributes of the source dictionaries have on the performance of IC and DS-based methods, as well as the characteristics of the dictionaries produced. [sent-45, score-0.72]
15 The basis of the pivot technique is dealt with in the next section, and the state of the art in pivot techniques is reviewed in the third section. [sent-48, score-0.74]
16 It could be thought that these causalities are 847 Figure 1: Ambiguity problem of the pivot technique. [sent-56, score-0.37]
17 We merged a Basque-English dictionary composed of 17,672 entries and 43,021 pairs with an English-Spanish one composed of 16,326 entries and 38,128 pairs, and obtained a noised Basque-Spanish dictionary comprising 14,000 entries and 104,165 pairs. [sent-59, score-1.021]
18 32% of these ambiguous entries contain incorrect translation equivalents (80,200 pairs out of 99,844). [sent-62, score-0.473]
19 The conclusion is that the transitive relation between words across languages can not be assumed, because of the large number of ambiguous entries that dictionaries actually have. [sent-64, score-0.631]
20 So it is not possible to map entries and translation equivalents according to their corresponding senses. [sent-66, score-0.403]
21 As an alternative, most papers try to guide this mapping according to semantic distances extracted from the dictionaries themselves or from external resources such as corpora. [sent-67, score-0.369]
22 This consists of pairs of equivalents not identified in the pivot process because there is no pivot word, or else one of the equivalents is not present. [sent-69, score-0.965]
23 We will not be dealing with this issue in this work so that we can focus on the translation ambiguity problem. [sent-70, score-0.257]
24 (1994) worked with the structure of the source dictionaries and introduced the IC method which measures the semantic distance between two words according to the number of pivot-words they share. [sent-72, score-0.384]
25 However, not all the dictionaries provide this kind of information. [sent-76, score-0.304]
26 (2009) proposed using WordNet, only for the pivot language (for English in their case), to take advantage of all the semantic information that WordNet can provide. [sent-79, score-0.37]
27 (2009) researched the use of multiple languages as pivots, on the hypothesis that the more languages used, the more evidences will be found to find translation equivalents. [sent-82, score-0.273]
28 (2008) used parallel corpora to estimate translation probabilities between possible translation pairs. [sent-85, score-0.256]
29 However, even if this strategy achieves the best results in the terminology extraction field, it is not adequate when less resourced languages are involved because parallel corpora are very scarce. [sent-87, score-0.282]
30 , 2008; Gamallo and Pichel, 2010) proposed methods to eliminate spurious translations using cross-lingual context or distributional similarity calculated from comparable corpora. [sent-89, score-0.238]
31 Other characteristics of the merged dictionaries like directionality (Paik et al. [sent-92, score-0.399]
32 The resources for building the new dictionary are two basic (no definitions, no senses) bilingual dictionaries (A-B, B-C) including source (A), target (C) and a pivot language (B), as well as a comparable corpus for the source-target (A-C) language pair. [sent-96, score-1.135]
33 In our experiments, the source and target languages are Basque and Spanish, respectively, and English is used for pivot purposes. [sent-99, score-0.472]
34 These dictionaries and pivot language were selected in order to be able to evaluate the results automatically. [sent-103, score-0.674]
35 During the evaluation we also used frequency information extracted from a parallel corpus, but then again, this corpus was not used during the dictionary building process, and therefore, it would not be used in a real application environment. [sent-104, score-0.324]
36 The two dictionaries mentioned in the previous section (Basque-English Deu→en and English-Spanish Den→es) were used to produce a new Basque-Spanish Deu→en→es dictionary. [sent-107, score-0.304]
37 We can observe that the ambiguity level of the entries (average number of translations per source word) is significant. [sent-111, score-0.379]
38 This produces more noise in the pivot process, but it also benefits IC due to the increase in pivot words. [sent-112, score-0.74]
39 1 Inverse consultation IC uses the structure of the Da−b and Db−c source dictionaries to measure the similarity of the meanings between source word and translation candidate. [sent-136, score-0.642]
40 To find suitable equivalents for a given entry, all target language translations of each pivot translation are looked up (e. [sent-139, score-0.693]
41 The number of common elements of the same language between SA and the translations or equivalences (E) obtained in the original direction (Da→b(s)) is used to measure the semantic distance between entries and corresponding translations. [sent-144, score-0.291]
42 If only one inverse dictionary is consulted, the method is called “one time inverse consultation” or IC1. [sent-146, score-0.339]
43 If n inverse dictionaries are consulted, the method is called “n time inverse consultation”. [sent-147, score-0.448]
44 In the same way, the number of common elements between SA and E is denoted as follows: δ(E,SA) = X δ(E,x) (1) xX∈SA IC asks for more than one pivot word between source word s and translation candidate t. [sent-150, score-0.536]
45 In our example: δ(Da→b(s), Dc→b(t)) > 1 (2) In general, this condition guarantees that pivot words belong to the same sense of the source word (e. [sent-151, score-0.454]
46 If two or more pivot words share a translation t in the Des→en dictionary (|tr(tc, Des→en | > 1) (e. [sent-158, score-0.678]
47 We can conclude that entry s and candidate t are mutual translations because the hypothesis that “faucet” and “tap ” are lexical variants of the same sense c is contrasted against two evidences. [sent-164, score-0.281]
48 Specifically, IC needs several lexical variants in the pivot language per each entry sense in both dictionaries. [sent-166, score-0.532]
49 (b) p(|tr(tc, Dc→b) | > 1): Estimated by computing the) average coverage of lexical variants in the pivot language for each entry in Dc→b. [sent-168, score-0.544]
50 850 So, in order to obtain a good performance with IC, the dictionaries used need to provide a high coverage of lexical variants per sense in the pivot language. [sent-170, score-0.8]
51 Average coverage of lexical variants in the pivot language was calculated for both dictionaries. [sent-172, score-0.493]
52 Only ambiguous entries were analyzed because they are the set of entries which IC must solve. [sent-174, score-0.426]
53 In the Deu→en dictionary more than 75% of senses have more than one lexical variant in the pivot language. [sent-175, score-0.6]
54 2 ∗ Distributional Similarity DS has been used successfully for extracting bilingual terminology from comparable corpora. [sent-189, score-0.203]
55 The underlying idea is to identify as translation equivalents those words which show similar distributions or contexts across two corpora of different languages, assuming that this similarity is proportional to the semantic distance. [sent-190, score-0.264]
56 This technique can be used for pruning wrong translations produced in a pivot-based dictionary building process (Kaji et al. [sent-192, score-0.429]
57 For solving an ambiguous translation t of a source word s, both context representations must be accurate. [sent-205, score-0.22]
58 As we were not interested in dealing with missing translations, the reference for calculating recall was drawn up with respect to the intersection between the merged dictionary (Deu→en→es) and the reference dictionary (Deu→es). [sent-209, score-0.697]
59 It is better to deal effectively with frequent words and frequent translations than rare ones. [sent-212, score-0.259]
60 Frequency of use of Basque words and frequency of source-target translation equivalent pairs were extracted respectively from the open domain monolingual corpus and the parallel corpus described in the previous section. [sent-213, score-0.285]
61 After analyzing the wrong pairs by hand, we observed that some of them corresponded to correct pairs not included in the reference dictionary. [sent-220, score-0.23]
62 Other wrong pairs comprise translation equivalents which have the same stem but different gramatical categories (e. [sent-225, score-0.319]
63 Precision starts to decline significantly when dealing 852 Figure 5: Recall results according to the minimum frequency of translation pairs. [sent-231, score-0.454]
64 with those entries over a minimum frequency of 10,000. [sent-232, score-0.309]
65 However, only very few entries (234) reach that minimum frequency. [sent-233, score-0.21]
66 This could be due to the fact that frequent entries tend to have more translation variants (See Table 3). [sent-238, score-0.407]
67 The fact that there are too many candidates to solve would explain why the recall starts to decline when dealing with very frequent entries. [sent-239, score-0.336]
68 Recall according to frequency of pairs provides information about whether IC selects rare translations or the most probable ones (See Figure 5). [sent-241, score-0.288]
69 It must be noted that this recall is calculated with respect to the translation pairs of the merged dictionary Deu→en→es which appear in the parallel corpus (see section 4. [sent-242, score-0.535]
70 However, recall for pairs whose frequency is higher than 100 only reaches 0. [sent-245, score-0.204]
71 Even if the maximum recall is achieved for pairs whose frequency is above 40,000, it is not significant because they suppose a minimum number (3 pairs). [sent-247, score-0.242]
72 The dictionary created by IC or unambiguous pairs can be used as a reference for tuning the threshold in a robust way with respect to the evaluation score such as F-score. [sent-259, score-0.347]
73 In our experiments, thresholds estimated against the dictionary created by IC are very close to those calculated with respect to the whole reference dictionary (see Figure 6). [sent-260, score-0.578]
74 In all cases, precision is slightly better when dealing with frequent words (frequency > 20). [sent-268, score-0.241]
75 However, if global thresholds are used, performance starts to decline significantly when dealing with words whose frequency is above 1,000. [sent-271, score-0.346]
76 853 In this case, global thresholds seem to perform better because the most frequent entries are handled better. [sent-277, score-0.312]
77 We have plotted the recall according to the frequency of pairs calculated from a parallel corpus in order to analyze the performance of DS when dealing with frequent translation pairs (See Figure 5). [sent-285, score-0.624]
78 The performance decreases when dealing with pairs whose frequency is higher than 100. [sent-286, score-0.251]
79 This means that DSs performance is worse when dealing with the most common translation pairs. [sent-287, score-0.222]
80 The results show that DS rankings are worse when dealing with some words above a certain frequency threshold (e. [sent-289, score-0.253]
81 However, DS tips the scales in its favor if only entries with frequencies above 50 are considered and strict thresholds are used (TOP1, 0. [sent-301, score-0.242]
82 Even if strict thresholds are used, DS outperforms IC for all entries whose frequency is lower than 640. [sent-304, score-0.341]
83 Only when dealing with very frequent entries (frequency > 8, 000) is ICs performance close to DSs, but these entries make up a very small group (234 entries). [sent-306, score-0.523]
84 In order to compare the recall with respect to the frequency of translation pairs under the same conditions, we have to select a threshold that provides a similar precision to IC. [sent-307, score-0.452]
85 Even if IC’s recall clearly surpasses DS’s when dealing with frequent translation pairs (frequency > 2, 560), it only represents a minimal number of pairs (39). [sent-310, score-0.44]
86 For IC that value is the number of pivot words (see Formula 1), and the context similarity score in the case of DS. [sent-316, score-0.399]
87 In addition, if we want the dictionaries to cover the most common entries (e. [sent-328, score-0.476]
88 , in a basic dictionary for language learners) it is also interesting to look at wAvgF values because greater value is given to finding translations for the most frequent words. [sent-330, score-0.384]
89 On the other hand, if our objective is to build big dictionaries with a high recall, it would be better to look at AvgF2 measure which attaches importance to recall. [sent-331, score-0.304]
90 wAvgF measure is stricter than the others since it takes frequency of entries into account. [sent-348, score-0.271]
91 7 Conclusions This paper has analyzed IC and DS, for the task of pruning wrong translations from bilingual dictionaries built by means of pivot techniques. [sent-350, score-1.077]
92 After analyzing their strong and weak points we have showed that IC requires high ambiguity level dictionaries with several lexical variants per entry sense. [sent-351, score-0.47]
93 Future experiments include contrasting these results with other dictionaries and language pairs. [sent-364, score-0.304]
94 Compiling bilingual lexicon entries from a non-parallel English-Chinese corpus. [sent-382, score-0.313]
95 Automatic generation of bilingual dictionaries using intermediary languages and comparable corpora. [sent-387, score-0.528]
96 Automatic construction of a Japanese-Chinese dictionary via English. [sent-399, score-0.222]
97 Automatic construction of a transfer dictionary considering directionality. [sent-413, score-0.222]
98 Linking english words in two bilingual dictionaries to generate another language pair dictionary. [sent-433, score-0.445]
99 Construction of a bilingual dictionary intermediated by a third language. [sent-442, score-0.336]
100 Building bilingual lexicons using lexical translation probabilities via pivot languages. [sent-446, score-0.624]
wordName wordTfidf (topN-words)
[('ic', 0.434), ('pivot', 0.37), ('dictionaries', 0.304), ('ds', 0.301), ('dictionary', 0.195), ('deu', 0.186), ('entries', 0.172), ('bilingual', 0.141), ('basque', 0.126), ('resourced', 0.126), ('translations', 0.119), ('translation', 0.113), ('dealing', 0.109), ('frequency', 0.099), ('equivalents', 0.091), ('consultation', 0.09), ('dc', 0.085), ('tanaka', 0.084), ('entry', 0.079), ('kaji', 0.078), ('en', 0.073), ('inverse', 0.072), ('faucet', 0.072), ('gamallo', 0.072), ('grifo', 0.072), ('iturri', 0.072), ('paik', 0.072), ('tap', 0.072), ('wrong', 0.072), ('thresholds', 0.07), ('frequent', 0.07), ('da', 0.067), ('tr', 0.066), ('precision', 0.062), ('recall', 0.062), ('characteristics', 0.059), ('bond', 0.058), ('des', 0.058), ('avgf', 0.054), ('pichel', 0.054), ('shezaf', 0.054), ('shirai', 0.054), ('umemura', 0.054), ('wavgf', 0.054), ('ambiguous', 0.054), ('source', 0.053), ('transitive', 0.052), ('variants', 0.052), ('sc', 0.05), ('adequate', 0.049), ('languages', 0.049), ('threshold', 0.045), ('pruning', 0.043), ('coverage', 0.043), ('pairs', 0.043), ('decline', 0.042), ('es', 0.039), ('resources', 0.038), ('minimum', 0.038), ('merged', 0.036), ('consulted', 0.036), ('corresponded', 0.036), ('dss', 0.036), ('elhuyar', 0.036), ('erdmann', 0.036), ('lcomb', 0.036), ('noised', 0.036), ('researched', 0.036), ('strictness', 0.036), ('tsunakawa', 0.036), ('reference', 0.036), ('senses', 0.035), ('spanish', 0.035), ('ambiguity', 0.035), ('conditions', 0.035), ('comparable', 0.034), ('sa', 0.034), ('variability', 0.034), ('id', 0.034), ('depending', 0.033), ('fulfill', 0.031), ('sense', 0.031), ('contexts', 0.031), ('parallel', 0.03), ('similarity', 0.029), ('analyzed', 0.028), ('terminology', 0.028), ('compiling', 0.028), ('istv', 0.028), ('pacling', 0.028), ('calculated', 0.028), ('respect', 0.028), ('distributional', 0.028), ('construction', 0.027), ('candidates', 0.027), ('according', 0.027), ('estimated', 0.026), ('starts', 0.026), ('establishing', 0.026), ('evidences', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999899 18 emnlp-2011-Analyzing Methods for Improving Precision of Pivot Based Bilingual Dictionaries
Author: Xabier Saralegi ; Iker Manterola ; Inaki San Vicente
Abstract: An A-C bilingual dictionary can be inferred by merging A-B and B-C dictionaries using B as pivot. However, polysemous pivot words often produce wrong translation candidates. This paper analyzes two methods for pruning wrong candidates: one based on exploiting the structure of the source dictionaries, and the other based on distributional similarity computed from comparable corpora. As both methods depend exclusively on easily available resources, they are well suited to less resourced languages. We studied whether these two techniques complement each other given that they are based on different paradigms. We also researched combining them by looking for the best adequacy depending on various application scenarios. ,
2 0.097240463 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
Author: Chang Liu ; Daniel Dahlmeier ; Hwee Tou Ng
Abstract: Many machine translation evaluation metrics have been proposed after the seminal BLEU metric, and many among them have been found to consistently outperform BLEU, demonstrated by their better correlations with human judgment. It has long been the hope that by tuning machine translation systems against these new generation metrics, advances in automatic machine translation evaluation can lead directly to advances in automatic machine translation. However, to date there has been no unambiguous report that these new metrics can improve a state-of-theart machine translation system over its BLEUtuned baseline. In this paper, we demonstrate that tuning Joshua, a hierarchical phrase-based statistical machine translation system, with the TESLA metrics results in significantly better humanjudged translation quality than the BLEUtuned baseline. TESLA-M in particular is simple and performs well in practice on large datasets. We release all our implementation under an open source license. It is our hope that this work will encourage the machine translation community to finally move away from BLEU as the unquestioned default and to consider the new generation metrics when tuning their systems.
3 0.094618775 118 emnlp-2011-SMT Helps Bitext Dependency Parsing
Author: Wenliang Chen ; Jun'ichi Kazama ; Min Zhang ; Yoshimasa Tsuruoka ; Yujie Zhang ; Yiou Wang ; Kentaro Torisawa ; Haizhou Li
Abstract: We propose a method to improve the accuracy of parsing bilingual texts (bitexts) with the help of statistical machine translation (SMT) systems. Previous bitext parsing methods use human-annotated bilingual treebanks that are hard to obtain. Instead, our approach uses an auto-generated bilingual treebank to produce bilingual constraints. However, because the auto-generated bilingual treebank contains errors, the bilingual constraints are noisy. To overcome this problem, we use large-scale unannotated data to verify the constraints and design a set of effective bilingual features for parsing models based on the verified results. The experimental results show that our new parsers significantly outperform state-of-theart baselines. Moreover, our approach is still able to provide improvement when we use a larger monolingual treebank that results in a much stronger baseline. Especially notable is that our approach can be used in a purely monolingual setting with the help of SMT.
4 0.088226691 44 emnlp-2011-Domain Adaptation via Pseudo In-Domain Data Selection
Author: Amittai Axelrod ; Xiaodong He ; Jianfeng Gao
Abstract: Xiaodong He Microsoft Research Redmond, WA 98052 xiaohe @mi cro s o ft . com Jianfeng Gao Microsoft Research Redmond, WA 98052 j fgao @mi cro s o ft . com have its own argot, vocabulary or stylistic preferences, such that the corpus characteristics will necWe explore efficient domain adaptation for the task of statistical machine translation based on extracting sentences from a large generaldomain parallel corpus that are most relevant to the target domain. These sentences may be selected with simple cross-entropy based methods, of which we present three. As these sentences are not themselves identical to the in-domain data, we call them pseudo in-domain subcorpora. These subcorpora 1% the size of the original can then used to train small domain-adapted Statistical Machine Translation (SMT) systems which outperform systems trained on the entire corpus. Performance is further improved when we use these domain-adapted models in combination with a true in-domain model. The results show that more training data is not always better, and that best results are attained via proper domain-relevant data selection, as well as combining in- and general-domain systems during decoding. – –
5 0.08405678 73 emnlp-2011-Improving Bilingual Projections via Sparse Covariance Matrices
Author: Jagadeesh Jagarlamudi ; Raghavendra Udupa ; Hal Daume III ; Abhijit Bhole
Abstract: Mapping documents into an interlingual representation can help bridge the language barrier of cross-lingual corpora. Many existing approaches are based on word co-occurrences extracted from aligned training data, represented as a covariance matrix. In theory, such a covariance matrix should represent semantic equivalence, and should be highly sparse. Unfortunately, the presence of noise leads to dense covariance matrices which in turn leads to suboptimal document representations. In this paper, we explore techniques to recover the desired sparsity in covariance matrices in two ways. First, we explore word association measures and bilingual dictionaries to weigh the word pairs. Later, we explore different selection strategies to remove the noisy pairs based on the association scores. Our experimental results on the task of aligning comparable documents shows the efficacy of sparse covariance matrices on two data sets from two different language pairs.
6 0.082699224 25 emnlp-2011-Cache-based Document-level Statistical Machine Translation
7 0.071960755 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
8 0.068650968 6 emnlp-2011-A Generate and Rank Approach to Sentence Paraphrasing
9 0.064406551 125 emnlp-2011-Statistical Machine Translation with Local Language Models
10 0.06139591 64 emnlp-2011-Harnessing different knowledge sources to measure semantic relatedness under a uniform model
11 0.05467169 119 emnlp-2011-Semantic Topic Models: Combining Word Distributional Statistics and Dictionary Definitions
12 0.052450545 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
13 0.051638428 99 emnlp-2011-Non-parametric Bayesian Segmentation of Japanese Noun Phrases
14 0.051114526 93 emnlp-2011-Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation
15 0.049846131 65 emnlp-2011-Heuristic Search for Non-Bottom-Up Tree Structure Prediction
16 0.049530853 95 emnlp-2011-Multi-Source Transfer of Delexicalized Dependency Parsers
17 0.049128518 76 emnlp-2011-Language Models for Machine Translation: Original vs. Translated Texts
18 0.048853032 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
19 0.048813548 146 emnlp-2011-Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance
20 0.047729064 63 emnlp-2011-Harnessing WordNet Senses for Supervised Sentiment Classification
topicId topicWeight
[(0, 0.183), (1, 0.025), (2, 0.01), (3, -0.144), (4, 0.016), (5, -0.027), (6, -0.04), (7, 0.06), (8, -0.111), (9, 0.037), (10, 0.032), (11, 0.027), (12, 0.035), (13, 0.072), (14, 0.023), (15, 0.089), (16, 0.014), (17, -0.043), (18, -0.034), (19, -0.109), (20, -0.092), (21, -0.216), (22, 0.014), (23, 0.056), (24, 0.033), (25, -0.104), (26, 0.055), (27, 0.065), (28, -0.01), (29, 0.001), (30, -0.055), (31, 0.053), (32, -0.106), (33, 0.156), (34, 0.064), (35, 0.052), (36, 0.055), (37, -0.079), (38, 0.055), (39, 0.009), (40, 0.037), (41, 0.051), (42, -0.089), (43, 0.034), (44, -0.061), (45, 0.094), (46, 0.144), (47, -0.045), (48, 0.025), (49, -0.001)]
simIndex simValue paperId paperTitle
same-paper 1 0.950445 18 emnlp-2011-Analyzing Methods for Improving Precision of Pivot Based Bilingual Dictionaries
Author: Xabier Saralegi ; Iker Manterola ; Inaki San Vicente
Abstract: An A-C bilingual dictionary can be inferred by merging A-B and B-C dictionaries using B as pivot. However, polysemous pivot words often produce wrong translation candidates. This paper analyzes two methods for pruning wrong candidates: one based on exploiting the structure of the source dictionaries, and the other based on distributional similarity computed from comparable corpora. As both methods depend exclusively on easily available resources, they are well suited to less resourced languages. We studied whether these two techniques complement each other given that they are based on different paradigms. We also researched combining them by looking for the best adequacy depending on various application scenarios. ,
2 0.63218713 25 emnlp-2011-Cache-based Document-level Statistical Machine Translation
Author: Zhengxian Gong ; Min Zhang ; Guodong Zhou
Abstract: Statistical machine translation systems are usually trained on a large amount of bilingual sentence pairs and translate one sentence at a time, ignoring document-level information. In this paper, we propose a cache-based approach to document-level translation. Since caches mainly depend on relevant data to supervise subsequent decisions, it is critical to fill the caches with highly-relevant data of a reasonable size. In this paper, we present three kinds of caches to store relevant document-level information: 1) a dynamic cache, which stores bilingual phrase pairs from the best translation hypotheses of previous sentences in the test document; 2) a static cache, which stores relevant bilingual phrase pairs extracted from similar bilingual document pairs (i.e. source documents similar to the test document and their corresponding target documents) in the training parallel corpus; 3) a topic cache, which stores the target-side topic words related with the test document in the source-side. In particular, three new features are designed to explore various kinds of document-level information in above three kinds of caches. Evaluation shows the effectiveness of our cache-based approach to document-level translation with the performance improvement of 0.8 1 in BLUE score over Moses. Especially, detailed analysis and discussion are presented to give new insights to document-level translation. 1
3 0.60953885 118 emnlp-2011-SMT Helps Bitext Dependency Parsing
Author: Wenliang Chen ; Jun'ichi Kazama ; Min Zhang ; Yoshimasa Tsuruoka ; Yujie Zhang ; Yiou Wang ; Kentaro Torisawa ; Haizhou Li
Abstract: We propose a method to improve the accuracy of parsing bilingual texts (bitexts) with the help of statistical machine translation (SMT) systems. Previous bitext parsing methods use human-annotated bilingual treebanks that are hard to obtain. Instead, our approach uses an auto-generated bilingual treebank to produce bilingual constraints. However, because the auto-generated bilingual treebank contains errors, the bilingual constraints are noisy. To overcome this problem, we use large-scale unannotated data to verify the constraints and design a set of effective bilingual features for parsing models based on the verified results. The experimental results show that our new parsers significantly outperform state-of-theart baselines. Moreover, our approach is still able to provide improvement when we use a larger monolingual treebank that results in a much stronger baseline. Especially notable is that our approach can be used in a purely monolingual setting with the help of SMT.
4 0.58292234 73 emnlp-2011-Improving Bilingual Projections via Sparse Covariance Matrices
Author: Jagadeesh Jagarlamudi ; Raghavendra Udupa ; Hal Daume III ; Abhijit Bhole
Abstract: Mapping documents into an interlingual representation can help bridge the language barrier of cross-lingual corpora. Many existing approaches are based on word co-occurrences extracted from aligned training data, represented as a covariance matrix. In theory, such a covariance matrix should represent semantic equivalence, and should be highly sparse. Unfortunately, the presence of noise leads to dense covariance matrices which in turn leads to suboptimal document representations. In this paper, we explore techniques to recover the desired sparsity in covariance matrices in two ways. First, we explore word association measures and bilingual dictionaries to weigh the word pairs. Later, we explore different selection strategies to remove the noisy pairs based on the association scores. Our experimental results on the task of aligning comparable documents shows the efficacy of sparse covariance matrices on two data sets from two different language pairs.
5 0.51209396 44 emnlp-2011-Domain Adaptation via Pseudo In-Domain Data Selection
Author: Amittai Axelrod ; Xiaodong He ; Jianfeng Gao
Abstract: Xiaodong He Microsoft Research Redmond, WA 98052 xiaohe @mi cro s o ft . com Jianfeng Gao Microsoft Research Redmond, WA 98052 j fgao @mi cro s o ft . com have its own argot, vocabulary or stylistic preferences, such that the corpus characteristics will necWe explore efficient domain adaptation for the task of statistical machine translation based on extracting sentences from a large generaldomain parallel corpus that are most relevant to the target domain. These sentences may be selected with simple cross-entropy based methods, of which we present three. As these sentences are not themselves identical to the in-domain data, we call them pseudo in-domain subcorpora. These subcorpora 1% the size of the original can then used to train small domain-adapted Statistical Machine Translation (SMT) systems which outperform systems trained on the entire corpus. Performance is further improved when we use these domain-adapted models in combination with a true in-domain model. The results show that more training data is not always better, and that best results are attained via proper domain-relevant data selection, as well as combining in- and general-domain systems during decoding. – –
6 0.50730801 64 emnlp-2011-Harnessing different knowledge sources to measure semantic relatedness under a uniform model
7 0.43149462 93 emnlp-2011-Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation
8 0.39767718 112 emnlp-2011-Refining the Notions of Depth and Density in WordNet-based Semantic Similarity Measures
9 0.37164363 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
10 0.3634547 63 emnlp-2011-Harnessing WordNet Senses for Supervised Sentiment Classification
11 0.3404974 19 emnlp-2011-Approximate Scalable Bounded Space Sketch for Large Data NLP
12 0.33680838 66 emnlp-2011-Hierarchical Phrase-based Translation Representations
13 0.32279286 76 emnlp-2011-Language Models for Machine Translation: Original vs. Translated Texts
14 0.31046477 148 emnlp-2011-Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation.
15 0.31042239 86 emnlp-2011-Lexical Co-occurrence, Statistical Significance, and Word Association
16 0.30312839 2 emnlp-2011-A Cascaded Classification Approach to Semantic Head Recognition
17 0.30258778 36 emnlp-2011-Corroborating Text Evaluation Results with Heterogeneous Measures
18 0.30190301 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
19 0.29515243 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
20 0.29169074 23 emnlp-2011-Bootstrapped Named Entity Recognition for Product Attribute Extraction
topicId topicWeight
[(23, 0.089), (36, 0.018), (37, 0.017), (45, 0.077), (53, 0.463), (54, 0.016), (57, 0.02), (62, 0.02), (64, 0.034), (66, 0.029), (69, 0.016), (79, 0.043), (82, 0.011), (90, 0.013), (96, 0.028), (98, 0.018)]
simIndex simValue paperId paperTitle
1 0.92013824 125 emnlp-2011-Statistical Machine Translation with Local Language Models
Author: Christof Monz
Abstract: Part-of-speech language modeling is commonly used as a component in statistical machine translation systems, but there is mixed evidence that its usage leads to significant improvements. We argue that its limited effectiveness is due to the lack of lexicalization. We introduce a new approach that builds a separate local language model for each word and part-of-speech pair. The resulting models lead to more context-sensitive probability distributions and we also exploit the fact that different local models are used to estimate the language model probability of each word during decoding. Our approach is evaluated for Arabic- and Chinese-to-English translation. We show that it leads to statistically significant improvements for multiple test sets and also across different genres, when compared against a competitive baseline and a system using a part-of-speech model.
same-paper 2 0.86364633 18 emnlp-2011-Analyzing Methods for Improving Precision of Pivot Based Bilingual Dictionaries
Author: Xabier Saralegi ; Iker Manterola ; Inaki San Vicente
Abstract: An A-C bilingual dictionary can be inferred by merging A-B and B-C dictionaries using B as pivot. However, polysemous pivot words often produce wrong translation candidates. This paper analyzes two methods for pruning wrong candidates: one based on exploiting the structure of the source dictionaries, and the other based on distributional similarity computed from comparable corpora. As both methods depend exclusively on easily available resources, they are well suited to less resourced languages. We studied whether these two techniques complement each other given that they are based on different paradigms. We also researched combining them by looking for the best adequacy depending on various application scenarios. ,
3 0.50068605 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
Author: Yang Gao ; Philipp Koehn ; Alexandra Birch
Abstract: Long-distance reordering remains one of the biggest challenges facing machine translation. We derive soft constraints from the source dependency parsing to directly address the reordering problem for the hierarchical phrasebased model. Our approach significantly improves Chinese–English machine translation on a large-scale task by 0.84 BLEU points on average. Moreover, when we switch the tuning function from BLEU to the LRscore which promotes reordering, we observe total improvements of 1.21 BLEU, 1.30 LRscore and 3.36 TER over the baseline. On average our approach improves reordering precision and recall by 6.9 and 0.3 absolute points, respectively, and is found to be especially effective for long-distance reodering.
4 0.49100855 76 emnlp-2011-Language Models for Machine Translation: Original vs. Translated Texts
Author: Gennadi Lembersky ; Noam Ordan ; Shuly Wintner
Abstract: We investigate the differences between language models compiled from original target-language texts and those compiled from texts manually translated to the target language. Corroborating established observations of Translation Studies, we demonstrate that the latter are significantly better predictors of translated sentences than the former, and hence fit the reference set better. Furthermore, translated texts yield better language models for statistical machine translation than original texts.
5 0.48550394 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
Author: Chang Liu ; Daniel Dahlmeier ; Hwee Tou Ng
Abstract: Many machine translation evaluation metrics have been proposed after the seminal BLEU metric, and many among them have been found to consistently outperform BLEU, demonstrated by their better correlations with human judgment. It has long been the hope that by tuning machine translation systems against these new generation metrics, advances in automatic machine translation evaluation can lead directly to advances in automatic machine translation. However, to date there has been no unambiguous report that these new metrics can improve a state-of-theart machine translation system over its BLEUtuned baseline. In this paper, we demonstrate that tuning Joshua, a hierarchical phrase-based statistical machine translation system, with the TESLA metrics results in significantly better humanjudged translation quality than the BLEUtuned baseline. TESLA-M in particular is simple and performs well in practice on large datasets. We release all our implementation under an open source license. It is our hope that this work will encourage the machine translation community to finally move away from BLEU as the unquestioned default and to consider the new generation metrics when tuning their systems.
6 0.4515067 44 emnlp-2011-Domain Adaptation via Pseudo In-Domain Data Selection
7 0.44209141 25 emnlp-2011-Cache-based Document-level Statistical Machine Translation
8 0.4384771 66 emnlp-2011-Hierarchical Phrase-based Translation Representations
10 0.42687666 68 emnlp-2011-Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding
11 0.41868398 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification
12 0.41311559 56 emnlp-2011-Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases
13 0.41252306 75 emnlp-2011-Joint Models for Chinese POS Tagging and Dependency Parsing
14 0.40466303 3 emnlp-2011-A Correction Model for Word Alignments
15 0.40100706 13 emnlp-2011-A Word Reordering Model for Improved Machine Translation
16 0.40073648 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
17 0.39954579 46 emnlp-2011-Efficient Subsampling for Training Complex Language Models
18 0.39796543 38 emnlp-2011-Data-Driven Response Generation in Social Media
19 0.39550275 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study
20 0.39540482 97 emnlp-2011-Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French