acl acl2011 acl2011-232 knowledge-graph by maker-knowledge-mining

232 acl-2011-Nonparametric Bayesian Machine Transliteration with Synchronous Adaptor Grammars


Source: pdf

Author: Yun Huang ; Min Zhang ; Chew Lim Tan

Abstract: Machine transliteration is defined as automatic phonetic translation of names across languages. In this paper, we propose synchronous adaptor grammar, a novel nonparametric Bayesian learning approach, for machine transliteration. This model provides a general framework without heuristic or restriction to automatically learn syllable equivalents between languages. The proposed model outperforms the state-of-the-art EMbased model in the English to Chinese transliteration task.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 sg 1Human Language Department Institute for Infocomm Research 1 Fusionopolis Way, Singapore Abstract Machine transliteration is defined as automatic phonetic translation of names across languages. [sent-10, score-0.455]

2 In this paper, we propose synchronous adaptor grammar, a novel nonparametric Bayesian learning approach, for machine transliteration. [sent-11, score-0.767]

3 This model provides a general framework without heuristic or restriction to automatically learn syllable equivalents between languages. [sent-12, score-0.293]

4 The proposed model outperforms the state-of-the-art EMbased model in the English to Chinese transliteration task. [sent-13, score-0.302]

5 1 Introduction Proper names are one source of OOV words in many NLP tasks, such as machine translation and cross- lingual information retrieval. [sent-14, score-0.167]

6 translation by preserving how words sound in both languages. [sent-17, score-0.146]

7 In general, machine transliteration is often modelled as monotonic machine translation (Rama and Gali, 2009; Finch and Sumita, 2009; Finch and Sumita, 2010), the joint source-channel models (Li et al. [sent-18, score-0.554]

8 Syllable equivalents acquisition is a critical phase for all these models. [sent-21, score-0.166]

9 Traditional learning approaches aim to maximize the likelihood of training data by the Expectation-Maximization (EM) algorithm. [sent-22, score-0.043]

10 However, the EM algorithm may over-fit the training data by memorizing the whole training instances. [sent-23, score-0.072]

11 To avoid this problem, some approaches restrict that a 534 2Department of Computer Science National University of Singapore 13 Computing Drive, Singapore single character in one language could be aligned to many characters of the other, but not vice versa (Li et al. [sent-24, score-0.089]

12 Heuristics are introduced to obtain many-to-many alignments by combining two directional one-to-many alignments (Rama and Gali, 2009). [sent-27, score-0.165]

13 Compared to maximum likelihood approaches, Bayesian models provide a systemic way to encode knowledges and infer compact structures. [sent-28, score-0.115]

14 They have been successfully applied to many machine learning tasks (Liu and Gildea, 2009; Zhang et al. [sent-29, score-0.036]

15 Among these models, Adaptor Grammars (AGs) provide a framework for defining nonparametric Bayesian models based on PCFGs (Johnson et al. [sent-32, score-0.132]

16 They introduce additional stochastic processes (named adaptors) allowing the expansion of an adapted symbol to depend on the expansion history. [sent-34, score-0.291]

17 Since many existing models could be viewed as special kinds of PCFG, adaptor grammars give general Bayesian extension to them. [sent-35, score-0.566]

18 AGs have been used in various NLP tasks such as topic modeling (Johnson, 2010), perspective modeling (Hardisty et al. [sent-36, score-0.036]

19 We also describe how transliteration could be modelled under this formalism. [sent-39, score-0.446]

20 It should be emphasized that the proposed method is language independent and heuristic-free. [sent-40, score-0.051]

21 Experiments show the proposed approach outperforms the strong EM-based baseline in the English to Chinese transliteration task. [sent-41, score-0.302]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('adaptor', 0.43), ('transliteration', 0.302), ('synchronous', 0.205), ('ags', 0.203), ('rama', 0.177), ('samplesag', 0.177), ('samplescfg', 0.177), ('gali', 0.156), ('modelled', 0.144), ('bayesian', 0.136), ('grammars', 0.136), ('finch', 0.135), ('johnson', 0.134), ('nonparametric', 0.132), ('equivalents', 0.128), ('sumita', 0.123), ('syllable', 0.123), ('gs', 0.118), ('singapore', 0.101), ('draw', 0.101), ('adapted', 0.093), ('nonterminals', 0.083), ('rul', 0.078), ('multi', 0.078), ('ene', 0.078), ('ance', 0.078), ('adaptors', 0.078), ('ddi', 0.078), ('otefd', 0.078), ('return', 0.076), ('infocomm', 0.072), ('mzhang', 0.072), ('zn', 0.072), ('sof', 0.072), ('darwish', 0.072), ('memorizing', 0.072), ('yor', 0.072), ('hardisty', 0.072), ('ntr', 0.072), ('systemic', 0.072), ('rsa', 0.072), ('expansion', 0.069), ('pitman', 0.068), ('yun', 0.068), ('reddy', 0.068), ('na', 0.065), ('symbols', 0.062), ('nis', 0.061), ('hp', 0.061), ('tohfe', 0.061), ('symbol', 0.06), ('lingual', 0.059), ('otf', 0.059), ('directional', 0.057), ('monotonic', 0.055), ('cache', 0.055), ('names', 0.055), ('alignments', 0.054), ('pcfgs', 0.054), ('zi', 0.054), ('scfg', 0.054), ('translation', 0.053), ('oov', 0.052), ('versa', 0.052), ('yang', 0.052), ('em', 0.051), ('emphasized', 0.051), ('dir', 0.05), ('tb', 0.05), ('drive', 0.049), ('tu', 0.049), ('preserving', 0.048), ('chew', 0.047), ('goldwater', 0.046), ('blunsom', 0.046), ('ia', 0.045), ('phonetic', 0.045), ('tthhee', 0.045), ('sound', 0.045), ('likelihood', 0.043), ('bi', 0.042), ('indexed', 0.042), ('chinese', 0.042), ('restriction', 0.042), ('rt', 0.042), ('lim', 0.041), ('tt', 0.04), ('pcfg', 0.04), ('nonterminal', 0.04), ('tuple', 0.039), ('sample', 0.038), ('critical', 0.038), ('ss', 0.038), ('li', 0.038), ('grammar', 0.037), ('nt', 0.037), ('vice', 0.037), ('else', 0.037), ('ts', 0.037), ('tasks', 0.036)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 232 acl-2011-Nonparametric Bayesian Machine Transliteration with Synchronous Adaptor Grammars

Author: Yun Huang ; Min Zhang ; Chew Lim Tan

Abstract: Machine transliteration is defined as automatic phonetic translation of names across languages. In this paper, we propose synchronous adaptor grammar, a novel nonparametric Bayesian learning approach, for machine transliteration. This model provides a general framework without heuristic or restriction to automatically learn syllable equivalents between languages. The proposed model outperforms the state-of-the-art EMbased model in the English to Chinese transliteration task.

2 0.27143469 34 acl-2011-An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment

Author: Hassan Sajjad ; Alexander Fraser ; Helmut Schmid

Abstract: We propose a language-independent method for the automatic extraction of transliteration pairs from parallel corpora. In contrast to previous work, our method uses no form of supervision, and does not require linguistically informed preprocessing. We conduct experiments on data sets from the NEWS 2010 shared task on transliteration mining and achieve an F-measure of up to 92%, outperforming most of the semi-supervised systems that were submitted. We also apply our method to English/Hindi and English/Arabic parallel corpora and compare the results with manually built gold standards which mark transliterated word pairs. Finally, we integrate the transliteration module into the GIZA++ word aligner and evaluate it on two word alignment tasks achieving improvements in both precision and recall measured against gold standard word alignments.

3 0.18135329 197 acl-2011-Latent Class Transliteration based on Source Language Origin

Author: Masato Hagiwara ; Satoshi Sekine

Abstract: Transliteration, a rich source of proper noun spelling variations, is usually recognized by phonetic- or spelling-based models. However, a single model cannot deal with different words from different language origins, e.g., “get” in “piaget” and “target.” Li et al. (2007) propose a method which explicitly models and classifies the source language origins and switches transliteration models accordingly. This model, however, requires an explicitly tagged training set with language origins. We propose a novel method which models language origins as latent classes. The parameters are learned from a set of transliterated word pairs via the EM algorithm. The experimental results of the transliteration task of Western names to Japanese show that the proposed model can achieve higher accuracy compared to the conventional models without latent classes.

4 0.13331166 57 acl-2011-Bayesian Word Alignment for Statistical Machine Translation

Author: Coskun Mermer ; Murat Saraclar

Abstract: In this work, we compare the translation performance of word alignments obtained via Bayesian inference to those obtained via expectation-maximization (EM). We propose a Gibbs sampler for fully Bayesian inference in IBM Model 1, integrating over all possible parameter values in finding the alignment distribution. We show that Bayesian inference outperforms EM in all of the tested language pairs, domains and data set sizes, by up to 2.99 BLEU points. We also show that the proposed method effectively addresses the well-known rare word problem in EM-estimated models; and at the same time induces a much smaller dictionary of bilingual word-pairs. .t r

5 0.1326457 43 acl-2011-An Unsupervised Model for Joint Phrase Alignment and Extraction

Author: Graham Neubig ; Taro Watanabe ; Eiichiro Sumita ; Shinsuke Mori ; Tatsuya Kawahara

Abstract: We present an unsupervised model for joint phrase alignment and extraction using nonparametric Bayesian methods and inversion transduction grammars (ITGs). The key contribution is that phrases of many granularities are included directly in the model through the use of a novel formulation that memorizes phrases generated not only by terminal, but also non-terminal symbols. This allows for a completely probabilistic model that is able to create a phrase table that achieves competitive accuracy on phrase-based machine translation tasks directly from unaligned sentence pairs. Experiments on several language pairs demonstrate that the proposed model matches the accuracy of traditional two-step word alignment/phrase extraction approach while reducing the phrase table to a fraction of the original size.

6 0.12921135 153 acl-2011-How do you pronounce your name? Improving G2P with transliterations

7 0.12422217 250 acl-2011-Prefix Probability for Probabilistic Synchronous Context-Free Grammars

8 0.1126963 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

9 0.081196941 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

10 0.078343861 296 acl-2011-Terminal-Aware Synchronous Binarization

11 0.078164406 61 acl-2011-Binarized Forest to String Translation

12 0.074345939 180 acl-2011-Issues Concerning Decoding with Synchronous Context-free Grammar

13 0.071510009 30 acl-2011-Adjoining Tree-to-String Translation

14 0.065921105 188 acl-2011-Judging Grammaticality with Tree Substitution Grammar Derivations

15 0.059214205 173 acl-2011-Insertion Operator for Bayesian Tree Substitution Grammars

16 0.058505163 221 acl-2011-Model-Based Aligner Combination Using Dual Decomposition

17 0.054685313 290 acl-2011-Syntax-based Statistical Machine Translation using Tree Automata and Tree Transducers

18 0.054353114 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

19 0.053386517 49 acl-2011-Automatic Evaluation of Chinese Translation Output: Word-Level or Character-Level?

20 0.052163161 137 acl-2011-Fine-Grained Class Label Markup of Search Queries


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.137), (1, -0.094), (2, 0.043), (3, 0.027), (4, 0.013), (5, -0.022), (6, -0.061), (7, -0.012), (8, -0.048), (9, 0.079), (10, 0.007), (11, 0.067), (12, 0.023), (13, 0.176), (14, 0.052), (15, 0.005), (16, 0.068), (17, 0.105), (18, 0.255), (19, -0.06), (20, -0.154), (21, 0.19), (22, -0.117), (23, -0.076), (24, -0.099), (25, -0.036), (26, 0.026), (27, -0.051), (28, -0.004), (29, 0.043), (30, -0.069), (31, 0.05), (32, -0.04), (33, 0.05), (34, 0.008), (35, -0.034), (36, 0.056), (37, 0.012), (38, 0.058), (39, 0.056), (40, -0.012), (41, 0.046), (42, -0.016), (43, -0.035), (44, -0.033), (45, 0.068), (46, -0.049), (47, 0.005), (48, 0.016), (49, -0.007)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95008779 232 acl-2011-Nonparametric Bayesian Machine Transliteration with Synchronous Adaptor Grammars

Author: Yun Huang ; Min Zhang ; Chew Lim Tan

Abstract: Machine transliteration is defined as automatic phonetic translation of names across languages. In this paper, we propose synchronous adaptor grammar, a novel nonparametric Bayesian learning approach, for machine transliteration. This model provides a general framework without heuristic or restriction to automatically learn syllable equivalents between languages. The proposed model outperforms the state-of-the-art EMbased model in the English to Chinese transliteration task.

2 0.81166762 34 acl-2011-An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment

Author: Hassan Sajjad ; Alexander Fraser ; Helmut Schmid

Abstract: We propose a language-independent method for the automatic extraction of transliteration pairs from parallel corpora. In contrast to previous work, our method uses no form of supervision, and does not require linguistically informed preprocessing. We conduct experiments on data sets from the NEWS 2010 shared task on transliteration mining and achieve an F-measure of up to 92%, outperforming most of the semi-supervised systems that were submitted. We also apply our method to English/Hindi and English/Arabic parallel corpora and compare the results with manually built gold standards which mark transliterated word pairs. Finally, we integrate the transliteration module into the GIZA++ word aligner and evaluate it on two word alignment tasks achieving improvements in both precision and recall measured against gold standard word alignments.

3 0.80022901 197 acl-2011-Latent Class Transliteration based on Source Language Origin

Author: Masato Hagiwara ; Satoshi Sekine

Abstract: Transliteration, a rich source of proper noun spelling variations, is usually recognized by phonetic- or spelling-based models. However, a single model cannot deal with different words from different language origins, e.g., “get” in “piaget” and “target.” Li et al. (2007) propose a method which explicitly models and classifies the source language origins and switches transliteration models accordingly. This model, however, requires an explicitly tagged training set with language origins. We propose a novel method which models language origins as latent classes. The parameters are learned from a set of transliterated word pairs via the EM algorithm. The experimental results of the transliteration task of Western names to Japanese show that the proposed model can achieve higher accuracy compared to the conventional models without latent classes.

4 0.69778627 153 acl-2011-How do you pronounce your name? Improving G2P with transliterations

Author: Aditya Bhargava ; Grzegorz Kondrak

Abstract: Grapheme-to-phoneme conversion (G2P) of names is an important and challenging problem. The correct pronunciation of a name is often reflected in its transliterations, which are expressed within a different phonological inventory. We investigate the problem of using transliterations to correct errors produced by state-of-the-art G2P systems. We present a novel re-ranking approach that incorporates a variety of score and n-gram features, in order to leverage transliterations from multiple languages. Our experiments demonstrate significant accuracy improvements when re-ranking is applied to n-best lists generated by three different G2P programs.

5 0.48608783 250 acl-2011-Prefix Probability for Probabilistic Synchronous Context-Free Grammars

Author: Mark-Jan Nederhof ; Giorgio Satta

Abstract: We present a method for the computation of prefix probabilities for synchronous contextfree grammars. Our framework is fairly general and relies on the combination of a simple, novel grammar transformation and standard techniques to bring grammars into normal forms.

6 0.42265978 43 acl-2011-An Unsupervised Model for Joint Phrase Alignment and Extraction

7 0.38972169 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

8 0.37189341 180 acl-2011-Issues Concerning Decoding with Synchronous Context-free Grammar

9 0.36938825 57 acl-2011-Bayesian Word Alignment for Statistical Machine Translation

10 0.36367044 154 acl-2011-How to train your multi bottom-up tree transducer

11 0.34874922 151 acl-2011-Hindi to Punjabi Machine Translation System

12 0.3456783 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations

13 0.33900574 87 acl-2011-Corpus Expansion for Statistical Machine Translation with Semantic Role Label Substitution Rules

14 0.33535641 173 acl-2011-Insertion Operator for Bayesian Tree Substitution Grammars

15 0.33320713 11 acl-2011-A Fast and Accurate Method for Approximate String Search

16 0.32797223 325 acl-2011-Unsupervised Word Alignment with Arbitrary Features

17 0.32206935 93 acl-2011-Dealing with Spurious Ambiguity in Learning ITG-based Word Alignment

18 0.32147488 335 acl-2011-Why Initialization Matters for IBM Model 1: Multiple Optima and Non-Strict Convexity

19 0.31772316 15 acl-2011-A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

20 0.31771111 296 acl-2011-Terminal-Aware Synchronous Binarization


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.01), (17, 0.033), (37, 0.075), (39, 0.045), (41, 0.547), (55, 0.023), (59, 0.023), (72, 0.012), (91, 0.021), (96, 0.119)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.95795041 83 acl-2011-Contrasting Multi-Lingual Prosodic Cues to Predict Verbal Feedback for Rapport

Author: Siwei Wang ; Gina-Anne Levow

Abstract: Verbal feedback is an important information source in establishing interactional rapport. However, predicting verbal feedback across languages is challenging due to languagespecific differences, inter-speaker variation, and the relative sparseness and optionality of verbal feedback. In this paper, we employ an approach combining classifier weighting and SMOTE algorithm oversampling to improve verbal feedback prediction in Arabic, English, and Spanish dyadic conversations. This approach improves the prediction of verbal feedback, up to 6-fold, while maintaining a high overall accuracy. Analyzing highly weighted features highlights widespread use of pitch, with more varied use of intensity and duration.

2 0.95069015 328 acl-2011-Using Cross-Entity Inference to Improve Event Extraction

Author: Yu Hong ; Jianfeng Zhang ; Bin Ma ; Jianmin Yao ; Guodong Zhou ; Qiaoming Zhu

Abstract: Event extraction is the task of detecting certain specified types of events that are mentioned in the source language data. The state-of-the-art research on the task is transductive inference (e.g. cross-event inference). In this paper, we propose a new method of event extraction by well using cross-entity inference. In contrast to previous inference methods, we regard entitytype consistency as key feature to predict event mentions. We adopt this inference method to improve the traditional sentence-level event extraction system. Experiments show that we can get 8.6% gain in trigger (event) identification, and more than 11.8% gain for argument (role) classification in ACE event extraction. 1

3 0.93281502 219 acl-2011-Metagrammar engineering: Towards systematic exploration of implemented grammars

Author: Antske Fokkens

Abstract: When designing grammars of natural language, typically, more than one formal analysis can account for a given phenomenon. Moreover, because analyses interact, the choices made by the engineer influence the possibilities available in further grammar development. The order in which phenomena are treated may therefore have a major impact on the resulting grammar. This paper proposes to tackle this problem by using metagrammar development as a methodology for grammar engineering. Iargue that metagrammar engineering as an approach facilitates the systematic exploration of grammars through comparison of competing analyses. The idea is illustrated through a comparative study of auxiliary structures in HPSG-based grammars for German and Dutch. Auxiliaries form a central phenomenon of German and Dutch and are likely to influence many components of the grammar. This study shows that a special auxiliary+verb construction significantly improves efficiency compared to the standard argument-composition analysis for both parsing and generation.

4 0.92498219 189 acl-2011-K-means Clustering with Feature Hashing

Author: Hajime Senuma

Abstract: One of the major problems of K-means is that one must use dense vectors for its centroids, and therefore it is infeasible to store such huge vectors in memory when the feature space is high-dimensional. We address this issue by using feature hashing (Weinberger et al., 2009), a dimension-reduction technique, which can reduce the size of dense vectors while retaining sparsity of sparse vectors. Our analysis gives theoretical motivation and justification for applying feature hashing to Kmeans, by showing how much will the objective of K-means be (additively) distorted. Furthermore, to empirically verify our method, we experimented on a document clustering task.

5 0.92248791 139 acl-2011-From Bilingual Dictionaries to Interlingual Document Representations

Author: Jagadeesh Jagarlamudi ; Hal Daume III ; Raghavendra Udupa

Abstract: Mapping documents into an interlingual representation can help bridge the language barrier of a cross-lingual corpus. Previous approaches use aligned documents as training data to learn an interlingual representation, making them sensitive to the domain of the training data. In this paper, we learn an interlingual representation in an unsupervised manner using only a bilingual dictionary. We first use the bilingual dictionary to find candidate document alignments and then use them to find an interlingual representation. Since the candidate alignments are noisy, we de- velop a robust learning algorithm to learn the interlingual representation. We show that bilingual dictionaries generalize to different domains better: our approach gives better performance than either a word by word translation method or Canonical Correlation Analysis (CCA) trained on a different domain.

6 0.92153424 143 acl-2011-Getting the Most out of Transition-based Dependency Parsing

same-paper 7 0.92101222 232 acl-2011-Nonparametric Bayesian Machine Transliteration with Synchronous Adaptor Grammars

8 0.90937167 56 acl-2011-Bayesian Inference for Zodiac and Other Homophonic Ciphers

9 0.83972174 185 acl-2011-Joint Identification and Segmentation of Domain-Specific Dialogue Acts for Conversational Dialogue Systems

10 0.64331132 94 acl-2011-Deciphering Foreign Language

11 0.64156401 65 acl-2011-Can Document Selection Help Semi-supervised Learning? A Case Study On Event Extraction

12 0.62650526 196 acl-2011-Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models

13 0.62353396 223 acl-2011-Modeling Wisdom of Crowds Using Latent Mixture of Discriminative Experts

14 0.60858512 33 acl-2011-An Affect-Enriched Dialogue Act Classification Model for Task-Oriented Dialogue

15 0.59760642 12 acl-2011-A Generative Entity-Mention Model for Linking Entities with Knowledge Base

16 0.59528369 135 acl-2011-Faster and Smaller N-Gram Language Models

17 0.58837128 244 acl-2011-Peeling Back the Layers: Detecting Event Role Fillers in Secondary Contexts

18 0.58095652 316 acl-2011-Unary Constraints for Efficient Context-Free Parsing

19 0.58026397 40 acl-2011-An Error Analysis of Relation Extraction in Social Media Documents

20 0.57760823 58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing