acl acl2013 acl2013-10 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yang Feng ; Trevor Cohn
Abstract: Most modern machine translation systems use phrase pairs as translation units, allowing for accurate modelling of phraseinternal translation and reordering. However phrase-based approaches are much less able to model sentence level effects between different phrase-pairs. We propose a new model to address this imbalance, based on a word-based Markov model of translation which generates target translations left-to-right. Our model encodes word and phrase level phenomena by conditioning translation decisions on previous decisions and uses a hierarchical Pitman-Yor Process prior to provide dynamic adaptive smoothing. This mechanism implicitly supports not only traditional phrase pairs, but also gapping phrases which are non-consecutive in the source. Our experiments on Chinese to English and Arabic to English translation show consistent improvements over competitive baselines, of up to +3.4 BLEU.
Reference: text
sentIndex sentText sentNum sentScore
1 Abstract Most modern machine translation systems use phrase pairs as translation units, allowing for accurate modelling of phraseinternal translation and reordering. [sent-2, score-1.335]
2 However phrase-based approaches are much less able to model sentence level effects between different phrase-pairs. [sent-3, score-0.047]
3 We propose a new model to address this imbalance, based on a word-based Markov model of translation which generates target translations left-to-right. [sent-4, score-0.593]
4 Our model encodes word and phrase level phenomena by conditioning translation decisions on previous decisions and uses a hierarchical Pitman-Yor Process prior to provide dynamic adaptive smoothing. [sent-5, score-1.158]
5 This mechanism implicitly supports not only traditional phrase pairs, but also gapping phrases which are non-consecutive in the source. [sent-6, score-0.289]
6 Our experiments on Chinese to English and Arabic to English translation show consistent improvements over competitive baselines, of up to +3. [sent-7, score-0.384]
7 1 Introduction Recent years have witnessed burgeoning development of statistical machine translation research, notably phrase-based (Koehn et al. [sent-9, score-0.5]
8 These approaches model sentence translation as a sequence of simple translation decisions, such as the application of a phrase translation in phrase-based methods or a grammar rule in syntax-based approaches. [sent-13, score-1.244]
9 In order to simplify modelling, most MT models make an independence assumption, stating that the translation decisions in a derivation are independent of one another. [sent-14, score-0.784]
10 This conflicts with the intuition behind phrase-based MT, namely that translation decisions should be dependent on cont . [sent-15, score-0.776]
11 On one hand, the use of phrases can memorize local context and hence helps to generate better translation compared to word-based models (Brown et al. [sent-19, score-0.51]
12 On the other hand, this mechanism requires each phrase to be matched strictly and to be used as a whole, which precludes the use of discontinuous phrases and leads to poor generalisation to unseen data (where large phrases tend not to match). [sent-21, score-0.45]
13 In this paper we propose a new model to drop the independence assumption, by instead modelling correlations between translation decisions, which we use to induce translation derivations from aligned sentences (akin to word alignment). [sent-22, score-1.18]
14 We develop a Markov model over translation decisions, in which each decision is conditioned on previous n most recent decisions. [sent-23, score-0.718]
15 Our approach employs a sophisticated Bayesian non-parametric prior, namely the hierarchical Pitman-Yor Process (Teh, 2006; Teh et al. [sent-24, score-0.155]
16 , 2006) to represent backoff from larger to smaller contexts. [sent-25, score-0.069]
17 As a result, we need only use very simple translation units primarily single words, but can still describe – complex multi-word units through correlations between their component translation decisions. [sent-26, score-1.052]
18 We further decompose the process of generating each target word into component factors: finishing the translating, jumping elsewhere in the source, emitting a target word and deciding the fertility of the source words. [sent-27, score-0.724]
19 enabling model parameters to be shared between similar translation decisions, thereby obtaining more reliable statistics and generalizing better from small training sets. [sent-29, score-0.493]
20 learning a much richer set of translation fragments, such as gapping phrases, e. [sent-31, score-0.492]
21 providing a unifying framework spanning word-based and phrase-based model of translation, while incorporating explicit transla333 Proce dingSsof oifa, th Beu 5l1gsarti Aan,An u aglu Mste 4e-ti9n2g 0 o1f3 t. [sent-41, score-0.105]
22 We demonstrate our model on Chinese-English and Arabic-English translation datasets. [sent-44, score-0.431]
23 The model produces uniformly better translations than those of a competitive phrase-based baseline, amounting to an improvement of up to 3. [sent-45, score-0.105]
24 2 Related Work Word based models have a long history in machine translation, starting with the venerable IBM translation models (Brown et al. [sent-47, score-0.434]
25 These models are still in wide-spread use today, albeit only as a preprocessing step for inferring word level alignments from sentence-aligned parallel corpora. [sent-50, score-0.128]
26 They combine a number of factors, including distortion and fertility, which have been shown to improve word-alignment and translation performance over simpler models. [sent-51, score-0.384]
27 Our approach is similar to these works, as we also develop a word-based model, and explicitly consider similar translation decisions, alignment jumps and fertility. [sent-52, score-0.702]
28 Together this results in a model with rich expressiveness but can still generalize well to unseen data. [sent-54, score-0.2]
29 (201 1) propose a rule Markov model for a tree-to-string model which models correlations between pairs of mininal rules, and use Kneser-Ney smoothing to alleviate the problems of data sparsity. [sent-57, score-0.215]
30 (201 1) develop a bilingual language model which incorporates words in the source and target languages to predict the next unit, which they use as a feature in a translation system. [sent-59, score-0.783]
31 (2012) who develop a novel estimation algorithm based around discriminative projection into continuous spaces. [sent-61, score-0.117]
32 (201 1), who present a sequence model of translation including reordering. [sent-63, score-0.431]
33 Our work also uses bilingual information, using the source words as part of the conditioning context. [sent-64, score-0.195]
34 In contrast to these approaches which primarily address the decoding problem, we focus on the learning problem of inferring alignments from parallel sentences. [sent-65, score-0.177]
35 Additionally, we develop a full generative model using a Bayesian prior, and incorporate additional factors besides lexical items, namely jumps in the source and word fertility. [sent-66, score-0.642]
36 Another aspect of this paper is the implicit support for phrase-pairs that are discontinous in the source language. [sent-67, score-0.178]
37 This idea has been developed explicitly in a number of previous approaches, in grammar based (Chiang, 2005) and phrase-based systems (Galley and Manning, 2010). [sent-68, score-0.049]
38 The latter is most similar to this paper, and shows that discontinuous phrases compliment standard contiguous phrases, improving expressiveness and translation performance. [sent-69, score-0.628]
39 Unlike their work, here we develop a complimentary approach by constructing a generative model which can induce these rich rules directly from sentence-aligned corpora. [sent-70, score-0.349]
40 3 Model Given a source sentence, our model infers a latent derivation which produces a target translation and meanwhile gives a word alignment between the source and the target. [sent-71, score-0.834]
41 We consider a process in which the target string is generated using a left-to-right order, similar to the decoding strategy used by phrase-based machine translation systems (Koehn et al. [sent-72, score-0.56]
42 During this process we maintain a position in the source sentence, which can jump around to allow for different sentence ordering in the target vs. [sent-74, score-0.66]
43 In contrast to phrase-based models, we use words as our basic translation unit, rather than multi-word phrases. [sent-76, score-0.384]
44 Furthermore, we decompose the decisions involved in generating each target word to a number of separate factors, where each factor is modelled separately and conditioned on a rich history of recent translation decisions. [sent-77, score-1.016]
45 1 Markov Translation Our model generates target translation left-toright word by word. [sent-79, score-0.546]
46 This generative process resembles the sequence of translation decisions considered by a standard MT decoder (Koehn et al. [sent-82, score-0.76]
47 Instead source words can be skipped or repeatedly translated. [sent-84, score-0.171]
48 3), while also permitting inference using (Gseibebs § sampling (§4). [sent-88, score-0.051]
49 Each of the three distribuiti −ons n (,fiin −ish n, jump a. [sent-93, score-0.364]
50 emission) ies dthrraewen d respective from hierarchical Pitman-Yor Process priors, as described in Section 3. [sent-95, score-0.061]
51 The jump decision τi in Equation 1 demands further explanation. [sent-97, score-0.454]
52 Instead of modelling jump distances explicitly, which poses problems for generalizing between different lengths of sentences and general parameter explosion, we consider a small handful of types of jump based on the distance between the current source word ai and the previous source word ai−1, i. [sent-98, score-1.279]
53 1 We bin jumps into five types: a) insert; b) backward, if di < 0; c) stay, if di = 0; d) monotone, if di = 1; e) forward, if di > 1. [sent-101, score-0.584]
54 The special jump type insert handles null align− −n ments, denoted ai = 0 which licence spurious insertions in the target string. [sent-102, score-0.715]
wordName wordTfidf (topN-words)
[('translation', 0.384), ('jump', 0.364), ('decisions', 0.246), ('faii', 0.175), ('jumps', 0.152), ('modelling', 0.138), ('finish', 0.133), ('markov', 0.126), ('iy', 0.121), ('correlations', 0.121), ('source', 0.12), ('develop', 0.117), ('target', 0.115), ('ai', 0.111), ('di', 0.108), ('gapping', 0.108), ('yp', 0.102), ('decision', 0.09), ('fertility', 0.089), ('factors', 0.088), ('expressiveness', 0.083), ('discontinuous', 0.083), ('emission', 0.081), ('sheffield', 0.081), ('conditioned', 0.08), ('phrases', 0.078), ('conditioning', 0.075), ('bayesian', 0.074), ('decompose', 0.071), ('insert', 0.071), ('rich', 0.07), ('generative', 0.069), ('backoff', 0.069), ('inferring', 0.065), ('yi', 0.064), ('alignments', 0.063), ('generalizing', 0.062), ('process', 0.061), ('hierarchical', 0.061), ('cohn', 0.061), ('independence', 0.06), ('amounting', 0.058), ('burgeoning', 0.058), ('discontinous', 0.058), ('iil', 0.058), ('kre', 0.058), ('parameterisation', 0.058), ('tta', 0.058), ('unifying', 0.058), ('witnessed', 0.058), ('mechanism', 0.058), ('units', 0.057), ('koehn', 0.056), ('galley', 0.054), ('crego', 0.054), ('durrani', 0.054), ('endi', 0.054), ('fertilities', 0.054), ('ffie', 0.054), ('finishing', 0.054), ('generalisation', 0.054), ('iid', 0.054), ('ish', 0.054), ('licence', 0.054), ('pitmanyor', 0.054), ('precludes', 0.054), ('prior', 0.054), ('conflicts', 0.051), ('emit', 0.051), ('emitting', 0.051), ('fai', 0.051), ('finished', 0.051), ('ons', 0.051), ('permitting', 0.051), ('skipped', 0.051), ('history', 0.05), ('explicitly', 0.049), ('primarily', 0.049), ('teh', 0.049), ('namely', 0.049), ('memorize', 0.048), ('fal', 0.048), ('inserts', 0.048), ('jumping', 0.048), ('respects', 0.048), ('vaswani', 0.048), ('derivation', 0.048), ('model', 0.047), ('induce', 0.046), ('cont', 0.046), ('agenda', 0.046), ('akin', 0.046), ('hile', 0.046), ('iw', 0.046), ('je', 0.046), ('stating', 0.046), ('employs', 0.045), ('phrase', 0.045), ('brown', 0.044), ('explosion', 0.043)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 10 acl-2013-A Markov Model of Machine Translation using Non-parametric Bayesian Inference
Author: Yang Feng ; Trevor Cohn
Abstract: Most modern machine translation systems use phrase pairs as translation units, allowing for accurate modelling of phraseinternal translation and reordering. However phrase-based approaches are much less able to model sentence level effects between different phrase-pairs. We propose a new model to address this imbalance, based on a word-based Markov model of translation which generates target translations left-to-right. Our model encodes word and phrase level phenomena by conditioning translation decisions on previous decisions and uses a hierarchical Pitman-Yor Process prior to provide dynamic adaptive smoothing. This mechanism implicitly supports not only traditional phrase pairs, but also gapping phrases which are non-consecutive in the source. Our experiments on Chinese to English and Arabic to English translation show consistent improvements over competitive baselines, of up to +3.4 BLEU.
2 0.19676912 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation
Author: Jiajun Zhang ; Chengqing Zong
Abstract: Currently, almost all of the statistical machine translation (SMT) models are trained with the parallel corpora in some specific domains. However, when it comes to a language pair or a different domain without any bilingual resources, the traditional SMT loses its power. Recently, some research works study the unsupervised SMT for inducing a simple word-based translation model from the monolingual corpora. It successfully bypasses the constraint of bitext for SMT and obtains a relatively promising result. In this paper, we take a step forward and propose a simple but effective method to induce a phrase-based model from the monolingual corpora given an automatically-induced translation lexicon or a manually-edited translation dictionary. We apply our method for the domain adaptation task and the extensive experiments show that our proposed method can substantially improve the translation quality. 1
3 0.1947455 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
Author: Trevor Cohn ; Gholamreza Haffari
Abstract: Modern phrase-based machine translation systems make extensive use of wordbased translation models for inducing alignments from parallel corpora. This is problematic, as the systems are incapable of accurately modelling many translation phenomena that do not decompose into word-for-word translation. This paper presents a novel method for inducing phrase-based translation units directly from parallel data, which we frame as learning an inverse transduction grammar (ITG) using a recursive Bayesian prior. Overall this leads to a model which learns translations of entire sentences, while also learning their decomposition into smaller units (phrase-pairs) recursively, terminating at word translations. Our experiments on Arabic, Urdu and Farsi to English demonstrate improvements over competitive baseline systems.
4 0.18560202 307 acl-2013-Scalable Decipherment for Machine Translation via Hash Sampling
Author: Sujith Ravi
Abstract: In this paper, we propose a new Bayesian inference method to train statistical machine translation systems using only nonparallel corpora. Following a probabilistic decipherment approach, we first introduce a new framework for decipherment training that is flexible enough to incorporate any number/type of features (besides simple bag-of-words) as side-information used for estimating translation models. In order to perform fast, efficient Bayesian inference in this framework, we then derive a hash sampling strategy that is inspired by the work of Ahmed et al. (2012). The new translation hash sampler enables us to scale elegantly to complex models (for the first time) and large vocab- ulary/corpora sizes. We show empirical results on the OPUS data—our method yields the best BLEU scores compared to existing approaches, while achieving significant computational speedups (several orders faster). We also report for the first time—BLEU score results for a largescale MT task using only non-parallel data (EMEA corpus).
5 0.16348886 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers
Author: Graham Neubig
Abstract: In this paper we describe Travatar, a forest-to-string machine translation (MT) engine based on tree transducers. It provides an open-source C++ implementation for the entire forest-to-string MT pipeline, including rule extraction, tuning, decoding, and evaluation. There are a number of options for model training, and tuning includes advanced options such as hypergraph MERT, and training of sparse features through online learning. The training pipeline is modeled after that of the popular Moses decoder, so users familiar with Moses should be able to get started quickly. We perform a validation experiment of the decoder on EnglishJapanese machine translation, and find that it is possible to achieve greater accuracy than translation using phrase-based and hierarchical-phrase-based translation. As auxiliary results, we also compare different syntactic parsers and alignment techniques that we tested in the process of developing the decoder. Travatar is available under the LGPL at http : / /phont ron . com/t ravat ar
6 0.16311084 40 acl-2013-Advancements in Reordering Models for Statistical Machine Translation
7 0.15243146 143 acl-2013-Exact Maximum Inference for the Fertility Hidden Markov Model
8 0.1470862 11 acl-2013-A Multi-Domain Translation Model Framework for Statistical Machine Translation
9 0.14160648 181 acl-2013-Hierarchical Phrase Table Combination for Machine Translation
10 0.13827397 255 acl-2013-Name-aware Machine Translation
11 0.13223858 330 acl-2013-Stem Translation with Affix-Based Rule Selection for Agglutinative Languages
12 0.1273853 388 acl-2013-Word Alignment Modeling with Context Dependent Deep Neural Network
13 0.12713888 374 acl-2013-Using Context Vectors in Improving a Machine Translation System with Bridge Language
14 0.12590292 38 acl-2013-Additive Neural Networks for Statistical Machine Translation
15 0.12253701 314 acl-2013-Semantic Roles for String to Tree Machine Translation
16 0.12223263 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation
17 0.12090355 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT
18 0.11892907 77 acl-2013-Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?
19 0.11608132 355 acl-2013-TransDoop: A Map-Reduce based Crowdsourced Translation for Complex Domain
20 0.11542309 200 acl-2013-Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
topicId topicWeight
[(0, 0.254), (1, -0.191), (2, 0.223), (3, 0.126), (4, -0.025), (5, 0.033), (6, 0.006), (7, 0.01), (8, -0.01), (9, 0.048), (10, 0.012), (11, -0.029), (12, 0.014), (13, 0.008), (14, 0.003), (15, -0.047), (16, 0.021), (17, -0.011), (18, 0.01), (19, -0.045), (20, -0.076), (21, -0.021), (22, -0.017), (23, -0.02), (24, -0.074), (25, -0.051), (26, -0.03), (27, 0.015), (28, 0.052), (29, 0.054), (30, 0.009), (31, -0.042), (32, 0.11), (33, 0.046), (34, 0.002), (35, 0.055), (36, -0.017), (37, -0.026), (38, -0.039), (39, 0.01), (40, -0.056), (41, 0.048), (42, -0.043), (43, -0.043), (44, -0.03), (45, 0.011), (46, 0.01), (47, 0.076), (48, -0.007), (49, 0.043)]
simIndex simValue paperId paperTitle
same-paper 1 0.97087824 10 acl-2013-A Markov Model of Machine Translation using Non-parametric Bayesian Inference
Author: Yang Feng ; Trevor Cohn
Abstract: Most modern machine translation systems use phrase pairs as translation units, allowing for accurate modelling of phraseinternal translation and reordering. However phrase-based approaches are much less able to model sentence level effects between different phrase-pairs. We propose a new model to address this imbalance, based on a word-based Markov model of translation which generates target translations left-to-right. Our model encodes word and phrase level phenomena by conditioning translation decisions on previous decisions and uses a hierarchical Pitman-Yor Process prior to provide dynamic adaptive smoothing. This mechanism implicitly supports not only traditional phrase pairs, but also gapping phrases which are non-consecutive in the source. Our experiments on Chinese to English and Arabic to English translation show consistent improvements over competitive baselines, of up to +3.4 BLEU.
2 0.82605255 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
Author: Trevor Cohn ; Gholamreza Haffari
Abstract: Modern phrase-based machine translation systems make extensive use of wordbased translation models for inducing alignments from parallel corpora. This is problematic, as the systems are incapable of accurately modelling many translation phenomena that do not decompose into word-for-word translation. This paper presents a novel method for inducing phrase-based translation units directly from parallel data, which we frame as learning an inverse transduction grammar (ITG) using a recursive Bayesian prior. Overall this leads to a model which learns translations of entire sentences, while also learning their decomposition into smaller units (phrase-pairs) recursively, terminating at word translations. Our experiments on Arabic, Urdu and Farsi to English demonstrate improvements over competitive baseline systems.
3 0.81645709 307 acl-2013-Scalable Decipherment for Machine Translation via Hash Sampling
Author: Sujith Ravi
Abstract: In this paper, we propose a new Bayesian inference method to train statistical machine translation systems using only nonparallel corpora. Following a probabilistic decipherment approach, we first introduce a new framework for decipherment training that is flexible enough to incorporate any number/type of features (besides simple bag-of-words) as side-information used for estimating translation models. In order to perform fast, efficient Bayesian inference in this framework, we then derive a hash sampling strategy that is inspired by the work of Ahmed et al. (2012). The new translation hash sampler enables us to scale elegantly to complex models (for the first time) and large vocab- ulary/corpora sizes. We show empirical results on the OPUS data—our method yields the best BLEU scores compared to existing approaches, while achieving significant computational speedups (several orders faster). We also report for the first time—BLEU score results for a largescale MT task using only non-parallel data (EMEA corpus).
4 0.81332898 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers
Author: Graham Neubig
Abstract: In this paper we describe Travatar, a forest-to-string machine translation (MT) engine based on tree transducers. It provides an open-source C++ implementation for the entire forest-to-string MT pipeline, including rule extraction, tuning, decoding, and evaluation. There are a number of options for model training, and tuning includes advanced options such as hypergraph MERT, and training of sparse features through online learning. The training pipeline is modeled after that of the popular Moses decoder, so users familiar with Moses should be able to get started quickly. We perform a validation experiment of the decoder on EnglishJapanese machine translation, and find that it is possible to achieve greater accuracy than translation using phrase-based and hierarchical-phrase-based translation. As auxiliary results, we also compare different syntactic parsers and alignment techniques that we tested in the process of developing the decoder. Travatar is available under the LGPL at http : / /phont ron . com/t ravat ar
5 0.78249651 77 acl-2013-Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?
Author: Nadir Durrani ; Alexander Fraser ; Helmut Schmid ; Hieu Hoang ; Philipp Koehn
Abstract: The phrase-based and N-gram-based SMT frameworks complement each other. While the former is better able to memorize, the latter provides a more principled model that captures dependencies across phrasal boundaries. Some work has been done to combine insights from these two frameworks. A recent successful attempt showed the advantage of using phrasebased search on top of an N-gram-based model. We probe this question in the reverse direction by investigating whether integrating N-gram-based translation and reordering models into a phrase-based decoder helps overcome the problematic phrasal independence assumption. A large scale evaluation over 8 language pairs shows that performance does significantly improve.
6 0.77256984 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT
7 0.77223223 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation
8 0.76435149 305 acl-2013-SORT: An Interactive Source-Rewriting Tool for Improved Translation
9 0.7632162 64 acl-2013-Automatically Predicting Sentence Translation Difficulty
10 0.76183498 255 acl-2013-Name-aware Machine Translation
11 0.75967914 330 acl-2013-Stem Translation with Affix-Based Rule Selection for Agglutinative Languages
12 0.75651866 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation
13 0.75512534 355 acl-2013-TransDoop: A Map-Reduce based Crowdsourced Translation for Complex Domain
14 0.75288045 180 acl-2013-Handling Ambiguities of Bilingual Predicate-Argument Structures for Statistical Machine Translation
15 0.74601722 201 acl-2013-Integrating Translation Memory into Phrase-Based Machine Translation during Decoding
16 0.74544626 312 acl-2013-Semantic Parsing as Machine Translation
17 0.74342602 16 acl-2013-A Novel Translation Framework Based on Rhetorical Structure Theory
18 0.72526526 68 acl-2013-Bilingual Data Cleaning for SMT using Graph-based Random Walk
19 0.71575367 92 acl-2013-Context-Dependent Multilingual Lexical Lookup for Under-Resourced Languages
20 0.71340519 40 acl-2013-Advancements in Reordering Models for Statistical Machine Translation
topicId topicWeight
[(0, 0.031), (6, 0.034), (11, 0.038), (24, 0.018), (26, 0.027), (35, 0.599), (42, 0.056), (48, 0.025), (70, 0.038), (88, 0.011), (90, 0.019), (95, 0.045)]
simIndex simValue paperId paperTitle
1 0.97405714 55 acl-2013-Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?
Author: Romain Deveaud ; Eric SanJuan ; Patrice Bellot
Abstract: The current topic modeling approaches for Information Retrieval do not allow to explicitly model query-oriented latent topics. More, the semantic coherence of the topics has never been considered in this field. We propose a model-based feedback approach that learns Latent Dirichlet Allocation topic models on the top-ranked pseudo-relevant feedback, and we measure the semantic coherence of those topics. We perform a first experimental evaluation using two major TREC test collections. Results show that retrieval perfor- mances tend to be better when using topics with higher semantic coherence.
2 0.97200191 278 acl-2013-Patient Experience in Online Support Forums: Modeling Interpersonal Interactions and Medication Use
Author: Annie Chen
Abstract: Though there has been substantial research concerning the extraction of information from clinical notes, to date there has been less work concerning the extraction of useful information from patient-generated content. Using a dataset comprised of online support group discussion content, this paper investigates two dimensions that may be important in the extraction of patient-generated experiences from text; significant individuals/groups and medication use. With regard to the former, the paper describes an approach involving the pairing of important figures (e.g. family, husbands, doctors, etc.) and affect, and suggests possible applications of such techniques to research concerning online social support, as well as integration into search interfaces for patients. Additionally, the paper demonstrates the extraction of side effects and sentiment at different phases in patient medication use, e.g. adoption, current use, discontinuation and switching, and demonstrates the utility of such an application for drug safety monitoring in online discussion forums. 1
3 0.96697307 160 acl-2013-Fine-grained Semantic Typing of Emerging Entities
Author: Ndapandula Nakashole ; Tomasz Tylenda ; Gerhard Weikum
Abstract: Methods for information extraction (IE) and knowledge base (KB) construction have been intensively studied. However, a largely under-explored case is tapping into highly dynamic sources like news streams and social media, where new entities are continuously emerging. In this paper, we present a method for discovering and semantically typing newly emerging out-ofKB entities, thus improving the freshness and recall of ontology-based IE and improving the precision and semantic rigor of open IE. Our method is based on a probabilistic model that feeds weights into integer linear programs that leverage type signatures of relational phrases and type correlation or disjointness constraints. Our experimental evaluation, based on crowdsourced user studies, show our method performing significantly better than prior work.
4 0.9632538 76 acl-2013-Building and Evaluating a Distributional Memory for Croatian
Author: Jan Snajder ; Sebastian Pado ; Zeljko Agic
Abstract: We report on the first structured distributional semantic model for Croatian, DM.HR. It is constructed after the model of the English Distributional Memory (Baroni and Lenci, 2010), from a dependencyparsed Croatian web corpus, and covers about 2M lemmas. We give details on the linguistic processing and the design principles. An evaluation shows state-of-theart performance on a semantic similarity task with particularly good performance on nouns. The resource is freely available.
same-paper 5 0.95972645 10 acl-2013-A Markov Model of Machine Translation using Non-parametric Bayesian Inference
Author: Yang Feng ; Trevor Cohn
Abstract: Most modern machine translation systems use phrase pairs as translation units, allowing for accurate modelling of phraseinternal translation and reordering. However phrase-based approaches are much less able to model sentence level effects between different phrase-pairs. We propose a new model to address this imbalance, based on a word-based Markov model of translation which generates target translations left-to-right. Our model encodes word and phrase level phenomena by conditioning translation decisions on previous decisions and uses a hierarchical Pitman-Yor Process prior to provide dynamic adaptive smoothing. This mechanism implicitly supports not only traditional phrase pairs, but also gapping phrases which are non-consecutive in the source. Our experiments on Chinese to English and Arabic to English translation show consistent improvements over competitive baselines, of up to +3.4 BLEU.
6 0.95614547 311 acl-2013-Semantic Neighborhoods as Hypergraphs
7 0.95115304 32 acl-2013-A relatedness benchmark to test the role of determiners in compositional distributional semantics
8 0.94381726 122 acl-2013-Discriminative Approach to Fill-in-the-Blank Quiz Generation for Language Learners
9 0.79399359 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval
10 0.777022 238 acl-2013-Measuring semantic content in distributional vectors
11 0.76862538 113 acl-2013-Derivational Smoothing for Syntactic Distributional Semantics
12 0.76086503 58 acl-2013-Automated Collocation Suggestion for Japanese Second Language Learners
13 0.74695456 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
14 0.73182219 121 acl-2013-Discovering User Interactions in Ideological Discussions
15 0.72654301 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval
16 0.71476811 219 acl-2013-Learning Entity Representation for Entity Disambiguation
17 0.71310204 352 acl-2013-Towards Accurate Distant Supervision for Relational Facts Extraction
18 0.71275711 347 acl-2013-The Role of Syntax in Vector Space Models of Compositional Semantics
19 0.70832074 371 acl-2013-Unsupervised joke generation from big data
20 0.70800257 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models