acl acl2013 acl2013-312 knowledge-graph by maker-knowledge-mining

312 acl-2013-Semantic Parsing as Machine Translation

Source: pdf

Author: Jacob Andreas ; Andreas Vlachos ; Stephen Clark

Abstract: Semantic parsing is the problem of deriving a structured meaning representation from a natural language utterance. Here we approach it as a straightforward machine translation task, and demonstrate that standard machine translation components can be adapted into a semantic parser. In experiments on the multilingual GeoQuery corpus we find that our parser is competitive with the state of the art, and in some cases achieves higher accuracy than recently proposed purpose-built systems. These results support the use of machine translation methods as an informative baseline in semantic parsing evaluations, and suggest that research in semantic parsing could benefit from advances in machine translation.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Semantic Parsing as Machine Translation Jacob Andreas Andreas Vlachos Stephen Clark Computer Laboratory Computer Laboratory Computer Laboratory University of Cambridge University of Cambridge University of Cambridge j da3 3 @ cam . [sent-1, score-0.074]

2 uk Abstract Semantic parsing is the problem of deriving a structured meaning representation from a natural language utterance. [sent-7, score-0.126]

3 Here we approach it as a straightforward machine translation task, and demonstrate that standard machine translation components can be adapted into a semantic parser. [sent-8, score-0.308]

4 In experiments on the multilingual GeoQuery corpus we find that our parser is competitive with the state of the art, and in some cases achieves higher accuracy than recently proposed purpose-built systems. [sent-9, score-0.164]

5 These results support the use of machine translation methods as an informative baseline in semantic parsing evaluations, and suggest that research in semantic parsing could benefit from advances in machine translation. [sent-10, score-0.359]

6 1 Introduction Semantic parsing (SP) is the problem of transforming a natural language (NL) utterance into a machine-interpretable meaning representation (MR). [sent-11, score-0.19]

7 At least superficially, SP is simply a machine translation (MT) task: we transform an NL utterance in one language into a statement of another (un-natural) meaning representation language (MRL). [sent-18, score-0.259]

8 Indeed, successful semantic parsers often resemble MT systems in several important respects, including the use of word alignment models as a starting point for rule extraction (Wong and Mooney, 2006; Kwiatkowski et al. [sent-19, score-0.294]

9 , 2010) and the use of automata such as tree transducers (Jones et al. [sent-20, score-0.179]

10 Contrast this with ordinary MT, where varying degrees of wrongness are tolerated by human readers (and evaluation metrics). [sent-24, score-0.081]

11 To avoid producing malformed MRs, almost all of the existing research on SP has focused on developing models with richer structure than those commonly used for MT. [sent-25, score-0.065]

12 In this work we attempt to determine how accurate a semantic parser we can build by treating SP as a pure MT task, and describe pre- and postprocessing steps which allow structure to be preserved in the MT process. [sent-26, score-0.143]

13 Our contributions are as follows: We develop a semantic parser using off-the-shelf MT components, exploring phrase-based as well as hierarchical models. [sent-27, score-0.228]

14 Experiments with four languages on the popular GeoQuery corpus (Zelle, 1995) show that our parser is competitve with the state-ofthe-art, in some cases achieving higher accuracy than recently introduced purpose-built semantic parsers. [sent-28, score-0.143]

15 Our approach also appears to require substantially less time to train than the two bestperforming semantic parsers. [sent-29, score-0.102]

16 2 MT-based semantic parsing The input is a corpus of NL utterances paired with MRs. [sent-31, score-0.12]

17 In order to learn a semantic parser using MT we linearize the MRs, learn alignments between the MRL and the NL, extract translation rules, and learn a language model for the MRL. [sent-32, score-0.521]

18 We also specify a decoding procedure that will return structured MRs for an utterance during prediction. [sent-33, score-0.176]

19 [X] → [X] → ⇓ EXTRACT (HIER) hstate , state1i hstate [X] texa , state1 [X] state1 stateid1 texas0i . [sent-39, score-0.24]

20 Figure 1: Illustration of preprocessing and rule extraction. [sent-42, score-0.097]

21 Linearization We assume that the MRL is variable-free (that is, the meaning representation for each utterance is tree-shaped), noting that formalisms with variables, like the λ-calculus, can be mapped onto variable-free logical forms with combinatory logics (Curry et al. [sent-43, score-0.216]

22 In order to learn a semantic parser using MT we begin by converting these MRs to a form more similar to NL. [sent-45, score-0.192]

23 To do so, we simply take a preorder traversal of every functional form, and label every function with the number of arguments it takes. [sent-46, score-0.078]

24 After translation, recovery of the function is easy: if the arity of every function in the MRL is known, then every traversal uniquely specifies its corresponding tree. [sent-47, score-0.118]

25 Most importantly, it eliminates any possible ambiguity from the tree reconstruction which takes place during decoding: given any sequence of decorated MRL tokens, we can always reconstruct the corresponding tree structure (if one exists). [sent-50, score-0.214]

26 Alignment Following the linearization of the MRs, we find alignments between the MR tokens and the NL tokens using the IBM Model 4 (Brown et al. [sent-54, score-0.112]

27 Once the alignment algorithm is run in both directions (NL to MRL, MRL to NL), we symmetrize the resulting alignments to obtain a consensus many-to-many alignment (Och and Ney, 2000; Koehn et al. [sent-56, score-0.202]

28 Rule extraction From the many-to-many alignment we need to extract a translation rule table, consisting of corresponding phrases in NL and MRL. [sent-58, score-0.272]

29 We consider a phrase-based translation model (Koehn et al. [sent-59, score-0.119]

30 Rules for the phrase-based model consist of pairs of aligned source and target sequences, while hierarchical rules are SCFG productions containing at most two instances of a single nonterminal symbol. [sent-61, score-0.145]

31 Note that both extraction algorithms can learn rules which a traditional tree-transducer-based approach cannot—for example the right hand side [X] river1 all0 traverse1 [X] corresponding to the pair of disconnected tree fragments: [X] traverse ? [sent-62, score-0.171]

32 Language modeling In addition to translation rules learned from a parallel corpus, MT systems also rely on an n-gram language model for the target language, estimated from a (typically larger) monolingual corpus. [sent-68, score-0.179]

33 In the case of SP, such a monolingual corpus is rarely available, and we instead use the MRs available in the training data to learn a language model of the MRL. [sent-69, score-0.049]

34 Prediction Given a new NL utterance, we need to find the n best translations (i. [sent-71, score-0.04]

35 sequences of decorated MRL tokens) that maximize the weighted sum of the translation score (the probabilities of the translations according to the rule translation table) and the language model score, a process usually referred to as decoding. [sent-73, score-0.465]

36 Standard decoding procedures for MT produce an n-best list of all possible translations, but here we need to restrict ourselves to translations corresponding to well-formed MRs. [sent-74, score-0.082]

37 In principle this could be done by re-writing the beam search algorithm used in decoding to immediately discard malformed MRs; for the experiments in this paper we simply filter the regular n-best list until we find a well-formed MR. [sent-75, score-0.107]

38 Finally, we insert the brackets according to the tree structure specified by the argument number labels. [sent-77, score-0.062]

39 All semantic parsers for GeoQuery we compare against also makes use of NP lists (Jones et al. [sent-83, score-0.141]

40 In our experiments, the NP list was included by appending all entries as extra training sentences to the end of the training corpus of each language with 50 times the weight of regular training examples, to ensure that they are learned as translation rules. [sent-85, score-0.119]

41 Evaluation for each utterance is performed by executing both the predicted and the gold standard MRs against the database and obtaining their respective answers. [sent-86, score-0.098]

42 Implementation In all experiments, we use the IBM Model 4 implementation from the GIZA++ toolkit (Och and Ney, 2000) for alignment, and the phrase-based and hierarchical models implemented in the Moses toolkit (Koehn et al. [sent-94, score-0.085]

43 The best symmetrization algorithm, translation and language model weights for each language are selected using cross-validation on the development set. [sent-96, score-0.119]

44 4 Results We first compare the results for the two translation rule extraction models, phrase-based and hierarchical (“MT-phrase” and “MT-hier” respectively in Table 1). [sent-99, score-0.301]

45 We find that the hierarchical model performs better in all languages apart from Greek, indicating that the long-range reorderings learned by a hierarchical translation system are useful for this task. [sent-100, score-0.289]

46 As expected, the performances are almost uniformly lower, but the parser still produces correct output for the majority of examples. [sent-103, score-0.073]

47 It is not evident, a priori, that this search procedure is guaranteed to find any well-formed outputs in reasonable time; to test the effect of this extra requirement on en de el th MT-phrase 75. [sent-105, score-0.036]

48 In practice, increasing search depth in the n-best list from 1to 50 results in a gain of no more than a percentage point or two, and we conclude that our filtering method is appropriate for the task. [sent-126, score-0.034]

49 We also compare the MT-based semantic parsers to several recently published ones: WASP (Wong and Mooney, 2006), which like the hierarchical model described here learns a SCFG to translate between NL and MRL; tsVB (Jones et al. [sent-127, score-0.263]

50 , 2012), which uses variational Bayesian inference to learn weights for a tree transducer; UBL (Kwiatkowski et al. [sent-128, score-0.152]

51 , 2010), which learns a CCG lexicon with semantic annotations; and hybridtree (Lu et al. [sent-129, score-0.18]

52 , 2008), which learns a synchronous generative model over variable-free MRs and NL strings. [sent-130, score-0.037]

53 In the results shown in Table 1 we observe that on English GeoQuery data, the hierarchical translation model achieves scores competitive with the state of the art, and in every language one of the MT systems achieves accuracy at least as good as a purpose-built semantic parser. [sent-131, score-0.365]

54 So in addition to competitive performance, the MTbased parser also appears to be considerably more efficient at training time than other parsers in the literature. [sent-135, score-0.179]

55 Like the present work, it uses GIZA++ alignments as a starting point for the rule extraction procedure, and algorithms reminiscent of those used in syntactic MT to extract rules. [sent-137, score-0.155]

56 tsVB also uses a piece of standard MT ma- chinery, specifically tree transducers, which have been profitably employed for syntax-based machine translation (Maletti, 2010). [sent-138, score-0.213]

57 In that work, however, the usual MT parameter-estimation technique of simply counting the number of rule occurrences does not improve scores, and the authors instead resort to a variational inference procedure to acquire rule weights. [sent-139, score-0.271]

58 The present work is also the first we are aware of which uses phrasebased rather than tree-based machine translation techniques to learn a semantic parser. [sent-140, score-0.238]

59 It employs resolution procedures specific to the λ-calculus such as splitting and unification in order to generate rule templates. [sent-145, score-0.175]

60 Like other systems described, it uses GIZA alignments for initialization. [sent-146, score-0.058]

61 Other work which generalizes from variable-free meaning representations to λ-calculus expressions includes the natural language generation procedure described by Lu and Ng (201 1). [sent-147, score-0.078]

62 UBL, like an MT system (and unlike most of the other systems discussed in this section), extracts rules at multiple levels of granularity by means of this splitting and unification procedure. [sent-148, score-0.138]

63 hybridtree similarly benefits from the introduction of 50 multi-level rules composed from smaller rules, a process similar to the one used for creating phrase tables in a phrase-based MT system. [sent-149, score-0.133]

64 6 Discussion Our results validate the hypothesis that it is possible to adapt an ordinary MT system into a working semantic parser. [sent-150, score-0.119]

65 For this reason, we argue for the use of a machine translation baseline as a point of comparison for new methods. [sent-152, score-0.119]

66 The results also demonstrate the usefulness of two techniques which are crucial for successful MT, but which are not widely used in semantic parsing. [sent-153, score-0.07]

67 The second is the use of large, composed rules (rather than rules which trigger on only one lexical item, or on tree portions of limited depth (Lu et al. [sent-155, score-0.216]

68 7 Conclusions We have presented a semantic parser which uses techniques from machine translation to learn mappings from natural language to variable-free meaning representations. [sent-157, score-0.353]

69 The parser performs comparably to several recent purpose-built semantic parsers on the GeoQuery dataset, while training considerably faster than state-of-the-art systems. [sent-158, score-0.214]

70 Our experiments demonstrate the usefulness of several techniques which might be broadly applied to other semantic parsers, and provides an infor- mative basis for future work. [sent-159, score-0.102]

71 Inducing probabilistic ccg grammars from logical form with higherorder unification. [sent-216, score-0.038]

72 A probabilistic forest-to-string model for language generation from typed lambda calculus expressions. [sent-224, score-0.102]

73 A generative model for parsing natural language to meaning representations. [sent-229, score-0.092]

74 Towards a theory of natural language interfaces to databases. [sent-241, score-0.036]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('mrl', 0.453), ('mrs', 0.359), ('geoquery', 0.253), ('nl', 0.247), ('mt', 0.18), ('sp', 0.162), ('ubl', 0.13), ('translation', 0.119), ('curry', 0.11), ('texa', 0.11), ('tsvb', 0.11), ('jones', 0.1), ('utterance', 0.098), ('rule', 0.097), ('kwiatkowski', 0.095), ('decorated', 0.09), ('mr', 0.085), ('hierarchical', 0.085), ('thai', 0.084), ('lu', 0.079), ('cam', 0.074), ('parser', 0.073), ('cityid', 0.073), ('hybridtree', 0.073), ('wasp', 0.073), ('transducers', 0.072), ('arity', 0.072), ('parsers', 0.071), ('semantic', 0.07), ('andreas', 0.067), ('greek', 0.066), ('hstate', 0.065), ('malformed', 0.065), ('texas', 0.063), ('tree', 0.062), ('koehn', 0.062), ('rules', 0.06), ('alignments', 0.058), ('state', 0.056), ('zelle', 0.056), ('alignment', 0.056), ('calculus', 0.054), ('linearization', 0.054), ('linearize', 0.054), ('vlachos', 0.054), ('wong', 0.053), ('border', 0.051), ('parsing', 0.05), ('ordinary', 0.049), ('learn', 0.049), ('lambda', 0.048), ('goldwasser', 0.048), ('unification', 0.046), ('traversal', 0.046), ('automata', 0.045), ('och', 0.045), ('np', 0.044), ('scfg', 0.044), ('combinatory', 0.044), ('alexandra', 0.042), ('decoding', 0.042), ('minutes', 0.042), ('meaning', 0.042), ('giza', 0.041), ('popescu', 0.041), ('variational', 0.041), ('translations', 0.04), ('luke', 0.039), ('ccg', 0.038), ('learns', 0.037), ('bird', 0.036), ('interfaces', 0.036), ('procedure', 0.036), ('competitive', 0.035), ('laboratory', 0.035), ('sharon', 0.035), ('della', 0.035), ('edward', 0.035), ('depth', 0.034), ('mooney', 0.034), ('uk', 0.034), ('cambridge', 0.033), ('reilly', 0.032), ('logics', 0.032), ('jena', 0.032), ('tolerated', 0.032), ('santa', 0.032), ('symmetrize', 0.032), ('preorder', 0.032), ('spacebook', 0.032), ('prolog', 0.032), ('bestperforming', 0.032), ('churchill', 0.032), ('maletti', 0.032), ('mative', 0.032), ('profitably', 0.032), ('pietra', 0.032), ('hwee', 0.032), ('splitting', 0.032), ('tou', 0.032)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 312 acl-2013-Semantic Parsing as Machine Translation

Author: Jacob Andreas ; Andreas Vlachos ; Stephen Clark

2 0.15376621 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers

Author: Graham Neubig

Abstract: In this paper we describe Travatar, a forest-to-string machine translation (MT) engine based on tree transducers. It provides an open-source C++ implementation for the entire forest-to-string MT pipeline, including rule extraction, tuning, decoding, and evaluation. There are a number of options for model training, and tuning includes advanced options such as hypergraph MERT, and training of sparse features through online learning. The training pipeline is modeled after that of the popular Moses decoder, so users familiar with Moses should be able to get started quickly. We perform a validation experiment of the decoder on EnglishJapanese machine translation, and find that it is possible to achieve greater accuracy than translation using phrase-based and hierarchical-phrase-based translation. As auxiliary results, we also compare different syntactic parsers and alignment techniques that we tested in the process of developing the decoder. Travatar is available under the LGPL at http : / /phont ron . com/t ravat ar

3 0.12862542 36 acl-2013-Adapting Discriminative Reranking to Grounded Language Learning

Author: Joohyun Kim ; Raymond Mooney

Abstract: We adapt discriminative reranking to improve the performance of grounded language acquisition, specifically the task of learning to follow navigation instructions from observation. Unlike conventional reranking used in syntactic and semantic parsing, gold-standard reference trees are not naturally available in a grounded setting. Instead, we show how the weak supervision of response feedback (e.g. successful task completion) can be used as an alternative, experimentally demonstrating that its performance is comparable to training on gold-standard parse trees.

4 0.12815107 314 acl-2013-Semantic Roles for String to Tree Machine Translation

Author: Marzieh Bazrafshan ; Daniel Gildea

Abstract: We experiment with adding semantic role information to a string-to-tree machine translation system based on the rule extraction procedure of Galley et al. (2004). We compare methods based on augmenting the set of nonterminals by adding semantic role labels, and altering the rule extraction process to produce a separate set of rules for each predicate that encompass its entire predicate-argument structure. Our results demonstrate that the second approach is effective in increasing the quality of translations.

5 0.11526992 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation

Author: Fabienne Braune ; Nina Seemann ; Daniel Quernheim ; Andreas Maletti

Abstract: We present a new translation model integrating the shallow local multi bottomup tree transducer. We perform a largescale empirical evaluation of our obtained system, which demonstrates that we significantly beat a realistic tree-to-tree baseline on the WMT 2009 English → German tlriannes olnati tohne tWasMk.T TA 2s0 an a Edndgitliisonha →l c Gonetrrmibauntion we make the developed software and complete tool-chain publicly available for further experimentation.

6 0.11410104 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation

7 0.10969835 255 acl-2013-Name-aware Machine Translation

8 0.10951722 10 acl-2013-A Markov Model of Machine Translation using Non-parametric Bayesian Inference

9 0.10398301 181 acl-2013-Hierarchical Phrase Table Combination for Machine Translation

10 0.10367972 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT

11 0.10296196 215 acl-2013-Large-scale Semantic Parsing via Schema Matching and Lexicon Extension

12 0.10277712 228 acl-2013-Leveraging Domain-Independent Information in Semantic Parsing

13 0.10219361 195 acl-2013-Improving machine translation by training against an automatic semantic frame based evaluation metric

14 0.10001923 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation

15 0.099044211 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation

16 0.095499493 11 acl-2013-A Multi-Domain Translation Model Framework for Statistical Machine Translation

17 0.09318357 13 acl-2013-A New Syntactic Metric for Evaluation of Machine Translation

18 0.090694986 347 acl-2013-The Role of Syntax in Vector Space Models of Compositional Semantics

19 0.088945255 330 acl-2013-Stem Translation with Affix-Based Rule Selection for Agglutinative Languages

20 0.086837232 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.239), (1, -0.132), (2, 0.055), (3, 0.033), (4, -0.106), (5, 0.042), (6, 0.039), (7, -0.03), (8, 0.035), (9, 0.046), (10, -0.024), (11, 0.044), (12, 0.049), (13, 0.014), (14, 0.025), (15, -0.028), (16, 0.016), (17, -0.009), (18, 0.021), (19, -0.009), (20, -0.042), (21, -0.029), (22, -0.051), (23, 0.037), (24, -0.005), (25, -0.01), (26, -0.011), (27, 0.032), (28, 0.026), (29, 0.07), (30, 0.023), (31, -0.015), (32, 0.032), (33, -0.007), (34, -0.027), (35, 0.015), (36, -0.083), (37, 0.04), (38, 0.006), (39, 0.099), (40, -0.1), (41, 0.119), (42, 0.012), (43, -0.017), (44, 0.105), (45, 0.113), (46, -0.043), (47, -0.015), (48, 0.016), (49, -0.006)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93125838 312 acl-2013-Semantic Parsing as Machine Translation

Author: Jacob Andreas ; Andreas Vlachos ; Stephen Clark

2 0.80873966 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation

Author: Fabienne Braune ; Nina Seemann ; Daniel Quernheim ; Andreas Maletti

3 0.78567982 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers

Author: Graham Neubig

4 0.74857396 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT

Author: Wenduan Xu ; Yue Zhang ; Philip Williams ; Philipp Koehn

Abstract: We present a context-sensitive chart pruning method for CKY-style MT decoding. Source phrases that are unlikely to have aligned target constituents are identified using sequence labellers learned from the parallel corpus, and speed-up is obtained by pruning corresponding chart cells. The proposed method is easy to implement, orthogonal to cube pruning and additive to its pruning power. On a full-scale Englishto-German experiment with a string-totree model, we obtain a speed-up of more than 60% over a strong baseline, with no loss in BLEU.

5 0.7423262 314 acl-2013-Semantic Roles for String to Tree Machine Translation

Author: Marzieh Bazrafshan ; Daniel Gildea

6 0.69573551 36 acl-2013-Adapting Discriminative Reranking to Grounded Language Learning

7 0.66169631 165 acl-2013-General binarization for parsing and translation

8 0.65429038 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation

9 0.65371108 330 acl-2013-Stem Translation with Affix-Based Rule Selection for Agglutinative Languages

10 0.65295172 163 acl-2013-From Natural Language Specifications to Program Input Parsers

11 0.6400106 13 acl-2013-A New Syntactic Metric for Evaluation of Machine Translation

12 0.63811851 16 acl-2013-A Novel Translation Framework Based on Rhetorical Structure Theory

13 0.63639402 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation

14 0.63520527 10 acl-2013-A Markov Model of Machine Translation using Non-parametric Bayesian Inference

15 0.62792927 255 acl-2013-Name-aware Machine Translation

16 0.62747073 15 acl-2013-A Novel Graph-based Compact Representation of Word Alignment

17 0.62503678 64 acl-2013-Automatically Predicting Sentence Translation Difficulty

18 0.61452687 228 acl-2013-Leveraging Domain-Independent Information in Semantic Parsing

19 0.6090039 313 acl-2013-Semantic Parsing with Combinatory Categorial Grammars

20 0.60826099 285 acl-2013-Propminer: A Workflow for Interactive Information Extraction and Exploration using Dependency Trees

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.058), (6, 0.057), (11, 0.049), (24, 0.043), (26, 0.069), (28, 0.014), (35, 0.08), (42, 0.077), (48, 0.05), (64, 0.016), (70, 0.034), (76, 0.208), (88, 0.035), (90, 0.069), (95, 0.077)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.81456393 57 acl-2013-Arguments and Modifiers from the Learner's Perspective

Author: Leon Bergen ; Edward Gibson ; Timothy J. O'Donnell

Abstract: We present a model for inducing sentential argument structure, which distinguishes arguments from optional modifiers. We use this model to study whether representing an argument/modifier distinction helps in learning argument structure, and whether a linguistically-natural argument/modifier distinction can be induced from distributional data alone. Our results provide evidence for both hypotheses.

2 0.812837 147 acl-2013-Exploiting Topic based Twitter Sentiment for Stock Prediction

Author: Jianfeng Si ; Arjun Mukherjee ; Bing Liu ; Qing Li ; Huayi Li ; Xiaotie Deng

Abstract: This paper proposes a technique to leverage topic based sentiments from Twitter to help predict the stock market. We first utilize a continuous Dirichlet Process Mixture model to learn the daily topic set. Then, for each topic we derive its sentiment according to its opinion words distribution to build a sentiment time series. We then regress the stock index and the Twitter sentiment time series to predict the market. Experiments on real-life S&P100; Index show that our approach is effective and performs better than existing state-of-the-art non-topic based methods. 1

same-paper 3 0.80896568 312 acl-2013-Semantic Parsing as Machine Translation

Author: Jacob Andreas ; Andreas Vlachos ; Stephen Clark

4 0.67372227 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT

Author: Wenduan Xu ; Yue Zhang ; Philip Williams ; Philipp Koehn

5 0.66764295 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

Author: Lu Wang ; Hema Raghavan ; Vittorio Castelli ; Radu Florian ; Claire Cardie

Abstract: We consider the problem of using sentence compression techniques to facilitate queryfocused multi-document summarization. We present a sentence-compression-based framework for the task, and design a series of learning-based compression models built on parse trees. An innovative beam search decoder is proposed to efficiently find highly probable compressions. Under this framework, we show how to integrate various indicative metrics such as linguistic motivation and query relevance into the compression process by deriving a novel formulation of a compression scoring function. Our best model achieves statistically significant improvement over the state-of-the-art systems on several metrics (e.g. 8.0% and 5.4% improvements in ROUGE-2 respectively) for the DUC 2006 and 2007 summarization task. ,

6 0.66581124 127 acl-2013-Docent: A Document-Level Decoder for Phrase-Based Statistical Machine Translation

7 0.66323012 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction

8 0.66231984 250 acl-2013-Models of Translation Competitions

9 0.66229618 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing

10 0.66197002 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation

11 0.66189688 98 acl-2013-Cross-lingual Transfer of Semantic Role Labeling Models

12 0.66162777 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model

13 0.6613239 174 acl-2013-Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation

14 0.66058838 264 acl-2013-Online Relative Margin Maximization for Statistical Machine Translation

15 0.65997529 276 acl-2013-Part-of-Speech Induction in Dependency Trees for Statistical Machine Translation

16 0.65668046 172 acl-2013-Graph-based Local Coherence Modeling

17 0.65662557 353 acl-2013-Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain

18 0.65612531 328 acl-2013-Stacking for Statistical Machine Translation

19 0.65569955 38 acl-2013-Additive Neural Networks for Statistical Machine Translation

20 0.65482128 248 acl-2013-Modelling Annotator Bias with Multi-task Gaussian Processes: An Application to Machine Translation Quality Estimation