acl acl2013 acl2013-19 knowledge-graph by maker-knowledge-mining

19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation


Source: pdf

Author: Yang Liu

Abstract: We introduce a shift-reduce parsing algorithm for phrase-based string-todependency translation. As the algorithm generates dependency trees for partial translations left-to-right in decoding, it allows for efficient integration of both n-gram and dependency language models. To resolve conflicts in shift-reduce parsing, we propose a maximum entropy model trained on the derivation graph of training data. As our approach combines the merits of phrase-based and string-todependency models, it achieves significant improvements over the two baselines on the NIST Chinese-English datasets.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 cn Abstract We introduce a shift-reduce parsing algorithm for phrase-based string-todependency translation. [sent-3, score-0.218]

2 As the algorithm generates dependency trees for partial translations left-to-right in decoding, it allows for efficient integration of both n-gram and dependency language models. [sent-4, score-0.622]

3 To resolve conflicts in shift-reduce parsing, we propose a maximum entropy model trained on the derivation graph of training data. [sent-5, score-0.356]

4 In addition, it is straightforward to integrate n-gram language models into phrase-based decoders in which translation always grows left-to-right. [sent-11, score-0.156]

5 However, as phrase-based decoding usually casts translation as a string concatenation problem and permits arbitrary permutation, it proves to be NP-complete (Knight, 1999). [sent-13, score-0.266]

6 Moreover, syntax-based approaches often suffer from the rule coverage problem since syntactic constraints rule out a large portion of nonsyntactic phrase pairs, which might help decoders generalize well to unseen data (Marcu et al. [sent-23, score-0.335]

7 As a result, incremental decoding with hierar- chical structures has attracted increasing attention in recent years. [sent-26, score-0.356]

8 While some authors try to integrate syntax into phrase-based decoding (Galley and Manning, 2008; Galley and Manning, 2009; Feng et al. [sent-27, score-0.181]

9 While parsing algorithms can be used to parse partial translations in phrase-based decoding, the search space is significantly enlarged since there are exponentially many parse trees for exponentially many translations. [sent-32, score-0.479]

10 On the other hand, although target words can be generated left-to-right by altering the way of tree transversal in syntaxbased models, it is still difficult to reach full rule coverage as compared with phrase table. [sent-33, score-0.394]

11 target phrase with a set of dependency Each translation rule is composed of a source phrase, a arcs. [sent-37, score-0.481]

12 In this paper, we propose a shift-reduce parsing algorithm for phrase-based string-to-dependency translation. [sent-40, score-0.218]

13 The basic unit of translation in our model is string-to-dependency phrase pair, which consists of a phrase on the source side and a dependency structure on the target side. [sent-41, score-0.539]

14 The algorithm generates well-formed dependency structures for partial translations left-to-right using string-todependency phrase pairs. [sent-42, score-0.54]

15 full rule coverage: all phrase pairs, both syntactic and non-syntactic, can be used in our algorithm. [sent-48, score-0.152]

16 exploiting syntactic information: as the shift-reduce parsing algorithm generates target language dependency trees in decoding, dependency language models (Shen et al. [sent-54, score-0.72]

17 resolving local parsing ambiguity: as dependency trees for phrases are memorized in rules, our approach avoids resolving local parsing ambiguity and explores in a smaller search space than parsing word-by-word on the fly in decoding (Galley and Manning, 2009). [sent-58, score-0.983]

18 2 Shift-Reduce Parsing for Phrase-based String-to-Dependency Translation Figure 1 shows a training example consisting of a (romanized) Chinese sentence, an English dependency tree, and the word alignment between them. [sent-63, score-0.205]

19 As shown in Figure 1, each rule is composed of a source phrase and a target dependency structure. [sent-66, score-0.396]

20 (2008) divide dependency structures into two broad categories: 1. [sent-68, score-0.295]

21 well-formed (a) fixed: the head is known or fixed; Figure 2: Shift-reduce parsing with string-to-dependency phrase pairs. [sent-69, score-0.292]

22 For each state, the algorithm maintains a stack to store items (i. [sent-70, score-0.353]

23 At each step, it chooses one action to extend a state: shift (S), reduce left (Rl), or reduce right (Rr). [sent-73, score-0.53]

24 The decoding process terminates when all source words are covered and there is a complete dependency tree in the stack. [sent-74, score-0.475]

25 We further distinguish between left and right floating structures according to the position of head. [sent-79, score-0.551]

26 For example, as “The President will” is the left dependant ofits head “visit”, it is a left floating structure. [sent-80, score-0.52]

27 Each item s ∈ S is a well-formed dependency sedtr. [sent-88, score-0.283]

28 shift (S): move a target dependency structure onto the stack; 3 2. [sent-93, score-0.414]

29 reduce left (Rl): combine the two items on the stack, st and st−1 (t ≥ 2), with the root of st as the head and replace 2th),e wmi hw tithhe a combined item; 3. [sent-94, score-0.601]

30 reduce right (Rr): combine the two items on the stack, st and st−1 (t ≥ 2), with the root of st−1 as the head an(dt replace wthitehm t wei rthoo a combined item. [sent-95, score-0.453]

31 × The decoding process terminates when all source words are covered and there is a complete dependency tree in the stack. [sent-96, score-0.475]

32 , 2009), our algorithm does not maintain a queue for remaining words of the input because the future dependency structure to be shifted is unknown in advance in the translation scenario. [sent-98, score-0.423]

33 st−1stlegalaction(s) st−1 are the top two items in the stack of a state. [sent-104, score-0.298]

34 We use “h” to denote fixed structure, “l” to denote left floating structure, and “r” to denote right floating structure. [sent-105, score-0.846]

35 It is easy to verify that the reduce left and reduce right actions are equivalent to the left adjoining and right adjoining operations defined by Shen et al. [sent-108, score-0.499]

36 They suffice to operate on wellformed structures and produce projective dependency parse trees. [sent-110, score-0.295]

37 4 Therefore, with dependency structures present in the stacks, it is possible to use dependency language models to encourage linguistically plausible phrase reordering. [sent-111, score-0.586]

38 3 A Maximum Entropy Based Shift-Reduce Parsing Model Shift-reduce parsing is efficient but suffers from parsing errors caused by syntactic ambiguity. [sent-112, score-0.326]

39 Figure 3 shows two (partial) derivations for a dependency tree. [sent-113, score-0.205]

40 Consider the item on the top, the algorithm can either apply a shift action to move a new item or apply a reduce left action to obtain a bigger structure. [sent-114, score-0.691]

41 This is often referred to as conflict in the shift-reduce dependency parsing literature (Huang et al. [sent-115, score-0.368]

42 Fortunately, if we distinguish between left and right floating structures, it is possible to rule out most conflicts. [sent-126, score-0.527]

43 Table 1 shows the relationship between conflicts, dependency structures and actions. [sent-127, score-0.295]

44 “h” stands for fixed structure, “l” for left floating structure, and “r” for right floating structure. [sent-133, score-0.846]

45 If the stack is empty, the only applicable action is shift. [sent-134, score-0.323]

46 If there is only one item in the stack and the item is either fixed or left floating, the only applicable action is shift. [sent-135, score-0.63]

47 Note that it is illegal to shift a right floating structure onto an empty stack because it will never be reduced. [sent-136, score-0.849]

48 If the stack contains at least two items, only “h+h” is ambiguous and the others are either unambiguous or illegal. [sent-137, score-0.229]

49 Therefore, we only need to focus on how to resolve conflicts for the “h+h” case (i. [sent-138, score-0.181]

50 , the top two items in a stack are both fixed structures). [sent-140, score-0.368]

51 Unfortunately, such oracle turns out to be non-unique even for monolingual shift-reduce dependency parsing (Huang et al. [sent-148, score-0.368]

52 The situation for phrase-based shift-reduce parsing aggravates because there are usually multiple ways of segmenting sentence into phrases. [sent-150, score-0.163]

53 The graph begins with an empty state and ends with the given training example. [sent-153, score-0.149]

54 eEreach V no isde a v corresponds to a state in the shift-reduce parsing process. [sent-155, score-0.242]

55 To build the derivation graph, our algorithm starts with an empty state and iteratively extends an unprocessed state until reaches the completed state. [sent-158, score-0.342]

56 In practice, we extend deterministic shiftreduce parsing with beam search (Zhang and Clark, 2008; Huang et al. [sent-176, score-0.236]

57 As shown in Algorithm 1, the algorithm maintains a list of stacks V and each stack groups states with the same numbVer a nodf eaaccchum stualcakte gdr aucptsio sntsa (line 2). [sent-178, score-0.381]

58 hTeh sea mstaec nku lmistV initializes with an empty state v0 (line 3). [sent-179, score-0.149]

59 As the stack of a state keeps changing during the decoding process, the context information needed to calculate dependency language model and maximum entropy model probabilities (e. [sent-184, score-0.81]

60 Therefore, we use hypergraph reranking (Huang and Chiang, 2007; Huang, 2008), which proves to be effective for integrating non-local features into dynamic programming, to alleviate this problem. [sent-190, score-0.143]

61 3 In the second pass, we use the hypergraph reranking algorithm (Huang, 2008) to find promising translations using additional dependency features (i. [sent-195, score-0.448]

62 As hypergraph is capable of storing exponentially many derivations compactly, the negative effect of propagating mistakes made in the first pass to the second pass can be minimized. [sent-198, score-0.441]

63 If an ill-formed structure has a single root, it can treated as a (pseudo) fixed structure; otherwise it is transformed to one (pseudo) left floating structure and one (pseudo) right floating structure. [sent-201, score-0.922]

64 5 Experiments We evaluated dependency our phrase-based string-to- system Chinese- translation English translation. [sent-203, score-0.29]

65 Stanford on We used the parser (Klein and Manning, get dependency 2003) to trees for English sentences. [sent-208, score-0.258]

66 We used the SRILM toolkit (Stolcke, 2002) to train a 3Note that the first pass does not work like a phrase-based decoder because it yields dependency trees on the target side. [sent-209, score-0.473]

67 , each action has a fixed probability of 1/3) is used to resolve “h+h” conflicts. [sent-212, score-0.219]

68 01897 ∗ ∗ Table 3: Contribution of maximum entropy shiftreduce parsing model. [sent-250, score-0.352]

69 Adding dependency language model (“depLM”) and the maximum entropy shift-reduce parsing model (“maxent”) significantly improves BLEU and TER on the development set, both separately and jointly. [sent-252, score-0.523]

70 A 3-gram dependency language model was trained on the English dependency trees. [sent-254, score-0.41]

71 We evaluated translation quality using uncased BLEU (Papineni et al. [sent-256, score-0.152]

72 Moses shares the same feature set with our system except for the dependency features. [sent-277, score-0.205]

73 For our system, we used the Le Zhang’s maximum entropy modeling toolkit to train the shift-reduce parsing model after extracting 32. [sent-282, score-0.279]

74 From the same training data, Moses extracted 103M bilingual phrases, the bottomup string-to-dependency system extracted 587M string-to-dependency rules, and our system extracted 124M phrase-based dependency rules. [sent-288, score-0.245]

75 The gains in TER are much larger than BLEU because dependency language models do not model n-grams directly. [sent-298, score-0.205]

76 One possible reason is that our decoder organizes stacks with respect to actions, whereas Moses groups partial translations with the same number of covered source words in stacks. [sent-307, score-0.325]

77 In the second pass, our decoder reranks the hypergraph with additional dependency features. [sent-308, score-0.425]

78 We find that adding dependency language and maximum entropy shift-reduce models consistently brings significant improvements, both separately and jointly. [sent-309, score-0.321]

79 Our system consistently outper8 forms Moses in all cases, suggesting that adding dependency helps improve phrase reordering. [sent-319, score-0.291]

80 They introduce maximum spanning tree (MST) parsing (McDonald et al. [sent-321, score-0.247]

81 One challenge is that MST parsing itself is not incremental, making it expensive to identify loops during hypothesis expansion. [sent-324, score-0.163]

82 On the contrary, shiftreduce parsing is naturally incremental and can be seamlessly integrated into left-to-right phrasebased decoding. [sent-325, score-0.321]

83 More importantly, in our work dependency trees are memorized for phrases rather than being generated word by word on the fly in decoding. [sent-326, score-0.313]

84 This treatment might not only reduce decoding complexity but also potentially revolve local parsing ambiguity. [sent-327, score-0.423]

85 Our decoding algorithm is similar to Gimpel and Smith (201 1)’s lattice parsing algorithm as we divide decoding into two steps: hypergraph generation and hypergraph rescoring. [sent-328, score-0.921]

86 The major difference is that our hypergraph is not a phrasal lattice because each phrase pair is associated with a dependency structure on the target side. [sent-329, score-0.555]

87 In other words, our second pass is to find the Viterbi derivation with addition features rather than parsing the phrasal lattice. [sent-330, score-0.365]

88 In addition, their algorithm produces phrasal dependency parse trees while the leaves of our dependency trees are words, making dependency language models can be directly used. [sent-331, score-0.82]

89 Shift-reduce parsing has been successfully used in phrase-based decoding but limited to adding structural constraints. [sent-332, score-0.344]

90 (2010) use shift-reduce parsing to impose ITG (Wu, 1997) constraints on phrase permutation. [sent-335, score-0.249]

91 Along another line, a number of authors have developed incremental algorithms for syntaxbased models (Watanabe et al. [sent-337, score-0.145]

92 (2012) use dot- ted rules to change the tree transversal to generate target words left-to-right, either top-down or bottom-up. [sent-344, score-0.185]

93 7 Conclusion We have presented a shift-reduce parsing algorithm for phrase-based string-to-dependency translation. [sent-345, score-0.218]

94 The algorithm generates dependency structures incrementally using string-todependency phrase pairs. [sent-346, score-0.436]

95 , the uncovered source phrases) in the maximum entropy model to resolve conflicts. [sent-350, score-0.171]

96 It is also interesting to compare with applying word-based shift-reduce parsing to phrase-based decoding similar to (Galley and Manning, 2009). [sent-352, score-0.344]

97 An efficient shift-reduce decoding algorithm for phrased-based machine translation. [sent-372, score-0.236]

98 A new string-to-dependency machine translation algorithm with a target dependency language model. [sent-509, score-0.384]

99 Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. [sent-533, score-0.163]

100 A tale of two parsers: investigating and combining graphbased and transition-based dependency parsing using beam search. [sent-542, score-0.368]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('floating', 0.315), ('stack', 0.229), ('visit', 0.218), ('dependency', 0.205), ('decoding', 0.181), ('parsing', 0.163), ('moses', 0.158), ('president', 0.158), ('huang', 0.148), ('shen', 0.145), ('hypergraph', 0.143), ('rl', 0.138), ('shift', 0.132), ('st', 0.132), ('conflicts', 0.126), ('galley', 0.112), ('wlc', 0.109), ('pass', 0.099), ('stacks', 0.097), ('action', 0.094), ('structures', 0.09), ('london', 0.089), ('phrase', 0.086), ('incremental', 0.085), ('translation', 0.085), ('feng', 0.085), ('bleu', 0.083), ('wh', 0.082), ('left', 0.081), ('rr', 0.08), ('reduce', 0.079), ('state', 0.079), ('item', 0.078), ('decoder', 0.077), ('entropy', 0.074), ('shiftreduce', 0.073), ('decoders', 0.071), ('empty', 0.07), ('fixed', 0.07), ('items', 0.069), ('reordering', 0.067), ('uncased', 0.067), ('rule', 0.066), ('liang', 0.066), ('watanabe', 0.065), ('right', 0.065), ('root', 0.065), ('koehn', 0.061), ('syntaxbased', 0.06), ('exponentially', 0.06), ('ter', 0.06), ('derivation', 0.059), ('partial', 0.059), ('tsinghua', 0.057), ('resolve', 0.055), ('algorithm', 0.055), ('memorized', 0.055), ('tlc', 0.055), ('transversal', 0.055), ('trc', 0.055), ('wrc', 0.055), ('haitao', 0.053), ('chiang', 0.053), ('trees', 0.053), ('pseudo', 0.052), ('dyer', 0.052), ('rules', 0.049), ('mst', 0.049), ('qun', 0.049), ('actions', 0.049), ('wst', 0.048), ('manning', 0.047), ('covered', 0.047), ('mi', 0.047), ('kevin', 0.046), ('coverage', 0.046), ('yang', 0.045), ('translations', 0.045), ('riezler', 0.045), ('nist', 0.045), ('illformed', 0.045), ('prohibited', 0.045), ('acl', 0.044), ('phrasal', 0.044), ('head', 0.043), ('maximum', 0.042), ('och', 0.042), ('tree', 0.042), ('hs', 0.042), ('capable', 0.04), ('bottomup', 0.04), ('maintain', 0.04), ('emnlp', 0.039), ('significantly', 0.039), ('target', 0.039), ('tag', 0.039), ('liu', 0.038), ('distortion', 0.038), ('compactly', 0.038), ('structure', 0.038)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation

Author: Yang Liu

Abstract: We introduce a shift-reduce parsing algorithm for phrase-based string-todependency translation. As the algorithm generates dependency trees for partial translations left-to-right in decoding, it allows for efficient integration of both n-gram and dependency language models. To resolve conflicts in shift-reduce parsing, we propose a maximum entropy model trained on the derivation graph of training data. As our approach combines the merits of phrase-based and string-todependency models, it achieves significant improvements over the two baselines on the NIST Chinese-English datasets.

2 0.23711263 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing

Author: Muhua Zhu ; Yue Zhang ; Wenliang Chen ; Min Zhang ; Jingbo Zhu

Abstract: Shift-reduce dependency parsers give comparable accuracies to their chartbased counterparts, yet the best shiftreduce constituent parsers still lag behind the state-of-the-art. One important reason is the existence of unary nodes in phrase structure trees, which leads to different numbers of shift-reduce actions between different outputs for the same input. This turns out to have a large empirical impact on the framework of global training and beam search. We propose a simple yet effective extension to the shift-reduce process, which eliminates size differences between action sequences in beam-search. Our parser gives comparable accuracies to the state-of-the-art chart parsers. With linear run-time complexity, our parser is over an order of magnitude faster than the fastest chart parser.

3 0.22304426 133 acl-2013-Efficient Implementation of Beam-Search Incremental Parsers

Author: Yoav Goldberg ; Kai Zhao ; Liang Huang

Abstract: Beam search incremental parsers are accurate, but not as fast as they could be. We demonstrate that, contrary to popular belief, most current implementations of beam parsers in fact run in O(n2), rather than linear time, because each statetransition is actually implemented as an O(n) operation. We present an improved implementation, based on Tree Structured Stack (TSS), in which a transition is performed in O(1), resulting in a real lineartime algorithm, which is verified empiri- cally. We further improve parsing speed by sharing feature-extraction and dotproduct across beam items. Practically, our methods combined offer a speedup of ∼2x over strong baselines on Penn Treeb∼a2nxk sentences, a bnads are eosrd oenrs P eofn magnitude faster on much longer sentences.

4 0.21959226 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers

Author: Graham Neubig

Abstract: In this paper we describe Travatar, a forest-to-string machine translation (MT) engine based on tree transducers. It provides an open-source C++ implementation for the entire forest-to-string MT pipeline, including rule extraction, tuning, decoding, and evaluation. There are a number of options for model training, and tuning includes advanced options such as hypergraph MERT, and training of sparse features through online learning. The training pipeline is modeled after that of the popular Moses decoder, so users familiar with Moses should be able to get started quickly. We perform a validation experiment of the decoder on EnglishJapanese machine translation, and find that it is possible to achieve greater accuracy than translation using phrase-based and hierarchical-phrase-based translation. As auxiliary results, we also compare different syntactic parsers and alignment techniques that we tested in the process of developing the decoder. Travatar is available under the LGPL at http : / /phont ron . com/t ravat ar

5 0.19323246 26 acl-2013-A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy

Author: Francesco Sartorio ; Giorgio Satta ; Joakim Nivre

Abstract: We present a novel transition-based, greedy dependency parser which implements a flexible mix of bottom-up and top-down strategies. The new strategy allows the parser to postpone difficult decisions until the relevant information becomes available. The novel parser has a ∼12% error reduction in unlabeled attach∼ment score over an arc-eager parser, with a slow-down factor of 2.8.

6 0.1872678 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing

7 0.17924014 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT

8 0.17265865 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search

9 0.17054038 40 acl-2013-Advancements in Reordering Models for Statistical Machine Translation

10 0.16110559 80 acl-2013-Chinese Parsing Exploiting Characters

11 0.1606279 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching

12 0.15929997 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation

13 0.15151854 15 acl-2013-A Novel Graph-based Compact Representation of Word Alignment

14 0.14886637 314 acl-2013-Semantic Roles for String to Tree Machine Translation

15 0.1354062 11 acl-2013-A Multi-Domain Translation Model Framework for Statistical Machine Translation

16 0.13416544 181 acl-2013-Hierarchical Phrase Table Combination for Machine Translation

17 0.13162313 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation

18 0.13128513 288 acl-2013-Punctuation Prediction with Transition-based Parsing

19 0.12650184 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation

20 0.12468314 38 acl-2013-Additive Neural Networks for Statistical Machine Translation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.296), (1, -0.272), (2, -0.048), (3, 0.136), (4, -0.122), (5, 0.063), (6, 0.108), (7, -0.024), (8, 0.021), (9, -0.02), (10, -0.026), (11, 0.092), (12, -0.024), (13, 0.011), (14, 0.216), (15, 0.042), (16, -0.04), (17, 0.018), (18, -0.011), (19, 0.007), (20, -0.09), (21, 0.045), (22, 0.057), (23, -0.002), (24, -0.014), (25, 0.008), (26, -0.041), (27, -0.004), (28, -0.006), (29, 0.027), (30, 0.029), (31, -0.002), (32, -0.001), (33, -0.049), (34, -0.024), (35, -0.0), (36, -0.011), (37, 0.013), (38, -0.083), (39, -0.003), (40, 0.045), (41, 0.067), (42, 0.004), (43, 0.122), (44, 0.045), (45, 0.004), (46, -0.046), (47, -0.003), (48, -0.054), (49, 0.012)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96050715 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation

Author: Yang Liu

Abstract: We introduce a shift-reduce parsing algorithm for phrase-based string-todependency translation. As the algorithm generates dependency trees for partial translations left-to-right in decoding, it allows for efficient integration of both n-gram and dependency language models. To resolve conflicts in shift-reduce parsing, we propose a maximum entropy model trained on the derivation graph of training data. As our approach combines the merits of phrase-based and string-todependency models, it achieves significant improvements over the two baselines on the NIST Chinese-English datasets.

2 0.77205795 133 acl-2013-Efficient Implementation of Beam-Search Incremental Parsers

Author: Yoav Goldberg ; Kai Zhao ; Liang Huang

Abstract: Beam search incremental parsers are accurate, but not as fast as they could be. We demonstrate that, contrary to popular belief, most current implementations of beam parsers in fact run in O(n2), rather than linear time, because each statetransition is actually implemented as an O(n) operation. We present an improved implementation, based on Tree Structured Stack (TSS), in which a transition is performed in O(1), resulting in a real lineartime algorithm, which is verified empiri- cally. We further improve parsing speed by sharing feature-extraction and dotproduct across beam items. Practically, our methods combined offer a speedup of ∼2x over strong baselines on Penn Treeb∼a2nxk sentences, a bnads are eosrd oenrs P eofn magnitude faster on much longer sentences.

3 0.75536621 226 acl-2013-Learning to Prune: Context-Sensitive Pruning for Syntactic MT

Author: Wenduan Xu ; Yue Zhang ; Philip Williams ; Philipp Koehn

Abstract: We present a context-sensitive chart pruning method for CKY-style MT decoding. Source phrases that are unlikely to have aligned target constituents are identified using sequence labellers learned from the parallel corpus, and speed-up is obtained by pruning corresponding chart cells. The proposed method is easy to implement, orthogonal to cube pruning and additive to its pruning power. On a full-scale Englishto-German experiment with a string-totree model, we obtain a speed-up of more than 60% over a strong baseline, with no loss in BLEU.

4 0.75046831 26 acl-2013-A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy

Author: Francesco Sartorio ; Giorgio Satta ; Joakim Nivre

Abstract: We present a novel transition-based, greedy dependency parser which implements a flexible mix of bottom-up and top-down strategies. The new strategy allows the parser to postpone difficult decisions until the relevant information becomes available. The novel parser has a ∼12% error reduction in unlabeled attach∼ment score over an arc-eager parser, with a slow-down factor of 2.8.

5 0.74140787 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing

Author: Muhua Zhu ; Yue Zhang ; Wenliang Chen ; Min Zhang ; Jingbo Zhu

Abstract: Shift-reduce dependency parsers give comparable accuracies to their chartbased counterparts, yet the best shiftreduce constituent parsers still lag behind the state-of-the-art. One important reason is the existence of unary nodes in phrase structure trees, which leads to different numbers of shift-reduce actions between different outputs for the same input. This turns out to have a large empirical impact on the framework of global training and beam search. We propose a simple yet effective extension to the shift-reduce process, which eliminates size differences between action sequences in beam-search. Our parser gives comparable accuracies to the state-of-the-art chart parsers. With linear run-time complexity, our parser is over an order of magnitude faster than the fastest chart parser.

6 0.72660959 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching

7 0.72316897 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing

8 0.7156238 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers

9 0.71403897 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search

10 0.69188786 288 acl-2013-Punctuation Prediction with Transition-based Parsing

11 0.6866622 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation

12 0.66322082 127 acl-2013-Docent: A Document-Level Decoder for Phrase-Based Statistical Machine Translation

13 0.66120338 312 acl-2013-Semantic Parsing as Machine Translation

14 0.63040894 363 acl-2013-Two-Neighbor Orientation Model with Cross-Boundary Global Contexts

15 0.61939865 13 acl-2013-A New Syntactic Metric for Evaluation of Machine Translation

16 0.60002071 200 acl-2013-Integrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation

17 0.59927052 328 acl-2013-Stacking for Statistical Machine Translation

18 0.59079742 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation

19 0.58918458 77 acl-2013-Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?

20 0.58356071 165 acl-2013-General binarization for parsing and translation


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.042), (6, 0.034), (11, 0.065), (24, 0.021), (26, 0.04), (35, 0.045), (42, 0.079), (48, 0.026), (70, 0.414), (88, 0.014), (90, 0.065), (95, 0.068)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.96970773 384 acl-2013-Visual Features for Linguists: Basic image analysis techniques for multimodally-curious NLPers

Author: Elia Bruni ; Marco Baroni

Abstract: unkown-abstract

2 0.96725929 89 acl-2013-Computerized Analysis of a Verbal Fluency Test

Author: James O. Ryan ; Serguei Pakhomov ; Susan Marino ; Charles Bernick ; Sarah Banks

Abstract: We present a system for automated phonetic clustering analysis of cognitive tests of phonemic verbal fluency, on which one must name words starting with a specific letter (e.g., ‘F’) for one minute. Test responses are typically subjected to manual phonetic clustering analysis that is labor-intensive and subject to inter-rater variability. Our system provides an automated alternative. In a pilot study, we applied this system to tests of 55 novice and experienced professional fighters (boxers and mixed martial artists) and found that experienced fighters produced significantly longer chains of phonetically similar words, while no differences were found in the total number of words produced. These findings are preliminary, but strongly suggest that our system can be used to detect subtle signs of brain damage due to repetitive head trauma in individuals that are otherwise unimpaired.

3 0.9563235 296 acl-2013-Recognizing Identical Events with Graph Kernels

Author: Goran Glavas ; Jan Snajder

Abstract: Identifying news stories that discuss the same real-world events is important for news tracking and retrieval. Most existing approaches rely on the traditional vector space model. We propose an approach for recognizing identical real-world events based on a structured, event-oriented document representation. We structure documents as graphs of event mentions and use graph kernels to measure the similarity between document pairs. Our experiments indicate that the proposed graph-based approach can outperform the traditional vector space model, and is especially suitable for distinguishing between topically similar, yet non-identical events.

same-paper 4 0.95252562 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation

Author: Yang Liu

Abstract: We introduce a shift-reduce parsing algorithm for phrase-based string-todependency translation. As the algorithm generates dependency trees for partial translations left-to-right in decoding, it allows for efficient integration of both n-gram and dependency language models. To resolve conflicts in shift-reduce parsing, we propose a maximum entropy model trained on the derivation graph of training data. As our approach combines the merits of phrase-based and string-todependency models, it achieves significant improvements over the two baselines on the NIST Chinese-English datasets.

5 0.93684721 348 acl-2013-The effect of non-tightness on Bayesian estimation of PCFGs

Author: Shay B. Cohen ; Mark Johnson

Abstract: Probabilistic context-free grammars have the unusual property of not always defining tight distributions (i.e., the sum of the “probabilities” of the trees the grammar generates can be less than one). This paper reviews how this non-tightness can arise and discusses its impact on Bayesian estimation of PCFGs. We begin by presenting the notion of “almost everywhere tight grammars” and show that linear CFGs follow it. We then propose three different ways of reinterpreting non-tight PCFGs to make them tight, show that the Bayesian estimators in Johnson et al. (2007) are correct under one of them, and provide MCMC samplers for the other two. We conclude with a discussion of the impact of tightness empirically.

6 0.91784471 218 acl-2013-Latent Semantic Tensor Indexing for Community-based Question Answering

7 0.90181023 220 acl-2013-Learning Latent Personas of Film Characters

8 0.86382067 356 acl-2013-Transfer Learning Based Cross-lingual Knowledge Extraction for Wikipedia

9 0.73083663 153 acl-2013-Extracting Events with Informal Temporal References in Personal Histories in Online Communities

10 0.68720895 329 acl-2013-Statistical Machine Translation Improves Question Retrieval in Community Question Answering via Matrix Factorization

11 0.6838491 249 acl-2013-Models of Semantic Representation with Visual Attributes

12 0.66908324 80 acl-2013-Chinese Parsing Exploiting Characters

13 0.66678238 380 acl-2013-VSEM: An open library for visual semantics representation

14 0.66254604 274 acl-2013-Parsing Graphs with Hyperedge Replacement Grammars

15 0.64777911 167 acl-2013-Generalizing Image Captions for Image-Text Parallel Corpus

16 0.62222683 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing

17 0.6221422 339 acl-2013-Temporal Signals Help Label Temporal Relations

18 0.61547178 165 acl-2013-General binarization for parsing and translation

19 0.61366069 169 acl-2013-Generating Synthetic Comparable Questions for News Articles

20 0.61148769 168 acl-2013-Generating Recommendation Dialogs by Extracting Information from User Reviews