emnlp emnlp2011 emnlp2011-132 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yue Zhang ; Stephen Clark
Abstract: Machine-produced text often lacks grammaticality and fluency. This paper studies grammaticality improvement using a syntax-based algorithm based on CCG. The goal of the search problem is to find an optimal parse tree among all that can be constructed through selection and ordering of the input words. The search problem, which is significantly harder than parsing, is solved by guided learning for best-first search. In a standard word ordering task, our system gives a BLEU score of 40. 1, higher than the previous result of 33.7 achieved by a dependency-based system.
Reference: text
sentIndex sentText sentNum sentScore
1 Abstract Machine-produced text often lacks grammaticality and fluency. [sent-5, score-0.236]
2 This paper studies grammaticality improvement using a syntax-based algorithm based on CCG. [sent-6, score-0.236]
3 The goal of the search problem is to find an optimal parse tree among all that can be constructed through selection and ordering of the input words. [sent-7, score-0.304]
4 The search problem, which is significantly harder than parsing, is solved by guided learning for best-first search. [sent-8, score-0.196]
5 1 Introduction Machine-produced text, such as SMT output, often lacks grammaticality and fluency, especially when using n-gram language modelling (Knight, 2007). [sent-12, score-0.236]
6 Recent efforts have been made to improve grammaticality using local language models (Blackwood et al. [sent-13, score-0.236]
7 The input may also include words beyond the output of the base system, e. [sent-27, score-0.228]
8 The search algorithm is guided by perceptron training, which ensures that the explored path in the search space consists of highly probable hypotheses. [sent-34, score-0.366]
9 This framework of best-first search guided by learning is a general contribution of the paper, which could be applied to problems outside grammaticality improvement. [sent-35, score-0.432]
10 This problem is an instance of our general task formulation, but without any input constraints, or content word selection (since all input words are used). [sent-39, score-0.21]
11 the output of the base SMT system), some constraints can be given to specific input words, limiting their order or identifying them as an atomic phrase, for example. [sent-58, score-0.228]
12 The goal of the search algorithm is to find an optimal parse tree (including the surface string) among all that can be constructed via selecting and ordering a subset of words from the input multiset. [sent-59, score-0.349]
13 A hypothesis is expanded by applying CCG unary rules to the hypothesis, or by combining the hypothesis with existing hypotheses using CCG binary rules. [sent-65, score-0.379]
14 Each edge is a CCG constituent, spanning a sequence of words. [sent-71, score-0.23]
15 Similar to partial parses in a typical chart parser, edges have recursive structures. [sent-72, score-0.382]
16 Depending on the number of subedges, edges can be classified into leaf edges, unary edges and binary edges. [sent-73, score-0.556]
17 Existing edges are expanded to generate new edges via unary and binary CCG rules. [sent-75, score-0.62]
18 An edge that meets the output criteria is called a goal edge. [sent-76, score-0.337]
19 In the experiments of this paper, we define a goal edge as one that includes all input words the correct number of times. [sent-77, score-0.399]
20 The signature of an edge consists of the category label, surface string and head word of the constituent. [sent-78, score-0.349]
21 Two edges are equivalent if they share the same signature. [sent-79, score-0.226]
22 Given our feature definitions, a lower scoring edge with the same signature as a higher scoring edge cannot be part of the highest scoring derivation. [sent-80, score-0.46]
23 The number of words in the surface string of an edge is called the size of the edge. [sent-81, score-0.275]
24 Other important substructures of an edge include a bitvector and an array, which stores the indices of the input words that the edge contains. [sent-82, score-0.608]
25 Before two edges are combined using a binary CCG rule, an input check is performed to make sure that the total count for a word from the two edges does not exceed the count for that word in the input. [sent-83, score-0.557]
26 Intuitively, an edge can record the count of each unique input word it contains, and perform the input check in linear time. [sent-84, score-0.44]
27 However, since most input words typically occur once, they can be indexed and represented by a bitvector, which allows a constant time input check. [sent-85, score-0.21]
28 In the best-first process, edges to be expanded are ordered by their scores, and stored in an agenda. [sent-87, score-0.29]
29 There are many ways in which edges could be ordered and compared. [sent-89, score-0.226]
30 Here the chart is organised as a set of beams, each containing a fixed number of edges with a particular size. [sent-90, score-0.382]
31 In each beam, edges are ordered by their scores, and low score edges are pruned. [sent-92, score-0.452]
32 In addition to pruning by the beam, only the highest scored edge is kept among all that share the same signature. [sent-93, score-0.282]
33 During initialization, the agenda (a) and chart (c) are cleared. [sent-96, score-0.355]
34 All candidate lexical categories are assigned to each input word, and the resulting leaf edges are put onto the agenda. [sent-97, score-0.438]
35 In the main loop, the best edge (e) is popped from the agenda. [sent-98, score-0.302]
36 If e or any equivalent edge e of e is already in the chart, the loop continues without expanding e. [sent-100, score-0.298]
37 It can be proved that any edge in the chart must have been combined with and therefore the expansion of e is unnecessary. [sent-101, score-0.386]
38 Edge e is first expanded by applying unary rules, and any new edges are put into a list (new). [sent-102, score-0.394]
39 Next, e is matched against each existing edge ˜e in the chart. [sent-103, score-0.23]
40 e and e are combined in both possible orders, and any resulting edge is added to new. [sent-105, score-0.23]
41 At the end ofeach loop, edges from new are added to the agenda, and new is cleared. [sent-106, score-0.226]
42 For practical reasons we also include a timeout stopping condition. [sent-111, score-0.213]
43 If no goal edges are found before the timeout is reached, a default output is constructed by the following procedure. [sent-112, score-0.546]
44 First, ifany two edges in the chart pass the input check, and the words they contain constitute the full input set, they are concatenated to form an output string. [sent-113, score-0.635]
45 Second, when no two edges in the chart meet the above condition, the largest edge in the chart is chosen. [sent-114, score-0.768]
46 Then edges in the chart are iterated over in the larger first order, with any edge e e that passes the input check with concatenated with and updated. [sent-115, score-0.717]
47 e 4 e e, Model and Features We use a discriminative linear model to score edges, where the score of an edge e is calculated using the global feature vector Φ(e) and the parameter vector of the model. [sent-117, score-0.23]
48 It is computed incrementally as the edge is built. [sent-119, score-0.23]
49 So for any edge e, φ(e) consists of features from the top rule of the hierarchical structure of e. [sent-121, score-0.23]
50 The features in Table 1 represent patterns including the constituent label; the head word of the constituent; the size of the constituent; word, POS and lexical category N-grams resulting from a binary combination; and the unary and binary rules by which the constituent is constructed. [sent-123, score-0.381]
51 Second, it is also crucial to the speed of the search algorithm, since the best-first mechanism relies on a model to find goal hypotheses efficiently. [sent-127, score-0.212]
52 The training algorithm runs the decoder on each training example, updating the model when necessary, until the gold goal conditionfeature edge is recovered. [sent-130, score-0.363]
53 We choose to update parameters as soon as the best edge from the agenda is not a gold-standard edge. [sent-135, score-0.498]
54 The intuition is that all gold edges are forced to be above all non-gold edges on the agenda. [sent-136, score-0.521]
55 An alternative is to update when a gold-standard edge falls off the chart, which corresponds to the precondition for parameter updates of Daum e´ III and Marcu (2005). [sent-138, score-0.399]
56 Our updates lead both to correctness (edges in the chart are correct) and efficiency (correct edges are found at the first possible opportunity). [sent-140, score-0.436]
57 1150 During a perceptron update, an incorrect prediction, corresponding to the current best edge in the agenda, is penalized, and the corresponding gold edge is rewarded. [sent-141, score-0.61]
58 However, in our scenario it is not obvious what the corresponding gold edge should be, and there are many ways in which the gold edge could be defined. [sent-142, score-0.598]
59 In practice we found that the simple strategy of selecting the lowest scored gold-standard edge in the agenda was effective, and the results presented in this paper are based on this method. [sent-144, score-0.481]
60 The first is to reinitialize the agenda and chart using the new model, and continue until the current training example is correctly predicted. [sent-146, score-0.412]
61 In order to achieve reasonable efficiency, we adopt a second approach, which is to continue training without reinitializing the agenda and chart. [sent-149, score-0.256]
62 Instead, only edges from the top of the agenda down to the lowest-scoring goldstandard edge are given new scores according to the new parameters. [sent-150, score-0.655]
63 The initialization is identical to the test search, except that the list of goal edges is not maintained. [sent-152, score-0.29]
64 In the main loop, the best edge e is popped off the agenda. [sent-153, score-0.302]
65 Only gold edges are pushed onto the chart throughout the training process. [sent-156, score-0.451]
66 When updating parameters, the current non-gold edge (e) is used as the negative example, and the smallest gold edge in the agenda (minGold) is used as the corresponding positive example. [sent-157, score-0.728]
67 Note that we do not use the global feature vector in the update, since only the constituent level parameter vectors are compatible for edges with different sizes. [sent-159, score-0.306]
68 After a parameter update, edges are rescored from the top of the agenda down to minGold. [sent-160, score-0.425]
69 In addition to the surface string, our system also produces the CCG parse given an input bag of words. [sent-174, score-0.214]
70 The quality of the parse tree can reflect both the grammaticality of the surface string and the quality of the trained grammar model. [sent-175, score-0.343]
71 Again because the word order can be different, we turn both the output and the gold-standard into a bag of word/category pairs, and calculate the percentage of matched pairs as the lexical category precision. [sent-179, score-0.224]
72 (2009) used base NPs from Penn Treebank annotation, while we extract base NPs from the CCGBbank by taking as base NPs the NPs that do not recursively contain other NPs. [sent-183, score-0.24]
73 The first (“dictionary”) is to assign all possible lexical categories to each input word from the dictionary. [sent-191, score-0.212]
74 The lexical category dictionary is built using 1152 MethodTimeoutBLEUrLaetniogthrTaitmioeout dictonary510. [sent-192, score-0.206]
75 In practice, it is often unnecessary to leave lexical category disambiguation completely to the grammaticality improvement system. [sent-212, score-0.353]
76 When it is reasonable to assume that the input sentence for the grammaticality improvement system is sufficiently fluent, a list of candidate lexical categories can be assigned automatically to each word via supertagging (Clark and Curran, 2007) on the input sequence. [sent-213, score-0.553]
77 We use the C&C; supertagger1 to assign a set of probable lexical categories to each input word using the goldstandard order. [sent-214, score-0.212]
78 When the input is noisy, the accuracy of a supertagger tends to be lower than when the input is grammatical. [sent-215, score-0.359]
79 One way to address this problem is to allow the supertagger to produce a larger list of possible supertags for each input word, and leave the ambiguity to the grammatical improvement system. [sent-216, score-0.299]
80 4 lexical categories for each input word in the development test (which is smaller than the dictionary case). [sent-231, score-0.301]
81 The table shows that the BLEU score of the grammaticality improvement system is higher when a super tagger is used, and the higher the β value, the better the BLEU score. [sent-236, score-0.236]
82 In practice, the β value should be set in accordance with the lack of grammaticality and fluency in the input. [sent-237, score-0.3]
83 When the timeout value increases, the BLEU score generally increases. [sent-243, score-0.213]
84 The main effect of a larger timeout is the increased possibility of a complete sentence being found. [sent-244, score-0.213]
85 5s to 50s using the dictionary method, for example, the average output sentence length increases from 84% of the input length to 93%. [sent-246, score-0.237]
86 Table 3 shows the lexical category accuracies using the dictionary, and supertagger with different β levels. [sent-247, score-0.266]
87 The best lexical category accuracy of 77% is achieved when using a supertagger with a β level 0. [sent-258, score-0.266]
88 075 for the majority of sentences, the accuracy of our grammaticality improvement system is much lower. [sent-261, score-0.236]
89 Second, the search algorithm often fails to find a goal hypothesis before timeout, and a default output that is less grammatical than a complete constituent is constructed. [sent-272, score-0.397]
90 075) for was a nonexe cut ive di re ct or o f Rudo lph Agnew and forme r chai rman o f Cons ol idated Gold F ie lds PLC thi s Brit i sh indust rial congl ome rate 5 5 years old . [sent-287, score-0.677]
91 McDe rmott Int e rnat i onal Inc s aid it s Babcock & Wi l cox unit complet ed the s ale o f it s Bai ley Cont ro l Ope rat i s ons for F inme ccani ca S . [sent-294, score-0.673]
92 Note that through the use of a supertagger, we are no longer assuming that the input is a bag of words without any order, and therefore only the dictionary results are directly comparable with Wan et al. [sent-328, score-0.258]
93 Moreover, CCG allows us to reduce the ambiguity level of the search algorithm through the assignment of possible lexical categories to input words, which is useful when the input has a basic degree of fluency, as is often the case in a grammaticality improvement task. [sent-339, score-0.642]
94 1155 The use of perceptron learning to improve search has been proposed in guided learning for easy-first search (Shen et al. [sent-352, score-0.366]
95 In particular, LaSO updates parameters when all correct hypotheses are lost, but our algorithm makes an update as soon as the top item from the agenda is incorrect. [sent-356, score-0.381]
96 Given an incorrect hypothesis, LaSO finds the corresponding gold hypothesis for perceptron update by constructing its correct sibling. [sent-358, score-0.295]
97 In contrast, our algorithm takes the lowest scored gold hypothesis currently in the agenda to avoid updating parameters for hypotheses that may have not been constructed. [sent-359, score-0.455]
98 (2007), which maintains a queue of hypotheses during search, and performs learning to ensure that the highest scored hypothesis in the queue is correct. [sent-361, score-0.287]
99 8 Conclusion We proposed a grammaticality improvement system using CCG, and evaluated it using a standard input word ordering task. [sent-367, score-0.387]
100 Improving grammaticality in statistical sentence generation: Introducing a dependency spanning tree algorithm with an argument satisfaction model. [sent-472, score-0.236]
wordName wordTfidf (topN-words)
[('wan', 0.274), ('ccg', 0.253), ('grammaticality', 0.236), ('edge', 0.23), ('edges', 0.226), ('timeout', 0.213), ('agenda', 0.199), ('chart', 0.156), ('supertagger', 0.149), ('append', 0.133), ('nps', 0.123), ('ccgbank', 0.11), ('guided', 0.107), ('laso', 0.106), ('input', 0.105), ('unary', 0.104), ('bleu', 0.091), ('dictionary', 0.089), ('search', 0.089), ('cancombine', 0.085), ('mingold', 0.085), ('perceptron', 0.081), ('constituent', 0.08), ('base', 0.08), ('clark', 0.077), ('hypothesis', 0.076), ('fluent', 0.075), ('category', 0.074), ('rat', 0.072), ('popped', 0.072), ('update', 0.069), ('gold', 0.069), ('hockenmaier', 0.068), ('loop', 0.068), ('blackwood', 0.066), ('expanded', 0.064), ('fluency', 0.064), ('bag', 0.064), ('agnew', 0.064), ('asbe', 0.064), ('babcock', 0.064), ('bai', 0.064), ('brit', 0.064), ('ccani', 0.064), ('chai', 0.064), ('complet', 0.064), ('congl', 0.064), ('cox', 0.064), ('forme', 0.064), ('indust', 0.064), ('inme', 0.064), ('lds', 0.064), ('lph', 0.064), ('mcde', 0.064), ('nonexe', 0.064), ('ons', 0.064), ('ope', 0.064), ('plc', 0.064), ('rial', 0.064), ('rman', 0.064), ('rmott', 0.064), ('rnat', 0.064), ('rudo', 0.064), ('categories', 0.064), ('shen', 0.064), ('goal', 0.064), ('grammar', 0.062), ('cont', 0.062), ('hypotheses', 0.059), ('continue', 0.057), ('white', 0.055), ('ale', 0.055), ('ome', 0.055), ('onal', 0.055), ('popbest', 0.055), ('int', 0.055), ('updates', 0.054), ('scored', 0.052), ('curran', 0.05), ('queue', 0.05), ('ld', 0.048), ('precondition', 0.046), ('thi', 0.046), ('cons', 0.046), ('yue', 0.046), ('ordering', 0.046), ('grammatical', 0.045), ('surface', 0.045), ('steedman', 0.044), ('smt', 0.043), ('output', 0.043), ('inc', 0.043), ('ley', 0.043), ('lexical', 0.043), ('daum', 0.043), ('bitvector', 0.043), ('dat', 0.043), ('espinosa', 0.043), ('goaltest', 0.043), ('hypertagging', 0.043)]
simIndex simValue paperId paperTitle
same-paper 1 0.9999997 132 emnlp-2011-Syntax-Based Grammaticality Improvement using CCG and Guided Search
Author: Yue Zhang ; Stephen Clark
Abstract: Machine-produced text often lacks grammaticality and fluency. This paper studies grammaticality improvement using a syntax-based algorithm based on CCG. The goal of the search problem is to find an optimal parse tree among all that can be constructed through selection and ordering of the input words. The search problem, which is significantly harder than parsing, is solved by guided learning for best-first search. In a standard word ordering task, our system gives a BLEU score of 40. 1, higher than the previous result of 33.7 achieved by a dependency-based system.
2 0.18695882 121 emnlp-2011-Semi-supervised CCG Lexicon Extension
Author: Emily Thomforde ; Mark Steedman
Abstract: This paper introduces Chart Inference (CI), an algorithm for deriving a CCG category for an unknown word from a partial parse chart. It is shown to be faster and more precise than a baseline brute-force method, and to achieve wider coverage than a rule-based system. In addition, we show the application of CI to a domain adaptation task for question words, which are largely missing in the Penn Treebank. When used in combination with self-training, CI increases the precision of the baseline StatCCG parser over subjectextraction questions by 50%. An error analysis shows that CI contributes to the increase by expanding the number of category types available to the parser, while self-training adjusts the counts.
3 0.12278558 87 emnlp-2011-Lexical Generalization in CCG Grammar Induction for Semantic Parsing
Author: Tom Kwiatkowski ; Luke Zettlemoyer ; Sharon Goldwater ; Mark Steedman
Abstract: We consider the problem of learning factored probabilistic CCG grammars for semantic parsing from data containing sentences paired with logical-form meaning representations. Traditional CCG lexicons list lexical items that pair words and phrases with syntactic and semantic content. Such lexicons can be inefficient when words appear repeatedly with closely related lexical content. In this paper, we introduce factored lexicons, which include both lexemes to model word meaning and templates to model systematic variation in word usage. We also present an algorithm for learning factored CCG lexicons, along with a probabilistic parse-selection model. Evaluations on benchmark datasets demonstrate that the approach learns highly accurate parsers, whose generalization performance greatly from the lexical factoring. benefits
4 0.12056709 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
Author: Kevin Gimpel ; Noah A. Smith
Abstract: We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text using a target-side dependency parser. For decoding, we describe a coarse-to-fine approach based on lattice dependency parsing of phrase lattices. We demonstrate performance improvements for Chinese-English and UrduEnglish translation over a phrase-based baseline. We also investigate the use of unsupervised dependency parsers, reporting encouraging preliminary results.
5 0.089853957 6 emnlp-2011-A Generate and Rank Approach to Sentence Paraphrasing
Author: Prodromos Malakasiotis ; Ion Androutsopoulos
Abstract: We present a method that paraphrases a given sentence by first generating candidate paraphrases and then ranking (or classifying) them. The candidates are generated by applying existing paraphrasing rules extracted from parallel corpora. The ranking component considers not only the overall quality of the rules that produced each candidate, but also the extent to which they preserve grammaticality and meaning in the particular context of the input sentence, as well as the degree to which the candidate differs from the input. We experimented with both a Maximum Entropy classifier and an SVR ranker. Experimental results show that incorporating features from an existing paraphrase recognizer in the ranking component improves performance, and that our overall method compares well against a state of the art paraphrase generator, when paraphrasing rules apply to the input sentences. We also propose a new methodology to evaluate the ranking components of generate-and-rank paraphrase generators, which evaluates them across different combinations of weights for grammaticality, meaning preservation, and diversity. The paper is accompanied by a paraphrasing dataset we constructed for evaluations of this kind.
6 0.086751871 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax
7 0.084923185 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
8 0.083059765 65 emnlp-2011-Heuristic Search for Non-Bottom-Up Tree Structure Prediction
9 0.080994509 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives
10 0.080819383 125 emnlp-2011-Statistical Machine Translation with Local Language Models
11 0.079597227 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation
12 0.077926233 10 emnlp-2011-A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions
13 0.077781469 103 emnlp-2011-Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus
14 0.073272809 145 emnlp-2011-Unsupervised Semantic Role Induction with Graph Partitioning
15 0.073208682 141 emnlp-2011-Unsupervised Dependency Parsing without Gold Part-of-Speech Tags
16 0.072733179 4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser
17 0.072054587 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
18 0.07025519 75 emnlp-2011-Joint Models for Chinese POS Tagging and Dependency Parsing
19 0.068004891 135 emnlp-2011-Timeline Generation through Evolutionary Trans-Temporal Summarization
20 0.065241642 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
topicId topicWeight
[(0, 0.239), (1, 0.074), (2, -0.001), (3, -0.002), (4, 0.032), (5, -0.067), (6, -0.111), (7, -0.013), (8, 0.138), (9, 0.005), (10, -0.094), (11, 0.031), (12, -0.141), (13, -0.05), (14, 0.097), (15, 0.025), (16, -0.009), (17, -0.093), (18, 0.114), (19, -0.043), (20, -0.085), (21, -0.021), (22, 0.117), (23, -0.112), (24, -0.148), (25, -0.076), (26, -0.046), (27, -0.062), (28, -0.23), (29, 0.042), (30, 0.122), (31, 0.111), (32, 0.076), (33, 0.253), (34, 0.038), (35, -0.108), (36, -0.077), (37, 0.121), (38, 0.108), (39, -0.022), (40, -0.121), (41, 0.051), (42, 0.073), (43, 0.029), (44, 0.058), (45, 0.034), (46, -0.071), (47, -0.021), (48, 0.088), (49, 0.084)]
simIndex simValue paperId paperTitle
same-paper 1 0.94575226 132 emnlp-2011-Syntax-Based Grammaticality Improvement using CCG and Guided Search
Author: Yue Zhang ; Stephen Clark
Abstract: Machine-produced text often lacks grammaticality and fluency. This paper studies grammaticality improvement using a syntax-based algorithm based on CCG. The goal of the search problem is to find an optimal parse tree among all that can be constructed through selection and ordering of the input words. The search problem, which is significantly harder than parsing, is solved by guided learning for best-first search. In a standard word ordering task, our system gives a BLEU score of 40. 1, higher than the previous result of 33.7 achieved by a dependency-based system.
2 0.8610782 121 emnlp-2011-Semi-supervised CCG Lexicon Extension
Author: Emily Thomforde ; Mark Steedman
Abstract: This paper introduces Chart Inference (CI), an algorithm for deriving a CCG category for an unknown word from a partial parse chart. It is shown to be faster and more precise than a baseline brute-force method, and to achieve wider coverage than a rule-based system. In addition, we show the application of CI to a domain adaptation task for question words, which are largely missing in the Penn Treebank. When used in combination with self-training, CI increases the precision of the baseline StatCCG parser over subjectextraction questions by 50%. An error analysis shows that CI contributes to the increase by expanding the number of category types available to the parser, while self-training adjusts the counts.
3 0.4881008 87 emnlp-2011-Lexical Generalization in CCG Grammar Induction for Semantic Parsing
Author: Tom Kwiatkowski ; Luke Zettlemoyer ; Sharon Goldwater ; Mark Steedman
Abstract: We consider the problem of learning factored probabilistic CCG grammars for semantic parsing from data containing sentences paired with logical-form meaning representations. Traditional CCG lexicons list lexical items that pair words and phrases with syntactic and semantic content. Such lexicons can be inefficient when words appear repeatedly with closely related lexical content. In this paper, we introduce factored lexicons, which include both lexemes to model word meaning and templates to model systematic variation in word usage. We also present an algorithm for learning factored CCG lexicons, along with a probabilistic parse-selection model. Evaluations on benchmark datasets demonstrate that the approach learns highly accurate parsers, whose generalization performance greatly from the lexical factoring. benefits
4 0.48067868 65 emnlp-2011-Heuristic Search for Non-Bottom-Up Tree Structure Prediction
Author: Andrea Gesmundo ; James Henderson
Abstract: State of the art Tree Structures Prediction techniques rely on bottom-up decoding. These approaches allow the use of context-free features and bottom-up features. We discuss the limitations of mainstream techniques in solving common Natural Language Processing tasks. Then we devise a new framework that goes beyond Bottom-up Decoding, and that allows a better integration of contextual features. Furthermore we design a system that addresses these issues and we test it on Hierarchical Machine Translation, a well known tree structure prediction problem. The structure of the proposed system allows the incorporation of non-bottom-up features and relies on a more sophisticated decoding approach. We show that the proposed approach can find bet- ter translations using a smaller portion of the search space.
5 0.38120359 103 emnlp-2011-Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus
Author: Emily M. Bender ; Dan Flickinger ; Stephan Oepen ; Yi Zhang
Abstract: In order to obtain a fine-grained evaluation of parser accuracy over naturally occurring text, we study 100 examples each of ten reasonably frequent linguistic phenomena, randomly selected from a parsed version of the English Wikipedia. We construct a corresponding set of gold-standard target dependencies for these 1000 sentences, operationalize mappings to these targets from seven state-of-theart parsers, and evaluate the parsers against this data to measure their level of success in identifying these dependencies.
6 0.36326039 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax
7 0.34727252 100 emnlp-2011-Optimal Search for Minimum Error Rate Training
8 0.31812873 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming
9 0.29994589 19 emnlp-2011-Approximate Scalable Bounded Space Sketch for Large Data NLP
10 0.29306936 135 emnlp-2011-Timeline Generation through Evolutionary Trans-Temporal Summarization
11 0.28943455 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing
12 0.28640413 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
13 0.28093415 145 emnlp-2011-Unsupervised Semantic Role Induction with Graph Partitioning
14 0.27389213 6 emnlp-2011-A Generate and Rank Approach to Sentence Paraphrasing
15 0.26879707 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification
16 0.26355237 5 emnlp-2011-A Fast Re-scoring Strategy to Capture Long-Distance Dependencies
17 0.26190767 141 emnlp-2011-Unsupervised Dependency Parsing without Gold Part-of-Speech Tags
18 0.26109874 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives
19 0.25341141 68 emnlp-2011-Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding
20 0.25325534 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation
topicId topicWeight
[(12, 0.316), (23, 0.097), (36, 0.037), (37, 0.031), (45, 0.069), (53, 0.035), (54, 0.033), (57, 0.018), (62, 0.018), (64, 0.014), (65, 0.012), (66, 0.029), (69, 0.024), (79, 0.092), (82, 0.016), (87, 0.011), (90, 0.015), (96, 0.063), (98, 0.015)]
simIndex simValue paperId paperTitle
same-paper 1 0.70250535 132 emnlp-2011-Syntax-Based Grammaticality Improvement using CCG and Guided Search
Author: Yue Zhang ; Stephen Clark
Abstract: Machine-produced text often lacks grammaticality and fluency. This paper studies grammaticality improvement using a syntax-based algorithm based on CCG. The goal of the search problem is to find an optimal parse tree among all that can be constructed through selection and ordering of the input words. The search problem, which is significantly harder than parsing, is solved by guided learning for best-first search. In a standard word ordering task, our system gives a BLEU score of 40. 1, higher than the previous result of 33.7 achieved by a dependency-based system.
2 0.67018408 138 emnlp-2011-Tuning as Ranking
Author: Mark Hopkins ; Jonathan May
Abstract: We offer a simple, effective, and scalable method for statistical machine translation parameter tuning based on the pairwise approach to ranking (Herbrich et al., 1999). Unlike the popular MERT algorithm (Och, 2003), our pairwise ranking optimization (PRO) method is not limited to a handful of parameters and can easily handle systems with thousands of features. Moreover, unlike recent approaches built upon the MIRA algorithm of Crammer and Singer (2003) (Watanabe et al., 2007; Chiang et al., 2008b), PRO is easy to implement. It uses off-the-shelf linear binary classifier software and can be built on top of an existing MERT framework in a matter of hours. We establish PRO’s scalability and effectiveness by comparing it to MERT and MIRA and demonstrate parity on both phrase-based and syntax-based systems in a variety of language pairs, using large scale data scenarios.
3 0.46041718 87 emnlp-2011-Lexical Generalization in CCG Grammar Induction for Semantic Parsing
Author: Tom Kwiatkowski ; Luke Zettlemoyer ; Sharon Goldwater ; Mark Steedman
Abstract: We consider the problem of learning factored probabilistic CCG grammars for semantic parsing from data containing sentences paired with logical-form meaning representations. Traditional CCG lexicons list lexical items that pair words and phrases with syntactic and semantic content. Such lexicons can be inefficient when words appear repeatedly with closely related lexical content. In this paper, we introduce factored lexicons, which include both lexemes to model word meaning and templates to model systematic variation in word usage. We also present an algorithm for learning factored CCG lexicons, along with a probabilistic parse-selection model. Evaluations on benchmark datasets demonstrate that the approach learns highly accurate parsers, whose generalization performance greatly from the lexical factoring. benefits
4 0.4565922 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing
Author: Amit Dubey ; Frank Keller ; Patrick Sturt
Abstract: This paper introduces a psycholinguistic model of sentence processing which combines a Hidden Markov Model noun phrase chunker with a co-reference classifier. Both models are fully incremental and generative, giving probabilities of lexical elements conditional upon linguistic structure. This allows us to compute the information theoretic measure of surprisal, which is known to correlate with human processing effort. We evaluate our surprisal predictions on the Dundee corpus of eye-movement data show that our model achieve a better fit with human reading times than a syntax-only model which does not have access to co-reference information.
5 0.45209116 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
Author: Chang Liu ; Daniel Dahlmeier ; Hwee Tou Ng
Abstract: Many machine translation evaluation metrics have been proposed after the seminal BLEU metric, and many among them have been found to consistently outperform BLEU, demonstrated by their better correlations with human judgment. It has long been the hope that by tuning machine translation systems against these new generation metrics, advances in automatic machine translation evaluation can lead directly to advances in automatic machine translation. However, to date there has been no unambiguous report that these new metrics can improve a state-of-theart machine translation system over its BLEUtuned baseline. In this paper, we demonstrate that tuning Joshua, a hierarchical phrase-based statistical machine translation system, with the TESLA metrics results in significantly better humanjudged translation quality than the BLEUtuned baseline. TESLA-M in particular is simple and performs well in practice on large datasets. We release all our implementation under an open source license. It is our hope that this work will encourage the machine translation community to finally move away from BLEU as the unquestioned default and to consider the new generation metrics when tuning their systems.
6 0.45016727 136 emnlp-2011-Training a Parser for Machine Translation Reordering
7 0.45014626 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
8 0.45003003 35 emnlp-2011-Correcting Semantic Collocation Errors with L1-induced Paraphrases
9 0.44943574 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
10 0.44839472 111 emnlp-2011-Reducing Grounded Learning Tasks To Grammatical Inference
11 0.44801867 34 emnlp-2011-Corpus-Guided Sentence Generation of Natural Images
12 0.44667652 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
13 0.44485822 36 emnlp-2011-Corroborating Text Evaluation Results with Heterogeneous Measures
14 0.44467556 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
15 0.44354913 128 emnlp-2011-Structured Relation Discovery using Generative Models
16 0.44190955 68 emnlp-2011-Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding
17 0.44181302 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification
18 0.44138047 33 emnlp-2011-Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs
19 0.44037202 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features
20 0.44009206 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation