emnlp emnlp2011 emnlp2011-32 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Nikhil Dinesh ; Aravind Joshi ; Insup Lee
Abstract: The computation of logical form has been proposed as an intermediate step in the translation of sentences to logic. Logical form encodes the resolution of scope ambiguities. In this paper, we describe experiments on a modestsized corpus of regulation annotated with a novel variant of logical form, called abstract syntax trees (ASTs). The main step in computing ASTs is to order scope-taking operators. A learning model for ranking is adapted for this ordering. We design features by studying the problem ofcomparing the scope ofone operator to another. The scope comparisons are used to compute ASTs, with an F-score of 90.6% on the set of ordering decisons.
Reference: text
sentIndex sentText sentNum sentScore
1 We design features by studying the problem ofcomparing the scope ofone operator to another. [sent-8, score-0.424]
2 The scope comparisons are used to compute ASTs, with an F-score of 90. [sent-9, score-0.232]
3 Just as a parse tree determines the constituent structure of a sentence, a logical form of a sentence represents one way of resolving scope ambiguities. [sent-12, score-0.332]
4 The level of logical form is an appealing layer of modularity; it allows us to take a step beyond parsing in studying scope phenomenon, and yet, avoid the open problem of fully translating sentences to logic. [sent-13, score-0.358]
5 Data-driven analyses of scope have been of interest in psycholinguistics (Kurtzman and MacDonald, 1993) and more recently in NLP (Srinivasan and Yates, 2009). [sent-14, score-0.207]
6 In the related problem of translating database queries to logic, Zettlemoyer and Collins (2009) and Wong and Mooney (2007) consider the scope of adjectives in addition to determiners, for example the scope of “cheapest” in the noun phrase “the cheapest flights from Boston to New York”. [sent-19, score-0.464]
7 To our knowledge, empirical studies of scope have been restricted to phenomenon between and within noun phrases. [sent-20, score-0.207]
8 In this paper, we describe experiments on a novel annotation of scope phenomenon in regulatory texts Section 610 of the Food and Drug Administration’s Code of Federal Regulations1 (FDA CFR). [sent-21, score-0.305]
9 1 Problems and Assumptions A key assumption of logical form is that the translation from language to logic is syntax-based. [sent-38, score-0.226]
10 The pre and postconditions are expressed in a modal logic that we designed in prior work (Dinesh et al. [sent-59, score-0.239]
11 In describing the logical form, we will sketch how the logical form can be mapped to logic. [sent-61, score-0.25]
12 Given the assumptions about the logic, our goal is to transform a regulatory sentence into a structure that lets us determine: (I) the constituents of a sentence that contribute to the pre/postcondition, and (II) the scope of operators in the pre/postcondition. [sent-63, score-0.714]
13 The structures that we use are called abstract syntax trees (ASTs), which can be understood as a restricted kind of logical form for regulatory texts. [sent-64, score-0.249]
14 We design features by studying the problem of comparing the scope of one operator to another. [sent-69, score-0.424]
15 The pairwise scope comparisons are then used to compute ASTs, with an F-score of 90. [sent-70, score-0.232]
16 We then describe the corpus using statistics about operators in Section 4. [sent-74, score-0.409]
17 In Section 5, we describe experiments on comparing the scope of an operator to another. [sent-75, score-0.398]
18 We use the pairwise scope comparisons, in Section 6 to comput the AST. [sent-76, score-0.207]
19 11: (1) A general safety test for the detection of extraneous toxic contaminants shall be performed on biological products intended for administration to humans. [sent-79, score-0.208]
20 We use the term nuclear scope to refer to the last (nth) argument of the operator, and the term restrictor to refer to any other argument. [sent-104, score-0.359]
21 We borrow these terms from the literature on quantifier scope for determiners (Heim and Kratzer, 1998, Chapter 7). [sent-105, score-0.413]
22 For example, the phrase “general safety test” is in the restrictor of the operator A, and the variable is in its nuclear scope. [sent-106, score-0.422]
23 Non-unary operators bind the variable displayed on the internal node. [sent-108, score-0.409]
24 Implicit operators are inserted when there is no overt word or phrase. [sent-110, score-0.409]
25 Similarly, we use the implicit operator Post to mark the position of the postcondition. [sent-113, score-0.274]
26 Given an AST for a sentences, we 1204 say that an operator oi scopes over oj, denoted oi ≫ oj, if oj appears in the nuclear scope of oi. [sent-116, score-0.602]
27 Isnt, addition, we say ≫tha At ,t Pheo srets ≫tri Act,or a of oi scopes over oj, dtioenno, wtede R(oi) ≫ oj, if oj appears in the restrictor of oi. [sent-118, score-0.246]
28 We will assume as given a Processed Parse Tree (PPT) of a sentence, with the operators and their restrictors identified. [sent-121, score-0.447]
29 Given such a PPT, the AST is computed in two steps: (1) finding the preterminal at which an operator takes scope, and (2) ordering the operators associated with a preterminal. [sent-124, score-0.903]
30 The steps are described in reverse order, because in most cases, the operators associated with a preterminal are determined directly by syntactic attachment. [sent-128, score-0.624]
31 A PPT provides the set of operators in a sentence, associated with their restrictors. [sent-138, score-0.409]
32 For example, the determiner “a” has the restrictor general safety test. [sent-139, score-0.278]
33 The phrase biological products has no explicit determiner associated with it, and the corresponding operator in the PPT is labeled “IMP” for implicit. [sent-140, score-0.405]
34 Except for the postcondition marker, annotator-specified implicit operators are not given in the PPT. [sent-142, score-0.59]
35 There are two main types of nodes in the PPT – operators and preterminals. [sent-143, score-0.409]
36 While this is true of example (1), embedded operators arise, for example, in the context of PP-modification of NPs and relative clauses. [sent-149, score-0.521]
37 In this work, the PPTs are obtained by removing all scope decisions from the AST. [sent-152, score-0.207]
38 To a first approximation, we start by removing all operators from the AST, and then, replace the corresponding variables by the operators. [sent-153, score-0.409]
39 Implicit unary operators (such as the postcondition marker) are placed at the start of the preterminal. [sent-154, score-0.507]
40 For example, we need to determine that the implicit determiner associated with biological products is universal, and hence, we have IMP ≫ Post. [sent-169, score-0.297]
41 A PPT τ is viewed as a set of preterminal nodes, and we will write – (a) p ∈ τ to denote that p occurs in τ, and (b) |τ| tpo d∈ en τote to oth dee nnoutmeb thera to fp preterminals ,i na τ. [sent-172, score-0.274]
42 bA) preterminal p tihs evi neuwmebde as an roertdeermredin aslest ionf operators p = (o1, . [sent-173, score-0.624]
43 For example, in Figure 2, the root preterminal p has |p | = 5, and the operators oth1e = Post, ote2r = nAal, op3 h = s |ph|a =ll, a5n, da so on. [sent-177, score-0.624]
44 An AST α contains a ranking of operators associated with each preterminal, denoted rα (p). [sent-178, score-0.436]
45 ,o5) be the root preterminal of the PPT in Figure 2. [sent-183, score-0.215]
46 For o|p| example, o52 = Adenotes that thedeterminer “A” appears second in the surface order (Figure 2) and fifth or lowest in the scope order (Figure 1). [sent-185, score-0.23]
47 Similarly, o15 = IMP denotes that the implicit determiner appears fifth or last in the surface order (Figure 2) and first or highest in the scope order (Figure 1). [sent-186, score-0.398]
48 , an operator or its variable located in the restrictor of another. [sent-197, score-0.305]
49 An embedded operator can either (a) take scope within the restrictor of the embedding operator, or (b) outscope the embedding operator. [sent-198, score-0.686]
50 To account for the second case, we need to determine whether it is appropriate to lift an embedded operator to a higher preterminal than the one to which it is associated syntactically. [sent-199, score-0.518]
51 The implicit determiner IMP in the PPT is interpreted as the existential determiner some in the AST. [sent-205, score-0.322]
52 The three operators are related as follows in the AST: any ≫ some and R(any) ≫ a, i. [sent-206, score-0.409]
53 , any outscopes atnhey implicit determiner, ya)n d≫ a appears niny the restrictor of any. [sent-208, score-0.248]
54 The important feature of this example is that the determiner “any” is syntactically embedded in the restrictor of IMP in the PPT (Figure 4), but it outscopes the implicit determiner in the AST (Figure 3). [sent-212, score-0.53]
55 As a result, the PPT in Figure 3 cannot be converted to the AST in Figure 4 simply by ranking sibling operators (as we did in the previous section). [sent-213, score-0.436]
56 The only allowed operation during this conversion is to raise an embedded operator to a higher preterminal. [sent-215, score-0.329]
57 The operators are di- vided into the following types determiners (e. [sent-220, score-0.552]
58 De Re vs De Dicto: We narrow our focus to one part of the annotation, the de re vs de dicto distinction. [sent-233, score-0.394]
59 Informally, operators with de re scope occur in the precondition of the logical translation of a sentence, while those with de dicto scope occur in the postcondition. [sent-234, score-1.342]
60 For simplicity, we further restrict attention to operators that are siblings of the postcondition in the – AST, and ignore the operators embedded in prepositional phrases and clauses, for example. [sent-236, score-1.059]
61 A (main clause) operator o is said to have de re scope iff it outscopes the postcondition marker (o ≫ Post). [sent-237, score-0.76]
62 Otherwise, sth teh operator nisd istaiiodn t om haarkvee rd (eo odi ≫cto P scope (Post ≫ o). [sent-238, score-0.423]
63 n I 2, teh eex implicit fd thetee grmeninerear a sassfeotcyia tteesdt with “biological products” has de re scope, while all other operators in the sentence have de dicto scope. [sent-240, score-0.886]
64 An operator has de re scope iff it outscopes the postcondition marker. [sent-243, score-0.711]
65 Table 1 shows the percentage of each type of operator that has de re scope. [sent-244, score-0.332]
66 Modal auxiliaries and negation are umambigous to this distinction, and always have de dicto scope. [sent-245, score-0.303]
67 r075cR% enStacgoep Table 2: De Re scope distribution for determiners. [sent-250, score-0.207]
68 The guidelines for annotation were as follows (a) universal determiners have de re scope, (b) existential determiners have de dicto scope, and (c) for other determiners, the annotator needs to decide whether a particular use is interpreted existen– tially or universally. [sent-260, score-0.808]
69 Table 2 shows de re scope distribution for each of these subtypes. [sent-261, score-0.348]
70 As expected, universal and existential determiners are unambiguous, while ambiguous and deictic determiners show more variety. [sent-262, score-0.44]
71 Thus, to disambiguate between de re and de dicto interpretations for determiners, we need two types of features (1) Features to predict whether ambiguous and deictic determiners are universal or not, and (2) Features to determine the type of implicit determiners. [sent-264, score-0.705]
72 In Table 2, we assume that the type of implicit determiners are given. [sent-265, score-0.226]
73 %Rc5 e%nStacgoep Table 3: De Re scope distribution for VP modifiers. [sent-270, score-0.207]
74 Following the guidelines for annotation, the temporal and conditional modifiers are always de re, the purpose modifiers and modifiers conveying references to laws are always de dicto. [sent-279, score-0.336]
75 5 Comparing the Scope of Operators We now consider a subproblem in computing the AST comparing the scope of pairs of operators. [sent-280, score-0.207]
76 We begin by revisiting de re-de dicto distinction from Section 4. [sent-283, score-0.29]
77 Our observations are triples x = (o, o′, τ) are such that there is a preterminal – p ∈ τ, {o, o′} ⊆ p, and o′ = Post. [sent-286, score-0.215]
78 In other words, we are considering operators (o) that are siblings of the postcondition marker (o′). [sent-287, score-0.587]
79 The verb perform is frequent in the CFR, and its subject is typically given de dicto scope, as it is the main predicate of the sentence. [sent-293, score-0.253]
80 The majority class is de dicto when all operators are considered (the first row), and de re in all other rows. [sent-313, score-0.803]
81 From Table 4, we can conclude that the TYPE feature is useful in making the de re-de dicto distinction, and further gains are obtained by using ALL features. [sent-317, score-0.253]
82 Itanc toitchaellr words, nth oe, problem i cst tso w wdheetethremrin Re( ow)h ≫eth oer a syntactically embedded operator remains scopally embedded, or whether it has inverse scope (see Section 3. [sent-323, score-0.51]
83 In most cases, the number of operators per preterminal is less than 7. [sent-350, score-0.624]
84 We enumerate the operators in the 1209 initial PPT, corresponding to an in-order traversal. [sent-355, score-0.409]
85 Note that τ, α and α∗ share the same set of preterminals, but may associate different operators with them. [sent-368, score-0.409]
86 We say that p is correct in α∗, if it is associated with the same set of operators as in α, and for all {o, o′} p, we have o ≫ o′ w. [sent-369, score-0.409]
87 pIn, woteh ehra words, th oe preterminals are identical, both in terms of the set of operators ⊆ and the ordering between pairs of operators. [sent-376, score-0.556]
88 While preterminal-level accuracy gives partial credit, it is still a little harsh, in the sense that an algorithm which makes one ordering mistake at a preterminal is penalized the same as an algorithm which makes multiple mistakes. [sent-377, score-0.303]
89 The set Pairs(p, α) consists of pairs of operators (o, o′) such that o and o′ are both associated with p in α, and o = o′ or o ≫ o′. [sent-380, score-0.409]
90 No Embedding – The AST is computed purely by reordering operators within a preterminal in the PPT. [sent-386, score-0.624]
91 , the order of operators in the AST respects the surface order (b) TYPE Using only type and subtype information for the operators (c) ALL Using all the features described in Section 5 – – 2. [sent-389, score-0.885]
92 As we saw in Section 5, in 95% of the cases, the embedded operators respects syntactic scope, and as a result, we obtain only modest gains from handling embedded operators. [sent-419, score-0.633]
93 The algorithm that handles embedded operators (ALL+) usually raises them from a single operator node (as in Figure 3) to a multi-operator node (as in Figure 5). [sent-424, score-0.712]
94 If it makes an incorrect decision to raise an operator it takes a precision hit, at the multi-operator node (because 1210 it has some false positives). [sent-425, score-0.217]
95 Furthermore, when we restrict attention to those preterminals with two or more operators in the PPT, the accuracy of ALL+ is 69. [sent-431, score-0.468]
96 8 Conclusions We described experiments on a modest-sized corpus of regulatory sentences, annotated with a novel variant of logical form, called abstract syntax trees (ASTs). [sent-461, score-0.249]
97 The main step in this conversion was to rank or order the operators at a preterminal. [sent-464, score-0.409]
98 More recently, Srinivasan and Yates (2009) showed how pragmatic information, for example “there are more people than cities”, can be leveraged for scope disambiguation. [sent-471, score-0.237]
99 Expanding the scope of the ATIS task: the ATIS-3 corpus. [sent-534, score-0.207]
100 Quantifier scope disambiguation using extracted pragmatic knowledge: Preliminary results. [sent-689, score-0.237]
wordName wordTfidf (topN-words)
[('ppt', 0.455), ('operators', 0.409), ('ast', 0.327), ('asts', 0.215), ('preterminal', 0.215), ('scope', 0.207), ('operator', 0.191), ('dicto', 0.177), ('determiners', 0.143), ('logical', 0.125), ('restrictor', 0.114), ('embedded', 0.112), ('logic', 0.101), ('postcondition', 0.098), ('regulatory', 0.098), ('ordering', 0.088), ('imp', 0.087), ('determiner', 0.085), ('implicit', 0.083), ('cfr', 0.079), ('safety', 0.079), ('biological', 0.077), ('oj', 0.076), ('dinesh', 0.076), ('de', 0.076), ('existential', 0.069), ('re', 0.065), ('conformance', 0.063), ('quantifier', 0.063), ('preterminals', 0.059), ('regulation', 0.054), ('modal', 0.054), ('products', 0.052), ('deictic', 0.051), ('fda', 0.051), ('outscopes', 0.051), ('ppts', 0.051), ('marker', 0.049), ('pre', 0.046), ('modifiers', 0.045), ('atis', 0.044), ('subtype', 0.044), ('subtypes', 0.044), ('security', 0.041), ('post', 0.04), ('geoquery', 0.039), ('governatori', 0.038), ('nuclear', 0.038), ('postconditions', 0.038), ('restrictors', 0.038), ('scopal', 0.038), ('sergot', 0.038), ('distinction', 0.037), ('zettlemoyer', 0.035), ('oi', 0.034), ('universal', 0.034), ('checking', 0.033), ('anaphoric', 0.033), ('srinivasan', 0.033), ('business', 0.031), ('embedding', 0.031), ('siblings', 0.031), ('pragmatic', 0.03), ('queries', 0.028), ('ranking', 0.027), ('negation', 0.027), ('vp', 0.027), ('studying', 0.026), ('bos', 0.026), ('raise', 0.026), ('syntax', 0.026), ('guidelines', 0.025), ('accomodate', 0.025), ('barth', 0.025), ('bio', 0.025), ('compilers', 0.025), ('dahl', 0.025), ('gensaf', 0.025), ('grosof', 0.025), ('heim', 0.025), ('kurtzman', 0.025), ('makinson', 0.025), ('manufacturer', 0.025), ('nstacgoep', 0.025), ('odi', 0.025), ('prod', 0.025), ('comparisons', 0.025), ('laws', 0.024), ('processed', 0.024), ('der', 0.023), ('iff', 0.023), ('auxiliaries', 0.023), ('surface', 0.023), ('boosting', 0.022), ('wong', 0.022), ('organization', 0.022), ('unprocessed', 0.022), ('cheapest', 0.022), ('harsh', 0.022), ('scopes', 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999928 32 emnlp-2011-Computing Logical Form on Regulatory Texts
Author: Nikhil Dinesh ; Aravind Joshi ; Insup Lee
Abstract: The computation of logical form has been proposed as an intermediate step in the translation of sentences to logic. Logical form encodes the resolution of scope ambiguities. In this paper, we describe experiments on a modestsized corpus of regulation annotated with a novel variant of logical form, called abstract syntax trees (ASTs). The main step in computing ASTs is to order scope-taking operators. A learning model for ranking is adapted for this ordering. We design features by studying the problem ofcomparing the scope ofone operator to another. The scope comparisons are used to compute ASTs, with an F-score of 90.6% on the set of ordering decisons.
2 0.10023093 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
Author: Xinyan Xiao ; Yang Liu ; Qun Liu ; Shouxun Lin
Abstract: Although discriminative training guarantees to improve statistical machine translation by incorporating a large amount of overlapping features, it is hard to scale up to large data due to decoding complexity. We propose a new algorithm to generate translation forest of training data in linear time with the help of word alignment. Our algorithm also alleviates the oracle selection problem by ensuring that a forest always contains derivations that exactly yield the reference translation. With millions of features trained on 519K sentences in 0.03 second per sentence, our system achieves significant improvement by 0.84 BLEU over the baseline system on the NIST Chinese-English test sets.
Author: Wei Lu ; Hwee Tou Ng
Abstract: This paper describes a novel probabilistic approach for generating natural language sentences from their underlying semantics in the form of typed lambda calculus. The approach is built on top of a novel reduction-based weighted synchronous context free grammar formalism, which facilitates the transformation process from typed lambda calculus into natural language sentences. Sentences can then be generated based on such grammar rules with a log-linear model. To acquire such grammar rules automatically in an unsupervised manner, we also propose a novel approach with a generative model, which maps from sub-expressions of logical forms to word sequences in natural language sentences. Experiments on benchmark datasets for both English and Chinese generation tasks yield significant improvements over results obtained by two state-of-the-art machine translation models, in terms of both automatic metrics and human evaluation.
4 0.073869914 87 emnlp-2011-Lexical Generalization in CCG Grammar Induction for Semantic Parsing
Author: Tom Kwiatkowski ; Luke Zettlemoyer ; Sharon Goldwater ; Mark Steedman
Abstract: We consider the problem of learning factored probabilistic CCG grammars for semantic parsing from data containing sentences paired with logical-form meaning representations. Traditional CCG lexicons list lexical items that pair words and phrases with syntactic and semantic content. Such lexicons can be inefficient when words appear repeatedly with closely related lexical content. In this paper, we introduce factored lexicons, which include both lexemes to model word meaning and templates to model systematic variation in word usage. We also present an algorithm for learning factored CCG lexicons, along with a probabilistic parse-selection model. Evaluations on benchmark datasets demonstrate that the approach learns highly accurate parsers, whose generalization performance greatly from the lexical factoring. benefits
5 0.043635502 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
Author: Edward Grefenstette ; Mehrnoosh Sadrzadeh
Abstract: Modelling compositional meaning for sentences using empirical distributional methods has been a challenge for computational linguists. We implement the abstract categorical model of Coecke et al. (2010) using data from the BNC and evaluate it. The implementation is based on unsupervised learning of matrices for relational words and applying them to the vectors of their arguments. The evaluation is based on the word disambiguation task developed by Mitchell and Lapata (2008) for intransitive sentences, and on a similar new experiment designed for transitive sentences. Our model matches the results of its competitors . in the first experiment, and betters them in the second. The general improvement in results with increase in syntactic complexity showcases the compositional power of our model.
6 0.037681527 144 emnlp-2011-Unsupervised Learning of Selectional Restrictions and Detection of Argument Coercions
7 0.037382282 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation
8 0.037259165 50 emnlp-2011-Evaluating Dependency Parsing: Robust and Heuristics-Free Cross-Annotation Evaluation
9 0.036047403 65 emnlp-2011-Heuristic Search for Non-Bottom-Up Tree Structure Prediction
10 0.035259251 24 emnlp-2011-Bootstrapping Semantic Parsers from Conversations
11 0.035159107 4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser
12 0.033414304 111 emnlp-2011-Reducing Grounded Learning Tasks To Grammatical Inference
13 0.03178668 103 emnlp-2011-Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus
14 0.03088198 132 emnlp-2011-Syntax-Based Grammaticality Improvement using CCG and Guided Search
15 0.03041281 45 emnlp-2011-Dual Decomposition with Many Overlapping Components
16 0.030295989 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
17 0.030256197 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification
18 0.030192502 13 emnlp-2011-A Word Reordering Model for Improved Machine Translation
19 0.030068666 136 emnlp-2011-Training a Parser for Machine Translation Reordering
20 0.029160557 96 emnlp-2011-Multilayer Sequence Labeling
topicId topicWeight
[(0, 0.123), (1, 0.014), (2, -0.018), (3, 0.0), (4, 0.024), (5, -0.036), (6, -0.037), (7, -0.003), (8, 0.081), (9, -0.039), (10, -0.05), (11, 0.025), (12, -0.112), (13, -0.1), (14, -0.014), (15, -0.055), (16, 0.019), (17, -0.009), (18, 0.009), (19, 0.018), (20, -0.008), (21, 0.01), (22, -0.017), (23, 0.071), (24, -0.019), (25, -0.074), (26, 0.098), (27, 0.058), (28, 0.082), (29, -0.05), (30, -0.047), (31, -0.015), (32, -0.076), (33, -0.144), (34, 0.015), (35, -0.029), (36, 0.06), (37, 0.08), (38, -0.001), (39, 0.103), (40, 0.018), (41, -0.118), (42, -0.07), (43, 0.119), (44, -0.189), (45, 0.161), (46, -0.208), (47, 0.126), (48, -0.213), (49, -0.065)]
simIndex simValue paperId paperTitle
same-paper 1 0.94292736 32 emnlp-2011-Computing Logical Form on Regulatory Texts
Author: Nikhil Dinesh ; Aravind Joshi ; Insup Lee
Abstract: The computation of logical form has been proposed as an intermediate step in the translation of sentences to logic. Logical form encodes the resolution of scope ambiguities. In this paper, we describe experiments on a modestsized corpus of regulation annotated with a novel variant of logical form, called abstract syntax trees (ASTs). The main step in computing ASTs is to order scope-taking operators. A learning model for ranking is adapted for this ordering. We design features by studying the problem ofcomparing the scope ofone operator to another. The scope comparisons are used to compute ASTs, with an F-score of 90.6% on the set of ordering decisons.
2 0.44548321 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
Author: Xinyan Xiao ; Yang Liu ; Qun Liu ; Shouxun Lin
Abstract: Although discriminative training guarantees to improve statistical machine translation by incorporating a large amount of overlapping features, it is hard to scale up to large data due to decoding complexity. We propose a new algorithm to generate translation forest of training data in linear time with the help of word alignment. Our algorithm also alleviates the oracle selection problem by ensuring that a forest always contains derivations that exactly yield the reference translation. With millions of features trained on 519K sentences in 0.03 second per sentence, our system achieves significant improvement by 0.84 BLEU over the baseline system on the NIST Chinese-English test sets.
3 0.41430691 10 emnlp-2011-A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions
Author: Wei Lu ; Hwee Tou Ng
Abstract: This paper describes a novel probabilistic approach for generating natural language sentences from their underlying semantics in the form of typed lambda calculus. The approach is built on top of a novel reduction-based weighted synchronous context free grammar formalism, which facilitates the transformation process from typed lambda calculus into natural language sentences. Sentences can then be generated based on such grammar rules with a log-linear model. To acquire such grammar rules automatically in an unsupervised manner, we also propose a novel approach with a generative model, which maps from sub-expressions of logical forms to word sequences in natural language sentences. Experiments on benchmark datasets for both English and Chinese generation tasks yield significant improvements over results obtained by two state-of-the-art machine translation models, in terms of both automatic metrics and human evaluation.
4 0.36266464 87 emnlp-2011-Lexical Generalization in CCG Grammar Induction for Semantic Parsing
Author: Tom Kwiatkowski ; Luke Zettlemoyer ; Sharon Goldwater ; Mark Steedman
Abstract: We consider the problem of learning factored probabilistic CCG grammars for semantic parsing from data containing sentences paired with logical-form meaning representations. Traditional CCG lexicons list lexical items that pair words and phrases with syntactic and semantic content. Such lexicons can be inefficient when words appear repeatedly with closely related lexical content. In this paper, we introduce factored lexicons, which include both lexemes to model word meaning and templates to model systematic variation in word usage. We also present an algorithm for learning factored CCG lexicons, along with a probabilistic parse-selection model. Evaluations on benchmark datasets demonstrate that the approach learns highly accurate parsers, whose generalization performance greatly from the lexical factoring. benefits
5 0.36162949 129 emnlp-2011-Structured Sparsity in Structured Prediction
Author: Andre Martins ; Noah Smith ; Mario Figueiredo ; Pedro Aguiar
Abstract: Linear models have enjoyed great success in structured prediction in NLP. While a lot of progress has been made on efficient training with several loss functions, the problem of endowing learners with a mechanism for feature selection is still unsolved. Common approaches employ ad hoc filtering or L1regularization; both ignore the structure of the feature space, preventing practicioners from encoding structural prior knowledge. We fill this gap by adopting regularizers that promote structured sparsity, along with efficient algorithms to handle them. Experiments on three tasks (chunking, entity recognition, and dependency parsing) show gains in performance, compactness, and model interpretability.
6 0.28846878 34 emnlp-2011-Corpus-Guided Sentence Generation of Natural Images
7 0.27747723 144 emnlp-2011-Unsupervised Learning of Selectional Restrictions and Detection of Argument Coercions
8 0.26820049 46 emnlp-2011-Efficient Subsampling for Training Complex Language Models
9 0.26099432 62 emnlp-2011-Generating Subsequent Reference in Shared Visual Scenes: Computation vs Re-Use
10 0.25075182 27 emnlp-2011-Classifying Sentences as Speech Acts in Message Board Posts
11 0.24323954 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
12 0.22697647 61 emnlp-2011-Generating Aspect-oriented Multi-Document Summarization with Event-aspect model
13 0.21042705 65 emnlp-2011-Heuristic Search for Non-Bottom-Up Tree Structure Prediction
14 0.19704702 3 emnlp-2011-A Correction Model for Word Alignments
15 0.19018193 103 emnlp-2011-Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus
16 0.18872176 45 emnlp-2011-Dual Decomposition with Many Overlapping Components
17 0.18842529 72 emnlp-2011-Improved Transliteration Mining Using Graph Reinforcement
18 0.18567882 82 emnlp-2011-Learning Local Content Shift Detectors from Document-level Information
19 0.18429412 63 emnlp-2011-Harnessing WordNet Senses for Supervised Sentiment Classification
20 0.17691605 110 emnlp-2011-Ranking Human and Machine Summarization Systems
topicId topicWeight
[(23, 0.068), (36, 0.023), (37, 0.018), (45, 0.048), (53, 0.015), (54, 0.039), (57, 0.012), (62, 0.491), (64, 0.025), (66, 0.018), (69, 0.017), (78, 0.014), (79, 0.038), (82, 0.019), (90, 0.021), (96, 0.022), (98, 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.86386138 32 emnlp-2011-Computing Logical Form on Regulatory Texts
Author: Nikhil Dinesh ; Aravind Joshi ; Insup Lee
Abstract: The computation of logical form has been proposed as an intermediate step in the translation of sentences to logic. Logical form encodes the resolution of scope ambiguities. In this paper, we describe experiments on a modestsized corpus of regulation annotated with a novel variant of logical form, called abstract syntax trees (ASTs). The main step in computing ASTs is to order scope-taking operators. A learning model for ranking is adapted for this ordering. We design features by studying the problem ofcomparing the scope ofone operator to another. The scope comparisons are used to compute ASTs, with an F-score of 90.6% on the set of ordering decisons.
2 0.85289901 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts
Author: Bryan Rink ; Sanda Harabagiu
Abstract: This paper presents a generative model for the automatic discovery of relations between entities in electronic medical records. The model discovers relation instances and their types by determining which context tokens express the relation. Additionally, the valid semantic classes for each type of relation are determined. We show that the model produces clusters of relation trigger words which better correspond with manually annotated relations than several existing clustering techniques. The discovered relations reveal some of the implicit semantic structure present in patient records.
3 0.75649536 100 emnlp-2011-Optimal Search for Minimum Error Rate Training
Author: Michel Galley ; Chris Quirk
Abstract: Minimum error rate training is a crucial component to many state-of-the-art NLP applications, such as machine translation and speech recognition. However, common evaluation functions such as BLEU or word error rate are generally highly non-convex and thus prone to search errors. In this paper, we present LP-MERT, an exact search algorithm for minimum error rate training that reaches the global optimum using a series of reductions to linear programming. Given a set of N-best lists produced from S input sentences, this algorithm finds a linear model that is globally optimal with respect to this set. We find that this algorithm is polynomial in N and in the size of the model, but exponential in S. We present extensions of this work that let us scale to reasonably large tuning sets (e.g., one thousand sentences), by either searching only promising regions of the parameter space, or by using a variant of LP-MERT that relies on a beam-search approximation. Experimental results show improvements over the standard Och algorithm.
4 0.47953013 114 emnlp-2011-Relation Extraction with Relation Topics
Author: Chang Wang ; James Fan ; Aditya Kalyanpur ; David Gondek
Abstract: This paper describes a novel approach to the semantic relation detection problem. Instead of relying only on the training instances for a new relation, we leverage the knowledge learned from previously trained relation detectors. Specifically, we detect a new semantic relation by projecting the new relation’s training instances onto a lower dimension topic space constructed from existing relation detectors through a three step process. First, we construct a large relation repository of more than 7,000 relations from Wikipedia. Second, we construct a set of non-redundant relation topics defined at multiple scales from the relation repository to characterize the existing relations. Similar to the topics defined over words, each relation topic is an interpretable multinomial distribution over the existing relations. Third, we integrate the relation topics in a kernel function, and use it together with SVM to construct detectors for new relations. The experimental results on Wikipedia and ACE data have confirmed that backgroundknowledge-based topics generated from the Wikipedia relation repository can significantly improve the performance over the state-of-theart relation detection approaches.
5 0.44891432 40 emnlp-2011-Discovering Relations between Noun Categories
Author: Thahir Mohamed ; Estevam Hruschka ; Tom Mitchell
Abstract: Traditional approaches to Relation Extraction from text require manually defining the relations to be extracted. We propose here an approach to automatically discovering relevant relations, given a large text corpus plus an initial ontology defining hundreds of noun categories (e.g., Athlete, Musician, Instrument). Our approach discovers frequently stated relations between pairs of these categories, using a two step process. For each pair of categories (e.g., Musician and Instrument) it first coclusters the text contexts that connect known instances of the two categories, generating a candidate relation for each resulting cluster. It then applies a trained classifier to determine which of these candidate relations is semantically valid. Our experiments apply this to a text corpus containing approximately 200 million web pages and an ontology containing 122 categories from the NELL system [Carlson et al., 2010b], producing a set of 781 proposed can- didate relations, approximately half of which are semantically valid. We conclude this is a useful approach to semi-automatic extension of the ontology for large-scale information extraction systems such as NELL. 1
6 0.43053433 128 emnlp-2011-Structured Relation Discovery using Generative Models
7 0.4009628 70 emnlp-2011-Identifying Relations for Open Information Extraction
8 0.38243827 57 emnlp-2011-Extreme Extraction - Machine Reading in a Week
9 0.36798421 138 emnlp-2011-Tuning as Ranking
10 0.35770151 94 emnlp-2011-Modelling Discourse Relations for Arabic
11 0.34660324 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
12 0.34647217 64 emnlp-2011-Harnessing different knowledge sources to measure semantic relatedness under a uniform model
13 0.33734101 65 emnlp-2011-Heuristic Search for Non-Bottom-Up Tree Structure Prediction
14 0.33065981 59 emnlp-2011-Fast and Robust Joint Models for Biomedical Event Extraction
15 0.32847455 97 emnlp-2011-Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French
17 0.3198435 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation
18 0.31907734 116 emnlp-2011-Robust Disambiguation of Named Entities in Text
19 0.3185173 113 emnlp-2011-Relation Acquisition using Word Classes and Partial Patterns
20 0.31757659 109 emnlp-2011-Random Walk Inference and Learning in A Large Scale Knowledge Base