acl acl2013 acl2013-4 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ben Swanson ; Elif Yamangil ; Eugene Charniak ; Stuart Shieber
Abstract: We propose a new variant of TreeAdjoining Grammar that allows adjunction of full wrapping trees but still bears only context-free expressivity. We provide a transformation to context-free form, and a further reduction in probabilistic model size through factorization and pooling of parameters. This collapsed context-free form is used to implement efficient gram- mar estimation and parsing algorithms. We perform parsing experiments the Penn Treebank and draw comparisons to TreeSubstitution Grammars and between different variations in probabilistic model design. Examination of the most probable derivations reveals examples of the linguistically relevant structure that our variant makes possible.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We propose a new variant of TreeAdjoining Grammar that allows adjunction of full wrapping trees but still bears only context-free expressivity. [sent-5, score-0.729]
2 In constituent-based parsing work, the prevailing technique to combat this divide between efficient models and real world data has been to selectively strengthen the dependencies in a CFG by increasing the grammar size through methods such as symbol refinement (Petrov et al. [sent-11, score-0.195]
3 This grammar derives the sentences from a quote of Isaac Asimov’s - “I do not fear computers. [sent-31, score-0.138]
4 We contrast these models and investigate the use of adjunction in the most probable derivations of the test corpus, demonstating the superior modeling performance of OSTAG over TSG. [sent-38, score-0.504]
5 2 TAG and Variants Here we provide a short history of the relevant work in related grammar formalisms, leading up to a definition of OSTAG. [sent-39, score-0.112]
6 The rules R can be thought of as elementary trees of depth 1, which are combined by substituting a derived tree rooted at a nonterminal X at some leaf node in an elementary tree with a frontier node labeled with that same nonterminal. [sent-41, score-0.896]
7 The derived trees rooted at the start symbol S are taken to be the trees generated by the grammar. [sent-42, score-0.252]
8 1 Tree-Substitution Grammar By generalizing CFG to allow elementary trees in R to be of depth greater than or equal to 1, we get the Tree-Substitution Grammar. [sent-44, score-0.17]
9 TSG remains in the family of context-free grammars, as can be easily seen by the removal of the internal nodes in all elementary trees; what is left is a CFG that generates the same language. [sent-45, score-0.143]
10 As a reversible alternative that preserves the internal structure, annotation of each internal node with a unique index creates a large number of deterministic CFG rules that record the structure of the original elementary trees. [sent-46, score-0.437]
11 A more compact CFG representation can be obtained by marking each node in each elementary tree with a signature of its subtree. [sent-47, score-0.331]
12 This transform, presented by Goodman (2003), can rein in the grammar constant G, as the crucial CFG algorithms for a sentence of length n have complexity O(Gn3). [sent-48, score-0.132]
13 A simple probabilistic model for a TSG is a set of multinomials, one for each nonterminal in N corresponding to its possible substitutions in R. [sent-49, score-0.125]
14 A more flexible model allows a potentially infinite number of substitution rules using a Dirichlet Process (Cohn et al. [sent-50, score-0.144]
15 This model has proven effective for grammar induction via Markov Chain Monte Carlo (MCMC), in which TSG derivations of the training set are repeatedly sampled to find frequently occurring elementary trees. [sent-52, score-0.281]
16 A straightforward technique for induction of a finite TSG is to perform this nonparametric induction and select the set of rules that appear in at least one sampled derivation at one or several of the final iterations. [sent-53, score-0.21]
17 The adjunction site is circled, and the foot node of the auxiliary tree is denoted with an asterisk. [sent-56, score-0.96]
18 The OSTAG constraint would disallow further adjunction at the bold VP node only, as it is along the spine of the auxiliary tree. [sent-57, score-1.014]
19 set of auxiliary trees A and the adjunction operation that governs their use. [sent-58, score-0.732]
20 An auxiliary tree α is an elementary tree containing a single distinguished nonterminal leaf, the foot node, with the same symbol as the root of α. [sent-59, score-0.756]
21 An auxiliary tree with root and foot node X can be adjoined into an internal node of an elementary tree labeled with X by splicing the auxiliary tree in at that internal node, as pictured in Figure 2. [sent-60, score-1.173]
22 We refer to the path between the root and foot nodes in an auxiliary tree as the spine of the tree. [sent-61, score-0.678]
23 As mentioned above, the added power afforded by adjunction comes at a serious price in time complexity. [sent-62, score-0.487]
24 However, a large effort in non-probabilistic grammar induction has been performed through manual annotation with the XTAG project(Doran et al. [sent-64, score-0.152]
25 Schabes and Waters (1995) showed that by restricting the form of the auxiliary trees in A and the set of auxiliary trees that may adjoin at particular nodes, a TAG generates only context-free languages. [sent-68, score-0.548]
26 The TIG restriction on auxiliary trees states that the foot node must occur as either the leftmost or rightmost leaf node. [sent-69, score-0.517]
27 This introduces an important distinction between left, right, and wrapping auxiliary trees, of which only the first two are allowed in TIG. [sent-70, score-0.332]
28 Furthermore, TIG disallows adjunction ofleft auxiliary trees on the spines of right auxiliary trees, and vice versa. [sent-71, score-0.919]
29 This is to prevent the construction of wrapping auxiliary trees, whose removal is essential for the simplified complexity of TIG. [sent-72, score-0.31]
30 , 2011; Yamangil and Shieber, 2012) were able to extend the non-parametric modeling of TSGs to TIG, providing methods for both modeling and grammar induction. [sent-76, score-0.112]
31 We allow arbitrary initial and auxiliary trees, and place only one restriction on adjunction: we disallow adjunction at any node on the spine of an auxiliary tree below the root (though we discuss relaxing that constraint in Section 4. [sent-79, score-1.339]
32 We refer to this variant as Off Spine TAG (OSTAG) and note that it allows the use of full wrapping rules, which are forbidden in TIG. [sent-81, score-0.184]
33 We propose a simple but empirically effective heuristic for grammar induction for our experiments on Penn Treebank data. [sent-84, score-0.152]
34 We take the nonterminals of the target CFG grammar to be nodes or pairs of nodes, elements of the set N +N N. [sent-90, score-0.177]
35 Girsive ofn ntwodoe ns woditehs η and η0, we notate a target nonterminal as η(η0). [sent-92, score-0.139]
36 Now for each tree τ and each interior node η in τ that is not on the spine of τ, with children η1 , . [sent-93, score-0.423]
37 (2) Rules of type (1) handle the expansion of a node not on the spine of an auxiliary tree and rules of type (2) a spinal node. [sent-97, score-0.738]
38 In addition, to initiate adjunction at any node η0 where a tree τ with root η is adjoinable, we use a rule η0 → η(η0) (3) and for the foot node ηf of τ, we use a rule (η) → η (4) The OSTAG constraint follows immediately from the structure of the rules of type (2). [sent-98, score-1.179]
39 Any child spine node ηs manifests as a CFG nonterminal ηs (η0). [sent-99, score-0.444]
40 If child spine nodes themselves allowed adjunction, we would need a type (3) rule of the form ηs (η0) → ηs (η0) (η00). [sent-100, score-0.402]
41 This arises from the spurious ambiguity between adjunction at a substitution site (before applying a type (5) rule) versus the same adjunction at the root of the substituted initial tree (after applying a type (5) rule). [sent-105, score-1.173]
42 To avoid double-counting derivations, which can adversely effect probabilistic modeling, type (3) and type (4) rules in which the side with the unapplied symbol is a nonterminal leaf can be omitted. [sent-107, score-0.334]
43 1 Example The grammar of Figure 3(a) can be converted to a CFG by this method. [sent-109, score-0.112]
44 For the initial tree α, we have the following generated rules (with nodes notated by the tree name and a Gorn number subscript): α? [sent-112, score-0.295]
45 α1 α2 →1 →1 →1 α1 α2 α1 x α1 y α2 α2 For the auxiliary trees β and β? [sent-113, score-0.274]
46 (α2) γ1(α1) γ1(α2) →2 →4 →4 b γ1(α2) b α1 α2 The grammar of Figure 3(b) is simply a renaming of this grammar. [sent-121, score-0.112]
47 The ability to use adjunction allows expression of the same language as an OSTAG with k + m elementary trees (Figure 4(b)). [sent-129, score-0.653]
48 First, OSTAG allows zero adjunctions on each node on the spine below the root of an auxiliary tree, but any non-zero finite bound on the number of adjunctions allowed on-spine would similarly limit generative capacity. [sent-133, score-0.801]
49 The tradeoff is in the grammar constant of the effective probabilistic CFG; an extension that allows k levels of on spine adjunction has a grammar constant that is O(|N|(k+2)). [sent-134, score-0.997]
50 Second, the OSTAG form of adjunction is consistent with the TIG form. [sent-135, score-0.458]
51 That is, we can extend OSTAG by allowing on-spine adjunction ofleft- or right-auxiliary trees in keeping with the TIG constraints without increasing generative capacity. [sent-136, score-0.545]
52 3 Probabilistic OSTAG One major motivation for adherence to a contextfree grammar formalism is the ability to employ algorithms designed for probabilistic CFGs such as the CYK algorithm for parsing or the InsideOutside algorithm for grammar estimation. [sent-138, score-0.363]
53 In this section we present a probabilistic model for an OSTAG grammar in PCFG form that can be used in such algorithms, and show that many parameters of this PCFG can be pooled or set equal to one and ignored. [sent-139, score-0.148]
54 References to rules of types (1-5) below refer to the CFG transformation rules defined in Section 3. [sent-140, score-0.151]
55 Furthermore, these rules employ a template in which the stored symbol appears in the left-hand side and in exactly one symbol on the right-hand side where the spine of the auxiliary tree proceeds. [sent-144, score-0.693]
56 One deterministic rule exists for this template applied to each η, and so we may record only the template. [sent-145, score-0.14]
57 In order to perform CYK or IO, it is not even necessary to record the index in the right-hand side where the spine continues; these algorithms fill a chart bottom up and we can simply propagate the stored nonterminal up in the chart. [sent-146, score-0.366]
58 CFG rules of type (4) are also deterministic and do not require parameters. [sent-147, score-0.123]
59 All that is required is a check that a given symbol is adjoinable, which is true for all symbols except nonterminal leaves and applied symbols. [sent-149, score-0.143]
60 To avoid this we propose the following factorization for the probabilistic expansion of an off spine node. [sent-152, score-0.25]
61 First, a decision is made as to whether a type (1) or (3) rule will be used; this corresponds to deciding if adjunction will or will not take place at the node. [sent-153, score-0.56]
62 If adjunction is rejected, then there is only one type (1) rule available, and so parameterization of type (1) rules is unnecessary. [sent-154, score-0.656]
63 By conditioning the probability of adjunction on varying amounts of information about the node, alternative models can easily be defined. [sent-156, score-0.458]
64 We also applied the annotation from Klein and Manning (2003) that appends “-U” to each nonterminal node with a single child, drastically reducing the presence of looping unary chains. [sent-160, score-0.199]
65 Designed to facilitate sister adjunction, we define our binarization scheme by example in which the added nodes, indicated by @, record both the parent and head child of the rule. [sent-163, score-0.147]
66 NP @NN-NP @NN-NP DT SBAR @NN-NP JJ NN A compact TSG can be obtained automatically using the MCMC grammar induction technique of Cohn and Blunsom (2010), retaining all TSG rules that appear in at least one derivation in after 1000 iterations of sampling. [sent-164, score-0.297]
67 For the OSTAG models, we list the number of adjunctions in the MPD of the full test set, as well as the number of wrapping adjunctions. [sent-177, score-0.213]
68 ble auxiliary root and foot node pairs it contains. [sent-178, score-0.442]
69 We also include the TSG rule left behind when the adjunction of this auxiliary tree is removed. [sent-180, score-0.814]
70 With our full set of initial and auxiliary trees, we use EM and the PCFG reduction described above to estimate the parameters of an OSTAG. [sent-182, score-0.218]
71 We investigate three models for the probability of adjunction at a node. [sent-183, score-0.458]
72 The second employs more parameters, conditioning on both the node’s symbol and the symbol of its leftmost child (OSTAG2). [sent-185, score-0.139]
73 Furthermore, the various parameterizations of adjunction with OSTAG indicate that, at least in the case of the Penn Treebank, the finer grained modeling of a full table of adjunction probabilities for each Goodman index OSTAG3 overcomes the danger of sparse data estimates. [sent-189, score-0.936]
74 Not only does such a model lead to better parsing performance, but it uses adjunction more extensively than its more lightly parameterized alternatives. [sent-190, score-0.487]
75 2 to allow any finite number of on-spine adjunctions without sacrificing contextfree form. [sent-196, score-0.161]
76 However, the increase to the grammar constant quickly makes parsing with such models an arduous task. [sent-197, score-0.161]
77 To determine the effect of such a relaxation, we allow a single level of on-spine adjunction using the adjunction model of OSTAG1, and estimate this model with EM on the training data. [sent-198, score-0.916]
78 We parse sentences of length 40 or less in section 23 and observe that on-spine adjunction is never used in the MPD parses. [sent-199, score-0.458]
79 As an artifact of the English language, the majority have their foot node on the left spine and would also be usable by TIG, and so we discuss the instances of wrapping auxiliary trees in these derivations that are uniquely available to OSTAG. [sent-202, score-0.873]
80 We remove binarization for clarity and denote the foot node with an asterisk. [sent-203, score-0.236]
81 A frequent use of wrapping adjunction is to coordinate symbols such as quotes, parentheses, and dashes on both sides of a noun phrase. [sent-204, score-0.581]
82 One common wrapping auxiliary tree in our experiments is NP “ NP* ” PP This is used frequently in the news text of the Wall Street Journal for reported speech when avoiding a full quotation. [sent-205, score-0.409]
83 Another frequent wrapping rule, shown below, allows direct coordination between the contents of an appositive with the rest of the sentence. [sent-208, score-0.187]
84 The wrapping rule allows us to coordinate the verb “fell” with the pattern “X %” instead of 156. [sent-213, score-0.218]
85 These rules highlight the linguistic intuitions that back TAG; if their adjunction were undone, the remaining derivation would be a valid sentence that simply lacks the modifying structure of the auxiliary tree. [sent-215, score-0.751]
86 However, the MPD parses reveal that not all useful adjunctions conform to this paradigm, and that left-auxiliary trees that are not used for sister adjunction are susceptible to this behavior. [sent-216, score-0.688]
87 The most common such tree is used to create noun phrases such as P&G;’s share of [the Japanese market] the House’s repeal of [a law] Apple’s family of [Macintosh Computers] Canada’s output of [crude oil] by adjoining the shared unbracketed syntax onto the NP dominating the bracketed text. [sent-217, score-0.156]
88 Furthermore, in some cases removing the adjunction can leave a grammatically incorrect sentence, as in the third example where the noun phrase changes plurality. [sent-219, score-0.458]
89 While our grammar induction method is a crude (but effective) heuristic, we can still highlight the qualities of the more important auxiliary trees by examining aggregate statistics over the MPD parses, shown in Figure 6. [sent-220, score-0.426]
90 The use of leftauxiliary trees for sister adjunction is a clear trend, as is the predominant use of right-auxiliary trees for the complementary set of “regular” adjunctions, which is to be expected in a right branching language such as English. [sent-221, score-0.685]
91 The statistics also All Total Sister Lex FLex Wrap Right Left 3585 (1374) 2851 (1180) 2244 (990) 1028 (558) 41 (26) 17 (11) 28 (19) 7 (2) 1698 (518) 1109 (400) 894 (299) 835 (472) 1846 (830) 1725 (769) 1322 (672) 186 (84) Figure 6: Statistics for MPD auxiliary trees using OSTAG3. [sent-222, score-0.274]
92 The columns indicate type of auxiliary tree and the rows correspond respectively to the full set found in the MPD, those that perform sister adjunction, those that are lexicalized, and those that are fully lexicalized. [sent-223, score-0.371]
93 Each cell shows the number of tokens followed by the number of types of auxiliary tree that fit its conditions. [sent-224, score-0.286]
94 6 Conclusion The OSTAG variant of Tree-Adjoining Grammar is a simple weakly context-free formalism that still provides for all types of adjunction and is a bit more concise than TSG (quadratically so). [sent-226, score-0.521]
95 OSTAG provides an alternative to TIG as a context-free TAG variant that offers wrapping adjunction in exchange for recursive left/right spine adjunction. [sent-228, score-0.831]
96 The most important direction of future work for OSTAG is the development of a principled grammar induction model, perhaps using the same techniques that have been successfully applied to TSG and TIG. [sent-231, score-0.152]
97 Our system performs the CFG transform described above and optionally employs coarse to fine pruning and relaxed (finite) limits on the number of spine adjunctions. [sent-233, score-0.214]
98 As a TSG is simply a TAG without adjunction rules, our parser can easily be used as a TSG estimator and parser as well. [sent-234, score-0.458]
99 An empirical evaluation of probabilistic lexicalized tree insertion grammars. [sent-280, score-0.175]
100 Tree insertion grammar: a cubic-time, parsable formalism that lexicalizes context-free grammar without changing the trees produced. [sent-323, score-0.266]
wordName wordTfidf (topN-words)
[('ostag', 0.553), ('adjunction', 0.458), ('cfg', 0.255), ('tsg', 0.223), ('spine', 0.214), ('auxiliary', 0.187), ('mpd', 0.131), ('wrapping', 0.123), ('tig', 0.119), ('grammar', 0.112), ('node', 0.11), ('foot', 0.106), ('tree', 0.099), ('adjunctions', 0.09), ('nonterminal', 0.089), ('trees', 0.087), ('elementary', 0.083), ('tag', 0.082), ('rule', 0.07), ('rules', 0.064), ('np', 0.061), ('schabes', 0.061), ('gorn', 0.058), ('pretzels', 0.058), ('cfgs', 0.056), ('substitution', 0.055), ('symbol', 0.054), ('sister', 0.053), ('contextfree', 0.047), ('derivations', 0.046), ('cyk', 0.045), ('adjoinable', 0.044), ('joshi', 0.043), ('record', 0.043), ('pcfg', 0.042), ('derivation', 0.042), ('grammars', 0.041), ('vp', 0.041), ('induction', 0.04), ('insertion', 0.04), ('compact', 0.039), ('root', 0.039), ('appositive', 0.039), ('shindo', 0.039), ('adjoining', 0.037), ('probabilistic', 0.036), ('reversible', 0.036), ('yamangil', 0.036), ('variant', 0.036), ('shieber', 0.034), ('cohn', 0.033), ('nodes', 0.033), ('nonterminals', 0.032), ('type', 0.032), ('goodman', 0.032), ('reduction', 0.031), ('child', 0.031), ('afforded', 0.029), ('treesubstitution', 0.029), ('parsing', 0.029), ('deterministic', 0.027), ('formalism', 0.027), ('leaf', 0.027), ('internal', 0.027), ('fear', 0.026), ('hn', 0.026), ('elif', 0.026), ('lari', 0.026), ('notate', 0.026), ('providence', 0.026), ('wrap', 0.026), ('xtag', 0.026), ('charniak', 0.026), ('yves', 0.025), ('allows', 0.025), ('finite', 0.024), ('rooted', 0.024), ('disallow', 0.024), ('insideoutside', 0.024), ('isaac', 0.024), ('mohri', 0.024), ('ofn', 0.024), ('transformation', 0.023), ('prp', 0.022), ('treeadjoining', 0.022), ('allowed', 0.022), ('aravind', 0.022), ('penn', 0.021), ('em', 0.021), ('constraint', 0.021), ('doran', 0.021), ('frontier', 0.021), ('exactly', 0.021), ('dominating', 0.02), ('mcmc', 0.02), ('pal', 0.02), ('constant', 0.02), ('index', 0.02), ('bernoulli', 0.02), ('binarization', 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999988 4 acl-2013-A Context Free TAG Variant
Author: Ben Swanson ; Elif Yamangil ; Eugene Charniak ; Stuart Shieber
Abstract: We propose a new variant of TreeAdjoining Grammar that allows adjunction of full wrapping trees but still bears only context-free expressivity. We provide a transformation to context-free form, and a further reduction in probabilistic model size through factorization and pooling of parameters. This collapsed context-free form is used to implement efficient gram- mar estimation and parsing algorithms. We perform parsing experiments the Penn Treebank and draw comparisons to TreeSubstitution Grammars and between different variations in probabilistic model design. Examination of the most probable derivations reveals examples of the linguistically relevant structure that our variant makes possible.
2 0.53218663 261 acl-2013-Nonparametric Bayesian Inference and Efficient Parsing for Tree-adjoining Grammars
Author: Elif Yamangil ; Stuart M. Shieber
Abstract: In the line of research extending statistical parsing to more expressive grammar formalisms, we demonstrate for the first time the use of tree-adjoining grammars (TAG). We present a Bayesian nonparametric model for estimating a probabilistic TAG from a parsed corpus, along with novel block sampling methods and approximation transformations for TAG that allow efficient parsing. Our work shows performance improvements on the Penn Treebank and finds more compact yet linguistically rich representations of the data, but more importantly provides techniques in grammar transformation and statistical inference that make practical the use of these more expressive systems, thereby enabling further experimentation along these lines.
3 0.3011601 57 acl-2013-Arguments and Modifiers from the Learner's Perspective
Author: Leon Bergen ; Edward Gibson ; Timothy J. O'Donnell
Abstract: We present a model for inducing sentential argument structure, which distinguishes arguments from optional modifiers. We use this model to study whether representing an argument/modifier distinction helps in learning argument structure, and whether a linguistically-natural argument/modifier distinction can be induced from distributional data alone. Our results provide evidence for both hypotheses.
4 0.20598443 357 acl-2013-Transfer Learning for Constituency-Based Grammars
Author: Yuan Zhang ; Regina Barzilay ; Amir Globerson
Abstract: In this paper, we consider the problem of cross-formalism transfer in parsing. We are interested in parsing constituencybased grammars such as HPSG and CCG using a small amount of data specific for the target formalism, and a large quantity of coarse CFG annotations from the Penn Treebank. While all of the target formalisms share a similar basic syntactic structure with Penn Treebank CFG, they also encode additional constraints and semantic features. To handle this apparent discrepancy, we design a probabilistic model that jointly generates CFG and target formalism parses. The model includes features of both parses, allowing trans- fer between the formalisms, while preserving parsing efficiency. We evaluate our approach on three constituency-based grammars CCG, HPSG, and LFG, augmented with the Penn Treebank-1. Our experiments show that across all three formalisms, the target parsers significantly benefit from the coarse annotations.1 —
5 0.14210929 144 acl-2013-Explicit and Implicit Syntactic Features for Text Classification
Author: Matt Post ; Shane Bergsma
Abstract: Syntactic features are useful for many text classification tasks. Among these, tree kernels (Collins and Duffy, 2001) have been perhaps the most robust and effective syntactic tool, appealing for their empirical success, but also because they do not require an answer to the difficult question of which tree features to use for a given task. We compare tree kernels to different explicit sets of tree features on five diverse tasks, and find that explicit features often perform as well as tree kernels on accuracy and always in orders of magnitude less time, and with smaller models. Since explicit features are easy to generate and use (with publicly avail- able tools) , we suggest they should always be included as baseline comparisons in tree kernel method evaluations.
6 0.13441397 26 acl-2013-A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy
7 0.10822773 274 acl-2013-Parsing Graphs with Hyperedge Replacement Grammars
8 0.10249523 348 acl-2013-The effect of non-tightness on Bayesian estimation of PCFGs
9 0.087153822 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation
10 0.081370711 165 acl-2013-General binarization for parsing and translation
11 0.079074122 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
12 0.075603947 314 acl-2013-Semantic Roles for String to Tree Machine Translation
13 0.071168795 80 acl-2013-Chinese Parsing Exploiting Characters
14 0.07061106 311 acl-2013-Semantic Neighborhoods as Hypergraphs
15 0.069736712 275 acl-2013-Parsing with Compositional Vector Grammars
16 0.064024515 260 acl-2013-Nonconvex Global Optimization for Latent-Variable Models
17 0.061117228 36 acl-2013-Adapting Discriminative Reranking to Grounded Language Learning
18 0.059017751 212 acl-2013-Language-Independent Discriminative Parsing of Temporal Expressions
19 0.056587316 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation
20 0.055695906 361 acl-2013-Travatar: A Forest-to-String Machine Translation Engine based on Tree Transducers
topicId topicWeight
[(0, 0.138), (1, -0.091), (2, -0.101), (3, -0.01), (4, -0.187), (5, 0.046), (6, 0.117), (7, -0.023), (8, 0.019), (9, -0.03), (10, 0.059), (11, 0.067), (12, 0.14), (13, -0.097), (14, -0.112), (15, -0.17), (16, 0.248), (17, 0.242), (18, -0.145), (19, 0.068), (20, 0.094), (21, 0.039), (22, 0.236), (23, 0.069), (24, -0.155), (25, -0.158), (26, 0.043), (27, -0.094), (28, 0.139), (29, 0.066), (30, -0.112), (31, -0.028), (32, -0.005), (33, -0.079), (34, -0.017), (35, -0.033), (36, 0.071), (37, 0.139), (38, 0.031), (39, -0.129), (40, 0.071), (41, -0.134), (42, -0.001), (43, 0.064), (44, -0.11), (45, 0.035), (46, 0.1), (47, -0.029), (48, 0.06), (49, -0.009)]
simIndex simValue paperId paperTitle
1 0.9634465 261 acl-2013-Nonparametric Bayesian Inference and Efficient Parsing for Tree-adjoining Grammars
Author: Elif Yamangil ; Stuart M. Shieber
Abstract: In the line of research extending statistical parsing to more expressive grammar formalisms, we demonstrate for the first time the use of tree-adjoining grammars (TAG). We present a Bayesian nonparametric model for estimating a probabilistic TAG from a parsed corpus, along with novel block sampling methods and approximation transformations for TAG that allow efficient parsing. Our work shows performance improvements on the Penn Treebank and finds more compact yet linguistically rich representations of the data, but more importantly provides techniques in grammar transformation and statistical inference that make practical the use of these more expressive systems, thereby enabling further experimentation along these lines.
same-paper 2 0.95201886 4 acl-2013-A Context Free TAG Variant
Author: Ben Swanson ; Elif Yamangil ; Eugene Charniak ; Stuart Shieber
Abstract: We propose a new variant of TreeAdjoining Grammar that allows adjunction of full wrapping trees but still bears only context-free expressivity. We provide a transformation to context-free form, and a further reduction in probabilistic model size through factorization and pooling of parameters. This collapsed context-free form is used to implement efficient gram- mar estimation and parsing algorithms. We perform parsing experiments the Penn Treebank and draw comparisons to TreeSubstitution Grammars and between different variations in probabilistic model design. Examination of the most probable derivations reveals examples of the linguistically relevant structure that our variant makes possible.
3 0.82353532 57 acl-2013-Arguments and Modifiers from the Learner's Perspective
Author: Leon Bergen ; Edward Gibson ; Timothy J. O'Donnell
Abstract: We present a model for inducing sentential argument structure, which distinguishes arguments from optional modifiers. We use this model to study whether representing an argument/modifier distinction helps in learning argument structure, and whether a linguistically-natural argument/modifier distinction can be induced from distributional data alone. Our results provide evidence for both hypotheses.
4 0.50308937 165 acl-2013-General binarization for parsing and translation
Author: Matthias Buchse ; Alexander Koller ; Heiko Vogler
Abstract: Binarization ofgrammars is crucial for improving the complexity and performance of parsing and translation. We present a versatile binarization algorithm that can be tailored to a number of grammar formalisms by simply varying a formal parameter. We apply our algorithm to binarizing tree-to-string transducers used in syntax-based machine translation.
5 0.45442349 274 acl-2013-Parsing Graphs with Hyperedge Replacement Grammars
Author: David Chiang ; Jacob Andreas ; Daniel Bauer ; Karl Moritz Hermann ; Bevan Jones ; Kevin Knight
Abstract: Hyperedge replacement grammar (HRG) is a formalism for generating and transforming graphs that has potential applications in natural language understanding and generation. A recognition algorithm due to Lautemann is known to be polynomial-time for graphs that are connected and of bounded degree. We present a more precise characterization of the algorithm’s complexity, an optimization analogous to binarization of contextfree grammars, and some important implementation details, resulting in an algorithm that is practical for natural-language applications. The algorithm is part of Bolinas, a new software toolkit for HRG processing.
6 0.43971777 348 acl-2013-The effect of non-tightness on Bayesian estimation of PCFGs
7 0.42439693 144 acl-2013-Explicit and Implicit Syntactic Features for Text Classification
8 0.3903352 357 acl-2013-Transfer Learning for Constituency-Based Grammars
9 0.3482258 311 acl-2013-Semantic Neighborhoods as Hypergraphs
10 0.32298303 260 acl-2013-Nonconvex Global Optimization for Latent-Variable Models
11 0.30091637 275 acl-2013-Parsing with Compositional Vector Grammars
12 0.29177067 137 acl-2013-Enlisting the Ghost: Modeling Empty Categories for Machine Translation
13 0.2850922 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
14 0.27917963 161 acl-2013-Fluid Construction Grammar for Historical and Evolutionary Linguistics
15 0.27901655 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation
16 0.26949838 299 acl-2013-Reconstructing an Indo-European Family Tree from Non-native English Texts
17 0.26634952 349 acl-2013-The mathematics of language learning
18 0.26082018 36 acl-2013-Adapting Discriminative Reranking to Grounded Language Learning
19 0.25568256 280 acl-2013-Plurality, Negation, and Quantification:Towards Comprehensive Quantifier Scope Disambiguation
20 0.25236356 331 acl-2013-Stop-probability estimates computed on a large corpus improve Unsupervised Dependency Parsing
topicId topicWeight
[(0, 0.04), (2, 0.118), (6, 0.019), (11, 0.087), (14, 0.042), (19, 0.135), (24, 0.025), (26, 0.06), (28, 0.011), (35, 0.097), (42, 0.036), (48, 0.037), (70, 0.056), (76, 0.017), (88, 0.03), (90, 0.047), (95, 0.041)]
simIndex simValue paperId paperTitle
same-paper 1 0.84755433 4 acl-2013-A Context Free TAG Variant
Author: Ben Swanson ; Elif Yamangil ; Eugene Charniak ; Stuart Shieber
Abstract: We propose a new variant of TreeAdjoining Grammar that allows adjunction of full wrapping trees but still bears only context-free expressivity. We provide a transformation to context-free form, and a further reduction in probabilistic model size through factorization and pooling of parameters. This collapsed context-free form is used to implement efficient gram- mar estimation and parsing algorithms. We perform parsing experiments the Penn Treebank and draw comparisons to TreeSubstitution Grammars and between different variations in probabilistic model design. Examination of the most probable derivations reveals examples of the linguistically relevant structure that our variant makes possible.
2 0.82463425 261 acl-2013-Nonparametric Bayesian Inference and Efficient Parsing for Tree-adjoining Grammars
Author: Elif Yamangil ; Stuart M. Shieber
Abstract: In the line of research extending statistical parsing to more expressive grammar formalisms, we demonstrate for the first time the use of tree-adjoining grammars (TAG). We present a Bayesian nonparametric model for estimating a probabilistic TAG from a parsed corpus, along with novel block sampling methods and approximation transformations for TAG that allow efficient parsing. Our work shows performance improvements on the Penn Treebank and finds more compact yet linguistically rich representations of the data, but more importantly provides techniques in grammar transformation and statistical inference that make practical the use of these more expressive systems, thereby enabling further experimentation along these lines.
3 0.77741265 340 acl-2013-Text-Driven Toponym Resolution using Indirect Supervision
Author: Michael Speriosu ; Jason Baldridge
Abstract: Toponym resolvers identify the specific locations referred to by ambiguous placenames in text. Most resolvers are based on heuristics using spatial relationships between multiple toponyms in a document, or metadata such as population. This paper shows that text-driven disambiguation for toponyms is far more effective. We exploit document-level geotags to indirectly generate training instances for text classifiers for toponym resolution, and show that textual cues can be straightforwardly integrated with other commonly used ones. Results are given for both 19th century texts pertaining to the American Civil War and 20th century newswire articles.
4 0.76050532 126 acl-2013-Diverse Keyword Extraction from Conversations
Author: Maryam Habibi ; Andrei Popescu-Belis
Abstract: A new method for keyword extraction from conversations is introduced, which preserves the diversity of topics that are mentioned. Inspired from summarization, the method maximizes the coverage of topics that are recognized automatically in transcripts of conversation fragments. The method is evaluated on excerpts of the Fisher and AMI corpora, using a crowdsourcing platform to elicit comparative relevance judgments. The results demonstrate that the method outperforms two competitive baselines.
5 0.745206 73 acl-2013-Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions
Author: Xiaoming Lu ; Lei Xie ; Cheung-Chi Leung ; Bin Ma ; Haizhou Li
Abstract: We present an efficient approach for broadcast news story segmentation using a manifold learning algorithm on latent topic distributions. The latent topic distribution estimated by Latent Dirichlet Allocation (LDA) is used to represent each text block. We employ Laplacian Eigenmaps (LE) to project the latent topic distributions into low-dimensional semantic representations while preserving the intrinsic local geometric structure. We evaluate two approaches employing LDA and probabilistic latent semantic analysis (PLSA) distributions respectively. The effects of different amounts of training data and different numbers of latent topics on the two approaches are studied. Experimental re- sults show that our proposed LDA-based approach can outperform the corresponding PLSA-based approach. The proposed approach provides the best performance with the highest F1-measure of 0.7860.
6 0.73428726 222 acl-2013-Learning Semantic Textual Similarity with Structural Representations
7 0.73373717 60 acl-2013-Automatic Coupling of Answer Extraction and Information Retrieval
8 0.71619004 275 acl-2013-Parsing with Compositional Vector Grammars
9 0.70648807 295 acl-2013-Real-World Semi-Supervised Learning of POS-Taggers for Low-Resource Languages
10 0.69295365 57 acl-2013-Arguments and Modifiers from the Learner's Perspective
11 0.66192347 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
12 0.6588012 245 acl-2013-Modeling Human Inference Process for Textual Entailment Recognition
13 0.65781224 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
14 0.6537717 317 acl-2013-Sentence Level Dialect Identification in Arabic
15 0.65283722 31 acl-2013-A corpus-based evaluation method for Distributional Semantic Models
16 0.65000725 341 acl-2013-Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm
17 0.64626122 274 acl-2013-Parsing Graphs with Hyperedge Replacement Grammars
18 0.64574891 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing
19 0.64568484 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
20 0.64546096 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering