acl acl2010 acl2010-236 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Adam Pauls ; Dan Klein ; Chris Quirk
Abstract: We propose a top-down algorithm for extracting k-best lists from a parser. Our algorithm, TKA∗ is a variant of the kbest A∗ (KA∗) algorithm of Pauls and Klein (2009). In contrast to KA∗, which performs an inside and outside pass before performing k-best extraction bottom up, TKA∗ performs only the inside pass before extracting k-best lists top down. TKA∗ maintains the same optimality and efficiency guarantees of KA∗, but is simpler to both specify and implement.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We propose a top-down algorithm for extracting k-best lists from a parser. [sent-3, score-0.102]
2 Our algorithm, TKA∗ is a variant of the kbest A∗ (KA∗) algorithm of Pauls and Klein (2009). [sent-4, score-0.073]
3 In contrast to KA∗, which performs an inside and outside pass before performing k-best extraction bottom up, TKA∗ performs only the inside pass before extracting k-best lists top down. [sent-5, score-1.166]
4 TKA∗ maintains the same optimality and efficiency guarantees of KA∗, but is simpler to both specify and implement. [sent-6, score-0.231]
5 1 Introduction Many situations call for a parser to return a kbest list of parses instead of a single best hypothesis. [sent-7, score-0.047]
6 The k-best algorithm of Jim e´nez and Marzal (2000) and Huang and Chiang (2005), referred to hereafter as LAZY, operates by first performing an exhaustive Viterbi inside pass and then lazily extracting k-best lists in top-down manner. [sent-9, score-0.575]
7 The k-best A∗ algorithm of Pauls and Klein (2009), hereafter KA∗, computes Viterbi inside and outside scores before extracting k-best lists bottom up. [sent-10, score-0.699]
8 Because these additional passes are only partial, KA∗ can be significantly faster than LAZY, especially when a heuristic is used (Pauls and Klein, 2009). [sent-11, score-0.068]
9 In this paper, we propose TKA∗, a topdown variant of KA∗ that, like LAZY, performs only an inside pass before extracting k-best lists top-down, but maintains the same optimality and efficiency guarantees as KA∗. [sent-12, score-0.714]
10 This algorithm can be seen as a generalization of the lattice k-best algorithm of Soong and Huang (1991) to parsing. [sent-13, score-0.052]
11 Because TKA∗ eliminates the outside pass from KA∗, TKA∗ is simpler both in implementation and specification. [sent-14, score-0.449]
12 com 2 Review Because our algorithm is very similar to KA∗, which is in turn an extension of the (1-best) A∗ parsing algorithm of Klein and Manning (2003), we first introduce notation and review those two algorithms before presenting our new algorithm. [sent-17, score-0.081]
13 Without loss of generality, we assume Chomsky nor- mal form: each non-terminal rule r in G has the fmoarlm f r = eAa → Bn- Cerm winiathl weight wr. [sent-24, score-0.147]
14 hInts wide derivations of an edge (A, i,j) are trees with root nonterminal A, spanning si . [sent-26, score-0.481]
15 The weight (negative log-probability) ofthe best (minimum) inside derivation for an edge e is called the Viterbi inside score β(e), and the weight of the best derivation of G → s0 . [sent-30, score-1.543]
16 The goal of a kbest parsing algorithm is to compute the k best (minimum weight) inside derivations of the edge (G, 0, n). [sent-37, score-0.884]
17 We formulate the algorithms in this paper in terms of prioritized weighted deduction rules (Shieber et al. [sent-38, score-0.339]
18 A prioritized weighted deduction rule has the form φ1 : w1, . [sent-40, score-0.377]
19 , are the antecedent items of the deduction rule and φ0 is the conclusion item. [sent-52, score-0.673]
20 A deduction rule states that, given the antecedents φ1 , . [sent-53, score-0.351]
21 2While we present the algorithm specialized to parsing with a PCFG, this algorithm generalizes to a wide range of 200 UppsalaP,r Sowceeeddeinn,g 1s1 o-f16 th Jeu AlyC 2L0 21001. [sent-66, score-0.081]
22 sn-1 G VP VBZDTNPNN s2 s3 s4 NNNPVPVPNP s0 s1 s2 sn-1 Figure 1: Representations of the different types of items used in parsing. [sent-74, score-0.322]
23 (d) An outside derivation item: Q(TVGP, 1, 2, {(NP, 2, n)}. [sent-78, score-0.523]
24 These deduction rules are “executed” within a generic agenda-driven algorithm, which constructs items in a prioritized fashion. [sent-80, score-0.661]
25 The algorithm maintains an agenda (a priority queue of items), as well as a chart of items already processed. [sent-81, score-0.615]
26 The fundamental operation of the algorithm is to pop the highest priority item φ from the agenda, put it into the chart with its current weight, and apply deduction rules to form any items which can be built by combining φ with items already in the chart. [sent-82, score-1.29]
27 When the resulting items are either new or have a weight smaller than an item’s best score so far, they are put on the agenda with priority given by p(·). [sent-83, score-0.558]
28 re B a cdaeudsuec atiloln a rnutelec eisd executed, we sometimes refer to particular conclusion item as “waiting” on another item before it can be built. [sent-85, score-0.394]
29 2 A∗ A∗ parsing (Klein and Manning, 2003) is an algorithm for computing the 1-best parse of a sentence. [sent-87, score-0.055]
30 A∗ operates on items called inside edge items I(A, i,j), which represent the many possible inside derivations of an edge (A, i,j). [sent-88, score-1.979]
31 Inside edge items are constructed according to the IN deduction rule of Table 1. [sent-89, score-0.934]
32 This deduction rule constructs inside edge items in a bottom-up fashion, combining items representing smaller edges I(B, i, k) and I(C, k, j) with a grammar rule r = A → B C to form a larger item I(A, i,j). [sent-90, score-1.952]
33 The weight oBf a newly mcon as tlarurcgteerd i itteemm I Iis( given by Tthhee sum of the weights of the antecedent items and the grammar rule r, and its priority is given by hypergraph search problems as shown in Klein and Manning (2001). [sent-91, score-0.545]
34 G G NP VP NN s0 NP VP s1 s2 (a) NP s3 s4 s5 NN s0 VP VPNN VP s1 s2 s3 NP s4 s5 (b) Figure 2: (a) An outside derivation item before expansion at the edge (VP, 1, 4). [sent-92, score-1.008]
35 (b) A possible expansion of the item in (a) using the rule VP→ VP NN. [sent-93, score-0.309]
36 are marked in its weight plus a heuristic h(A, i,j). [sent-95, score-0.084]
37 For consistent and admissible heuristics h(·), this deduction rtuenlet guarantees tbhlaet weuhreinst an ihn(s·i)d,e t edge dituecmti ins removed from the agenda, its current weight is its true Viterbi inside score. [sent-96, score-0.971]
38 The heuristic h controls the speed of the algorithm. [sent-97, score-0.045]
39 It can be shown that an edge e satisfying β(e) + h(A, i,j) > β(G, 0, n) will never be removed from the agenda, allowing some edges to be safely pruned during parsing. [sent-98, score-0.38]
40 The more closely h(e) approximates the Viterbi outside cost α(e), the more items are pruned. [sent-99, score-0.566]
41 3 KA∗ The use of inside edge items in A∗ exploits the optimal substructure property of derivations – since a best derivation of a larger edge is always composed of best derivations of smaller edges, it is only necessary to compute the best way of building a particular inside edge item. [sent-101, score-2.432]
42 Thus, KA∗, the k-best extension of A∗, must search not in the space of inside edge items, but rather in the space of inside derivation items D(TA, i,j), which represent specific derivations of the edge (A, i,j) using tree TA. [sent-103, score-1.962]
43 However, the number of inside derivation items is exponential in the length of the input sentence, and even with a very accurate heuristic, running A∗ directly in this space is not feasible. [sent-104, score-0.928]
44 Fortunately, Pauls and Klein (2009) show that with a perfect heuristic, that is, h(e) = α(e) ∀e, Awi∗t hs eaar pcehr on hiensuirdiset cd,er tihvaatti iso,n i(tee)ms = w αi(l e only remove items from the agenda that participate in the true k-best lists (up to ties). [sent-105, score-0.48]
45 A superscript * indicates that the rule is used iTna TbleKA 1:∗ , ahned d ae sduucpteiorsncri rputle †s iu nsdeidca itnes t hthisat p taphee r r. [sent-108, score-0.077]
46 dPIn a OUT-D, tphte † tree iTcaBGte iss t hhaet tree rTulGAe iesxt uesnedde idn at (A, i,j) with rule r, FC is the list F with (C, l,j) prepended, and β(F) is Pe∈F β(e). [sent-112, score-0.129]
47 Outside items are built using the OUT-L and OUT-R deduction rules shown in Table 1. [sent-121, score-0.64]
48 OUTL and OUT-R combine, in a top-down fashion, an outside edge over a larger span and inside edge over a smaller span to form a new outside edge over a smaller span. [sent-122, score-1.63]
49 Because these rules make reference to inside edge items I(A, i,j), these items must also be built using the IN deduction rules from 1-best A∗. [sent-123, score-1.581]
50 Outside edge items must thus wait until the necessary inside edge items have been built. [sent-124, score-1.534]
51 The outside pass is initialized with the item O(G, 0, n) when the inside edge item I(G, 0, n) is popped from the agenda. [sent-125, score-1.345]
52 Once we have started populating outside scores using the outside deductions, we can initiate a search on inside derivation items. [sent-126, score-1.094]
53 3 These items are built bottom-up using the IN-D deduction rule. [sent-127, score-0.601]
54 The crucial element of this rule is that derivation items for a particular edge wait until the exact outside score of that edge has been computed. [sent-128, score-1.507]
55 The als1 gorithm terminates when k derivation items rooted at (G, 0, n) have been popped from the agenda. [sent-129, score-0.668]
56 3 TKA∗ KA∗ efficiently explores the space of inside derivation items because it waits for the exact Viterbi outside cost before building each derivation item. [sent-130, score-1.534]
57 However, these outside costs and associated deduction items are only auxiliary quantities used to guide the exploration of inside derivations: they allow KA∗ to prioritize currently constructed inside derivation items (i. [sent-131, score-2.128]
58 , constructed derivations of the goal) by their optimal completion costs. [sent-133, score-0.236]
59 Outside costs are thus only necessary because we construct partial derivations bottomup; if we constructed partial derivations in a topdown fashion, all we would need to compute opti3We stress that the order of computation is entirely specified by the deduction rules we only speak about e. [sent-134, score-0.799]
60 – mal completion costs are Viterbi inside scores, and we could forget the outside pass. [sent-137, score-0.656]
61 Inside edge items are constructed in the same way as KA∗, but once the inside edge item I(G, 0, n) has been discovered, TKA∗ begins building partial derivations from the goal outwards. [sent-139, score-1.66]
62 We replace the inside derivation items of KA∗ with outside derivation items, which represent trees rooted at the goal and expanding downwards. [sent-140, score-1.577]
63 These items bottom out in a list of edges called the frontier edges. [sent-141, score-0.559]
64 When a frontier edge represents a single word in the input, i. [sent-143, score-0.391]
65 is of the form (si, i,i+ 1), we say that edge is complete. [sent-145, score-0.253]
66 An outside derivation can be expanded by applying a rule to one of its incomplete frontier edges; see Figure 2. [sent-146, score-0.784]
67 In the same way that inside derivation items wait on exact outside scores before being built, outside derivation items wait on the inside edge items of all frontier edges before they can be constructed. [sent-147, score-3.292]
68 Although building derivations top-down obviates the need for a 1-best outside pass, it raises a new issue. [sent-148, score-0.459]
69 When building derivations bottom-up, the only way to expand a particular partial inside derivation is to combine it with another partial inside derivation to build a bigger tree. [sent-149, score-1.495]
70 In contrast, an outside derivation item can be expanded anywhere along its frontier. [sent-150, score-0.72]
71 Naively building derivations top-down would lead to a prohibitively large number of expansion choices. [sent-151, score-0.25]
72 We solve this issue by always expanding the left-most incomplete frontier edge of an outside derivation item. [sent-152, score-1.001]
73 We show the deduction rule OUT-D which performs this deduction in Figure 1(d). [sent-153, score-0.577]
74 We denote an outside derivation item as Q(TAG, i,j, F), where TAG is a tree rooted at the goal w,itih, jl,eFft-)m,o wsht incomplete edge (A, i,j), and F is the list of incomplete frontier edges excluding (A, i,j), oorfd iencreodm frploemte le frfto to right. [sent-154, score-1.391]
75 Wesh eexncelvude-r the application of this rule “completes” the left202 most edge, the next edge is removed from F and ims ousste edd as ,t thhee new point iosf expansion. [sent-155, score-0.358]
76 mOn Fce a naldl frontier edges are complete, the item represents a correctly scored derivation of the goal, explored in a pre-order traversal. [sent-156, score-0.713]
77 1 Correctness It should be clear that expanding the left-most incomplete frontier edge first eventually explores the same set of derivations as expanding all frontier edges simultaneously. [sent-158, score-0.952]
78 The only worry in fixing this canonical order is that we will somehow explore the Q items in an incorrect order, possibly building some complete derivation Q0C before a more optimal complete derivation QC. [sent-159, score-0.92]
79 However, note that all items Q along the left-most construction of QC have priority equal to or better than any less optimal complete derivation Q0C. [sent-160, score-0.684]
80 Therefore, when Q0C is enqueued, it will have lower priority than all Q; Q0C will therefore not be dequeued until all Q and hence QC have been built. [sent-161, score-0.083]
81 2 Implementation Details Building derivations bottom-up is convenient from an indexing point of view: since larger derivations are built from smaller ones, it is not necessary to construct the larger derivation from scratch. [sent-164, score-0.686]
82 In order keep the same efficiency when building trees top-down, a slightly different data structure is necessary. [sent-166, score-0.099]
83 We represent top-down derivations as a lazy list of expansions. [sent-167, score-0.263]
84 The top node TGG is an empty list, and whenever we expand an outside derivation item Q(TAG, i,j,F) with a rule r = A → B C and split point ,l,F t)he w resulting drer =iva Atio →n →TBG B Bis C a new lpislitt ti pteomin wt li,th t (r, l) as tinheg head data, and TAG as its tail. [sent-168, score-0.819]
85 The tree can be reconstructed later by recursively reconstructing the parent, and adding the edges (B, i,l) and (C, l,j) as children of (A, i,j). [sent-169, score-0.125]
86 3 Advantages Although our algorithm eliminates the 1-best outside pass of KA∗, in practice, even for k = 104, the 1-best inside pass remains the overwhelming bottleneck (Pauls and Klein, 2009), and our modifications leave that pass unchanged. [sent-171, score-0.935]
87 However, we argue that our implementation is simpler to specify and implement. [sent-172, score-0.081]
88 In terms of deduction rules, our algorithm eliminates the 2 outside deduction rules and replaces the IN-D rule with the OUT-D rule, bringing the total number of rules from four to two. [sent-173, score-0.975]
89 The ease of specification translates directly into ease of implementation. [sent-174, score-0.046]
90 In particular, if highquality heuristics are not available, it is often more efficient to implement the 1-best inside pass as an exhaustive dynamic program, as in Huang and Chiang (2005). [sent-175, score-0.474]
91 We have argued that TKA∗ is simpler than TKA∗, but we do not expect it to do any more or less work than KA∗, modulo grammar specific optimizations. [sent-179, score-0.064]
92 Therefore, we simply verify, like KA∗, that the additional work of extracting k-best lists with TKA∗ is negligible compared to the time spent building 1-best inside edges. [sent-180, score-0.466]
93 We examined the time spent building 100-best lists for the same experimental setup as Pauls and Klein (2009). [sent-181, score-0.109]
94 4 On 100 sentences, our implementation of TKA∗ constructed 3. [sent-182, score-0.062]
95 46 billion items, of which about 2% were outside derivation items. [sent-183, score-0.547]
96 In other words, the cost of k-best extraction is dwarfed by the the 1-best inside edge computation in both cases. [sent-187, score-0.58]
97 The reason for the slight performance advantage of KA∗ is that our implementation of KA∗ uses lazy optimizations discussed in Pauls and Klein (2009), and while such optimizations could easily be incorporated in TKA∗, we have not yet done so in our implementation. [sent-188, score-0.184]
98 (2006), the former used to compute a heuristic for the latter, tested on sentences of length up to 25. [sent-190, score-0.045]
99 Our algorithm collapses the 1best outside and bottom-up derivation passes of KA∗ into a single, top-down pass without sacrificing efficiency or optimality. [sent-192, score-0.705]
100 This reduces the number of non base-case deduction rules, making TKA∗ easier both to specify and implement. [sent-193, score-0.272]
wordName wordTfidf (topN-words)
[('ka', 0.335), ('inside', 0.327), ('items', 0.322), ('tka', 0.308), ('derivation', 0.279), ('edge', 0.253), ('deduction', 0.25), ('outside', 0.244), ('item', 0.197), ('derivations', 0.175), ('frontier', 0.138), ('pauls', 0.134), ('vp', 0.1), ('edges', 0.099), ('pass', 0.096), ('lazy', 0.088), ('agenda', 0.086), ('priority', 0.083), ('klein', 0.079), ('rule', 0.077), ('viterbi', 0.067), ('wr', 0.058), ('wait', 0.057), ('optimality', 0.05), ('eliminates', 0.05), ('prioritized', 0.05), ('guarantees', 0.047), ('kbest', 0.047), ('incomplete', 0.046), ('wn', 0.046), ('maintains', 0.046), ('lists', 0.046), ('heuristic', 0.045), ('soong', 0.044), ('tbg', 0.044), ('expanding', 0.041), ('building', 0.04), ('weight', 0.039), ('rules', 0.039), ('proccedings', 0.038), ('huang', 0.038), ('efficiency', 0.037), ('sj', 0.037), ('rooted', 0.036), ('topdown', 0.035), ('qc', 0.035), ('modulo', 0.035), ('expansion', 0.035), ('partial', 0.034), ('jim', 0.033), ('optimizations', 0.033), ('ta', 0.033), ('ties', 0.032), ('fashion', 0.032), ('constructed', 0.032), ('popped', 0.031), ('nez', 0.031), ('mal', 0.031), ('deductive', 0.031), ('si', 0.031), ('extracting', 0.03), ('implementation', 0.03), ('sn', 0.03), ('queue', 0.03), ('simpler', 0.029), ('parsing', 0.029), ('completion', 0.029), ('tb', 0.029), ('np', 0.029), ('built', 0.029), ('smaller', 0.028), ('removed', 0.028), ('heuristics', 0.027), ('goal', 0.027), ('tag', 0.027), ('chiang', 0.027), ('algorithm', 0.026), ('participate', 0.026), ('hereafter', 0.026), ('shieber', 0.026), ('tree', 0.026), ('costs', 0.025), ('exhaustive', 0.024), ('executed', 0.024), ('antecedents', 0.024), ('antecedent', 0.024), ('billion', 0.024), ('iwpt', 0.024), ('passes', 0.023), ('spent', 0.023), ('fa', 0.023), ('ease', 0.023), ('exact', 0.022), ('trees', 0.022), ('whenever', 0.022), ('chart', 0.022), ('specify', 0.022), ('dan', 0.022), ('petrov', 0.022), ('explores', 0.021)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999952 236 acl-2010-Top-Down K-Best A* Parsing
Author: Adam Pauls ; Dan Klein ; Chris Quirk
Abstract: We propose a top-down algorithm for extracting k-best lists from a parser. Our algorithm, TKA∗ is a variant of the kbest A∗ (KA∗) algorithm of Pauls and Klein (2009). In contrast to KA∗, which performs an inside and outside pass before performing k-best extraction bottom up, TKA∗ performs only the inside pass before extracting k-best lists top down. TKA∗ maintains the same optimality and efficiency guarantees of KA∗, but is simpler to both specify and implement.
2 0.45560482 131 acl-2010-Hierarchical A* Parsing with Bridge Outside Scores
Author: Adam Pauls ; Dan Klein
Abstract: Hierarchical A∗ (HA∗) uses of a hierarchy of coarse grammars to speed up parsing without sacrificing optimality. HA∗ prioritizes search in refined grammars using Viterbi outside costs computed in coarser grammars. We present Bridge Hierarchical A∗ (BHA∗), a modified Hierarchial A∗ algorithm which computes a novel outside cost called a bridge outside cost. These bridge costs mix finer outside scores with coarser inside scores, and thus constitute tighter heuristics than entirely coarse scores. We show that BHA∗ substantially outperforms HA∗ when the hierarchy contains only very coarse grammars, while achieving comparable performance on more refined hierarchies.
3 0.11151741 169 acl-2010-Learning to Translate with Source and Target Syntax
Author: David Chiang
Abstract: Statistical translation models that try to capture the recursive structure of language have been widely adopted over the last few years. These models make use of varying amounts of information from linguistic theory: some use none at all, some use information about the grammar of the target language, some use information about the grammar of the source language. But progress has been slower on translation models that are able to learn the relationship between the grammars of both the source and target language. We discuss the reasons why this has been a challenge, review existing attempts to meet this challenge, and show how some old and new ideas can be combined into a sim- ple approach that uses both source and target syntax for significant improvements in translation accuracy.
4 0.11058209 53 acl-2010-Blocked Inference in Bayesian Tree Substitution Grammars
Author: Trevor Cohn ; Phil Blunsom
Abstract: Learning a tree substitution grammar is very challenging due to derivational ambiguity. Our recent approach used a Bayesian non-parametric model to induce good derivations from treebanked input (Cohn et al., 2009), biasing towards small grammars composed of small generalisable productions. In this paper we present a novel training method for the model using a blocked Metropolis-Hastings sampler in place of the previous method’s local Gibbs sampler. The blocked sampler makes considerably larger moves than the local sampler and consequently con- verges in less time. A core component of the algorithm is a grammar transformation which represents an infinite tree substitution grammar in a finite context free grammar. This enables efficient blocked inference for training and also improves the parsing algorithm. Both algorithms are shown to improve parsing accuracy.
5 0.10551728 69 acl-2010-Constituency to Dependency Translation with Forests
Author: Haitao Mi ; Qun Liu
Abstract: Tree-to-string systems (and their forestbased extensions) have gained steady popularity thanks to their simplicity and efficiency, but there is a major limitation: they are unable to guarantee the grammaticality of the output, which is explicitly modeled in string-to-tree systems via targetside syntax. We thus propose to combine the advantages of both, and present a novel constituency-to-dependency translation model, which uses constituency forests on the source side to direct the translation, and dependency trees on the target side (as a language model) to ensure grammaticality. Medium-scale experiments show an absolute and statistically significant improvement of +0.7 BLEU points over a state-of-the-art forest-based tree-to-string system even with fewer rules. This is also the first time that a treeto-tree model can surpass tree-to-string counterparts.
6 0.095332719 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing
7 0.093054652 46 acl-2010-Bayesian Synchronous Tree-Substitution Grammar Induction and Its Application to Sentence Compression
8 0.091549754 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
9 0.082607903 255 acl-2010-Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization
10 0.08167278 189 acl-2010-Optimizing Question Answering Accuracy by Maximizing Log-Likelihood
11 0.070764184 115 acl-2010-Filtering Syntactic Constraints for Statistical Machine Translation
12 0.070480354 23 acl-2010-Accurate Context-Free Parsing with Combinatory Categorial Grammar
13 0.06760145 75 acl-2010-Correcting Errors in a Treebank Based on Synchronous Tree Substitution Grammar
14 0.067206867 118 acl-2010-Fine-Grained Tree-to-String Translation Rule Extraction
15 0.064458057 265 acl-2010-cdec: A Decoder, Alignment, and Learning Framework for Finite-State and Context-Free Translation Models
16 0.063656233 133 acl-2010-Hierarchical Search for Word Alignment
17 0.061555255 21 acl-2010-A Tree Transducer Model for Synchronous Tree-Adjoining Grammars
18 0.052746184 200 acl-2010-Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing
19 0.052101079 48 acl-2010-Better Filtration and Augmentation for Hierarchical Phrase-Based Translation Rules
20 0.05179492 83 acl-2010-Dependency Parsing and Projection Based on Word-Pair Classification
topicId topicWeight
[(0, -0.138), (1, -0.072), (2, 0.048), (3, -0.029), (4, -0.108), (5, -0.087), (6, 0.159), (7, 0.04), (8, -0.035), (9, -0.081), (10, -0.052), (11, -0.097), (12, 0.006), (13, -0.071), (14, -0.044), (15, 0.018), (16, -0.031), (17, -0.003), (18, -0.169), (19, 0.147), (20, -0.061), (21, 0.159), (22, 0.168), (23, -0.062), (24, -0.027), (25, 0.325), (26, -0.182), (27, 0.255), (28, -0.146), (29, -0.035), (30, 0.213), (31, 0.163), (32, 0.051), (33, -0.297), (34, -0.085), (35, 0.109), (36, -0.024), (37, -0.069), (38, -0.067), (39, 0.01), (40, 0.061), (41, -0.023), (42, -0.02), (43, 0.016), (44, -0.049), (45, 0.004), (46, 0.027), (47, -0.025), (48, -0.037), (49, -0.047)]
simIndex simValue paperId paperTitle
same-paper 1 0.99253827 236 acl-2010-Top-Down K-Best A* Parsing
Author: Adam Pauls ; Dan Klein ; Chris Quirk
Abstract: We propose a top-down algorithm for extracting k-best lists from a parser. Our algorithm, TKA∗ is a variant of the kbest A∗ (KA∗) algorithm of Pauls and Klein (2009). In contrast to KA∗, which performs an inside and outside pass before performing k-best extraction bottom up, TKA∗ performs only the inside pass before extracting k-best lists top down. TKA∗ maintains the same optimality and efficiency guarantees of KA∗, but is simpler to both specify and implement.
2 0.98195225 131 acl-2010-Hierarchical A* Parsing with Bridge Outside Scores
Author: Adam Pauls ; Dan Klein
Abstract: Hierarchical A∗ (HA∗) uses of a hierarchy of coarse grammars to speed up parsing without sacrificing optimality. HA∗ prioritizes search in refined grammars using Viterbi outside costs computed in coarser grammars. We present Bridge Hierarchical A∗ (BHA∗), a modified Hierarchial A∗ algorithm which computes a novel outside cost called a bridge outside cost. These bridge costs mix finer outside scores with coarser inside scores, and thus constitute tighter heuristics than entirely coarse scores. We show that BHA∗ substantially outperforms HA∗ when the hierarchy contains only very coarse grammars, while achieving comparable performance on more refined hierarchies.
3 0.3549183 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
Author: Mohit Bansal ; Dan Klein
Abstract: We present a simple but accurate parser which exploits both large tree fragments and symbol refinement. We parse with all fragments of the training set, in contrast to much recent work on tree selection in data-oriented parsing and treesubstitution grammar learning. We require only simple, deterministic grammar symbol refinement, in contrast to recent work on latent symbol refinement. Moreover, our parser requires no explicit lexicon machinery, instead parsing input sentences as character streams. Despite its simplicity, our parser achieves accuracies of over 88% F1 on the standard English WSJ task, which is competitive with substantially more complicated state-of-theart lexicalized and latent-variable parsers. Additional specific contributions center on making implicit all-fragments parsing efficient, including a coarse-to-fine inference scheme and a new graph encoding.
4 0.34770352 186 acl-2010-Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two
Author: Benoit Sagot ; Giorgio Satta
Abstract: Linear Context-Free Rewriting Systems (LCFRSs) are a grammar formalism capable of modeling discontinuous phrases. Many parsing applications use LCFRSs where the fan-out (a measure of the discontinuity of phrases) does not exceed 2. We present an efficient algorithm for optimal reduction of the length of production right-hand side in LCFRSs with fan-out at most 2. This results in asymptotical running time improvement for known parsing algorithms for this class.
5 0.31519455 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing
Author: Liang Huang ; Kenji Sagae
Abstract: Incremental parsing techniques such as shift-reduce have gained popularity thanks to their efficiency, but there remains a major problem: the search is greedy and only explores a tiny fraction of the whole space (even with beam search) as opposed to dynamic programming. We show that, surprisingly, dynamic programming is in fact possible for many shift-reduce parsers, by merging “equivalent” stacks based on feature values. Empirically, our algorithm yields up to a five-fold speedup over a state-of-the-art shift-reduce depen- dency parser with no loss in accuracy. Better search also leads to better learning, and our final parser outperforms all previously reported dependency parsers for English and Chinese, yet is much faster.
6 0.30739412 53 acl-2010-Blocked Inference in Bayesian Tree Substitution Grammars
7 0.29499108 255 acl-2010-Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization
8 0.26000607 92 acl-2010-Don't 'Have a Clue'? Unsupervised Co-Learning of Downward-Entailing Operators.
9 0.23130222 7 acl-2010-A Generalized-Zero-Preserving Method for Compact Encoding of Concept Lattices
10 0.22826049 169 acl-2010-Learning to Translate with Source and Target Syntax
11 0.22696339 137 acl-2010-How Spoken Language Corpora Can Refine Current Speech Motor Training Methodologies
12 0.20796624 250 acl-2010-Untangling the Cross-Lingual Link Structure of Wikipedia
13 0.20652844 46 acl-2010-Bayesian Synchronous Tree-Substitution Grammar Induction and Its Application to Sentence Compression
14 0.20639856 98 acl-2010-Efficient Staggered Decoding for Sequence Labeling
15 0.20161153 95 acl-2010-Efficient Inference through Cascades of Weighted Tree Transducers
16 0.19997315 67 acl-2010-Computing Weakest Readings
17 0.19348921 21 acl-2010-A Tree Transducer Model for Synchronous Tree-Adjoining Grammars
18 0.19121855 111 acl-2010-Extracting Sequences from the Web
19 0.19080651 63 acl-2010-Comparable Entity Mining from Comparative Questions
20 0.18573309 189 acl-2010-Optimizing Question Answering Accuracy by Maximizing Log-Likelihood
topicId topicWeight
[(14, 0.028), (25, 0.112), (33, 0.032), (42, 0.021), (59, 0.104), (68, 0.334), (73, 0.03), (78, 0.027), (83, 0.061), (84, 0.029), (98, 0.111)]
simIndex simValue paperId paperTitle
same-paper 1 0.85950577 236 acl-2010-Top-Down K-Best A* Parsing
Author: Adam Pauls ; Dan Klein ; Chris Quirk
Abstract: We propose a top-down algorithm for extracting k-best lists from a parser. Our algorithm, TKA∗ is a variant of the kbest A∗ (KA∗) algorithm of Pauls and Klein (2009). In contrast to KA∗, which performs an inside and outside pass before performing k-best extraction bottom up, TKA∗ performs only the inside pass before extracting k-best lists top down. TKA∗ maintains the same optimality and efficiency guarantees of KA∗, but is simpler to both specify and implement.
2 0.70905364 44 acl-2010-BabelNet: Building a Very Large Multilingual Semantic Network
Author: Roberto Navigli ; Simone Paolo Ponzetto
Abstract: In this paper we present BabelNet a very large, wide-coverage multilingual semantic network. The resource is automatically constructed by means of a methodology that integrates lexicographic and encyclopedic knowledge from WordNet and Wikipedia. In addition Machine Translation is also applied to enrich the resource with lexical information for all languages. We conduct experiments on new and existing gold-standard datasets to show the high quality and coverage of the resource. –
3 0.68267077 131 acl-2010-Hierarchical A* Parsing with Bridge Outside Scores
Author: Adam Pauls ; Dan Klein
Abstract: Hierarchical A∗ (HA∗) uses of a hierarchy of coarse grammars to speed up parsing without sacrificing optimality. HA∗ prioritizes search in refined grammars using Viterbi outside costs computed in coarser grammars. We present Bridge Hierarchical A∗ (BHA∗), a modified Hierarchial A∗ algorithm which computes a novel outside cost called a bridge outside cost. These bridge costs mix finer outside scores with coarser inside scores, and thus constitute tighter heuristics than entirely coarse scores. We show that BHA∗ substantially outperforms HA∗ when the hierarchy contains only very coarse grammars, while achieving comparable performance on more refined hierarchies.
4 0.55197877 156 acl-2010-Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems
Author: Simone Paolo Ponzetto ; Roberto Navigli
Abstract: One of the main obstacles to highperformance Word Sense Disambiguation (WSD) is the knowledge acquisition bottleneck. In this paper, we present a methodology to automatically extend WordNet with large amounts of semantic relations from an encyclopedic resource, namely Wikipedia. We show that, when provided with a vast amount of high-quality semantic relations, simple knowledge-lean disambiguation algorithms compete with state-of-the-art supervised WSD systems in a coarse-grained all-words setting and outperform them on gold-standard domain-specific datasets.
5 0.51277292 237 acl-2010-Topic Models for Word Sense Disambiguation and Token-Based Idiom Detection
Author: Linlin Li ; Benjamin Roth ; Caroline Sporleder
Abstract: This paper presents a probabilistic model for sense disambiguation which chooses the best sense based on the conditional probability of sense paraphrases given a context. We use a topic model to decompose this conditional probability into two conditional probabilities with latent variables. We propose three different instantiations of the model for solving sense disambiguation problems with different degrees of resource availability. The proposed models are tested on three different tasks: coarse-grained word sense disambiguation, fine-grained word sense disambiguation, and detection of literal vs. nonliteral usages of potentially idiomatic expressions. In all three cases, we outper- form state-of-the-art systems either quantitatively or statistically significantly.
6 0.51189703 169 acl-2010-Learning to Translate with Source and Target Syntax
7 0.50494611 53 acl-2010-Blocked Inference in Bayesian Tree Substitution Grammars
8 0.50461918 23 acl-2010-Accurate Context-Free Parsing with Combinatory Categorial Grammar
9 0.50207478 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
10 0.5014624 172 acl-2010-Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons
11 0.50066793 69 acl-2010-Constituency to Dependency Translation with Forests
12 0.49951604 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
13 0.49831873 261 acl-2010-Wikipedia as Sense Inventory to Improve Diversity in Web Search Results
14 0.496429 71 acl-2010-Convolution Kernel over Packed Parse Forest
15 0.495435 89 acl-2010-Distributional Similarity vs. PU Learning for Entity Set Expansion
16 0.49366546 162 acl-2010-Learning Common Grammar from Multilingual Corpus
17 0.49250418 114 acl-2010-Faster Parsing by Supertagger Adaptation
18 0.49108303 9 acl-2010-A Joint Rule Selection Model for Hierarchical Phrase-Based Translation
19 0.49014449 128 acl-2010-Grammar Prototyping and Testing with the LinGO Grammar Matrix Customization System
20 0.4899419 185 acl-2010-Open Information Extraction Using Wikipedia