acl acl2011 acl2011-234 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Pierluigi Crescenzi ; Daniel Gildea ; Andrea Marino ; Gianluca Rossi ; Giorgio Satta
Abstract: We study the problem offinding the best headdriven parsing strategy for Linear ContextFree Rewriting System productions. A headdriven strategy must begin with a specified righthand-side nonterminal (the head) and add the remaining nonterminals one at a time in any order. We show that it is NP-hard to find the best head-driven strategy in terms of either the time or space complexity of parsing.
Reference: text
sentIndex sentText sentNum sentScore
1 di Sistemi e Informatica Universit a` di Firenze University of Rochester Universit a` di Firenze Gianluca Rossi Dip. [sent-4, score-0.354]
2 di Matematica Universit a` di Roma Tor Vergata Abstract We study the problem offinding the best headdriven parsing strategy for Linear ContextFree Rewriting System productions. [sent-5, score-0.571]
3 A headdriven strategy must begin with a specified righthand-side nonterminal (the head) and add the remaining nonterminals one at a time in any order. [sent-6, score-0.478]
4 We show that it is NP-hard to find the best head-driven strategy in terms of either the time or space complexity of parsing. [sent-7, score-0.31]
5 1 LCFRSs retain the fundamental property of CFGs that grammar nonterminals rewrite independently, but allow nonterminals to generate discontinuous phrases, that is, to generate more than one span in the string being produced. [sent-10, score-0.391]
6 di Ingegneria dell’Informazione Universit a` di Padova number of nonterminals on the right-hand side (rhs) of a rule, while fan-out is the number of spans of the string generated by the nonterminal in the lefthand side (lhs) of the rule. [sent-16, score-0.56]
7 General algorithms for parsing LCFRSs build a dynamic programming chart of recognized nonterminals bottom-up, in a manner analogous to the CKY algorithm for CFGs (Hopcroft and Ullman, 1979), but with time and space complexity that are dependent on the rank and fan-out of the grammar rules. [sent-19, score-0.522]
8 Whenever it is possible, binarization of LCFRS rules, or reduction of rank to two, is therefore important for parsing, as it reduces the time complexity needed for dynamic programming. [sent-20, score-0.278]
9 This has lead to a number of binarization algorithms for LCFRSs, as well as factorization algorithms that factor rules into new rules with smaller rank, with- out necessarily reducing rank all the way to two. [sent-21, score-0.3]
10 Gildea (2010) presents a related method for binarizing rules while keeping the time complexity of parsing as small as possible. [sent-32, score-0.305]
11 Gildea (201 1) shows that a polynomial time algorithm for factorizing LCFRSs in order to minimize time complexity would imply an improved approximation algorithm for the well-studied graph-theoretic property known as treewidth. [sent-34, score-0.248]
12 For SCFG, Satta and Peserico (2005) showed that the exponent in the time complexity of parsing algorithms must grow at least as fast as the square root of the rule rank, and Gildea and Sˇtefankovi cˇ (2007) tightened this bound to be linear in the rank. [sent-37, score-0.366]
13 (2009) mention that whether finding the optimal parsing strategy for an SCFG rule is NPhard is an important problem for future work. [sent-39, score-0.3]
14 In this paper, we investigate the problem of rule binarization for LCFRSs in the context of headdriven parsing strategies. [sent-40, score-0.274]
15 Head-driven strategies begin with one rhs symbol, and add one nonterminal at a time. [sent-41, score-0.277]
16 This rules out any factorization in which two subsets of nonterminals of size greater than one are combined in a single step. [sent-42, score-0.262]
17 However, the binarization used by Kallmeyer and Maier 451 (2010) simply proceeds left to right through the rule, without considering the impact of the parsing strategy on either time or space complexity. [sent-45, score-0.401]
18 We examine the question of whether we can efficiently find the strategy that minimizes either the time complexity or the space complexity of parsing. [sent-46, score-0.416]
19 We provide what is to our knowledge the first NP-hardness result for a grammar factorization problem, which we hope will aid in understanding parsing algorithms in general. [sent-56, score-0.255]
20 2 LCFRSs and parsing complexity × In this section we briefly introduce LCFRSs and define the problem of optimizing head-driven parsing complexity for these formalisms. [sent-57, score-0.458]
21 This is done by associating each production p of a grammar with a function g that takes as input the tuples generated by the nonterminals in p’s rhs, and rearranges their string components into a new tuple, possibly adding some alphabet symbols. [sent-63, score-0.491]
22 Example 1 g1(hx1,1 , x1,2i) = hx1,1x1,2i takes as input a tuple w(hitxh two strings a hnxd returns a tuple with a single string, obtained by concatenating the components in the input tuple. [sent-85, score-0.325]
23 g2 (hx1,1 , x1,2i) = hax1,1b, cx1,2di takes as input a tuple withi )tw =o strings and wraps kaersoun asd itnhpesuet strings w witihth symbols a, b, c, d ∈ V . [sent-86, score-0.227]
24 2 = A linear context-free rewriting system is a tuple G = (VN, VT, P, S), where VN and VT are finite, disjoint alphabets of nonterminal and terminal symbols, respectively. [sent-89, score-0.395]
25 Production (1) can be used to transform the r(g) string tuples generated by the nonterminals A1, . [sent-100, score-0.242]
26 For instance, hthavee string a =3b 3{ca3d3 is generated by means 452 fan-23 4out( A( A1 2∗⊕ A A 4s )t23)r∗a ⊕ te⊕gA( yA 34)2 ∗⊕ ⊕A A 31) 2 Figure 1: Some parsing strategies for production p in Example 3, and the associated maximum value for fan-out. [sent-110, score-0.418]
27 In the general case of LCFRSs, parsing of a production p as in (1) can be carried out in r(g) − 1 steps, collecting already baevai claarbrliee parses nfo rr( gn)on −ter 1minals A1, . [sent-119, score-0.29]
28 Any parsing strategy results in a complete parse of p, spanning f(p) = f(A) segments of w and represented by some item with 2f(A) indices. [sent-125, score-0.287]
29 It turns out that the best parsing strategy leads to fan-out 2. [sent-135, score-0.258]
30 2 The maximum fan-out f realized by a parsing strategy determines the space complexity of the parsing algorithm. [sent-136, score-0.516]
31 t Theh special cases poafc parsing bpalesxedity on C OF(G|ws and TAGs, this provides the wellknown space complexity of O(|w|2) and O(|w|4), respectively. [sent-141, score-0.258]
32 It can also be shown that, if a partial parse having fan-out f is obtained by means of the combination of two partial parses with fan-out f1 and f2, respectively, the resulting time complexity will be O(|w|f+f1+f2) (Seki et al. [sent-142, score-0.367]
33 As an example, in the case of parsing based on CFGs, nonterminals as well as partial parses all have fan- out one, resulting in the standard time complexity of O( |w |3) of dynamic programming methods. [sent-144, score-0.51]
34 When parsing with TAGs, we have to manipulate objects with fan-out two (in the worst case), resulting in time complexity of O(|w|6). [sent-145, score-0.269]
35 Optimizing the parsing complexity for a production means finding a parsing strategy that results in minimum space or time complexity. [sent-147, score-0.693]
36 In the MIN SPACE STRAT453 EGY problem one takes as input an LCFRS production p and an integer k, and must decide whether there exists a parsing strategy for p with maximum fan-out not larger than k. [sent-149, score-0.478]
37 In the MIN TIME STRATEGY problem one is given p and k as above and must decide whether there exists a parsing strategy for p such that, in any of its steps merging two partial parses with fan-out f1and f2 and resulting in a partial parse with fan-out f,the relation f+f1+f2 ≤ k holds. [sent-150, score-0.479]
38 In a head-driven strategy, one always starts parsing a production p from a fixed nonterminal in its rhs, called the head of p, and merges the remaining nonterminals one at a time with the partial parse containing the head. [sent-152, score-0.666]
39 Thus, under these strategies, the construction of partial parses that do not include the head is forbidden, and each parsing step involves at most one partial parse. [sent-153, score-0.345]
40 3 NP-completeness results For an LCFRS production p, let H be its head nonterminal, and let A1, . [sent-155, score-0.251]
41 A headdriven parsing strategy can be represented as a permutation π over the set [n] , prescribing that the nonhead nonterminals in p’s rhs should be merged with H in the order Aπ(1) , Aπ(2) , . [sent-159, score-0.552]
42 Given a graph M = (V, E) with set of vertices V and set of edges E, a linear arrangement of M is a bijective function h from V to [n] , where |V | = n. [sent-166, score-0.697]
43 The cutwidth of M at gap i∈ [n 1] and w|Vi |th = respect eto c a tliwniedatrh arrangement h i i∈s t[nhe− −n1u]m abnedr ofedges crossing the gap between the i-th vertex and its successor: − cw(M, h, i) = |{(u, v) ∈ E | h(u) ≤ i< h(v) }| . [sent-167, score-0.879]
44 The cutwidth of M is then defined as cw(M) = mhin i∈m[na−x1] cw(M,h,i) . [sent-169, score-0.399]
45 Theorem 1 The MIN SPACE STRATEGY problem restricted to head-driven parsing strategies is NPcomplete. [sent-172, score-0.248]
46 e loops dssou mnoet affect the value of the cutwidth and can therefore be removed. [sent-182, score-0.4]
47 Productionp has a head nonterminal H and a nonhead nonterminal Ai for each vertex vi ∈ V . [sent-184, score-0.517]
48 For each vi ∈ V , let E(vi) ⊆ E be the set of edges impinging on vi; t Ehu(sv |E(vi) | i bse et thhee degree of vi. [sent-189, score-0.231]
49 We let Ai generate a tuple )w|i tish tthweo string components for each ej ∈ E(vi). [sent-190, score-0.235]
50 Let vs and vt be the, endpoints of ei, with vs, vt ∈ V and s < t. [sent-205, score-0.229]
51 Observe that whenever edge ei impinges on vertex vj, then the left and right strings generated by Aj and associated with ei wrap around the string generated by H and associated with the same edge. [sent-207, score-0.465]
52 Example 4 Given the input graph of Figure 2, our reduction constructs the LCFRS production shown in Figure 3. [sent-209, score-0.249]
53 For each edge in the graph of Figure 2, we have a group of five spans in the production: one for the head nonterminal, and two spans for each of the two nonterminals corresponding to the edge’s endpoints. [sent-211, score-0.354]
54 2 Assume now some head-driven parsing strategy π for p. [sent-212, score-0.258]
55 For each i ∈ [n], we define Diπ to be the partial parse oeabcthain ie d∈ a [nfte],r step if nine π, consisting of the merge of nonterminals H, Aπ(1) , . [sent-213, score-0.286]
56 We observe that for any Diπ that includes or excludes both nonterminals As and At, the αj component in the definition of p is associated with a single string, and therefore contributes with a single unit to the fan-out of the partial parse. [sent-218, score-0.253]
57 On the other hand, if Diπ includes only one nonterminal between As and At, the αj component is associated with two strings and contributes with two units to the fan-out of the partial parse. [sent-219, score-0.278]
58 We can associate with π a linear arrangement hπ of M by letting hπ (vπ(i) ) = i, for each vi ∈ V . [sent-220, score-0.438]
59 To show that MIN SPACE STRATEGY is in NP, consider a nondeterministic algorithm that, given an LCFRS production p and an integer k, guesses a parsing strategy π for p, and tests whether f(Diπ) k for each i ∈ [n] . [sent-225, score-0.45]
60 Recall that we are now concerned with the quantity f1+ f2 + f, where f1is the fan-out of some partial parse D, f2 is the fan-out of a nonterminal A, and f is the fan out of the partial parse resulting from the merge of the two previous analyses. [sent-231, score-0.362]
61 Let M = (V, E) be some graph with |V | = n, and let h be a linear arrangement hfo wr tMh . [sent-233, score-0.473]
62 The modified cutwidth of M is defined as mcw(M) = mhin mi∈[anx] mcw(M,h,i) . [sent-235, score-0.478]
63 We strengthen this result below; recall that a cubic graph is a graph without self loops where each vertex has degree three. [sent-239, score-0.529]
64 For each vertex of G, G′ includes a component R(2n4, 8n4 +8). [sent-246, score-0.246]
65 Finally, for each edge (vi, vj) of G there is an edge in G′ connecting a degree 2 vertex in the component corresponding to the vertex vi with a degree 2 vertex in the component corresponding to the vertex vj. [sent-250, score-1.203]
66 t the modified cutwidth of R(r, c) is r − 1 whenever r ≥ 3 and c ≥ 4dtrh + f8 . [sent-254, score-0.447]
67 R They iasls ro −sh 1ow w thheante an optimal lainndear arrangement fheory G al′ shoa ss htohew fo thrmat depicted ianl Figure 6, where half of the vertex components are to the left of the H-shaped graph and all the other vertex components are to the right. [sent-255, score-0.908]
68 Modifying the vertex components All vertices x of degree 2 of the components corresponding to a vertex in G can be transformed into a vertex of degree 3 by adding five vertices x1, . [sent-259, score-1.3]
69 Observe that these five vertices can be positioned in the arrangement immediately after x in the order x1, x2, x5, x3, x4 (see the right part of the figure). [sent-263, score-0.51]
70 The resulting maximum modified cutwidth can increase by 2 in correspondence of vertex x5. [sent-264, score-0.651]
71 Since the vertices of these components, in the optimal arrangement, have modified cutwidth smaller than 456 n3 n2, an increase by 2 is still smaller than the maximum modified cutwidth of the entire graph, which is 3n4 O(n3). [sent-265, score-1.168]
72 2n4 + + + Modifying the middle bar of the H-shaped graph The vertices of degree 2 of this part of the graph can be modified as in the previous paragraph. [sent-266, score-0.619]
73 Indeed, in the optimal arrangement, these vertices have modified cutwidth smaller than 2n4 + 2n3 + n2, and an increase by 2 is still smaller than the maximum cutwidth of the entire graph. [sent-267, score-1.089]
74 Modifying the left/right columns of the H-shaped graph We replace the two copies of component R(3n4, 12n4 + 12) with two copies of the new component D(3n4, 24n4 + 16) shown in Figure 7, which is a cubic graph. [sent-268, score-0.276]
75 In order to prove that relation (2) still holds, it suffices to show that the modified cutwidth of the component D(r, c) is still r 1 wfiehde cneuvtweri r ≥ 3f tahned c c = 8orn + t1 6D. [sent-269, score-0.489]
76 We first observe that the linear arrangement obtained by visiting the vertices of D(r, c) from top to bottom and from left to right has modified cutwidth r −1. [sent-270, score-1.026]
77 es L ientt uos tw noow wsu pbrsoevets V1 a,n fdor V2 yw pithar |V1|, |V2 | ≥ 4r2, there exist at least r disjoint paths h be |Vtw|e,e|nV vertices of V1 and vertices of V2. [sent-272, score-0.583]
78 − • • Any row has (at least) one vertex in V1 and one vAenrytex ro iwn V2: (iant t lheiass case, i tv eisr easy t oV see there exist at least r disjoint paths between vertices of V1 and vertices of V2. [sent-274, score-0.787]
79 There exist at least 3r ‘mixed’ columns, that is, cTohleurmen esx iwsti atht (at least) one vede’rt cexol uinm V1 ,a tnhda one vertex in V2. [sent-275, score-0.232]
80 Again, it is easy to see that there exist at least r disjoint paths between vertices Figure 6: The optimal arrangement of G′. [sent-276, score-0.671]
81 of V1 and vertices of V2 (at least one path every three columns). [sent-277, score-0.26]
82 I)f =the r V1-columns interleave with the V2-columns, then there exist at least 2(r −1) disjoint paths between vertices aoft V1 atn 2d( rve−rt1ic)e dsi osjfo V2. [sent-284, score-0.351]
83 Otherwise, eanll vtheret V1columns precede or follow all the V2-columns (this corresponds to the optimal arrangement): in this case, there are r disjoint paths between vertices of V1 and vertices of V2. [sent-285, score-0.597]
84 Observe now that any linear arrangement partitions the set of vertices in D(r, c) into the sets V1, consisting of the first 4r2 vertices in the arrangement, and V2, consisting of all the remaining vertices. [sent-286, score-0.811]
85 We can now reduce from the MODIFIED CUTWIDTH problem for cubic graphs to the MIN TIME STRATEGY problem restricted to head-driven parsing strategies. [sent-292, score-0.246]
86 Theorem 2 The MIN TIME STRATEGYproblem restricted to head-driven parsing strategies is NPcomplete. [sent-293, score-0.248]
87 n the proof of Theorem 1, with rhs nonterminals H, A1, . [sent-300, score-0.283]
88 Assume now some head-driven parsing strategy π for p. [sent-305, score-0.258]
89 After parsing step i ∈ [n], we have a partial parse Diπ consisting otefp pth ie ∈ merge oef hnaovnete arm painrtaialsl H, Aπ(1) , . [sent-306, score-0.279]
90 Again, we associate with π a linear arrangement hπ of M by letting hπ (vπ(i) ) = i, for each vi ∈ V . [sent-312, score-0.438]
91 As in the proof of Theorem 1, the fan-out of∈ Diπ is then related to the cutwidth of the linear arrangement hπ of M at position iby f(Diπ) = |E| + cw(M, hπ, i) . [sent-313, score-0.781]
92 From the proof of Theorem 1, the fan-out of nonterminal Aπ(i) is twice the degree of vertex vπ(i), de- noted by |E(vπ(i)) |. [sent-314, score-0.43]
93 π, − The following general relation between cutwidth and modified cutwidth is rather intuitive: mcw(M,hπ,i) =21· [cw(M,hπ,i − 1) + − |E(vπ(i) ) | + cw(M, hπ, i)] . [sent-317, score-0.815]
94 We can thus conclude that there exists a head-driven parsing strategy for p with time complexity not greater than 2 · |E| + 9 + 2 · k = k′ if and only gifr mcw(M) ≤ k· . [sent-320, score-0.404]
95 It is now easy to see that the problem of finding a space- or time-optimal parsing strategy for a LCFRS production is NP-hard as well, and thus cannot be solved in polynomial (deterministic) time unless P = NP. [sent-324, score-0.466]
96 4 Concluding remarks Head-driven strategies are important in parsing based on LCFRSs, both in order to allow statistical modeling of head-modifier dependencies and in order to generalize the Markovization of CFG parsers 458 to parsers with discontinuous spans. [sent-325, score-0.248]
97 possible head-driven strategies for an LCFRS production with a head and n modifiers. [sent-327, score-0.261]
98 Choosing among these possible strategies affects both the time and the space complexity of parsing. [sent-328, score-0.269]
99 Thus the computational complexity of optimizing the choice of the parsing strategy for SCFGs is still an open problem. [sent-338, score-0.364]
100 This is in contrast to the findings of Gildea (201 1), which show that, for unrestricted parsing strategies, a polynomial time algorithm for minimizing parsing complexity would imply an improved approximation algorithm for finding the treewidth of general graphs. [sent-340, score-0.423]
wordName wordTfidf (topN-words)
[('cutwidth', 0.368), ('lcfrs', 0.322), ('lcfrss', 0.322), ('arrangement', 0.278), ('vertices', 0.232), ('vertex', 0.204), ('mcw', 0.154), ('production', 0.137), ('strategy', 0.135), ('nonterminals', 0.13), ('parsing', 0.123), ('di', 0.118), ('satta', 0.106), ('complexity', 0.106), ('giorgio', 0.1), ('vt', 0.1), ('nonterminal', 0.096), ('rewriting', 0.096), ('factorization', 0.096), ('strategies', 0.094), ('cw', 0.093), ('vi', 0.091), ('min', 0.09), ('rhs', 0.087), ('graph', 0.084), ('tuple', 0.081), ('partial', 0.081), ('modified', 0.079), ('headdriven', 0.077), ('maier', 0.077), ('binarization', 0.074), ('gildea', 0.074), ('cfgs', 0.072), ('linear', 0.069), ('proof', 0.066), ('string', 0.064), ('scfgs', 0.064), ('degree', 0.064), ('theorem', 0.063), ('guez', 0.062), ('kuhlmann', 0.062), ('kallmeyer', 0.061), ('makedon', 0.061), ('cubic', 0.061), ('strings', 0.059), ('vn', 0.059), ('synchronous', 0.059), ('rank', 0.058), ('integer', 0.055), ('disjoint', 0.053), ('cut', 0.049), ('tuples', 0.048), ('ei', 0.048), ('components', 0.048), ('bar', 0.047), ('columns', 0.047), ('informatica', 0.046), ('stags', 0.046), ('merge', 0.046), ('let', 0.042), ('edge', 0.042), ('optimal', 0.042), ('component', 0.042), ('tc', 0.04), ('time', 0.04), ('paths', 0.038), ('adjoining', 0.037), ('universit', 0.037), ('rules', 0.036), ('grammar', 0.036), ('edges', 0.034), ('spans', 0.034), ('bw', 0.033), ('loops', 0.032), ('graphs', 0.031), ('restricted', 0.031), ('polynomial', 0.031), ('bisection', 0.031), ('factorizing', 0.031), ('firenze', 0.031), ('mhin', 0.031), ('monien', 0.031), ('nphard', 0.031), ('rve', 0.031), ('sistemi', 0.031), ('tefankovi', 0.031), ('discontinuous', 0.031), ('parses', 0.03), ('head', 0.03), ('contextfree', 0.03), ('space', 0.029), ('logic', 0.029), ('vs', 0.029), ('middle', 0.029), ('finite', 0.029), ('productions', 0.029), ('crossing', 0.029), ('parse', 0.029), ('input', 0.028), ('least', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000006 234 acl-2011-Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
Author: Pierluigi Crescenzi ; Daniel Gildea ; Andrea Marino ; Gianluca Rossi ; Giorgio Satta
Abstract: We study the problem offinding the best headdriven parsing strategy for Linear ContextFree Rewriting System productions. A headdriven strategy must begin with a specified righthand-side nonterminal (the head) and add the remaining nonterminals one at a time in any order. We show that it is NP-hard to find the best head-driven strategy in terms of either the time or space complexity of parsing.
2 0.14112081 323 acl-2011-Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections
Author: Dipanjan Das ; Slav Petrov
Abstract: We describe a novel approach for inducing unsupervised part-of-speech taggers for languages that have no labeled training data, but have translated text in a resource-rich language. Our method does not assume any knowledge about the target language (in particular no tagging dictionary is assumed), making it applicable to a wide array of resource-poor languages. We use graph-based label propagation for cross-lingual knowledge transfer and use the projected labels as features in an unsupervised model (BergKirkpatrick et al., 2010). Across eight European languages, our approach results in an average absolute improvement of 10.4% over a state-of-the-art baseline, and 16.7% over vanilla hidden Markov models induced with the Expectation Maximization algorithm.
3 0.13701476 250 acl-2011-Prefix Probability for Probabilistic Synchronous Context-Free Grammars
Author: Mark-Jan Nederhof ; Giorgio Satta
Abstract: We present a method for the computation of prefix probabilities for synchronous contextfree grammars. Our framework is fairly general and relies on the combination of a simple, novel grammar transformation and standard techniques to bring grammars into normal forms.
4 0.13660011 296 acl-2011-Terminal-Aware Synchronous Binarization
Author: Licheng Fang ; Tagyoung Chung ; Daniel Gildea
Abstract: We present an SCFG binarization algorithm that combines the strengths of early terminal matching on the source language side and early language model integration on the target language side. We also examine how different strategies of target-side terminal attachment during binarization can significantly affect translation quality.
5 0.1073759 61 acl-2011-Binarized Forest to String Translation
Author: Hao Zhang ; Licheng Fang ; Peng Xu ; Xiaoyun Wu
Abstract: Tree-to-string translation is syntax-aware and efficient but sensitive to parsing errors. Forestto-string translation approaches mitigate the risk of propagating parser errors into translation errors by considering a forest of alternative trees, as generated by a source language parser. We propose an alternative approach to generating forests that is based on combining sub-trees within the first best parse through binarization. Provably, our binarization forest can cover any non-consitituent phrases in a sentence but maintains the desirable property that for each span there is at most one nonterminal so that the grammar constant for decoding is relatively small. For the purpose of reducing search errors, we apply the synchronous binarization technique to forest-tostring decoding. Combining the two techniques, we show that using a fast shift-reduce parser we can achieve significant quality gains in NIST 2008 English-to-Chinese track (1.3 BLEU points over a phrase-based system, 0.8 BLEU points over a hierarchical phrase-based system). Consistent and significant gains are also shown in WMT 2010 in the English to German, French, Spanish and Czech tracks.
6 0.094787583 143 acl-2011-Getting the Most out of Transition-based Dependency Parsing
7 0.083278358 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation
8 0.081179932 123 acl-2011-Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation
9 0.080681421 180 acl-2011-Issues Concerning Decoding with Synchronous Context-free Grammar
10 0.078901373 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations
11 0.078302905 314 acl-2011-Typed Graph Models for Learning Latent Attributes from Names
12 0.077750213 206 acl-2011-Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations
13 0.071770258 58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing
14 0.068504632 107 acl-2011-Dynamic Programming Algorithms for Transition-Based Dependency Parsers
15 0.068377383 274 acl-2011-Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
16 0.066320166 30 acl-2011-Adjoining Tree-to-String Translation
17 0.06520766 316 acl-2011-Unary Constraints for Efficient Context-Free Parsing
18 0.06061992 221 acl-2011-Model-Based Aligner Combination Using Dual Decomposition
19 0.059459727 171 acl-2011-Incremental Syntactic Language Models for Phrase-based Translation
20 0.058020983 243 acl-2011-Partial Parsing from Bitext Projections
topicId topicWeight
[(0, 0.151), (1, -0.074), (2, -0.016), (3, -0.11), (4, -0.016), (5, -0.038), (6, -0.1), (7, 0.004), (8, -0.041), (9, -0.043), (10, 0.013), (11, 0.015), (12, 0.011), (13, 0.073), (14, -0.024), (15, -0.044), (16, 0.024), (17, 0.024), (18, 0.042), (19, -0.01), (20, 0.013), (21, -0.005), (22, 0.014), (23, -0.015), (24, -0.03), (25, -0.058), (26, -0.082), (27, -0.037), (28, 0.038), (29, -0.002), (30, 0.081), (31, 0.057), (32, 0.001), (33, 0.052), (34, 0.035), (35, 0.004), (36, 0.133), (37, 0.196), (38, -0.032), (39, 0.044), (40, -0.049), (41, -0.047), (42, 0.023), (43, 0.01), (44, 0.07), (45, -0.064), (46, -0.028), (47, -0.014), (48, -0.043), (49, -0.031)]
simIndex simValue paperId paperTitle
same-paper 1 0.92738092 234 acl-2011-Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
Author: Pierluigi Crescenzi ; Daniel Gildea ; Andrea Marino ; Gianluca Rossi ; Giorgio Satta
Abstract: We study the problem offinding the best headdriven parsing strategy for Linear ContextFree Rewriting System productions. A headdriven strategy must begin with a specified righthand-side nonterminal (the head) and add the remaining nonterminals one at a time in any order. We show that it is NP-hard to find the best head-driven strategy in terms of either the time or space complexity of parsing.
2 0.72307718 296 acl-2011-Terminal-Aware Synchronous Binarization
Author: Licheng Fang ; Tagyoung Chung ; Daniel Gildea
Abstract: We present an SCFG binarization algorithm that combines the strengths of early terminal matching on the source language side and early language model integration on the target language side. We also examine how different strategies of target-side terminal attachment during binarization can significantly affect translation quality.
3 0.65628892 250 acl-2011-Prefix Probability for Probabilistic Synchronous Context-Free Grammars
Author: Mark-Jan Nederhof ; Giorgio Satta
Abstract: We present a method for the computation of prefix probabilities for synchronous contextfree grammars. Our framework is fairly general and relies on the combination of a simple, novel grammar transformation and standard techniques to bring grammars into normal forms.
4 0.63110238 180 acl-2011-Issues Concerning Decoding with Synchronous Context-free Grammar
Author: Tagyoung Chung ; Licheng Fang ; Daniel Gildea
Abstract: We discuss some of the practical issues that arise from decoding with general synchronous context-free grammars. We examine problems caused by unary rules and we also examine how virtual nonterminals resulting from binarization can best be handled. We also investigate adding more flexibility to synchronous context-free grammars by adding glue rules and phrases.
5 0.52221924 61 acl-2011-Binarized Forest to String Translation
Author: Hao Zhang ; Licheng Fang ; Peng Xu ; Xiaoyun Wu
Abstract: Tree-to-string translation is syntax-aware and efficient but sensitive to parsing errors. Forestto-string translation approaches mitigate the risk of propagating parser errors into translation errors by considering a forest of alternative trees, as generated by a source language parser. We propose an alternative approach to generating forests that is based on combining sub-trees within the first best parse through binarization. Provably, our binarization forest can cover any non-consitituent phrases in a sentence but maintains the desirable property that for each span there is at most one nonterminal so that the grammar constant for decoding is relatively small. For the purpose of reducing search errors, we apply the synchronous binarization technique to forest-tostring decoding. Combining the two techniques, we show that using a fast shift-reduce parser we can achieve significant quality gains in NIST 2008 English-to-Chinese track (1.3 BLEU points over a phrase-based system, 0.8 BLEU points over a hierarchical phrase-based system). Consistent and significant gains are also shown in WMT 2010 in the English to German, French, Spanish and Czech tracks.
6 0.521312 323 acl-2011-Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections
7 0.5039416 303 acl-2011-Tier-based Strictly Local Constraints for Phonology
8 0.50356364 154 acl-2011-How to train your multi bottom-up tree transducer
9 0.4972674 107 acl-2011-Dynamic Programming Algorithms for Transition-Based Dependency Parsers
10 0.48477742 123 acl-2011-Exact Decoding of Syntactic Translation Models through Lagrangian Relaxation
11 0.46793577 314 acl-2011-Typed Graph Models for Learning Latent Attributes from Names
12 0.44952425 274 acl-2011-Semi-Supervised Frame-Semantic Parsing for Unknown Predicates
13 0.42389759 243 acl-2011-Partial Parsing from Bitext Projections
14 0.41483724 342 acl-2011-full-for-print
15 0.40228912 58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing
16 0.40190226 143 acl-2011-Getting the Most out of Transition-based Dependency Parsing
17 0.40110794 295 acl-2011-Temporal Restricted Boltzmann Machines for Dependency Parsing
18 0.39661652 11 acl-2011-A Fast and Accurate Method for Approximate String Search
19 0.39537394 186 acl-2011-Joint Training of Dependency Parsing Filters through Latent Support Vector Machines
20 0.39517689 106 acl-2011-Dual Decomposition for Natural Language Processing
topicId topicWeight
[(4, 0.254), (5, 0.026), (17, 0.07), (26, 0.025), (37, 0.129), (39, 0.047), (41, 0.076), (55, 0.023), (59, 0.034), (62, 0.013), (72, 0.015), (91, 0.034), (96, 0.145)]
simIndex simValue paperId paperTitle
same-paper 1 0.79225135 234 acl-2011-Optimal Head-Driven Parsing Complexity for Linear Context-Free Rewriting Systems
Author: Pierluigi Crescenzi ; Daniel Gildea ; Andrea Marino ; Gianluca Rossi ; Giorgio Satta
Abstract: We study the problem offinding the best headdriven parsing strategy for Linear ContextFree Rewriting System productions. A headdriven strategy must begin with a specified righthand-side nonterminal (the head) and add the remaining nonterminals one at a time in any order. We show that it is NP-hard to find the best head-driven strategy in terms of either the time or space complexity of parsing.
2 0.66110969 111 acl-2011-Effects of Noun Phrase Bracketing in Dependency Parsing and Machine Translation
Author: Nathan Green
Abstract: Flat noun phrase structure was, up until recently, the standard in annotation for the Penn Treebanks. With the recent addition of internal noun phrase annotation, dependency parsing and applications down the NLP pipeline are likely affected. Some machine translation systems, such as TectoMT, use deep syntax as a language transfer layer. It is proposed that changes to the noun phrase dependency parse will have a cascading effect down the NLP pipeline and in the end, improve machine translation output, even with a reduction in parser accuracy that the noun phrase structure might cause. This paper examines this noun phrase structure’s effect on dependency parsing, in English, with a maximum spanning tree parser and shows a 2.43%, 0.23 Bleu score, improvement for English to Czech machine translation. .
3 0.66076612 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction
Author: Yee Seng Chan ; Dan Roth
Abstract: In this paper, we observe that there exists a second dimension to the relation extraction (RE) problem that is orthogonal to the relation type dimension. We show that most of these second dimensional structures are relatively constrained and not difficult to identify. We propose a novel algorithmic approach to RE that starts by first identifying these structures and then, within these, identifying the semantic type of the relation. In the real RE problem where relation arguments need to be identified, exploiting these structures also allows reducing pipelined propagated errors. We show that this RE framework provides significant improvement in RE performance.
4 0.6582073 277 acl-2011-Semi-supervised Relation Extraction with Large-scale Word Clustering
Author: Ang Sun ; Ralph Grishman ; Satoshi Sekine
Abstract: We present a simple semi-supervised relation extraction system with large-scale word clustering. We focus on systematically exploring the effectiveness of different cluster-based features. We also propose several statistical methods for selecting clusters at an appropriate level of granularity. When training on different sizes of data, our semi-supervised approach consistently outperformed a state-of-the-art supervised baseline system. 1
5 0.65449905 65 acl-2011-Can Document Selection Help Semi-supervised Learning? A Case Study On Event Extraction
Author: Shasha Liao ; Ralph Grishman
Abstract: Annotating training data for event extraction is tedious and labor-intensive. Most current event extraction tasks rely on hundreds of annotated documents, but this is often not enough. In this paper, we present a novel self-training strategy, which uses Information Retrieval (IR) to collect a cluster of related documents as the resource for bootstrapping. Also, based on the particular characteristics of this corpus, global inference is applied to provide more confident and informative data selection. We compare this approach to self-training on a normal newswire corpus and show that IR can provide a better corpus for bootstrapping and that global inference can further improve instance selection. We obtain gains of 1.7% in trigger labeling and 2.3% in role labeling through IR and an additional 1.1% in trigger labeling and 1.3% in role labeling by applying global inference. 1
6 0.65353996 331 acl-2011-Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation
7 0.65292382 311 acl-2011-Translationese and Its Dialects
8 0.65239638 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations
9 0.65107334 73 acl-2011-Collective Classification of Congressional Floor-Debate Transcripts
10 0.65017915 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering
11 0.65015048 103 acl-2011-Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation
12 0.64975715 58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing
13 0.64804876 92 acl-2011-Data point selection for cross-language adaptation of dependency parsers
14 0.64776486 34 acl-2011-An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment
15 0.64712548 309 acl-2011-Transition-based Dependency Parsing with Rich Non-local Features
16 0.64698488 246 acl-2011-Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition
17 0.64622307 28 acl-2011-A Statistical Tree Annotator and Its Applications
18 0.64584076 128 acl-2011-Exploring Entity Relations for Named Entity Disambiguation
19 0.64468229 44 acl-2011-An exponential translation model for target language morphology
20 0.64456558 23 acl-2011-A Pronoun Anaphora Resolution System based on Factorial Hidden Markov Models