acl acl2013 acl2013-274 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: David Chiang ; Jacob Andreas ; Daniel Bauer ; Karl Moritz Hermann ; Bevan Jones ; Kevin Knight
Abstract: Hyperedge replacement grammar (HRG) is a formalism for generating and transforming graphs that has potential applications in natural language understanding and generation. A recognition algorithm due to Lautemann is known to be polynomial-time for graphs that are connected and of bounded degree. We present a more precise characterization of the algorithm’s complexity, an optimization analogous to binarization of contextfree grammars, and some important implementation details, resulting in an algorithm that is practical for natural-language applications. The algorithm is part of Bolinas, a new software toolkit for HRG processing.
Reference: text
sentIndex sentText sentNum sentScore
1 1 Introduction Hyperedge replacement grammar (HRG) is a context-free rewriting formalism for generating graphs (Drewes et al. [sent-5, score-0.403]
2 , 1997), and its synchronous counterpart can be used for transforming graphs to/from other graphs or trees. [sent-6, score-0.35]
3 A polynomial-time recognition algorithm for HRGs was described by Lautemann (1990), building on the work of Rozenberg and Welzl (1986) on boundary node label controlled grammars, and others have presented polynomial-time algorithms as well (Mazanek and Minas, 2008; Moot, 2008). [sent-10, score-0.62]
4 ” Indeed, the key step of the algorithm, which matches a rule against the input graph, is described at a very high level, so that it is not obvious (for a non-expert in graph algorithms) how to implement it. [sent-12, score-0.258]
5 The resulting algorithm is practical and is implemented as part of the open-source Bolinas toolkit for hyperedge replacement grammars. [sent-16, score-0.379]
6 The derivation starts with a single edge labeled with the nonterminal symbol S . [sent-22, score-0.45]
7 The first rewriting step replaces this edge with a subgraph, which we might read as “The 924 ProceedingSsof oifa, th Beu 5l1gsarti Aan,An uuaglu Mste 4e-ti9n2g 0 o1f3 t. [sent-23, score-0.328]
8 ” The second rewriting step replaces the X edge with another subgraph, which we might read as “The boy wants the girl to believe something (Y) involving both of them. [sent-27, score-0.579]
9 2 Definitions The graphs we use in this paper have edge labels, but no node labels; while node labels are intuitive for many graphs in NLP, using both node and edge labels complicates the definition of hyperedge grammar and algorithms. [sent-30, score-1.591]
10 All of our graphs are directed (ordered), as the purpose of most graph structures in NLP is to model dependencies between entities. [sent-31, score-0.303]
11 An edge-labeled, ordered hypergraph is a tuple H = hV, E, ℓi, where • • • V is a finite set of nodes E ⊆ V+ is a finite set of hyperedges, each of Ewh ⊆ich V connects one or more distinct nodes ℓ : E → C assigns a label (drawn from the ℓfin :ite E set C) to seiagcnhs edge. [sent-33, score-0.428]
12 For brevity we use the terms graph and hypergraph interchangeably, and similarly for edge and hyperedge. [sent-34, score-0.485]
13 A hypergraph fragment is a tuple hV, E, ℓ, Xi, where hV, E, ℓi is a hypergraph and hXV ∈ Vℓ,+X iis, a lhisetr eof h Vd,iEsti,nℓcit inso ade hs cpaelrlegdra pthhe exXter ∈nal V nodes. [sent-37, score-0.294]
14 The function of graph fragments in HRG is analogous to the right-hand sides of CFG rules and to elementary trees in tree adjoining grammars (Joshi and Schabes, 1997). [sent-38, score-0.465]
15 The external nodes indicate how to integrate a graph into another graph during a derivation, and are analogous to foot nodes. [sent-39, score-0.53]
16 Let e = (v1 · · · vk) be an edge in H with label A. [sent-46, score-0.233]
17 The graph language voef, a grammar lGo sisu rteh eo (possibly infinite) set of graphs H that have no edges with nonterminal labels such that S ⇒G∗H. [sent-54, score-0.56]
18 When a HRG rule (A → R) is applied to an edge e, nth ae mapping eof ( Aex t→erna Rl) )n iosde aps pilni e Rd to t ahen 925 Figure 2: Derivation of a hyperedge replacement grammar for a graph representing the meaning of “The boy wants the girl to believe that he wants her. [sent-55, score-1.232]
19 When writing grammar rules, we make this ordering explicit by writing the left hand side of a rule as an edge and indexing the external nodes of R on both sides, as shown in Figure 2. [sent-57, score-0.506]
20 HRG derivations are context-free in the sense that the applicability of each production depends on the nonterminal label of the replaced edge only. [sent-58, score-0.427]
21 This allows us to represent a derivation as a derivation tree, and sets of derivations of a graph as a derivation forest (which can in turn represented as hypergraphs). [sent-59, score-0.492]
22 Finally, let m be an arbitrary node of H called the marker node, whose usage will become clear below. [sent-75, score-0.279]
23 1 Representing subgraphs Just as CKY deals with substrings (i, j] of the input, the HRG parsing algorithm deals with edgeinduced subgraphs Iof the input. [sent-77, score-0.58]
24 A boundbe ary node of I a node in I is which is either a node with an edge in H\I or an external node. [sent-85, score-0.832]
25 A boundary edge dogf e I i ns an edge inn e Ix wterhniaclh nhoadse a boundary node as an endpoint. [sent-86, score-1.416]
26 The boundary representation of I the tuple hbn(I), be(I, v), m ∈ Ii, where is • bn(I) is the set of boundary nodes of I • be(I, v) be the set of boundary edges of v in I • (m ∈ I) is a flag indicating whether the (mmark ∈er In)od ise i sa i fnl a Ig. [sent-87, score-1.364]
27 The boundary representation of Isuffices to specify I compactly. [sent-88, score-0.382]
28 If I and I′ are two subgraphs of H with the same boundary representation, then I= I′. [sent-90, score-0.606]
29 Suppose I I′; without loss of generality, suppose that there is an edge e that is in I I′. [sent-98, score-0.233]
30 Let π be the shortest path \ (ignoring edge direction) that begins with e and ends with a boundary node. [sent-99, score-0.646]
31 All the edges along π must be in I I′, or else there would be a boundary \ nmoudset bine tihne I \mIiddle of π, and π would not be the shortest path from e to a boundary node. [sent-100, score-0.934]
32 Then, in particular, the last edge of π must be in I\I′. [sent-101, score-0.295]
33 Since pita ahrtaisc a boundary tn eoddgee as an endpoint, i It must be a boundary edge of I, but cannot be a boundary edge of I′, which is a contradiction. [sent-102, score-1.674]
34 < < , If two subgraphs are disjoint, we can use their boundary representations to compute the boundary representation of their union. [sent-104, score-0.988]
35 Let Iand J be two subgraphs whose edges are disjoint. [sent-106, score-0.301]
36 A node v is a boundary node of I ∪ J iff one of the following holds: (i) v is a boundary node of one subgraph but not the other (ii) v is a boundary node of both subgraphs, and has an edge which is not a boundary edge of either. [sent-107, score-2.872]
37 An edge is a boundary edge of I J iff it has a ∪ boundary sn aod beo of Id ∪ y J e as an endpoint a itnd h aiss a boundary edge of I or J J. [sent-108, score-1.909]
38 (⇒) v has an edge in either I J and an or edge e o (⇒uts)id ve hbaosth a n I aedndg e J. [sent-110, score-0.466]
39 i nTh eeithreeforr Ie oitr must d be a a boundary node of either I J. [sent-111, score-0.63]
40 Moreover, e is not or a boundary edge of either, satisfying condition (ii). [sent-112, score-0.65]
41 (⇐) Case (i): without loss of generality, assume v is( a boundary :n woidteh oofu t I. [sent-113, score-0.382]
42 l oIst sh oasf an edge e ,in a I, uamnde therefore in I J, and an edge e′ outside I, which ∪ < must aolrseo nb e I o ∪u tJs,id aen d J. [sent-114, score-0.528]
43 a nF eord e J (because I and J are disjoint), and if e′ ∈ J, then v would be a boundary node of J. [sent-115, score-0.568]
44 Therefore, he′e v I ∪ J, so v i sa a boundary node of I J. [sent-116, score-0.568]
45 Case (ii): v h ∪as J an edge ∪ ian b Io aunndd atrhyer neofdoree o If ∪ J, a. [sent-117, score-0.233]
46 < This result leads to Algorithm 1, which runs in time linear in the number of boundary nodes. [sent-120, score-0.382]
47 Algorithm 1 Compute the union of two disjoint subgraphs I J. [sent-121, score-0.328]
48 2 Treewidth Lautemann’s algorithm tries to match a rule against the input graph all at once. [sent-126, score-0.31]
49 Gildea has shown that this problem is related to 927 the notion of treewidth (Gildea, 2011), which we review briefly here. [sent-129, score-0.261]
50 A tree decomposition of a graph H = hV, Ei is a tree T, each of whose nodes η iHs as =so hcVia,Etedi iwsi ath t sets Vη ⊆ Vh oanfd w Eη ⊆ E, ewsit hη the following properties: 1. [sent-131, score-0.552]
51 The treewidth of H Tish tehe w dmthini omf aTl sw midtahx |oVf any tree decomposition of H. [sent-143, score-0.456]
52 A tree decomposition of a graph fragment hV, E, Xi is a tree decomposition of hV, Ei that has thVhe, Ea,dXdiiti iosn aal t property thpaots atillo nth oef e hVxt,eErnia tlh naot hdaess belong to Vη for some η. [sent-144, score-0.673]
53 This decomposition has width three, because its largest node has 4 elements. [sent-147, score-0.349]
54 In general, a tree has width one, and it can be shown that a graph has treewidth at most two iff it does not have the following graph as a minor (Bodlaender, 1997): η ∈ K4= Finding a tree decomposition with minimal width is in general NP-hard (Arnborg et al. [sent-148, score-1.049]
55 However, we find that for the graphs we are interested in in NLP applications, even a na¨ ıve algorithm gives tree decompositions of low width in practice: simply perform a depth-first traversal of the edges of the graph, forming a tree T. [sent-150, score-0.537]
56 Any tree decomposition can be converted into one which is nice in the following sense (simplified from Cygan et al. [sent-159, score-0.24]
57 Each tree node η must be one of: • A leaf node, such that Vη = ∅. [sent-161, score-0.34]
58 • • A unary node, which introduces exactly one edge e. [sent-162, score-0.264]
59 For each production (A → R) ∈ G, f bined a a HnRicGe tree decomposition onf ( RA a n→d cRa)ll ∈ ∈it G TR. [sent-167, score-0.23]
60 dTh ae n itrceeew treideth d eocfo Gm ioss tihtieo nm oafx Rim aunmd 928 treewidth of any right-hand side in G. [sent-168, score-0.261]
61 The properties of tree decomposition allow us to limit the number of boundary nodes of the partially-recognized rule. [sent-170, score-0.668]
62 More formally, let RDη be the subgraph of R induced by the union of Eη′ for all η′ equal to or dominated by η. [sent-171, score-0.232]
63 Let R be a graph fragment, and assume a tree decomposition of R. [sent-174, score-0.369]
64 All the boundary nodes of RDη belong to Vη ∩ Vparent(η). [sent-175, score-0.525]
65 Node v must have an edge in RDη and therefore in Rη′ for some η′ dominated by or equal to η. [sent-178, score-0.328]
66 Since the root node contains all the external nodes, by the running intersection property, both Vη and Vparent(η) must contain v as well. [sent-180, score-0.325]
67 Therefore there must be a tree node not dominated by or equal to η that contains this edge, and therefore v. [sent-182, score-0.373]
68 This result, in turn, will allow us to bound the complexity ofthe parsing algorithm in terms ofthe treewidth of G. [sent-185, score-0.4]
69 A passive item has the form [A, I,X], where X ∈ V+ is an explicit ordering [oAf ,thI,eX boundary Xnod ∈es of I. [sent-190, score-0.451]
70 aAtn Aac ⇒tive item has the form [A → R, η, I,φ], where • • • • (A → R) is a production of G η is a node of TR I a subgraph of H is φ is a bijection between the boundary nodes oφf i RDη bainjedc tthioonse b oetfw w I. [sent-193, score-0.848]
71 If we assume that the TR are nice, then the in- ference rules that generate active items follow the different types of nodes in a nice tree decomposition: • Leaf [A → R,η,∅,∅] • where η is a leaf node of TR. [sent-197, score-0.57]
72 In the Nonterminal, Terminal, and Binary rules, we form unions of subgraphs and unions of mappings. [sent-201, score-0.294]
73 When forming the union of two subgraphs, we require that the subgraphs be disjoint (however, see Section 3. [sent-202, score-0.328]
74 Using grammar (a), the recognition algorithm would incorrectly accept the graph (b) by assembling together the three overlapping fragments (c). [sent-208, score-0.283]
75 For the Binary inference rule, active items should be indexed by their tree node (η1 or η2). [sent-211, score-0.427]
76 4 The disjointness check A successful proof using the inference rules above builds an HRG derivation (comprising all the rewrites used by the Nonterminal rule) which derives a graph H′, as well as a graph isomorphism φ : H′ → H (the union of the mappings from all the items). [sent-215, score-0.713]
77 During inference, whenever we form the union of two subgraphs, we require that the subgraphs be disjoint. [sent-216, score-0.271]
78 This is a rather expensive operation: it can be done using only their boundary representations, but the best algorithm we are aware of is still quadratic in the number of boundary nodes. [sent-217, score-0.816]
79 However, we can replace the disjointness check with a weaker and faster check such that any derivation that merges two non-disjoint subgraphs will ultimately fail, and therefore the derived graph H′ is isomorphic to the input graph H′ as desired. [sent-221, score-0.851]
80 This weaker check is to require, when merging two subgraphs I J, that: and 1. [sent-222, score-0.261]
81 If m belongs to both I and J, it must be a boundary node of both. [sent-224, score-0.63]
82 But condition (2) guarantees that φ maps only one node to the marker m. [sent-229, score-0.263]
83 We can show this again by induction: if φI and φJ each map only one node to m, then φI∪φJ must map only one node to m, by a combinatio∪n φof condition (2) and the fact that the inference rules guarantee that φI, φJ, and φI ∪ φJ are one-to-one on boundary nodes. [sent-230, score-0.936]
84 Then we can show that, since m is recognized exactly once, the whole graph is also recognized exactly once. [sent-231, score-0.242]
85 2 Choose a path π (ignoring edge direction) from v to m. [sent-239, score-0.264]
86 5 Complexity The key to the efficiency of the algorithm is that the treewidth of G leads to a bound on the number of boundary nodes we must keep track of at any time. [sent-247, score-0.848]
87 Each inference rule mentions only boundary nodes of RDη or RDηi, all of which belong to Vη (by Proposition 3), so there are at most |Vη| ≤ k + 1 of them. [sent-250, score-0.648]
88 In the Nontermareina atl amnods Binary in kfe +re 1n coef rules, e Inach th boundary edge could belong to I J or neither. [sent-251, score-0.72]
89 For each active item [A → R, η, I,φ], every boundary node of RDη must belong to Vη ∩ Vparent(η) (by Proposition 3). [sent-254, score-0.716]
90 Therefore the numb∩erV of boundary nodes is at most k + 1 (but typically less), and the number of possible items is in O((2dn)k+1). [sent-255, score-0.549]
91 e F tree decomposition Ro,fR R∪,R∼′i )s,u wceh tchoant:• All the external nodes of both R and R′ belong to Vη ferorn some η. [sent-266, score-0.379]
92 In the synchronous parsing algorithm, passive items have the form [A, I,X, I′, X′] and active items have the form [A → R : R′, η, I,φ, I′, φ′]. [sent-269, score-0.387]
93 d The complexity of the parsing algorithm is again in O((3dn)k+1), where k is now the maxaimgauimn itnre eOw(i(3dth of the dependency graph as defined in this section. [sent-272, score-0.313]
94 In general, this treewidth will be greater than the treewidth of either the source or target side on its own, so that synchronous parsing is generally slower than standard parsing. [sent-273, score-0.654]
95 5 Conclusion Although Lautemann’s polynomial-time extension of CKY to HRGs has been known for some time, the desire to use graph grammars for large-scale NLP applications introduces some practical considerations not accounted for in Lautemann’s original presentation. [sent-274, score-0.244]
96 )k+ I1t) r space, Ow(h(e3re n is the numrbeeqru uoirfe enso Ode((s in the input graph, d is its maximum degree, and k is the maximum treewidth of the rule right-hand sides in the grammar. [sent-277, score-0.378]
97 Solving connectivity problems parameterized by treewidth in single exponential time. [sent-301, score-0.261]
98 The complexity of graph languages generated by hyperedge replacement. [sent-330, score-0.426]
99 Parsing of hyperedge replacement grammars with graph parser combinators. [sent-334, score-0.571]
100 Lambek grammars, tree adjoining grammars and hyperedge replacement grammars. [sent-339, score-0.489]
wordName wordTfidf (topN-words)
[('boundary', 0.382), ('hrg', 0.301), ('treewidth', 0.261), ('edge', 0.233), ('subgraphs', 0.224), ('hyperedge', 0.205), ('node', 0.186), ('graph', 0.174), ('hrgs', 0.14), ('lautemann', 0.14), ('graphs', 0.129), ('nonterminal', 0.123), ('replacement', 0.122), ('bn', 0.117), ('hv', 0.112), ('decomposition', 0.103), ('subgraph', 0.101), ('rewriting', 0.095), ('derivation', 0.094), ('girl', 0.092), ('synchronous', 0.092), ('tree', 0.092), ('nodes', 0.091), ('boy', 0.084), ('rule', 0.084), ('bolinas', 0.08), ('rd', 0.079), ('hypergraph', 0.078), ('edges', 0.077), ('items', 0.076), ('wants', 0.075), ('disjointness', 0.071), ('grammars', 0.07), ('passive', 0.069), ('must', 0.062), ('rozenberg', 0.062), ('proposition', 0.061), ('vparent', 0.06), ('width', 0.06), ('finite', 0.059), ('grammar', 0.057), ('disjoint', 0.057), ('fragment', 0.057), ('cky', 0.054), ('atl', 0.053), ('bijection', 0.053), ('belong', 0.052), ('algorithm', 0.052), ('let', 0.051), ('analogous', 0.05), ('tuple', 0.05), ('hn', 0.048), ('complexity', 0.047), ('union', 0.047), ('definition', 0.047), ('geoquery', 0.046), ('rules', 0.046), ('nice', 0.045), ('terminal', 0.044), ('marker', 0.042), ('connected', 0.041), ('external', 0.041), ('binarization', 0.041), ('arnborg', 0.04), ('cygan', 0.04), ('drewes', 0.04), ('edgeinduced', 0.04), ('gogate', 0.04), ('isomorphic', 0.04), ('mazanek', 0.04), ('pilipczuk', 0.04), ('treewidths', 0.04), ('parsing', 0.04), ('inference', 0.039), ('vk', 0.039), ('grzegorz', 0.038), ('productions', 0.038), ('check', 0.037), ('derivations', 0.036), ('jones', 0.036), ('intersection', 0.036), ('decompositions', 0.035), ('unions', 0.035), ('condition', 0.035), ('tang', 0.035), ('production', 0.035), ('generality', 0.034), ('recognized', 0.034), ('active', 0.034), ('sides', 0.033), ('dominated', 0.033), ('iff', 0.033), ('path', 0.031), ('unary', 0.031), ('shieber', 0.031), ('moritz', 0.031), ('isomorphism', 0.031), ('eof', 0.031), ('endpoint', 0.031), ('locally', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 274 acl-2013-Parsing Graphs with Hyperedge Replacement Grammars
Author: David Chiang ; Jacob Andreas ; Daniel Bauer ; Karl Moritz Hermann ; Bevan Jones ; Kevin Knight
Abstract: Hyperedge replacement grammar (HRG) is a formalism for generating and transforming graphs that has potential applications in natural language understanding and generation. A recognition algorithm due to Lautemann is known to be polynomial-time for graphs that are connected and of bounded degree. We present a more precise characterization of the algorithm’s complexity, an optimization analogous to binarization of contextfree grammars, and some important implementation details, resulting in an algorithm that is practical for natural-language applications. The algorithm is part of Bolinas, a new software toolkit for HRG processing.
2 0.16098922 140 acl-2013-Evaluating Text Segmentation using Boundary Edit Distance
Author: Chris Fournier
Abstract: This work proposes a new segmentation evaluation metric, named boundary similarity (B), an inter-coder agreement coefficient adaptation, and a confusion-matrix for segmentation that are all based upon an adaptation of the boundary edit distance in Fournier and Inkpen (2012). Existing segmentation metrics such as Pk, WindowDiff, and Segmentation Similarity (S) are all able to award partial credit for near misses between boundaries, but are biased towards segmentations containing few or tightly clustered boundaries. Despite S’s improvements, its normalization also produces cosmetically high values that overestimate agreement & performance, leading this work to propose a solution.
3 0.12627092 165 acl-2013-General binarization for parsing and translation
Author: Matthias Buchse ; Alexander Koller ; Heiko Vogler
Abstract: Binarization ofgrammars is crucial for improving the complexity and performance of parsing and translation. We present a versatile binarization algorithm that can be tailored to a number of grammar formalisms by simply varying a formal parameter. We apply our algorithm to binarizing tree-to-string transducers used in syntax-based machine translation.
4 0.12239814 15 acl-2013-A Novel Graph-based Compact Representation of Word Alignment
Author: Qun Liu ; Zhaopeng Tu ; Shouxun Lin
Abstract: In this paper, we propose a novel compact representation called weighted bipartite hypergraph to exploit the fertility model, which plays a critical role in word alignment. However, estimating the probabilities of rules extracted from hypergraphs is an NP-complete problem, which is computationally infeasible. Therefore, we propose a divide-and-conquer strategy by decomposing a hypergraph into a set of independent subhypergraphs. The experiments show that our approach outperforms both 1-best and n-best alignments.
5 0.10822773 4 acl-2013-A Context Free TAG Variant
Author: Ben Swanson ; Elif Yamangil ; Eugene Charniak ; Stuart Shieber
Abstract: We propose a new variant of TreeAdjoining Grammar that allows adjunction of full wrapping trees but still bears only context-free expressivity. We provide a transformation to context-free form, and a further reduction in probabilistic model size through factorization and pooling of parameters. This collapsed context-free form is used to implement efficient gram- mar estimation and parsing algorithms. We perform parsing experiments the Penn Treebank and draw comparisons to TreeSubstitution Grammars and between different variations in probabilistic model design. Examination of the most probable derivations reveals examples of the linguistically relevant structure that our variant makes possible.
6 0.10553849 19 acl-2013-A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation
7 0.091809854 296 acl-2013-Recognizing Identical Events with Graph Kernels
8 0.091287501 57 acl-2013-Arguments and Modifiers from the Learner's Perspective
9 0.09051697 174 acl-2013-Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation
10 0.088125929 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation
11 0.080448836 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
12 0.078200758 314 acl-2013-Semantic Roles for String to Tree Machine Translation
13 0.078163564 348 acl-2013-The effect of non-tightness on Bayesian estimation of PCFGs
14 0.077470899 37 acl-2013-Adaptive Parser-Centric Text Normalization
15 0.074470229 285 acl-2013-Propminer: A Workflow for Interactive Information Extraction and Exploration using Dependency Trees
16 0.073770098 128 acl-2013-Does Korean defeat phonotactic word segmentation?
17 0.073385552 26 acl-2013-A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy
18 0.072378218 357 acl-2013-Transfer Learning for Constituency-Based Grammars
19 0.069566071 261 acl-2013-Nonparametric Bayesian Inference and Efficient Parsing for Tree-adjoining Grammars
20 0.067324549 173 acl-2013-Graph-based Semi-Supervised Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging
topicId topicWeight
[(0, 0.162), (1, -0.06), (2, -0.062), (3, -0.008), (4, -0.061), (5, 0.051), (6, 0.047), (7, -0.006), (8, -0.032), (9, 0.007), (10, -0.003), (11, 0.025), (12, 0.052), (13, -0.094), (14, -0.015), (15, -0.061), (16, 0.129), (17, 0.146), (18, -0.024), (19, 0.023), (20, -0.046), (21, 0.066), (22, 0.088), (23, 0.063), (24, -0.088), (25, -0.044), (26, -0.051), (27, 0.073), (28, -0.012), (29, 0.03), (30, -0.027), (31, -0.038), (32, -0.084), (33, -0.006), (34, 0.106), (35, -0.086), (36, -0.086), (37, -0.002), (38, -0.021), (39, 0.057), (40, 0.052), (41, 0.067), (42, 0.131), (43, 0.053), (44, 0.027), (45, -0.05), (46, -0.128), (47, 0.026), (48, -0.093), (49, -0.066)]
simIndex simValue paperId paperTitle
same-paper 1 0.97535586 274 acl-2013-Parsing Graphs with Hyperedge Replacement Grammars
Author: David Chiang ; Jacob Andreas ; Daniel Bauer ; Karl Moritz Hermann ; Bevan Jones ; Kevin Knight
Abstract: Hyperedge replacement grammar (HRG) is a formalism for generating and transforming graphs that has potential applications in natural language understanding and generation. A recognition algorithm due to Lautemann is known to be polynomial-time for graphs that are connected and of bounded degree. We present a more precise characterization of the algorithm’s complexity, an optimization analogous to binarization of contextfree grammars, and some important implementation details, resulting in an algorithm that is practical for natural-language applications. The algorithm is part of Bolinas, a new software toolkit for HRG processing.
2 0.7953549 165 acl-2013-General binarization for parsing and translation
Author: Matthias Buchse ; Alexander Koller ; Heiko Vogler
Abstract: Binarization ofgrammars is crucial for improving the complexity and performance of parsing and translation. We present a versatile binarization algorithm that can be tailored to a number of grammar formalisms by simply varying a formal parameter. We apply our algorithm to binarizing tree-to-string transducers used in syntax-based machine translation.
3 0.61485231 348 acl-2013-The effect of non-tightness on Bayesian estimation of PCFGs
Author: Shay B. Cohen ; Mark Johnson
Abstract: Probabilistic context-free grammars have the unusual property of not always defining tight distributions (i.e., the sum of the “probabilities” of the trees the grammar generates can be less than one). This paper reviews how this non-tightness can arise and discusses its impact on Bayesian estimation of PCFGs. We begin by presenting the notion of “almost everywhere tight grammars” and show that linear CFGs follow it. We then propose three different ways of reinterpreting non-tight PCFGs to make them tight, show that the Bayesian estimators in Johnson et al. (2007) are correct under one of them, and provide MCMC samplers for the other two. We conclude with a discussion of the impact of tightness empirically.
4 0.5805555 311 acl-2013-Semantic Neighborhoods as Hypergraphs
Author: Chris Quirk ; Pallavi Choudhury
Abstract: Ambiguity preserving representations such as lattices are very useful in a number of NLP tasks, including paraphrase generation, paraphrase recognition, and machine translation evaluation. Lattices compactly represent lexical variation, but word order variation leads to a combinatorial explosion of states. We advocate hypergraphs as compact representations for sets of utterances describing the same event or object. We present a method to construct hypergraphs from sets of utterances, and evaluate this method on a simple recognition task. Given a set of utterances that describe a single object or event, we construct such a hypergraph, and demonstrate that it can recognize novel descriptions of the same event with high accuracy.
5 0.55488521 4 acl-2013-A Context Free TAG Variant
Author: Ben Swanson ; Elif Yamangil ; Eugene Charniak ; Stuart Shieber
Abstract: We propose a new variant of TreeAdjoining Grammar that allows adjunction of full wrapping trees but still bears only context-free expressivity. We provide a transformation to context-free form, and a further reduction in probabilistic model size through factorization and pooling of parameters. This collapsed context-free form is used to implement efficient gram- mar estimation and parsing algorithms. We perform parsing experiments the Penn Treebank and draw comparisons to TreeSubstitution Grammars and between different variations in probabilistic model design. Examination of the most probable derivations reveals examples of the linguistically relevant structure that our variant makes possible.
6 0.52828747 260 acl-2013-Nonconvex Global Optimization for Latent-Variable Models
7 0.52508813 285 acl-2013-Propminer: A Workflow for Interactive Information Extraction and Exploration using Dependency Trees
8 0.49040276 161 acl-2013-Fluid Construction Grammar for Historical and Evolutionary Linguistics
9 0.46796027 326 acl-2013-Social Text Normalization using Contextual Graph Random Walks
10 0.45054743 261 acl-2013-Nonparametric Bayesian Inference and Efficient Parsing for Tree-adjoining Grammars
11 0.43756819 57 acl-2013-Arguments and Modifiers from the Learner's Perspective
12 0.43443549 15 acl-2013-A Novel Graph-based Compact Representation of Word Alignment
13 0.43338841 37 acl-2013-Adaptive Parser-Centric Text Normalization
14 0.43221971 324 acl-2013-Smatch: an Evaluation Metric for Semantic Feature Structures
15 0.42681608 320 acl-2013-Shallow Local Multi-Bottom-up Tree Transducers in Statistical Machine Translation
16 0.42589012 182 acl-2013-High-quality Training Data Selection using Latent Topics for Graph-based Semi-supervised Learning
17 0.42434773 140 acl-2013-Evaluating Text Segmentation using Boundary Edit Distance
18 0.41043067 280 acl-2013-Plurality, Negation, and Quantification:Towards Comprehensive Quantifier Scope Disambiguation
19 0.4102861 36 acl-2013-Adapting Discriminative Reranking to Grounded Language Learning
20 0.40493754 176 acl-2013-Grounded Unsupervised Semantic Parsing
topicId topicWeight
[(0, 0.047), (1, 0.238), (6, 0.047), (11, 0.077), (14, 0.019), (15, 0.018), (24, 0.039), (26, 0.062), (35, 0.066), (42, 0.033), (48, 0.044), (64, 0.012), (70, 0.086), (88, 0.026), (90, 0.049), (95, 0.055)]
simIndex simValue paperId paperTitle
same-paper 1 0.82915455 274 acl-2013-Parsing Graphs with Hyperedge Replacement Grammars
Author: David Chiang ; Jacob Andreas ; Daniel Bauer ; Karl Moritz Hermann ; Bevan Jones ; Kevin Knight
Abstract: Hyperedge replacement grammar (HRG) is a formalism for generating and transforming graphs that has potential applications in natural language understanding and generation. A recognition algorithm due to Lautemann is known to be polynomial-time for graphs that are connected and of bounded degree. We present a more precise characterization of the algorithm’s complexity, an optimization analogous to binarization of contextfree grammars, and some important implementation details, resulting in an algorithm that is practical for natural-language applications. The algorithm is part of Bolinas, a new software toolkit for HRG processing.
2 0.68135512 347 acl-2013-The Role of Syntax in Vector Space Models of Compositional Semantics
Author: Karl Moritz Hermann ; Phil Blunsom
Abstract: Modelling the compositional process by which the meaning of an utterance arises from the meaning of its parts is a fundamental task of Natural Language Processing. In this paper we draw upon recent advances in the learning of vector space representations of sentential semantics and the transparent interface between syntax and semantics provided by Combinatory Categorial Grammar to introduce Combinatory Categorial Autoencoders. This model leverages the CCG combinatory operators to guide a non-linear transformation of meaning within a sentence. We use this model to learn high dimensional embeddings for sentences and evaluate them in a range of tasks, demonstrating that the incorporation of syntax allows a concise model to learn representations that are both effective and general.
3 0.58223033 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
Author: Oleg Rokhlenko ; Idan Szpektor
Abstract: We introduce the novel task of automatically generating questions that are relevant to a text but do not appear in it. One motivating example of its application is for increasing user engagement around news articles by suggesting relevant comparable questions, such as “is Beyonce a better singer than Madonna?”, for the user to answer. We present the first algorithm for the task, which consists of: (a) offline construction of a comparable question template database; (b) ranking of relevant templates to a given article; and (c) instantiation of templates only with entities in the article whose comparison under the template’s relation makes sense. We tested the suggestions generated by our algorithm via a Mechanical Turk experiment, which showed a significant improvement over the strongest baseline of more than 45% in all metrics.
4 0.58105308 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing
Author: Muhua Zhu ; Yue Zhang ; Wenliang Chen ; Min Zhang ; Jingbo Zhu
Abstract: Shift-reduce dependency parsers give comparable accuracies to their chartbased counterparts, yet the best shiftreduce constituent parsers still lag behind the state-of-the-art. One important reason is the existence of unary nodes in phrase structure trees, which leads to different numbers of shift-reduce actions between different outputs for the same input. This turns out to have a large empirical impact on the framework of global training and beam search. We propose a simple yet effective extension to the shift-reduce process, which eliminates size differences between action sequences in beam-search. Our parser gives comparable accuracies to the state-of-the-art chart parsers. With linear run-time complexity, our parser is over an order of magnitude faster than the fastest chart parser.
5 0.57530653 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering
Author: Anthony Fader ; Luke Zettlemoyer ; Oren Etzioni
Abstract: We study question answering as a machine learning problem, and induce a function that maps open-domain questions to queries over a database of web extractions. Given a large, community-authored, question-paraphrase corpus, we demonstrate that it is possible to learn a semantic lexicon and linear ranking function without manually annotating questions. Our approach automatically generalizes a seed lexicon and includes a scalable, parallelized perceptron parameter estimation scheme. Experiments show that our approach more than quadruples the recall of the seed lexicon, with only an 8% loss in precision.
6 0.57485527 318 acl-2013-Sentiment Relevance
7 0.57465553 212 acl-2013-Language-Independent Discriminative Parsing of Temporal Expressions
8 0.57419425 7 acl-2013-A Lattice-based Framework for Joint Chinese Word Segmentation, POS Tagging and Parsing
10 0.57374102 275 acl-2013-Parsing with Compositional Vector Grammars
11 0.57323235 80 acl-2013-Chinese Parsing Exploiting Characters
12 0.57200688 333 acl-2013-Summarization Through Submodularity and Dispersion
13 0.56844646 250 acl-2013-Models of Translation Competitions
14 0.56841069 165 acl-2013-General binarization for parsing and translation
15 0.56829244 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search
16 0.5678404 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model
17 0.56687123 144 acl-2013-Explicit and Implicit Syntactic Features for Text Classification
18 0.56640977 369 acl-2013-Unsupervised Consonant-Vowel Prediction over Hundreds of Languages
19 0.56504768 224 acl-2013-Learning to Extract International Relations from Political Context
20 0.56478709 343 acl-2013-The Effect of Higher-Order Dependency Features in Discriminative Phrase-Structure Parsing