emnlp emnlp2012 emnlp2012-88 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Michael White ; Rajakrishnan Rajkumar
Abstract: Comprehension and corpus studies have found that the tendency to minimize dependency length has a strong influence on constituent ordering choices. In this paper, we investigate dependency length minimization in the context of discriminative realization ranking, focusing on its potential to eliminate egregious ordering errors as well as better match the distributional characteristics of sentence orderings in news text. We find that with a stateof-the-art, comprehensive realization ranking model, dependency length minimization yields statistically significant improvements in BLEU scores and significantly reduces the number of heavy/light ordering errors. Through distributional analyses, we also show that with simpler ranking models, dependency length minimization can go overboard, too often sacrificing canonical word order to shorten dependencies, while richer models manage to better counterbalance the dependency length minimization preference against (sometimes) competing canonical word order preferences.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract Comprehension and corpus studies have found that the tendency to minimize dependency length has a strong influence on constituent ordering choices. [sent-3, score-0.602]
2 In this paper, we investigate dependency length minimization in the context of discriminative realization ranking, focusing on its potential to eliminate egregious ordering errors as well as better match the distributional characteristics of sentence orderings in news text. [sent-4, score-1.091]
3 We find that with a stateof-the-art, comprehensive realization ranking model, dependency length minimization yields statistically significant improvements in BLEU scores and significantly reduces the number of heavy/light ordering errors. [sent-5, score-1.033]
4 We demonstrate empirically using OpenCCG, our CCG-based (Steedman, 2000) surface realization system, the utility of a global feature encoding 244 the total dependency length of a given derivation. [sent-8, score-0.673]
5 Although other works in the realization literature have used phrase length or head-dependent distances in their models (Filippova and Strube, 2009; Velldal and Oepen, 2005; White and Rajkumar, 2009, i. [sent-9, score-0.497]
6 ), to the best of our knowledge, this paper is the first to use insights from the minimal dependency length theory directly and study their effects, both qualita- tively and quantitatively. [sent-11, score-0.344]
7 3, the full model shifts before next to the verb, despite the NP cheating being very light, yielding a very confusing ordering given that before is meant to be intransitive. [sent-19, score-0.186]
8 The syntactic features in White & Rajku- mar’s (2009) realization from Clark & Curran’s ranking model are taken (2007) normal form model PLraoncgeeuadgineg Lse oafr tnhineg 2,0 p1a2g Jeosin 24t C4–o2n5f5e,re Jnecjue Iosnla Enmd,p Kiroicraela, M 1e2t–h1o4ds Ju ilny N 20a1tu2r. [sent-20, score-0.395]
9 lc L2a0n1g2ua Agseso Pcrioactieosnsi fnogr a Cnodm Cpoumtaptiuotnaatilo Lnianlg Nuaist uircasl wsj 0034. [sent-22, score-0.132]
10 Table 1: Examples of OpenCCG output with White & Rajkumar’s (2009) models—the first represents a successful case, the latter two egregious ordering errors (Table 3; see Section 3). [sent-31, score-0.262]
11 As such, the model takes into account action of dependency but in essence our investigation length with derivation steps, does not consider fect of dependency the main ef- length itself. [sent-33, score-0.688]
12 It is important to observe at this point that dependency length minimization is more of a preference than an optimization objective, which must be balanced against other order preferences at times. [sent-35, score-0.583]
13 A closer reading of Temperley’s (2007) study reveals that dependency length can sometimes run counter to many canonical word order choices. [sent-36, score-0.381]
14 Assuming that their parent head is the main verb of the sentence, a longshort sequence would minimize overall dependency length. [sent-38, score-0.182]
15 However, in 613 examples found in the Penn Treebank, the average length of the first adjunct was 3. [sent-39, score-0.198]
16 Gildea and Temperley (2007) also suggest 245 that adverb placement might involve cases which go against dependency length minimization. [sent-45, score-0.344]
17 An exam- ination of 295 legitimate long-short post-verbal constituent orders (counter to dependency length) from Section 00 of the Penn Treebank revealed that temporal adverb phrases are often involved in long-short orders, as shown in wsj 0075. [sent-46, score-0.427]
18 In our setup, the preference to minimize dependency length can be balanced by features capturing preferences for alternate choices (e. [sent-48, score-0.389]
19 the argument-adjunct distinction in our dependency ordering model, Table 4). [sent-50, score-0.322]
20 Via distributional analyses, we show that while simpler realization ranking models can go overboard in minimizing dependency length, richer models largely succeed in overcoming this issue, while still taking advantage of dependency length minimization to avoid egregious ordering errors. [sent-51, score-1.424]
21 1 Minimal Dependency Length Comprehension and corpus studies (Gibson, 1998; Gibson, 2000; Temperley, 2007) point to the tendency of production and comprehension systems to adhere to principles of dependency length minimization. [sent-53, score-0.491]
22 The idea of dependency length minimization is based on Gibson’s (1998) Dependency Locality Theory (DLT) of comprehension, which predicts that longer dependencies are more difficult to process. [sent-54, score-0.538]
23 Table 2: Counter-examples to dependency length minimization comprehension (Lewis et al. [sent-64, score-0.608]
24 Extending these ideas from comprehension, Temperley (2007) poses the question: Does language production reflect a preference for shorter dependencies as well so as to facilitate comprehension? [sent-66, score-0.083]
25 By means of a study of Penn Treebank data, Temperley shows that English sentences do display a tendency to minimize the sum of all their head-dependent distances as illustrated by a variety of constructions. [sent-67, score-0.083]
26 Tily (2010) also applies insights from the above cited papers to show that dependency length constitutes a significant pressure towards language change. [sent-69, score-0.344]
27 , Japanese), dependency length minimization results in the “long-short” constituent ordering in language production (Yamashita and Chang, 2001). [sent-72, score-0.795]
28 More generally, Hawkins’s (1994; 2000) processing domains, dependency length minimization and endweight effects in constituent ordering (Wasow and Arnold, 2003) are all very closely related. [sent-73, score-0.757]
29 However, it would be very reductive to consider dependency length minimization as the sole factor in language production. [sent-75, score-0.538]
30 These other prefer246 ences are either correlated with dependency length or can override the minimal dependency length preference. [sent-77, score-0.688]
31 As Temperley (2007) suggests, a satisactory model should combine insights from multiple approaches, a theme which we investigate in this work by means of a rich feature set adapted from the parsing and realization literature. [sent-84, score-0.291]
32 Our feature design has been inspired by the conclusions of the above-cited works pertaining to the role of de- pendency length minimization in syntactic choice in conjuction with other factors influencing constituent order. [sent-85, score-0.435]
33 OpenCCG is a parsing/generation library which includes a hybrid symbolic-statistical chart realizer (White, 2006; White and Rajkumar, 2009). [sent-92, score-0.092]
34 The input to the OpenCCG realizer is a semantic graph, where each node has a lexical predication and a set of semantic features; nodes are connected via dependency relations. [sent-93, score-0.274]
35 Alternative realizations are ranked using integrated ngram or averaged perceptron scoring models. [sent-95, score-0.141]
36 In the experiments reported below, the inputs are derived from the gold standard derivations in the CCGbank (Hockenmaier and Steedman, 2007), and the outputs are the highest-scoring realizations found during the realizer’s chart-based search. [sent-96, score-0.187]
37 1 3 Feature Design In the realm of paraphrasing using tree linearization, Kempen and Harbusch (2004) explore features which have later been appropriated into classification approaches for surface realization (Filippova and Strube, 2007). [sent-97, score-0.329]
38 Prominent features include in1The realizer can also be run using inputs derived from OpenCCG’s parser, though informal experiments suggest that parse errors tend to decrease generation quality. [sent-98, score-0.128]
39 In the case of ranking models for surface realization, by far the most comprehensive experiments involving linguistically motivated features are reported in work of Cahill for German realization ranking (Cahill et al. [sent-100, score-0.457]
40 The feature sets explored in this paper extend those in previous work on realization ranking with OpenCCG using averaged perceptron models (White and Rajkumar, 2009; Rajkumar et al. [sent-103, score-0.389]
41 , 2009; Rajkumar and White, 2010) to include more comprehensive ordering features. [sent-104, score-0.14]
42 The inclusion of the DEPORD features is intended to yield a model with a similarly rich set of ordering features as Cahill and Forster’s (2009) realization ranking model for German. [sent-106, score-0.495]
43 DEPLEN The total of the length between all semantic heads and dependents for a realization, where length is in intervening excluding punctuation. [sent-108, score-0.412]
44 For length purposes, collapsed named entities were counted as a single word in the experiments reported here. [sent-109, score-0.162]
45 , 2010); however, realization ranking accuracy was slightly worse than counting all non-punctuation words. [sent-112, score-0.355]
46 + Rel1 + Rel2 hhNNNN,, lleefftt,, DDTet,, JMJiodi Table 4: Basic head-dependent and sibling dependent ordering features of) Hockenmaier’s (2003) generative syntactic model. [sent-134, score-0.14]
47 C&C; NF DISTANCE The distance features from the C&C; normal form model, where the distance between a head and its dependent is measured in intervening words, punctuation marks or verbs; caps of 3, 3 and 2 (resp. [sent-138, score-0.084]
48 DEPORD Several classes of features for ordering heads and dependents as well as sibling dependents on the same side of the head. [sent-140, score-0.228]
49 The basic features—using words, POS tags and dependency relations, grouped by the broad POS tag of the head—are shown in Table 4. [sent-141, score-0.182]
50 The three groups of models are designed to test the impact of the dependency length feature when added to feature sets of increasing complexity. [sent-149, score-0.344]
51 The second group is centered on DEPORD-NODIST, which contains all features except the dependency length feature and the distance features in Clark & Curran’s normal form model, which may indirectly capture some dependency length minimization preferences. [sent-151, score-0.922]
52 In the final group, DEPORD-NF contains all the features examined in this paper except the dependency length feature, while DEPLEN contains all the features including the dependency length feature. [sent-153, score-0.688]
53 Note that the learned weight of the total dependency length feature was negative in each case, as expected. [sent-154, score-0.344]
54 2 BLEU Results Following the usual practice in the realization ranking, we first evaluate our results quantitatively using exact matches and BLEU (Papineni et al. [sent-160, score-0.291]
55 As with the dev set, the dependency length feature yielded a significant increase in BLEU scores for each comparison on the test set also. [sent-225, score-0.344]
56 In particular, note that DEPLEN and DEPORD-NF agree on the best realization 81% of the time, while DEPLEN-NODIST and DEPORDNODIST have 78. [sent-228, score-0.291]
57 4% agreement; by comparison, DEPORD-NODIST and GLOBAL only agree on the best realization 51. [sent-230, score-0.291]
58 3 Detailed Analyses The effect of the dependency length feature on the distribution of dependency lengths is illustrated in Table 8. [sent-233, score-0.526]
59 The table shows the mean of the total dependency length of each realized derivation com3Kudos to Kevin Gimpel for making his resampling scripts available from http : / /www . [sent-234, score-0.344]
60 21 Table 9: Distribution of various kinds of post-verbal con- stituents in the development set (Section 00); 4692 gold cases considered pared to the corresponding gold standard derivation, as well as the number of derivations with greater and lower dependency length. [sent-273, score-0.27]
61 According to paired ttests, the mean dependency lengths for the DEPLENNODIST and DEPLEN models do not differ significantly from the gold standard. [sent-274, score-0.226]
62 In contrast, the mean dependency length of all the models that do not include the dependency length feature does differ significantly (p < 0. [sent-275, score-0.688]
63 Additionally, all these models have more realizations with dependency length greater than the gold standard, in comparison to the dependency length minimizing models; this shows the efficacy of the dependency length feature in approximating the gold standard. [sent-277, score-1.278]
64 Interestingly, the DEPLEN-GLOBAL model significantly undershoots the gold standard on mean dependency length, and has the most skewed distribution of sentences with greater vs. [sent-278, score-0.226]
65 Apart from studying dependency length directly, we also looked at one of the attested effects of de- pendency length minimization, viz. [sent-280, score-0.506]
66 the tendency to prefer short-long post-verbal constituents in production (Temperley, 2007). [sent-281, score-0.148]
67 Four kinds of constituents were found in the postverbal domain. [sent-284, score-0.11]
68 For every verb, apart from single constituents and equal length constituents, shortlong and long-short sequences were also observed. [sent-285, score-0.233]
69 Table 9 demonstrates that for both the gold standard corpus as well as the realizer models, short-long constituents were more frequent than long-short or equal length constituents. [sent-286, score-0.369]
70 Table 10: Distribution of heavy unequal constituents (length difference > 5) in Section 00; 4692 gold cases considered and significance tested against the gold standard using a χ-square test ported by previous corpus studies of English (Temperley, 2007; Wasow and Arnold, 2003). [sent-307, score-0.232]
71 The figures reported here show the tendency of the DEPLEN* models to be closer to the gold standard than the other models, especially in the case of short-long constituents. [sent-308, score-0.083]
72 We also performed an analysis of relative constituent lengths focusing on light-heavy and heavylight cases; specifically, we examined unequal length constituent sequences where the length difference of the constituents was greater than 5, and the shorter constituent was under 5 words. [sent-309, score-0.668]
73 Using a χ-square test, the distribution of heavy unequal length constituent counts in the DEPLEN-NODIST and DEPLEN models does not significantly differ from that of the gold standard. [sent-311, score-0.358]
74 4 Examples Table 11 shows examples of how the dependency length feature (DEPLEN) affects the output even in comparison to a model (DEPORD) with a rich set of discriminative syntactic and dependency ordering features, but no features directly targeting relative weight. [sent-314, score-0.666]
75 7, the dependency length model produces an exact match, while the DEPORD model fails to shift the short temporal adverbial next year next to the verb, leaving a confusingly repetitive this year next year at the end of the sentence. [sent-316, score-0.515]
76 1, the dependency length model produces a nearly exact match with just an equally ac251 ceptable inversion of closely watching. [sent-318, score-0.344]
77 13, both models put the temporal modifier on Thursday in its canonical VP-final position, despite this order running counter to dependency length minimization. [sent-323, score-0.381]
78 2 shows a case where DEPORD is nearly an exact match (except for a missing comma), but the dependency length model fronts the PP on the 12-member board, where it is grammatical but rather marked (and not motivated in the discourse context). [sent-325, score-0.344]
79 5 Interim Discussion The experiments show a consistent positive effect of the dependency length feature in improving BLEU scores and achieving a better match with the corpus distributions of dependency length and short/long constituent orders. [sent-327, score-0.767]
80 Intriguingly, there is some evidence that a nega- tively weighted total dependency length feature can go too far in minimizing dependency length, in the absence of other informative features to counterbalance it. [sent-329, score-0.623]
81 In particular, the DEPLEN-GLOBAL model in Table 8 has significantly lower dependency length than the corpus, but in the richer models with discriminative synactic and dependency ordering features, there are no significant differences. [sent-330, score-0.666]
82 It may still be though that additional features are necessary to counteract the tendency towards dependency length minimization, for example to ensure that initial constituents play their intended role in establishing and continuing topics in discourse, as also observed in Table 11. [sent-331, score-0.454]
83 , claiming some success in its trade diplomacy ,removed South Korea , Taiwan and Saudi Arabia from a list of countries it is closely watching for allegedly failing to honor U. [sent-340, score-0.259]
84 claiming some success in its trade diplomacy , removed South Korea Taiwan and Saudi Arabia from a list of countries it is watching closely for allegedly failing to honor U. [sent-345, score-0.259]
85 removed from a list of countries it is watching closely for allegedly failing to honor U. [sent-350, score-0.167]
86 patents , copyrights and other intellectual-property rights , claiming some success in its trade diplomacy , South Korea , Taiwan and Saudi Arabia . [sent-352, score-0.213]
87 8 DEPLEN DEPORD but he has not said before that the country wants half the debt forgiven . [sent-354, score-0.085]
88 but he not has said before ∅ the country wants half the debt forgiven . [sent-355, score-0.085]
89 Table 11: Examples of realized output for full models with and without the dependency length feature 4. [sent-378, score-0.344]
90 6 Targeted Human Evaluation To determine whether heavy-light ordering differences often represent ordering errors (including egregious ones), rather than simply representing acceptable variation, we conducted a targeted human evaluation on examples of this kind. [sent-379, score-0.44]
91 Inspecting the data ourselves, we found that many of the items did indeed involve egregious ordering errors that the DEPLEN* models managed to avoid. [sent-385, score-0.262]
92 5 Related Work As noted in the introduction, to the best of our knowledge this paper is the first to examine the impact of dependency length minimization on realization ranking. [sent-386, score-0.829]
93 ’s (201 1) system on shared task shallow inputs (-S), which performs much better than their sys- tem on deep inputs (-D) that more closely resemble OpenCCG’s. [sent-392, score-0.072]
94 6 Conclusions In this paper, we have investigated dependency length minimization in the context of realization ranking, focusing on its potential to eliminate egregious ordering errors as well as better match the distributional characteristics of sentence orderings in news text. [sent-393, score-1.091]
95 Going beyond the BLEU metric, we also conducted a targeted human evaluation to confirm the utility of the dependency length feature in models of varying richness. [sent-425, score-0.382]
96 newness: The effects of structural complexity and discourse status on constituent ordering. [sent-438, score-0.118]
97 The types and distributions of errors in a wide coverage surface realizer evaluation. [sent-476, score-0.13]
98 Generating natural word orders in a semi-free word order language: Treebank-based linearization preferences for German. [sent-548, score-0.073]
99 Linguistically informed statistical models of constituent structure for ordering in sentence realization. [sent-597, score-0.219]
100 “Long before short” preference in the production of a headfinal language. [sent-648, score-0.083]
wordName wordTfidf (topN-words)
[('sdcl', 0.35), ('deplen', 0.32), ('realization', 0.291), ('depord', 0.198), ('openccg', 0.198), ('temperley', 0.198), ('minimization', 0.194), ('dependency', 0.182), ('length', 0.162), ('ordering', 0.14), ('wsj', 0.132), ('egregious', 0.122), ('wasow', 0.122), ('rajkumar', 0.12), ('realizations', 0.107), ('np', 0.106), ('white', 0.106), ('cahill', 0.095), ('realizer', 0.092), ('npi', 0.091), ('arnold', 0.083), ('bleu', 0.079), ('constituent', 0.079), ('hockenmaier', 0.077), ('hawkins', 0.076), ('treasury', 0.076), ('constituents', 0.071), ('comprehension', 0.07), ('ccg', 0.07), ('ranking', 0.064), ('arabia', 0.061), ('filippova', 0.059), ('curran', 0.057), ('year', 0.057), ('animacy', 0.055), ('dlt', 0.052), ('taiwan', 0.052), ('gibson', 0.051), ('minimizing', 0.051), ('rajakrishnan', 0.047), ('allegedly', 0.046), ('anttila', 0.046), ('cheating', 0.046), ('claiming', 0.046), ('copyrights', 0.046), ('counterbalance', 0.046), ('diplomacy', 0.046), ('forgiven', 0.046), ('glauber', 0.046), ('hbought', 0.046), ('honor', 0.046), ('hvbd', 0.046), ('nakanishi', 0.046), ('rexinger', 0.046), ('undersecretary', 0.046), ('preference', 0.045), ('clark', 0.044), ('gold', 0.044), ('intervening', 0.044), ('distances', 0.044), ('dep', 0.044), ('korea', 0.044), ('dependents', 0.044), ('lewis', 0.041), ('bohnet', 0.041), ('categorial', 0.041), ('light', 0.041), ('normal', 0.04), ('circuit', 0.039), ('headword', 0.039), ('linearization', 0.039), ('postverbal', 0.039), ('aoife', 0.039), ('thursday', 0.039), ('patents', 0.039), ('saudi', 0.039), ('watching', 0.039), ('status', 0.039), ('tendency', 0.039), ('south', 0.039), ('wants', 0.039), ('surface', 0.038), ('production', 0.038), ('targeted', 0.038), ('heavy', 0.037), ('canonical', 0.037), ('inputs', 0.036), ('succeed', 0.036), ('unequal', 0.036), ('adjunct', 0.036), ('hogan', 0.036), ('combinatory', 0.036), ('csli', 0.036), ('ccgbank', 0.036), ('failing', 0.036), ('purchase', 0.036), ('executive', 0.036), ('rights', 0.036), ('orders', 0.034), ('perceptron', 0.034)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000005 88 emnlp-2012-Minimal Dependency Length in Realization Ranking
Author: Michael White ; Rajakrishnan Rajkumar
Abstract: Comprehension and corpus studies have found that the tendency to minimize dependency length has a strong influence on constituent ordering choices. In this paper, we investigate dependency length minimization in the context of discriminative realization ranking, focusing on its potential to eliminate egregious ordering errors as well as better match the distributional characteristics of sentence orderings in news text. We find that with a stateof-the-art, comprehensive realization ranking model, dependency length minimization yields statistically significant improvements in BLEU scores and significantly reduces the number of heavy/light ordering errors. Through distributional analyses, we also show that with simpler ranking models, dependency length minimization can go overboard, too often sacrificing canonical word order to shorten dependencies, while richer models manage to better counterbalance the dependency length minimization preference against (sometimes) competing canonical word order preferences.
2 0.10495252 59 emnlp-2012-Generating Non-Projective Word Order in Statistical Linearization
Author: Bernd Bohnet ; Anders Bjorkelund ; Jonas Kuhn ; Wolfgang Seeker ; Sina Zarriess
Abstract: We propose a technique to generate nonprojective word orders in an efficient statistical linearization system. Our approach predicts liftings of edges in an unordered syntactic tree by means of a classifier, and uses a projective algorithm for tree linearization. We obtain statistically significant improvements on six typologically different languages: English, German, Dutch, Danish, Hungarian, and Czech.
3 0.097739086 105 emnlp-2012-Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
Author: Jonathan K. Kummerfeld ; David Hall ; James R. Curran ; Dan Klein
Abstract: Constituency parser performance is primarily interpreted through a single metric, F-score on WSJ section 23, that conveys no linguistic information regarding the remaining errors. We classify errors within a set of linguistically meaningful types using tree transformations that repair groups of errors together. We use this analysis to answer a range of questions about parser behaviour, including what linguistic constructions are difficult for stateof-the-art parsers, what types of errors are being resolved by rerankers, and what types are introduced when parsing out-of-domain text.
4 0.084270447 12 emnlp-2012-A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing
Author: Bernd Bohnet ; Joakim Nivre
Abstract: Most current dependency parsers presuppose that input words have been morphologically disambiguated using a part-of-speech tagger before parsing begins. We present a transitionbased system for joint part-of-speech tagging and labeled dependency parsing with nonprojective trees. Experimental evaluation on Chinese, Czech, English and German shows consistent improvements in both tagging and parsing accuracy when compared to a pipeline system, which lead to improved state-of-theart results for all languages.
5 0.078276142 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
Author: Valentin I. Spitkovsky ; Hiyan Alshawi ; Daniel Jurafsky
Abstract: We present a new family of models for unsupervised parsing, Dependency and Boundary models, that use cues at constituent boundaries to inform head-outward dependency tree generation. We build on three intuitions that are explicit in phrase-structure grammars but only implicit in standard dependency formulations: (i) Distributions of words that occur at sentence boundaries such as English determiners resemble constituent edges. (ii) Punctuation at sentence boundaries further helps distinguish full sentences from fragments like headlines and titles, allowing us to model grammatical differences between complete and incomplete sentences. (iii) Sentence-internal punctuation boundaries help with longer-distance dependencies, since punctuation correlates with constituent edges. Our models induce state-of-the-art dependency grammars for many languages without — — special knowledge of optimal input sentence lengths or biased, manually-tuned initializers.
6 0.07434918 57 emnlp-2012-Generalized Higher-Order Dependency Parsing with Cube Pruning
7 0.070644408 16 emnlp-2012-Aligning Predicates across Monolingual Comparable Texts using Graph-based Clustering
8 0.067939863 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
9 0.066960774 131 emnlp-2012-Unified Dependency Parsing of Chinese Morphological and Syntactic Structures
10 0.059972208 18 emnlp-2012-An Empirical Investigation of Statistical Significance in NLP
11 0.059900232 94 emnlp-2012-Multiple Aspect Summarization Using Integer Linear Programming
12 0.059288844 129 emnlp-2012-Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries
13 0.048575994 80 emnlp-2012-Learning Verb Inference Rules from Linguistically-Motivated Evidence
14 0.046960808 46 emnlp-2012-Exploiting Reducibility in Unsupervised Dependency Parsing
15 0.046956502 127 emnlp-2012-Transforming Trees to Improve Syntactic Convergence
16 0.046276305 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
17 0.044931509 67 emnlp-2012-Inducing a Discriminative Parser to Optimize Machine Translation Reordering
18 0.04362883 126 emnlp-2012-Training Factored PCFGs with Expectation Propagation
19 0.042579859 64 emnlp-2012-Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints
20 0.04249021 123 emnlp-2012-Syntactic Transfer Using a Bilingual Lexicon
topicId topicWeight
[(0, 0.183), (1, -0.075), (2, 0.052), (3, -0.044), (4, 0.047), (5, 0.036), (6, -0.007), (7, 0.02), (8, -0.023), (9, 0.111), (10, -0.015), (11, 0.016), (12, -0.075), (13, 0.079), (14, 0.037), (15, -0.037), (16, 0.0), (17, -0.034), (18, -0.033), (19, 0.04), (20, 0.046), (21, -0.054), (22, 0.065), (23, 0.045), (24, -0.003), (25, 0.097), (26, 0.025), (27, -0.015), (28, 0.056), (29, -0.043), (30, 0.073), (31, -0.042), (32, -0.026), (33, -0.088), (34, 0.046), (35, -0.079), (36, -0.303), (37, 0.123), (38, 0.303), (39, 0.265), (40, 0.118), (41, -0.134), (42, -0.194), (43, -0.187), (44, -0.116), (45, -0.084), (46, 0.01), (47, -0.107), (48, -0.177), (49, -0.012)]
simIndex simValue paperId paperTitle
same-paper 1 0.93374091 88 emnlp-2012-Minimal Dependency Length in Realization Ranking
Author: Michael White ; Rajakrishnan Rajkumar
Abstract: Comprehension and corpus studies have found that the tendency to minimize dependency length has a strong influence on constituent ordering choices. In this paper, we investigate dependency length minimization in the context of discriminative realization ranking, focusing on its potential to eliminate egregious ordering errors as well as better match the distributional characteristics of sentence orderings in news text. We find that with a stateof-the-art, comprehensive realization ranking model, dependency length minimization yields statistically significant improvements in BLEU scores and significantly reduces the number of heavy/light ordering errors. Through distributional analyses, we also show that with simpler ranking models, dependency length minimization can go overboard, too often sacrificing canonical word order to shorten dependencies, while richer models manage to better counterbalance the dependency length minimization preference against (sometimes) competing canonical word order preferences.
2 0.6067915 59 emnlp-2012-Generating Non-Projective Word Order in Statistical Linearization
Author: Bernd Bohnet ; Anders Bjorkelund ; Jonas Kuhn ; Wolfgang Seeker ; Sina Zarriess
Abstract: We propose a technique to generate nonprojective word orders in an efficient statistical linearization system. Our approach predicts liftings of edges in an unordered syntactic tree by means of a classifier, and uses a projective algorithm for tree linearization. We obtain statistically significant improvements on six typologically different languages: English, German, Dutch, Danish, Hungarian, and Czech.
3 0.46719414 18 emnlp-2012-An Empirical Investigation of Statistical Significance in NLP
Author: Taylor Berg-Kirkpatrick ; David Burkett ; Dan Klein
Abstract: We investigate two aspects of the empirical behavior of paired significance tests for NLP systems. First, when one system appears to outperform another, how does significance level relate in practice to the magnitude of the gain, to the size of the test set, to the similarity of the systems, and so on? Is it true that for each task there is a gain which roughly implies significance? We explore these issues across a range of NLP tasks using both large collections of past systems’ outputs and variants of single systems. Next, once significance levels are computed, how well does the standard i.i.d. notion of significance hold up in practical settings where future distributions are neither independent nor identically distributed, such as across domains? We explore this question using a range of test set variations for constituency parsing.
4 0.37507135 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
Author: Valentin I. Spitkovsky ; Hiyan Alshawi ; Daniel Jurafsky
Abstract: We present a new family of models for unsupervised parsing, Dependency and Boundary models, that use cues at constituent boundaries to inform head-outward dependency tree generation. We build on three intuitions that are explicit in phrase-structure grammars but only implicit in standard dependency formulations: (i) Distributions of words that occur at sentence boundaries such as English determiners resemble constituent edges. (ii) Punctuation at sentence boundaries further helps distinguish full sentences from fragments like headlines and titles, allowing us to model grammatical differences between complete and incomplete sentences. (iii) Sentence-internal punctuation boundaries help with longer-distance dependencies, since punctuation correlates with constituent edges. Our models induce state-of-the-art dependency grammars for many languages without — — special knowledge of optimal input sentence lengths or biased, manually-tuned initializers.
Author: Jonathan K. Kummerfeld ; David Hall ; James R. Curran ; Dan Klein
Abstract: Constituency parser performance is primarily interpreted through a single metric, F-score on WSJ section 23, that conveys no linguistic information regarding the remaining errors. We classify errors within a set of linguistically meaningful types using tree transformations that repair groups of errors together. We use this analysis to answer a range of questions about parser behaviour, including what linguistic constructions are difficult for stateof-the-art parsers, what types of errors are being resolved by rerankers, and what types are introduced when parsing out-of-domain text.
6 0.28257307 16 emnlp-2012-Aligning Predicates across Monolingual Comparable Texts using Graph-based Clustering
7 0.28077394 46 emnlp-2012-Exploiting Reducibility in Unsupervised Dependency Parsing
8 0.27442065 53 emnlp-2012-First Order vs. Higher Order Modification in Distributional Semantics
9 0.2480918 45 emnlp-2012-Exploiting Chunk-level Features to Improve Phrase Chunking
10 0.24757765 12 emnlp-2012-A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing
11 0.23971564 131 emnlp-2012-Unified Dependency Parsing of Chinese Morphological and Syntactic Structures
12 0.23439856 22 emnlp-2012-Automatically Constructing a Normalisation Dictionary for Microblogs
13 0.2261963 57 emnlp-2012-Generalized Higher-Order Dependency Parsing with Cube Pruning
14 0.21915419 129 emnlp-2012-Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries
15 0.21662886 7 emnlp-2012-A Novel Discriminative Framework for Sentence-Level Discourse Analysis
16 0.21217713 21 emnlp-2012-Assessment of ESL Learners' Syntactic Competence Based on Similarity Measures
17 0.21109332 94 emnlp-2012-Multiple Aspect Summarization Using Integer Linear Programming
18 0.20672219 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
19 0.19904776 108 emnlp-2012-Probabilistic Finite State Machines for Regression-based MT Evaluation
20 0.19878304 55 emnlp-2012-Forest Reranking through Subtree Ranking
topicId topicWeight
[(2, 0.02), (16, 0.041), (25, 0.012), (29, 0.019), (34, 0.047), (45, 0.014), (60, 0.067), (63, 0.047), (64, 0.024), (65, 0.025), (74, 0.075), (76, 0.046), (79, 0.01), (80, 0.018), (81, 0.02), (82, 0.395), (86, 0.018), (95, 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.76910436 88 emnlp-2012-Minimal Dependency Length in Realization Ranking
Author: Michael White ; Rajakrishnan Rajkumar
Abstract: Comprehension and corpus studies have found that the tendency to minimize dependency length has a strong influence on constituent ordering choices. In this paper, we investigate dependency length minimization in the context of discriminative realization ranking, focusing on its potential to eliminate egregious ordering errors as well as better match the distributional characteristics of sentence orderings in news text. We find that with a stateof-the-art, comprehensive realization ranking model, dependency length minimization yields statistically significant improvements in BLEU scores and significantly reduces the number of heavy/light ordering errors. Through distributional analyses, we also show that with simpler ranking models, dependency length minimization can go overboard, too often sacrificing canonical word order to shorten dependencies, while richer models manage to better counterbalance the dependency length minimization preference against (sometimes) competing canonical word order preferences.
2 0.58308011 50 emnlp-2012-Extending Machine Translation Evaluation Metrics with Lexical Cohesion to Document Level
Author: Billy T. M. Wong ; Chunyu Kit
Abstract: This paper proposes the utilization of lexical cohesion to facilitate evaluation of machine translation at the document level. As a linguistic means to achieve text coherence, lexical cohesion ties sentences together into a meaningfully interwoven structure through words with the same or related meaning. A comparison between machine and human translation is conducted to illustrate one of their critical distinctions that human translators tend to use more cohesion devices than machine. Various ways to apply this feature to evaluate machinetranslated documents are presented, including one without reliance on reference translation. Experimental results show that incorporating this feature into sentence-level evaluation metrics can enhance their correlation with human judgements.
3 0.32552633 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
Author: Jayant Krishnamurthy ; Tom Mitchell
Abstract: We present a method for training a semantic parser using only a knowledge base and an unlabeled text corpus, without any individually annotated sentences. Our key observation is that multiple forms ofweak supervision can be combined to train an accurate semantic parser: semantic supervision from a knowledge base, and syntactic supervision from dependencyparsed sentences. We apply our approach to train a semantic parser that uses 77 relations from Freebase in its knowledge representation. This semantic parser extracts instances of binary relations with state-of-theart accuracy, while simultaneously recovering much richer semantic structures, such as conjunctions of multiple relations with partially shared arguments. We demonstrate recovery of this richer structure by extracting logical forms from natural language queries against Freebase. On this task, the trained semantic parser achieves 80% precision and 56% recall, despite never having seen an annotated logical form.
4 0.31602585 7 emnlp-2012-A Novel Discriminative Framework for Sentence-Level Discourse Analysis
Author: Shafiq Joty ; Giuseppe Carenini ; Raymond Ng
Abstract: We propose a complete probabilistic discriminative framework for performing sentencelevel discourse analysis. Our framework comprises a discourse segmenter, based on a binary classifier, and a discourse parser, which applies an optimal CKY-like parsing algorithm to probabilities inferred from a Dynamic Conditional Random Field. We show on two corpora that our approach outperforms the state-of-the-art, often by a wide margin.
5 0.31057683 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
Author: Valentin I. Spitkovsky ; Hiyan Alshawi ; Daniel Jurafsky
Abstract: We present a new family of models for unsupervised parsing, Dependency and Boundary models, that use cues at constituent boundaries to inform head-outward dependency tree generation. We build on three intuitions that are explicit in phrase-structure grammars but only implicit in standard dependency formulations: (i) Distributions of words that occur at sentence boundaries such as English determiners resemble constituent edges. (ii) Punctuation at sentence boundaries further helps distinguish full sentences from fragments like headlines and titles, allowing us to model grammatical differences between complete and incomplete sentences. (iii) Sentence-internal punctuation boundaries help with longer-distance dependencies, since punctuation correlates with constituent edges. Our models induce state-of-the-art dependency grammars for many languages without — — special knowledge of optimal input sentence lengths or biased, manually-tuned initializers.
6 0.31006217 130 emnlp-2012-Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars
7 0.30529225 51 emnlp-2012-Extracting Opinion Expressions with semi-Markov Conditional Random Fields
8 0.30405298 14 emnlp-2012-A Weakly Supervised Model for Sentence-Level Semantic Orientation Analysis with Multiple Experts
9 0.30299017 109 emnlp-2012-Re-training Monolingual Parser Bilingually for Syntactic SMT
10 0.30142438 123 emnlp-2012-Syntactic Transfer Using a Bilingual Lexicon
11 0.29964778 122 emnlp-2012-Syntactic Surprisal Affects Spoken Word Duration in Conversational Contexts
12 0.29772884 82 emnlp-2012-Left-to-Right Tree-to-String Decoding with Prediction
13 0.29665062 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
14 0.29514375 81 emnlp-2012-Learning to Map into a Universal POS Tagset
15 0.29453522 20 emnlp-2012-Answering Opinion Questions on Products by Exploiting Hierarchical Organization of Consumer Reviews
16 0.29406905 105 emnlp-2012-Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
17 0.29344362 77 emnlp-2012-Learning Constraints for Consistent Timeline Extraction
18 0.29324716 120 emnlp-2012-Streaming Analysis of Discourse Participants
19 0.29267591 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents
20 0.29257879 1 emnlp-2012-A Bayesian Model for Learning SCFGs with Discontiguous Rules