emnlp emnlp2012 emnlp2012-55 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Richard Farkas ; Helmut Schmid
Abstract: We propose the subtree ranking approach to parse forest reranking which is a generalization of current perceptron-based reranking methods. For the training of the reranker, we extract competing local subtrees, hence the training instances (candidate subtree sets) are very similar to those used during beamsearch parsing. This leads to better parameter optimization. Another chief advantage of the framework is that arbitrary learning to rank methods can be applied. We evaluated our reranking approach on German and English phrase structure parsing tasks and compared it to various state-of-the-art reranking approaches such as the perceptron-based forest reranker. The subtree ranking approach with a Maximum Entropy model significantly outperformed the other approaches.
Reference: text
sentIndex sentText sentNum sentScore
1 de Abstract We propose the subtree ranking approach to parse forest reranking which is a generalization of current perceptron-based reranking methods. [sent-3, score-1.58]
2 For the training of the reranker, we extract competing local subtrees, hence the training instances (candidate subtree sets) are very similar to those used during beamsearch parsing. [sent-4, score-0.46]
3 Another chief advantage of the framework is that arbitrary learning to rank methods can be applied. [sent-6, score-0.119]
4 We evaluated our reranking approach on German and English phrase structure parsing tasks and compared it to various state-of-the-art reranking approaches such as the perceptron-based forest reranker. [sent-7, score-1.033]
5 The subtree ranking approach with a Maximum Entropy model significantly outperformed the other approaches. [sent-8, score-0.517]
6 The idea is to (re)rank candidates extracted by a base system exploiting a rich feature set and operating at a global (usually sentence) level. [sent-12, score-0.243]
7 Reranking achieved significant gains over the base system in many tasks because it has access to information/features which are not computable in the base system. [sent-13, score-0.14]
8 1038 (2006)) because the base system effectively and efficiently filters out many bad candidates and makes the problem tractable. [sent-16, score-0.151]
9 The standard approach for reranking is the n-best list ranking procedure, where the base system extracts its top n global-level candidates with associated goodness scores that define an initial ranking. [sent-17, score-0.529]
10 The bottleneck of this approach is the small number of candidates consid- ered. [sent-19, score-0.115]
11 Compared to n-best lists, packed parse forests encode more candidates in a compact way. [sent-20, score-0.32]
12 Forest reranking methods have been proposed, which can exploit the richer set of candidates and they have been successfully applied for phrase-structure (Huang, 2008), dependency (Hayashi et al. [sent-21, score-0.316]
13 The core of the algorithm is a beam-search based decoder operating on the packed forest in a bottom-up manner. [sent-24, score-0.617]
14 It follows the assumption that the feature values of the whole structure are the sum of the feature values of the local elements and they are designed to the usage of the perceptron update. [sent-25, score-0.231]
15 During training, it decodes the 1-best complete parse then it makes the perceptron update against the oracle parse, i. [sent-27, score-0.499]
16 the perceptron is trained at the global (sentence) level. [sent-29, score-0.234]
17 We propose here a subtree ranker approach which can be regarded as a generalization of this for- LParnogcue agdein Lgesa ornf tihneg, 2 p0a1g2e Jso 1in03t C8–o1n0f4e7re,n Jce ju on Is Elanmdp,ir Kicoarlea M,e 1t2h–o1d4s J iunly N 2a0tu1r2a. [sent-30, score-0.766]
18 Lc a2n0g1u2ag Aes Psorcoicaetsiosin fgo arn Cdo Cmopmutpauti oantiaoln Lailn Ngautiustriacls est reranking procedure. [sent-32, score-0.235]
19 In contrast to updating on a single (sub)tree per sentence using only the 1-best parse (perceptron-based forest reranking), the subtree ranker exploits subtrees of all sizes from a sentence and trains a (re)ranker utilising several derivations of the constituent in question. [sent-33, score-1.476]
20 During parsing we conduct a beam-search extraction by asking the ranker to select the k best subtrees among the possible candidates of each forest node. [sent-34, score-1.126]
21 The chief motivation for this approach is that in this way, training and prediction are carried out on similar local candidate lists which we expect to be favorable to the learning mechanism. [sent-35, score-0.194]
22 We empirically prove that the trained discriminative rankers benefit from having access to a larger amount of subtree candidates. [sent-36, score-0.46]
23 Moreover, in this framework any kind of learning to rank methods can be chosen as ranker, including pair-wise and list-wise classifiers (Li, 2011). [sent-37, score-0.103]
24 The contributions of this paper are the following: • We extend the perceptron-based forest rWeerankers to the subtree ranker forest reranking framework which allows to replace the perceptron update by any kind of learning to rank procedure. [sent-38, score-2.303]
25 • 2 We report experimental results on German aWnde English phrase-structure parsing comparing subtree rerankers to various other rerankers showing a significant improvement over the perceptron-based forest reranker approach. [sent-39, score-1.343]
26 Related Work Our method is closely related to the work of Huang (2008), who introduced forest-based reranking for phrase structure parsing. [sent-40, score-0.235]
27 It has several advantages compared with the perceptron-based forest reranker. [sent-42, score-0.496]
28 While the perceptron is fast to train, other machine learning approaches usually outperform it. [sent-44, score-0.188]
29 Most of the existing learning to rank approaches are built on linear models and evaluate the candidates independently of each other (such as MaxEnt (Charniak and Johnson, 2005), SVMRank (Joachims, 2002), SoftRank (Guiver and Snelson, 2008)). [sent-45, score-0.137]
30 We believe that the real bottleneck of parsing applications is parsing time and not training time. [sent-47, score-0.168]
31 In theory, we can imagine learning to rank approaches which can not be reduced to the individual scoring of candidates at prediction time, for instance a decision tree-based pairwise ranker. [sent-49, score-0.171]
32 Although such methods would also fit into the general subtree framework, they are not employed in practice (Li, 2011). [sent-50, score-0.397]
33 The subtree ranking approach is a generalization of the perceptron-based approach. [sent-51, score-0.54]
34 If the ranking algorithm is the Averaged Perceptron, the parsing algorithm reduces to perceptron-based forest parsing. [sent-52, score-0.706]
35 If the “selection strategy” utilizes the base system ranking and training starts with a filtering step which keeps only candidate sets from the root node of the forest we get the offline version of the training procedure of the perceptron-based forest reranker of Huang (2008). [sent-53, score-1.58]
36 As our approach is based on local ranking (local update in the online learning literature), it is highly related to early update which looks for the first local decision point where the oracle parse falls out from the beam. [sent-54, score-0.684]
37 Early update was introduced by Collins and Roark (2004) for incremental parsing and adopted to forest reranking by Wang and Zong (201 1). [sent-55, score-0.873]
38 Besides phrase structure parsing, the forest reranking approach was successfully applied for dependency parsing as well. [sent-56, score-0.798]
39 The topological order of the parse forest nodes can form the “sequence of choices” of Searn. [sent-61, score-0.655]
40 Neu and Szepesv a´ri (2009) introduced the top-down parsing Markov Decision Processes and experiment with several inverse reinforcement learning methods. [sent-64, score-0.094]
41 The forest reranking approaches are bottom-up parsers which would require a new (non-straightforward) definition of a corresponding Markov Decision Process. [sent-65, score-0.731]
42 – 3 – Subtree Ranking-based Forest Reranking A packed parse forest is a compact representation of possible parses for a given sentence. [sent-66, score-0.667]
43 A forest has the structure of a hypergraph, whose nodes V are the elementary units ofthe underlying structured prediction problem and the hyperedges E are the possible deductive steps from the nodes. [sent-67, score-0.587]
44 In this framework nodes correspond to constituents spanning a certain scope of the input sentence and a hyperedge e links a parent node head(e) to its children tails(e) (i. [sent-69, score-0.233]
45 The forest is extracted from the chart of a base PCFG parser, usually employing a heavy pruning strategy. [sent-72, score-0.635]
46 Then the goal of a forest reranker is to find the best parse of the input sentence exploiting a feature representation of (sub)trees. [sent-73, score-0.804]
47 We sketch the parsing procedure of the subtree ranker in Algorithm 1. [sent-74, score-0.777]
48 It is a bottom-up beamsearch parser operating on the hypergraph. [sent-75, score-0.129]
49 At each node v we store the k best subtrees S(v) headed by the node. [sent-76, score-0.236]
50 The S(v) lists contain the k top-ranked subtrees by the ranker R among the candidates in the beam. [sent-77, score-0.62]
51 The set of candidate subtrees at a node is the union of the candidates at the different hyperedges. [sent-78, score-0.316]
52 The set of candidate subtrees at a certain hyperedge, in turn, is formed by the Cartesian product ⊗S(vi) ionf ttuhren k, i-bse fostr msuebdtr beyes t hsteo Creadr eats athne pcrohidldu nto ⊗deSs( vi. [sent-79, score-0.201]
53 The final output of forest ranking is the 1-best subtree headed by the goal node S1(vgoal). [sent-80, score-1.079]
54 Algorithm 2 depicts the training procedure of the subtree ranker. [sent-82, score-0.374]
55 As forests sometimes do not contain the gold standard tree, we extract an oracle tree instead, which is the closest derivable tree in the forest to the gold standard tree (Collins, 2000). [sent-83, score-0.917]
56 Then we optimize the parser for ranking the oracle tree at the top. [sent-84, score-0.384]
57 In Algorithm 2, we extract the oracle tree from the parses encoded in the forest hV, Eii fcoler ttrheee ifrtho training sentence, ewdh i nc thh ies tohree ttr heeV ,wEitih the highest F-score when compared to the gold standard tree yi. [sent-86, score-0.784]
58 For each of the training sentences we calculate the oracle subtrees for each node {Ov} of tchaelc corresponding parse feeosres fot. [sent-87, score-0.416]
59 r eWaceh f nololdowe { Othe dynamic programming approach of Huang (2008) for the extraction of the forest oracle. [sent-88, score-0.496]
60 The goal of this algorithm is to extract the full oracle tree, but as a side product it calculates the best possible subtree for all nodes including the nodes outside of the full oracle tree as well. [sent-89, score-0.789]
61 After computing the oracle subtrees, we crawl the forests bottom-up and extract a training instance hC, Ovi at each node v which consists of the candihdCat,eO Oseti aCt aenacdh ht nheo doera vcl we Ov a ct othnsatis ntso doef. [sent-90, score-0.286]
62 t Teh cea creation of candidate lists is exactly the same as it was at parsing time. [sent-91, score-0.203]
63 Then we create training instances from each of the candidate lists and form the set of subtrees S(v) which is stored for candidate extraction at the higher levels of the forest (later steps in the training instance extraction). [sent-92, score-0.809]
64 A crucial design question is how to form the S(v) sets during training, which is the task ofthe selection Algorithm 2 Subtree Ranker Training Require: {hV,Eii,yi}1N, SS selection strategy ∅: fTor ← ←all ∅ i← 1. [sent-93, score-0.119]
65 extractor(hV, Eii, yi) fOor ← ←all o v ∈ Vi iranc bottom-up topological order dfoor C ∅ fCor ← ←all ∅ e ∈ E, head(e) = v do rC a ← C ∈ ∪ E (⊗S(vj)) , vj ∈ tails(e) endC f o←r T ← T ∪ hC, Ovi S(v) ← SS(C, Ov) endS (fvor) end for R ← train reranker(T) Rret ←urn tr Rin Tq ← ← strategy SS. [sent-98, score-0.129]
66 One possible solution is to keep the k best oracle subtrees, i. [sent-99, score-0.162]
67 the k subtrees closest to the gold standard parse, which is analogous to using the gold standard labels in Maximum Entropy Markov Models for sequence labeling problems (we refer this selection strategy as ’oracle subtree’ later on). [sent-101, score-0.273]
68 The problem with this solution is that if the rankers have been trained on the oracle subtrees potentially leads to a suboptimal performance as the outputs of the ranker at prediction time are noisy. [sent-102, score-0.73]
69 Note that this approach is not a classical beam-based decoding anymore as the “beam” is maintained according to the oracle parses and there is no model which influences that. [sent-103, score-0.212]
70 An alternative solution beam-based decoding is to use a ranker model to extract the S(v) set in training time as well. [sent-104, score-0.364]
71 In the general reranking approach, we assume that the ranking of the base parser is reliable. [sent-105, score-0.488]
72 So we store the k best subtrees according to the base system in S(v) (the ’base system ranking’ selection strategy). [sent-106, score-0.284]
73 Note that the general framework keeps this question open and lets the implementations define a selection strategy SS. [sent-107, score-0.127]
74 – – After extracting the training instances T we can train an arbitrary ranker R offline. [sent-108, score-0.336]
75 Note that the extraction of candidate lists is exactly the same in Algorithm 1 and 2 while the creation of Sv can be different. [sent-109, score-0.136]
76 As the grammatical functions of constituents are important from a downstream application point of view especially in German we also report PARSEVAL scores on the conflation of constituent labels and grammatical functions. [sent-112, score-0.145]
77 2 Implementation of the Generic Framework We investigate the Averaged Perceptron and a Maximum Entropy ranker as the reranker R in the subtree ranking framework. [sent-126, score-1.087]
78 The Maximum Entropy ranker model is optimized with a loss function which is the negative log conditional likelihood of the oracle trees relative to the candidate sets. [sent-127, score-0.553]
79 In the case of multiple oracles we optimize for the sum of the oracle trees’ posterior probabilities (Charniak and Johnson, 2005). [sent-128, score-0.162]
80 There is no need to compute the global normalization constant of the Maximum Entropy model because we only need the ranking and not the probabilities. [sent-131, score-0.189]
81 Hence the difference is in how to train the ranker model. [sent-132, score-0.336]
82 3 Five Methods for Forest-based Reranking We conducted comparative experiments employing the proposed subtree ranking approach and state-ofthe-art methods for forest reranking. [sent-135, score-1.055]
83 • • The original perceptron-based forest reranker of Huang (2008) (’perceptron dwi ftohr global atrnakine-r ing’). [sent-137, score-0.776]
84 The same method employing the early-update updating em meechthaondism em ipnlsoteyaindg o tfh teh eea global update. [sent-138, score-0.115]
85 Wang and Zong (201 1) reported a significant gain using this update over the standard global update (’perceptron with early update’). [sent-139, score-0.231]
86 • • • Similar to learning a perceptron at the global lSeivmeli aarn dto th leeanr applying cite at olonc aalt decisions, we can train a Maximum Entropy ranker at the global level utilizing the n-best full parse candidates of the base parser, then use this model for local decision making. [sent-140, score-0.918]
87 So we train the standard n-best rerankers (Charniak and Johnson, 2005) and then apply them in the beamsearch-based Viterbi parser (’n-best list training’). [sent-141, score-0.126]
88 The subtree ranker method using the Averaged Perceptron ere rranankekerr m. [sent-144, score-0.71]
89 eTthhiosd dis u dsiinffge trhenet A Afrvoemra gtehed ’perceptron with global training’ as we conduct updates at every local decision point and we do offline training (’subtree ranking by AvgPer’). [sent-145, score-0.29]
90 The subtree ranker method using Maximum Entropy training (’subtree ranking by iMmuaxmEnt’). [sent-146, score-0.853]
91 For German the base parser and the reranker operate on the conflation of constituent labels and grammatical functions. [sent-153, score-0.438]
92 For English, we used the forest extraction and pruning code of Huang (2008). [sent-154, score-0.523]
93 The pruning removes hyperedges where the difference between the cost of the best derivation using this hyperedge and the cost of the globally best derivation is above some threshold. [sent-155, score-0.218]
94 For German, we used the pruned parse forest of Bitpar (Schmid, 2004). [sent-156, score-0.57]
95 After computing the posterior probability of each hyperedge given the input sentence, Bitpar prunes the parse forest by deleting hyperedges whose posterior probability is below some threshold. [sent-157, score-0.761]
96 We employed an Averaged Perceptron (for ’perceptron with global training’, ’perceptron with early update’ and ’subtree ranking by AvgPer’) and a Maximum Entropy reranker (for ’subtree ranking by MaxEnt’ and ’n-best list training’). [sent-160, score-0.624]
97 For the perceptron reranker, we used the Joshua implementation2. [sent-161, score-0.188]
98 For the Maximum Entropy reranker we used the RankMaxEnt implementation of the Mallet package (McCallum, 2002) modified to use the objective function of Charniak and Johnson (2005) and we optimized the L2 regularizer coefficient on the development set. [sent-163, score-0.234]
99 The beam-size were set to 15 (the value suggested by Huang (2008)) during parsing and the training of the ’perceptron with global training’ and ’perceptron with early update’ models. [sent-164, score-0.148]
100 We used k = 3 for training the ’subtree ranking by AvgPer’ and ’subtree ranking by MaxEnt’ rankers (see Section 5 for a discussion on this). [sent-165, score-0.372]
wordName wordTfidf (topN-words)
[('forest', 0.496), ('subtree', 0.374), ('ranker', 0.336), ('reranking', 0.235), ('reranker', 0.234), ('perceptron', 0.188), ('oracle', 0.162), ('subtrees', 0.146), ('ranking', 0.143), ('hyperedge', 0.126), ('german', 0.103), ('bitpar', 0.101), ('forests', 0.09), ('rankers', 0.086), ('rerankers', 0.086), ('candidates', 0.081), ('avgper', 0.075), ('packed', 0.075), ('update', 0.075), ('parse', 0.074), ('entropy', 0.072), ('base', 0.07), ('huang', 0.07), ('parsing', 0.067), ('searn', 0.065), ('tails', 0.065), ('hyperedges', 0.065), ('topological', 0.059), ('lists', 0.057), ('rank', 0.056), ('charniak', 0.055), ('candidate', 0.055), ('vi', 0.052), ('eii', 0.05), ('fvor', 0.05), ('hayashi', 0.05), ('hv', 0.05), ('ovi', 0.05), ('subcorpus', 0.05), ('szepesv', 0.05), ('vgoal', 0.05), ('schmid', 0.048), ('ov', 0.048), ('global', 0.046), ('ontonotes', 0.046), ('operating', 0.046), ('selection', 0.044), ('beamsearch', 0.043), ('endc', 0.043), ('conflation', 0.043), ('local', 0.043), ('employing', 0.042), ('johnson', 0.041), ('parser', 0.04), ('tree', 0.039), ('tiger', 0.039), ('vj', 0.039), ('neu', 0.039), ('farkas', 0.039), ('chief', 0.039), ('parseval', 0.039), ('maxent', 0.039), ('rc', 0.039), ('averaged', 0.037), ('pcfg', 0.036), ('zong', 0.036), ('fcor', 0.036), ('hc', 0.036), ('trajectories', 0.036), ('early', 0.035), ('decision', 0.034), ('bottleneck', 0.034), ('reward', 0.034), ('node', 0.034), ('regarded', 0.033), ('headed', 0.032), ('strategy', 0.031), ('brackets', 0.029), ('decoding', 0.028), ('sub', 0.028), ('keeps', 0.028), ('grammatical', 0.028), ('maximum', 0.027), ('pruning', 0.027), ('reinforcement', 0.027), ('updating', 0.027), ('nodes', 0.026), ('gold', 0.026), ('store', 0.024), ('creation', 0.024), ('offline', 0.024), ('framework', 0.024), ('collins', 0.023), ('kind', 0.023), ('generalization', 0.023), ('constituents', 0.023), ('constituent', 0.023), ('employed', 0.023), ('ends', 0.023), ('parses', 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000007 55 emnlp-2012-Forest Reranking through Subtree Ranking
Author: Richard Farkas ; Helmut Schmid
Abstract: We propose the subtree ranking approach to parse forest reranking which is a generalization of current perceptron-based reranking methods. For the training of the reranker, we extract competing local subtrees, hence the training instances (candidate subtree sets) are very similar to those used during beamsearch parsing. This leads to better parameter optimization. Another chief advantage of the framework is that arbitrary learning to rank methods can be applied. We evaluated our reranking approach on German and English phrase structure parsing tasks and compared it to various state-of-the-art reranking approaches such as the perceptron-based forest reranker. The subtree ranking approach with a Maximum Entropy model significantly outperformed the other approaches.
2 0.13590595 105 emnlp-2012-Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
Author: Jonathan K. Kummerfeld ; David Hall ; James R. Curran ; Dan Klein
Abstract: Constituency parser performance is primarily interpreted through a single metric, F-score on WSJ section 23, that conveys no linguistic information regarding the remaining errors. We classify errors within a set of linguistically meaningful types using tree transformations that repair groups of errors together. We use this analysis to answer a range of questions about parser behaviour, including what linguistic constructions are difficult for stateof-the-art parsers, what types of errors are being resolved by rerankers, and what types are introduced when parsing out-of-domain text.
3 0.10892491 67 emnlp-2012-Inducing a Discriminative Parser to Optimize Machine Translation Reordering
Author: Graham Neubig ; Taro Watanabe ; Shinsuke Mori
Abstract: This paper proposes a method for learning a discriminative parser for machine translation reordering using only aligned parallel text. This is done by treating the parser’s derivation tree as a latent variable in a model that is trained to maximize reordering accuracy. We demonstrate that efficient large-margin training is possible by showing that two measures of reordering accuracy can be factored over the parse tree. Using this model in the pre-ordering framework results in significant gains in translation accuracy over standard phrasebased SMT and previously proposed unsupervised syntax induction methods.
4 0.10340659 57 emnlp-2012-Generalized Higher-Order Dependency Parsing with Cube Pruning
Author: Hao Zhang ; Ryan McDonald
Abstract: State-of-the-art graph-based parsers use features over higher-order dependencies that rely on decoding algorithms that are slow and difficult to generalize. On the other hand, transition-based dependency parsers can easily utilize such features without increasing the linear complexity of the shift-reduce system beyond a constant. In this paper, we attempt to address this imbalance for graph-based parsing by generalizing the Eisner (1996) algorithm to handle arbitrary features over higherorder dependencies. The generalization is at the cost of asymptotic efficiency. To account for this, cube pruning for decoding is utilized (Chiang, 2007). For the first time, label tuple and structural features such as valencies can be scored efficiently with third-order features in a graph-based parser. Our parser achieves the state-of-art unlabeled accuracy of 93.06% and labeled accuracy of 91.86% on the standard test set for English, at a faster speed than a reimplementation ofthe third-ordermodel of Koo et al. (2010).
5 0.093656383 12 emnlp-2012-A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing
Author: Bernd Bohnet ; Joakim Nivre
Abstract: Most current dependency parsers presuppose that input words have been morphologically disambiguated using a part-of-speech tagger before parsing begins. We present a transitionbased system for joint part-of-speech tagging and labeled dependency parsing with nonprojective trees. Experimental evaluation on Chinese, Czech, English and German shows consistent improvements in both tagging and parsing accuracy when compared to a pipeline system, which lead to improved state-of-theart results for all languages.
6 0.082012668 82 emnlp-2012-Left-to-Right Tree-to-String Decoding with Prediction
7 0.081203744 37 emnlp-2012-Dynamic Programming for Higher Order Parsing of Gap-Minding Trees
8 0.061909679 59 emnlp-2012-Generating Non-Projective Word Order in Statistical Linearization
9 0.061088555 78 emnlp-2012-Learning Lexicon Models from Search Logs for Query Expansion
10 0.060763866 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
11 0.058105297 46 emnlp-2012-Exploiting Reducibility in Unsupervised Dependency Parsing
12 0.058051307 70 emnlp-2012-Joint Chinese Word Segmentation, POS Tagging and Parsing
13 0.057764228 112 emnlp-2012-Resolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge
14 0.056415401 5 emnlp-2012-A Discriminative Model for Query Spelling Correction with Latent Structural SVM
15 0.054852296 64 emnlp-2012-Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints
16 0.052878916 86 emnlp-2012-Locally Training the Log-Linear Model for SMT
17 0.052802842 131 emnlp-2012-Unified Dependency Parsing of Chinese Morphological and Syntactic Structures
18 0.046698846 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
19 0.045338117 109 emnlp-2012-Re-training Monolingual Parser Bilingually for Syntactic SMT
20 0.044851635 11 emnlp-2012-A Systematic Comparison of Phrase Table Pruning Techniques
topicId topicWeight
[(0, 0.175), (1, -0.123), (2, 0.051), (3, -0.036), (4, 0.021), (5, -0.084), (6, -0.038), (7, 0.007), (8, -0.028), (9, 0.139), (10, 0.007), (11, -0.011), (12, 0.021), (13, 0.127), (14, 0.077), (15, -0.006), (16, 0.058), (17, 0.033), (18, 0.007), (19, -0.051), (20, 0.066), (21, -0.063), (22, -0.035), (23, -0.136), (24, 0.04), (25, -0.041), (26, 0.014), (27, 0.08), (28, -0.123), (29, -0.154), (30, -0.023), (31, 0.048), (32, 0.078), (33, 0.092), (34, -0.085), (35, -0.187), (36, -0.143), (37, 0.013), (38, 0.168), (39, -0.025), (40, -0.293), (41, -0.124), (42, 0.114), (43, 0.024), (44, 0.033), (45, 0.22), (46, 0.056), (47, 0.087), (48, 0.018), (49, -0.053)]
simIndex simValue paperId paperTitle
same-paper 1 0.9666965 55 emnlp-2012-Forest Reranking through Subtree Ranking
Author: Richard Farkas ; Helmut Schmid
Abstract: We propose the subtree ranking approach to parse forest reranking which is a generalization of current perceptron-based reranking methods. For the training of the reranker, we extract competing local subtrees, hence the training instances (candidate subtree sets) are very similar to those used during beamsearch parsing. This leads to better parameter optimization. Another chief advantage of the framework is that arbitrary learning to rank methods can be applied. We evaluated our reranking approach on German and English phrase structure parsing tasks and compared it to various state-of-the-art reranking approaches such as the perceptron-based forest reranker. The subtree ranking approach with a Maximum Entropy model significantly outperformed the other approaches.
2 0.59842277 105 emnlp-2012-Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output
Author: Jonathan K. Kummerfeld ; David Hall ; James R. Curran ; Dan Klein
Abstract: Constituency parser performance is primarily interpreted through a single metric, F-score on WSJ section 23, that conveys no linguistic information regarding the remaining errors. We classify errors within a set of linguistically meaningful types using tree transformations that repair groups of errors together. We use this analysis to answer a range of questions about parser behaviour, including what linguistic constructions are difficult for stateof-the-art parsers, what types of errors are being resolved by rerankers, and what types are introduced when parsing out-of-domain text.
3 0.37546557 57 emnlp-2012-Generalized Higher-Order Dependency Parsing with Cube Pruning
Author: Hao Zhang ; Ryan McDonald
Abstract: State-of-the-art graph-based parsers use features over higher-order dependencies that rely on decoding algorithms that are slow and difficult to generalize. On the other hand, transition-based dependency parsers can easily utilize such features without increasing the linear complexity of the shift-reduce system beyond a constant. In this paper, we attempt to address this imbalance for graph-based parsing by generalizing the Eisner (1996) algorithm to handle arbitrary features over higherorder dependencies. The generalization is at the cost of asymptotic efficiency. To account for this, cube pruning for decoding is utilized (Chiang, 2007). For the first time, label tuple and structural features such as valencies can be scored efficiently with third-order features in a graph-based parser. Our parser achieves the state-of-art unlabeled accuracy of 93.06% and labeled accuracy of 91.86% on the standard test set for English, at a faster speed than a reimplementation ofthe third-ordermodel of Koo et al. (2010).
4 0.35405073 82 emnlp-2012-Left-to-Right Tree-to-String Decoding with Prediction
Author: Yang Feng ; Yang Liu ; Qun Liu ; Trevor Cohn
Abstract: Decoding algorithms for syntax based machine translation suffer from high computational complexity, a consequence of intersecting a language model with a context free grammar. Left-to-right decoding, which generates the target string in order, can improve decoding efficiency by simplifying the language model evaluation. This paper presents a novel left to right decoding algorithm for tree-to-string translation, using a bottom-up parsing strategy and dynamic future cost estimation for each partial translation. Our method outperforms previously published tree-to-string decoders, including a competing left-to-right method.
5 0.3393701 67 emnlp-2012-Inducing a Discriminative Parser to Optimize Machine Translation Reordering
Author: Graham Neubig ; Taro Watanabe ; Shinsuke Mori
Abstract: This paper proposes a method for learning a discriminative parser for machine translation reordering using only aligned parallel text. This is done by treating the parser’s derivation tree as a latent variable in a model that is trained to maximize reordering accuracy. We demonstrate that efficient large-margin training is possible by showing that two measures of reordering accuracy can be factored over the parse tree. Using this model in the pre-ordering framework results in significant gains in translation accuracy over standard phrasebased SMT and previously proposed unsupervised syntax induction methods.
6 0.3186582 45 emnlp-2012-Exploiting Chunk-level Features to Improve Phrase Chunking
7 0.30738175 78 emnlp-2012-Learning Lexicon Models from Search Logs for Query Expansion
8 0.30612409 64 emnlp-2012-Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints
9 0.29372799 114 emnlp-2012-Revisiting the Predictability of Language: Response Completion in Social Media
10 0.28010792 7 emnlp-2012-A Novel Discriminative Framework for Sentence-Level Discourse Analysis
11 0.27727312 37 emnlp-2012-Dynamic Programming for Higher Order Parsing of Gap-Minding Trees
12 0.27268973 46 emnlp-2012-Exploiting Reducibility in Unsupervised Dependency Parsing
13 0.26233938 119 emnlp-2012-Spectral Dependency Parsing with Latent Variables
14 0.26135728 133 emnlp-2012-Unsupervised PCFG Induction for Grounded Language Learning with Highly Ambiguous Supervision
15 0.2448379 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
16 0.24344023 59 emnlp-2012-Generating Non-Projective Word Order in Statistical Linearization
17 0.24216913 12 emnlp-2012-A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing
18 0.21711364 88 emnlp-2012-Minimal Dependency Length in Realization Ranking
19 0.20810811 121 emnlp-2012-Supervised Text-based Geolocation Using Language Models on an Adaptive Grid
20 0.2040381 26 emnlp-2012-Building a Lightweight Semantic Model for Unsupervised Information Extraction on Short Listings
topicId topicWeight
[(14, 0.011), (16, 0.067), (29, 0.013), (34, 0.073), (45, 0.011), (60, 0.103), (63, 0.041), (64, 0.017), (65, 0.022), (69, 0.366), (70, 0.019), (73, 0.02), (74, 0.044), (76, 0.045), (80, 0.023), (86, 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 0.76560444 55 emnlp-2012-Forest Reranking through Subtree Ranking
Author: Richard Farkas ; Helmut Schmid
Abstract: We propose the subtree ranking approach to parse forest reranking which is a generalization of current perceptron-based reranking methods. For the training of the reranker, we extract competing local subtrees, hence the training instances (candidate subtree sets) are very similar to those used during beamsearch parsing. This leads to better parameter optimization. Another chief advantage of the framework is that arbitrary learning to rank methods can be applied. We evaluated our reranking approach on German and English phrase structure parsing tasks and compared it to various state-of-the-art reranking approaches such as the perceptron-based forest reranker. The subtree ranking approach with a Maximum Entropy model significantly outperformed the other approaches.
2 0.39820513 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents
Author: Heeyoung Lee ; Marta Recasens ; Angel Chang ; Mihai Surdeanu ; Dan Jurafsky
Abstract: We introduce a novel coreference resolution system that models entities and events jointly. Our iterative method cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. As clusters are built, information flows between entity and event clusters through features that model semantic role dependencies. Our system handles nominal and verbal events as well as entities, and our joint formulation allows information from event coreference to help entity coreference, and vice versa. In a cross-document domain with comparable documents, joint coreference resolution performs significantly better (over 3 CoNLL F1 points) than two strong baselines that resolve entities and events separately.
3 0.39810893 123 emnlp-2012-Syntactic Transfer Using a Bilingual Lexicon
Author: Greg Durrett ; Adam Pauls ; Dan Klein
Abstract: We consider the problem of using a bilingual dictionary to transfer lexico-syntactic information from a resource-rich source language to a resource-poor target language. In contrast to past work that used bitexts to transfer analyses of specific sentences at the token level, we instead use features to transfer the behavior of words at a type level. In a discriminative dependency parsing framework, our approach produces gains across a range of target languages, using two different lowresource training methodologies (one weakly supervised and one indirectly supervised) and two different dictionary sources (one manually constructed and one automatically constructed).
Author: Bernd Bohnet ; Joakim Nivre
Abstract: Most current dependency parsers presuppose that input words have been morphologically disambiguated using a part-of-speech tagger before parsing begins. We present a transitionbased system for joint part-of-speech tagging and labeled dependency parsing with nonprojective trees. Experimental evaluation on Chinese, Czech, English and German shows consistent improvements in both tagging and parsing accuracy when compared to a pipeline system, which lead to improved state-of-theart results for all languages.
5 0.39319199 70 emnlp-2012-Joint Chinese Word Segmentation, POS Tagging and Parsing
Author: Xian Qian ; Yang Liu
Abstract: In this paper, we propose a novel decoding algorithm for discriminative joint Chinese word segmentation, part-of-speech (POS) tagging, and parsing. Previous work often used a pipeline method Chinese word segmentation followed by POS tagging and parsing, which suffers from error propagation and is unable to leverage information in later modules for earlier components. In our approach, we train the three individual models separately during training, and incorporate them together in a unified framework during decoding. We extend the CYK parsing algorithm so that it can deal with word segmentation and POS tagging features. As far as we know, this is the first work on joint Chinese word segmentation, POS tagging and parsing. Our experimental results on Chinese Tree Bank 5 corpus show that our approach outperforms the state-of-the-art pipeline system. –
6 0.39037213 45 emnlp-2012-Exploiting Chunk-level Features to Improve Phrase Chunking
7 0.38928011 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
8 0.38815376 24 emnlp-2012-Biased Representation Learning for Domain Adaptation
9 0.38796484 64 emnlp-2012-Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints
10 0.38519451 23 emnlp-2012-Besting the Quiz Master: Crowdsourcing Incremental Classification Games
11 0.38347438 18 emnlp-2012-An Empirical Investigation of Statistical Significance in NLP
12 0.3830843 109 emnlp-2012-Re-training Monolingual Parser Bilingually for Syntactic SMT
13 0.3813875 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
14 0.38129178 82 emnlp-2012-Left-to-Right Tree-to-String Decoding with Prediction
15 0.37963617 92 emnlp-2012-Multi-Domain Learning: When Do Domains Matter?
16 0.37845835 108 emnlp-2012-Probabilistic Finite State Machines for Regression-based MT Evaluation
17 0.37840459 51 emnlp-2012-Extracting Opinion Expressions with semi-Markov Conditional Random Fields
18 0.37738904 42 emnlp-2012-Entropy-based Pruning for Phrase-based Machine Translation
19 0.37638283 131 emnlp-2012-Unified Dependency Parsing of Chinese Morphological and Syntactic Structures
20 0.37533239 5 emnlp-2012-A Discriminative Model for Query Spelling Correction with Latent Structural SVM