emnlp emnlp2011 emnlp2011-134 knowledge-graph by maker-knowledge-mining

134 emnlp-2011-Third-order Variational Reranking on Packed-Shared Dependency Forests


Source: pdf

Author: Katsuhiko Hayashi ; Taro Watanabe ; Masayuki Asahara ; Yuji Matsumoto

Abstract: We propose a novel forest reranking algorithm for discriminative dependency parsing based on a variant of Eisner’s generative model. In our framework, we define two kinds of generative model for reranking. One is learned from training data offline and the other from a forest generated by a baseline parser on the fly. The final prediction in the reranking stage is performed using linear interpolation of these models and discriminative model. In order to efficiently train the model from and decode on a hypergraph data structure representing a forest, we apply extended inside/outside and Viterbi algorithms. Experimental results show that our proposed forest reranking algorithm achieves significant improvement when compared with conventional approaches.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 pn Abstract We propose a novel forest reranking algorithm for discriminative dependency parsing based on a variant of Eisner’s generative model. [sent-7, score-0.997]

2 One is learned from training data offline and the other from a forest generated by a baseline parser on the fly. [sent-9, score-0.302]

3 The final prediction in the reranking stage is performed using linear interpolation of these models and discriminative model. [sent-10, score-0.531]

4 Experimental results show that our proposed forest reranking algorithm achieves significant improvement when compared with conventional approaches. [sent-12, score-0.732]

5 1 Introduction Recently, much of research on statistical parsing has been focused on k-best (or forest) reranking (Collins, 2000; Charniak and Johnson, 2005; Huang, 2008). [sent-13, score-0.464]

6 Typically, reranking methods first generate a list of top-k candidates (or a forest) from a baseline system, then rerank the candidates with arbitrary features that are intractable within the baseline system. [sent-14, score-0.568]

7 In the reranking framework, the baseline system is usually modeled with a generative model, and a discriminative model is used for reranking. [sent-15, score-0.648]

8 (2009) reversed the usual order of the two models for dependency parsing by employing a generative model to rescore the k-best candidates provided by a discriminative model. [sent-17, score-0.409]

9 They use a variant of Eisner’s generative model C (Eisner, 1996b; 1479 Eisner, 1996a) for reranking and extend it to capture higher-order information than Eisner’s second-order generative model. [sent-18, score-0.703]

10 Their reranking model showed large improvements in dependency parsing accuracy. [sent-19, score-0.579]

11 In this paper, we propose a forest generative reranking algorithm, opposed to Sangati et al. [sent-21, score-0.786]

12 Moreover, our reranking uses not only a generative model obtained from training data, but also a sentence specific generative model learned from a forest. [sent-24, score-0.724]

13 In the reranking stage, we use linearly combined model of these models. [sent-25, score-0.443]

14 The model proposed in this paper is factored in the third-order structure, therefore, its non-locality makes it difficult to perform the reranking with an usual 1-best Viterbi search. [sent-27, score-0.514]

15 This algorithm enables us an exact 1-best reranking without any approximation. [sent-29, score-0.451]

16 • • To extend k-best to forest generative reranking. [sent-31, score-0.364]

17 We introduce variational reranking which is a Wcome ibnitnraotdiounce approach aofl generative reranking • and variational decoding (Li et al. [sent-32, score-1.646]

18 To obtain 1-best tree in the reranking stage, we ProceedEindgisnb oufr tghhe, 2 S0c1o1tl Canodn,f eUrKen,c Jeuol yn 2 E7m–3p1ir,ic 2a0l1 M1. [sent-34, score-0.422]

19 Moreover, our variational reranking framework achieves consistent improvement, compared to conventional approaches, such as simple k-best and forest-based generative reranking algorithms. [sent-38, score-1.33]

20 While there may be exponentially many dependency trees, the forest represents them in polynomial space. [sent-46, score-0.328]

21 A dependency forest (or tree) can be defined as a hypergraph data strucure HG (Tu et al. [sent-47, score-0.428]

22 The hyperedge e is an incoming edge for saw1,8 and outgoing edge for each of I1,2, girl3,5 and with5,81 . [sent-73, score-0.277]

23 sF tohre n hoetaadatinodna tla brevity o∈f algorithmic description, we do not distinguish left and right tails in the definition, but, our implementation implicitly distinguishes left tails tailsL(e) and right tails tailsR(e). [sent-85, score-0.632]

24 We define the set of incoming edges of a node v as IE(v) and the set of outgoing edges of a node v as OE(v). [sent-86, score-0.363]

25 Like a variation of Eisner’s generative model C (Eisner, 1996b; Eisner, 1996a), 1In Figure 1, according to custom of dependency tree description, the direction of hyperedge is written as from × head to tail nodes. [sent-89, score-0.478]

26 First, we define a tri-sibling model whose context space consists of the head node, sibling node, tri-sibling node and direction of a node v: q1(v |C(v)) = q1(v|h, sib, tsib, dir) (4) where h, sib and tsib are head, sibling and tri-sibling node of v, and dir is a direction of v from h. [sent-99, score-1.03]

27 ∞ (2009), we define a grandsibling model whose context space consists ofthe head node, sibling node, grandparent node and direction of a node v. [sent-107, score-0.813]

28 This model is factored in a grandsibling structure shown in the right side of Figure 2. [sent-113, score-0.491]

29 The direct estimation of tri-sibling and grandsibling models from a corpus suffers from serious data sparseness issues. [sent-114, score-0.439]

30 Table 2: Reduction lists for tri-sibling and grandsibling models: wt(), w() and t() mean word and POS-tag, word, POS-tag for a node. [sent-123, score-0.444]

31 v Figure 2: The left side denotes tri-sibling structure and the right side denotes grandsibling structure. [sent-136, score-0.477]

32 On the other hand, the grandsibling model has non-local features because the grandparent is not factored in one hyperedge. [sent-143, score-0.532]

33 Our reranking models are generative versions of Koo and Collins (2010)’s third-order factorization model. [sent-145, score-0.552]

34 For a constituent parser, Huang (2008) applied cube pruning techniques to forest reranking with non-local features. [sent-149, score-0.737]

35 8 share the same spirit to their third-order parsing algorithm since the grandsibling model is similar to their model 1 in that it is factored in grandsibling structure. [sent-154, score-0.97]

36 Line 4 references outgoing edge e′ of node h from a set of outgoing edges OE(h). [sent-157, score-0.291]

37 tails(e) contains a node v, the sibling node sib and tri-sibling node tsib of v, moreover, the head of e′ (head(e′)) is the grandparent for v and sib. [sent-158, score-0.904]

38 Thus, in line 5, we can capture tri-sibling and grandsibling information and compute the current inside estimate of Eq. [sent-159, score-0.448]

39 ,d(v|e|,e))) =⊗|e|d(vi,e) ⊗i= ⊗1 (9) where d(vi, e) denotes the current estimate of the best cost for a pair of node vi and a hyperedge e. [sent-168, score-0.276]

40 Each ctsib and cgsib in line 5 and 7⊗ indicates the cost of tri-sibling and grandsibling Algorithm 1Exact DP-Search Algorithm(HG(x)) 1:for h ∈ V in bottom-up topological order do 2: fro hr e ∈ V I inE( bhot) od{om 3: /r/ eta ∈ils I(Ee)( his) d{ov1, . [sent-170, score-0.74]

41 We do not compute the cost of the grandsibling model when h is top node because top node has no outgoing edges. [sent-183, score-0.787]

42 Our search algorithm is based on the third-order parsing algorithm, but, the search space is previously shrank c by a baseline parser’s k-best approximation and a forest pruning algorithm presented in the next section. [sent-186, score-0.466]

43 Therefore, the time efficiency of our reranking is unimpaired. [sent-187, score-0.422]

44 3 Forest Pruning Charniak and Johnson (2005) and Huang (2008) proposed forest pruning algorithms to reduce the size of a forest. [sent-189, score-0.293]

45 In our experiments, we use Charniak and Johnson (2005)’s forest pruning criterion because the variational model needs traditional inside/outside probabilities for its ML estimation. [sent-191, score-0.602]

46 (10) Following Huang (2008), we also prune away nodes with all incoming and outgoing hyperedges pruned. [sent-194, score-0.292]

47 (2009) proposed a variational decision rule that rescores candidates with an approximate distribution ∈ Q. [sent-198, score-0.354]

48 Here, we propose to apply the variational decision rule to dependency parsing. [sent-202, score-0.404]

49 For dependency parsing, we can choose to model q∗ as the tri-sibling and grandsibling generative models in section 3. [sent-203, score-0.654]

50 This variational approximate framework can be applied to other tasks collapsing spurious ambiguity, such as latent-variable parsing (Matsuzaki et al. [sent-207, score-0.351]

51 For each hyperedge e, we denote ctsib as the posterior weight for computing expected count c1 of events in the tri-sibling model q1∗ . [sent-218, score-0.332]

52 The expected count c2 needed for the estimation of grandsibling model q2∗ is extracted in lines 7-15. [sent-220, score-0.49]

53 c2 for a grandsibling model must be extracted over two hyperedges e and e′ because it needs grandparent information. [sent-221, score-0.648]

54 0241608"kfobre10st" 2k0=1 4 the number of hyperedges per sentence Figure 3: The relationship between tha data size (the number of hyperedges) and oracle scores on development data: Forests encode candidates with high accuracy scores more compactly than k-best lists. [sent-223, score-0.283]

55 (2009) assumes n-gram locality of the forest to efficiently train the model, namely, the baseline n-gram model has larger n than that of variational n-gram model. [sent-229, score-0.572]

56 In our case, grandsibling locality is not embedded in the forest generated from the baseline parser. [sent-230, score-0.672]

57 Note that this framework is a combination of variational decoding and generative reranking. [sent-237, score-0.535]

58 Table 4: The statistics of forests and 20-best lists on development data: this shows the average number of hyperedges and nodes per sentence and oracle scores. [sent-239, score-0.388]

59 We obtain k-best lists and forests generated from the baseline discriminative model which has the same feature set as provided in (McDonald et al. [sent-247, score-0.299]

60 We also train a generative reranking model from the training data. [sent-252, score-0.573]

61 Our variational reranking does not need much time to train the model because the training is performed over not the training data (39832 sentences) but the development data (1700 sentences)4. [sent-255, score-0.757]

62 After MERT was performed until the convergence, the variational reranking finally achieved a 94. [sent-256, score-0.736]

63 Though the size of forests is smaller than that of k-best lists, the oracle scores of forests are much higher than those of k-best lists. [sent-275, score-0.366]

64 2 The Performance of Reranking First, we compare the performance of variational decoding with that of MBR decoding. [sent-277, score-0.384]

65 However, compared with baseline, the gains of variational and MBR decoding are small. [sent-280, score-0.384]

66 Second, we also compare the performance of variational reranking with k-best and forest generative reranking algorithms. [sent-281, score-1.496]

67 Table 6 shows that our variational reranking framework achieves the highest accuracy scores. [sent-282, score-0.756]

68 Being different from the decoding framework, reranking achieves significant improvements. [sent-283, score-0.543]

69 This result is intuitively reasonable because the reranking model obtained from training data has the ability to select a globally consistent candidate, while the variational approximate model obtained from a forest only supports selecting a locally consistent candidate. [sent-284, score-1.009]

70 On the other hand, the fact that variational reranking achieves the best results clearly indicates that the combination of sentence specific generative model and that obtained from training data is successful in selecting both locally and globally appropriate candidate from a forest. [sent-285, score-0.909]

71 66GHz Quad-Core Xeon) of the baseline k-best, generative reranking and variational reranking parsers (java implemented). [sent-287, score-1.291]

72 19e7ld Table 7: The parsing time (CPU second per sentence) and accuracy score of the baseline k-best, generative reranking and variational reranking parsers 3164k820 . [sent-298, score-1.333]

73 a87l916) Table 8: The comparison of tri-sibling and grandsibling models: the performance of the grandsibling model outperforms that of the tri-sibling model. [sent-304, score-0.839]

74 This means that our reranking parser can parse sentences at reasonable times. [sent-307, score-0.461]

75 2, our variational reranking model achieves higher accuracy scores than the others. [sent-310, score-0.756]

76 To analyze the factors that improve accuracy scores, we further investigate whether varia- tional reranking is performed better with the trisibling or grandsibling model. [sent-311, score-0.857]

77 Table 8 indicates that grandsibling model achieves a larger gain than that of tri-sibling model. [sent-312, score-0.455]

78 Table 9 shows the examples whose accuracy scores improved by the grandsibling model. [sent-313, score-0.409]

79 On the other hand, many errors remain still in 1486 Table 6: The comparison of the reranking frameworks: Generative means k-best or forest reranking algorithm based on a generative model estimated from a corpus. [sent-315, score-1.258]

80 4 Comparison with Other Systems Table 10 shows the comparison of the performance of variational reranking (16-best forests) with that of other systems. [sent-327, score-0.71]

81 The model trained from unlabeled data can be easily incorporated into our reranking framework. [sent-332, score-0.443]

82 283 in section 23 from baseline and variational reranking parsers. [sent-336, score-0.739]

83 The underlined portions show the effect of the grandsibling model. [sent-337, score-0.409]

84 6 Related Work Collins (2000) and Charniak and Johnson (2005) proposed a reranking algorithm for constituent parsers. [sent-338, score-0.451]

85 Huang (2008) extended it to a forest reranking algorithm with non-local features. [sent-339, score-0.685]

86 Our framework is for a dependency parser and the decoding in the reranking stage is done with an exact 1-best dynamic programming algorithm. [sent-340, score-0.731]

87 (2009) proposed a k-best generative reranking algorithm for dependency parsing. [sent-342, score-0.675]

88 In this paper, we use a similar generative model, but combined with a variational model learned on the fly. [sent-343, score-0.439]

89 Their model 1 is defined by an enclosing grandsibling for each sibling or grandchild part used in Carreras (2007). [sent-346, score-0.475]

90 Our grandsibling model is similar to the model 1, but ours is defined by a generative model. [sent-347, score-0.581]

91 The decoding in the reranking stage is also similar to the parsing algorithm of their model 1. [sent-348, score-0.647]

92 In order to capture grandsibling factors, our decoding calculates inside probablities for not the current head node but each pair of the node and its outgoing edges. [sent-349, score-0.908]

93 Our variational model is inspired by the study of Li et al. [sent-354, score-0.309]

94 7 Conclusions In this paper, we propose a novel forest reranking algorithm for dependency parsing. [sent-357, score-0.779]

95 Our reranking algorithm is a combination approach of generative reranking and variational decoding. [sent-358, score-1.291]

96 The search algorithm in the reranking stage can be performed using dynamic programming algorithm. [sent-359, score-0.558]

97 Our variational reranking is aimed at selecting a candidate from a forest, which is correct both in local and global. [sent-360, score-0.71]

98 Our experimental results show more significant improvements than conventional approaches, such as k-best and forest generative reranking. [sent-361, score-0.386]

99 We plan to examine to model such a complex structure (granduncle) (Goldberg and Elhadad, 2010) or higher-order structure than third-order for reranking which is computationally expensive for a baseline parser. [sent-364, score-0.472]

100 4, we also plan to incorporate semi-supervised learning into our framework, which may potentially improve our reranking performance. [sent-366, score-0.422]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('reranking', 0.422), ('grandsibling', 0.409), ('variational', 0.288), ('sib', 0.245), ('forest', 0.234), ('tails', 0.196), ('forests', 0.168), ('cgsib', 0.163), ('tsib', 0.163), ('hyperedges', 0.155), ('ctsib', 0.147), ('hyperedge', 0.14), ('generative', 0.13), ('dir', 0.123), ('node', 0.113), ('mbr', 0.113), ('hypergraph', 0.1), ('decoding', 0.096), ('dependency', 0.094), ('hg', 0.089), ('outgoing', 0.089), ('wrd', 0.083), ('eisner', 0.076), ('vr', 0.07), ('huang', 0.066), ('koo', 0.065), ('ggm', 0.065), ('vl', 0.065), ('sangati', 0.064), ('dist', 0.063), ('grandparent', 0.063), ('pruning', 0.059), ('viterbi', 0.052), ('head', 0.049), ('mert', 0.048), ('incoming', 0.048), ('ayr', 0.047), ('discriminative', 0.046), ('sibling', 0.045), ('candidates', 0.044), ('tail', 0.044), ('parsing', 0.042), ('titov', 0.04), ('factored', 0.039), ('parser', 0.039), ('inside', 0.039), ('stage', 0.037), ('semiring', 0.035), ('duo', 0.035), ('lists', 0.035), ('event', 0.034), ('kumar', 0.033), ('arqg', 0.033), ('aui', 0.033), ('tailsl', 0.033), ('tailsr', 0.033), ('variationxal', 0.033), ('usual', 0.032), ('henderson', 0.031), ('compactly', 0.031), ('tag', 0.031), ('charniak', 0.031), ('collins', 0.031), ('oracle', 0.03), ('lines', 0.03), ('estimation', 0.03), ('baseline', 0.029), ('iwpt', 0.029), ('algorithm', 0.029), ('mcdonald', 0.028), ('itb', 0.028), ('johnson', 0.028), ('ax', 0.027), ('oe', 0.026), ('performed', 0.026), ('reductions', 0.026), ('aryg', 0.026), ('nara', 0.026), ('achieves', 0.025), ('events', 0.024), ('carreras', 0.024), ('pe', 0.024), ('vc', 0.024), ('occuring', 0.024), ('encode', 0.023), ('japan', 0.023), ('globally', 0.023), ('denotes', 0.023), ('conventional', 0.022), ('decision', 0.022), ('dynamic', 0.022), ('cube', 0.022), ('search', 0.022), ('right', 0.022), ('loss', 0.021), ('top', 0.021), ('framework', 0.021), ('model', 0.021), ('goldberg', 0.021), ('fro', 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 134 emnlp-2011-Third-order Variational Reranking on Packed-Shared Dependency Forests

Author: Katsuhiko Hayashi ; Taro Watanabe ; Masayuki Asahara ; Yuji Matsumoto

Abstract: We propose a novel forest reranking algorithm for discriminative dependency parsing based on a variant of Eisner’s generative model. In our framework, we define two kinds of generative model for reranking. One is learned from training data offline and the other from a forest generated by a baseline parser on the fly. The final prediction in the reranking stage is performed using linear interpolation of these models and discriminative model. In order to efficiently train the model from and decode on a hypergraph data structure representing a forest, we apply extended inside/outside and Viterbi algorithms. Experimental results show that our proposed forest reranking algorithm achieves significant improvement when compared with conventional approaches.

2 0.21551819 68 emnlp-2011-Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding

Author: Marco Dinarelli ; Sophie Rosset

Abstract: Reranking models have been successfully applied to many tasks of Natural Language Processing. However, there are two aspects of this approach that need a deeper investigation: (i) Assessment of hypotheses generated for reranking at classification phase: baseline models generate a list of hypotheses and these are used for reranking without any assessment; (ii) Detection of cases where reranking models provide a worst result: the best hypothesis provided by the reranking model is assumed to be always the best result. In some cases the reranking model provides an incorrect hypothesis while the baseline best hypothesis is correct, especially when baseline models are accurate. In this paper we propose solutions for these two aspects: (i) a semantic inconsistency metric to select possibly more correct n-best hypotheses, from a large set generated by an SLU basiline model. The selected hypotheses are reranked applying a state-of-the-art model based on Partial Tree Kernels, which encode SLU hypotheses in Support Vector Machines with complex structured features; (ii) finally, we apply a decision strategy, based on confidence values, to select the final hypothesis between the first ranked hypothesis provided by the baseline SLU model and the first ranked hypothesis provided by the re-ranker. We show the effectiveness of these solutions presenting comparative results obtained reranking hypotheses generated by a very accurate Conditional Random Field model. We evaluate our approach on the French MEDIA corpus. The results show significant improvements with respect to current state-of-the-art and previous 1104 Sophie Rosset LIMSI-CNRS B.P. 133, 91403 Orsay Cedex France ro s set @ l ims i fr . re-ranking models.

3 0.20659178 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training

Author: Xinyan Xiao ; Yang Liu ; Qun Liu ; Shouxun Lin

Abstract: Although discriminative training guarantees to improve statistical machine translation by incorporating a large amount of overlapping features, it is hard to scale up to large data due to decoding complexity. We propose a new algorithm to generate translation forest of training data in linear time with the help of word alignment. Our algorithm also alleviates the oracle selection problem by ensuring that a forest always contains derivations that exactly yield the reference translation. With millions of features trained on 519K sentences in 0.03 second per sentence, our system achieves significant improvement by 0.84 BLEU over the baseline system on the NIST Chinese-English test sets.

4 0.12696101 75 emnlp-2011-Joint Models for Chinese POS Tagging and Dependency Parsing

Author: Zhenghua Li ; Min Zhang ; Wanxiang Che ; Ting Liu ; Wenliang Chen ; Haizhou Li

Abstract: Part-of-speech (POS) is an indispensable feature in dependency parsing. Current research usually models POS tagging and dependency parsing independently. This may suffer from error propagation problem. Our experiments show that parsing accuracy drops by about 6% when using automatic POS tags instead of gold ones. To solve this issue, this paper proposes a solution by jointly optimizing POS tagging and dependency parsing in a unique model. We design several joint models and their corresponding decoding algorithms to incorporate different feature sets. We further present an effective pruning strategy to reduce the search space of candidate POS tags, leading to significant improvement of parsing speed. Experimental results on Chinese Penn Treebank 5 show that our joint models significantly improve the state-of-the-art parsing accuracy by about 1.5%. Detailed analysis shows that the joint method is able to choose such POS tags that are more helpful and discriminative from parsing viewpoint. This is the fundamental reason of parsing accuracy improvement.

5 0.10095681 10 emnlp-2011-A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions

Author: Wei Lu ; Hwee Tou Ng

Abstract: This paper describes a novel probabilistic approach for generating natural language sentences from their underlying semantics in the form of typed lambda calculus. The approach is built on top of a novel reduction-based weighted synchronous context free grammar formalism, which facilitates the transformation process from typed lambda calculus into natural language sentences. Sentences can then be generated based on such grammar rules with a log-linear model. To acquire such grammar rules automatically in an unsupervised manner, we also propose a novel approach with a generative model, which maps from sub-expressions of logical forms to word sequences in natural language sentences. Experiments on benchmark datasets for both English and Chinese generation tasks yield significant improvements over results obtained by two state-of-the-art machine translation models, in terms of both automatic metrics and human evaluation.

6 0.088049889 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation

7 0.085000128 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation

8 0.084373116 65 emnlp-2011-Heuristic Search for Non-Bottom-Up Tree Structure Prediction

9 0.084209085 131 emnlp-2011-Syntactic Decision Tree LMs: Random Selection or Intelligent Design?

10 0.078450769 4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser

11 0.073201053 102 emnlp-2011-Parse Correction with Specialized Models for Difficult Attachment Types

12 0.07163538 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives

13 0.067101873 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification

14 0.061319895 50 emnlp-2011-Evaluating Dependency Parsing: Robust and Heuristics-Free Cross-Annotation Evaluation

15 0.055655591 45 emnlp-2011-Dual Decomposition with Many Overlapping Components

16 0.055232301 66 emnlp-2011-Hierarchical Phrase-based Translation Representations

17 0.053805113 93 emnlp-2011-Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation

18 0.052658204 100 emnlp-2011-Optimal Search for Minimum Error Rate Training

19 0.049893521 125 emnlp-2011-Statistical Machine Translation with Local Language Models

20 0.049888831 51 emnlp-2011-Exact Decoding of Phrase-Based Translation Models through Lagrangian Relaxation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.191), (1, 0.108), (2, -0.005), (3, 0.044), (4, -0.008), (5, -0.005), (6, -0.063), (7, -0.211), (8, 0.193), (9, -0.074), (10, 0.044), (11, 0.082), (12, -0.012), (13, 0.121), (14, -0.068), (15, -0.083), (16, 0.004), (17, -0.053), (18, 0.094), (19, 0.117), (20, 0.057), (21, 0.15), (22, 0.025), (23, 0.044), (24, 0.242), (25, -0.197), (26, 0.267), (27, 0.025), (28, 0.03), (29, -0.072), (30, 0.21), (31, -0.098), (32, -0.128), (33, 0.147), (34, -0.096), (35, 0.086), (36, -0.141), (37, 0.009), (38, -0.221), (39, -0.026), (40, -0.006), (41, 0.027), (42, -0.016), (43, 0.098), (44, -0.099), (45, -0.054), (46, 0.036), (47, -0.089), (48, 0.026), (49, 0.098)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95160848 134 emnlp-2011-Third-order Variational Reranking on Packed-Shared Dependency Forests

Author: Katsuhiko Hayashi ; Taro Watanabe ; Masayuki Asahara ; Yuji Matsumoto

Abstract: We propose a novel forest reranking algorithm for discriminative dependency parsing based on a variant of Eisner’s generative model. In our framework, we define two kinds of generative model for reranking. One is learned from training data offline and the other from a forest generated by a baseline parser on the fly. The final prediction in the reranking stage is performed using linear interpolation of these models and discriminative model. In order to efficiently train the model from and decode on a hypergraph data structure representing a forest, we apply extended inside/outside and Viterbi algorithms. Experimental results show that our proposed forest reranking algorithm achieves significant improvement when compared with conventional approaches.

2 0.70701969 68 emnlp-2011-Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding

Author: Marco Dinarelli ; Sophie Rosset

Abstract: Reranking models have been successfully applied to many tasks of Natural Language Processing. However, there are two aspects of this approach that need a deeper investigation: (i) Assessment of hypotheses generated for reranking at classification phase: baseline models generate a list of hypotheses and these are used for reranking without any assessment; (ii) Detection of cases where reranking models provide a worst result: the best hypothesis provided by the reranking model is assumed to be always the best result. In some cases the reranking model provides an incorrect hypothesis while the baseline best hypothesis is correct, especially when baseline models are accurate. In this paper we propose solutions for these two aspects: (i) a semantic inconsistency metric to select possibly more correct n-best hypotheses, from a large set generated by an SLU basiline model. The selected hypotheses are reranked applying a state-of-the-art model based on Partial Tree Kernels, which encode SLU hypotheses in Support Vector Machines with complex structured features; (ii) finally, we apply a decision strategy, based on confidence values, to select the final hypothesis between the first ranked hypothesis provided by the baseline SLU model and the first ranked hypothesis provided by the re-ranker. We show the effectiveness of these solutions presenting comparative results obtained reranking hypotheses generated by a very accurate Conditional Random Field model. We evaluate our approach on the French MEDIA corpus. The results show significant improvements with respect to current state-of-the-art and previous 1104 Sophie Rosset LIMSI-CNRS B.P. 133, 91403 Orsay Cedex France ro s set @ l ims i fr . re-ranking models.

3 0.49884504 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training

Author: Xinyan Xiao ; Yang Liu ; Qun Liu ; Shouxun Lin

Abstract: Although discriminative training guarantees to improve statistical machine translation by incorporating a large amount of overlapping features, it is hard to scale up to large data due to decoding complexity. We propose a new algorithm to generate translation forest of training data in linear time with the help of word alignment. Our algorithm also alleviates the oracle selection problem by ensuring that a forest always contains derivations that exactly yield the reference translation. With millions of features trained on 519K sentences in 0.03 second per sentence, our system achieves significant improvement by 0.84 BLEU over the baseline system on the NIST Chinese-English test sets.

4 0.35392836 65 emnlp-2011-Heuristic Search for Non-Bottom-Up Tree Structure Prediction

Author: Andrea Gesmundo ; James Henderson

Abstract: State of the art Tree Structures Prediction techniques rely on bottom-up decoding. These approaches allow the use of context-free features and bottom-up features. We discuss the limitations of mainstream techniques in solving common Natural Language Processing tasks. Then we devise a new framework that goes beyond Bottom-up Decoding, and that allows a better integration of contextual features. Furthermore we design a system that addresses these issues and we test it on Hierarchical Machine Translation, a well known tree structure prediction problem. The structure of the proposed system allows the incorporation of non-bottom-up features and relies on a more sophisticated decoding approach. We show that the proposed approach can find bet- ter translations using a smaller portion of the search space.

5 0.34100038 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification

Author: Sze-Meng Jojo Wong ; Mark Dras

Abstract: Attempts to profile authors according to their characteristics extracted from textual data, including native language, have drawn attention in recent years, via various machine learning approaches utilising mostly lexical features. Drawing on the idea of contrastive analysis, which postulates that syntactic errors in a text are to some extent influenced by the native language of an author, this paper explores the usefulness of syntactic features for native language identification. We take two types of parse substructure as features— horizontal slices of trees, and the more general feature schemas from discriminative parse reranking—and show that using this kind of syntactic feature results in an accuracy score in classification of seven native languages of around 80%, an error reduction of more than 30%.

6 0.3166858 75 emnlp-2011-Joint Models for Chinese POS Tagging and Dependency Parsing

7 0.31139272 102 emnlp-2011-Parse Correction with Specialized Models for Difficult Attachment Types

8 0.30144399 10 emnlp-2011-A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions

9 0.29138297 131 emnlp-2011-Syntactic Decision Tree LMs: Random Selection or Intelligent Design?

10 0.24639875 96 emnlp-2011-Multilayer Sequence Labeling

11 0.22837146 47 emnlp-2011-Efficient retrieval of tree translation examples for Syntax-Based Machine Translation

12 0.22756058 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation

13 0.22444732 50 emnlp-2011-Evaluating Dependency Parsing: Robust and Heuristics-Free Cross-Annotation Evaluation

14 0.22370115 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives

15 0.21621117 4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser

16 0.20679721 100 emnlp-2011-Optimal Search for Minimum Error Rate Training

17 0.19753638 66 emnlp-2011-Hierarchical Phrase-based Translation Representations

18 0.19608282 59 emnlp-2011-Fast and Robust Joint Models for Biomedical Event Extraction

19 0.19287884 93 emnlp-2011-Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation

20 0.19086878 45 emnlp-2011-Dual Decomposition with Many Overlapping Components


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(11, 0.299), (15, 0.011), (23, 0.101), (36, 0.047), (37, 0.024), (45, 0.048), (53, 0.013), (54, 0.053), (57, 0.014), (62, 0.025), (64, 0.046), (66, 0.029), (69, 0.03), (79, 0.032), (82, 0.015), (87, 0.036), (90, 0.029), (96, 0.048), (98, 0.017)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.76306266 71 emnlp-2011-Identifying and Following Expert Investors in Stock Microblogs

Author: Roy Bar-Haim ; Elad Dinur ; Ronen Feldman ; Moshe Fresko ; Guy Goldstein

Abstract: Information published in online stock investment message boards, and more recently in stock microblogs, is considered highly valuable by many investors. Previous work focused on aggregation of sentiment from all users. However, in this work we show that it is beneficial to distinguish expert users from non-experts. We propose a general framework for identifying expert investors, and use it as a basis for several models that predict stock rise from stock microblogging messages (stock tweets). In particular, we present two methods that combine expert identification and per-user unsupervised learning. These methods were shown to achieve relatively high precision in predicting stock rise, and significantly outperform our baseline. In addition, our work provides an in-depth analysis of the content and potential usefulness of stock tweets.

same-paper 2 0.74729127 134 emnlp-2011-Third-order Variational Reranking on Packed-Shared Dependency Forests

Author: Katsuhiko Hayashi ; Taro Watanabe ; Masayuki Asahara ; Yuji Matsumoto

Abstract: We propose a novel forest reranking algorithm for discriminative dependency parsing based on a variant of Eisner’s generative model. In our framework, we define two kinds of generative model for reranking. One is learned from training data offline and the other from a forest generated by a baseline parser on the fly. The final prediction in the reranking stage is performed using linear interpolation of these models and discriminative model. In order to efficiently train the model from and decode on a hypergraph data structure representing a forest, we apply extended inside/outside and Viterbi algorithms. Experimental results show that our proposed forest reranking algorithm achieves significant improvement when compared with conventional approaches.

3 0.5647788 136 emnlp-2011-Training a Parser for Machine Translation Reordering

Author: Jason Katz-Brown ; Slav Petrov ; Ryan McDonald ; Franz Och ; David Talbot ; Hiroshi Ichikawa ; Masakazu Seno ; Hideto Kazawa

Abstract: We propose a simple training regime that can improve the extrinsic performance of a parser, given only a corpus of sentences and a way to automatically evaluate the extrinsic quality of a candidate parse. We apply our method to train parsers that excel when used as part of a reordering component in a statistical machine translation system. We use a corpus of weakly-labeled reference reorderings to guide parser training. Our best parsers contribute significant improvements in subjective translation quality while their intrinsic attachment scores typically regress.

4 0.44791847 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation

Author: Kevin Gimpel ; Noah A. Smith

Abstract: We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text using a target-side dependency parser. For decoding, we describe a coarse-to-fine approach based on lattice dependency parsing of phrase lattices. We demonstrate performance improvements for Chinese-English and UrduEnglish translation over a phrase-based baseline. We also investigate the use of unsupervised dependency parsers, reporting encouraging preliminary results.

5 0.43735489 59 emnlp-2011-Fast and Robust Joint Models for Biomedical Event Extraction

Author: Sebastian Riedel ; Andrew McCallum

Abstract: Extracting biomedical events from literature has attracted much recent attention. The bestperforming systems so far have been pipelines of simple subtask-specific local classifiers. A natural drawback of such approaches are cascading errors introduced in early stages of the pipeline. We present three joint models of increasing complexity designed to overcome this problem. The first model performs joint trigger and argument extraction, and lends itself to a simple, efficient and exact inference algorithm. The second model captures correlations between events, while the third model ensures consistency between arguments of the same event. Inference in these models is kept tractable through dual decomposition. The first two models outperform the previous best joint approaches and are very competitive with respect to the current state-of-theart. The third model yields the best results reported so far on the BioNLP 2009 shared task, the BioNLP 2011 Genia task and the BioNLP 2011Infectious Diseases task.

6 0.43530095 75 emnlp-2011-Joint Models for Chinese POS Tagging and Dependency Parsing

7 0.43156943 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax

8 0.43151 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features

9 0.42943704 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation

10 0.427367 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training

11 0.42602852 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming

12 0.42587122 66 emnlp-2011-Hierarchical Phrase-based Translation Representations

13 0.42566362 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives

14 0.42482489 79 emnlp-2011-Lateen EM: Unsupervised Training with Multiple Objectives, Applied to Dependency Grammar Induction

15 0.42321709 87 emnlp-2011-Lexical Generalization in CCG Grammar Induction for Semantic Parsing

16 0.42298615 111 emnlp-2011-Reducing Grounded Learning Tasks To Grammatical Inference

17 0.42275125 68 emnlp-2011-Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding

18 0.42241147 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation

19 0.42228881 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation

20 0.42138076 138 emnlp-2011-Tuning as Ranking