acl acl2010 acl2010-60 knowledge-graph by maker-knowledge-mining

60 acl-2010-Collocation Extraction beyond the Independence Assumption

Source: pdf

Author: Gerlof Bouma

Abstract: In this paper we start to explore two-part collocation extraction association measures that do not estimate expected probabilities on the basis of the independence assumption. We propose two new measures based upon the well-known measures of mutual information and pointwise mutual information. Expected probabilities are derived from automatically trained Aggregate Markov Models. On three collocation gold standards, we find the new association measures vary in their effectiveness.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 de Abstract In this paper we start to explore two-part collocation extraction association measures that do not estimate expected probabilities on the basis of the independence assumption. [sent-3, score-0.733]

2 We propose two new measures based upon the well-known measures of mutual information and pointwise mutual information. [sent-4, score-0.282]

3 On three collocation gold standards, we find the new association measures vary in their effectiveness. [sent-6, score-0.506]

4 1 Introduction Collocation extraction typically proceeds by scoring collocation candidates with an association mea- sure, where high scores are taken to indicate likely collocationhood. [sent-7, score-0.429]

5 Two well-known such measures are pointwise mutual information (PMI) and mutual information (MI). [sent-8, score-0.194]

6 (1) (2) x∈{wX1 ,¬w1 } yx∈∈{{ww2,,¬¬ww2}} PMI (1) is the logged ratio of the observed bigramme probability and the expected bigramme probability under independence of the two words in the combination. [sent-10, score-0.89]

7 MI (2) is the expected outcome of PMI, and measures how much information of the distribution of one word is contained in the distribution of the other. [sent-11, score-0.122]

8 PMI was introduced into the collocation extraction field by Church and Hanks (1990). [sent-12, score-0.391]

9 First, the observed occurrence probability pobs is compared to the expected occurrence probability pexp. [sent-15, score-0.15]

10 Secondly, the independence assumption underlies the estimation of pexp. [sent-16, score-0.207]

11 For instance, the bigramme of the is uninteresting from a collocation extraction perspective, although it probably is amongst the most frequent bigrammes for any English corpus. [sent-18, score-0.725]

12 Looking at pobs andpexp together allows us to recognize these cases (Manning and Schu¨tze (1999) and Evert (2007) for more discussion). [sent-20, score-0.072]

13 The second aspect, the independence assumption in the estimation of pexp, is more problematic, however, even in the context of collocation extraction. [sent-21, score-0.546]

14 As Evert (2007, p42) notes, the assumption of “independence is extremely unrealistic,” because it ignores “a variety of syntactic, semantic and lexical restrictions. [sent-22, score-0.056]

15 The independence assumption leads to overestimated expectation and the the will need to be very frequent for it to show up as a likely collocation. [sent-26, score-0.207]

16 A less contrived example of how the independence assumption might mislead collocation extraction is when bigramme distribution is influenced by compositional, non-collocational, semantic dependencies. [sent-27, score-0.939]

17 Investigating adjective-noun combinations in a corpus, we might find that beige cloth gets a high PMI, whereas beige thought does not. [sent-28, score-0.154]

18 This does not make the former a collocation or multiword unit. [sent-29, score-0.438]

19 Rather, what we would measure is the tendency to use colours with visible things and not with abstract objects. [sent-30, score-0.039]

20 c C2o0n1f0er Aenscseoc Sihatoirotn P faopre Crso,m papguetsat 1io0n9a–l1 L1i4n,guistics associations between words are real dependencies, but they need not be collocational in nature. [sent-33, score-0.078]

21 Because of the independence assumption, PMI and MI measure these syntactic and semantic associations just as much as they measure collocational association. [sent-34, score-0.229]

22 In this paper, we therefore experimentally investigate the use of a more informed pexp in the context of collocation extraction. [sent-35, score-0.636]

23 2 Aggregate Markov Models To replace pexp under independence, one might consider models with explicit linguistic information, such as a POS-tag bigramme model. [sent-36, score-0.606]

24 We might not know exactly what factors are needed to estimate pexp and even if we do, we might lack the resources to train the resulting models. [sent-39, score-0.352]

25 The only thing we know about estimating pexp is that we need more information than a unigramme model but less than a bigramme model (as this would make pobs/pexp uninformative). [sent-40, score-0.626]

26 In an AMM, bigramme probability is not directly modeled, but mediated by a hidden class variable c: pamm(w2|w1) = Xp(c|w1)p(w2|c). [sent-44, score-0.427]

27 (3) Xc The number of classes in an AMM determines the amount of dependency that can be captured. [sent-45, score-0.046]

28 In the case of just one class, AMM is equivalent to a unigramme model. [sent-46, score-0.078]

29 AMMs become equivalent to the full bigramme model when the number of classes equals the size of the smallest of the vocabularies of the parts of the combination. [sent-47, score-0.389]

30 AMMs can be trained with EM, using no more information than one would need for ML bigramme probability estimates. [sent-49, score-0.335]

31 Their use in collocation extraction is to our knowledge novel. [sent-54, score-0.391]

33 The definition of the counterparts to (P)MI without the independence assumption, the AMM-ratio and AMM-divergence, is now straightforward: ramm(w1,w2) = logp(w1)pp(wam1,mw(2w)2|w1), (7) damm(w1,w2) = X p(x,y)ramm(x,y). [sent-57, score-0.151]

34 (8) x∈{wX1 ,¬w1 } yx∈∈{{ww2,,¬¬ww2}} The free parameter in these association measures is the number of hidden classes in the AMM, that is, the amount of dependency between the bigramme parts used to estimate pexp. [sent-58, score-0.555]

35 Note that AMM-ratio and AMM-divergence with one hidden class are equivalent to PMI and MI, respectively. [sent-59, score-0.122]

36 1 Data and procedure We apply AMM-ratio and AMM-divergence to three collocation gold standards. [sent-62, score-0.38]

37 The effectiveness of association measures in collocation extraction is measured by ranking collocation candidates after the scores defined by the measures, and calculating average precision of these lists against the gold standard annotation. [sent-63, score-0.923]

38 We consider the newly pro- posed AMM-based measures for a varying number of hidden categories. [sent-64, score-0.127]

39 The new measures are compared against two baselines: ranking by frequency (pobs) and random ordering. [sent-65, score-0.137]

40 Because AMM-ratio and -divergence with one hidden class boil down to PMI and MI (and thus log-likelihood ratio), the evaluation contains an implicit comparison with 110 these canonical measures, too. [sent-66, score-0.092]

41 However, the results will not be state-of-the-art: for the datasets investigated below, there are more effective extraction methods based on supervised machine learning (Pecina, 2008). [sent-67, score-0.052]

42 The first gold standard used is the German adjective-noun dataset (Evert, 2008). [sent-68, score-0.041]

43 We used the bigramme frequency data included in the resource. [sent-71, score-0.362]

44 The second gold standard consists of 5 102 German PP-verb combinations, also sampled from newspaper texts (Krenn, 2008). [sent-73, score-0.064]

45 The data contains annotation for support verb constructions (FVGs) and figurative expressions. [sent-74, score-0.066]

46 This resource also comes with its own frequency data. [sent-75, score-0.049]

47 After frequency thresholding, AMMs are trained on 46k PPs, 7. [sent-76, score-0.049]

48 Third and last is the English verb-particle construction (VPC) gold standard (Baldwin, 2008), consisting of 3078 verb-particle pairs and annotation for transitive and intransitive idiomatic VPCs. [sent-78, score-0.138]

49 We extract frequency data from the BNC, following the methods described in Baldwin (2005). [sent-79, score-0.049]

50 For the transitive VPCs, we have 5k Vs, 35 particles and 54k pair types. [sent-83, score-0.106]

51 All our EM runs start with randomly initialized model vectors. [sent-84, score-0.035]

52 3 we discuss the impact of model variation due to this random factor. [sent-86, score-0.036]

53 2 Results German A-N collocations The top slice in Table 1 shows results for the three subtasks of the A-N dataset. [sent-88, score-0.193]

54 We see that using AMM-based pexp initially improves average precision, for each task and for both the ratio and the divergence measure. [sent-89, score-0.3]

55 At their maxima, the informed measures outperform both baselines as well as PMI and MI/loglikelihood ratio (# classes=1). [sent-90, score-0.18]

56 It is likely that the drop in performance for the larger AMM-based measures is due to the AMMs learning the collocations themselves. [sent-92, score-0.194]

57 That is, the AMMs become rich enough to not only capture the broadly applicative distributional influences of syntax and semantics, but also provide accurate pexps for individual, distributionally deviant combinations like collocations. [sent-93, score-0.102]

58 An accurate pexp results in a low association score. [sent-94, score-0.303]

59 (2005), we take the 200 most frequent adjectives and assign them to – the category that maximizes p(c|w1) ; likewise for nouns and p(w2 |c). [sent-97, score-0.073]

60 Four selepc(tce|dw clusters (out of 16) are given in |Tca)ble 2. [sent-98, score-0.04]

61 2 The esoteric class 1contains ordinal numbers and nouns that one typically uses those with, including references to temporal concepts. [sent-99, score-0.075]

62 Class 4 shows a group of adjectives denoting colours and/or political affiliations and a less coherent set of nouns, although the noun cluster can be understood if we consider individual adjectives that are associated with this class. [sent-101, score-0.14]

63 Our informal impression from looking at clusters is that this is a common situation: as a whole, a cluster cannot be easily characterized, although for subsets or individual pairs, one can get an intuition for why they are in the same class. [sent-102, score-0.07]

64 Unfortunately, we also see that some actual collocations are clustered in class 4, such as gelbe Karte ‘warning’ (lit. [sent-103, score-0.159]

65 German PP-Verb collocations The second slice in Table 1 shows that, for both subtypes of PP-V collocation, better pexp-estimates lead to decreased average precision. [sent-106, score-0.164]

66 The most effective AMM-ratio and -distance measures are those equivalent to (P)MI. [sent-107, score-0.118]

67 Apparently, the better pexps are unfortunate for the extraction of the type of collocations in this dataset. [sent-108, score-0.23]

68 The poor performance of PMI on these data clearly below frequency has been noticed before by Krenn and Evert (2001). [sent-109, score-0.049]

69 A possible explanation for the lack of improvement in the AMMs lies in the relatively high performing frequency baselines. [sent-110, score-0.049]

70 The frequency baseline for FVGs is five times the – – 2An anonymous reviewer rightly warns against sketching an overly positive picture of the knowledge captured in the AMMs by only presenting a few clusters. [sent-111, score-0.049]

71 However, the clustering performed here is only secondary to our main goal of improving collocation extraction. [sent-112, score-0.339]

72 111 # classes 1 A-N category 1 2 4 ramm damm 8 16 45. [sent-114, score-0.627]

73 2 category 1–2 category 1–3 ramm damm ramm damm 55. [sent-122, score-1.162]

74 8 PP-V figurative FVG VPC intransitive transitive 4468. [sent-170, score-0.131]

75 1 Table 1: Average precision for AMM-based association measures and baselines on three datasets. [sent-250, score-0.177]

76 Since the AMMs provide a better fit for the more frequent pairs in the training data, they might end up providing too good pexp-estimates for the true collocations from the beginning. [sent-253, score-0.134]

77 Further investigation is needed to find out whether this situation can be ameliorated and, if not, whether we can systematically identify for what kind of collocation extraction tasks using better pexps is simply not a good idea. [sent-254, score-0.463]

78 English Verb-Particle constructions The last gold standard is the English VPC dataset, shown in the bottom slice of Table 1. [sent-255, score-0.131]

79 We can clearly see the effect of the largest AMMs approaching the full bigramme model as average precision here approaches the random baseline. [sent-257, score-0.339]

80 The VPC extraction task shows a difference between the two AMM-based measures: AMMratio does not improve at all, remaining below the frequency baseline. [sent-258, score-0.101]

81 AMM-divergence, however, shows a slight decrease in precision first, but ends up performing above the frequency baseline for the 8-class AMMs in both subtasks. [sent-259, score-0.075]

82 Table 3 shows four clusters of verbs and particles. [sent-260, score-0.04]

83 The large first cluster contains verbs that involve motion/displacement of the subject or object and associated particles, for instance walk about or push away. [sent-261, score-0.08]

84 Interestingly, the description of the gold standard gives exactly such cases as negatives, since they constitute compositional verbparticle constructions (Baldwin, 2008). [sent-262, score-0.1]

85 collocation extraction by decreasing the impact of verb-preposition associations that are due to PPselecting verbs. [sent-264, score-0.43]

86 Class 4 shows a third type of distributional generalization: the verbs in this class are all frequently used in the passive. [sent-265, score-0.053]

87 3 Variation due to local optima We start each EM run with a random initialization of the model parameters. [sent-267, score-0.042]

88 Since EM finds local rather than global optima, each run may lead to different AMMs, which in turn will affect AMMbased collocation extraction. [sent-268, score-0.339]

89 ), so that a run with the same data and the same number of classes will always learn (almost) the same model. [sent-276, score-0.046]

90 On the assumption that an average over several runs will vary less than individual runs, we have also constructed a combined pexp by averaging over 40 pexps. [sent-277, score-0.356]

91 The last column Variation in avg precision min A-N cat 1 cat 1–2 cat 1–3 ramm damm ramm damm ramm damm 46. [sent-278, score-1.805]

92 9 Table 4: Variation on A-N data over 40 EM runs and result of combining pexps. [sent-314, score-0.035]

93 in Table 4 shows this combined estimator leads to good extraction results. [sent-315, score-0.052]

94 4 Conclusions In this paper, we have started to explore collocation extraction beyond the assumption of independence. [sent-316, score-0.447]

95 We have introduced two new association measures that do away with this assumption in the estimation of expected probabilities. [sent-317, score-0.216]

96 A possible obstacle in the adoption of AMMs in collocation extraction is that we have not provided any heuristic for setting the number of classes for the AMMs. [sent-320, score-0.437]

97 In general, considering these smaller models might suffice for tasks that have a fairly restricted definition of collocation candidate, like the tasks in our evaluation do. [sent-323, score-0.367]

98 Because AMM fitting is unsupervised, selecting a class size is in this respect no different from selecting a suitable association measure from the canon of existing measures. [sent-324, score-0.119]

99 Future research into association measures that are not based on the independence assumption will also include considering different EM variants and other automatically learnable models besides the AMMs used in this paper. [sent-325, score-0.333]

100 Finally, the idea of using an informed estimate of expected probability in an association measure need not be confined to (P)MI, as there are many other measures that employ expected probabilities. [sent-326, score-0.279]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('amms', 0.433), ('collocation', 0.339), ('bigramme', 0.313), ('ramm', 0.289), ('damm', 0.265), ('pexp', 0.265), ('amm', 0.217), ('independence', 0.151), ('pmi', 0.13), ('vpc', 0.12), ('collocations', 0.106), ('mi', 0.103), ('multiword', 0.099), ('measures', 0.088), ('mwe', 0.072), ('evert', 0.072), ('krenn', 0.072), ('pexps', 0.072), ('pobs', 0.072), ('german', 0.062), ('em', 0.059), ('particles', 0.058), ('slice', 0.058), ('aggregate', 0.057), ('assumption', 0.056), ('saul', 0.054), ('class', 0.053), ('extraction', 0.052), ('intransitive', 0.049), ('frequency', 0.049), ('transitive', 0.048), ('beige', 0.048), ('fvgs', 0.048), ('karte', 0.048), ('unigramme', 0.048), ('vpcs', 0.048), ('classes', 0.046), ('stefan', 0.042), ('auto', 0.042), ('brigitte', 0.042), ('memo', 0.042), ('optima', 0.042), ('gold', 0.041), ('mutual', 0.041), ('team', 0.04), ('baldwin', 0.04), ('clusters', 0.04), ('associations', 0.039), ('blitzer', 0.039), ('cat', 0.039), ('hidden', 0.039), ('fat', 0.039), ('yellow', 0.039), ('colours', 0.039), ('collocational', 0.039), ('yx', 0.039), ('association', 0.038), ('vs', 0.036), ('variation', 0.036), ('ratio', 0.035), ('runs', 0.035), ('expected', 0.034), ('figurative', 0.034), ('potsdam', 0.034), ('organisation', 0.034), ('expressions', 0.034), ('rooth', 0.033), ('particle', 0.033), ('constructions', 0.032), ('informed', 0.032), ('bouma', 0.031), ('estimate', 0.031), ('equivalent', 0.03), ('combinations', 0.03), ('cluster', 0.03), ('lrec', 0.03), ('subtasks', 0.029), ('card', 0.029), ('hofmann', 0.029), ('fitting', 0.028), ('might', 0.028), ('compositional', 0.027), ('timothy', 0.027), ('category', 0.027), ('precision', 0.026), ('walk', 0.025), ('baselines', 0.025), ('push', 0.025), ('markov', 0.024), ('pointwise', 0.024), ('car', 0.024), ('church', 0.024), ('pc', 0.024), ('adjectives', 0.024), ('newspaper', 0.023), ('denoting', 0.023), ('probability', 0.022), ('nouns', 0.022), ('uninteresting', 0.021), ('tce', 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9999997 60 acl-2010-Collocation Extraction beyond the Independence Assumption

Author: Gerlof Bouma

2 0.27417192 36 acl-2010-Automatic Collocation Suggestion in Academic Writing

Author: Jian-Cheng Wu ; Yu-Chia Chang ; Teruko Mitamura ; Jason S. Chang

Abstract: In recent years, collocation has been widely acknowledged as an essential characteristic to distinguish native speakers from non-native speakers. Research on academic writing has also shown that collocations are not only common but serve a particularly important discourse function within the academic community. In our study, we propose a machine learning approach to implementing an online collocation writing assistant. We use a data-driven classifier to provide collocation suggestions to improve word choices, based on the result of classifica- tion. The system generates and ranks suggestions to assist learners’ collocation usages in their academic writing with satisfactory results. 1

3 0.2715292 147 acl-2010-Improving Statistical Machine Translation with Monolingual Collocation

Author: Zhanyi Liu ; Haifeng Wang ; Hua Wu ; Sheng Li

Abstract: This paper proposes to use monolingual collocations to improve Statistical Machine Translation (SMT). We make use of the collocation probabilities, which are estimated from monolingual corpora, in two aspects, namely improving word alignment for various kinds of SMT systems and improving phrase table for phrase-based SMT. The experimental results show that our method improves the performance of both word alignment and translation quality significantly. As compared to baseline systems, we achieve absolute improvements of 2.40 BLEU score on a phrase-based SMT system and 1.76 BLEU score on a parsing-based SMT system. 1

4 0.084834956 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

Author: Omri Abend ; Ari Rappoport

Abstract: The core-adjunct argument distinction is a basic one in the theory of argument structure. The task of distinguishing between the two has strong relations to various basic NLP tasks such as syntactic parsing, semantic role labeling and subcategorization acquisition. This paper presents a novel unsupervised algorithm for the task that uses no supervised models, utilizing instead state-of-the-art syntactic induction algorithms. This is the first work to tackle this task in a fully unsupervised scenario.

5 0.060834378 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery

Author: Omri Abend ; Roi Reichart ; Ari Rappoport

Abstract: We present a novel fully unsupervised algorithm for POS induction from plain text, motivated by the cognitive notion of prototypes. The algorithm first identifies landmark clusters of words, serving as the cores of the induced POS categories. The rest of the words are subsequently mapped to these clusters. We utilize morphological and distributional representations computed in a fully unsupervised manner. We evaluate our algorithm on English and German, achieving the best reported results for this task.

6 0.059055887 216 acl-2010-Starting from Scratch in Semantic Role Labeling

7 0.056481529 158 acl-2010-Latent Variable Models of Selectional Preference

8 0.053437702 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models

9 0.049843039 5 acl-2010-A Framework for Figurative Language Detection Based on Sense Differentiation

10 0.046423893 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts

11 0.044708479 258 acl-2010-Weakly Supervised Learning of Presupposition Relations between Verbs

12 0.042008717 145 acl-2010-Improving Arabic-to-English Statistical Machine Translation by Reordering Post-Verbal Subjects for Alignment

13 0.040949587 214 acl-2010-Sparsity in Dependency Grammar Induction

14 0.040805481 237 acl-2010-Topic Models for Word Sense Disambiguation and Token-Based Idiom Detection

15 0.039645817 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

16 0.038530838 141 acl-2010-Identifying Text Polarity Using Random Walks

17 0.038479913 205 acl-2010-SVD and Clustering for Unsupervised POS Tagging

18 0.037839893 89 acl-2010-Distributional Similarity vs. PU Learning for Entity Set Expansion

19 0.037498791 125 acl-2010-Generating Templates of Entity Summaries with an Entity-Aspect Model and Pattern Mining

20 0.037281394 130 acl-2010-Hard Constraints for Grammatical Function Labelling

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.138), (1, 0.003), (2, -0.008), (3, 0.001), (4, 0.061), (5, 0.008), (6, -0.039), (7, 0.01), (8, 0.109), (9, -0.002), (10, -0.002), (11, 0.069), (12, -0.011), (13, 0.06), (14, 0.189), (15, 0.036), (16, -0.034), (17, -0.18), (18, 0.189), (19, 0.371), (20, -0.084), (21, 0.065), (22, -0.221), (23, 0.094), (24, -0.023), (25, 0.081), (26, 0.041), (27, -0.009), (28, -0.031), (29, -0.066), (30, 0.071), (31, 0.01), (32, 0.018), (33, 0.013), (34, -0.093), (35, 0.04), (36, -0.024), (37, -0.033), (38, 0.031), (39, -0.024), (40, 0.005), (41, -0.025), (42, 0.013), (43, -0.023), (44, 0.001), (45, 0.011), (46, -0.033), (47, -0.018), (48, 0.034), (49, 0.003)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.91913998 60 acl-2010-Collocation Extraction beyond the Independence Assumption

Author: Gerlof Bouma

2 0.91496181 36 acl-2010-Automatic Collocation Suggestion in Academic Writing

Author: Jian-Cheng Wu ; Yu-Chia Chang ; Teruko Mitamura ; Jason S. Chang

3 0.59375924 147 acl-2010-Improving Statistical Machine Translation with Monolingual Collocation

Author: Zhanyi Liu ; Haifeng Wang ; Hua Wu ; Sheng Li

4 0.34054795 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

Author: Omri Abend ; Ari Rappoport

5 0.26723295 5 acl-2010-A Framework for Figurative Language Detection Based on Sense Differentiation

Author: Daria Bogdanova

Abstract: Various text mining algorithms require the process offeature selection. High-level semantically rich features, such as figurative language uses, speech errors etc., are very promising for such problems as e.g. writing style detection, but automatic extraction of such features is a big challenge. In this paper, we propose a framework for figurative language use detection. This framework is based on the idea of sense differentiation. We describe two algorithms illustrating the mentioned idea. We show then how these algorithms work by applying them to Russian language data.

6 0.24035066 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery

7 0.23853128 166 acl-2010-Learning Word-Class Lattices for Definition and Hypernym Extraction

8 0.21607119 258 acl-2010-Weakly Supervised Learning of Presupposition Relations between Verbs

9 0.21116559 117 acl-2010-Fine-Grained Genre Classification Using Structural Learning Algorithms

10 0.20880014 85 acl-2010-Detecting Experiences from Weblogs

11 0.20702139 139 acl-2010-Identifying Generic Noun Phrases

12 0.20480211 252 acl-2010-Using Parse Features for Preposition Selection and Error Detection

13 0.2044642 140 acl-2010-Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.

14 0.20241746 111 acl-2010-Extracting Sequences from the Web

15 0.19907449 205 acl-2010-SVD and Clustering for Unsupervised POS Tagging

16 0.19659957 61 acl-2010-Combining Data and Mathematical Models of Language Change

17 0.19570309 248 acl-2010-Unsupervised Ontology Induction from Text

18 0.19217259 212 acl-2010-Simple Semi-Supervised Training of Part-Of-Speech Taggers

19 0.19214833 189 acl-2010-Optimizing Question Answering Accuracy by Maximizing Log-Likelihood

20 0.19161835 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(14, 0.019), (25, 0.06), (39, 0.011), (42, 0.026), (44, 0.015), (59, 0.106), (71, 0.011), (73, 0.039), (75, 0.293), (76, 0.017), (78, 0.048), (80, 0.014), (83, 0.118), (84, 0.043), (98, 0.092)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.81536722 175 acl-2010-Models of Metaphor in NLP

Author: Ekaterina Shutova

Abstract: Automatic processing of metaphor can be clearly divided into two subtasks: metaphor recognition (distinguishing between literal and metaphorical language in a text) and metaphor interpretation (identifying the intended literal meaning of a metaphorical expression). Both of them have been repeatedly addressed in NLP. This paper is the first comprehensive and systematic review of the existing computational models of metaphor, the issues of metaphor annotation in corpora and the available resources.

same-paper 2 0.76472235 60 acl-2010-Collocation Extraction beyond the Independence Assumption

Author: Gerlof Bouma

3 0.60427308 102 acl-2010-Error Detection for Statistical Machine Translation Using Linguistic Features

Author: Deyi Xiong ; Min Zhang ; Haizhou Li

Abstract: Automatic error detection is desired in the post-processing to improve machine translation quality. The previous work is largely based on confidence estimation using system-based features, such as word posterior probabilities calculated from Nbest lists or word lattices. We propose to incorporate two groups of linguistic features, which convey information from outside machine translation systems, into error detection: lexical and syntactic features. We use a maximum entropy classifier to predict translation errors by integrating word posterior probability feature and linguistic features. The experimental results show that 1) linguistic features alone outperform word posterior probability based confidence estimation in error detection; and 2) linguistic features can further provide complementary information when combined with word confidence scores, which collectively reduce the classification error rate by 18.52% and improve the F measure by 16.37%.

4 0.56712079 158 acl-2010-Latent Variable Models of Selectional Preference

Author: Diarmuid O Seaghdha

Abstract: This paper describes the application of so-called topic models to selectional preference induction. Three models related to Latent Dirichlet Allocation, a proven method for modelling document-word cooccurrences, are presented and evaluated on datasets of human plausibility judgements. Compared to previously proposed techniques, these models perform very competitively, especially for infrequent predicate-argument combinations where they exceed the quality of Web-scale predictions while using relatively little data.

5 0.56206775 219 acl-2010-Supervised Noun Phrase Coreference Research: The First Fifteen Years

Author: Vincent Ng

Abstract: The research focus of computational coreference resolution has exhibited a shift from heuristic approaches to machine learning approaches in the past decade. This paper surveys the major milestones in supervised coreference research since its inception fifteen years ago.

6 0.5584963 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

7 0.55772233 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

8 0.55658495 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans

9 0.5560782 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields

10 0.55446446 59 acl-2010-Cognitively Plausible Models of Human Language Processing

11 0.55434811 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar

12 0.5539341 195 acl-2010-Phylogenetic Grammar Induction

13 0.55047792 65 acl-2010-Complexity Metrics in an Incremental Right-Corner Parser

14 0.55044544 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition

15 0.55041695 252 acl-2010-Using Parse Features for Preposition Selection and Error Detection

16 0.55014479 148 acl-2010-Improving the Use of Pseudo-Words for Evaluating Selectional Preferences

17 0.54998362 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts

18 0.54994375 76 acl-2010-Creating Robust Supervised Classifiers via Web-Scale N-Gram Data

19 0.54908228 233 acl-2010-The Same-Head Heuristic for Coreference

20 0.54881388 1 acl-2010-"Ask Not What Textual Entailment Can Do for You..."