acl acl2010 acl2010-10 knowledge-graph by maker-knowledge-mining

10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences


Source: pdf

Author: Alan Ritter ; Mausam Mausam ; Oren Etzioni

Abstract: The computation of selectional preferences, the admissible argument values for a relation, is a well-known NLP task with broad applicability. We present LDA-SP, which utilizes LinkLDA (Erosheva et al., 2004) to model selectional preferences. By simultaneously inferring latent topics and topic distributions over relations, LDA-SP combines the benefits of previous approaches: like traditional classbased approaches, it produces humaninterpretable classes describing each relation’s preferences, but it is competitive with non-class-based methods in predictive power. We compare LDA-SP to several state-ofthe-art methods achieving an 85% increase in recall at 0.9 precision over mutual information (Erk, 2007). We also evaluate LDA-SP’s effectiveness at filtering improper applications of inference rules, where we show substantial improvement over Pantel et al. ’s system (Pantel et al., 2007).

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract The computation of selectional preferences, the admissible argument values for a relation, is a well-known NLP task with broad applicability. [sent-3, score-0.585]

2 For example, locations are likely to appear in the second argument of the relation X is headquartered in Y and companies or organizations in the first. [sent-13, score-0.278]

3 A large, high-quality database of preferences has the potential to improve the performance of a wide range of NLP tasks including semantic role labeling (Gildea and Jurafsky, 2002), pronoun resolution (Bergsma et al. [sent-14, score-0.345]

4 Resnik (1996) presented the earliest work in this area, describing an information-theoretic approach that inferred selectional preferences based on the WordNet hypernym hierarchy. [sent-18, score-0.777]

5 In this paper we describe a novel approach to computing selectional preferences by making use of unsupervised topic models. [sent-23, score-0.926]

6 Unsupervised topic models, such as latent Dirichlet allocation (LDA) (Blei et al. [sent-25, score-0.25]

7 For our problem these topics offer an intuitive interpretation they represent the (latent) set of classes that store the preferences for the different relations. [sent-27, score-0.612]

8 Thus, topic models are a natural fit for modeling our relation data. [sent-28, score-0.325]

9 Thus, LDA-SP is able to capture information about the pairs of topics that commonly co-occur. [sent-32, score-0.218]

10 We run LDA-SP to compute preferences on a massive dataset of binary relations r(a1, a2) ex– 424 Proce dinUgsp osfa tlhae, 4S8wthed Aen n,u 1a1l-1 M6e Jeutilnyg 2 o0f1 t0h. [sent-34, score-0.406]

11 3), as well as produce a repository of class-based preferences with a little manual effort as demonstrated in Section 4. [sent-41, score-0.374]

12 1 2 Previous Work Previous work on selectional preferences can be broken into four categories: class-based approaches (Resnik, 1996; Li and Abe, 1998; Clark and Weir, 2002; Pantel et al. [sent-45, score-0.765]

13 For each relation, some measure of the overlap between the classes and observed arguments is used to identify those that best describe the arguments. [sent-54, score-0.249]

14 These techniques produce a human-interpretable output, but often suffer in quality due to an incoherent taxonomy, inability to map arguments to a class (poor lexical coverage), and word sense ambiguity. [sent-55, score-0.248]

15 Of these, the similarity based approaches make use of a distributional similarity measure between arguments and evaluate a heuristic scoring function: Srel(arg)= X sim(arg,arg0) · wtrel(arg) arg0 ∈XSeen(rel) 1Our repository of selectional preferences is available at http : / /www . [sent-57, score-1.066]

16 These methods obtain better lexical coverage, but are unable to obtain any abstract representation of selectional preferences. [sent-62, score-0.42]

17 (1999), in which each class corresponds to a multinomial over relations and arguments and EM is used to learn the parameters of the model. [sent-67, score-0.4]

18 In contrast, we use a LinkLDA framework in which each relation is associated with a corresponding multinomial distribution over classes, and each argu- ment is drawn from a class-specific distribution over words; LinkLDA captures co-occurrence of classes in the two arguments. [sent-68, score-0.425]

19 (2008) proposed the first discriminative approach to selectional preferences. [sent-72, score-0.42]

20 They automatically generated positive and negative examples by selecting arguments having high and low mutual information with the relation. [sent-74, score-0.25]

21 O´ S ´eaghdha (2010) proposes a series of LDA-style models for the task of computing selectional preferences. [sent-91, score-0.448]

22 This work learns selectional preferences between the following grammatical relations: verb-object, nounnoun, and adjective-noun. [sent-92, score-0.737]

23 In contrast, our work uses LinkLDA and focuses on modeling multiple arguments of a relation (e. [sent-95, score-0.28]

24 We present a series of topic models for the task of computing selectional preferences. [sent-99, score-0.637]

25 On the other hand, JointLDA, the model at the other extreme (Figure 1) assumes both arguments of a specific extraction are generated based on a single hidden variable z. [sent-102, score-0.262]

26 2 Our task is to compute, for each argument ai of each relation r, a set of usual argument values (noun phrases) that it takes. [sent-105, score-0.364]

27 For example, for the relation is headquartered in the first argument set will include companies like Microsoft, Intel, General Motors and second argument will favor locations like New York, California, Seattle. [sent-106, score-0.406]

28 In the generative model for our data, each relation r has a corresponding multinomial over topics θr, drawn from a Dirichlet. [sent-110, score-0.513]

29 For each extraction, a hidden topic z is first picked according to θr, and then the observed argument a is chosen according to the multinomial βz. [sent-111, score-0.508]

30 Readers familiar with topic modeling terminology can understand our approach as follows: we treat each relation as a document whose contents consist of a bags of words corresponding to all the noun phrases observed as arguments of the relation in our corpus. [sent-112, score-0.577]

31 Formally, LDA generates each argument in the corpus of relations as follows: for each topic t = 1. [sent-113, score-0.406]

32 Clearly this is undesirable, as information about which topics one of the arguments favors can help inform the topics chosen for the other. [sent-127, score-0.608]

33 For example, class pairs such as (team, game), (politician, political issue) form much more plausible selectional preferences than, say, (team, political issue), (politician, game). [sent-128, score-0.913]

34 The key difference in JointLDA (versus LDA) is that instead of one, it maintains two sets of topics (latent distributions over words) denoted by β and γ, one for classes of each argument. [sent-132, score-0.338]

35 A topic id k represents a pair of topics, βk and γk, that co-occur in the arguments of extracted relations. [sent-133, score-0.361]

36 The hidden variable z = k indicates that the noun phrase for the first argument was drawn from the multinomial βk, and that the second argument was drawn from γk. [sent-135, score-0.541]

37 The per-relation distribution θr is a multinomial over the topic ids and represents the selectional preferences, both for arg1s and arg2s of a relation r. [sent-136, score-0.851]

38 Most notably, in JointLDA topics correspond to pairs of multinomials (βk, γk); this leads to a situation in which multiple redundant distributions are needed to represent the same underlying semantic class. [sent-138, score-0.324]

39 For example consider the case where we we need to represent the following selectional preferences for our corpus of relations: (person, location), (per- son, organization), and (person, crime). [sent-139, score-0.737]

40 Because JointLDA requires a separate pair of multinomials for each topic, it is forced to use 3 separate multinomials to represent the class person, rather than learning a single distribution representing person and choosing 3 different topics for a2. [sent-140, score-0.474]

41 LinkLDA is more flexible than JointLDA, allowing different topics to be chosen for a1, and a2, however still models the generation of topics from the same distribution for a given relation. [sent-143, score-0.505]

42 In particular note that each ai is drawn from a different hidden topic zi, however the zi’s are drawn from the same distri- bution θr for a given relation r. [sent-147, score-0.489]

43 To facilitate learn- ing related topic pairs between arguments we employ a sparse prior over the per-relation topic distributions. [sent-148, score-0.55]

44 Because a few topics are likely to be assigned most of the probability mass for a given relation it is more likely (although not necessary) that the same topic number k will be drawn for both arguments. [sent-149, score-0.609]

45 This allows one argument to help disambiguate the other in the case of ambiguous relation strings. [sent-152, score-0.236]

46 LinkLDA, however, is more flexible; rather than requiring both arguments to be generated from one of |Z| possible pairs ofmultinomials (βz, γz), LinokLf|DZA| paolsloswibls etphea arguments oofm a given extraction to be generated from |Z|2 possible pairs. [sent-153, score-0.4]

47 LinkLDA can thus re-use argument classes, choosing different combinations of topics for the arguments if it fits the data better. [sent-155, score-0.518]

48 4 Inference For all the models we use collapsed Gibbs sampling for inference in which each of the hidden variables (e. [sent-159, score-0.252]

49 First, they naturally model the class-based nature of selectional preferences, but don’t take a pre-defined set of classes as input. [sent-169, score-0.497]

50 Second, the models naturally handle ambiguous arguments, as they are able to assign different topics to the same phrase in different contexts. [sent-172, score-0.246]

51 Finally we note that, once a topic distribution has been learned over a set of training relations, one can efficiently apply inference to unseen relations (Yao et al. [sent-177, score-0.435]

52 – 4 Experiments We perform three main experiments to assess the quality of the preferences obtained using topic models. [sent-179, score-0.506]

53 2), which is a standard way to evaluate the quality of selectional preferences (Rooth et al. [sent-181, score-0.737]

54 We use this experiment to compare the various topic models as well as the best model with the known state of the art approaches to selectional preferences. [sent-184, score-0.665]

55 Finally, we report on the quality of a large database of Wordnet-based preferences obtained after manually associating our topics with Wordnet classes (Section 4. [sent-187, score-0.612]

56 We inferred topic-argument and relation-topic multinomials (β, γ, and θ) on the generalization corpus by taking 5 samples at a lag of 50 after a burn in of 750 iterations. [sent-196, score-0.251]

57 Using multiple samples introduces the risk of topic drift due to lack of identifiability, however we found this to not be a problem in practice. [sent-197, score-0.218]

58 During development we found that the topics tend to remain stable across multiple samples after sufficient burn in, and multiple samples improved performance. [sent-198, score-0.313]

59 Table 1 lists sample topics and high ranked words for each (for both arguments) as well as relations favoring those topics. [sent-199, score-0.307]

60 In this pseudo-disambiguation experiment an observed tuple is paired with a pseudo-negative, which has both arguments randomly generated from the whole vocabulary (according to the corpus-wide distribution over arguments). [sent-204, score-0.28]

61 For each relation we collected all tuples containing arguments in the vocabulary. [sent-209, score-0.371]

62 For each tu3Many of the most frequent relations have very weak selectional preferences, and thus provide little signal for inferring meaningful topics. [sent-211, score-0.542]

63 428 Topic tArg1hRieglahetsiotn psro wbhaibcihlit ayss toig tnArg2 (βt probable values according to the multinomial distributions for each argument and γt). [sent-213, score-0.264]

64 The middle column reports a few relations whose inferred topic distributions assign highest probability to t. [sent-214, score-0.39]

65 Next we used collapsed Gibbs sampling rto( ∗i,nfaer a distribution over topics, θr, for each of the relations in the primary corpus (based solely on tuples in the training set) using the topics from the generalization corpus. [sent-218, score-0.596]

66 For each of the 500 observed tuples in the testset we generated a pseudo-negative tuple by randomly sampling two noun phrases from the distribution of NPs in both corpora. [sent-219, score-0.238]

67 2 Prediction Our prediction system needs to determine whether a specific relation-argument pair is admissible according to the selectional preferences or is a random distractor (D). [sent-222, score-0.811]

68 We first compute the probability of observing a1 for first argument of relation r given that it is not a distractor, P(a1 |r, ¬D), which we approximate by its probability given an eicshtim waete a popfr othxeparameters inferred by our model, marginalizing over hidden topics t. [sent-224, score-0.614]

69 recall Figure 4: Comparison to similarity-based selectional preference systems. [sent-237, score-0.485]

70 In addition to a superior performance in selectional preference evaluation LDA-SP also produces a set of coherent topics, which can be useful in their own right. [sent-247, score-0.451]

71 We choose the task of improving textual entailment by learning selectional preferences for inference rules and filtering inferences that do not respect these. [sent-260, score-1.003]

72 This applica- tion of selectional preferences was introduced by Pantel et. [sent-261, score-0.737]

73 1 Filtering Inferences In order for an inference to be plausible, both relations must have similar selectional preferences, and further, the arguments must obey the selectional preferences of both the antecedent r1 and the consequent r2. [sent-268, score-1.636]

74 (2007) made use of these intuitions by producing a set of classbased selectional preferences for each relation, then filtering out any inferences where the arguments were incompatible with the intersection of these preferences. [sent-270, score-1.11]

75 In contrast, we take a probabilistic approach, evaluating the quality of a specific inference by measuring the probability that the arguments in both the antecedent and the con- sequent were drawn from the same hidden topic in our model. [sent-271, score-0.672]

76 Note that this probability captures both the requirement that the antecedent and consequent have similar selectional preferences, and that the arguments from a particular instance ofthe rule’s application match their overlap. [sent-272, score-0.752]

77 We use zri,j to denote the topic that generates the jth argument of relation ri. [sent-273, score-0.425]

78 2 Experimental Conditions In order to evaluate LDA-SP’s ability to filter inferences based on selectional preferences we need a set of inference rules between the relations in our corpus. [sent-279, score-1.026]

79 We first gathered all instances in the generalization corpus, and for each r(a1, a2) created a corresponding simple sentence by concatenating the arguments with the relation string between them. [sent-281, score-0.362]

80 We then automatically filtered out any rules which contained a negation, or for which the antecedent and consequent contained a pair of antonyms found in WordNet (this left us with 85 rules). [sent-287, score-0.222]

81 recall Figure 5: Precision and recall on the inference filtering task. [sent-302, score-0.221]

82 4 A Repository of Class-Based Preferences Finally we explore LDA-SP’s ability to produce a repository of human interpretable class-based selectional preferences. [sent-313, score-0.504]

83 As an example, for the relation was born in, we would like to infer that the plausible arguments include (person, location) and (person, date). [sent-314, score-0.308]

84 Since we already have a set of topics, our task reduces to mapping the inferred topics to an equivalent class in a taxonomy (e. [sent-315, score-0.304]

85 7 Guided by the fact that we have a relatively small number of topics (600 total, 300 for each argument) we simply chose to label them manually. [sent-319, score-0.218]

86 By labeling this small number of topics we can infer class-based preferences for an arbitrary number of relations. [sent-320, score-0.563]

87 In particular, we applied a semi-automatic scheme to map topics to WordNet. [sent-321, score-0.218]

88 We then manually picked the best class from the shortlist that best represented the 20 top arguments for a topic (similar to Table 1). [sent-323, score-0.485]

89 We marked all incoherent topics with a special symbol ∅. [sent-324, score-0.248]

90 8 For each ar- gument of each relation we picked the top two topics according to frequency in the 5 Gibbs samples. [sent-328, score-0.362]

91 We then discarded any topics which were labeled with ∅; this resulted in a set of 236 predictions. [sent-329, score-0.218]

92 We contrast this with Pantel’s repository,9 the only other released database of selectional preferences to our knowledge. [sent-333, score-0.737]

93 We evaluated the same 100 relations from his website and tagged the top 2 classes for each argument and evaluated the accuracy to be roughly 0. [sent-334, score-0.294]

94 We emphasize that tagging a pair of class-based preferences is a highly subjective task, so these results should be treated as preliminary. [sent-344, score-0.317]

95 5 Conclusions and Future Work We have presented an application of topic modeling to the problem of automatically computing selectional preferences. [sent-347, score-0.609]

96 Our method, LDA-SP, learns a distribution over topics for each relation while simultaneously grouping related words into these topics. [sent-348, score-0.396]

97 Finally, our repository of selectional preferences for 10,000 relations is available at http : / /www . [sent-353, score-0.883]

98 labeling of multinomial topic David Mimno, Hanna M. [sent-470, score-0.31]

99 Predicting response to political blog posts with topic models. [sent-531, score-0.24]

100 Efficient methods for topic model inference on streaming document collections. [sent-538, score-0.276]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('selectional', 0.42), ('linklda', 0.369), ('preferences', 0.317), ('jointlda', 0.221), ('topics', 0.218), ('topic', 0.189), ('arguments', 0.172), ('bergsma', 0.129), ('argument', 0.128), ('pantel', 0.127), ('independentlda', 0.126), ('relation', 0.108), ('erosheva', 0.105), ('lda', 0.102), ('multinomial', 0.093), ('tuples', 0.091), ('relations', 0.089), ('inference', 0.087), ('generalization', 0.082), ('resnik', 0.08), ('classes', 0.077), ('erk', 0.074), ('rooth', 0.072), ('antecedent', 0.068), ('classbased', 0.068), ('inferences', 0.067), ('filtering', 0.066), ('drawn', 0.065), ('consequent', 0.063), ('multinomials', 0.063), ('hidden', 0.062), ('latent', 0.061), ('newman', 0.06), ('textrunner', 0.057), ('repository', 0.057), ('politician', 0.055), ('wordnet', 0.052), ('political', 0.051), ('mimno', 0.051), ('mutual', 0.05), ('gibbs', 0.05), ('class', 0.046), ('rules', 0.046), ('antonyms', 0.045), ('distributions', 0.043), ('yao', 0.043), ('person', 0.043), ('headquartered', 0.042), ('shortlist', 0.042), ('yano', 0.042), ('xt', 0.041), ('distribution', 0.041), ('inferred', 0.04), ('etzioni', 0.04), ('tuple', 0.039), ('sampling', 0.039), ('admissible', 0.037), ('burn', 0.037), ('distractor', 0.037), ('similarity', 0.036), ('jaccard', 0.036), ('aj', 0.036), ('collapsed', 0.036), ('picked', 0.036), ('blei', 0.034), ('dirichlet', 0.034), ('brody', 0.034), ('daume', 0.034), ('schubert', 0.034), ('recall', 0.034), ('inferring', 0.033), ('bayes', 0.032), ('banko', 0.032), ('elena', 0.032), ('reisinger', 0.032), ('preference', 0.031), ('incoherent', 0.03), ('zi', 0.03), ('samples', 0.029), ('unseen', 0.029), ('precision', 0.029), ('probability', 0.029), ('generative', 0.029), ('simultaneously', 0.029), ('dirt', 0.029), ('carlson', 0.029), ('durme', 0.029), ('kozareva', 0.029), ('retains', 0.029), ('approaches', 0.028), ('predictive', 0.028), ('models', 0.028), ('generated', 0.028), ('poor', 0.028), ('labeling', 0.028), ('plausible', 0.028), ('oren', 0.028), ('drawbacks', 0.027), ('interpretable', 0.027), ('mei', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999893 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

Author: Alan Ritter ; Mausam Mausam ; Oren Etzioni

Abstract: The computation of selectional preferences, the admissible argument values for a relation, is a well-known NLP task with broad applicability. We present LDA-SP, which utilizes LinkLDA (Erosheva et al., 2004) to model selectional preferences. By simultaneously inferring latent topics and topic distributions over relations, LDA-SP combines the benefits of previous approaches: like traditional classbased approaches, it produces humaninterpretable classes describing each relation’s preferences, but it is competitive with non-class-based methods in predictive power. We compare LDA-SP to several state-ofthe-art methods achieving an 85% increase in recall at 0.9 precision over mutual information (Erk, 2007). We also evaluate LDA-SP’s effectiveness at filtering improper applications of inference rules, where we show substantial improvement over Pantel et al. ’s system (Pantel et al., 2007).

2 0.43075702 158 acl-2010-Latent Variable Models of Selectional Preference

Author: Diarmuid O Seaghdha

Abstract: This paper describes the application of so-called topic models to selectional preference induction. Three models related to Latent Dirichlet Allocation, a proven method for modelling document-word cooccurrences, are presented and evaluated on datasets of human plausibility judgements. Compared to previously proposed techniques, these models perform very competitively, especially for infrequent predicate-argument combinations where they exceed the quality of Web-scale predictions while using relatively little data.

3 0.22385301 148 acl-2010-Improving the Use of Pseudo-Words for Evaluating Selectional Preferences

Author: Nathanael Chambers ; Dan Jurafsky

Abstract: This paper improves the use of pseudowords as an evaluation framework for selectional preferences. While pseudowords originally evaluated word sense disambiguation, they are now commonly used to evaluate selectional preferences. A selectional preference model ranks a set of possible arguments for a verb by their semantic fit to the verb. Pseudo-words serve as a proxy evaluation for these decisions. The evaluation takes an argument of a verb like drive (e.g. car), pairs it with an alternative word (e.g. car/rock), and asks a model to identify the original. This paper studies two main aspects of pseudoword creation that affect performance results. (1) Pseudo-word evaluations often evaluate only a subset of the words. We show that selectional preferences should instead be evaluated on the data in its entirety. (2) Different approaches to selecting partner words can produce overly optimistic evaluations. We offer suggestions to address these factors and present a simple baseline that outperforms the state-ofthe-art by 13% absolute on a newspaper domain.

4 0.21307047 79 acl-2010-Cross-Lingual Latent Topic Extraction

Author: Duo Zhang ; Qiaozhu Mei ; ChengXiang Zhai

Abstract: Probabilistic latent topic models have recently enjoyed much success in extracting and analyzing latent topics in text in an unsupervised way. One common deficiency of existing topic models, though, is that they would not work well for extracting cross-lingual latent topics simply because words in different languages generally do not co-occur with each other. In this paper, we propose a way to incorporate a bilingual dictionary into a probabilistic topic model so that we can apply topic models to extract shared latent topics in text data of different languages. Specifically, we propose a new topic model called Probabilistic Cross-Lingual Latent Semantic Analysis (PCLSA) which extends the Proba- bilistic Latent Semantic Analysis (PLSA) model by regularizing its likelihood function with soft constraints defined based on a bilingual dictionary. Both qualitative and quantitative experimental results show that the PCLSA model can effectively extract cross-lingual latent topics from multilingual text data.

5 0.20422491 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

Author: Zornitsa Kozareva ; Eduard Hovy

Abstract: A challenging problem in open information extraction and text mining is the learning of the selectional restrictions of semantic relations. We propose a minimally supervised bootstrapping algorithm that uses a single seed and a recursive lexico-syntactic pattern to learn the arguments and the supertypes of a diverse set of semantic relations from the Web. We evaluate the performance of our algorithm on multiple semantic relations expressed using “verb”, “noun”, and “verb prep” lexico-syntactic patterns. Humanbased evaluation shows that the accuracy of the harvested information is about 90%. We also compare our results with existing knowledge base to outline the similarities and differences of the granularity and diversity of the harvested knowledge.

6 0.18920232 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

7 0.15371956 191 acl-2010-PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names

8 0.13216145 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models

9 0.12292669 8 acl-2010-A Hybrid Hierarchical Model for Multi-Document Summarization

10 0.11991057 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

11 0.11322571 237 acl-2010-Topic Models for Word Sense Disambiguation and Token-Based Idiom Detection

12 0.11146405 41 acl-2010-Automatic Selectional Preference Acquisition for Latin Verbs

13 0.10023397 181 acl-2010-On Learning Subtypes of the Part-Whole Relation: Do Not Mix Your Seeds

14 0.09984868 238 acl-2010-Towards Open-Domain Semantic Role Labeling

15 0.09735015 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans

16 0.090972371 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

17 0.089817628 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

18 0.087517291 216 acl-2010-Starting from Scratch in Semantic Role Labeling

19 0.086819366 258 acl-2010-Weakly Supervised Learning of Presupposition Relations between Verbs

20 0.086302176 220 acl-2010-Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.253), (1, 0.164), (2, 0.075), (3, 0.048), (4, 0.119), (5, 0.025), (6, 0.001), (7, -0.043), (8, 0.119), (9, -0.289), (10, 0.013), (11, 0.014), (12, 0.168), (13, 0.054), (14, 0.36), (15, 0.01), (16, 0.037), (17, 0.016), (18, -0.071), (19, -0.212), (20, -0.045), (21, -0.01), (22, 0.038), (23, -0.035), (24, -0.133), (25, -0.003), (26, 0.019), (27, 0.029), (28, -0.131), (29, 0.086), (30, -0.062), (31, -0.028), (32, 0.03), (33, -0.01), (34, -0.098), (35, 0.004), (36, -0.048), (37, 0.023), (38, 0.054), (39, 0.093), (40, 0.104), (41, 0.029), (42, 0.036), (43, -0.002), (44, 0.056), (45, -0.049), (46, -0.015), (47, -0.01), (48, -0.104), (49, -0.004)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96464515 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

Author: Alan Ritter ; Mausam Mausam ; Oren Etzioni

Abstract: The computation of selectional preferences, the admissible argument values for a relation, is a well-known NLP task with broad applicability. We present LDA-SP, which utilizes LinkLDA (Erosheva et al., 2004) to model selectional preferences. By simultaneously inferring latent topics and topic distributions over relations, LDA-SP combines the benefits of previous approaches: like traditional classbased approaches, it produces humaninterpretable classes describing each relation’s preferences, but it is competitive with non-class-based methods in predictive power. We compare LDA-SP to several state-ofthe-art methods achieving an 85% increase in recall at 0.9 precision over mutual information (Erk, 2007). We also evaluate LDA-SP’s effectiveness at filtering improper applications of inference rules, where we show substantial improvement over Pantel et al. ’s system (Pantel et al., 2007).

2 0.87124556 158 acl-2010-Latent Variable Models of Selectional Preference

Author: Diarmuid O Seaghdha

Abstract: This paper describes the application of so-called topic models to selectional preference induction. Three models related to Latent Dirichlet Allocation, a proven method for modelling document-word cooccurrences, are presented and evaluated on datasets of human plausibility judgements. Compared to previously proposed techniques, these models perform very competitively, especially for infrequent predicate-argument combinations where they exceed the quality of Web-scale predictions while using relatively little data.

3 0.71490097 148 acl-2010-Improving the Use of Pseudo-Words for Evaluating Selectional Preferences

Author: Nathanael Chambers ; Dan Jurafsky

Abstract: This paper improves the use of pseudowords as an evaluation framework for selectional preferences. While pseudowords originally evaluated word sense disambiguation, they are now commonly used to evaluate selectional preferences. A selectional preference model ranks a set of possible arguments for a verb by their semantic fit to the verb. Pseudo-words serve as a proxy evaluation for these decisions. The evaluation takes an argument of a verb like drive (e.g. car), pairs it with an alternative word (e.g. car/rock), and asks a model to identify the original. This paper studies two main aspects of pseudoword creation that affect performance results. (1) Pseudo-word evaluations often evaluate only a subset of the words. We show that selectional preferences should instead be evaluated on the data in its entirety. (2) Different approaches to selecting partner words can produce overly optimistic evaluations. We offer suggestions to address these factors and present a simple baseline that outperforms the state-ofthe-art by 13% absolute on a newspaper domain.

4 0.62042451 79 acl-2010-Cross-Lingual Latent Topic Extraction

Author: Duo Zhang ; Qiaozhu Mei ; ChengXiang Zhai

Abstract: Probabilistic latent topic models have recently enjoyed much success in extracting and analyzing latent topics in text in an unsupervised way. One common deficiency of existing topic models, though, is that they would not work well for extracting cross-lingual latent topics simply because words in different languages generally do not co-occur with each other. In this paper, we propose a way to incorporate a bilingual dictionary into a probabilistic topic model so that we can apply topic models to extract shared latent topics in text data of different languages. Specifically, we propose a new topic model called Probabilistic Cross-Lingual Latent Semantic Analysis (PCLSA) which extends the Proba- bilistic Latent Semantic Analysis (PLSA) model by regularizing its likelihood function with soft constraints defined based on a bilingual dictionary. Both qualitative and quantitative experimental results show that the PCLSA model can effectively extract cross-lingual latent topics from multilingual text data.

5 0.58329433 191 acl-2010-PCFGs, Topic Models, Adaptor Grammars and Learning Topical Collocations and the Structure of Proper Names

Author: Mark Johnson

Abstract: This paper establishes a connection between two apparently very different kinds of probabilistic models. Latent Dirichlet Allocation (LDA) models are used as “topic models” to produce a lowdimensional representation of documents, while Probabilistic Context-Free Grammars (PCFGs) define distributions over trees. The paper begins by showing that LDA topic models can be viewed as a special kind of PCFG, so Bayesian inference for PCFGs can be used to infer Topic Models as well. Adaptor Grammars (AGs) are a hierarchical, non-parameteric Bayesian extension of PCFGs. Exploiting the close relationship between LDA and PCFGs just described, we propose two novel probabilistic models that combine insights from LDA and AG models. The first replaces the unigram component of LDA topic models with multi-word sequences or collocations generated by an AG. The second extension builds on the first one to learn aspects of the internal structure of proper names.

6 0.56547159 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

7 0.50456959 41 acl-2010-Automatic Selectional Preference Acquisition for Latin Verbs

8 0.50238389 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

9 0.41917372 8 acl-2010-A Hybrid Hierarchical Model for Multi-Document Summarization

10 0.40247694 256 acl-2010-Vocabulary Choice as an Indicator of Perspective

11 0.39465526 181 acl-2010-On Learning Subtypes of the Part-Whole Relation: Do Not Mix Your Seeds

12 0.38448057 19 acl-2010-A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation

13 0.38263625 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

14 0.3737514 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

15 0.36613619 61 acl-2010-Combining Data and Mathematical Models of Language Change

16 0.35771561 34 acl-2010-Authorship Attribution Using Probabilistic Context-Free Grammars

17 0.3545641 258 acl-2010-Weakly Supervised Learning of Presupposition Relations between Verbs

18 0.35062891 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

19 0.34941837 237 acl-2010-Topic Models for Word Sense Disambiguation and Token-Based Idiom Detection

20 0.34706366 246 acl-2010-Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(25, 0.063), (33, 0.014), (42, 0.019), (44, 0.012), (59, 0.101), (72, 0.012), (73, 0.04), (76, 0.014), (78, 0.366), (80, 0.011), (83, 0.099), (84, 0.032), (98, 0.133)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.98921347 228 acl-2010-The Importance of Rule Restrictions in CCG

Author: Marco Kuhlmann ; Alexander Koller ; Giorgio Satta

Abstract: Combinatory Categorial Grammar (CCG) is generally construed as a fully lexicalized formalism, where all grammars use one and the same universal set of rules, and crosslinguistic variation is isolated in the lexicon. In this paper, we show that the weak generative capacity of this ‘pure’ form of CCG is strictly smaller than that of CCG with grammar-specific rules, and of other mildly context-sensitive grammar formalisms, including Tree Adjoining Grammar (TAG). Our result also carries over to a multi-modal extension of CCG.

2 0.96773714 94 acl-2010-Edit Tree Distance Alignments for Semantic Role Labelling

Author: Hector-Hugo Franco-Penya

Abstract: ―Tree SRL system‖ is a Semantic Role Labelling supervised system based on a tree-distance algorithm and a simple k-NN implementation. The novelty of the system lies in comparing the sentences as tree structures with multiple relations instead of extracting vectors of features for each relation and classifying them. The system was tested with the English CoNLL-2009 shared task data set where 79% accuracy was obtained. 1

3 0.91214758 229 acl-2010-The Influence of Discourse on Syntax: A Psycholinguistic Model of Sentence Processing

Author: Amit Dubey

Abstract: Probabilistic models of sentence comprehension are increasingly relevant to questions concerning human language processing. However, such models are often limited to syntactic factors. This paper introduces a novel sentence processing model that consists of a parser augmented with a probabilistic logic-based model of coreference resolution, which allows us to simulate how context interacts with syntax in a reading task. Our simulations show that a Weakly Interactive cognitive architecture can explain data which had been provided as evidence for the Strongly Interactive hypothesis.

same-paper 4 0.88084394 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

Author: Alan Ritter ; Mausam Mausam ; Oren Etzioni

Abstract: The computation of selectional preferences, the admissible argument values for a relation, is a well-known NLP task with broad applicability. We present LDA-SP, which utilizes LinkLDA (Erosheva et al., 2004) to model selectional preferences. By simultaneously inferring latent topics and topic distributions over relations, LDA-SP combines the benefits of previous approaches: like traditional classbased approaches, it produces humaninterpretable classes describing each relation’s preferences, but it is competitive with non-class-based methods in predictive power. We compare LDA-SP to several state-ofthe-art methods achieving an 85% increase in recall at 0.9 precision over mutual information (Erk, 2007). We also evaluate LDA-SP’s effectiveness at filtering improper applications of inference rules, where we show substantial improvement over Pantel et al. ’s system (Pantel et al., 2007).

5 0.82499421 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models

Author: Stefan Thater ; Hagen Furstenau ; Manfred Pinkal

Abstract: We present a syntactically enriched vector model that supports the computation of contextualized semantic representations in a quasi compositional fashion. It employs a systematic combination of first- and second-order context vectors. We apply our model to two different tasks and show that (i) it substantially outperforms previous work on a paraphrase ranking task, and (ii) achieves promising results on a wordsense similarity task; to our knowledge, it is the first time that an unsupervised method has been applied to this task.

6 0.74935079 158 acl-2010-Latent Variable Models of Selectional Preference

7 0.74832785 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

8 0.72259867 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

9 0.70464337 23 acl-2010-Accurate Context-Free Parsing with Combinatory Categorial Grammar

10 0.69647408 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

11 0.68906492 115 acl-2010-Filtering Syntactic Constraints for Statistical Machine Translation

12 0.68594772 130 acl-2010-Hard Constraints for Grammatical Function Labelling

13 0.68450958 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

14 0.68448114 21 acl-2010-A Tree Transducer Model for Synchronous Tree-Adjoining Grammars

15 0.6766904 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

16 0.67441452 107 acl-2010-Exemplar-Based Models for Word Meaning in Context

17 0.67348564 67 acl-2010-Computing Weakest Readings

18 0.67305869 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

19 0.671624 71 acl-2010-Convolution Kernel over Packed Parse Forest

20 0.66500348 155 acl-2010-Kernel Based Discourse Relation Recognition with Temporal Ordering Information