emnlp emnlp2011 emnlp2011-84 knowledge-graph by maker-knowledge-mining

84 emnlp-2011-Learning the Information Status of Noun Phrases in Spoken Dialogues


Source: pdf

Author: Altaf Rahman ; Vincent Ng

Abstract: An entity in a dialogue may be old, new, or mediated/inferrable with respect to the hearer’s beliefs. Knowing the information status of the entities participating in a dialogue can therefore facilitate its interpretation. We address the under-investigated problem of automatically determining the information status of discourse entities. Specifically, we extend Nissim’s (2006) machine learning approach to information-status determination with lexical and structured features, and exploit learned knowledge of the information status of each discourse entity for coreference resolution. Experimental results on a set of Switchboard dialogues reveal that (1) incorporating our proposed features into Nissim’s feature set enables our system to achieve stateof-the-art performance on information-status classification, and (2) the resulting information can be used to improve the performance of learning-based coreference resolvers.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Knowing the information status of the entities participating in a dialogue can therefore facilitate its interpretation. [sent-4, score-0.358]

2 We address the under-investigated problem of automatically determining the information status of discourse entities. [sent-5, score-0.34]

3 Specifically, we extend Nissim’s (2006) machine learning approach to information-status determination with lexical and structured features, and exploit learned knowledge of the information status of each discourse entity for coreference resolution. [sent-6, score-0.962]

4 1 Introduction Information status is not a term unfamiliar to researchers working on discourse processing problems. [sent-8, score-0.314]

5 It describes the extent to which a discourse entity, which is typically a noun phrase (NP), is available to the hearer given the speaker’s assumptions about the hearer’s beliefs. [sent-9, score-0.227]

6 (2004), a discourse entity can be new, old, or mediated. [sent-11, score-0.231]

7 Information status is a subject that has received a lot of attention in theoretical linguistics (Halliday, 1976; Prince, 1981 ; Haji cˇov a´, 1984; Vallduv ı´, 1992; Steedman, 2000). [sent-13, score-0.204]

8 Knowing the information status of discourse entities can potentially benefit many NLP applications. [sent-14, score-0.4]

9 Since new entities are by definition new to the hearer and therefore cannot refer to a previously-introduced NP, knowledge of information status could be used to improve anaphora resolution. [sent-18, score-0.449]

10 Despite the potential usefulness of information status in NLP tasks, there has been little work on learning the information status of discourse entities. [sent-19, score-0.54]

11 First, we describe a learning approach to the under-studied problem of determining the information status of discourse entities that extends Nissim’s (2006) feature set with two novel types of features: lexical features and structured features based on syntactic parse trees. [sent-23, score-0.632]

12 tc ho2d0s11 in A Nsasotuciraatlio Lnan fogru Cagoem Ppruotcaetisosninagl, L pinagguesis 1ti0c6s9–1080, acquired knowledge of information status for coreference resolution. [sent-28, score-0.586]

13 1% in F-measure, and (2) learned knowledge of information status can be used to improve coreference resolvers by 1. [sent-31, score-0.586]

14 Finally, we evaluate the determination of information status as a standalone task and in the context of coreference resolution. [sent-38, score-0.665]

15 2 Old, New, and Mediated Entities Since the concepts of old, new, and mediated entities are not widely known to researchers working outside the area of discourse processing, in this section we will explain them in more detail. [sent-39, score-0.363]

16 The terms old and new information have meant a variety of things over the years (Allerton, 1978; Prince, 1981; Horn, 1986). [sent-40, score-0.237]

17 , their definitions are built upon Prince’s (1981), and the categorization into old, new, and mediated entities resemble those of Strube (1998) and Eckert and Strube (2001). [sent-45, score-0.227]

18 As mentioned before, an entity is old if it is both known to the hearer and has been mentioned in the conversation. [sent-47, score-0.501]

19 More precisely, an entity is old if (1) it is coreferential with an entity introduced earlier, (2) it is a generic pronoun, or (3) it is a personal pronoun referring to the dialogue participants. [sent-48, score-0.582]

20 In Example 1, my is an old entity because it is coreferent with I. [sent-52, score-0.358]

21 In Example 2, You is an old entity because it is a generic pronoun. [sent-53, score-0.358]

22 An entity is mediated if it has not been previously introduced in the conversation, but can be 1070 inferred from already-mentioned entities or is generally known to the hearer. [sent-55, score-0.374]

23 More specifically, an entity is mediated if (1) it is a generally known entity (e. [sent-56, score-0.409]

24 , an entity that is inferrable from a related entity mentioned earlier in the dialogue). [sent-60, score-0.266]

25 In Example 3a, by the time the hearer processes the second occurrence of the door, she has already had a mental entity corresponding to the door (after processing the first occurrence). [sent-64, score-0.342]

26 As a result, the second occurrence of the door is an old entity. [sent-65, score-0.341]

27 In Example 3b, on the other hand, the hearer is not assumed to have any mental representation of the door in question, but she can infer that the door she saw was part of Mary’s house. [sent-66, score-0.281]

28 Hence, this occurrence of the door is a mediated entity. [sent-67, score-0.245]

29 In general, an entity that is related to an earlier entity via a part-whole relation or a set-subset relation is mediated. [sent-68, score-0.266]

30 An entity is new if it has not been introduced in the dialogue and the hearer cannot infer it from previously mentioned entities. [sent-70, score-0.306]

31 The second occurrence can be labeled as old (because it is coreferential with an earlier entity) or mediated (because it is a generally known entity). [sent-74, score-0.485]

32 Finally, randomizing the instances does not allow us to apply learned knowledge of information status to coreference resolution, which needs to be performed for each dialogue. [sent-92, score-0.586]

33 4 Baseline System In this section, we describe our baseline system, which adopts a machine learning approach to determining the information status of a discourse entity. [sent-94, score-0.36]

34 For instance, if an NP, NPk, and a discourse entity that appears before it have the same string (full prev mention), then NPk is likely to be an old entity. [sent-105, score-0.498]

35 Mention time is the categorical version of full prev mention and therefore serves to detect old entities. [sent-106, score-0.29]

36 Partial prev mention is useful for detecting mediated entities, especially those that have a set-subset relation with a preceding entity. [sent-107, score-0.24]

37 The “NP length” feature is motivated by the observation that old entities tend to contain less lexical materials than new entities. [sent-112, score-0.369]

38 To determine the information status of an NP in a test dialogue, we create an instance for it as during training and present it independently to the three binary SVM classifiers, each of which returns a real value representing the signed distance of the instance from the hyperplane. [sent-115, score-0.323]

39 1 Lexical Features As discussed, an entity should be labeled as med if it has not been introduced in the dialogue but is generally known to a human. [sent-119, score-0.387]

40 Hence, it would be desirable to augment Nissim’s feature set with features that indicate whether an entity is generally known or not. [sent-122, score-0.216]

41 One way to do this is to (1) create a list of generally known entities, and then (2) create a binary feature that has the value True if and only if the entity under consideration appears in this list. [sent-123, score-0.218]

42 For instance, from the annotated data, the learner will learn that any instance of China cannot be labeled as new, and the decision of whether it should be an old entity or a med entity depends on whether it is coreferential with a previously-mentioned entity. [sent-135, score-0.77]

43 Specifically, given an instance corresponding to discourse entity e, we extract the substructure from the parse tree containing e as follows. [sent-161, score-0.327]

44 We (1) take the subtree rooted at Parent(n(e)), (2) replace each leaf node in this subtree with a node labeled X, (3) replace the subtree rooted at n(e) with a leaf node labeled Y, and (4) use the subtree rooted at Parent(n(e)) as the structured feature for the instance corresponding to e. [sent-163, score-0.218]

45 Note that using two labels, X and Y, enables the kernel to distinguish the discourse entity under consideration from its context within this substructure. [sent-167, score-0.305]

46 We hope that our use of structured features for information-status classification can promote their use in discourse processing. [sent-180, score-0.222]

47 Specifically, we define and employ the following composite kernel: Kc(F1, F2) = K1(F1, F2) + K2(F1, F2), where F1 and F2 are the full set of features that represent the two entities under consideration, and K1 and K2 are the kernels we are combining. [sent-184, score-0.23]

48 Note that while our results and Original Nissim’s are not directly comparable, the two systems are consistent in terms of the relative performance for the three classes: best for old and worst for new. [sent-231, score-0.237]

49 Since many new instances are misclassified, a natural question is: are these instances misclassified as old or med? [sent-233, score-0.305]

50 Similar questions can be raised for old and med, despite their substantially higher recall values than new. [sent-234, score-0.237]

51 As we can see, these numbers seem to suggest the “in-between” nature of mediated entities: when an old or new entity is misclassified, it is typically misclassified as med (rows 1 and 3); however, when a med entity is misclassified, it is equally likely to be misclassified as old and new (row 2). [sent-239, score-1.337]

52 These results are perhaps not surprisingly, since intuitively med entities bear some resemblance to both old and new entities. [sent-240, score-0.495]

53 For instance, the similarity between med and old stems from the fact that different instances of the same entity (e. [sent-241, score-0.53]

54 On the other hand, med and new are similar in that it may sometimes be difficult even for a human to determine whether certain entities should be labeled as med or new, since the decision depends on whether she believes these entities are generally known or not. [sent-244, score-0.542]

55 2 Relation to Anaphoricity Determination Anaphoricity determination refers to the task of determining whether an NP is anaphoric or not, where an NP is considered anaphoric if it is part of a (nonsingleton) coreference chain but is not the head of the chain (Ng and Cardie, 2002). [sent-246, score-0.689]

56 Given this definition, anaphoricity determination bears resemblance to information-status classification. [sent-261, score-0.281]

57 For instance, an old entity is anaphoric, since it has been introduced earlier in the conversation and therefore have an antecedent. [sent-262, score-0.409]

58 Similarly, a new or med entity is non-anaphoric, since the entity has not been previously introduced in the conversation and therefore cannot have an antecedent. [sent-263, score-0.441]

59 There has been a lot of recent work on anaphoricity determination (e. [sent-264, score-0.281]

60 Given the similarity between this task and information-status classification, a natural ques- tion is: will the anaphoricity features previously developed by coreference researchers be helpful for information-status classification? [sent-268, score-0.628]

61 Results with the anaphoricity features are shown in Table 5. [sent-270, score-0.246]

62 Comparing each of Baseline+Ana and Baseline+Lexical+Ana with the corresponding experiments in Table 3, we see that the addition of anaphoricity features yields a mild performance improvement, which is consistent over all three classes. [sent-273, score-0.246]

63 However, comparing the last column of the two tables, we can see that in the 3These 26 features are derived from those employed by Ng and Cardie’s (2002) anaphoricity determination system. [sent-274, score-0.325]

64 1075 presence of the structured features, the anaphoricity features do not contribute positively to overall performance. [sent-276, score-0.291]

65 Hence, in the coreference experiments in the next section, we will not employ anaphoricity features for information-status classification. [sent-277, score-0.675]

66 Since the 147 information-status annotated dialogues are also coreference annotated, we use them in our coreference evaluation. [sent-279, score-0.866]

67 To our knowledge, our work represents the first attempt to report coreference results on this dataset. [sent-280, score-0.382]

68 1 Coreference Models While the so-called mention-pair coreference model has dominated coreference research for more than a decade since its appearance in the mid-1990s, a number of new coreference models have been proposed in recent years. [sent-282, score-1.146]

69 Specifically, we create (1) a positive instance for each anaphoric NP NPk and its closest antecedent NPj ; and (2) a negative instance for NPk paired with each of the intervening NPs, NPj+1, NPj+2, . [sent-294, score-0.301]

70 First, since each candidate antecedent for an NP to be resolved (henceforth an active NP) is considered independently of the others, this model only determines how good a candidate antecedent is relative to the active NP, but not how good a candidate antecedent is relative to other candidates. [sent-304, score-0.291]

71 Second, it has limitations in its expressiveness: the in- formation extracted from the two NPs alone may not be sufficient for making a coreference decision. [sent-306, score-0.382]

72 Since the CR model ranks preceding clusters, a training instance i(cj, NPk) represents a preceding cluster cj and an anaphoric NP NPk. [sent-322, score-0.331]

73 We train a cluster ranker to jointly learn anaphoricity determination and coreference resolution using SVMlight’s ranker-learning algorithm. [sent-328, score-0.807]

74 Specifically, for each NP, NPk, we create a training instance between NPk and each preceding clus- ter cj using the features described above. [sent-329, score-0.227]

75 1 Experimental Setup The training/test split we use in the coreference experiments is the same as that in the informationstatus experiments. [sent-343, score-0.441]

76 Our decision to allow the coreference models to resolve only the old entities is motivated by the fact that med and new entities have not been previously introduced in the conversation and therefore do not have antecedents. [sent-345, score-0.99]

77 The NPs used by the coreference models are the same as those accessible to the information-status classifier. [sent-346, score-0.382]

78 We employ two scoring programs, B3 (Bagga and Baldwin, 1998) and φ3-CEAF (Luo, 2005), to score the output of a coreference model. [sent-347, score-0.429]

79 2 Results and Discussion As our baseline, we employ our coreference models to generate NP partitions on the test documents without using any knowledge of information status. [sent-358, score-0.429]

80 Next, we examine the impact of learned knowledge of information status on the performance of a coreference model. [sent-366, score-0.586]

81 Since knowledge of information status enables a coreference model to focus on resolving only the old entities, we hypothesize that the resulting model will have a higher precision than one that does not employ such knowledge. [sent-367, score-0.892]

82 Since we are employing knowledge of information status in a pipeline coreference architecture where information-status classification is performed prior to coreference resolution, errors made by the (upstream) information-status classifier may propagate to the (downstream) coreference system. [sent-369, score-1.445]

83 In particular, the higher the accuracy of information-status classification is, the more likely the F-measure of the downstream coreference model will improve. [sent-371, score-0.405]

84 Results of the coreference models employing knowledge provided by the three information-status classifiers are shown in rows 2–4 of Table 6. [sent-384, score-0.458]

85 As expected, B3 precision increases in comparison to the baseline, regardless ofthe coreference model and the scoring program. [sent-385, score-0.382]

86 In addition, employing knowledge of information status always improves coreference performance: F-measure scores increase by 1. [sent-386, score-0.625]

87 These results suggest that the three information-status classifiers have achieved the level of accuracy needed for the coreference models to improve. [sent-395, score-0.419]

88 On the other hand, it is somewhat surprising that the three information-status classifiers have yielded coreference systems that perform at essentially the same level of performance. [sent-396, score-0.419]

89 Finally, we investigate whether our coreference system could be improved if it had access to perfect knowledge of information status (taken directly from the gold-standard annotations). [sent-400, score-0.586]

90 This experiment will allow us to determine whether the usefulness of knowledge of information status for coreference resolution is limited by the accuracy in computing such knowledge. [sent-401, score-0.677]

91 As we can see, using perfect information-status knowledge yields a coreference system that improves those that employs automatically acquired information-status knowledge by 1. [sent-403, score-0.382]

92 This indicates that the accuracy in computing such knowledge does play a role in determining its usefulness for coreference resolution. [sent-408, score-0.43]

93 8 Conclusions We examined the problem of automatically determining the information status of discourse entities in spoken dialogues. [sent-409, score-0.426]

94 In addition, we evaluated informationstatus classification in the context of coreference resolution, and showed that automatically acquired knowledge of information status can be profitably used to improve coreference systems. [sent-413, score-1.05]

95 Global, joint determination of anaphoricity and coreference resolution using integer programming. [sent-455, score-0.732]

96 A mentionsynchronous coreference resolution algorithm based on the Bell tree. [sent-496, score-0.451]

97 Identifying anaphoric and non-anaphoricnoun phrases to improve coreference resolution. [sent-512, score-0.483]

98 A machine learning approach to coreference resolution of noun phrases. [sent-544, score-0.451]

99 An entity-mention model for coreference resolution with inductive logic programming. [sent-572, score-0.451]

100 Global learning of noun phrase anaphoricity in coreference resolution via label propagation. [sent-580, score-0.653]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('nissim', 0.505), ('coreference', 0.382), ('npk', 0.317), ('old', 0.237), ('status', 0.204), ('anaphoricity', 0.202), ('med', 0.172), ('nps', 0.165), ('mediated', 0.141), ('np', 0.13), ('npj', 0.129), ('entity', 0.121), ('hearer', 0.117), ('discourse', 0.11), ('dialogues', 0.102), ('anaphoric', 0.101), ('ceaf', 0.094), ('entities', 0.086), ('door', 0.082), ('rahman', 0.082), ('antecedent', 0.081), ('determination', 0.079), ('mp', 0.076), ('switchboard', 0.071), ('resolution', 0.069), ('dialogue', 0.068), ('misclassified', 0.068), ('cj', 0.066), ('informationstatus', 0.059), ('cr', 0.053), ('kernel', 0.052), ('ranker', 0.051), ('instance', 0.048), ('employ', 0.047), ('preceding', 0.046), ('structured', 0.045), ('features', 0.044), ('anaphora', 0.042), ('xb', 0.04), ('ng', 0.039), ('employing', 0.039), ('classifiers', 0.037), ('learner', 0.036), ('coreferential', 0.035), ('prince', 0.035), ('classifier', 0.033), ('svm', 0.032), ('kernels', 0.03), ('prev', 0.03), ('denis', 0.03), ('henceforth', 0.029), ('specifically', 0.028), ('resolver', 0.027), ('pragmatics', 0.027), ('antecedents', 0.027), ('conversation', 0.027), ('parse', 0.027), ('vincent', 0.027), ('clusters', 0.026), ('determining', 0.026), ('relational', 0.026), ('known', 0.026), ('luo', 0.025), ('aone', 0.025), ('svmlight', 0.025), ('subtree', 0.025), ('feature', 0.025), ('cluster', 0.024), ('active', 0.024), ('earlier', 0.024), ('mention', 0.023), ('altaf', 0.023), ('assemble', 0.023), ('bean', 0.023), ('eckert', 0.023), ('kpj', 0.023), ('presupposition', 0.023), ('rpj', 0.023), ('svmmulticlass', 0.023), ('vallduv', 0.023), ('xiaoqiang', 0.023), ('classified', 0.023), ('classification', 0.023), ('create', 0.023), ('composite', 0.023), ('enables', 0.022), ('occurrence', 0.022), ('flat', 0.022), ('usefulness', 0.022), ('true', 0.022), ('convolution', 0.021), ('tree', 0.021), ('lexical', 0.021), ('aforementioned', 0.021), ('baseline', 0.02), ('calhoun', 0.02), ('ana', 0.02), ('vieira', 0.02), ('bagga', 0.02), ('malvina', 0.02)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999928 84 emnlp-2011-Learning the Information Status of Noun Phrases in Spoken Dialogues

Author: Altaf Rahman ; Vincent Ng

Abstract: An entity in a dialogue may be old, new, or mediated/inferrable with respect to the hearer’s beliefs. Knowing the information status of the entities participating in a dialogue can therefore facilitate its interpretation. We address the under-investigated problem of automatically determining the information status of discourse entities. Specifically, we extend Nissim’s (2006) machine learning approach to information-status determination with lexical and structured features, and exploit learned knowledge of the information status of each discourse entity for coreference resolution. Experimental results on a set of Switchboard dialogues reveal that (1) incorporating our proposed features into Nissim’s feature set enables our system to achieve stateof-the-art performance on information-status classification, and (2) the resulting information can be used to improve the performance of learning-based coreference resolvers.

2 0.10173098 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing

Author: Amit Dubey ; Frank Keller ; Patrick Sturt

Abstract: This paper introduces a psycholinguistic model of sentence processing which combines a Hidden Markov Model noun phrase chunker with a co-reference classifier. Both models are fully incremental and generative, giving probabilities of lexical elements conditional upon linguistic structure. This allows us to compute the information theoretic measure of surprisal, which is known to correlate with human processing effort. We evaluate our surprisal predictions on the Dundee corpus of eye-movement data show that our model achieve a better fit with human reading times than a syntax-only model which does not have access to co-reference information.

3 0.098319225 38 emnlp-2011-Data-Driven Response Generation in Social Media

Author: Alan Ritter ; Colin Cherry ; William B. Dolan

Abstract: Ottawa, Ontario, K1A 0R6 Co l . Cherry@ nrc-cnrc . gc . ca in Redmond, WA 98052 bi l ldol @mi cro so ft . com large corpus of status-response pairs found on Twitter to create a system that responds to Twitter status We present a data-driven approach to generating responses to Twitter status posts, based on phrase-based Statistical Machine Translation. We find that mapping conversational stimuli onto responses is more difficult than translating between languages, due to the wider range of possible responses, the larger fraction of unaligned words/phrases, and the presence of large phrase pairs whose alignment cannot be further decomposed. After addressing these challenges, we compare approaches based on SMT and Information Retrieval in a human evaluation. We show that SMT outperforms IR on this task, and its output is preferred over actual human responses in 15% of cases. As far as we are aware, this is the first work to investigate the use of phrase-based SMT to directly translate a linguistic stimulus into an appropriate response.

4 0.094667211 128 emnlp-2011-Structured Relation Discovery using Generative Models

Author: Limin Yao ; Aria Haghighi ; Sebastian Riedel ; Andrew McCallum

Abstract: We explore unsupervised approaches to relation extraction between two named entities; for instance, the semantic bornIn relation between a person and location entity. Concretely, we propose a series of generative probabilistic models, broadly similar to topic models, each which generates a corpus of observed triples of entity mention pairs and the surface syntactic dependency path between them. The output of each model is a clustering of observed relation tuples and their associated textual expressions to underlying semantic relation types. Our proposed models exploit entity type constraints within a relation as well as features on the dependency path between entity mentions. We examine effectiveness of our approach via multiple evaluations and demonstrate 12% error reduction in precision over a state-of-the-art weakly supervised baseline.

5 0.091809496 116 emnlp-2011-Robust Disambiguation of Named Entities in Text

Author: Johannes Hoffart ; Mohamed Amir Yosef ; Ilaria Bordino ; Hagen Furstenau ; Manfred Pinkal ; Marc Spaniol ; Bilyana Taneva ; Stefan Thater ; Gerhard Weikum

Abstract: Disambiguating named entities in naturallanguage text maps mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base such as DBpedia or YAGO. This paper presents a robust method for collective disambiguation, by harnessing context from knowledge bases and using a new form of coherence graph. It unifies prior approaches into a comprehensive framework that combines three measures: the prior probability of an entity being mentioned, the similarity between the contexts of a mention and a candidate entity, as well as the coherence among candidate entities for all mentions together. The method builds a weighted graph of mentions and candidate entities, and computes a dense subgraph that approximates the best joint mention-entity mapping. Experiments show that the new method significantly outperforms prior methods in terms of accuracy, with robust behavior across a variety of inputs.

6 0.080394909 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study

7 0.079655454 57 emnlp-2011-Extreme Extraction - Machine Reading in a Week

8 0.076977462 142 emnlp-2011-Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities

9 0.065606371 147 emnlp-2011-Using Syntactic and Semantic Structural Kernels for Classifying Definition Questions in Jeopardy!

10 0.06078371 127 emnlp-2011-Structured Lexical Similarity via Convolution Kernels on Dependency Trees

11 0.0550979 94 emnlp-2011-Modelling Discourse Relations for Arabic

12 0.053705264 60 emnlp-2011-Feature-Rich Language-Independent Syntax-Based Alignment for Statistical Machine Translation

13 0.053026736 105 emnlp-2011-Predicting Thread Discourse Structure over Technical Web Forums

14 0.051994149 62 emnlp-2011-Generating Subsequent Reference in Shared Visual Scenes: Computation vs Re-Use

15 0.046480149 132 emnlp-2011-Syntax-Based Grammaticality Improvement using CCG and Guided Search

16 0.046143547 92 emnlp-2011-Minimally Supervised Event Causality Identification

17 0.045096174 29 emnlp-2011-Collaborative Ranking: A Case Study on Entity Linking

18 0.044433784 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts

19 0.043847166 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification

20 0.041589923 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.148), (1, -0.087), (2, -0.058), (3, 0.023), (4, -0.057), (5, -0.143), (6, 0.015), (7, 0.047), (8, 0.017), (9, -0.134), (10, -0.038), (11, -0.063), (12, -0.078), (13, 0.121), (14, -0.062), (15, -0.084), (16, 0.125), (17, 0.043), (18, -0.036), (19, -0.098), (20, -0.183), (21, 0.084), (22, -0.025), (23, -0.152), (24, -0.027), (25, 0.142), (26, -0.062), (27, 0.045), (28, -0.034), (29, 0.09), (30, -0.023), (31, 0.007), (32, -0.016), (33, -0.028), (34, 0.055), (35, -0.066), (36, -0.088), (37, -0.039), (38, -0.088), (39, 0.028), (40, 0.071), (41, -0.143), (42, 0.091), (43, 0.114), (44, -0.094), (45, -0.021), (46, 0.059), (47, 0.197), (48, 0.124), (49, 0.193)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93294233 84 emnlp-2011-Learning the Information Status of Noun Phrases in Spoken Dialogues

Author: Altaf Rahman ; Vincent Ng

Abstract: An entity in a dialogue may be old, new, or mediated/inferrable with respect to the hearer’s beliefs. Knowing the information status of the entities participating in a dialogue can therefore facilitate its interpretation. We address the under-investigated problem of automatically determining the information status of discourse entities. Specifically, we extend Nissim’s (2006) machine learning approach to information-status determination with lexical and structured features, and exploit learned knowledge of the information status of each discourse entity for coreference resolution. Experimental results on a set of Switchboard dialogues reveal that (1) incorporating our proposed features into Nissim’s feature set enables our system to achieve stateof-the-art performance on information-status classification, and (2) the resulting information can be used to improve the performance of learning-based coreference resolvers.

2 0.59098423 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing

Author: Amit Dubey ; Frank Keller ; Patrick Sturt

Abstract: This paper introduces a psycholinguistic model of sentence processing which combines a Hidden Markov Model noun phrase chunker with a co-reference classifier. Both models are fully incremental and generative, giving probabilities of lexical elements conditional upon linguistic structure. This allows us to compute the information theoretic measure of surprisal, which is known to correlate with human processing effort. We evaluate our surprisal predictions on the Dundee corpus of eye-movement data show that our model achieve a better fit with human reading times than a syntax-only model which does not have access to co-reference information.

3 0.46056765 116 emnlp-2011-Robust Disambiguation of Named Entities in Text

Author: Johannes Hoffart ; Mohamed Amir Yosef ; Ilaria Bordino ; Hagen Furstenau ; Manfred Pinkal ; Marc Spaniol ; Bilyana Taneva ; Stefan Thater ; Gerhard Weikum

Abstract: Disambiguating named entities in naturallanguage text maps mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base such as DBpedia or YAGO. This paper presents a robust method for collective disambiguation, by harnessing context from knowledge bases and using a new form of coherence graph. It unifies prior approaches into a comprehensive framework that combines three measures: the prior probability of an entity being mentioned, the similarity between the contexts of a mention and a candidate entity, as well as the coherence among candidate entities for all mentions together. The method builds a weighted graph of mentions and candidate entities, and computes a dense subgraph that approximates the best joint mention-entity mapping. Experiments show that the new method significantly outperforms prior methods in terms of accuracy, with robust behavior across a variety of inputs.

4 0.42820257 38 emnlp-2011-Data-Driven Response Generation in Social Media

Author: Alan Ritter ; Colin Cherry ; William B. Dolan

Abstract: Ottawa, Ontario, K1A 0R6 Co l . Cherry@ nrc-cnrc . gc . ca in Redmond, WA 98052 bi l ldol @mi cro so ft . com large corpus of status-response pairs found on Twitter to create a system that responds to Twitter status We present a data-driven approach to generating responses to Twitter status posts, based on phrase-based Statistical Machine Translation. We find that mapping conversational stimuli onto responses is more difficult than translating between languages, due to the wider range of possible responses, the larger fraction of unaligned words/phrases, and the presence of large phrase pairs whose alignment cannot be further decomposed. After addressing these challenges, we compare approaches based on SMT and Information Retrieval in a human evaluation. We show that SMT outperforms IR on this task, and its output is preferred over actual human responses in 15% of cases. As far as we are aware, this is the first work to investigate the use of phrase-based SMT to directly translate a linguistic stimulus into an appropriate response.

5 0.39662549 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study

Author: Alan Ritter ; Sam Clark ; Mausam ; Oren Etzioni

Abstract: People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-NER system doubles F1 score compared with the Stanford NER system. T-NER leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms cotraining, increasing F1 by 25% over ten common entity types. Our NLP tools are available at: http : / / github .com/ aritt er /twitte r_nlp

6 0.35328186 60 emnlp-2011-Feature-Rich Language-Independent Syntax-Based Alignment for Statistical Machine Translation

7 0.34116325 142 emnlp-2011-Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities

8 0.31407237 128 emnlp-2011-Structured Relation Discovery using Generative Models

9 0.27846858 147 emnlp-2011-Using Syntactic and Semantic Structural Kernels for Classifying Definition Questions in Jeopardy!

10 0.26705378 127 emnlp-2011-Structured Lexical Similarity via Convolution Kernels on Dependency Trees

11 0.26703858 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification

12 0.26519093 105 emnlp-2011-Predicting Thread Discourse Structure over Technical Web Forums

13 0.26343557 62 emnlp-2011-Generating Subsequent Reference in Shared Visual Scenes: Computation vs Re-Use

14 0.25659996 67 emnlp-2011-Hierarchical Verb Clustering Using Graph Factorization

15 0.25638166 57 emnlp-2011-Extreme Extraction - Machine Reading in a Week

16 0.2556881 2 emnlp-2011-A Cascaded Classification Approach to Semantic Head Recognition

17 0.25537413 94 emnlp-2011-Modelling Discourse Relations for Arabic

18 0.2550199 23 emnlp-2011-Bootstrapped Named Entity Recognition for Product Attribute Extraction

19 0.25378245 106 emnlp-2011-Predicting a Scientific Communitys Response to an Article

20 0.2530838 12 emnlp-2011-A Weakly-supervised Approach to Argumentative Zoning of Scientific Documents


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(23, 0.089), (27, 0.021), (36, 0.031), (37, 0.032), (45, 0.061), (53, 0.02), (54, 0.027), (57, 0.019), (62, 0.023), (64, 0.014), (66, 0.042), (69, 0.016), (75, 0.382), (79, 0.042), (82, 0.017), (87, 0.013), (96, 0.04), (98, 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.68376023 84 emnlp-2011-Learning the Information Status of Noun Phrases in Spoken Dialogues

Author: Altaf Rahman ; Vincent Ng

Abstract: An entity in a dialogue may be old, new, or mediated/inferrable with respect to the hearer’s beliefs. Knowing the information status of the entities participating in a dialogue can therefore facilitate its interpretation. We address the under-investigated problem of automatically determining the information status of discourse entities. Specifically, we extend Nissim’s (2006) machine learning approach to information-status determination with lexical and structured features, and exploit learned knowledge of the information status of each discourse entity for coreference resolution. Experimental results on a set of Switchboard dialogues reveal that (1) incorporating our proposed features into Nissim’s feature set enables our system to achieve stateof-the-art performance on information-status classification, and (2) the resulting information can be used to improve the performance of learning-based coreference resolvers.

2 0.54529238 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study

Author: Alan Ritter ; Sam Clark ; Mausam ; Oren Etzioni

Abstract: People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-NER system doubles F1 score compared with the Stanford NER system. T-NER leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms cotraining, increasing F1 by 25% over ten common entity types. Our NLP tools are available at: http : / / github .com/ aritt er /twitte r_nlp

3 0.34012285 144 emnlp-2011-Unsupervised Learning of Selectional Restrictions and Detection of Argument Coercions

Author: Kirk Roberts ; Sanda Harabagiu

Abstract: Metonymic language is a pervasive phenomenon. Metonymic type shifting, or argument type coercion, results in a selectional restriction violation where the argument’s semantic class differs from the class the predicate expects. In this paper we present an unsupervised method that learns the selectional restriction of arguments and enables the detection of argument coercion. This method also generates an enhanced probabilistic resolution of logical metonymies. The experimental results indicate substantial improvements the detection of coercions and the ranking of metonymic interpretations.

4 0.33831745 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation

Author: Kevin Gimpel ; Noah A. Smith

Abstract: We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text using a target-side dependency parser. For decoding, we describe a coarse-to-fine approach based on lattice dependency parsing of phrase lattices. We demonstrate performance improvements for Chinese-English and UrduEnglish translation over a phrase-based baseline. We also investigate the use of unsupervised dependency parsers, reporting encouraging preliminary results.

5 0.33755499 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features

Author: Christos Christodoulopoulos ; Sharon Goldwater ; Mark Steedman

Abstract: In this paper we present a fully unsupervised syntactic class induction system formulated as a Bayesian multinomial mixture model, where each word type is constrained to belong to a single class. By using a mixture model rather than a sequence model (e.g., HMM), we are able to easily add multiple kinds of features, including those at both the type level (morphology features) and token level (context and alignment features, the latter from parallel corpora). Using only context features, our system yields results comparable to state-of-the art, far better than a similar model without the one-class-per-type constraint. Using the additional features provides added benefit, and our final system outperforms the best published results on most of the 25 corpora tested.

6 0.33607718 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning

7 0.33522302 147 emnlp-2011-Using Syntactic and Semantic Structural Kernels for Classifying Definition Questions in Jeopardy!

8 0.33452609 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing

9 0.33328792 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation

10 0.33327317 35 emnlp-2011-Correcting Semantic Collocation Errors with L1-induced Paraphrases

11 0.3325685 143 emnlp-2011-Unsupervised Information Extraction with Distributional Prior Knowledge

12 0.33250958 128 emnlp-2011-Structured Relation Discovery using Generative Models

13 0.3324793 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances

14 0.33229852 136 emnlp-2011-Training a Parser for Machine Translation Reordering

15 0.33224607 68 emnlp-2011-Hypotheses Selection Criteria in a Reranking Framework for Spoken Language Understanding

16 0.33010921 132 emnlp-2011-Syntax-Based Grammaticality Improvement using CCG and Guided Search

17 0.32850263 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation

18 0.32795331 65 emnlp-2011-Heuristic Search for Non-Bottom-Up Tree Structure Prediction

19 0.32767066 107 emnlp-2011-Probabilistic models of similarity in syntactic context

20 0.32729286 66 emnlp-2011-Hierarchical Phrase-based Translation Representations