emnlp emnlp2010 emnlp2010-21 knowledge-graph by maker-knowledge-mining

21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications


Source: pdf

Author: Eduardo Blanco ; Dan Moldovan

Abstract: This paper presents a method for the automatic discovery of MANNER relations from text. An extended definition of MANNER is proposed, including restrictions on the sorts of concepts that can be part of its domain and range. The connections with other relations and the lexico-syntactic patterns that encode MANNER are analyzed. A new feature set specialized on MANNER detection is depicted and justified. Experimental results show improvement over previous attempts to extract MANNER. Combinations of MANNER with other semantic relations are also discussed.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract This paper presents a method for the automatic discovery of MANNER relations from text. [sent-3, score-0.244]

2 An extended definition of MANNER is proposed, including restrictions on the sorts of concepts that can be part of its domain and range. [sent-4, score-0.193]

3 The connections with other relations and the lexico-syntactic patterns that encode MANNER are analyzed. [sent-5, score-0.288]

4 A new feature set specialized on MANNER detection is depicted and justified. [sent-6, score-0.156]

5 Combinations of MANNER with other semantic relations are also discussed. [sent-8, score-0.312]

6 1 Introduction Extracting semantic relations from text is an important step towards understanding the meaning of text. [sent-9, score-0.312]

7 Recently, there is a growing interest in text semantics (M` arquez et al. [sent-11, score-0.08]

8 An important semantic relation for many applications is the MANNER relation. [sent-13, score-0.186]

9 For example, quick delivery encodes a MANNER relation, since quick is the manner in which the delivery happened. [sent-15, score-0.74]

10 , and the text Through his spokesman, Obama sent a strong questions, it is useful to identify first the MANNER relations in text. [sent-18, score-0.2]

11 Con- all and He started the company sider the following example: The company said Mr. [sent-32, score-0.068]

12 There are assisted by Manfred Gingl, two MANNERrelations in this sentence: the underlined chunks of text encode the way in which Mr. [sent-37, score-0.144]

13 2 Previous Work The extraction of semantic relations in general has caught the attention of several researchers. [sent-39, score-0.312]

14 Approaches to detect semantic relations usually focus on particular lexical and syntactic patterns. [sent-40, score-0.384]

15 Work has been done on detecting relations within noun phrases (Nulty, 2007), 1Penn TreeBank, file ws j 0 0 2 7, sentence 10. [sent-45, score-0.292]

16 (2003) propose a set of features to extract MANNER exclusively from adverbial phrases and report a precision of 64. [sent-58, score-0.073]

17 MANNER is a semantic role, and all the works on the extraction of roles (Gildea and Jurafsky, 2002; Giuglea and Moschitti, 2006) extracts MANNER as well. [sent-61, score-0.112]

18 The two most used semantic role annotation resources, FrameNet (Baker et al. [sent-64, score-0.193]

19 3 The Semantics of MANNER Relation Traditionally, a semantic relation is defined by stating the kind of connection linking two concepts. [sent-68, score-0.186]

20 For example, MANNER is loosely defined by the PropBank annotation guidelines2 as manner adverbs specify how an action is performed [. [sent-69, score-0.708]

21 ] manner should be used when an adverb be an answer to a question starting with ’how? [sent-72, score-0.739]

22 Nonetheless, to the best of our knowledge, semantic relations have been mostly defined stating only a vague definition. [sent-76, score-0.312]

23 Following (Helbig, 2005), we propose an extended definition for semantic relations, includ- 2http://verbs. [sent-77, score-0.141]

24 316 ing semantic restrictions for its domain and range. [sent-81, score-0.21]

25 These restrictions help deciding which relation holds between a given pair of concepts. [sent-82, score-0.168]

26 A relation shall not hold between two concepts unless they belong to its domain and range. [sent-83, score-0.153]

27 1 MANNER Definition Formally, MANNER is represented as MNR(x, y), and it should be read x is the manner in which y happened. [sent-86, score-0.633]

28 In addition, DOMAIN(MNR) and RANGE(MNR) are the sets of sorts of concepts that can be part of the first and second argument. [sent-87, score-0.066]

29 Situations include events and states and can be expressed by verbs or nouns, e. [sent-89, score-0.081]

30 DOMAIN(MNR), namely x, is restricted to qualities (ql), non temporal abstract objects (ntao) and states (st). [sent-92, score-0.226]

31 They do not encode periods or points of time, such as week, or yesterday. [sent-96, score-0.088]

32 Unlike events, states are situations that do not imply a change in the concepts involved. [sent-98, score-0.122]

33 For more details about these semantic classes, refer to (Helbig, 2005). [sent-100, score-0.112]

34 These semantic restrictions on MANNER come after studying previous definitions and manual examination of hundreds of examples. [sent-101, score-0.17]

35 can be answered by with the hammer, and yet the hammer is not the MANNER but the INSTRUMENT of the broke event. [sent-107, score-0.116]

36 Other relations that may be confused as MANNER include AT-LOCATION and AT-TIME, like in [The dog jumped]x [over the fence]y and [John used to go]x [regularly]y. [sent-108, score-0.2]

37 A way of solving this ambiguity is by prioritizing the semantic relations among the possible candidates for a given pair of concepts. [sent-109, score-0.312]

38 This idea has one big disadvantage: the correct detection of MANNER relies on the detection of several other relations, a problem which has proven difficult and thus would unnecessarily introduce errors. [sent-112, score-0.092]

39 Using the proposed extended definition one may discard the false MANNER relations above. [sent-113, score-0.229]

40 Hammer is not a quality, non temporal abstract object or state (hammers are palpable objects), so by definition a relation of the form MNR(the hammer, y) shall not hold. [sent-114, score-0.199]

41 Similarly, fence and week do not fulfill the domain restriction, so MNR(over the fence, y) and MNR(every other week, y) are not valid either. [sent-115, score-0.126]

42 Using the extended definition, since request is an event (it implies a change), MNR(by request, y) is discarded based on the domain and range restrictions. [sent-123, score-0.068]

43 the underlined 4 Argument Extraction In order to implement domain and range restrictions, one needs to map words to the four proposed semantic classes: situations (si), states (st), qualities (ql) and non temporal abstract objects (ntao). [sent-124, score-0.47]

44 Then, the head is mapped into a semantic class using three sources of information: POS tags, WordNet hypernyms and named entity (NE) types. [sent-127, score-0.112]

45 The mapping also uses an automatically built list of verbs and nouns that encode events (verb events and noun events). [sent-139, score-0.156]

46 The procedure to map words into semantic classes has been evaluated on a subset of PropBank which was not used to define the mapping. [sent-140, score-0.143]

47 We syntactically parsed the sentences using Charniak’s parser and then performed argument detection by matching the trees to the syntactic patterns depicted in Section 5. [sent-142, score-0.151]

48 After mapping and enforcing domain and range constraints, the argument pairs were reduced to 11,724 (22. [sent-146, score-0.074]

49 The filtering does make mistakes, but the massive pruning mainly filters out potential relations that do not hold: it filters 77. [sent-150, score-0.2]

50 Table 2 shows the syntactic distribution of MANNER relation in PropBank. [sent-153, score-0.104]

51 We only consider relations between a single node in the syntactic tree and MANNER relations are encoded in PropBank between a single node in the syntactic tree and a verb. [sent-154, score-0.46]

52 Syntactic annotation comes straight from the Penn TreeBank. [sent-156, score-0.066]

53 a verb; MANNER relations expressed by trace chains identifying coreference and split arguments are ignored. [sent-157, score-0.228]

54 The vast majority of PPs encode either a AT-TIME or AT-LOCATION. [sent-176, score-0.088]

55 MANNER relations expressed by ADVPs are easier to detect since the adverb is a clear signal. [sent-177, score-0.348]

56 Adverbs ending in -ly are more likely to encode a MANNER. [sent-178, score-0.088]

57 Note that in both cases, the head of the NP contained in the PP encoding MANNER (conditions and music) belongs to ntao (Section 4). [sent-194, score-0.099]

58 Other prepositions, like with and like are more likely to encode a MANNER, but again it is not guaranteed. [sent-195, score-0.088]

59 First, all ADVPs and PPs whose parent node is a VP or S and encode a MANNER according to PropBank are extracted, yielding 3559 and 3499 positive instances respectively. [sent-200, score-0.147]

60 Because PropBank adds semantic role annotation on top of the Penn TreeBank, we have gold syntactic annotation for all instances. [sent-207, score-0.259]

61 2 Selecting features Selected features are derived from previous works on detecting semantic roles, namely (Gildea and Jurafsky, 2002) and the participating systems in specialized on MANNER detection. [sent-209, score-0.241]

62 These new features bring a significant improvement and are dependent on the phrase potentially encoding a MANNER. [sent-212, score-0.069]

63 Experimentation has shown that MANNER relations expressed by an ADVP are easier to detect than the ones expressed by a PP. [sent-213, score-0.242]

64 Some features are typical of semantic role labeling, but features adverb, dict ionary and ends-with-ly are specialized to MANNER extraction from ADVPs. [sent-215, score-0.256]

65 The main adverb and verb are retrieved by selecting the last adverb or verb of a sequence. [sent-218, score-0.3]

66 For example, in more strongly, the main adverb is strongly, and in had been rescued the main verb is rescued. [sent-219, score-0.15]

67 D i ionary ct tests the presence of the adverb in a custom built dictionary which contains all lemmas for adverbs in WordNet whose gloss matches the regular expression in a . [sent-220, score-0.175]

68 Some features are typical of semantic role detection; we only provide a justification for the new features added. [sent-238, score-0.157]

69 PPs having quotes are more likely to encode a MANNER, the chunk of text between quotes being the manner. [sent-241, score-0.088]

70 For nouns, only non temporal abstract objects and states can encode a MANNER. [sent-247, score-0.268]

71 In that case, a PP starting with by is much more likely to encode an AGENT than a MANNER. [sent-251, score-0.088]

72 For example, named entity recognition and flags indicating the presence of AT-LOCATION and AT-TIME relations for the verb were tried, but they did not bring any significant improvement. [sent-264, score-0.285]

73 The three specialized features (3, 4 and 5) are responsible for an improvement of . [sent-271, score-0.069]

74 The novel features specialized in MANNER detection from PPs (in bold letters in Table 5) bring an improvement of 0. [sent-277, score-0.156]

75 2 Error Analysis The mapping of words to semantic classes is data-driven and decisions were taken so that the overall accuracy is high. [sent-291, score-0.143]

76 Given We want to [see]y the market from the inside, the underlined PP encodes a mapping proposed (Table 1) does not map inside to ntao. [sent-293, score-0.103]

77 ], the underlined text encodes a MANNER and yet cohorts is subsumed by social group. [sent-297, score-0.138]

78 The model proposed for MANNER detection makes mistakes as well. [sent-300, score-0.078]

79 For ADVPs, if the main adverb has not been seen during training, chances of detecting MANNER are low. [sent-301, score-0.166]

80 ] (wsj 1017, MANNERand the has ardently 26) even though ardently is present in the dictionary and ends in -ly; For PPs, some errors are due to the Prop- Bank annotation. [sent-311, score-0.07]

81 (wsj 2061, 57), the underas ARG2, even though it does encode a MANNER. [sent-313, score-0.088]

82 Unlike typical semantic role labelers, our features do not include rich syntactic information (e. [sent-344, score-0.187]

83 9 Composing MANNER with PURPOSE MANNER can combine with other semantic relations in order to reveal implicit relations that otherwise would be missed. [sent-357, score-0.512]

84 The basic idea is to compose MANNER with other relations in order to infer another MANNER. [sent-358, score-0.2]

85 A necessary condition for combining MANNER with another relation R is the compatibility of RANGE(MNR) with DOMAIN(R) or RANGE(R) with DOMAIN(MNR). [sent-359, score-0.074]

86 The extended definition (Section 3) allows to quickly determine iftwo relations are compatible (Blanco et al. [sent-360, score-0.229]

87 ] the traders [place]y orders [via computers]MNR [to buy the basket of stocks 48). [sent-366, score-0.088]

88 PropBank states the basic annotation between brackets: via computers is the MANNER and to buy the basket [. [sent-367, score-0.216]

89 We propose to combine these two relations in order to come up with the new relation MNR(via computers, buy the basket [. [sent-371, score-0.362]

90 This relation is obvious when reading the sentence, so it is omitted by the writer. [sent-375, score-0.074]

91 However, any semantic representation of text needs as much semantics as possible explicitly stated. [sent-376, score-0.157]

92 MANNER does not combine with relations such as CAUSE, ATLOCATION or AT-TIME. [sent-392, score-0.2]

93 For example, given And they continue [anonymously]x,MNR [attacking]y CIA Director William Webster [for being too accommodating to the committee]z,CAU (wsj 0590, 27), there is no relation between x and z. [sent-393, score-0.074]

94 Our models specialize in detecting the most common pattern encoding MANNER. [sent-403, score-0.123]

95 We believe that each relation or role has its own unique characteristics and capturing them improves performance. [sent-405, score-0.119]

96 We have shown this fact for MANNER by examining examples, considering the kind of arguments that can be part of the domain and range, and considering theoretical works (Hawkins, 1999). [sent-406, score-0.068]

97 The combination of MANNER and PURPOSE opens up a novel paradigm to perform semantic inference. [sent-409, score-0.112]

98 We envision a layer of semantics using a small set of basic semantic relations and inference mechanisms on top of them to obtain more semantics on demand. [sent-410, score-0.402]

99 Combining semantic relations in order to obtain more relation is only one of the pos- sible inference methods. [sent-411, score-0.386]

100 Introduction to the CoNLL-2005 shared task: semantic role labeling. [sent-426, score-0.157]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('manner', 0.633), ('mnr', 0.337), ('pps', 0.243), ('advps', 0.213), ('propbank', 0.205), ('relations', 0.2), ('wsj', 0.121), ('semantic', 0.112), ('adverb', 0.106), ('girju', 0.089), ('hammer', 0.089), ('encode', 0.088), ('relation', 0.074), ('ntao', 0.071), ('specialized', 0.069), ('detecting', 0.06), ('restrictions', 0.058), ('underlined', 0.056), ('vp', 0.054), ('basket', 0.053), ('blanco', 0.053), ('fence', 0.053), ('instrument', 0.053), ('pp', 0.052), ('davidov', 0.05), ('advp', 0.05), ('temporal', 0.05), ('states', 0.047), ('encodes', 0.047), ('non', 0.046), ('detection', 0.046), ('eduardo', 0.046), ('qualities', 0.046), ('szpakowicz', 0.046), ('semantics', 0.045), ('role', 0.045), ('computers', 0.045), ('verb', 0.044), ('discovery', 0.044), ('detect', 0.042), ('depicted', 0.041), ('bring', 0.041), ('adverbial', 0.041), ('domain', 0.04), ('concepts', 0.039), ('ne', 0.039), ('adverbs', 0.039), ('roxana', 0.038), ('objects', 0.037), ('annotation', 0.036), ('gildea', 0.036), ('situations', 0.036), ('holds', 0.036), ('ardently', 0.035), ('atlocation', 0.035), ('barker', 0.035), ('cohorts', 0.035), ('damn', 0.035), ('giuglea', 0.035), ('hawkins', 0.035), ('helbig', 0.035), ('hirano', 0.035), ('specialize', 0.035), ('stronach', 0.035), ('surpluses', 0.035), ('framenet', 0.035), ('buy', 0.035), ('rappoport', 0.035), ('arquez', 0.035), ('prp', 0.035), ('company', 0.034), ('argument', 0.034), ('events', 0.034), ('week', 0.033), ('moldovan', 0.033), ('phrases', 0.032), ('mistakes', 0.032), ('classes', 0.031), ('syntactic', 0.03), ('ionary', 0.03), ('dmitry', 0.03), ('delivery', 0.03), ('legislation', 0.03), ('ql', 0.03), ('sells', 0.03), ('stan', 0.03), ('straight', 0.03), ('charniak', 0.03), ('parent', 0.03), ('definition', 0.029), ('bank', 0.029), ('ari', 0.029), ('expressing', 0.029), ('instances', 0.029), ('palmer', 0.028), ('request', 0.028), ('arguments', 0.028), ('encoding', 0.028), ('broke', 0.027), ('sorts', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999803 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications

Author: Eduardo Blanco ; Dan Moldovan

Abstract: This paper presents a method for the automatic discovery of MANNER relations from text. An extended definition of MANNER is proposed, including restrictions on the sorts of concepts that can be part of its domain and range. The connections with other relations and the lexico-syntactic patterns that encode MANNER are analyzed. A new feature set specialized on MANNER detection is depicted and justified. Experimental results show improvement over previous attempts to extract MANNER. Combinations of MANNER with other semantic relations are also discussed.

2 0.098678119 121 emnlp-2010-What a Parser Can Learn from a Semantic Role Labeler and Vice Versa

Author: Stephen Boxwell ; Dennis Mehay ; Chris Brew

Abstract: In many NLP systems, there is a unidirectional flow of information in which a parser supplies input to a semantic role labeler. In this paper, we build a system that allows information to flow in both directions. We make use of semantic role predictions in choosing a single-best parse. This process relies on an averaged perceptron model to distinguish likely semantic roles from erroneous ones. Our system penalizes parses that give rise to low-scoring semantic roles. To explore the consequences of this we perform two experiments. First, we use a baseline generative model to produce n-best parses, which are then re-ordered by our semantic model. Second, we use a modified version of our semantic role labeler to predict semantic roles at parse time. The performance of this modified labeler is weaker than that of our best full SRL, because it is restricted to features that can be computed directly from the parser’s packed chart. For both experiments, the resulting semantic predictions are then used to select parses. Finally, we feed the selected parses produced by each experiment to the full version of our semantic role labeler. We find that SRL performance can be improved over this baseline by selecting parses with likely semantic roles.

3 0.083916858 11 emnlp-2010-A Semi-Supervised Approach to Improve Classification of Infrequent Discourse Relations Using Feature Vector Extension

Author: Hugo Hernault ; Danushka Bollegala ; Mitsuru Ishizuka

Abstract: Several recent discourse parsers have employed fully-supervised machine learning approaches. These methods require human annotators to beforehand create an extensive training corpus, which is a time-consuming and costly process. On the other hand, unlabeled data is abundant and cheap to collect. In this paper, we propose a novel semi-supervised method for discourse relation classification based on the analysis of cooccurring features in unlabeled data, which is then taken into account for extending the feature vectors given to a classifier. Our experimental results on the RST Discourse Treebank corpus and Penn Discourse Treebank indicate that the proposed method brings a significant improvement in classification accuracy and macro-average F-score when small training datasets are used. For instance, with training sets of c.a. 1000 labeled instances, the proposed method brings improvements in accuracy and macro-average F-score up to 50% compared to a baseline classifier. We believe that the proposed method is a first step towards detecting low-occurrence relations, which is useful for domains with a lack of annotated data.

4 0.078431457 20 emnlp-2010-Automatic Detection and Classification of Social Events

Author: Apoorv Agarwal ; Owen Rambow

Abstract: In this paper we introduce the new task of social event extraction from text. We distinguish two broad types of social events depending on whether only one or both parties are aware of the social contact. We annotate part of Automatic Content Extraction (ACE) data, and perform experiments using Support Vector Machines with Kernel methods. We use a combination of structures derived from phrase structure trees and dependency trees. A characteristic of our events (which distinguishes them from ACE events) is that the participating entities can be spread far across the parse trees. We use syntactic and semantic insights to devise a new structure derived from dependency trees and show that this plays a role in achieving the best performing system for both social event detection and classification tasks. We also use three data sampling approaches to solve the problem of data skewness. Sampling methods improve the F1-measure for the task of relation detection by over 20% absolute over the baseline.

5 0.068062991 31 emnlp-2010-Constraints Based Taxonomic Relation Classification

Author: Quang Do ; Dan Roth

Abstract: Determining whether two terms in text have an ancestor relation (e.g. Toyota and car) or a sibling relation (e.g. Toyota and Honda) is an essential component of textual inference in NLP applications such as Question Answering, Summarization, and Recognizing Textual Entailment. Significant work has been done on developing stationary knowledge sources that could potentially support these tasks, but these resources often suffer from low coverage, noise, and are inflexible when needed to support terms that are not identical to those placed in them, making their use as general purpose background knowledge resources difficult. In this paper, rather than building a stationary hierarchical structure of terms and relations, we describe a system that, given two terms, determines the taxonomic relation between them using a machine learning-based approach that makes use of existing resources. Moreover, we develop a global constraint opti- mization inference process and use it to leverage an existing knowledge base also to enforce relational constraints among terms and thus improve the classifier predictions. Our experimental evaluation shows that our approach significantly outperforms other systems built upon existing well-known knowledge sources.

6 0.063707106 28 emnlp-2010-Collective Cross-Document Relation Extraction Without Labelled Data

7 0.061271586 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing

8 0.060817339 115 emnlp-2010-Uptraining for Accurate Deterministic Question Parsing

9 0.059913978 59 emnlp-2010-Identifying Functional Relations in Web Text

10 0.058430672 95 emnlp-2010-SRL-Based Verb Selection for ESL

11 0.057289124 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks

12 0.054991331 118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing

13 0.054782167 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?

14 0.05315882 119 emnlp-2010-We're Not in Kansas Anymore: Detecting Domain Changes in Streams

15 0.05222182 12 emnlp-2010-A Semi-Supervised Method to Learn and Construct Taxonomies Using the Web

16 0.052005947 60 emnlp-2010-Improved Fully Unsupervised Parsing with Zoomed Learning

17 0.050588526 24 emnlp-2010-Automatically Producing Plot Unit Representations for Narrative Text

18 0.049462311 27 emnlp-2010-Clustering-Based Stratified Seed Sampling for Semi-Supervised Relation Classification

19 0.049315397 122 emnlp-2010-WikiWars: A New Corpus for Research on Temporal Expressions

20 0.047647983 68 emnlp-2010-Joint Inference for Bilingual Semantic Role Labeling


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.162), (1, 0.095), (2, 0.058), (3, 0.209), (4, 0.031), (5, -0.028), (6, 0.032), (7, -0.041), (8, 0.117), (9, 0.024), (10, 0.011), (11, -0.085), (12, 0.003), (13, -0.044), (14, 0.029), (15, 0.069), (16, -0.007), (17, 0.079), (18, -0.009), (19, 0.045), (20, 0.089), (21, 0.024), (22, 0.084), (23, 0.028), (24, -0.054), (25, -0.066), (26, -0.019), (27, 0.044), (28, 0.101), (29, 0.043), (30, 0.03), (31, 0.039), (32, 0.048), (33, 0.089), (34, 0.142), (35, 0.035), (36, -0.104), (37, -0.143), (38, 0.044), (39, 0.003), (40, -0.075), (41, -0.009), (42, 0.003), (43, -0.019), (44, -0.195), (45, 0.009), (46, -0.035), (47, -0.408), (48, -0.081), (49, -0.225)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97242606 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications

Author: Eduardo Blanco ; Dan Moldovan

Abstract: This paper presents a method for the automatic discovery of MANNER relations from text. An extended definition of MANNER is proposed, including restrictions on the sorts of concepts that can be part of its domain and range. The connections with other relations and the lexico-syntactic patterns that encode MANNER are analyzed. A new feature set specialized on MANNER detection is depicted and justified. Experimental results show improvement over previous attempts to extract MANNER. Combinations of MANNER with other semantic relations are also discussed.

2 0.47105744 59 emnlp-2010-Identifying Functional Relations in Web Text

Author: Thomas Lin ; Mausam ; Oren Etzioni

Abstract: Determining whether a textual phrase denotes a functional relation (i.e., a relation that maps each domain element to a unique range element) is useful for numerous NLP tasks such as synonym resolution and contradiction detection. Previous work on this problem has relied on either counting methods or lexico-syntactic patterns. However, determining whether a relation is functional, by analyzing mentions of the relation in a corpus, is challenging due to ambiguity, synonymy, anaphora, and other linguistic phenomena. We present the LEIBNIZ system that overcomes these challenges by exploiting the synergy between the Web corpus and freelyavailable knowledge resources such as Freebase. It first computes multiple typedfunctionality scores, representing functionality of the relation phrase when its arguments are constrained to specific types. It then aggregates these scores to predict the global functionality for the phrase. LEIBNIZ outperforms previous work, increasing area under the precisionrecall curve from 0.61 to 0.88. We utilize LEIBNIZ to generate the first public repository of automatically-identified functional relations.

3 0.38164002 16 emnlp-2010-An Approach of Generating Personalized Views from Normalized Electronic Dictionaries : A Practical Experiment on Arabic Language

Author: Aida Khemakhem ; Bilel Gargouri ; Abdelmajid Ben Hamadou

Abstract: Electronic dictionaries covering all natural language levels are very relevant for the human use as well as for the automatic processing use, namely those constructed with respect to international standards. Such dictionaries are characterized by a complex structure and an important access time when using a querying system. However, the need of a user is generally limited to a part of such a dictionary according to his domain and expertise level which corresponds to a specialized dictionary. Given the importance of managing a unified dictionary and considering the personalized needs of users, we propose an approach for generating personalized views starting from a normalized dictionary with respect to Lexical Markup Framework LMF-ISO 24613 norm. This approach provides the re-use of already defined views for a community of users by managing their profiles information and promoting the materialization of the generated views. It is composed of four main steps: (i) the projection of data categories controlled by a set of constraints (related to the user‟s profiles), (ii) the selection of values with consistency checking, (iii) the automatic generation of the query‟s model and finally, (iv) the refinement of the view. The proposed approach was con- solidated by carrying out an experiment on an LMF normalized Arabic dictionary. 1

4 0.37984192 11 emnlp-2010-A Semi-Supervised Approach to Improve Classification of Infrequent Discourse Relations Using Feature Vector Extension

Author: Hugo Hernault ; Danushka Bollegala ; Mitsuru Ishizuka

Abstract: Several recent discourse parsers have employed fully-supervised machine learning approaches. These methods require human annotators to beforehand create an extensive training corpus, which is a time-consuming and costly process. On the other hand, unlabeled data is abundant and cheap to collect. In this paper, we propose a novel semi-supervised method for discourse relation classification based on the analysis of cooccurring features in unlabeled data, which is then taken into account for extending the feature vectors given to a classifier. Our experimental results on the RST Discourse Treebank corpus and Penn Discourse Treebank indicate that the proposed method brings a significant improvement in classification accuracy and macro-average F-score when small training datasets are used. For instance, with training sets of c.a. 1000 labeled instances, the proposed method brings improvements in accuracy and macro-average F-score up to 50% compared to a baseline classifier. We believe that the proposed method is a first step towards detecting low-occurrence relations, which is useful for domains with a lack of annotated data.

5 0.36711967 24 emnlp-2010-Automatically Producing Plot Unit Representations for Narrative Text

Author: Amit Goyal ; Ellen Riloff ; Hal Daume III

Abstract: In the 1980s, plot units were proposed as a conceptual knowledge structure for representing and summarizing narrative stories. Our research explores whether current NLP technology can be used to automatically produce plot unit representations for narrative text. We create a system called AESOP that exploits a variety of existing resources to identify affect states and applies “projection rules” to map the affect states onto the characters in a story. We also use corpus-based techniques to generate a new type of affect knowledge base: verbs that impart positive or negative states onto their patients (e.g., being eaten is an undesirable state, but being fed is a desirable state). We harvest these “patient polarity verbs” from a Web corpus using two techniques: co-occurrence with Evil/Kind Agent patterns, and bootstrapping over conjunctions of verbs. We evaluate the plot unit representations produced by our system on a small collection of Aesop’s fables.

6 0.33233118 31 emnlp-2010-Constraints Based Taxonomic Relation Classification

7 0.31738216 121 emnlp-2010-What a Parser Can Learn from a Semantic Role Labeler and Vice Versa

8 0.29475799 119 emnlp-2010-We're Not in Kansas Anymore: Detecting Domain Changes in Streams

9 0.27195954 20 emnlp-2010-Automatic Detection and Classification of Social Events

10 0.27062893 28 emnlp-2010-Collective Cross-Document Relation Extraction Without Labelled Data

11 0.24657494 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing

12 0.24548486 81 emnlp-2010-Modeling Perspective Using Adaptor Grammars

13 0.24152532 103 emnlp-2010-Tense Sense Disambiguation: A New Syntactic Polysemy Task

14 0.22936894 60 emnlp-2010-Improved Fully Unsupervised Parsing with Zoomed Learning

15 0.22839223 122 emnlp-2010-WikiWars: A New Corpus for Research on Temporal Expressions

16 0.21703276 95 emnlp-2010-SRL-Based Verb Selection for ESL

17 0.21660326 118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing

18 0.21235631 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?

19 0.21166369 107 emnlp-2010-Towards Conversation Entailment: An Empirical Investigation

20 0.20963465 27 emnlp-2010-Clustering-Based Stratified Seed Sampling for Semi-Supervised Relation Classification


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.02), (10, 0.021), (12, 0.052), (29, 0.071), (30, 0.02), (32, 0.012), (52, 0.019), (56, 0.048), (62, 0.016), (66, 0.099), (72, 0.054), (76, 0.065), (82, 0.364), (87, 0.023), (89, 0.01)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.8175146 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications

Author: Eduardo Blanco ; Dan Moldovan

Abstract: This paper presents a method for the automatic discovery of MANNER relations from text. An extended definition of MANNER is proposed, including restrictions on the sorts of concepts that can be part of its domain and range. The connections with other relations and the lexico-syntactic patterns that encode MANNER are analyzed. A new feature set specialized on MANNER detection is depicted and justified. Experimental results show improvement over previous attempts to extract MANNER. Combinations of MANNER with other semantic relations are also discussed.

2 0.73426557 55 emnlp-2010-Handling Noisy Queries in Cross Language FAQ Retrieval

Author: Danish Contractor ; Govind Kothari ; Tanveer Faruquie ; L V Subramaniam ; Sumit Negi

Abstract: Recent times have seen a tremendous growth in mobile based data services that allow people to use Short Message Service (SMS) to access these data services. In a multilingual society it is essential that data services that were developed for a specific language be made accessible through other local languages also. In this paper, we present a service that allows a user to query a FrequentlyAsked-Questions (FAQ) database built in a local language (Hindi) using Noisy SMS English queries. The inherent noise in the SMS queries, along with the language mismatch makes this a challenging problem. We handle these two problems by formulating the query similarity over FAQ questions as a combinatorial search problem where the search space consists of combinations of dictionary variations of the noisy query and its top-N translations. We demonstrate the effectiveness of our approach on a real-life dataset.

3 0.71842706 27 emnlp-2010-Clustering-Based Stratified Seed Sampling for Semi-Supervised Relation Classification

Author: Longhua Qian ; Guodong Zhou

Abstract: Seed sampling is critical in semi-supervised learning. This paper proposes a clusteringbased stratified seed sampling approach to semi-supervised learning. First, various clustering algorithms are explored to partition the unlabeled instances into different strata with each stratum represented by a center. Then, diversity-motivated intra-stratum sampling is adopted to choose the center and additional instances from each stratum to form the unlabeled seed set for an oracle to annotate. Finally, the labeled seed set is fed into a bootstrapping procedure as the initial labeled data. We systematically evaluate our stratified bootstrapping approach in the semantic relation classification subtask of the ACE RDC (Relation Detection and Classification) task. In particular, we compare various clustering algorithms on the stratified bootstrapping performance. Experimental results on the ACE RDC 2004 corpus show that our clusteringbased stratified bootstrapping approach achieves the best F1-score of 75.9 on the subtask of semantic relation classification, approaching the one with golden clustering.

4 0.43682262 11 emnlp-2010-A Semi-Supervised Approach to Improve Classification of Infrequent Discourse Relations Using Feature Vector Extension

Author: Hugo Hernault ; Danushka Bollegala ; Mitsuru Ishizuka

Abstract: Several recent discourse parsers have employed fully-supervised machine learning approaches. These methods require human annotators to beforehand create an extensive training corpus, which is a time-consuming and costly process. On the other hand, unlabeled data is abundant and cheap to collect. In this paper, we propose a novel semi-supervised method for discourse relation classification based on the analysis of cooccurring features in unlabeled data, which is then taken into account for extending the feature vectors given to a classifier. Our experimental results on the RST Discourse Treebank corpus and Penn Discourse Treebank indicate that the proposed method brings a significant improvement in classification accuracy and macro-average F-score when small training datasets are used. For instance, with training sets of c.a. 1000 labeled instances, the proposed method brings improvements in accuracy and macro-average F-score up to 50% compared to a baseline classifier. We believe that the proposed method is a first step towards detecting low-occurrence relations, which is useful for domains with a lack of annotated data.

5 0.42917013 31 emnlp-2010-Constraints Based Taxonomic Relation Classification

Author: Quang Do ; Dan Roth

Abstract: Determining whether two terms in text have an ancestor relation (e.g. Toyota and car) or a sibling relation (e.g. Toyota and Honda) is an essential component of textual inference in NLP applications such as Question Answering, Summarization, and Recognizing Textual Entailment. Significant work has been done on developing stationary knowledge sources that could potentially support these tasks, but these resources often suffer from low coverage, noise, and are inflexible when needed to support terms that are not identical to those placed in them, making their use as general purpose background knowledge resources difficult. In this paper, rather than building a stationary hierarchical structure of terms and relations, we describe a system that, given two terms, determines the taxonomic relation between them using a machine learning-based approach that makes use of existing resources. Moreover, we develop a global constraint opti- mization inference process and use it to leverage an existing knowledge base also to enforce relational constraints among terms and thus improve the classifier predictions. Our experimental evaluation shows that our approach significantly outperforms other systems built upon existing well-known knowledge sources.

6 0.41461447 20 emnlp-2010-Automatic Detection and Classification of Social Events

7 0.407148 35 emnlp-2010-Discriminative Sample Selection for Statistical Machine Translation

8 0.39604828 45 emnlp-2010-Evaluating Models of Latent Document Semantics in the Presence of OCR Errors

9 0.39536956 26 emnlp-2010-Classifying Dialogue Acts in One-on-One Live Chats

10 0.39345512 40 emnlp-2010-Effects of Empty Categories on Machine Translation

11 0.38465306 68 emnlp-2010-Joint Inference for Bilingual Semantic Role Labeling

12 0.38314584 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks

13 0.38128555 72 emnlp-2010-Learning First-Order Horn Clauses from Web Text

14 0.38062206 32 emnlp-2010-Context Comparison of Bursty Events in Web Search and Online Media

15 0.38042364 86 emnlp-2010-Non-Isomorphic Forest Pair Translation

16 0.37779623 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

17 0.37559554 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice

18 0.37451109 114 emnlp-2010-Unsupervised Parse Selection for HPSG

19 0.37437689 69 emnlp-2010-Joint Training and Decoding Using Virtual Nodes for Cascaded Segmentation and Tagging Tasks

20 0.37354302 92 emnlp-2010-Predicting the Semantic Compositionality of Prefix Verbs