acl acl2010 acl2010-94 knowledge-graph by maker-knowledge-mining

94 acl-2010-Edit Tree Distance Alignments for Semantic Role Labelling


Source: pdf

Author: Hector-Hugo Franco-Penya

Abstract: ―Tree SRL system‖ is a Semantic Role Labelling supervised system based on a tree-distance algorithm and a simple k-NN implementation. The novelty of the system lies in comparing the sentences as tree structures with multiple relations instead of extracting vectors of features for each relation and classifying them. The system was tested with the English CoNLL-2009 shared task data set where 79% accuracy was obtained. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 ie Abstract ―Tree SRL system‖ is a Semantic Role Labelling supervised system based on a tree-distance algorithm and a simple k-NN implementation. [sent-4, score-0.061]

2 The novelty of the system lies in comparing the sentences as tree structures with multiple relations instead of extracting vectors of features for each relation and classifying them. [sent-5, score-0.526]

3 The system was tested with the English CoNLL-2009 shared task data set where 79% accuracy was obtained. [sent-6, score-0.111]

4 1 Introduction Semantic Role Labelling (SRL) is a natural language processing task which deals with semantic analysis at sentence-level. [sent-7, score-0.077]

5 SRL is the task of identifying arguments for a certain predicate and labelling them. [sent-8, score-0.712]

6 The arguments determine events such as ―who‖, ―whom‖, ―where‖, etc, with reference to one predicate. [sent-11, score-0.125]

7 The possible semantic roles are pre-defined for each predicate. [sent-12, score-0.134]

8 It adds a semantic layer to the Penn TreeBank (Marcus et al, 1994) and defines a set of semantic roles for each predicate. [sent-19, score-0.211]

9 It is difficult to define universal semantic roles for all predicates. [sent-20, score-0.134]

10 That is why PropBank defines a set of semantic roles for each possible sense of each predicate (frame) [See a sample of the frame ―raise‖ on the Figure 1 caption]. [sent-21, score-0.579]

11 The following table describes the number of sentences, sub-trees and labels contained in them, and the ratios of sub-trees per sentences and relations per sub-tree. [sent-28, score-0.197]

12 The four most frequent labels in the data set are: A1:35%, A0:20. [sent-31, score-0.133]

13 72% Propbank was originally built using constituent tree structures, but here only the dependency tree structure version was used. [sent-34, score-0.516]

14 Note that de- pendency tree structures have labels on the arrows. [sent-35, score-0.406]

15 The tree distance algorithm cannot work with these labelled arrows and so they are moved to the child node as an extra label. [sent-36, score-0.886]

16 The task performed by the Tree SRL system consists of labelling the relations (predicate arguments) which are assumed to be already identified. [sent-37, score-0.386]

17 3 Tree Distance The tree distance algorithm has already been applied to text entailment (Kouylekov & Magnini, 2005) and question answering (Punyakanok et al, 2004; Emms, 2006) with positive results. [sent-38, score-0.435]

18 The main contribution of this piece of work to the SRL field is the inclusion of the tree distance algorithm into an SRL system, working with tree structures in contrast to the classical ―feature extraction‖ and ―classification‖. [sent-39, score-0.625]

19 Kim et al (2009) developed a similar system for Information Extraction. [sent-40, score-0.137]

20 0c S20tu1d0e Antss Roecsieaatirconh f Woror Cksomhoppu,t paatgioensa 7l9 L–in8g4u,istics A two sentence sample, in a dependency tree repre sentation. [sent-43, score-0.292]

21 The square nod e represent the predicate that is going to be analyzed, (there can be multiple predicates in a single s entence). [sent-47, score-0.538]

22 Semi-dotted arrows between a square node and an ellipse node represent a semantic relati on. [sent-48, score-0.606]

23 This arrow has a semantic tag (A1, A2, A3 and A4). [sent-49, score-0.109]

24 The grey shadow contains all the nodes of the sub tre e for the ―rose‖ predicate. [sent-50, score-0.395]

25 The dotted double arrows between the nodes of bot h sentences represent the tree distance alignment for both sub-trees. [sent-51, score-0.696]

26 In this particular case every single node is matched. [sent-52, score-0.179]

27 The advantage of this algorithm is that its computational cost is low. [sent-54, score-0.152]

28 The optimal matching depends on the defined atomic cost of matching two nodes. [sent-55, score-0.535]

29 4 Tree SRL system architecture For the training and testing data set, all possible sub-trees were extracted. [sent-56, score-0.105]

30 Then, using the tree distance algorithm, the test sub-trees are labelled using the training ones. [sent-58, score-0.595]

31 Finally, the predicted labels get assembled on the original sentence where the test sub-tree came from. [sent-59, score-0.181]

32 A sub-tree extracted from a sentence, contains a predicate node, all its argument nodes and all the ancestors up to the first common ancestor of all nodes. [sent-61, score-0.63]

33 ), the sub-tree extracted from the above sentence will contain the nodes: ―a1‖, ―a2‖, ―p‖, all ancestors of ―a1‖,‖a2‖ and ―p‖ up to the first common one, in this case node ―u‖, which is also included in the sub-tree. [sent-65, score-0.227]

34 All of the white nodes are not included in the sub-tree. [sent-66, score-0.158]

35 5 Labelling Suppose that in Figure 1, the bottom sentence is the query, where the grey shadow contains the sub-tree to be labelled and the top sentence contains the sub-tree sample chosen to label the query. [sent-68, score-0.556]

36 Then, an alignment between the sample sub-tree and the query sub-tree suggests labelling the query sub-tree with A1, A2 and A3, where the first two labels are right but the last label, A4, is predicted as A3, so it is wrong. [sent-69, score-0.668]

37 Input: A sub-tree to be labelled Input: list of alignments sorted by ascending tree distance Output: labelled sub-tree foreach argument(a) in T do foreach alignment (ali) in the sorted list do if there is a semantic relation (ali. [sent-70, score-1.738]

38 function(a)) Then break loop; end end label relation p-a with the label of the relation (ali. [sent-72, score-0.482]

39 ali is an alignment between the sub-tree that has to be labelled and a sub-tree in the training dataset. [sent-76, score-0.341]

40 However, if the whole query is labelled using a single answer sample, the prediction is guaranteed to be consistent (no repeated argument labels). [sent-80, score-0.433]

41 Some possible ways to label the semantic relation using a sorted list of alignments (with each sub-tree of the training data set) is discussed ahead. [sent-81, score-0.421]

42 Each sub-tree contains one predicate and several semantic relations, one for each argument node. [sent-82, score-0.501]

43 1 Treating relations independently In this sub-section, the neighbouring sub-trees for one relation of a sub-tree T refers to the near81 est sub-trees with which the match with T produces a match between two predicate nodes and two argument nodes. [sent-84, score-0.893]

44 A label from the nearest neighbour(s) can be transferred to T for labelling the relation. [sent-85, score-0.486]

45 The current implementation (Approach A), described in more detail in Figure 4, labels a relation using the first nearest neighbour from a list ordered by ascending tree distance. [sent-86, score-0.749]

46 If there are several nearest neighbours, the first one on the list is used. [sent-87, score-0.166]

47 This is a naive implementation of the k-NN algorithm where in case of multiple nearest neighbours only one is used and the others get ignored. [sent-88, score-0.215]

48 A way to make it deterministic can be by extending the parameter ―k‖ in case of multiple cases at the same distance or a tie in the voting (Approach B). [sent-91, score-0.167]

49 2 Treating relations dependently In this section, a sample refers to a sub-tree containing all arguments and its labels. [sent-93, score-0.263]

50 Some strategies can lead to non-consistent structures (core argument labels cannot appear twice in the same sub-tree). [sent-95, score-0.28]

51 It does not have any mechanism to keep the consistency of the whole predicate structure. [sent-97, score-0.394]

52 Another way is to find a sample that contains enough information to label the whole sub-tree (Approach C). [sent-98, score-0.216]

53 The limitation of this model is that the required sample may not exist or the tree distance may be very high, making those samples poor predictors. [sent-100, score-0.491]

54 The implemented method (Approach A) indirectly attempts to find a training sample sub-tree which contains labels for all the arguments of the predicate. [sent-101, score-0.366]

55 It is expected for tree distances to be smaller than other sub-trees that do not have information to label all the desired relations. [sent-102, score-0.373]

56 The system tries to get a consistent structure using a simple algorithm. [sent-103, score-0.061]

57 Only in the case when using the nearest tree does not lead to labelling the whole structure, labels are predicted using multiple samples, thereby, risking the structure consistency. [sent-104, score-0.862]

58 Future implementations will rank possible candidate labels for each relation (probably using multiple samples). [sent-105, score-0.261]

59 A ―joint scoring algorithm‖, which is commonly used (Marquez et al, 2008), can be applied for consistency checking after finding the rank probability for all the argument labels for the same predicate (Approach D). [sent-106, score-0.588]

60 6 Experiments: the matching cost The cost of matching two nodes is crucial to the performance of the system. [sent-107, score-0.654]

61 Different atomic measures (ways to measure the cost of matching two nodes) that were tested are explained ahead. [sent-108, score-0.439]

62 Results for experiments using these atomic measures are given in Table 2. [sent-109, score-0.191]

63 1 Binary system For Binary system, the atomic cost of matching two nodes is one if label POS or dependency relations are different, otherwise the cost is zero. [sent-111, score-1.047]

64 The atomic cost of inserting or deleting a node is always one. [sent-112, score-0.605]

65 2 Ternary system The next intuitive measure is how the system would perform in case of a ternary cost (ternary system). [sent-115, score-0.423]

66 The atomic cost is half if POS or dependency relation is different, one if POS and dependency relation are different or zero in all other case. [sent-116, score-0.657]

67 For this system, Table 2 shows a very similar accuracy to the binary one. [sent-117, score-0.092]

68 3 Hamming system The atomic cost of matching two nodes is the sum of the following sub costs: 0. [sent-119, score-0.796]

69 25 if one node is a predicate but the other is not or if both nodes are predicates but with different lemma. [sent-126, score-0.745]

70 Note that the sum of all costs cannot be greater than one. [sent-128, score-0.091]

71 4 Predicate match system The analysis of results for the previous systems shows that the accuracy is higher for the subtrees that are labelled using sub-trees with the same predicate node. [sent-130, score-0.743]

72 Consequently, this strategy attempts to force the predicate to be the same. [sent-131, score-0.36]

73 In this system, the atomic cost of matching two nodes is the sum of the following sub costs: 82 0. [sent-132, score-0.735]

74 if one is a predicate and the other node is not or both nodes are predicates but with different lemma. [sent-136, score-0.745]

75 5 Complex system This strategy attempts to improve the accuracy by adding an extra label to the argument nodes and using it. [sent-139, score-0.506]

76 The atomic cost of matching two nodes is the sum of the following sub costs: 0. [sent-140, score-0.735]

77 1 for each different label (dependency relation or POS or lemma). [sent-141, score-0.194]

78 1 for each pair of different labels (dependency relation or POS or lemma). [sent-143, score-0.222]

79 4 if one node is a predicate and the other is not. [sent-145, score-0.505]

80 4 if both nodes are predicates and lemma is different. [sent-147, score-0.312]

81 2 if one node is marked as an argument and the other is not or one node is marked as a predicate and the other is not. [sent-148, score-0.782]

82 The atomic cost of deleting or inserting a node is: two if the node is an argument or predicate node and one in any other case. [sent-149, score-1.387]

83 The validation data set is added to the training data set when the system is labelling the evalua- tion data set. [sent-151, score-0.322]

84 o2p8m%ent Accuracy is measured as the percentage of semantic labels correctly predicted. [sent-155, score-0.21]

85 The implementation of the Tree SRL system takes several days to run a single experiment. [sent-156, score-0.061]

86 It makes non viable the idea of using the development data set for adjusting parameters and that is why, for the last three systems (Hamming, Predicate Match and Complex), the accuracy over the development data set is not measured. [sent-157, score-0.097]

87 If the complexity gets increased (Ternary), the number of cases having the multiple nearest sub-trees gets reduced. [sent-160, score-0.159]

88 The output of the system only contains five per cent of inconsistent structures (Binary and Ternary), which is lower than expected. [sent-162, score-0.149]

89 Higher accuracy for the relations where a sub-tree is labelled using a sub-tree sample which has the same predicate node. [sent-166, score-0.757]

90 This resulted in low accuracy for they predicted labels due to multiple nearest neighbours. [sent-170, score-0.39]

91 It is surprising that the hamming measure reaches higher accuracy than the ―predicate match‖, which uses more information, and is also surprising that the accuracies for ―Hamming‖, ―Predicate Match‖ and ―Complex‖ systems are very similar. [sent-171, score-0.224]

92 (Binary system) The accuracy results for multiple languages suggest that the size of the corpora has a strong influence on the results of the system performance. [sent-174, score-0.15]

93 This system does not identify argu- ments and does not perform predicate sense disambiguation. [sent-176, score-0.387]

94 83 8 Conclusion The tree distance algorithm has been applied successfully to build a SRL system. [sent-177, score-0.352]

95 Future work will focus on improving the performance of the system by: a) trying to extend the sub-trees which will contain more contextual information, b) using different approaches to label semantic relations discussed in Section 5. [sent-178, score-0.307]

96 Also, the system will be expanded to identify arguments using a tree distance algorithm. [sent-179, score-0.538]

97 Evaluating the task of identifying the arguments and labelling the relations separately will assist in determining which systems to combine to create an hybrid system with better performance. [sent-180, score-0.511]

98 The CoNLL-2009 shared task: syntactic and semantic dependencies in multiple languages. [sent-191, score-0.116]

99 Fast algorithms for the unit cost editing distance between trees. [sent-239, score-0.328]

100 Simple fast algorithms for the editing distance between trees and related problems. [sent-251, score-0.176]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('predicate', 0.326), ('labelling', 0.261), ('labelled', 0.243), ('srl', 0.243), ('tree', 0.224), ('atomic', 0.191), ('foreach', 0.185), ('node', 0.179), ('nodes', 0.158), ('cost', 0.152), ('ternary', 0.149), ('labels', 0.133), ('distance', 0.128), ('arguments', 0.125), ('nearest', 0.12), ('arrows', 0.112), ('hamming', 0.112), ('emms', 0.111), ('label', 0.105), ('sub', 0.103), ('argument', 0.098), ('matching', 0.096), ('rquez', 0.089), ('relation', 0.089), ('shasha', 0.084), ('predicates', 0.082), ('semantic', 0.077), ('al', 0.076), ('minvalue', 0.074), ('shadow', 0.074), ('trinity', 0.074), ('sample', 0.074), ('ascending', 0.072), ('lemma', 0.072), ('dependency', 0.068), ('kaizhong', 0.065), ('neighbour', 0.065), ('samples', 0.065), ('relations', 0.064), ('match', 0.063), ('system', 0.061), ('usa', 0.061), ('kouylekov', 0.06), ('grey', 0.06), ('square', 0.059), ('roles', 0.057), ('costs', 0.056), ('neighbours', 0.056), ('ali', 0.056), ('query', 0.055), ('sorted', 0.054), ('dublin', 0.053), ('propbank', 0.052), ('accuracy', 0.05), ('alignments', 0.05), ('pos', 0.05), ('entailment', 0.05), ('structures', 0.049), ('editing', 0.048), ('thirteenth', 0.048), ('traversal', 0.048), ('ancestors', 0.048), ('predicted', 0.048), ('end', 0.047), ('adjusting', 0.047), ('list', 0.046), ('punyakanok', 0.045), ('dennis', 0.045), ('frame', 0.045), ('testing', 0.044), ('distances', 0.044), ('inserting', 0.042), ('binary', 0.042), ('alignment', 0.042), ('deleting', 0.041), ('haji', 0.04), ('delete', 0.04), ('multiple', 0.039), ('inconsistent', 0.039), ('martin', 0.038), ('whole', 0.037), ('college', 0.037), ('sum', 0.035), ('edit', 0.034), ('post', 0.034), ('role', 0.034), ('attempts', 0.034), ('answering', 0.033), ('treating', 0.033), ('jan', 0.033), ('bot', 0.032), ('nod', 0.032), ('arrow', 0.032), ('daniele', 0.032), ('neighbouring', 0.032), ('pighin', 0.032), ('rising', 0.032), ('surprising', 0.031), ('consistency', 0.031), ('zhang', 0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000006 94 acl-2010-Edit Tree Distance Alignments for Semantic Role Labelling

Author: Hector-Hugo Franco-Penya

Abstract: ―Tree SRL system‖ is a Semantic Role Labelling supervised system based on a tree-distance algorithm and a simple k-NN implementation. The novelty of the system lies in comparing the sentences as tree structures with multiple relations instead of extracting vectors of features for each relation and classifying them. The system was tested with the English CoNLL-2009 shared task data set where 79% accuracy was obtained. 1

2 0.33119589 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

Author: Junhui Li ; Guodong Zhou ; Hwee Tou Ng

Abstract: This paper explores joint syntactic and semantic parsing of Chinese to further improve the performance of both syntactic and semantic parsing, in particular the performance of semantic parsing (in this paper, semantic role labeling). This is done from two levels. Firstly, an integrated parsing approach is proposed to integrate semantic parsing into the syntactic parsing process. Secondly, semantic information generated by semantic parsing is incorporated into the syntactic parsing model to better capture semantic information in syntactic parsing. Evaluation on Chinese TreeBank, Chinese PropBank, and Chinese NomBank shows that our integrated parsing approach outperforms the pipeline parsing approach on n-best parse trees, a natural extension of the widely used pipeline parsing approach on the top-best parse tree. Moreover, it shows that incorporating semantic role-related information into the syntactic parsing model significantly improves the performance of both syntactic parsing and semantic parsing. To our best knowledge, this is the first research on exploring syntactic parsing and semantic role labeling for both verbal and nominal predicates in an integrated way. 1

3 0.28649148 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans

Author: Fei Huang ; Alexander Yates

Abstract: Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domain of the training data. Semantic role labeling techniques are typically trained on newswire text, and in tests their performance on fiction is as much as 19% worse than their performance on newswire text. We investigate techniques for building open-domain semantic role labeling systems that approach the ideal of a train-once, use-anywhere system. We leverage recently-developed techniques for learning representations of text using latent-variable language models, and extend these techniques to ones that provide the kinds of features that are useful for semantic role labeling. In experiments, our novel system reduces error by 16% relative to the previous state of the art on out-of-domain text.

4 0.27041081 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

Author: Yotaro Watanabe ; Masayuki Asahara ; Yuji Matsumoto

Abstract: In predicate-argument structure analysis, it is important to capture non-local dependencies among arguments and interdependencies between the sense of a predicate and the semantic roles of its arguments. However, no existing approach explicitly handles both non-local dependencies and semantic dependencies between predicates and arguments. In this paper we propose a structured model that overcomes the limitation of existing approaches; the model captures both types of dependencies simultaneously by introducing four types of factors including a global factor type capturing non-local dependencies among arguments and a pairwise factor type capturing local dependencies between a predicate and an argument. In experiments the proposed model achieved competitive results compared to the stateof-the-art systems without applying any feature selection procedure.

5 0.22270389 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

Author: Matthew Gerber ; Joyce Chai

Abstract: Despite its substantial coverage, NomBank does not account for all withinsentence arguments and ignores extrasentential arguments altogether. These arguments, which we call implicit, are important to semantic processing, and their recovery could potentially benefit many NLP applications. We present a study of implicit arguments for a select group of frequent nominal predicates. We show that implicit arguments are pervasive for these predicates, adding 65% to the coverage of NomBank. We demonstrate the feasibility of recovering implicit arguments with a supervised classification model. Our results and analyses provide a baseline for future work on this emerging task.

6 0.20746517 207 acl-2010-Semantics-Driven Shallow Parsing for Chinese Semantic Role Labeling

7 0.20139818 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

8 0.19834603 216 acl-2010-Starting from Scratch in Semantic Role Labeling

9 0.19609946 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

10 0.18365477 130 acl-2010-Hard Constraints for Grammatical Function Labelling

11 0.17331029 25 acl-2010-Adapting Self-Training for Semantic Role Labeling

12 0.16145581 146 acl-2010-Improving Chinese Semantic Role Labeling with Rich Syntactic Features

13 0.16066292 238 acl-2010-Towards Open-Domain Semantic Role Labeling

14 0.15525448 115 acl-2010-Filtering Syntactic Constraints for Statistical Machine Translation

15 0.14976628 158 acl-2010-Latent Variable Models of Selectional Preference

16 0.12407634 30 acl-2010-An Open-Source Package for Recognizing Textual Entailment

17 0.11943664 133 acl-2010-Hierarchical Search for Word Alignment

18 0.11843517 155 acl-2010-Kernel Based Discourse Relation Recognition with Temporal Ordering Information

19 0.11602695 203 acl-2010-Rebanking CCGbank for Improved NP Interpretation

20 0.11439568 71 acl-2010-Convolution Kernel over Packed Parse Forest


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.302), (1, 0.076), (2, 0.362), (3, 0.158), (4, -0.007), (5, 0.024), (6, -0.142), (7, 0.004), (8, -0.158), (9, -0.107), (10, 0.01), (11, -0.064), (12, -0.06), (13, -0.021), (14, -0.041), (15, -0.04), (16, 0.006), (17, 0.042), (18, -0.003), (19, 0.02), (20, -0.006), (21, -0.006), (22, 0.07), (23, -0.03), (24, 0.12), (25, 0.025), (26, 0.061), (27, -0.069), (28, 0.041), (29, 0.069), (30, -0.002), (31, -0.022), (32, -0.09), (33, 0.003), (34, -0.084), (35, 0.005), (36, 0.027), (37, -0.05), (38, 0.009), (39, -0.037), (40, -0.031), (41, -0.048), (42, 0.075), (43, 0.031), (44, -0.008), (45, 0.049), (46, 0.068), (47, -0.004), (48, 0.072), (49, 0.065)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9622156 94 acl-2010-Edit Tree Distance Alignments for Semantic Role Labelling

Author: Hector-Hugo Franco-Penya

Abstract: ―Tree SRL system‖ is a Semantic Role Labelling supervised system based on a tree-distance algorithm and a simple k-NN implementation. The novelty of the system lies in comparing the sentences as tree structures with multiple relations instead of extracting vectors of features for each relation and classifying them. The system was tested with the English CoNLL-2009 shared task data set where 79% accuracy was obtained. 1

2 0.79363441 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

Author: Matthew Gerber ; Joyce Chai

Abstract: Despite its substantial coverage, NomBank does not account for all withinsentence arguments and ignores extrasentential arguments altogether. These arguments, which we call implicit, are important to semantic processing, and their recovery could potentially benefit many NLP applications. We present a study of implicit arguments for a select group of frequent nominal predicates. We show that implicit arguments are pervasive for these predicates, adding 65% to the coverage of NomBank. We demonstrate the feasibility of recovering implicit arguments with a supervised classification model. Our results and analyses provide a baseline for future work on this emerging task.

3 0.7931034 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

Author: Yotaro Watanabe ; Masayuki Asahara ; Yuji Matsumoto

Abstract: In predicate-argument structure analysis, it is important to capture non-local dependencies among arguments and interdependencies between the sense of a predicate and the semantic roles of its arguments. However, no existing approach explicitly handles both non-local dependencies and semantic dependencies between predicates and arguments. In this paper we propose a structured model that overcomes the limitation of existing approaches; the model captures both types of dependencies simultaneously by introducing four types of factors including a global factor type capturing non-local dependencies among arguments and a pairwise factor type capturing local dependencies between a predicate and an argument. In experiments the proposed model achieved competitive results compared to the stateof-the-art systems without applying any feature selection procedure.

4 0.78109843 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

Author: Hirotoshi Taira ; Sanae Fujita ; Masaaki Nagata

Abstract: Maintaining high annotation consistency in large corpora is crucial for statistical learning; however, such work is hard, especially for tasks containing semantic elements. This paper describes predicate argument structure analysis using transformation-based learning. An advantage of transformation-based learning is the readability of learned rules. A disadvantage is that the rule extraction procedure is time-consuming. We present incremental-based, transformation-based learning for semantic processing tasks. As an example, we deal with Japanese predicate argument analysis and show some tendencies of annotators for constructing a corpus with our method.

5 0.69448602 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

Author: Junhui Li ; Guodong Zhou ; Hwee Tou Ng

Abstract: This paper explores joint syntactic and semantic parsing of Chinese to further improve the performance of both syntactic and semantic parsing, in particular the performance of semantic parsing (in this paper, semantic role labeling). This is done from two levels. Firstly, an integrated parsing approach is proposed to integrate semantic parsing into the syntactic parsing process. Secondly, semantic information generated by semantic parsing is incorporated into the syntactic parsing model to better capture semantic information in syntactic parsing. Evaluation on Chinese TreeBank, Chinese PropBank, and Chinese NomBank shows that our integrated parsing approach outperforms the pipeline parsing approach on n-best parse trees, a natural extension of the widely used pipeline parsing approach on the top-best parse tree. Moreover, it shows that incorporating semantic role-related information into the syntactic parsing model significantly improves the performance of both syntactic parsing and semantic parsing. To our best knowledge, this is the first research on exploring syntactic parsing and semantic role labeling for both verbal and nominal predicates in an integrated way. 1

6 0.6913842 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans

7 0.68471646 216 acl-2010-Starting from Scratch in Semantic Role Labeling

8 0.65862608 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

9 0.57747167 130 acl-2010-Hard Constraints for Grammatical Function Labelling

10 0.57150614 238 acl-2010-Towards Open-Domain Semantic Role Labeling

11 0.55976325 207 acl-2010-Semantics-Driven Shallow Parsing for Chinese Semantic Role Labeling

12 0.53106731 146 acl-2010-Improving Chinese Semantic Role Labeling with Rich Syntactic Features

13 0.50527453 115 acl-2010-Filtering Syntactic Constraints for Statistical Machine Translation

14 0.47485894 25 acl-2010-Adapting Self-Training for Semantic Role Labeling

15 0.44639298 67 acl-2010-Computing Weakest Readings

16 0.43843243 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

17 0.4292171 127 acl-2010-Global Learning of Focused Entailment Graphs

18 0.42340058 21 acl-2010-A Tree Transducer Model for Synchronous Tree-Adjoining Grammars

19 0.40119353 158 acl-2010-Latent Variable Models of Selectional Preference

20 0.39699531 75 acl-2010-Correcting Errors in a Treebank Based on Synchronous Tree Substitution Grammar


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(14, 0.02), (25, 0.047), (42, 0.012), (59, 0.079), (73, 0.025), (78, 0.502), (80, 0.018), (83, 0.073), (84, 0.025), (98, 0.106)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.98509771 228 acl-2010-The Importance of Rule Restrictions in CCG

Author: Marco Kuhlmann ; Alexander Koller ; Giorgio Satta

Abstract: Combinatory Categorial Grammar (CCG) is generally construed as a fully lexicalized formalism, where all grammars use one and the same universal set of rules, and crosslinguistic variation is isolated in the lexicon. In this paper, we show that the weak generative capacity of this ‘pure’ form of CCG is strictly smaller than that of CCG with grammar-specific rules, and of other mildly context-sensitive grammar formalisms, including Tree Adjoining Grammar (TAG). Our result also carries over to a multi-modal extension of CCG.

same-paper 2 0.90955269 94 acl-2010-Edit Tree Distance Alignments for Semantic Role Labelling

Author: Hector-Hugo Franco-Penya

Abstract: ―Tree SRL system‖ is a Semantic Role Labelling supervised system based on a tree-distance algorithm and a simple k-NN implementation. The novelty of the system lies in comparing the sentences as tree structures with multiple relations instead of extracting vectors of features for each relation and classifying them. The system was tested with the English CoNLL-2009 shared task data set where 79% accuracy was obtained. 1

3 0.82489771 229 acl-2010-The Influence of Discourse on Syntax: A Psycholinguistic Model of Sentence Processing

Author: Amit Dubey

Abstract: Probabilistic models of sentence comprehension are increasingly relevant to questions concerning human language processing. However, such models are often limited to syntactic factors. This paper introduces a novel sentence processing model that consists of a parser augmented with a probabilistic logic-based model of coreference resolution, which allows us to simulate how context interacts with syntax in a reading task. Our simulations show that a Weakly Interactive cognitive architecture can explain data which had been provided as evidence for the Strongly Interactive hypothesis.

4 0.76705968 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

Author: Alan Ritter ; Mausam Mausam ; Oren Etzioni

Abstract: The computation of selectional preferences, the admissible argument values for a relation, is a well-known NLP task with broad applicability. We present LDA-SP, which utilizes LinkLDA (Erosheva et al., 2004) to model selectional preferences. By simultaneously inferring latent topics and topic distributions over relations, LDA-SP combines the benefits of previous approaches: like traditional classbased approaches, it produces humaninterpretable classes describing each relation’s preferences, but it is competitive with non-class-based methods in predictive power. We compare LDA-SP to several state-ofthe-art methods achieving an 85% increase in recall at 0.9 precision over mutual information (Erk, 2007). We also evaluate LDA-SP’s effectiveness at filtering improper applications of inference rules, where we show substantial improvement over Pantel et al. ’s system (Pantel et al., 2007).

5 0.69330686 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models

Author: Stefan Thater ; Hagen Furstenau ; Manfred Pinkal

Abstract: We present a syntactically enriched vector model that supports the computation of contextualized semantic representations in a quasi compositional fashion. It employs a systematic combination of first- and second-order context vectors. We apply our model to two different tasks and show that (i) it substantially outperforms previous work on a paraphrase ranking task, and (ii) achieves promising results on a wordsense similarity task; to our knowledge, it is the first time that an unsupervised method has been applied to this task.

6 0.60574228 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

7 0.59970462 158 acl-2010-Latent Variable Models of Selectional Preference

8 0.58786315 21 acl-2010-A Tree Transducer Model for Synchronous Tree-Adjoining Grammars

9 0.58061528 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

10 0.56828076 23 acl-2010-Accurate Context-Free Parsing with Combinatory Categorial Grammar

11 0.54150724 115 acl-2010-Filtering Syntactic Constraints for Statistical Machine Translation

12 0.53901201 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

13 0.53367901 67 acl-2010-Computing Weakest Readings

14 0.52992535 130 acl-2010-Hard Constraints for Grammatical Function Labelling

15 0.52447307 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

16 0.52042913 107 acl-2010-Exemplar-Based Models for Word Meaning in Context

17 0.51495814 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

18 0.51418817 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

19 0.51076406 155 acl-2010-Kernel Based Discourse Relation Recognition with Temporal Ordering Information

20 0.51033342 71 acl-2010-Convolution Kernel over Packed Parse Forest