acl acl2010 acl2010-198 knowledge-graph by maker-knowledge-mining

198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning


Source: pdf

Author: Hirotoshi Taira ; Sanae Fujita ; Masaaki Nagata

Abstract: Maintaining high annotation consistency in large corpora is crucial for statistical learning; however, such work is hard, especially for tasks containing semantic elements. This paper describes predicate argument structure analysis using transformation-based learning. An advantage of transformation-based learning is the readability of learned rules. A disadvantage is that the rule extraction procedure is time-consuming. We present incremental-based, transformation-based learning for semantic processing tasks. As an example, we deal with Japanese predicate argument analysis and show some tendencies of annotators for constructing a corpus with our method.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 ab Abstract Maintaining high annotation consistency in large corpora is crucial for statistical learning; however, such work is hard, especially for tasks containing semantic elements. [sent-5, score-0.07]

2 This paper describes predicate argument structure analysis using transformation-based learning. [sent-6, score-0.603]

3 An advantage of transformation-based learning is the readability of learned rules. [sent-7, score-0.095]

4 A disadvantage is that the rule extraction procedure is time-consuming. [sent-8, score-0.169]

5 As an example, we deal with Japanese predicate argument analysis and show some tendencies of annotators for constructing a corpus with our method. [sent-10, score-0.742]

6 1 Introduction Automatic predicate argument structure analysis (PAS) provides information of “who did what to whom” and is an important base tool for such various text processing tasks as machine translation information extraction (Hirschman et al. [sent-11, score-0.603]

7 , 1999), question answering (Narayanan and Harabagiu, 2004; Shen and Lapata, 2007), and summarization (Melli et al. [sent-12, score-0.034]

8 Most recent approaches to predicate argument structure analysis are statistical machine learning methods such as support vector machines (SVMs)(Pradhan et al. [sent-14, score-0.603]

9 For predicate argument structure analysis, we have the following representative large corpora: FrameNet (Fillmore et al. [sent-16, score-0.603]

10 jp ab The construction of such large corpora is strenuous and time-consuming. [sent-28, score-0.072]

11 Additionally, maintaining high annotation consistency in such corpora is crucial for statistical learning; however, such work is hard, especially for tasks containing semantic elements. [sent-29, score-0.09]

12 For example, in Japanese corpora, distinguishing true dative (or indirect object) arguments from time-type argument is difficult because the arguments of both types are often accompanied with the ‘ni’ case marker. [sent-30, score-0.857]

13 An advantage for such learning methods is that we can easily interpret the learned model. [sent-33, score-0.038]

14 The tasks in most previous research are such simple tagging tasks as part-of-speech tagging, insertion and deletion of parentheses in syntactic parsing, and chunking (Brill, 1995; Brill, 1993; Ramshaw and Marcus, 1995). [sent-34, score-0.036]

15 TBL can be slow, so we proposed an incremental training method to speed up the training. [sent-36, score-0.077]

16 From the experiments, we interrelated the annotation tendency on the dataset. [sent-38, score-0.107]

17 Section 2 describes Japanese predicate structure, our graph expression of it, and our improved method. [sent-40, score-0.393]

18 2 Predicate argument structure and graph transformation learning First, we illustrate the structure of a Japanese sentence in Fig. [sent-42, score-0.433]

19 In Japanese, we can divide a sentence into bunsetsu phrases (BP). [sent-44, score-0.066]

20 Since predicates and arguments in Japanese are mainly annotated on the head content word in each BP, we can deal with BPs as candidates of predicates or arguments. [sent-50, score-0.242]

21 In our experiments, we mapped each BP to an argument candidate node of graphs. [sent-51, score-0.53]

22 We also mapped each predicate to a predicate node. [sent-52, score-0.632]

23 Each predicate-argument relation is identified by an edge between a predicate and an argument, and the argument type is mapped to the edge label. [sent-53, score-0.892]

24 In our experiments below, we defined five argument types: nominative (subjective), accusative (direct objective), dative (indirect objective), time, and location. [sent-54, score-0.795]

25 We use five transformation types: a) add or b) delete a predicate node, c) add or d) delete an edge between an predicate and an argument node, e) change a label (= an argument type) to another label (Fig. [sent-55, score-1.413]

26 We explain the existence of an edge between a predicate and an argument labeled t candidate node as that the predicate and the argument have a t type relationship. [sent-57, score-1.5]

27 Below we explain our learning strategy when we directly adapt the learning method to our graph expression of PASs. [sent-59, score-0.109]

28 After pre-processing, each text is mapped to an initial graph. [sent-61, score-0.064]

29 In our experiments, the initial graph has argument candidate nodes with corresponding BPs and no predicate nodes or edges. [sent-62, score-0.804]

30 Next, comBBPPBBPPBBPP BBPPBBPPBBPP aa)) ``AAdddd PPrreedd NNooddee’’ PPRREEDD PPRREEDD bb)) ``DDeelleettee PPrreedd NNooddee’’ AARRGGPPRREEDD cd c) ` A D d e ld e E te d E g e d ’g ’e ’ A R G N o Pm PR . [sent-63, score-0.037]

31 RREEDD Figure 2: Transform types paring the current graphs with the gold standard graph structure in the training data, we find the different statuses of the nodes and edges among the graphs. [sent-67, score-0.334]

32 We extract such transformation rule candidates as ‘add node’ and ‘change edge label’ with constraints, including ‘the corresponding BP includes a verb’ and ‘the argument candidate and the predicate node have a syntactic dependency. [sent-68, score-1.034]

33 ’ The extractions are executed based on the rule templates given in advance. [sent-69, score-0.189]

34 Each extracted rule is evaluated for the current graphs, and error reduction is calculated. [sent-70, score-0.204]

35 The best rule for the reduction is selected as a new rule and inserted at the bottom of the current rule list. [sent-71, score-0.454]

36 The new rule is applied to the current graphs, which are transferred to other graph structures. [sent-72, score-0.192]

37 This procedure is iterated until the total errors for the gold standard graphs become zero. [sent-73, score-0.159]

38 When the process is completed, the rule list is the final model. [sent-74, score-0.125]

39 In the test phase, we iteratively transform nodes and edges in the graphs mapped from the test data, based on rules in the model like decision lists. [sent-75, score-0.421]

40 The last graph after all rule adaptations is the system output of the PAS. [sent-76, score-0.225]

41 In this procedure, the calculation of error reduction is very time-consuming, because we have to check many constraints from the candidate rules for all training samples. [sent-77, score-0.31]

42 The calculation order is O(MN), where M is the number of articles and N is the number of candidate rules. [sent-78, score-0.181]

43 Additionally, an edge rule usually has three types of constraints: ‘pred node constraint,’ ‘argument candidate node constraint,’ and ‘relation constraint. [sent-79, score-0.48]

44 ’ The number of combinations and extracted rules are much larger than one of the rules for the node rules. [sent-80, score-0.317]

45 proposed an index-based efficient reduction method for the calculation of error reduction (Ramshaw and Marcus, 1994). [sent-82, score-0.208]

46 However, in PAS tasks, we need to check the exclusiveness of the argument types (for example, a predicate argument structure does not have two nominative ar163 guments), and we cannot directly use the method. [sent-83, score-1.104]

47 only used candidate rules that happen in the current and gold standard graphs and used SVM learning for constraint checks (Jijkoun and de Rijke, 2007). [sent-85, score-0.394]

48 This method is effective for achieving high accuracy; however, it loses the readability of the rules. [sent-86, score-0.057]

49 To reduce the calculations while maintaining readability, we propose an incremental method and describe its procedure below. [sent-88, score-0.177]

50 In this procedure, we first have PAS graphs for only one arti- cle. [sent-89, score-0.115]

51 After the total errors among the current and gold standard graphs become zero in the article, we proceed to the next article. [sent-90, score-0.115]

52 For the next article, we first adapt the rules learned from the previous article. [sent-91, score-0.155]

53 After that, we extract new rules from the two articles until the total errors for the articles become zero. [sent-92, score-0.251]

54 Additionally, we count the number of rule occurrences and only use the rule candidates that happen more than once, because most such rules harm the accuracy. [sent-94, score-0.437]

55 We save and use these rules again if the occurrence increases. [sent-95, score-0.117]

56 1 Experimental Settings We used the articles in the NAIST Text Corpus version 1. [sent-97, score-0.067]

57 , 2007) based on the Mainichi Shinbun Corpus (Mainichi, 1995), which were taken from news articles published in the Japanese Mainichi Shinbun newspaper. [sent-99, score-0.067]

58 We used articles published on January 1st for training ex- amples and on January 3rd for test examples. [sent-100, score-0.067]

59 Three original argument types are defined in the NAIST Text Corpus: nominative (or subjective), accusative (or direct object), and dative (or indirect object). [sent-101, score-0.899]

60 For evaluation of the difficult annotation cases, we also added annotations for ‘time’ and ‘location’ types by ourselves. [sent-102, score-0.081]

61 After that, we adapted our incremental learning to the training data. [sent-105, score-0.077]

62 We used two constraint templates in Tables 2 and 3 for predicate nodes and edges when extracting the rule candidates. [sent-106, score-0.607]

63 In comparison, the original TBL cannot even extract one rule in a day. [sent-114, score-0.125]

64 The results of predicate and argument type predictions are shown in Table 4. [sent-115, score-0.672]

65 2) BSs containing a topic case marker (wa) are predicted to be nominative. [sent-117, score-0.049]

66 3) When a word sense category from a Japanese ontology of the head word in BS belongs to a ‘time’ or ‘location’ category, the BS is predicted to be a ‘time’ and ‘location’ type argument. [sent-118, score-0.118]

67 This indicates that the first twenty rules are mainly effective rules for the performance. [sent-123, score-0.234]

68 Next, we show the performance for every argument type in Table 5. [sent-125, score-0.388]

69 In this table, the performance of the dative and time types im- proved, even though they are difficult to distinguish. [sent-127, score-0.289]

70 On the other hand, the performance of the location type argument in our system is very low. [sent-128, score-0.482]

71 Our method learns rules as decreasing errors of 164 Table 2: Predicate node constraint templates poCsle1p‘nm&sodatmr;’12Papironset2d. [sent-129, score-0.326]

72 To confirm this, we performed an experiment in which we gave the rules of the baseline system to our system as initial rules and subsequently performed our incremental learning. [sent-134, score-0.311]

73 The performance for the location type argument improved drastically. [sent-136, score-0.482]

74 However, the total performance of the arguments was below the original TBL. [sent-137, score-0.096]

75 Moreover, the ‘Base + TBL’ performance surpassed the baseline system. [sent-138, score-0.033]

76 This indicates that our system learned a reasonable model. [sent-139, score-0.038]

77 Finally, we show some interesting extracted rules in Fig. [sent-140, score-0.117]

78 The first rule stands for an expression where the sentence ends with the performance of something, which is often seen in Japanese newspaper articles. [sent-142, score-0.167]

79 The second and third rules represent that annotators of this dataset tend to annotate time types for which the semantic category of the argument is time, even if the argument looks like the dat. [sent-143, score-0.892]

80 16 is ap lied Figure 4: Examples of extracted rules Table 5: Results for every arg. [sent-156, score-0.15]

81 4 Conclusion We performed experiments for Japanese predicate argument structure analysis using transformationbased learning and extracted rules that indicate the tendencies annotators have. [sent-169, score-0.935]

82 We presented an incremental procedure to speed up rule extraction. [sent-170, score-0.246]

83 The performance of PAS analysis improved, espe- cially, the dative and time types, which are difficult to distinguish. [sent-171, score-0.242]

84 Moreover, when time expressions are attached to the ‘ni’ case, the learned model showed a tendency to annotate them as dative arguments in the used corpus. [sent-172, score-0.458]

85 Our method has potential for dative predictions and interpreting the tendencies of annotator inconsistencies. [sent-173, score-0.333]

86 Building a large lexical databank which provides deep semantics. [sent-188, score-0.033]

87 Description of SQUASH, the SFU question answering summary handler for the DUC-2005 summarization task. [sent-232, score-0.034]

88 Exploring the statistical derivation of transformational rule sequences for part-of-speech tagging. [sent-259, score-0.125]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('argument', 0.319), ('predicate', 0.284), ('japanese', 0.262), ('dative', 0.242), ('tbl', 0.227), ('bp', 0.154), ('bps', 0.151), ('bss', 0.151), ('mainichi', 0.151), ('pprreedd', 0.151), ('nominative', 0.135), ('pas', 0.133), ('naist', 0.133), ('rule', 0.125), ('rules', 0.117), ('ramshaw', 0.115), ('graphs', 0.115), ('shinbun', 0.114), ('accusative', 0.099), ('arguments', 0.096), ('location', 0.094), ('tendencies', 0.091), ('jijkoun', 0.085), ('node', 0.083), ('iida', 0.081), ('reduction', 0.079), ('edge', 0.078), ('incremental', 0.077), ('cabocha', 0.076), ('gda', 0.076), ('melli', 0.076), ('nnooddee', 0.076), ('transformationbased', 0.076), ('bs', 0.074), ('type', 0.069), ('articles', 0.067), ('graph', 0.067), ('brill', 0.067), ('bunsetsu', 0.066), ('candidate', 0.064), ('mapped', 0.064), ('templates', 0.064), ('constraint', 0.062), ('hirschman', 0.061), ('indirect', 0.057), ('readability', 0.057), ('january', 0.057), ('maintaining', 0.056), ('predicates', 0.056), ('nombank', 0.054), ('transform', 0.053), ('calculation', 0.05), ('kawahara', 0.049), ('predicted', 0.049), ('annotators', 0.048), ('transformation', 0.047), ('types', 0.047), ('ni', 0.046), ('lance', 0.046), ('pw', 0.046), ('kudo', 0.046), ('kyoto', 0.044), ('meyers', 0.044), ('yuji', 0.044), ('pradhan', 0.044), ('procedure', 0.044), ('narayanan', 0.043), ('annotate', 0.042), ('expression', 0.042), ('delete', 0.041), ('tendency', 0.04), ('learned', 0.038), ('fillmore', 0.038), ('cd', 0.037), ('subjective', 0.037), ('edges', 0.037), ('ab', 0.036), ('happen', 0.036), ('chunking', 0.036), ('jp', 0.036), ('nodes', 0.035), ('propbank', 0.035), ('ralph', 0.034), ('annotation', 0.034), ('answering', 0.034), ('candidates', 0.034), ('lied', 0.033), ('surpassed', 0.033), ('nsc', 0.033), ('guments', 0.033), ('wooters', 0.033), ('interrelated', 0.033), ('eword', 0.033), ('kashani', 0.033), ('paring', 0.033), ('databank', 0.033), ('irl', 0.033), ('aaki', 0.033), ('adaptations', 0.033), ('cially', 0.033)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

Author: Hirotoshi Taira ; Sanae Fujita ; Masaaki Nagata

Abstract: Maintaining high annotation consistency in large corpora is crucial for statistical learning; however, such work is hard, especially for tasks containing semantic elements. This paper describes predicate argument structure analysis using transformation-based learning. An advantage of transformation-based learning is the readability of learned rules. A disadvantage is that the rule extraction procedure is time-consuming. We present incremental-based, transformation-based learning for semantic processing tasks. As an example, we deal with Japanese predicate argument analysis and show some tendencies of annotators for constructing a corpus with our method.

2 0.26781395 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

Author: Yotaro Watanabe ; Masayuki Asahara ; Yuji Matsumoto

Abstract: In predicate-argument structure analysis, it is important to capture non-local dependencies among arguments and interdependencies between the sense of a predicate and the semantic roles of its arguments. However, no existing approach explicitly handles both non-local dependencies and semantic dependencies between predicates and arguments. In this paper we propose a structured model that overcomes the limitation of existing approaches; the model captures both types of dependencies simultaneously by introducing four types of factors including a global factor type capturing non-local dependencies among arguments and a pairwise factor type capturing local dependencies between a predicate and an argument. In experiments the proposed model achieved competitive results compared to the stateof-the-art systems without applying any feature selection procedure.

3 0.25554651 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

Author: Matthew Gerber ; Joyce Chai

Abstract: Despite its substantial coverage, NomBank does not account for all withinsentence arguments and ignores extrasentential arguments altogether. These arguments, which we call implicit, are important to semantic processing, and their recovery could potentially benefit many NLP applications. We present a study of implicit arguments for a select group of frequent nominal predicates. We show that implicit arguments are pervasive for these predicates, adding 65% to the coverage of NomBank. We demonstrate the feasibility of recovering implicit arguments with a supervised classification model. Our results and analyses provide a baseline for future work on this emerging task.

4 0.20548297 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

Author: Omri Abend ; Ari Rappoport

Abstract: The core-adjunct argument distinction is a basic one in the theory of argument structure. The task of distinguishing between the two has strong relations to various basic NLP tasks such as syntactic parsing, semantic role labeling and subcategorization acquisition. This paper presents a novel unsupervised algorithm for the task that uses no supervised models, utilizing instead state-of-the-art syntactic induction algorithms. This is the first work to tackle this task in a fully unsupervised scenario.

5 0.20139818 94 acl-2010-Edit Tree Distance Alignments for Semantic Role Labelling

Author: Hector-Hugo Franco-Penya

Abstract: ―Tree SRL system‖ is a Semantic Role Labelling supervised system based on a tree-distance algorithm and a simple k-NN implementation. The novelty of the system lies in comparing the sentences as tree structures with multiple relations instead of extracting vectors of features for each relation and classifying them. The system was tested with the English CoNLL-2009 shared task data set where 79% accuracy was obtained. 1

6 0.18852983 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans

7 0.17377754 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

8 0.1657971 158 acl-2010-Latent Variable Models of Selectional Preference

9 0.15063609 216 acl-2010-Starting from Scratch in Semantic Role Labeling

10 0.1428615 253 acl-2010-Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing

11 0.11707387 130 acl-2010-Hard Constraints for Grammatical Function Labelling

12 0.11311346 238 acl-2010-Towards Open-Domain Semantic Role Labeling

13 0.10633658 118 acl-2010-Fine-Grained Tree-to-String Translation Rule Extraction

14 0.10611675 127 acl-2010-Global Learning of Focused Entailment Graphs

15 0.10574024 203 acl-2010-Rebanking CCGbank for Improved NP Interpretation

16 0.098630734 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

17 0.092841648 207 acl-2010-Semantics-Driven Shallow Parsing for Chinese Semantic Role Labeling

18 0.091708899 163 acl-2010-Learning Lexicalized Reordering Models from Reordering Graphs

19 0.090099014 146 acl-2010-Improving Chinese Semantic Role Labeling with Rich Syntactic Features

20 0.089817628 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.245), (1, 0.099), (2, 0.254), (3, 0.116), (4, -0.0), (5, 0.017), (6, -0.087), (7, -0.03), (8, -0.096), (9, -0.097), (10, 0.086), (11, 0.022), (12, -0.022), (13, 0.027), (14, 0.143), (15, -0.053), (16, 0.053), (17, 0.048), (18, -0.087), (19, 0.002), (20, -0.025), (21, 0.058), (22, 0.077), (23, -0.183), (24, 0.108), (25, 0.092), (26, 0.065), (27, -0.072), (28, 0.149), (29, -0.003), (30, -0.02), (31, -0.012), (32, -0.111), (33, -0.062), (34, -0.133), (35, 0.075), (36, 0.045), (37, 0.032), (38, -0.042), (39, -0.056), (40, -0.003), (41, -0.095), (42, 0.018), (43, -0.022), (44, 0.039), (45, 0.047), (46, 0.047), (47, -0.081), (48, 0.082), (49, -0.053)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96275276 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

Author: Hirotoshi Taira ; Sanae Fujita ; Masaaki Nagata

Abstract: Maintaining high annotation consistency in large corpora is crucial for statistical learning; however, such work is hard, especially for tasks containing semantic elements. This paper describes predicate argument structure analysis using transformation-based learning. An advantage of transformation-based learning is the readability of learned rules. A disadvantage is that the rule extraction procedure is time-consuming. We present incremental-based, transformation-based learning for semantic processing tasks. As an example, we deal with Japanese predicate argument analysis and show some tendencies of annotators for constructing a corpus with our method.

2 0.82740176 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

Author: Yotaro Watanabe ; Masayuki Asahara ; Yuji Matsumoto

Abstract: In predicate-argument structure analysis, it is important to capture non-local dependencies among arguments and interdependencies between the sense of a predicate and the semantic roles of its arguments. However, no existing approach explicitly handles both non-local dependencies and semantic dependencies between predicates and arguments. In this paper we propose a structured model that overcomes the limitation of existing approaches; the model captures both types of dependencies simultaneously by introducing four types of factors including a global factor type capturing non-local dependencies among arguments and a pairwise factor type capturing local dependencies between a predicate and an argument. In experiments the proposed model achieved competitive results compared to the stateof-the-art systems without applying any feature selection procedure.

3 0.79487348 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

Author: Matthew Gerber ; Joyce Chai

Abstract: Despite its substantial coverage, NomBank does not account for all withinsentence arguments and ignores extrasentential arguments altogether. These arguments, which we call implicit, are important to semantic processing, and their recovery could potentially benefit many NLP applications. We present a study of implicit arguments for a select group of frequent nominal predicates. We show that implicit arguments are pervasive for these predicates, adding 65% to the coverage of NomBank. We demonstrate the feasibility of recovering implicit arguments with a supervised classification model. Our results and analyses provide a baseline for future work on this emerging task.

4 0.72936267 94 acl-2010-Edit Tree Distance Alignments for Semantic Role Labelling

Author: Hector-Hugo Franco-Penya

Abstract: ―Tree SRL system‖ is a Semantic Role Labelling supervised system based on a tree-distance algorithm and a simple k-NN implementation. The novelty of the system lies in comparing the sentences as tree structures with multiple relations instead of extracting vectors of features for each relation and classifying them. The system was tested with the English CoNLL-2009 shared task data set where 79% accuracy was obtained. 1

5 0.65860981 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

Author: Omri Abend ; Ari Rappoport

Abstract: The core-adjunct argument distinction is a basic one in the theory of argument structure. The task of distinguishing between the two has strong relations to various basic NLP tasks such as syntactic parsing, semantic role labeling and subcategorization acquisition. This paper presents a novel unsupervised algorithm for the task that uses no supervised models, utilizing instead state-of-the-art syntactic induction algorithms. This is the first work to tackle this task in a fully unsupervised scenario.

6 0.54705942 130 acl-2010-Hard Constraints for Grammatical Function Labelling

7 0.50009406 216 acl-2010-Starting from Scratch in Semantic Role Labeling

8 0.48995581 158 acl-2010-Latent Variable Models of Selectional Preference

9 0.47682378 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans

10 0.4464018 238 acl-2010-Towards Open-Domain Semantic Role Labeling

11 0.43893373 67 acl-2010-Computing Weakest Readings

12 0.43122792 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

13 0.4056159 127 acl-2010-Global Learning of Focused Entailment Graphs

14 0.39448798 84 acl-2010-Detecting Errors in Automatically-Parsed Dependency Relations

15 0.38474926 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition

16 0.38228893 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

17 0.37936357 253 acl-2010-Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing

18 0.36657611 203 acl-2010-Rebanking CCGbank for Improved NP Interpretation

19 0.36446887 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

20 0.35903186 118 acl-2010-Fine-Grained Tree-to-String Translation Rule Extraction


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(25, 0.055), (33, 0.013), (39, 0.015), (42, 0.03), (44, 0.02), (54, 0.279), (59, 0.065), (72, 0.018), (73, 0.052), (78, 0.079), (80, 0.026), (83, 0.098), (84, 0.036), (98, 0.123)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.76431644 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning

Author: Hirotoshi Taira ; Sanae Fujita ; Masaaki Nagata

Abstract: Maintaining high annotation consistency in large corpora is crucial for statistical learning; however, such work is hard, especially for tasks containing semantic elements. This paper describes predicate argument structure analysis using transformation-based learning. An advantage of transformation-based learning is the readability of learned rules. A disadvantage is that the rule extraction procedure is time-consuming. We present incremental-based, transformation-based learning for semantic processing tasks. As an example, we deal with Japanese predicate argument analysis and show some tendencies of annotators for constructing a corpus with our method.

2 0.73141456 177 acl-2010-Multilingual Pseudo-Relevance Feedback: Performance Study of Assisting Languages

Author: Manoj Kumar Chinnakotla ; Karthik Raman ; Pushpak Bhattacharyya

Abstract: In a previous work of ours Chinnakotla et al. (2010) we introduced a novel framework for Pseudo-Relevance Feedback (PRF) called MultiPRF. Given a query in one language called Source, we used English as the Assisting Language to improve the performance of PRF for the source language. MulitiPRF showed remarkable improvement over plain Model Based Feedback (MBF) uniformly for 4 languages, viz., French, German, Hungarian and Finnish with English as the assisting language. This fact inspired us to study the effect of any source-assistant pair on MultiPRF performance from out of a set of languages with widely different characteristics, viz., Dutch, English, Finnish, French, German and Spanish. Carrying this further, we looked into the effect of using two assisting languages together on PRF. The present paper is a report of these investigations, their results and conclusions drawn therefrom. While performance improvement on MultiPRF is observed whatever the assisting language and whatever the source, observations are mixed when two assisting languages are used simultaneously. Interestingly, the performance improvement is more pronounced when the source and assisting languages are closely related, e.g., French and Spanish.

3 0.71238303 32 acl-2010-Arabic Named Entity Recognition: Using Features Extracted from Noisy Data

Author: Yassine Benajiba ; Imed Zitouni ; Mona Diab ; Paolo Rosso

Abstract: Building an accurate Named Entity Recognition (NER) system for languages with complex morphology is a challenging task. In this paper, we present research that explores the feature space using both gold and bootstrapped noisy features to build an improved highly accurate Arabic NER system. We bootstrap noisy features by projection from an Arabic-English parallel corpus that is automatically tagged with a baseline NER system. The feature space covers lexical, morphological, and syntactic features. The proposed approach yields an improvement of up to 1.64 F-measure (absolute).

4 0.70519376 164 acl-2010-Learning Phrase-Based Spelling Error Models from Clickthrough Data

Author: Xu Sun ; Jianfeng Gao ; Daniel Micol ; Chris Quirk

Abstract: This paper explores the use of clickthrough data for query spelling correction. First, large amounts of query-correction pairs are derived by analyzing users' query reformulation behavior encoded in the clickthrough data. Then, a phrase-based error model that accounts for the transformation probability between multi-term phrases is trained and integrated into a query speller system. Experiments are carried out on a human-labeled data set. Results show that the system using the phrase-based error model outperforms cantly its baseline systems. 1 signifi-

5 0.57633513 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models

Author: Stefan Thater ; Hagen Furstenau ; Manfred Pinkal

Abstract: We present a syntactically enriched vector model that supports the computation of contextualized semantic representations in a quasi compositional fashion. It employs a systematic combination of first- and second-order context vectors. We apply our model to two different tasks and show that (i) it substantially outperforms previous work on a paraphrase ranking task, and (ii) achieves promising results on a wordsense similarity task; to our knowledge, it is the first time that an unsupervised method has been applied to this task.

6 0.57442206 158 acl-2010-Latent Variable Models of Selectional Preference

7 0.57195812 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese

8 0.57115257 71 acl-2010-Convolution Kernel over Packed Parse Forest

9 0.56772202 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

10 0.56664217 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition

11 0.56637347 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses

12 0.56239676 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification

13 0.56131601 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields

14 0.5582819 248 acl-2010-Unsupervised Ontology Induction from Text

15 0.55679327 107 acl-2010-Exemplar-Based Models for Word Meaning in Context

16 0.55589712 130 acl-2010-Hard Constraints for Grammatical Function Labelling

17 0.55564684 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates

18 0.55563581 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans

19 0.554775 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

20 0.55477411 155 acl-2010-Kernel Based Discourse Relation Recognition with Temporal Ordering Information