emnlp emnlp2011 emnlp2011-67 knowledge-graph by maker-knowledge-mining

67 emnlp-2011-Hierarchical Verb Clustering Using Graph Factorization


Source: pdf

Author: Lin Sun ; Anna Korhonen

Abstract: Most previous research on verb clustering has focussed on acquiring flat classifications from corpus data, although many manually built classifications are taxonomic in nature. Also Natural Language Processing (NLP) applications benefit from taxonomic classifications because they vary in terms of the granularity they require from a classification. We introduce a new clustering method called Hierarchical Graph Factorization Clustering (HGFC) and extend it so that it is optimal for the task. Our results show that HGFC outperforms the frequently used agglomerative clustering on a hierarchical test set extracted from VerbNet, and that it yields state-of-the-art performance also on a flat test set. We demonstrate how the method can be used to acquire novel classifications as well as to extend existing ones on the basis of some prior knowledge about the classification.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 uk s 8 , Abstract Most previous research on verb clustering has focussed on acquiring flat classifications from corpus data, although many manually built classifications are taxonomic in nature. [sent-4, score-0.774]

2 We introduce a new clustering method called Hierarchical Graph Factorization Clustering (HGFC) and extend it so that it is optimal for the task. [sent-6, score-0.211]

3 Our results show that HGFC outperforms the frequently used agglomerative clustering on a hierarchical test set extracted from VerbNet, and that it yields state-of-the-art performance also on a flat test set. [sent-7, score-0.497]

4 1 Introduction A variety of verb classifications have been built to support NLP tasks. [sent-9, score-0.219]

5 Because verbs change their meaning and behaviour across domains, it is important to be able to tune existing classifications as well to build novel ones in a cost-effective manner, when required. [sent-21, score-0.234]

6 This graph-based, probabilistic clustering algorithm has some clear advantages over AGG (e. [sent-38, score-0.211]

7 it delays the decision on a verb’s cluster membership at any level until a full graph is available, minimising the problem of error propagation) and it has been shown to perform better than several other hierarchical clustering methods in recent comparisons (Yu et al. [sent-40, score-0.57]

8 number of clusters to be produced for each level of the hierarchy). [sent-47, score-0.253]

9 The sec- ond is addition of soft constraints to guide the clustering performance (Vlachos et al. [sent-49, score-0.275]

10 When evaluated on a flat clustering task, HGFC outperforms AGG and performs very similarly with the best flat clustering method reported on the same test set (Sun and Korhonen, 2009). [sent-55, score-0.678]

11 When evaluated on a hierarchical task, HGFC performs considerably better than AGG at all levels of gold standard classification. [sent-56, score-0.261]

12 The unconstrained version can be used to acquire novel classifications from scratch while the constrained version can be used to extend existing ones with additional class members, classes and levels of hierarchy. [sent-59, score-0.553]

13 We used three gold standards (and corresponding test sets) extracted from these resources in our experiments: T1: The first gold standard is a flat gold standard which includes 13 classes appearing in Levin’s original taxonomy (Stevenson and Joanis, 2003). [sent-71, score-0.415]

14 We included this small gold standard in our experiments so that we could compare the flat version of our method against previously published methods. [sent-72, score-0.223]

15 T2: The second gold standard is a large, hierarchical gold standard which we extracted from VerbNet as follows: 1) We removed all the verbs that have less than 1000 occurrences in our corpus. [sent-75, score-0.343]

16 T3: The third gold standard is a subset of T2 where singular classes (top level classes which do not divide into subclasses) are removed. [sent-94, score-0.236]

17 This gold standard was constructed to enable proper evaluation of the constrained version of HGFC (introduced in the following section) where we want to compare the impact of constraints across several levels of classification. [sent-95, score-0.277]

18 We adopt for our experiments a set of features which have performed well in recent verb clustering works: A: Subcategorization frames (SCFs) and their relative frequencies with individual verbs. [sent-100, score-0.298]

19 2 Clustering We introduce the agglomerative clustering (AGG) and Hierarchical Graph Factorization Clustering (HGFC) methods in the following two subsections, respectively. [sent-112, score-0.258]

20 The subsequent two subsections present our extensions to HGFC: (i) automatically determining the cluster structure and (ii) adding soft constraints to guide clustering performance. [sent-113, score-0.37]

21 1 Agglomerative clustering AGG is a method which treats each verb as a singleton cluster and then successively merges two closest clusters until all the clusters have been merged into one. [sent-116, score-0.773]

22 We followed previous verb clustering works and cut the AGG hierarchy manually. [sent-125, score-0.386]

23 For example, in order to group clusters representing Levin classes 9. [sent-132, score-0.244]

24 2 Hierarchical Graph Factorization Clustering Our new method HGFC derives a probabilistic bipartite graph from the similarity matrix (Yu et al. [sent-157, score-0.233]

25 The local and global clustering structures are learned via the random walk properties of the graph. [sent-159, score-0.211]

26 Firstly, there is no error propagation because the decision on a verb’s membership at any level is delayed until the full bipartite graph is available and until a tree structure can be extracted from it by aggregating probabilistic information from all the levels. [sent-161, score-0.253]

27 Secondly, the bipartite graph enables the construction of a hierarchical structure without any intermediate classes. [sent-162, score-0.241]

28 W can be encoded by a undierexcpte(d− graph G (Figure 1(a)), where the verbs are mapped to vertices and the Wij is the edge weight between vertices iand j. [sent-168, score-0.257]

29 The graph G and the cluster structure can be represented by a bipartite graph K(V, U). [sent-169, score-0.29]

30 The matrix B denotes the n m adjacency matrix, Twhiteh bip being tehneo tceosn tnheec ntio ×n weight benectwye meant rthixe, vertex vi and the cluster up. [sent-173, score-0.254]

31 Thus, B represents the connections between clusters at an upper and lower level of clustering. [sent-174, score-0.298]

32 A flat clustering algorithm can be induced by computing B. [sent-175, score-0.339]

33 The bipartite graph K also induces a similarity (W0) between vi and vj: wi0j = Ppm=1 = bipλbpjp (BΛ−1BT)ij where Λ = diag(λ1, λm). [sent-176, score-0.231]

34 This is different from the linkage method where only the data from two clusters are considered. [sent-196, score-0.222]

35 Given the cluster similarity p(up, uq), we can con- struct a new graph G1 (Figure 1(d)) with the clusters U as vertices. [sent-197, score-0.392]

36 , 2006) Require: N verbs V , number of clusters mlfor L levels Compute the similarity matrix W0 from V Build the graph G0 from W0 , and m0 ← n for l = 1, 2 to L do Factorize Gl−1 to obtain bipartite graph Kl with the adjacency matrix Bl (eq. [sent-201, score-0.736]

37 Dl−1Bl)ip (5) × This method might not extract a consistent tree structure, because the cluster membership at the lower level does not constrain the upper level membership. [sent-217, score-0.326]

38 For example, where two verbs were grouped together at a lower level, they could belong to separate clusters at an upper level. [sent-219, score-0.337]

39 The new algorithm starts from the top level bipartite graph, and generates consistent labels for each level by taking into account of the tree constraints set at upper levels. [sent-221, score-0.296]

40 Algorithm 2 Tree extraction algorithm for Algorithm 2 Tree extraction algorithm for HGFC Require: Given N, (Bl,ml) on each level for L levels On the top level L, collect the labels TL (eq. [sent-222, score-0.211]

41 3 Automatically determining the number of clusters for HGFC HGFC needs the number of levels and clusters at each level as input. [sent-232, score-0.528]

42 So the vertices at level l induce a similarity matrix of verbs after t-hop transitions. [sent-243, score-0.313]

43 The decaying similarity function captures the different scales of clustering structure in the data (Azran and Ghahramani, 2006b). [sent-244, score-0.298]

44 The upper levels would have a smaller number of clusters which represent a more global structure. [sent-245, score-0.32]

45 The number of levels and clusters at each level can thus be learned automatically. [sent-247, score-0.338]

46 We therefore propose a method that uses the decaying similarity function to learn the hierarchical clustering structure. [sent-248, score-0.409]

47 One simple modification to algorithm 1 is to set the number of clusters at level l (ml) to be ml−1 − 1. [sent-249, score-0.253]

48 We start by treating each verb as a cluster at the bottom level. [sent-253, score-0.212]

49 The increasingly decaying similarity causes many clusters to have 0 members especially at lower levels, which are pruned in the tree extraction. [sent-255, score-0.347]

50 It is useful for learning novel verb classifications from scratch. [sent-259, score-0.219]

51 VerbNet) it may be desirable to guide the clustering performance on the basis of information that is already known. [sent-262, score-0.211]

52 We propose a constrained version of HGFC which makes uses of labels at the bottom level to learn upper level classifications. [sent-263, score-0.303]

53 We modify the similarity matrix W as follows: If two verbs have different labels (li lj), the similarity between them is decreased by a lfactor a, and a < 1. [sent-266, score-0.247]

54 = 4 Experimental evaluation We applied the clustering methods introduced in section 3 to the test sets described in section 2 and evaluated them both quantitatively and qualitatively, as described in the subsequent sections. [sent-271, score-0.211]

55 ACC is the proportion of members of dominant clusters DOM-CLUSTi within all classes ci. [sent-274, score-0.279]

56 We used normalized mutual information (NMI) and F-Score (F) to evaluate hierarchical clustering results on T2 and T3. [sent-296, score-0.322]

57 The normalized variant of mutual information (MI) enables the comparison of clustering with different cluster numbers (Manning et al. [sent-299, score-0.306]

58 The number of verbs in a cluster K that take this class is denoted by nprevalent(K). [sent-304, score-0.225]

59 Therefore, we only report NMI when the number of classes in clustering and gold-standard is substantially different. [sent-307, score-0.265]

60 Finally, we supplemented quantitative evaluation with qualitative evaluation of clusters produced by different methods. [sent-308, score-0.227]

61 The number of clusters (K) and levels (L) were inferred automatically for HGFC as described in section 3. [sent-312, score-0.275]

62 We also compared HGFC against the best reported clustering method on T1 to date that of spectral clustering by Sun and Korhonen (2009). [sent-326, score-0.464]

63 When the HGFC is forced to produce a flat clustering (a one level tree only), it achieves the F of 52. [sent-331, score-0.437]

64 In the first set of experiments, we pre-defined the tree structure for HGFC by setting L to 3 and K at each level to be the K in the hierarchical gold standard. [sent-334, score-0.274]

65 The hierarchy produced by AGG was cut into 3 levels according to Ks in the gold standard. [sent-335, score-0.238]

66 In these tables, Nc is the number of clusters in HGFC clustering while Nl is the number of classes in the gold standard (the two do not always correspond perfectly because a few clusters have zero members). [sent-338, score-0.71]

67 Table 3 shows the results of both unconstrained and constrained versions of HGFC and those of AGG on the test set T3 (where singular classes are removed to enable proper evaluation of the constrained method). [sent-358, score-0.32]

68 Recall that the constrained version of HGFC learns the upper levels of classification on the basis of soft constraints set at the bottom level, as described earlier in section 3. [sent-360, score-0.359]

69 Yet, the relatively high result across all levels shows that the constrained version of HGFC can be employed a useful method to extend the hierarchical structure ofknown classifications. [sent-364, score-0.298]

70 The scale of the clustering structure is more complete here than in the gold standards. [sent-375, score-0.276]

71 In the table, Nc indicates the number of clusters in the inferred tree, while Nl indicates the closest match to the number of classes in the gold standard. [sent-376, score-0.309]

72 This evaluation is not fully reliable because the match between the gold standard and the clustering is poor at some levels of hierarchy. [sent-377, score-0.361]

73 3 Qualitative evaluation To gain a better insight into the performance of HGFC, we conducted further qualitative analysis of the clusters the two versions of this method produced for T3. [sent-380, score-0.227]

74 We focussed on the top level of 11 clusters (in the evaluation against the hierarchical gold standard, see table 3) as the impact of soft constraints is the weakest for the constrained method at this level. [sent-381, score-0.621]

75 1) so that most clusters simply group lower level classes and their members together. [sent-386, score-0.342]

76 Three nearly clean clusters were produced which only include sub-classes of the same class (e. [sent-387, score-0.218]

77 In contrast, none of the clusters produced by the unconstrained HGFC represent a single VerbNet class. [sent-400, score-0.312]

78 Thorough Levin style investigation of especially the unconstrained method would require looking at shared diathesis alternations between cluster members. [sent-410, score-0.295]

79 However, the analysis we conducted confirmed that the constrained method could indeed be used for extending known classifications, while the unconstrained method is more suitable for acquiring novel classifications from scratch. [sent-412, score-0.326]

80 5 Discussion and conclusion We have introduced a new graph-based method HGFC to hierarchical verb clustering which avoids some of the problems (e. [sent-415, score-0.409]

81 The second involves adding soft constraints to guide the clustering performance, which is useful when aiming to extend existing classification. [sent-420, score-0.275]

82 On a flat test set (T1), the unconstrained version of HGFC outperforms AGG and performs very similarly with the best current flat clustering method (spectral clustering) evaluated on the same dataset. [sent-422, score-0.619]

83 On the hierarchical test sets (T2 and T3), the unconstrained and constrained versions of HGFC outperform AGG clearly at all levels of classification. [sent-423, score-0.39]

84 The constrained version of HGFC detects the missing hierarchy from the existing gold standards with high accuracy. [sent-424, score-0.225]

85 When the number of clusters and levels is learned automatically, the unconstrained method produces a multi-level hierarchy. [sent-425, score-0.397]

86 Finally, the results from our qualitative evaluation show that both constrained and unconstrained versions of HGFC are capable of learning valuable novel information not included in the gold standards. [sent-427, score-0.296]

87 The previous work on Levin style verb classification has mostly focussed on flat classifications using methods suitable for flat clustering (Schulte im Walde, 2006; Joanis et al. [sent-428, score-0.859]

88 However, some works have employed hierarchical clustering as a method to infer flat clustering. [sent-433, score-0.45]

89 For example, Schulte im Walde and Brew (2002) employed AGG to initialize the KMeans clustering for German verbs. [sent-434, score-0.269]

90 Stevenson and Joanis (2003) used AGG for flat clustering on T1. [sent-436, score-0.339]

91 AGG was also used by Ferrer (2004) who performed hierarchical clustering of 514 Spanish verbs. [sent-439, score-0.322]

92 The results were evaluated against a hierarchical gold standard resembling that of Levin’s classification in English (V´ azquez et al. [sent-440, score-0.235]

93 Hierarchical clustering has also been performed for the related task of semantic verb classification. [sent-444, score-0.298]

94 The qualitative evaluation shows that the resulting clusters are very fine-grained. [sent-449, score-0.227]

95 Schulte im Walde (2008) performed hierarchical clustering of German verbs using human verb association as features and AGG as a method. [sent-450, score-0.569]

96 They focussed on two small collections of 56 and 104 verbs and evaluated the result against flat gold standard extracted from GermaNet (Kunze and Lemnitzer, 2002) and German FrameNet (Erk et al. [sent-451, score-0.351]

97 It is likely that different levels of clustering require more or less specific selectional preferences. [sent-464, score-0.296]

98 One way to obtain the latter is hierarchical clustering of relevant noun data. [sent-465, score-0.322]

99 As for the constrained version of HGFC, we will conduct a larger scale experiment on the VerbNet data to investigate what kind of upper level hierarchy it can propose for this resource (which currently has over 100 top level classes). [sent-467, score-0.331]

100 Finally, we plan to compare HGFC to other hierarchical clustering methods that are relatively new to NLP but have proved promising in other fields, including Bayesian Hierarchical Clustering (Heller and Ghahramani, 2005; Teh et al. [sent-468, score-0.322]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('hgfc', 0.635), ('agg', 0.35), ('clustering', 0.211), ('clusters', 0.19), ('joanis', 0.155), ('classifications', 0.132), ('levin', 0.131), ('flat', 0.128), ('unconstrained', 0.122), ('hierarchical', 0.111), ('schulte', 0.104), ('verbnet', 0.103), ('verbs', 0.102), ('stevenson', 0.1), ('korhonen', 0.097), ('cluster', 0.095), ('nmi', 0.091), ('verb', 0.087), ('levels', 0.085), ('issn', 0.078), ('walde', 0.078), ('constrained', 0.072), ('yu', 0.067), ('gold', 0.065), ('bipartite', 0.065), ('graph', 0.065), ('scfs', 0.065), ('vlachos', 0.065), ('level', 0.063), ('sun', 0.061), ('matrix', 0.061), ('vi', 0.059), ('im', 0.058), ('hierarchy', 0.058), ('focussed', 0.056), ('classes', 0.054), ('azran', 0.052), ('radj', 0.052), ('isbn', 0.051), ('agglomerative', 0.047), ('vertices', 0.045), ('upper', 0.045), ('decaying', 0.045), ('wij', 0.044), ('anna', 0.042), ('similarity', 0.042), ('spectral', 0.042), ('acc', 0.04), ('zoubin', 0.04), ('bip', 0.039), ('scf', 0.039), ('uq', 0.039), ('soft', 0.039), ('taxonomy', 0.038), ('ml', 0.038), ('brew', 0.037), ('qualitative', 0.037), ('factorization', 0.036), ('members', 0.035), ('tree', 0.035), ('classification', 0.033), ('linkage', 0.032), ('parameterized', 0.032), ('bottom', 0.03), ('kipper', 0.03), ('version', 0.03), ('cut', 0.03), ('shi', 0.029), ('german', 0.028), ('class', 0.028), ('suzanne', 0.028), ('bl', 0.028), ('taxonomic', 0.028), ('vj', 0.028), ('framenet', 0.026), ('tl', 0.026), ('ghahramani', 0.026), ('ip', 0.026), ('alternations', 0.026), ('briscoe', 0.026), ('eaghdha', 0.026), ('style', 0.026), ('arik', 0.026), ('azquez', 0.026), ('bassiou', 0.026), ('diag', 0.026), ('diathesis', 0.026), ('djs', 0.026), ('ferrer', 0.026), ('heller', 0.026), ('hubert', 0.026), ('kunze', 0.026), ('minimise', 0.026), ('preiss', 0.026), ('vinh', 0.026), ('vpl', 0.026), ('whijt', 0.026), ('zapirain', 0.026), ('constraints', 0.025), ('membership', 0.025)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 67 emnlp-2011-Hierarchical Verb Clustering Using Graph Factorization

Author: Lin Sun ; Anna Korhonen

Abstract: Most previous research on verb clustering has focussed on acquiring flat classifications from corpus data, although many manually built classifications are taxonomic in nature. Also Natural Language Processing (NLP) applications benefit from taxonomic classifications because they vary in terms of the granularity they require from a classification. We introduce a new clustering method called Hierarchical Graph Factorization Clustering (HGFC) and extend it so that it is optimal for the task. Our results show that HGFC outperforms the frequently used agglomerative clustering on a hierarchical test set extracted from VerbNet, and that it yields state-of-the-art performance also on a flat test set. We demonstrate how the method can be used to acquire novel classifications as well as to extend existing ones on the basis of some prior knowledge about the classification.

2 0.15189674 145 emnlp-2011-Unsupervised Semantic Role Induction with Graph Partitioning

Author: Joel Lang ; Mirella Lapata

Abstract: In this paper we present a method for unsupervised semantic role induction which we formalize as a graph partitioning problem. Argument instances of a verb are represented as vertices in a graph whose edge weights quantify their role-semantic similarity. Graph partitioning is realized with an algorithm that iteratively assigns vertices to clusters based on the cluster assignments of neighboring vertices. Our method is algorithmically and conceptually simple, especially with respect to how problem-specific knowledge is incorporated into the model. Experimental results on the CoNLL 2008 benchmark dataset demonstrate that our model is competitive with other unsupervised approaches in terms of F1 whilst attaining significantly higher cluster purity.

3 0.12649404 141 emnlp-2011-Unsupervised Dependency Parsing without Gold Part-of-Speech Tags

Author: Valentin I. Spitkovsky ; Hiyan Alshawi ; Angel X. Chang ; Daniel Jurafsky

Abstract: We show that categories induced by unsupervised word clustering can surpass the performance of gold part-of-speech tags in dependency grammar induction. Unlike classic clustering algorithms, our method allows a word to have different tags in different contexts. In an ablative analysis, we first demonstrate that this context-dependence is crucial to the superior performance of gold tags — requiring a word to always have the same part-ofspeech significantly degrades the performance of manual tags in grammar induction, eliminating the advantage that human annotation has over unsupervised tags. We then introduce a sequence modeling technique that combines the output of a word clustering algorithm with context-colored noise, to allow words to be tagged differently in different contexts. With these new induced tags as input, our state-of- the-art dependency grammar inducer achieves 59. 1% directed accuracy on Section 23 (all sentences) of the Wall Street Journal (WSJ) corpus — 0.7% higher than using gold tags.

4 0.08967559 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts

Author: Bryan Rink ; Sanda Harabagiu

Abstract: This paper presents a generative model for the automatic discovery of relations between entities in electronic medical records. The model discovers relation instances and their types by determining which context tokens express the relation. Additionally, the valid semantic classes for each type of relation are determined. We show that the model produces clusters of relation trigger words which better correspond with manually annotated relations than several existing clustering techniques. The discovered relations reveal some of the implicit semantic structure present in patient records.

5 0.08751528 128 emnlp-2011-Structured Relation Discovery using Generative Models

Author: Limin Yao ; Aria Haghighi ; Sebastian Riedel ; Andrew McCallum

Abstract: We explore unsupervised approaches to relation extraction between two named entities; for instance, the semantic bornIn relation between a person and location entity. Concretely, we propose a series of generative probabilistic models, broadly similar to topic models, each which generates a corpus of observed triples of entity mention pairs and the surface syntactic dependency path between them. The output of each model is a clustering of observed relation tuples and their associated textual expressions to underlying semantic relation types. Our proposed models exploit entity type constraints within a relation as well as features on the dependency path between entity mentions. We examine effectiveness of our approach via multiple evaluations and demonstrate 12% error reduction in precision over a state-of-the-art weakly supervised baseline.

6 0.05495831 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features

7 0.054770626 80 emnlp-2011-Latent Vector Weighting for Word Meaning in Context

8 0.054139558 144 emnlp-2011-Unsupervised Learning of Selectional Restrictions and Detection of Argument Coercions

9 0.052389618 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning

10 0.049861196 50 emnlp-2011-Evaluating Dependency Parsing: Robust and Heuristics-Free Cross-Annotation Evaluation

11 0.046196789 88 emnlp-2011-Linear Text Segmentation Using Affinity Propagation

12 0.046126176 9 emnlp-2011-A Non-negative Matrix Factorization Based Approach for Active Dual Supervision from Document and Word Labels

13 0.045304727 4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser

14 0.044433452 7 emnlp-2011-A Joint Model for Extended Semantic Role Labeling

15 0.043975256 107 emnlp-2011-Probabilistic models of similarity in syntactic context

16 0.04361048 29 emnlp-2011-Collaborative Ranking: A Case Study on Entity Linking

17 0.04267627 37 emnlp-2011-Cross-Cutting Models of Lexical Semantics

18 0.041689828 61 emnlp-2011-Generating Aspect-oriented Multi-Document Summarization with Event-aspect model

19 0.040270723 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation

20 0.038878772 12 emnlp-2011-A Weakly-supervised Approach to Argumentative Zoning of Scientific Documents


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.152), (1, -0.061), (2, -0.124), (3, 0.046), (4, -0.0), (5, -0.025), (6, 0.001), (7, -0.01), (8, -0.011), (9, 0.082), (10, 0.009), (11, 0.002), (12, 0.069), (13, -0.053), (14, -0.005), (15, 0.164), (16, -0.069), (17, 0.222), (18, 0.015), (19, 0.117), (20, -0.04), (21, 0.124), (22, -0.066), (23, -0.003), (24, -0.037), (25, -0.024), (26, -0.099), (27, 0.019), (28, 0.051), (29, 0.182), (30, -0.098), (31, 0.103), (32, -0.115), (33, 0.108), (34, 0.007), (35, -0.198), (36, -0.119), (37, -0.087), (38, -0.097), (39, 0.173), (40, -0.075), (41, 0.036), (42, 0.031), (43, 0.154), (44, 0.016), (45, 0.059), (46, 0.05), (47, -0.074), (48, -0.041), (49, 0.138)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95152533 67 emnlp-2011-Hierarchical Verb Clustering Using Graph Factorization

Author: Lin Sun ; Anna Korhonen

Abstract: Most previous research on verb clustering has focussed on acquiring flat classifications from corpus data, although many manually built classifications are taxonomic in nature. Also Natural Language Processing (NLP) applications benefit from taxonomic classifications because they vary in terms of the granularity they require from a classification. We introduce a new clustering method called Hierarchical Graph Factorization Clustering (HGFC) and extend it so that it is optimal for the task. Our results show that HGFC outperforms the frequently used agglomerative clustering on a hierarchical test set extracted from VerbNet, and that it yields state-of-the-art performance also on a flat test set. We demonstrate how the method can be used to acquire novel classifications as well as to extend existing ones on the basis of some prior knowledge about the classification.

2 0.64182937 145 emnlp-2011-Unsupervised Semantic Role Induction with Graph Partitioning

Author: Joel Lang ; Mirella Lapata

Abstract: In this paper we present a method for unsupervised semantic role induction which we formalize as a graph partitioning problem. Argument instances of a verb are represented as vertices in a graph whose edge weights quantify their role-semantic similarity. Graph partitioning is realized with an algorithm that iteratively assigns vertices to clusters based on the cluster assignments of neighboring vertices. Our method is algorithmically and conceptually simple, especially with respect to how problem-specific knowledge is incorporated into the model. Experimental results on the CoNLL 2008 benchmark dataset demonstrate that our model is competitive with other unsupervised approaches in terms of F1 whilst attaining significantly higher cluster purity.

3 0.50611234 141 emnlp-2011-Unsupervised Dependency Parsing without Gold Part-of-Speech Tags

Author: Valentin I. Spitkovsky ; Hiyan Alshawi ; Angel X. Chang ; Daniel Jurafsky

Abstract: We show that categories induced by unsupervised word clustering can surpass the performance of gold part-of-speech tags in dependency grammar induction. Unlike classic clustering algorithms, our method allows a word to have different tags in different contexts. In an ablative analysis, we first demonstrate that this context-dependence is crucial to the superior performance of gold tags — requiring a word to always have the same part-ofspeech significantly degrades the performance of manual tags in grammar induction, eliminating the advantage that human annotation has over unsupervised tags. We then introduce a sequence modeling technique that combines the output of a word clustering algorithm with context-colored noise, to allow words to be tagged differently in different contexts. With these new induced tags as input, our state-of- the-art dependency grammar inducer achieves 59. 1% directed accuracy on Section 23 (all sentences) of the Wall Street Journal (WSJ) corpus — 0.7% higher than using gold tags.

4 0.37528077 144 emnlp-2011-Unsupervised Learning of Selectional Restrictions and Detection of Argument Coercions

Author: Kirk Roberts ; Sanda Harabagiu

Abstract: Metonymic language is a pervasive phenomenon. Metonymic type shifting, or argument type coercion, results in a selectional restriction violation where the argument’s semantic class differs from the class the predicate expects. In this paper we present an unsupervised method that learns the selectional restriction of arguments and enables the detection of argument coercion. This method also generates an enhanced probabilistic resolution of logical metonymies. The experimental results indicate substantial improvements the detection of coercions and the ranking of metonymic interpretations.

5 0.37004954 37 emnlp-2011-Cross-Cutting Models of Lexical Semantics

Author: Joseph Reisinger ; Raymond Mooney

Abstract: Context-dependent word similarity can be measured over multiple cross-cutting dimensions. For example, lung and breath are similar thematically, while authoritative and superficial occur in similar syntactic contexts, but share little semantic similarity. Both of these notions of similarity play a role in determining word meaning, and hence lexical semantic models must take them both into account. Towards this end, we develop a novel model, Multi-View Mixture (MVM), that represents words as multiple overlapping clusterings. MVM finds multiple data partitions based on different subsets of features, subject to the marginal constraint that feature subsets are distributed according to Latent Dirich- let Allocation. Intuitively, this constraint favors feature partitions that have coherent topical semantics. Furthermore, MVM uses soft feature assignment, hence the contribution of each data point to each clustering view is variable, isolating the impact of data only to views where they assign the most features. Through a series of experiments, we demonstrate the utility of MVM as an inductive bias for capturing relations between words that are intuitive to humans, outperforming related models such as Latent Dirichlet Allocation.

6 0.33676884 79 emnlp-2011-Lateen EM: Unsupervised Training with Multiple Objectives, Applied to Dependency Grammar Induction

7 0.33529598 88 emnlp-2011-Linear Text Segmentation Using Affinity Propagation

8 0.31244284 26 emnlp-2011-Class Label Enhancement via Related Instances

9 0.30811232 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts

10 0.265609 143 emnlp-2011-Unsupervised Information Extraction with Distributional Prior Knowledge

11 0.25441617 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features

12 0.25255701 19 emnlp-2011-Approximate Scalable Bounded Space Sketch for Large Data NLP

13 0.24859849 43 emnlp-2011-Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews

14 0.2471886 61 emnlp-2011-Generating Aspect-oriented Multi-Document Summarization with Event-aspect model

15 0.24333473 84 emnlp-2011-Learning the Information Status of Noun Phrases in Spoken Dialogues

16 0.24253525 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning

17 0.23216011 128 emnlp-2011-Structured Relation Discovery using Generative Models

18 0.23076349 81 emnlp-2011-Learning General Connotation of Words using Graph-based Algorithms

19 0.22425556 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation

20 0.21521589 91 emnlp-2011-Literal and Metaphorical Sense Identification through Concrete and Abstract Context


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(23, 0.065), (36, 0.016), (37, 0.032), (45, 0.051), (53, 0.01), (54, 0.407), (57, 0.017), (62, 0.017), (64, 0.018), (66, 0.07), (69, 0.02), (79, 0.035), (82, 0.018), (90, 0.02), (96, 0.088), (98, 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.91358721 110 emnlp-2011-Ranking Human and Machine Summarization Systems

Author: Peter Rankel ; John Conroy ; Eric Slud ; Dianne O'Leary

Abstract: The Text Analysis Conference (TAC) ranks summarization systems by their average score over a collection of document sets. We investigate the statistical appropriateness of this score and propose an alternative that better distinguishes between human and machine evaluation systems.

same-paper 2 0.87250161 67 emnlp-2011-Hierarchical Verb Clustering Using Graph Factorization

Author: Lin Sun ; Anna Korhonen

Abstract: Most previous research on verb clustering has focussed on acquiring flat classifications from corpus data, although many manually built classifications are taxonomic in nature. Also Natural Language Processing (NLP) applications benefit from taxonomic classifications because they vary in terms of the granularity they require from a classification. We introduce a new clustering method called Hierarchical Graph Factorization Clustering (HGFC) and extend it so that it is optimal for the task. Our results show that HGFC outperforms the frequently used agglomerative clustering on a hierarchical test set extracted from VerbNet, and that it yields state-of-the-art performance also on a flat test set. We demonstrate how the method can be used to acquire novel classifications as well as to extend existing ones on the basis of some prior knowledge about the classification.

3 0.85299629 10 emnlp-2011-A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions

Author: Wei Lu ; Hwee Tou Ng

Abstract: This paper describes a novel probabilistic approach for generating natural language sentences from their underlying semantics in the form of typed lambda calculus. The approach is built on top of a novel reduction-based weighted synchronous context free grammar formalism, which facilitates the transformation process from typed lambda calculus into natural language sentences. Sentences can then be generated based on such grammar rules with a log-linear model. To acquire such grammar rules automatically in an unsupervised manner, we also propose a novel approach with a generative model, which maps from sub-expressions of logical forms to word sequences in natural language sentences. Experiments on benchmark datasets for both English and Chinese generation tasks yield significant improvements over results obtained by two state-of-the-art machine translation models, in terms of both automatic metrics and human evaluation.

4 0.49167925 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning

Author: Edward Grefenstette ; Mehrnoosh Sadrzadeh

Abstract: Modelling compositional meaning for sentences using empirical distributional methods has been a challenge for computational linguists. We implement the abstract categorical model of Coecke et al. (2010) using data from the BNC and evaluate it. The implementation is based on unsupervised learning of matrices for relational words and applying them to the vectors of their arguments. The evaluation is based on the word disambiguation task developed by Mitchell and Lapata (2008) for intransitive sentences, and on a similar new experiment designed for transitive sentences. Our model matches the results of its competitors . in the first experiment, and betters them in the second. The general improvement in results with increase in syntactic complexity showcases the compositional power of our model.

5 0.48183507 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation

Author: Yang Gao ; Philipp Koehn ; Alexandra Birch

Abstract: Long-distance reordering remains one of the biggest challenges facing machine translation. We derive soft constraints from the source dependency parsing to directly address the reordering problem for the hierarchical phrasebased model. Our approach significantly improves Chinese–English machine translation on a large-scale task by 0.84 BLEU points on average. Moreover, when we switch the tuning function from BLEU to the LRscore which promotes reordering, we observe total improvements of 1.21 BLEU, 1.30 LRscore and 3.36 TER over the baseline. On average our approach improves reordering precision and recall by 6.9 and 0.3 absolute points, respectively, and is found to be especially effective for long-distance reodering.

6 0.47610059 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation

7 0.46655193 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation

8 0.46479395 87 emnlp-2011-Lexical Generalization in CCG Grammar Induction for Semantic Parsing

9 0.45663565 47 emnlp-2011-Efficient retrieval of tree translation examples for Syntax-Based Machine Translation

10 0.45420018 134 emnlp-2011-Third-order Variational Reranking on Packed-Shared Dependency Forests

11 0.44116572 111 emnlp-2011-Reducing Grounded Learning Tasks To Grammatical Inference

12 0.43642673 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax

13 0.43177214 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation

14 0.43022031 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation

15 0.42634466 97 emnlp-2011-Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French

16 0.42575371 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training

17 0.42348093 6 emnlp-2011-A Generate and Rank Approach to Sentence Paraphrasing

18 0.42287591 127 emnlp-2011-Structured Lexical Similarity via Convolution Kernels on Dependency Trees

19 0.42002639 144 emnlp-2011-Unsupervised Learning of Selectional Restrictions and Detection of Argument Coercions

20 0.41620025 38 emnlp-2011-Data-Driven Response Generation in Social Media