acl acl2013 acl2013-119 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Lin Sun ; Diana McCarthy ; Anna Korhonen
Abstract: Although diathesis alternations have been used as features for manual verb classification, and there is recent work on incorporating such features in computational models of human language acquisition, work on large scale verb classification has yet to examine the potential for using diathesis alternations as input features to the clustering process. This paper proposes a method for approximating diathesis alternation behaviour in corpus data and shows, using a state-of-the-art verb clustering system, that features based on alternation approximation outperform those based on independent subcategorization frames. Our alternation-based approach is particularly adept at leveraging information from less frequent data.
Reference: text
sentIndex sentText sentNum sentScore
1 Diathesis alternation approximation for verb clustering Lin Sun Greedy Intelligence Ltd Hangzhou, China l . [sent-1, score-0.442]
2 This paper proposes a method for approximating diathesis alternation behaviour in corpus data and shows, using a state-of-the-art verb clustering system, that features based on alternation approximation outperform those based on independent subcategorization frames. [sent-4, score-0.824]
3 Our alternation-based approach is particularly adept at leveraging information from less frequent data. [sent-5, score-0.025]
4 1 Introduction Diathesis alternations (DAs) are regular alternations of the syntactic expression of verbal arguments, sometimes accompanied by a change in meaning. [sent-6, score-0.272]
5 Levin (1993)’s seminal book provides a manual inventory both of DAs and verb classes where membership is determined according to participation in these alternations. [sent-10, score-0.244]
6 ) can all take various DAs, such as the causative alternation, middle alternation and instrument subject alternation. [sent-16, score-0.127]
7 In computational linguistics, work inspired by Levin’s classification has exploited the link between syntax and semantics for producing classifications of verbs. [sent-17, score-0.093]
8 Such classifications are use- ful for a wide variety of purposes such as semantic role labelling (Gildea and Jurafsky, 2002), Diana McCarthy and Anna Korhonen DTAL and Computer Laboratory University of Cambridge Cambridge, UK diana@ dianamccarthy . [sent-18, score-0.066]
9 uk predicting unseen syntax (Parisien and Stevenson, 2010), argument zoning (Guo et al. [sent-22, score-0.077]
10 While Levin’s classification can be extended manually (Kipper-Schuler, 2005), a large body of research has developed methods for automatic verb classification since such methods can be applied easily to other domains and languages. [sent-25, score-0.23]
11 Existing work on automatic classification relies largely on syntactic features such as subcategorization frames (SCF)s (Schulte im Walde, 2006; Sun and Korhonen, 2011; Vlachos et al. [sent-26, score-0.333]
12 There has also been some success incorporating selectional preferences (Sun and Korhonen, 2009). [sent-28, score-0.082]
13 Merlo and Stevenson (2001) used cues such as passive voice, animacy and syntactic frames coupled with the overlap of lexical fillers between the alternating slots to predict a 3-way classification (unergative, unaccusative and object-drop). [sent-32, score-0.332]
14 (2008) used similar features to classify verbs on a much larger scale. [sent-34, score-0.086]
15 They classify up to 496 verbs using 11 different classifications each having between 2 and 14 classes. [sent-35, score-0.125]
16 (2008) we seek to automatically classify verbs into a broad range of classes. [sent-40, score-0.059]
17 , we include evidence of DA, but we do not manually select features attributed to specific alternations but rather experiment with syntactic evidence for alternation approximation. [sent-42, score-0.274]
18 We use the verb clustering system presented in Sun and Korhonen (2009) because it achieves state-of-theart results on several datasets, including those of Joanis et al. [sent-43, score-0.272]
19 , even without the additional boost in performance from the selectional preference data. [sent-44, score-0.046]
20 We are interested in the improvement that can be achieved to verb clustering using approximations for DAs, rather than the DA per se. [sent-45, score-0.295]
21 As such we make the simple assumption that if a pair of SCFs tends to occur with the same verbs, we have a potential occurrence of DA. [sent-46, score-0.051]
22 Although this approximation can give rise to false positives (pairs of frames that co-occur frequently but are not DA) we are nevertheless interested in investigating its potential usefulness for verb classification. [sent-47, score-0.422]
23 2 Diathesis Alternation Approximation A DA can be approximated by a pair of SCFs. [sent-49, score-0.028]
24 We parameterize frames involving prepositional phrases with the preposition. [sent-50, score-0.185]
25 Example SCFs for the verb “spray” are shown in Table 1. [sent-51, score-0.176]
26 The feature value of a single frame feature is the frequency of the SCF. [sent-52, score-0.285]
27 Given two frames fv (i) , fv (j) of a verb v, they can be transformed into a feature pair (fv (i), fv (j)) as an approximation to a DA. [sent-53, score-2.577]
28 The feature value of the DA feature (fv (i) , fv (j)) is ap- proximated by the joint probability of the pair of frames p(fv (i) , fv (j) |v), obtained by integrating all the possible DAs. [sent-54, score-1.679]
29 Tv)h,e key assumption irsa tthinagt the joint probability of two SCFs has a strong correlation with a DA on the grounds that the DA gives rise to both SCFs in the pair. [sent-55, score-0.041]
30 We use the DA feature (fv (i) , fv (j)) with its value p(fv (i) , fv (j) |v) as a new feature for verb clustering. [sent-56, score-1.642]
31 As a comparison point, we can ignore the DA and make a frame independence assumption. [sent-57, score-0.196]
32 The frame dependency is represented by a simple graphical model in figure 1. [sent-59, score-0.167]
33 v represents a verb, a represents a DA and f represents a specific frame in total of M possible frames In the data, the verb (v) and frames (f) are observed, and any underlying alternation (a) is hidden. [sent-61, score-0.84]
34 The aim is to approximate but not to detect a DA, so a is summed out: p(fv(i), fv(j)|v) = Xp(fv(i), fv(j)|a) · p(a|v) Xa (2) In order to evaluate this sum, we use a relaxation 1: the sum in equation 1 is replaced with the maximum (max). [sent-62, score-0.067]
35 This is a reasonable relaxation, as a pair of frames rarely participates in more than one type of a DA. [sent-63, score-0.213]
36 737 So we end up with a simple form: p(fv(i), fv(j)|v) ≈ Z−1 · min(fv(i), fv(j)) (5) The equation is intuitive: If fv (i) occurs 40 times and fv (j) 30 times, the DA between fv (i) and fv (j) ≤ 30 times. [sent-66, score-2.806]
37 This upper bound value is used as (tjhe) f≤ea 3tu0re ti mvaelsu. [sent-67, score-0.02]
38 vTalhuee original feature vector f of dimension M is transformed into M2 dimensions feature vector Table 2 shows the transformed feature space for spray. [sent-69, score-0.193]
39 The feature space matches our expectation well: f˜. [sent-70, score-0.043]
40 3 Experiments We evaluated this model by performing verb clustering experiments using three feature sets: F1: SCF parameterized with preposition. [sent-72, score-0.315]
41 F2: The frame pair features built from F1 with the frame independence assumption (equation 1). [sent-74, score-0.441]
42 This feature is not a DA feature as it ignores the inter-dependency of the frames. [sent-75, score-0.086]
43 F3: The frame pair features (DAs) built from F1 with the frame dependency assumption (equation 4). [sent-76, score-0.412]
44 This is the DA feature which considers the correlation of the two frames which are generated from the alternation. [sent-77, score-0.228]
45 F3 implicitly includes F1, as a frame can pair with itself. [sent-78, score-0.195]
46 2 In the example in Table 2, the frame pair “PP(on) PP(on)” will always have the same value as the “PP(on)” frame in F1. [sent-79, score-0.362]
47 (2007) which classifies each corpus 2We did this so that F3 included the SCF features as well as the DA approximation features. [sent-81, score-0.07]
48 occurrence of a verb as a member of one of the 168 SCFs on the basis of grammatical relations identified by the RASP (Briscoe et al. [sent-83, score-0.176]
49 We experimented with two datasets that have been used in prior work on verb clustering: the test sets 7-11 (3-14 classes) in Joanis et al. [sent-85, score-0.198]
50 We used the spectral clustering (SPEC) method and settings as in Sun and Korhonen (2009) but adopted the Bhattacharyya kernel (Jebara and Kondor, 2003) to improve the computational efficiency of the approach given the high dimensionality of the quadratic feature space. [sent-88, score-0.266]
51 XD wb(v,v0) =X(vdv0d)1/2 Xd=1 (6) The mean-filed bound of the Bhattacharyya kernel is very similar to the KL divergence kernel (Jebara et al. [sent-89, score-0.1]
52 , 2004) which is frequently used in verb clustering experiments (Korhonen et al. [sent-90, score-0.272]
53 To further reduce computational complexity, we restricted our scope to the more frequent features. [sent-92, score-0.025]
54 In the experiment described in this section we used the 50 most frequent features for the 3-6 way classifications (Joanis et al. [sent-93, score-0.118]
55 ’s test set 7-9) and 100 features for the 7-17 way classifications. [sent-94, score-0.027]
56 In the next section, we will demonstrate that F3 outperforms F1 regardless of the feature number setting. [sent-95, score-0.043]
57 The clustering results are evaluated using FMeasure as in Sun and Korhonen (2009) which provides the harmonic mean of precision (P) and recall (R) P is calculated using modified purity a global measure which evaluates the mean precision of clusters. [sent-97, score-0.096]
58 Each cluster (ki ∈ K) is associated with the gold-standard class to Kwh)ic ihs t ahses majority of its members belong. [sent-98, score-0.038]
59 The number of verbs – in a cluster (ki) that take this class is denoted by nprevalent(ki). [sent-99, score-0.059]
60 P =Pki∈K:nprevale|nvte(rkbi)>s|2nprevalent(ki) R is calculated using weighted class accuracy: the proportion of members of the dominant cluster DOM-CLUSTi within each of the gold-standard classes ci ∈ C. [sent-100, score-0.046]
61 frames) and F1 (single frame) features with Bhattacharyya kernel on Joanis et al. [sent-109, score-0.067]
62 datasets R =Pi|=C1||verbs| ivne DrbOs|M-CLUSTi| The results are shown in Table 3. [sent-111, score-0.041]
63 This indicates that the frame independence assumption is a poor assumption. [sent-113, score-0.219]
64 This experiment shows, on two datasets, that DA features are clearly more effective than the frame features for verb clustering, even when relaxations are used. [sent-121, score-0.397]
65 The frequency ranked features were added to the clustering one at a time, starting from the most frequent one. [sent-124, score-0.18]
66 F3 outperforms F1 clearly on all the feature number settings. [sent-126, score-0.043]
67 After adding some highly frequent frames (22 for test set 10 and 67 for test set 11), the performance for F1 is not further improved. [sent-127, score-0.21]
68 The performance of F3, in contrast, is improved for almost all (including the mid-range frequency) frames, although to a lesser degree for low frequency frames. [sent-128, score-0.032]
69 5 Related work Parisien and Stevenson (2010) introduced a hierarchical Bayesian model capable of learning verb alternations and constructions from syntactic input. [sent-129, score-0.296]
70 The focus was on modelling and explaining the child alternation acquisition rather than on automatic verb classification. [sent-130, score-0.329]
71 Therefore, no quantitative evaluation of the clustering is reported, and the number of verbs under the novel verb generalization test is relatively small. [sent-131, score-0.312]
72 Parisien and Figure 2: Comparison between frame features (F1) and DA features (F3) with different feature number settings. [sent-132, score-0.264]
73 A fundamental difference is that we explicitly use a probability distribution over alternations (pair of frames) to represent a verb, whereas they represent a verb by a distribution over the observed frames similar to Vlachos et al. [sent-140, score-0.481]
74 6 Conclusion and Future work We have demonstrated the merits of using DAs for verb clustering compared to the SCF data from which they are derived on standard verb classi- fication datasets and when integrated in a stateof-the-art verb clustering system. [sent-143, score-0.742]
75 We have also demonstrated that the performance of frame features is dominated by the high frequency frames. [sent-144, score-0.226]
76 In contrast, the DA features enable the mid-range frequency frames to further improve the performance. [sent-145, score-0.244]
77 739 In the future, we plan to evaluate the performance of DA features in a larger scale experiment. [sent-146, score-0.027]
78 Due to the high dimensionality of the transformed feature space (quadratic of the original feature space), we will need to improve the computational efficiency further, e. [sent-147, score-0.145]
79 via use of an unsupervised dimensionality reduction technique Zhao and Liu (2007). [sent-149, score-0.027]
80 Finally, we plan to supplement the DA feature with evidence from the slot fillers of the alternating slots, in the spirit of earlier work (McCarthy, 2000; Merlo and Stevenson, 2001 ; Joanis et al. [sent-152, score-0.129]
81 Unlike these previous works, we will use selectional preferences to generalize the argument heads but will do so using preferences from distributional data (Sun and Korhonen, 2009) rather than WordNet, and use all argument head data in all frames. [sent-154, score-0.209]
82 We envisage using maximum average distributional similarity of the argument heads in any potentially alternating slots in a pair of cooccurring frames as a feature, just as we currently use the frequency of the less frequent co-occurring frame. [sent-155, score-0.4]
83 A weakly-supervised approach to argumentative zoning of scientific documents. [sent-177, score-0.061]
84 Acquiring lexical generalizations from corpora: A case study for diathesis alternations. [sent-212, score-0.154]
85 Using semantic preferences to identify verbal participation in role switching alter- nations. [sent-232, score-0.109]
86 Automatic verb classification based on statistical distributions of argument structure. [sent-239, score-0.238]
87 Generalizing between form and meaning using learned verb classes. [sent-249, score-0.176]
88 A system for large-scale acquisition of verbal, nominal and adjectival subcategorization frames from corpora. [sent-255, score-0.264]
89 Experiments on the automatic induction of German semantic verb classes. [sent-263, score-0.176]
90 Improving verb clustering with automatically acquired selectional preferences. [sent-275, score-0.318]
91 Unsupervised and constrained dirichlet process mixture models for verb clustering. [sent-299, score-0.176]
wordName wordTfidf (topN-words)
[('fv', 0.69), ('korhonen', 0.219), ('da', 0.211), ('joanis', 0.204), ('frames', 0.185), ('verb', 0.176), ('parisien', 0.167), ('frame', 0.167), ('diathesis', 0.154), ('stevenson', 0.147), ('scfs', 0.13), ('alternation', 0.127), ('alternations', 0.12), ('sun', 0.117), ('scf', 0.097), ('clustering', 0.096), ('sup', 0.081), ('das', 0.072), ('classifications', 0.066), ('schulte', 0.065), ('jebara', 0.065), ('merlo', 0.061), ('vlachos', 0.061), ('levin', 0.061), ('mccarthy', 0.055), ('subcategorization', 0.053), ('bhattacharyya', 0.052), ('ki', 0.046), ('selectional', 0.046), ('relaxation', 0.044), ('approximation', 0.043), ('feature', 0.043), ('kondor', 0.042), ('spray', 0.042), ('zoning', 0.042), ('im', 0.041), ('participation', 0.041), ('spectral', 0.041), ('kernel', 0.04), ('verbs', 0.04), ('briscoe', 0.039), ('alternating', 0.038), ('preferences', 0.036), ('slots', 0.036), ('argument', 0.035), ('walde', 0.034), ('tsang', 0.034), ('preiss', 0.032), ('brew', 0.032), ('verbal', 0.032), ('transformed', 0.032), ('frequency', 0.032), ('rasp', 0.031), ('shutova', 0.029), ('independence', 0.029), ('xd', 0.028), ('pair', 0.028), ('dimensionality', 0.027), ('features', 0.027), ('fillers', 0.027), ('classes', 0.027), ('classification', 0.027), ('acquisition', 0.026), ('frequent', 0.025), ('axis', 0.024), ('metaphor', 0.024), ('usa', 0.024), ('bayesian', 0.024), ('approximations', 0.023), ('occurs', 0.023), ('equation', 0.023), ('assumption', 0.023), ('morristown', 0.023), ('guo', 0.023), ('pp', 0.023), ('phd', 0.022), ('cook', 0.022), ('datasets', 0.022), ('slot', 0.021), ('behaviour', 0.021), ('heads', 0.021), ('bound', 0.02), ('pennsylvania', 0.02), ('diana', 0.02), ('chicago', 0.02), ('classify', 0.019), ('class', 0.019), ('quadratic', 0.019), ('requirement', 0.019), ('society', 0.019), ('ltd', 0.019), ('ahses', 0.019), ('argumentative', 0.019), ('ivne', 0.019), ('krymolowski', 0.019), ('postgraduate', 0.019), ('unaccusative', 0.019), ('rise', 0.018), ('nj', 0.018), ('gildea', 0.018)]
simIndex simValue paperId paperTitle
same-paper 1 1.0 119 acl-2013-Diathesis alternation approximation for verb clustering
Author: Lin Sun ; Diana McCarthy ; Anna Korhonen
Abstract: Although diathesis alternations have been used as features for manual verb classification, and there is recent work on incorporating such features in computational models of human language acquisition, work on large scale verb classification has yet to examine the potential for using diathesis alternations as input features to the clustering process. This paper proposes a method for approximating diathesis alternation behaviour in corpus data and shows, using a state-of-the-art verb clustering system, that features based on alternation approximation outperform those based on independent subcategorization frames. Our alternation-based approach is particularly adept at leveraging information from less frequent data.
2 0.2743215 192 acl-2013-Improved Lexical Acquisition through DPP-based Verb Clustering
Author: Roi Reichart ; Anna Korhonen
Abstract: Subcategorization frames (SCFs), selectional preferences (SPs) and verb classes capture related aspects of the predicateargument structure. We present the first unified framework for unsupervised learning of these three types of information. We show how to utilize Determinantal Point Processes (DPPs), elegant probabilistic models that are defined over the possible subsets of a given dataset and give higher probability mass to high quality and diverse subsets, for clustering. Our novel clustering algorithm constructs a joint SCF-DPP DPP kernel matrix and utilizes the efficient sampling algorithms of DPPs to cluster together verbs with similar SCFs and SPs. We evaluate the induced clusters in the context of the three tasks and show results that are superior to strong baselines for each 1.
3 0.11661205 213 acl-2013-Language Acquisition and Probabilistic Models: keeping it simple
Author: Aline Villavicencio ; Marco Idiart ; Robert Berwick ; Igor Malioutov
Abstract: Hierarchical Bayesian Models (HBMs) have been used with some success to capture empirically observed patterns of under- and overgeneralization in child language acquisition. However, as is well known, HBMs are “ideal”learningsystems, assumingaccess to unlimited computational resources that may not be available to child language learners. Consequently, it remains crucial to carefully assess the use of HBMs along with alternative, possibly simpler, candidate models. This paper presents such an evaluation for a language acquisi- tion domain where explicit HBMs have been proposed: the acquisition of English dative constructions. In particular, we present a detailed, empiricallygrounded model-selection comparison of HBMs vs. a simpler alternative based on clustering along with maximum likelihood estimation that we call linear competition learning (LCL). Our results demonstrate that LCL can match HBM model performance without incurring on the high computational costs associated with HBMs.
4 0.11543331 265 acl-2013-Outsourcing FrameNet to the Crowd
Author: Marco Fossati ; Claudio Giuliano ; Sara Tonelli
Abstract: We present the first attempt to perform full FrameNet annotation with crowdsourcing techniques. We compare two approaches: the first one is the standard annotation methodology of lexical units and frame elements in two steps, while the second is a novel approach aimed at acquiring frames in a bottom-up fashion, starting from frame element annotation. We show that our methodology, relying on a single annotation step and on simplified role definitions, outperforms the standard one both in terms of accuracy and time.
5 0.10481297 310 acl-2013-Semantic Frames to Predict Stock Price Movement
Author: Boyi Xie ; Rebecca J. Passonneau ; Leon Wu ; German G. Creamer
Abstract: Semantic frames are a rich linguistic resource. There has been much work on semantic frame parsers, but less that applies them to general NLP problems. We address a task to predict change in stock price from financial news. Semantic frames help to generalize from specific sentences to scenarios, and to detect the (positive or negative) roles of specific companies. We introduce a novel tree representation, and use it to train predictive models with tree kernels using support vector machines. Our experiments test multiple text representations on two binary classification tasks, change of price and polarity. Experiments show that features derived from semantic frame parsing have significantly better performance across years on the polarity task.
6 0.087241635 224 acl-2013-Learning to Extract International Relations from Political Context
7 0.081243351 366 acl-2013-Understanding Verbs based on Overlapping Verbs Senses
8 0.06721513 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection
9 0.062903076 116 acl-2013-Detecting Metaphor by Contextual Analogy
10 0.060885608 186 acl-2013-Identifying English and Hungarian Light Verb Constructions: A Contrastive Approach
11 0.060434125 378 acl-2013-Using subcategorization knowledge to improve case prediction for translation to German
12 0.053375419 344 acl-2013-The Effects of Lexical Resource Quality on Preference Violation Detection
13 0.0524792 8 acl-2013-A Learner Corpus-based Approach to Verb Suggestion for ESL
14 0.050749548 98 acl-2013-Cross-lingual Transfer of Semantic Role Labeling Models
15 0.05052558 306 acl-2013-SPred: Large-scale Harvesting of Semantic Predicates
16 0.05018463 47 acl-2013-An Information Theoretic Approach to Bilingual Word Clustering
17 0.048280455 134 acl-2013-Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction
18 0.047325827 338 acl-2013-Task Alternation in Parallel Sentence Retrieval for Twitter Translation
19 0.047148988 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
20 0.045078702 267 acl-2013-PARMA: A Predicate Argument Aligner
topicId topicWeight
[(0, 0.113), (1, 0.026), (2, -0.003), (3, -0.07), (4, -0.047), (5, -0.047), (6, -0.044), (7, 0.072), (8, -0.0), (9, 0.002), (10, -0.013), (11, 0.009), (12, -0.005), (13, 0.022), (14, -0.062), (15, -0.037), (16, -0.012), (17, 0.02), (18, 0.149), (19, 0.033), (20, 0.092), (21, 0.003), (22, 0.06), (23, -0.057), (24, 0.138), (25, -0.032), (26, -0.093), (27, -0.034), (28, 0.074), (29, 0.083), (30, 0.04), (31, 0.007), (32, -0.04), (33, -0.093), (34, -0.162), (35, -0.006), (36, -0.183), (37, -0.181), (38, 0.038), (39, 0.114), (40, 0.057), (41, -0.041), (42, -0.015), (43, 0.111), (44, -0.123), (45, -0.012), (46, -0.047), (47, -0.004), (48, 0.044), (49, -0.017)]
simIndex simValue paperId paperTitle
same-paper 1 0.94289893 119 acl-2013-Diathesis alternation approximation for verb clustering
Author: Lin Sun ; Diana McCarthy ; Anna Korhonen
Abstract: Although diathesis alternations have been used as features for manual verb classification, and there is recent work on incorporating such features in computational models of human language acquisition, work on large scale verb classification has yet to examine the potential for using diathesis alternations as input features to the clustering process. This paper proposes a method for approximating diathesis alternation behaviour in corpus data and shows, using a state-of-the-art verb clustering system, that features based on alternation approximation outperform those based on independent subcategorization frames. Our alternation-based approach is particularly adept at leveraging information from less frequent data.
2 0.79758978 192 acl-2013-Improved Lexical Acquisition through DPP-based Verb Clustering
Author: Roi Reichart ; Anna Korhonen
Abstract: Subcategorization frames (SCFs), selectional preferences (SPs) and verb classes capture related aspects of the predicateargument structure. We present the first unified framework for unsupervised learning of these three types of information. We show how to utilize Determinantal Point Processes (DPPs), elegant probabilistic models that are defined over the possible subsets of a given dataset and give higher probability mass to high quality and diverse subsets, for clustering. Our novel clustering algorithm constructs a joint SCF-DPP DPP kernel matrix and utilizes the efficient sampling algorithms of DPPs to cluster together verbs with similar SCFs and SPs. We evaluate the induced clusters in the context of the three tasks and show results that are superior to strong baselines for each 1.
3 0.70961875 213 acl-2013-Language Acquisition and Probabilistic Models: keeping it simple
Author: Aline Villavicencio ; Marco Idiart ; Robert Berwick ; Igor Malioutov
Abstract: Hierarchical Bayesian Models (HBMs) have been used with some success to capture empirically observed patterns of under- and overgeneralization in child language acquisition. However, as is well known, HBMs are “ideal”learningsystems, assumingaccess to unlimited computational resources that may not be available to child language learners. Consequently, it remains crucial to carefully assess the use of HBMs along with alternative, possibly simpler, candidate models. This paper presents such an evaluation for a language acquisi- tion domain where explicit HBMs have been proposed: the acquisition of English dative constructions. In particular, we present a detailed, empiricallygrounded model-selection comparison of HBMs vs. a simpler alternative based on clustering along with maximum likelihood estimation that we call linear competition learning (LCL). Our results demonstrate that LCL can match HBM model performance without incurring on the high computational costs associated with HBMs.
4 0.57281095 310 acl-2013-Semantic Frames to Predict Stock Price Movement
Author: Boyi Xie ; Rebecca J. Passonneau ; Leon Wu ; German G. Creamer
Abstract: Semantic frames are a rich linguistic resource. There has been much work on semantic frame parsers, but less that applies them to general NLP problems. We address a task to predict change in stock price from financial news. Semantic frames help to generalize from specific sentences to scenarios, and to detect the (positive or negative) roles of specific companies. We introduce a novel tree representation, and use it to train predictive models with tree kernels using support vector machines. Our experiments test multiple text representations on two binary classification tasks, change of price and polarity. Experiments show that features derived from semantic frame parsing have significantly better performance across years on the polarity task.
5 0.56099212 344 acl-2013-The Effects of Lexical Resource Quality on Preference Violation Detection
Author: Jesse Dunietz ; Lori Levin ; Jaime Carbonell
Abstract: Lexical resources such as WordNet and VerbNet are widely used in a multitude of NLP tasks, as are annotated corpora such as treebanks. Often, the resources are used as-is, without question or examination. This practice risks missing significant performance gains and even entire techniques. This paper addresses the importance of resource quality through the lens of a challenging NLP task: detecting selectional preference violations. We present DAVID, a simple, lexical resource-based preference violation detector. With asis lexical resources, DAVID achieves an F1-measure of just 28.27%. When the resource entries and parser outputs for a small sample are corrected, however, the F1-measure on that sample jumps from 40% to 61.54%, and performance on other examples rises, suggesting that the algorithm becomes practical given refined resources. More broadly, this paper shows that resource quality matters tremendously, sometimes even more than algorithmic improvements.
6 0.55825311 265 acl-2013-Outsourcing FrameNet to the Crowd
7 0.55558866 366 acl-2013-Understanding Verbs based on Overlapping Verbs Senses
8 0.48991597 186 acl-2013-Identifying English and Hungarian Light Verb Constructions: A Contrastive Approach
9 0.43007743 224 acl-2013-Learning to Extract International Relations from Political Context
11 0.40678805 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection
12 0.40461189 378 acl-2013-Using subcategorization knowledge to improve case prediction for translation to German
13 0.36834762 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation
14 0.35235417 231 acl-2013-Linggle: a Web-scale Linguistic Search Engine for Words in Context
15 0.33338419 8 acl-2013-A Learner Corpus-based Approach to Verb Suggestion for ESL
16 0.32918519 149 acl-2013-Exploring Word Order Universals: a Probabilistic Graphical Model Approach
17 0.31268927 349 acl-2013-The mathematics of language learning
18 0.31209174 220 acl-2013-Learning Latent Personas of Film Characters
19 0.30171975 29 acl-2013-A Visual Analytics System for Cluster Exploration
20 0.29519555 267 acl-2013-PARMA: A Predicate Argument Aligner
topicId topicWeight
[(0, 0.039), (6, 0.028), (11, 0.076), (15, 0.014), (17, 0.044), (18, 0.243), (24, 0.049), (26, 0.038), (28, 0.011), (35, 0.063), (42, 0.049), (48, 0.06), (64, 0.018), (70, 0.066), (88, 0.021), (90, 0.012), (95, 0.074)]
simIndex simValue paperId paperTitle
same-paper 1 0.78957355 119 acl-2013-Diathesis alternation approximation for verb clustering
Author: Lin Sun ; Diana McCarthy ; Anna Korhonen
Abstract: Although diathesis alternations have been used as features for manual verb classification, and there is recent work on incorporating such features in computational models of human language acquisition, work on large scale verb classification has yet to examine the potential for using diathesis alternations as input features to the clustering process. This paper proposes a method for approximating diathesis alternation behaviour in corpus data and shows, using a state-of-the-art verb clustering system, that features based on alternation approximation outperform those based on independent subcategorization frames. Our alternation-based approach is particularly adept at leveraging information from less frequent data.
2 0.78506339 287 acl-2013-Public Dialogue: Analysis of Tolerance in Online Discussions
Author: Arjun Mukherjee ; Vivek Venkataraman ; Bing Liu ; Sharon Meraz
Abstract: Social media platforms have enabled people to freely express their views and discuss issues of interest with others. While it is important to discover the topics in discussions, it is equally useful to mine the nature of such discussions or debates and the behavior of the participants. There are many questions that can be asked. One key question is whether the participants give reasoned arguments with justifiable claims via constructive debates or exhibit dogmatism and egotistic clashes of ideologies. The central idea of this question is tolerance, which is a key concept in the field of communications. In this work, we perform a computational study of tolerance in the context of online discussions. We aim to identify tolerant vs. intolerant participants and investigate how disagreement affects tolerance in discussions in a quantitative framework. To the best of our knowledge, this is the first such study. Our experiments using real-life discussions demonstrate the effective- ness of the proposed technique and also provide some key insights into the psycholinguistic phenomenon of tolerance in online discussions.
3 0.70780993 85 acl-2013-Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis
Author: Shafiq Joty ; Giuseppe Carenini ; Raymond Ng ; Yashar Mehdad
Abstract: We propose a novel approach for developing a two-stage document-level discourse parser. Our parser builds a discourse tree by applying an optimal parsing algorithm to probabilities inferred from two Conditional Random Fields: one for intrasentential parsing and the other for multisentential parsing. We present two approaches to combine these two stages of discourse parsing effectively. A set of empirical evaluations over two different datasets demonstrates that our discourse parser significantly outperforms the stateof-the-art, often by a wide margin.
4 0.58762282 192 acl-2013-Improved Lexical Acquisition through DPP-based Verb Clustering
Author: Roi Reichart ; Anna Korhonen
Abstract: Subcategorization frames (SCFs), selectional preferences (SPs) and verb classes capture related aspects of the predicateargument structure. We present the first unified framework for unsupervised learning of these three types of information. We show how to utilize Determinantal Point Processes (DPPs), elegant probabilistic models that are defined over the possible subsets of a given dataset and give higher probability mass to high quality and diverse subsets, for clustering. Our novel clustering algorithm constructs a joint SCF-DPP DPP kernel matrix and utilizes the efficient sampling algorithms of DPPs to cluster together verbs with similar SCFs and SPs. We evaluate the induced clusters in the context of the three tasks and show results that are superior to strong baselines for each 1.
5 0.57714891 167 acl-2013-Generalizing Image Captions for Image-Text Parallel Corpus
Author: Polina Kuznetsova ; Vicente Ordonez ; Alexander Berg ; Tamara Berg ; Yejin Choi
Abstract: The ever growing amount of web images and their associated texts offers new opportunities for integrative models bridging natural language processing and computer vision. However, the potential benefits of such data are yet to be fully realized due to the complexity and noise in the alignment between image content and text. We address this challenge with contributions in two folds: first, we introduce the new task of image caption generalization, formulated as visually-guided sentence compression, and present an efficient algorithm based on dynamic beam search with dependency-based constraints. Second, we release a new large-scale corpus with 1 million image-caption pairs achieving tighter content alignment between images and text. Evaluation results show the intrinsic quality of the generalized captions and the extrinsic utility of the new imagetext parallel corpus with respect to a concrete application of image caption transfer.
6 0.56958896 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing
7 0.56607968 80 acl-2013-Chinese Parsing Exploiting Characters
8 0.56226975 164 acl-2013-FudanNLP: A Toolkit for Chinese Natural Language Processing
9 0.56206739 82 acl-2013-Co-regularizing character-based and word-based models for semi-supervised Chinese word segmentation
10 0.56188756 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation
11 0.56185144 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
12 0.56121606 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search
13 0.56069672 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering
14 0.56038743 134 acl-2013-Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction
16 0.55909091 7 acl-2013-A Lattice-based Framework for Joint Chinese Word Segmentation, POS Tagging and Parsing
17 0.55889654 123 acl-2013-Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study
18 0.55836278 275 acl-2013-Parsing with Compositional Vector Grammars
19 0.55761242 318 acl-2013-Sentiment Relevance
20 0.55737239 62 acl-2013-Automatic Term Ambiguity Detection