acl acl2010 acl2010-256 knowledge-graph by maker-knowledge-mining

256 acl-2010-Vocabulary Choice as an Indicator of Perspective


Source: pdf

Author: Beata Beigman Klebanov ; Eyal Beigman ; Daniel Diermeier

Abstract: We establish the following characteristics of the task of perspective classification: (a) using term frequencies in a document does not improve classification achieved with absence/presence features; (b) for datasets allowing the relevant comparisons, a small number of top features is found to be as effective as the full feature set and indispensable for the best achieved performance, testifying to the existence of perspective-specific keywords. We relate our findings to research on word frequency distributions and to discourse analytic studies of perspective.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We relate our findings to research on word frequency distributions and to discourse analytic studies of perspective. [sent-5, score-0.191]

2 1 Introduction We address the task of perspective classification. [sent-6, score-0.356]

3 , in having a meaningful interrelationship,” stressing the meaningful connectedness of one’s stances and pronouncements on possibly different issues. [sent-8, score-0.077]

4 1 Accordingly, one can talk about, say, opinion on a particular proposed legislation on abortion within pro-choice or pro-life perspectives; in this case, perspective essentially boils down to opinion in a particular debate. [sent-9, score-0.902]

5 Holding the issue constant but relaxing the requirement of a debate on a specific document, we can consider writings from pro- and con- perspective, in, for example, the death penalty controversy over a course of a period of time. [sent-10, score-0.238]

6 com one can talk about perspectives of people on two sides of a conflict; this is not opposition or support for any particular proposal, but ideas about a highly related cluster of issues, such as Israeli and Palestinian perspectives on the conflict in all its manifestations. [sent-12, score-0.537]

7 In this article, we consider perspective at all the four levels of abstraction. [sent-14, score-0.356]

8 We apply the same types of models to all, in order to discover any common properties of perspective classification. [sent-15, score-0.356]

9 We contrast it with text categorization and with opinion classification by employing models routinely used for such tasks. [sent-16, score-0.314]

10 Specifically, we consider models that use term frequencies as features (usually found to be superior for text categorization) and models that use term absence/presence (usually found to be superior for opinion classi- fication). [sent-17, score-0.234]

11 We motivate our hypothesis that presence/absence features would be as good as or better than frequencies, and test it experimentally. [sent-18, score-0.063]

12 Secondly, we investigate the question of feature redundancy often observed in text categorization. [sent-19, score-0.146]

13 2 Vocabulary Selection A line of inquiry going back at least to Zipf strives to characterize word frequency distributions in texts and corpora; see Baayen (2001) for a survey. [sent-20, score-0.096]

14 One of the findings in this literature is that a multinomial (called “urn model” by Baayen) is not a good model for word frequency distributions. [sent-21, score-0.134]

15 Jansche (2003) describes a two-stage generation process: (1) Toss a z-biased coin; if it comes up heads, generate 0; if it comes up tails, (2) generate according to F(θ), where F(θ) is a negative binomial distribution and z is a parameter controlling the extent of zero-inflation. [sent-26, score-0.05]

16 The postulation of two separate processes is effective for predicting word frequencies, but is there any meaning to the two processes? [sent-27, score-0.038]

17 The first process of deciding on the vocabulary, or word types, for the text what is its function? [sent-28, score-0.052]

18 Jansche (2003) suggests that the zero-inflation component takes care of the multitude of vocabulary words that are not “on topic” for the given text, including taboo words, technical jargon, proper names. [sent-29, score-0.134]

19 Indeed, text segmentation studies show that tracing recurrence of words in a text permits topical segmentation (Hearst, 1997; Hoey, 1991). [sent-31, score-0.297]

20 Yet, if a person compares abortion to infanticide are we content with describing this word – – as being merely “on topic,” that is, having a certain probability of occurrence once the topic of abortion comes up? [sent-32, score-0.678]

21 In fact, it is only likely to occur if the speaker holds a pro-life perspective, while a pro-choicer would avoid this term. [sent-33, score-0.069]

22 We therefore hypothesize that the choice of vocabulary is not only a matter of topic but also of perspective, while word recurrence has mainly to do with the topical composition of the text. [sent-34, score-0.357]

23 3 Data Partial Birth Abortion (PBA) debates: We use transcripts of the debates on Partial Birth Abortion Ban Act on the floors of the US House and Senate in 104-108 Congresses (1995-2003). [sent-36, score-0.123]

24 Simi- lar legislation was proposed multiple times, passed the legislatures, and, after having initially been vetoed by President Clinton, was signed into law by President Bush in 2003. [sent-37, score-0.101]

25 We use data from 278 legislators, with 669 speeches in all. [sent-38, score-0.101]

26 We take only one speech per speaker per year; since many serve multiple years, each speaker is represented with 1 to 5 speeches. [sent-39, score-0.138]

27 We perform 10-fold cross-validation splitting by speakers, so that all speeches by the same speaker are assigned to the same fold and testing is always inter-speaker. [sent-40, score-0.213]

28 When deriving the label for perspective, it is important to differentiate between a particular legislation and a pro-choice / pro-life perspective. [sent-41, score-0.101]

29 We removed 22 legislators with a mixed record, that is, those who gave 20-60% support to one of the positions. [sent-49, score-0.135]

30 2 Death Penalty (DP) blogs: We use University of Maryland Death Penalty Corpus (Greene and Resnik, 2009) of 1085 texts from a number of proand anti-death penalty websites. [sent-50, score-0.069]

31 We report 4-fold cross-validation (DP-4) using the folds in Greene and Resnik (2009), where training and testing data come from different websites for each of the sides, as well as 10-fold cross-validation performance on the entire corpus, irrespective of the site. [sent-51, score-0.116]

32 org by more than 200 different Israeli and Palestinian writers on issues related to the conflict. [sent-55, score-0.125]

33 James Moran, D-VA, as he changed his vote over the years. [sent-60, score-0.07]

34 For legislators rated by neither NRLC nor NARAL, we assumed the vote aligns with the perspective. [sent-61, score-0.205]

35 3The 10-fold setting yields almost perfect performance likely due to site-specific features beyond perspective per se, hence we do not use this setting in subsequent experiments. [sent-62, score-0.419]

36 254 son from either Arab or Western perspectives on Middle Eastern affairs in 2003-2009 from http://www. [sent-63, score-0.246]

37 The writers and interviewees on this site are usually former diplomats or government officials, academics, journalists, media and political analysts. [sent-66, score-0.188]

38 4 The specific issues cover a broad spectrum, including public life, politics, wars and conflicts, education, trade relations in and between countries like Lebanon, Jordan, Iraq, Egypt, Yemen, Morocco, Saudi Arabia, as well as their relations with the US and members of the European Union. [sent-67, score-0.053]

39 1 Pre-processing We are interested in perspective manifestations using common English vocabulary. [sent-69, score-0.356]

40 To avoid the possibility that artifacts such as names of senators or states drive the classification, we use as features words that contain only lowercase letters, possibly hyphenated. [sent-70, score-0.063]

41 a508KtuK Kres#CV1 40 folds 4 Models For generative models, we use two versions of Naive Bayes models termed multi-variate Bernoulli (here, NB-BOOL) and multinomial (here, NB-COUNT), respectively, in McCallum and Nigam (1998) study of event models for text categorization. [sent-73, score-0.206]

42 The first records presence/absence of a word in a text, while the second records the number of occurrences. [sent-74, score-0.076]

43 McCallum and Nigam (1998) found NB-COUNT to do better than NB-BOOL for sufficiently large vocabulary sizes for text categorization by topic. [sent-75, score-0.289]

44 For discriminative models, we use linear SVM, with presence-absence, normalized frequency, and tfidf feature weighting. [sent-76, score-0.037]

45 Both types of models are commonly used for text classification tasks. [sent-77, score-0.149]

46 (2006) use 4We excluded Israeli, Turkish, Iranian, Pakistani writers as not clearly representing either perspective. [sent-79, score-0.072]

47 5We additionally removed words containing support, oppos, sustain, overrid from the PBA data, in order not to inflate the performance on perspective classification due to the explicit reference to the upcoming vote. [sent-80, score-0.453]

48 NB-COUNT and SVM-NORMF for perspective classification; Pang et al. [sent-81, score-0.356]

49 (2008) all of the above for related tasks of movie review and political party classification. [sent-83, score-0.213]

50 6 5 Results Table 2 summarizes the cross-validation results for the four datasets discussed above. [sent-87, score-0.049]

51 7I968D2754F We conclude that there is no evidence for the relevance of the frequency composition of the text for perspective classification, for all levels of venue- and topic-control, from the tightest (PBA debates) to the loosest (Western vs Arab authors on Middle Eastern affairs). [sent-97, score-0.596]

52 This result is a clear indication that perspective classification is quite different from text categorization by topic, where count-based features usually perform better than boolean features. [sent-98, score-0.727]

53 On the other hand, we have not 6Parameter c controlling the trade-off between errors ownith tra tihneing gri ddat ca a =nd { m1a0r−gi6n,1 i0s− o5p,. [sent-99, score-0.05]

54 , 102}), since datasets are unbalanced (for e{x1a0mple, there is a fold} w),i sthin c27e% da-7ta3s%et ssp alriet). [sent-107, score-0.049]

55 u 7Here SVM-TFIDF is doing somewhat better than SVMBOOL on one of the folds and much worse on two other folds; paired t-test with just 4 pairs of observations does not detect a significant difference. [sent-108, score-0.16]

56 255 observed that boolean features are reliably better than count-based features, as reported for the sen- timent classification task in the movie review domain (Pang et al. [sent-109, score-0.258]

57 We note the low performance on BL-I, which could testify to a low degree of lexical consolidation in the Arab vs Western perspectives (more on this below). [sent-111, score-0.476]

58 It is also possible that the small size of BL-I leads to overfitting and low accuracies. [sent-112, score-0.04]

59 However, PBA subset with only 15 1 items (only 2002 and 2003 speeches) is still 96% classifiable, so size alone does not explain low BL-I performance. [sent-113, score-0.04]

60 6 Consolidation of perspective We explore feature redundancy in perspective classification. [sent-114, score-0.806]

61 As a proxy of feature quality, we use the weight assigned to the feature by the SVM-BOOL model based on the training data. [sent-116, score-0.074]

62 Thus, to get the performance with N best features, we take the N2 highest and lowest weight features, for the positive and negative classes, respectively, and retrain SVM-BOOL with these features only. [sent-117, score-0.063]

63 Nbest shows the smallest N and its proportion out of all features for which the performance of SVMBOOL with only the best N features is not significantly inferior (p1t>0. [sent-119, score-0.173]

64 No-Nbest shows the largest number N for which a model without N best features is not significantly inferior to the full model. [sent-121, score-0.11]

65 %2s1t% We observe that it is generally sufficient to use a small percentage of the available words to obtain the same classification accuracy as with the full feature set, even in high-accuracy cases such as PBA and BL. [sent-130, score-0.134]

66 The effectiveness of a small subset of features is consistent with the observa- tion in the discourse analysis studies that rivals 8We experimented with the mutual information based feature selection as well, with generally worse results. [sent-131, score-0.156]

67 in long-lasting controversies tend to consolidate their vocabulary and signal their perspective with certain stigma words and banner words, that is, specific keywords used by a discourse community to implicate adversaries and to create sympathy with own perspective, respectively (Teubert, 2001). [sent-132, score-0.588]

68 Thus, in abortion debates, using infanticide as a synonym for abortion is a pro-life stigma. [sent-133, score-0.615]

69 Note that this does not mean the rest of the features are not informative for classification, only that they are redundant with respect to a small percentage of top weight features. [sent-134, score-0.063]

70 When N best features are eliminated, performance goes down significantly with even smaller N for PBA and BL datasets. [sent-135, score-0.063]

71 Thus, top features are not only effective, they are also crucial for accurate classification, as their discrimination capacity is not replicated by any of the other vocabulary words. [sent-136, score-0.197]

72 For DP and BL-I datasets, the results seem to suggest perspectives with more diffused keyword distribution (No-NBest figures are higher). [sent-138, score-0.198]

73 Better comparisons are needed in order to verify the hypothesis of low consolidation. [sent-140, score-0.04]

74 For example, Greene and Resnik (2009) reported higher classification accuracies for the DP-4 data using syntactic frames in which a selected group of words appeared, rather than mere presence/absence of the words. [sent-142, score-0.147]

75 A test of different perspectives based on statistical distribution divergence. [sent-188, score-0.198]

76 A comparison of event models for Naive Bayes text classification. [sent-197, score-0.052]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('perspective', 0.356), ('abortion', 0.269), ('pba', 0.269), ('perspectives', 0.198), ('jansche', 0.154), ('legislators', 0.135), ('vocabulary', 0.134), ('debates', 0.123), ('greene', 0.123), ('arab', 0.123), ('political', 0.116), ('folds', 0.116), ('western', 0.104), ('categorization', 0.103), ('speeches', 0.101), ('legislation', 0.101), ('consolidation', 0.101), ('classification', 0.097), ('vs', 0.097), ('israeli', 0.093), ('nigam', 0.087), ('death', 0.087), ('dp', 0.081), ('baayen', 0.078), ('bpd', 0.077), ('infanticide', 0.077), ('lemons', 0.077), ('naral', 0.077), ('nrlc', 0.077), ('stressing', 0.077), ('svmbool', 0.077), ('writers', 0.072), ('vote', 0.07), ('speaker', 0.069), ('penalty', 0.069), ('bhat', 0.067), ('palestinian', 0.067), ('recurrence', 0.067), ('tracing', 0.067), ('pang', 0.066), ('topic', 0.063), ('features', 0.063), ('opinion', 0.062), ('bitter', 0.062), ('beigman', 0.062), ('eastern', 0.062), ('topical', 0.059), ('mccallum', 0.058), ('redundancy', 0.057), ('frequencies', 0.057), ('frequency', 0.057), ('discourse', 0.056), ('boolean', 0.056), ('party', 0.055), ('resnik', 0.054), ('issues', 0.053), ('talk', 0.052), ('birth', 0.052), ('text', 0.052), ('svm', 0.051), ('naive', 0.05), ('life', 0.05), ('bl', 0.05), ('mere', 0.05), ('controlling', 0.05), ('baroni', 0.05), ('datasets', 0.049), ('affairs', 0.048), ('conflict', 0.048), ('relaxing', 0.048), ('president', 0.048), ('politics', 0.047), ('eibe', 0.047), ('inferior', 0.047), ('weka', 0.045), ('paired', 0.044), ('witten', 0.044), ('middle', 0.043), ('fold', 0.043), ('lin', 0.043), ('keywords', 0.042), ('movie', 0.042), ('sides', 0.041), ('low', 0.04), ('distributions', 0.039), ('findings', 0.039), ('kaufmann', 0.039), ('multinomial', 0.038), ('records', 0.038), ('processes', 0.038), ('feature', 0.037), ('yu', 0.036), ('ian', 0.036), ('composition', 0.034), ('stefan', 0.034), ('hall', 0.034), ('controversy', 0.034), ('igman', 0.034), ('legislatures', 0.034), ('musolff', 0.034)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999827 256 acl-2010-Vocabulary Choice as an Indicator of Perspective

Author: Beata Beigman Klebanov ; Eyal Beigman ; Daniel Diermeier

Abstract: We establish the following characteristics of the task of perspective classification: (a) using term frequencies in a document does not improve classification achieved with absence/presence features; (b) for datasets allowing the relevant comparisons, a small number of top features is found to be as effective as the full feature set and indispensable for the best achieved performance, testifying to the existence of perspective-specific keywords. We relate our findings to research on word frequency distributions and to discourse analytic studies of perspective.

2 0.09535446 18 acl-2010-A Study of Information Retrieval Weighting Schemes for Sentiment Analysis

Author: Georgios Paltoglou ; Mike Thelwall

Abstract: Most sentiment analysis approaches use as baseline a support vector machines (SVM) classifier with binary unigram weights. In this paper, we explore whether more sophisticated feature weighting schemes from Information Retrieval can enhance classification accuracy. We show that variants of the classic tf.idf scheme adapted to sentiment analysis provide significant increases in accuracy, especially when using a sublinear function for term frequency weights and document frequency smoothing. The techniques are tested on a wide selection of data sets and produce the best accuracy to our knowledge.

3 0.086127281 22 acl-2010-A Unified Graph Model for Sentence-Based Opinion Retrieval

Author: Binyang Li ; Lanjun Zhou ; Shi Feng ; Kam-Fai Wong

Abstract: There is a growing research interest in opinion retrieval as on-line users’ opinions are becoming more and more popular in business, social networks, etc. Practically speaking, the goal of opinion retrieval is to retrieve documents, which entail opinions or comments, relevant to a target subject specified by the user’s query. A fundamental challenge in opinion retrieval is information representation. Existing research focuses on document-based approaches and documents are represented by bag-of-word. However, due to loss of contextual information, this representation fails to capture the associative information between an opinion and its corresponding target. It cannot distinguish different degrees of a sentiment word when associated with different targets. This in turn seriously affects opinion retrieval performance. In this paper, we propose a sentence-based approach based on a new information representa- , tion, namely topic-sentiment word pair, to capture intra-sentence contextual information between an opinion and its target. Additionally, we consider inter-sentence information to capture the relationships among the opinions on the same topic. Finally, the two types of information are combined in a unified graph-based model, which can effectively rank the documents. Compared with existing approaches, experimental results on the COAE08 dataset showed that our graph-based model achieved significant improvement. 1

4 0.073369473 6 acl-2010-A Game-Theoretic Model of Metaphorical Bargaining

Author: Beata Beigman Klebanov ; Eyal Beigman

Abstract: We present a game-theoretic model of bargaining over a metaphor in the context of political communication, find its equilibrium, and use it to rationalize observed linguistic behavior. We argue that game theory is well suited for modeling discourse as a dynamic resulting from a number of conflicting pressures, and suggest applications of interest to computational linguists.

5 0.070799939 251 acl-2010-Using Anaphora Resolution to Improve Opinion Target Identification in Movie Reviews

Author: Niklas Jakob ; Iryna Gurevych

Abstract: unkown-abstract

6 0.068595886 208 acl-2010-Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

7 0.066709504 123 acl-2010-Generating Focused Topic-Specific Sentiment Lexicons

8 0.066579282 158 acl-2010-Latent Variable Models of Selectional Preference

9 0.064575933 145 acl-2010-Improving Arabic-to-English Statistical Machine Translation by Reordering Post-Verbal Subjects for Alignment

10 0.064112782 155 acl-2010-Kernel Based Discourse Relation Recognition with Temporal Ordering Information

11 0.063513905 78 acl-2010-Cross-Language Text Classification Using Structural Correspondence Learning

12 0.059913654 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

13 0.057665691 85 acl-2010-Detecting Experiences from Weblogs

14 0.057424285 246 acl-2010-Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure

15 0.056642499 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts

16 0.055176087 33 acl-2010-Assessing the Role of Discourse References in Entailment Inference

17 0.054975357 134 acl-2010-Hierarchical Sequential Learning for Extracting Opinions and Their Attributes

18 0.052034829 77 acl-2010-Cross-Language Document Summarization Based on Machine Translation Quality Prediction

19 0.050872244 247 acl-2010-Unsupervised Event Coreference Resolution with Rich Linguistic Features

20 0.050783858 38 acl-2010-Automatic Evaluation of Linguistic Quality in Multi-Document Summarization


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.171), (1, 0.069), (2, -0.08), (3, 0.026), (4, -0.034), (5, -0.001), (6, -0.023), (7, 0.004), (8, 0.033), (9, -0.016), (10, -0.02), (11, 0.007), (12, 0.06), (13, 0.021), (14, 0.006), (15, -0.008), (16, -0.038), (17, 0.032), (18, 0.069), (19, -0.055), (20, 0.027), (21, -0.062), (22, 0.004), (23, -0.027), (24, -0.06), (25, -0.018), (26, 0.017), (27, 0.054), (28, 0.003), (29, 0.037), (30, 0.026), (31, 0.081), (32, 0.046), (33, 0.012), (34, -0.03), (35, 0.091), (36, -0.004), (37, 0.036), (38, -0.053), (39, 0.075), (40, 0.017), (41, 0.029), (42, -0.016), (43, -0.002), (44, -0.04), (45, -0.014), (46, -0.113), (47, -0.066), (48, -0.071), (49, -0.135)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.91373694 256 acl-2010-Vocabulary Choice as an Indicator of Perspective

Author: Beata Beigman Klebanov ; Eyal Beigman ; Daniel Diermeier

Abstract: We establish the following characteristics of the task of perspective classification: (a) using term frequencies in a document does not improve classification achieved with absence/presence features; (b) for datasets allowing the relevant comparisons, a small number of top features is found to be as effective as the full feature set and indispensable for the best achieved performance, testifying to the existence of perspective-specific keywords. We relate our findings to research on word frequency distributions and to discourse analytic studies of perspective.

2 0.65593302 81 acl-2010-Decision Detection Using Hierarchical Graphical Models

Author: Trung H. Bui ; Stanley Peters

Abstract: We investigate hierarchical graphical models (HGMs) for automatically detecting decisions in multi-party discussions. Several types of dialogue act (DA) are distinguished on the basis of their roles in formulating decisions. HGMs enable us to model dependencies between observed features of discussions, decision DAs, and subdialogues that result in a decision. For the task of detecting decision regions, an HGM classifier was found to outperform non-hierarchical graphical models and support vector machines, raising the F1-score to 0.80 from 0.55.

3 0.51661479 246 acl-2010-Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure

Author: Minwoo Jeong ; Ivan Titov

Abstract: Documents often have inherently parallel structure: they may consist of a text and commentaries, or an abstract and a body, or parts presenting alternative views on the same problem. Revealing relations between the parts by jointly segmenting and predicting links between the segments, would help to visualize such documents and construct friendlier user interfaces. To address this problem, we propose an unsupervised Bayesian model for joint discourse segmentation and alignment. We apply our method to the “English as a second language” podcast dataset where each episode is composed of two parallel parts: a story and an explanatory lecture. The predicted topical links uncover hidden re- lations between the stories and the lectures. In this domain, our method achieves competitive results, rivaling those of a previously proposed supervised technique.

4 0.50017256 18 acl-2010-A Study of Information Retrieval Weighting Schemes for Sentiment Analysis

Author: Georgios Paltoglou ; Mike Thelwall

Abstract: Most sentiment analysis approaches use as baseline a support vector machines (SVM) classifier with binary unigram weights. In this paper, we explore whether more sophisticated feature weighting schemes from Information Retrieval can enhance classification accuracy. We show that variants of the classic tf.idf scheme adapted to sentiment analysis provide significant increases in accuracy, especially when using a sublinear function for term frequency weights and document frequency smoothing. The techniques are tested on a wide selection of data sets and produce the best accuracy to our knowledge.

5 0.49299204 117 acl-2010-Fine-Grained Genre Classification Using Structural Learning Algorithms

Author: Zhili Wu ; Katja Markert ; Serge Sharoff

Abstract: Prior use of machine learning in genre classification used a list of labels as classification categories. However, genre classes are often organised into hierarchies, e.g., covering the subgenres of fiction. In this paper we present a method of using the hierarchy of labels to improve the classification accuracy. As a testbed for this approach we use the Brown Corpus as well as a range of other corpora, including the BNC, HGC and Syracuse. The results are not encouraging: apart from the Brown corpus, the improvements of our structural classifier over the flat one are not statistically significant. We discuss the relation between structural learning performance and the visual and distributional balance of the label hierarchy, suggesting that only balanced hierarchies might profit from structural learning.

6 0.48963621 85 acl-2010-Detecting Experiences from Weblogs

7 0.47562423 34 acl-2010-Authorship Attribution Using Probabilistic Context-Free Grammars

8 0.47298384 151 acl-2010-Intelligent Selection of Language Model Training Data

9 0.47021416 148 acl-2010-Improving the Use of Pseudo-Words for Evaluating Selectional Preferences

10 0.46882477 76 acl-2010-Creating Robust Supervised Classifiers via Web-Scale N-Gram Data

11 0.45505038 208 acl-2010-Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

12 0.45331958 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences

13 0.45255309 19 acl-2010-A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation

14 0.43862498 74 acl-2010-Correcting Errors in Speech Recognition with Articulatory Dynamics

15 0.43179986 197 acl-2010-Practical Very Large Scale CRFs

16 0.43175468 139 acl-2010-Identifying Generic Noun Phrases

17 0.43129319 258 acl-2010-Weakly Supervised Learning of Presupposition Relations between Verbs

18 0.43093741 241 acl-2010-Transition-Based Parsing with Confidence-Weighted Classification

19 0.43033463 58 acl-2010-Classification of Feedback Expressions in Multimodal Data

20 0.43003672 61 acl-2010-Combining Data and Mathematical Models of Language Change


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(25, 0.026), (42, 0.014), (59, 0.059), (73, 0.035), (76, 0.011), (78, 0.018), (83, 0.672), (84, 0.02), (98, 0.074)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97672784 256 acl-2010-Vocabulary Choice as an Indicator of Perspective

Author: Beata Beigman Klebanov ; Eyal Beigman ; Daniel Diermeier

Abstract: We establish the following characteristics of the task of perspective classification: (a) using term frequencies in a document does not improve classification achieved with absence/presence features; (b) for datasets allowing the relevant comparisons, a small number of top features is found to be as effective as the full feature set and indispensable for the best achieved performance, testifying to the existence of perspective-specific keywords. We relate our findings to research on word frequency distributions and to discourse analytic studies of perspective.

2 0.96508038 4 acl-2010-A Cognitive Cost Model of Annotations Based on Eye-Tracking Data

Author: Katrin Tomanek ; Udo Hahn ; Steffen Lohmann ; Jurgen Ziegler

Abstract: We report on an experiment to track complex decision points in linguistic metadata annotation where the decision behavior of annotators is observed with an eyetracking device. As experimental conditions we investigate different forms of textual context and linguistic complexity classes relative to syntax and semantics. Our data renders evidence that annotation performance depends on the semantic and syntactic complexity of the decision points and, more interestingly, indicates that fullscale context is mostly negligible with – the exception of semantic high-complexity cases. We then induce from this observational data a cognitively grounded cost model of linguistic meta-data annotations and compare it with existing non-cognitive models. Our data reveals that the cognitively founded model explains annotation costs (expressed in annotation time) more adequately than non-cognitive ones.

3 0.96249449 72 acl-2010-Coreference Resolution across Corpora: Languages, Coding Schemes, and Preprocessing Information

Author: Marta Recasens ; Eduard Hovy

Abstract: This paper explores the effect that different corpus configurations have on the performance of a coreference resolution system, as measured by MUC, B3, and CEAF. By varying separately three parameters (language, annotation scheme, and preprocessing information) and applying the same coreference resolution system, the strong bonds between system and corpus are demonstrated. The experiments reveal problems in coreference resolution evaluation relating to task definition, coding schemes, and features. They also ex- pose systematic biases in the coreference evaluation metrics. We show that system comparison is only possible when corpus parameters are in exact agreement.

4 0.94666737 38 acl-2010-Automatic Evaluation of Linguistic Quality in Multi-Document Summarization

Author: Emily Pitler ; Annie Louis ; Ani Nenkova

Abstract: To date, few attempts have been made to develop and validate methods for automatic evaluation of linguistic quality in text summarization. We present the first systematic assessment of several diverse classes of metrics designed to capture various aspects of well-written text. We train and test linguistic quality models on consecutive years of NIST evaluation data in order to show the generality of results. For grammaticality, the best results come from a set of syntactic features. Focus, coherence and referential clarity are best evaluated by a class of features measuring local coherence on the basis of cosine similarity between sentences, coreference informa- tion, and summarization specific features. Our best results are 90% accuracy for pairwise comparisons of competing systems over a test set of several inputs and 70% for ranking summaries of a specific input.

5 0.9442181 132 acl-2010-Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-Jointly Labeled Data

Author: Jenny Rose Finkel ; Christopher D. Manning

Abstract: One of the main obstacles to producing high quality joint models is the lack of jointly annotated data. Joint modeling of multiple natural language processing tasks outperforms single-task models learned from the same data, but still underperforms compared to single-task models learned on the more abundant quantities of available single-task annotated data. In this paper we present a novel model which makes use of additional single-task annotated data to improve the performance of a joint model. Our model utilizes a hierarchical prior to link the feature weights for shared features in several single-task models and the joint model. Experiments on joint parsing and named entity recog- nition, using the OntoNotes corpus, show that our hierarchical joint model can produce substantial gains over a joint model trained on only the jointly annotated data.

6 0.84861469 31 acl-2010-Annotation

7 0.81739062 1 acl-2010-"Ask Not What Textual Entailment Can Do for You..."

8 0.81017703 73 acl-2010-Coreference Resolution with Reconcile

9 0.74103987 81 acl-2010-Decision Detection Using Hierarchical Graphical Models

10 0.73132902 33 acl-2010-Assessing the Role of Discourse References in Entailment Inference

11 0.72334647 219 acl-2010-Supervised Noun Phrase Coreference Research: The First Fifteen Years

12 0.71325046 112 acl-2010-Extracting Social Networks from Literary Fiction

13 0.70757931 32 acl-2010-Arabic Named Entity Recognition: Using Features Extracted from Noisy Data

14 0.69892991 134 acl-2010-Hierarchical Sequential Learning for Extracting Opinions and Their Attributes

15 0.69452411 230 acl-2010-The Manually Annotated Sub-Corpus: A Community Resource for and by the People

16 0.69446599 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields

17 0.69054228 155 acl-2010-Kernel Based Discourse Relation Recognition with Temporal Ordering Information

18 0.68231136 122 acl-2010-Generating Fine-Grained Reviews of Songs from Album Reviews

19 0.67919111 197 acl-2010-Practical Very Large Scale CRFs

20 0.67896831 252 acl-2010-Using Parse Features for Preposition Selection and Error Detection