acl acl2013 acl2013-389 knowledge-graph by maker-knowledge-mining

389 acl-2013-Word Association Profiles and their Use for Automated Scoring of Essays

Source: pdf

Author: Beata Beigman Klebanov ; Michael Flor

Abstract: We describe a new representation of the content vocabulary of a text we call word association profile that captures the proportions of highly associated, mildly associated, unassociated, and dis-associated pairs of words that co-exist in the given text. We illustrate the shape of the distirbution and observe variation with genre and target audience. We present a study of the relationship between quality of writing and word association profiles. For a set of essays written by college graduates on a number of general topics, we show that the higher scoring essays tend to have higher percentages of both highly associated and dis-associated pairs, and lower percentages of mildly associated pairs of words. Finally, we use word association profiles to improve a system for automated scoring of essays.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 org Abstract We describe a new representation of the content vocabulary of a text we call word association profile that captures the proportions of highly associated, mildly associated, unassociated, and dis-associated pairs of words that co-exist in the given text. [sent-2, score-0.426]

2 We illustrate the shape of the distirbution and observe variation with genre and target audience. [sent-3, score-0.148]

3 We present a study of the relationship between quality of writing and word association profiles. [sent-4, score-0.14]

4 For a set of essays written by college graduates on a number of general topics, we show that the higher scoring essays tend to have higher percentages of both highly associated and dis-associated pairs, and lower percentages of mildly associated pairs of words. [sent-5, score-1.622]

5 Finally, we use word association profiles to improve a system for automated scoring of essays. [sent-6, score-0.353]

6 However, little is known about typical profiles of texts in terms of co-occurrence behavior of their words. [sent-10, score-0.212]

7 , 2009; Eisenstein and Barzilay, 2008) tells us that texts contain some proportion of more highly associated word pairs (those in subsequent sentences within the same topical unit) and of less highly associated pairs (those in sentences from different topical units). [sent-13, score-0.57]

8 1 Yet, does each text have a different distribution of highly associated, mildly associated, unassociated, and dis-associated pairs of words, or do texts tend to strike a similar balance of these? [sent-14, score-0.33]

9 What are the proportions of the different levels of association, how much variation there exists, and are there systematic differences between various kinds of texts? [sent-15, score-0.084]

10 From the applied perspective, our interest is in quantifying differences between well-written and poorly written essays, for the purposes of automated scoring of essays. [sent-17, score-0.192]

11 We therefore concentrate on essay data for the main experiments reported in this paper, although some additional corpora will be used for illustration purposes. [sent-18, score-0.439]

12 Section 2 presents our methodology for building word association profiles for texts. [sent-20, score-0.207]

13 Section 3 illustrates the profiles for three corpora from different genres. [sent-21, score-0.135]

14 2 presents our study of the relationship between writing quality and patterns of word associations, with section 4. [sent-23, score-0.098]

15 5 showing the results of adding a feature based on word association profile to a state-of-art essay scoring system. [sent-24, score-0.681]

16 1Note that the classical approach to topical segmentation of texts, TextTiling (Hearst, 1997), uses only word repetitions. [sent-26, score-0.095]

17 Ac s2s0o1ci3a Atiosnso fcoirat Cio nm foprut Caotimonpaulta Lti nognuails Lti cnsg,u piasgteics 1 48–1 58, 2 Methodology In order to describe the word association profile of a text, three decisions need to be made. [sent-30, score-0.15]

18 The second is which pairs of words in a text to consider when building a profile for the text; we opted for all pairs of content word types occurring in a text, irrespective of the distance between them. [sent-32, score-0.25]

19 The third decision is how to represent the co-occurrence profiles; we use a histogram where each bin represents the proportion of word pairs in the given interval of PMI values. [sent-34, score-0.211]

20 1 million word types and of 1,279 million word type pairs are efficiently compressed using the TrendStream technology (Flor, 2013), resulting in a database file of4. [sent-42, score-0.131]

21 We store pairwise word associations as bigrams; since associations are unordered, only one of the orders in actually stored in the database. [sent-45, score-0.178]

22 We define WAPT a word association profile of a text T as the distribution of PMI(x, y) for all pairs of content3 word types (x, y) ∈T. [sent-53, score-0.251]

23 All pairs of word types forw wohrdich ty tphees sas (sxo,cyi)atio ∈nTs. [sent-54, score-0.101]

24 67, while the rest of the bins contain word pairs (x, y) with −5 2. [sent-61, score-0.18]

25 5 We picked articles of 250 to 700 words in length, in order to keep the length of texts comparable to the essay data, while varying the genre; 770 such articles were found. [sent-67, score-0.516]

26 We observe that the shape of the distribution is similar to that of essay data, although WSJ articles tend to be less tight, on average, since the distribution in PMI<2. [sent-69, score-0.543]

27 The second additional corpus contains 140 literary texts written or adapted for readers in grades 3 and 4 in US schools (Sheehan et al. [sent-73, score-0.166]

28 5LDC93T3A in LDC catalogue The average WAP for these texts is shown with a thin solid (purple) line with circular markers in Figure 1 (Grades 3-4). [sent-76, score-0.205]

29 These texts are much tighter than texts in the other two collections, as the distribution is shifted to the right. [sent-77, score-0.154]

30 17, holds 19% of all word pairs in these texts more than twice the proportion in essays written by college graduates or in texts from the WSJ. [sent-79, score-0.939]

31 It is instructive to check whether the over-use of highly associated pairs is felt during reading. [sent-80, score-0.161]

32 – These texts strike an adult reader as overly explicit, taking the space to state things that an adult reader would readily infer or assume. [sent-81, score-0.153]

33 In fact, these sentences almost seem like training sentences the kind of sentences from which the associations between recorder and musical instrument, play, blowing can be learned. [sent-92, score-0.181]

34 To conclude the illustration, we observe that there are some broad similarities between the different copora in terms of the distribution of pairs of word types. [sent-94, score-0.131]

35 Thus, texts seem to be mainly made of pairs of weakly associated words about half the pairs of word types lie between PMI of 0. [sent-95, score-0.301]

36 The percentages of pairs at the low and the high ends of PMI differ with genre writing for children favors the higher end, while typical Wall Street Journal writing favors the low end, relatively to a corpus of essays on general topics written by college graduates. [sent-98, score-0.878]

37 Average for essays (a thick solid blue line), average for WSJ articles (a dashed orange line); average for Grades 3-4 corpus (a thin solid purple line with round markers). [sent-101, score-0.715]

38 Normal distribution is shown with a thin solid green line with asterisk markers. [sent-102, score-0.092]

39 Middle 70% of essays fall between the dotted lines. [sent-103, score-0.482]

40 we believe the illustration is suggestive, in that there is both constancy in writing for a similar purpose (observe the limited variation around the average that captures 70% of the essays) and variation with genre and target audience. [sent-104, score-0.242]

41 In what follows, we will explore more thoroughly the information provided by word association profiles regarding the quality of writing. [sent-105, score-0.207]

42 4 Application to Essay Scoring Texts written for a test and scored by relevant professionals is a setting where variation in text quality is expected. [sent-106, score-0.128]

43 In this section, we report our ex- periments with using WAPs to explore the variation in quality as quantified by essay scores. [sent-107, score-0.49]

44 1), then show the patterns of relationships between essay scores and word association profiles (section 4. [sent-109, score-0.677]

45 Finally, we report on an experiment where we significantly improve the performance of a very competitive, state-of-art system for automated scoring of essays, using a feature derived from WAP. [sent-111, score-0.146]

46 1 Data We consider two collections of essays written as responses in an analytical writing section of a high-stakes standardized test for graduate school admission; the time limit for essay composition was 45 minutes. [sent-113, score-1.035]

47 Essays were written in response to a prompt (essay question). [sent-114, score-0.097]

48 Example prompts are: “High-speed electronic communications media, such as electronic mail and television, tend to prevent meaningful and thoughtful communication” and “In the age of television, reading books is not as important as it once was. [sent-116, score-0.194]

49 ” The first collection (henceforth, setA) contains 8,899 essays written in response to nine different prompts, about 1,000 per prompt;6 the per-prompt subsets will be termed setA-p1 through setA-p9. [sent-118, score-0.528]

50 Each essay in setA was scored by 1 to 4 human raters on a scale of 1to 6; the majority ofessays received 2 human scores. [sent-119, score-0.546]

51 Most essays thereby receive an integer score,7 so the ranking of the essays is coarse. [sent-121, score-0.964]

52 The second collection (henceforth, setB) con6While we sampled exactly 1,000 essays per prompt, we removed empty responses, resulting in 975 to 1,000 essays per sample. [sent-123, score-0.964]

53 7as the two raters agree most of the time 1151 tains 400 essays, with 200 essays written on each of two prompts given as examples above (setB-p1 and setB-p2). [sent-124, score-0.755]

54 (2013), each essay was scored by 16 professional raters on a scale of 1to 6, allowing plus and minus scores as well, quantified as 0. [sent-126, score-0.577]

55 This fine-grained scale resulted in higher mean pairwise inter-rater correlations than the traditional integer-only scale (r=0. [sent-129, score-0.078]

56 We use the average of 16 raters as the final grade for each essay. [sent-132, score-0.112]

57 This dataset provides a very fine-grained ranking of the essays, with almost no two essays getting exactly the same score. [sent-133, score-0.482]

58 13 (row titled 5 and column titled SetB p1) means that 13% of the essays in setB-p1 received scores that round to 5. [sent-138, score-0.575]

59 For setA, average, minimum, and maximum values across the nine prompts are shown. [sent-139, score-0.151]

60 The use of 16 raters seems to have moved the rounded scores towards the middle; however, the relative ranking of the essays is much more delicate in setB than in setA. [sent-144, score-0.626]

61 2 Essay Score vs WAP We calculated correlations between essay score and the proportion of word pairs in each of the 60 bins of the WAP histogram, separately for each of the prompts p1-p6 in setA. [sent-146, score-0.923]

62 First, we observe that, perhaps contrary to expectation, the proportion of the highest values of PMI (the area to the right of PMI=4 in Figure 2) does not yield a consistent correlation with essay scores. [sent-151, score-0.579]

63 67 in Figure 2) produces a very consistent picture, with only two points out of 48 in that interval9 lacking significant positive correlation with essay score (p2 at PMI=3. [sent-155, score-0.474]

64 Next, observe the consistent negative correlations between essay score and the proportion of word pairs in bins PMI=0. [sent-157, score-0.802]

65 In what follows, we check whether the information about essay quality provided by WAP can be used to improve essay scoring. [sent-163, score-0.878]

66 We thank an anonymous reviewer for suggesting this direction, and leave a more detailed examination of the pairs in the highest-PMI bins to future work. [sent-169, score-0.15]

67 1152 Figure 2: Correlations with essay score for various bins of the WAP histogram. [sent-172, score-0.518]

68 P1 to P6 correspond to the first 6 prompts in SetA. [sent-173, score-0.151]

69 3 Baseline As a baseline, we use e-rater (Attali and Burstein, 2006), a state-of-art essay scoring system developed at Educational Testing Service. [sent-175, score-0.531]

70 We use a generic e-rater model built at Educational Testing Service using essays across a variety of writing prompts, with no connection to the current project and its authors. [sent-180, score-0.55]

71 This is a very competitive baseline, as e-rater features explain more than 70% of the variation in essay scores on a relatively coarse scale (setA) and more than 80% of the variation in scores on a fine-grained scale (setB). [sent-186, score-0.603]

72 4 Adding WAP We define HAT high associative tightness as the percentage of word type pairs with 2. [sent-191, score-0.134]

73 33 0 area that had a positive correlation with essay score in the setA-p1 set. [sent-192, score-0.474]

74 We note that the HAT feature is not correlated with essay length. [sent-199, score-0.439]

75 Essay length is not used as a feature in e-rater models, but it typically correlates strongly with the human essay score (at about r=0. [sent-200, score-0.439]

76 the proprotion of mildly associated vocabulary in an essay, is indirectly captured by another feature or group of features already present in e-rater. [sent-207, score-0.186]

77 Likewise, a feature that calculates the average PMI for all pairs ofcontent word types in the text failed to produce an improvement over the baseline for setA p1-p6. [sent-208, score-0.137]

78 The reason for this can be observed in Figure 2: The higher-scoring essays having more of both the low and the high PMI pairs leads to about the same average PMI as for the lower-scoring essays that have a higher concentration of values closer to the average PMI. [sent-209, score-1.107]

79 5 Evaluation To evaluate the usefulness of WAP in improving automated scoring of essays, we estimate a linear regression model using the human score as a dependent variable (label) and e-rater score and the HAT as the two independent variables (features). [sent-211, score-0.182]

80 Nrfootrethat e-rater itself is not trained on any of the data in setA and setB; we use the same e-rater model for all evaluations, a generic model that was pretrained on a large number of essays across different prompts. [sent-222, score-0.482]

81 Performance is measured using Pearson correlation with essay score. [sent-286, score-0.474]

82 9242, was observed for the setting with the highest baseline performance, suggesting that the HAT feature is still effective even after the delicate ranking of the essays revealed an exceptionally strong performance of e-rater. [sent-300, score-0.482]

83 5 Related Work Most of the attention in the computational linguistics research that deals with analysis of the lexis of texts has so far been paid to what in our terms would be the very high end of the word association profile. [sent-301, score-0.18]

84 Thus, following Halliday and Hasan (1976), Hoey (1991), and Morris and Hirst (1991), the notion of lexical cohesion has been used to capture repetitions of words and occurrence of words with related meanings in a text. [sent-302, score-0.108]

85 To our knowledge, lexical cohesion has not so far been used for automated scoring of essays. [sent-306, score-0.254]

86 Aspects related to the distribution of words in essays have been studied in relation to essay scoring. [sent-308, score-0.921]

87 (2004) compare sentences from certain discourse segments in an essay to determine their semantic similarity, such as comparing thesis statements to conclusions or thesis statements to essay prompts. [sent-320, score-0.878]

88 Our approach is different in that it does not measure the flow of the text, that is, the sequencing and repetition of the words, but rather assesses the choice of vocabulary as a whole. [sent-323, score-0.086]

89 Since (a) topics encapsulate clusters of highly associated words, and (b) topics for a given text are modeled as being chosen independently from each other, we expect a negative correlation between the number of topics in a document and the tightness of the word association profile of the text. [sent-328, score-0.413]

90 An alternative representation of word association profile would be a weighted graph, where the weights correspond to pairwise associations between words. [sent-329, score-0.224]

91 Steyvers and Tenenbaum (2005) analyze the graphs induced from large repositories like WordNet or databases of free associations, and find them to be scale-free and small-world; it is an open question whether word association graphs induced from book-length texts would exhibit similar properties. [sent-331, score-0.149]

92 We believe that word association profiles reflect the artwork that goes into using those internalized associations between words when creating a text, achieving the right mix of strong and weak, positive and negative associations. [sent-333, score-0.281]

93 We observed that the shape of the distribution is quite stable across various texts, with about half the pairs having a mild association; the allocation of pairs to the higher and the lower levels of association does vary across genres and target audiences. [sent-335, score-0.215]

94 We further presented a study of the relationship between quality of writing and word association profiles. [sent-336, score-0.14]

95 Finally, we demonstrated that the information provided by word association profiles leads to a significant improvement in a highly competitive, state-of-art essay scoring system that already measures various aspects of writing quality. [sent-339, score-0.844]

96 Scoring with the computer: Alternative procedures for improving reliability of holistic essay scoring. [sent-358, score-0.439]

97 Enhancing lexical cohesion measure with confidence measures, semantic relations and language model interpolation for multimedia spoken content topic segmentation. [sent-448, score-0.139]

98 Evaluation of text coherence for electronic essay scoring systems. [sent-521, score-0.588]

99 When do standard approaches for measuring vocabulary difficulty, syntactic complexity and referential cohesion yield biased estimates of text difficulty? [sent-561, score-0.109]

100 Select: A lexical cohesion based news story segmentation system. [sent-575, score-0.14]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('essays', 0.482), ('essay', 0.439), ('pmi', 0.273), ('seta', 0.233), ('wap', 0.169), ('prompts', 0.151), ('profiles', 0.135), ('setb', 0.132), ('hat', 0.127), ('mildly', 0.101), ('hoey', 0.094), ('scoring', 0.092), ('attali', 0.083), ('bins', 0.079), ('correlations', 0.078), ('profile', 0.078), ('texts', 0.077), ('cohesion', 0.076), ('raters', 0.076), ('proportion', 0.075), ('associations', 0.074), ('burstein', 0.072), ('pairs', 0.071), ('writing', 0.068), ('coherence', 0.057), ('thin', 0.056), ('automated', 0.054), ('repetition', 0.053), ('associated', 0.052), ('variation', 0.051), ('prompt', 0.051), ('graduates', 0.05), ('unassociated', 0.05), ('wsj', 0.049), ('written', 0.046), ('bullinaria', 0.046), ('television', 0.046), ('grades', 0.043), ('tend', 0.043), ('association', 0.042), ('jill', 0.041), ('percentages', 0.041), ('hirst', 0.04), ('barzilay', 0.039), ('chains', 0.038), ('adult', 0.038), ('highly', 0.038), ('blowing', 0.038), ('ercan', 0.038), ('marathe', 0.038), ('recorder', 0.038), ('sheehan', 0.038), ('silber', 0.038), ('stokes', 0.038), ('trendstream', 0.038), ('yoko', 0.038), ('rounded', 0.037), ('solid', 0.036), ('genre', 0.036), ('regression', 0.036), ('average', 0.036), ('topics', 0.035), ('correlation', 0.035), ('histogram', 0.035), ('vocabulary', 0.033), ('reshef', 0.033), ('futagi', 0.033), ('admission', 0.033), ('misra', 0.033), ('guinaudeau', 0.033), ('yigal', 0.033), ('purple', 0.033), ('flor', 0.033), ('devitt', 0.033), ('tightness', 0.033), ('proportions', 0.033), ('topical', 0.033), ('lexical', 0.032), ('church', 0.032), ('segmentation', 0.032), ('shape', 0.031), ('college', 0.031), ('creative', 0.031), ('scored', 0.031), ('topic', 0.031), ('mechanics', 0.031), ('titled', 0.031), ('lexis', 0.031), ('riedl', 0.031), ('texttiling', 0.031), ('musical', 0.031), ('higgins', 0.031), ('gruber', 0.031), ('scores', 0.031), ('observe', 0.03), ('turney', 0.03), ('word', 0.03), ('michael', 0.029), ('lund', 0.029), ('halliday', 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999869 389 acl-2013-Word Association Profiles and their Use for Automated Scoring of Essays

Author: Beata Beigman Klebanov ; Michael Flor

2 0.47386113 246 acl-2013-Modeling Thesis Clarity in Student Essays

Author: Isaac Persing ; Vincent Ng

Abstract: Recently, researchers have begun exploring methods of scoring student essays with respect to particular dimensions of quality such as coherence, technical errors, and relevance to prompt, but there is relatively little work on modeling thesis clarity. We present a new annotated corpus and propose a learning-based approach to scoring essays along the thesis clarity dimension. Additionally, in order to provide more valuable feedback on why an essay is scored as it is, we propose a second learning-based approach to identifying what kinds of errors an essay has that may lower its thesis clarity score.

3 0.10440849 69 acl-2013-Bilingual Lexical Cohesion Trigger Model for Document-Level Machine Translation

Author: Guosheng Ben ; Deyi Xiong ; Zhiyang Teng ; Yajuan Lu ; Qun Liu

Abstract: In this paper, we propose a bilingual lexical cohesion trigger model to capture lexical cohesion for document-level machine translation. We integrate the model into hierarchical phrase-based machine translation and achieve an absolute improvement of 0.85 BLEU points on average over the baseline on NIST Chinese-English test sets.

4 0.086550102 172 acl-2013-Graph-based Local Coherence Modeling

Author: Camille Guinaudeau ; Michael Strube

Abstract: We propose a computationally efficient graph-based approach for local coherence modeling. We evaluate our system on three tasks: sentence ordering, summary coherence rating and readability assessment. The performance is comparable to entity grid based approaches though these rely on a computationally expensive training phase and face data sparsity problems.

5 0.085495397 299 acl-2013-Reconstructing an Indo-European Family Tree from Non-native English Texts

Author: Ryo Nagata ; Edward Whittaker

Abstract: Mother tongue interference is the phenomenon where linguistic systems of a mother tongue are transferred to another language. Although there has been plenty of work on mother tongue interference, very little is known about how strongly it is transferred to another language and about what relation there is across mother tongues. To address these questions, this paper explores and visualizes mother tongue interference preserved in English texts written by Indo-European language speakers. This paper further explores linguistic features that explain why certain relations are preserved in English writing, and which contribute to related tasks such as native language identification.

6 0.084834807 55 acl-2013-Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?

7 0.082142949 59 acl-2013-Automated Pyramid Scoring of Summaries using Distributional Semantics

8 0.079041846 31 acl-2013-A corpus-based evaluation method for Distributional Semantic Models

9 0.066347323 238 acl-2013-Measuring semantic content in distributional vectors

10 0.057512589 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity

11 0.05384744 27 acl-2013-A Two Level Model for Context Sensitive Inference Rules

12 0.053163435 351 acl-2013-Topic Modeling Based Classification of Clinical Reports

13 0.053104732 41 acl-2013-Aggregated Word Pair Features for Implicit Discourse Relation Disambiguation

14 0.053016651 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models

15 0.052318837 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri

16 0.052279796 87 acl-2013-Compositional-ly Derived Representations of Morphologically Complex Words in Distributional Semantics

17 0.051386513 12 acl-2013-A New Set of Norms for Semantic Relatedness Measures

18 0.048580088 174 acl-2013-Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation

19 0.048554949 73 acl-2013-Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions

20 0.048316423 121 acl-2013-Discovering User Interactions in Ideological Discussions

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.168), (1, 0.061), (2, 0.013), (3, -0.066), (4, 0.007), (5, -0.05), (6, 0.021), (7, 0.011), (8, -0.072), (9, 0.033), (10, -0.009), (11, 0.039), (12, -0.015), (13, 0.015), (14, -0.074), (15, 0.006), (16, -0.02), (17, -0.021), (18, -0.019), (19, -0.045), (20, 0.025), (21, 0.023), (22, 0.05), (23, -0.055), (24, -0.008), (25, 0.146), (26, 0.017), (27, 0.037), (28, -0.1), (29, 0.063), (30, -0.137), (31, -0.139), (32, -0.022), (33, -0.108), (34, -0.048), (35, -0.071), (36, -0.048), (37, 0.1), (38, 0.131), (39, 0.019), (40, 0.24), (41, 0.138), (42, -0.357), (43, -0.089), (44, 0.22), (45, 0.186), (46, 0.075), (47, 0.196), (48, 0.046), (49, -0.034)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.9193626 246 acl-2013-Modeling Thesis Clarity in Student Essays

Author: Isaac Persing ; Vincent Ng

same-paper 2 0.90960896 389 acl-2013-Word Association Profiles and their Use for Automated Scoring of Essays

Author: Beata Beigman Klebanov ; Michael Flor

3 0.48776239 59 acl-2013-Automated Pyramid Scoring of Summaries using Distributional Semantics

Author: Rebecca J. Passonneau ; Emily Chen ; Weiwei Guo ; Dolores Perin

Abstract: The pyramid method for content evaluation of automated summarizers produces scores that are shown to correlate well with manual scores used in educational assessment of students’ summaries. This motivates the development of a more accurate automated method to compute pyramid scores. Of three methods tested here, the one that performs best relies on latent semantics.

4 0.45965755 299 acl-2013-Reconstructing an Indo-European Family Tree from Non-native English Texts

Author: Ryo Nagata ; Edward Whittaker

5 0.42323947 31 acl-2013-A corpus-based evaluation method for Distributional Semantic Models

Author: Abdellah Fourtassi ; Emmanuel Dupoux

Abstract: Evaluation methods for Distributional Semantic Models typically rely on behaviorally derived gold standards. These methods are difficult to deploy in languages with scarce linguistic/behavioral resources. We introduce a corpus-based measure that evaluates the stability of the lexical semantic similarity space using a pseudo-synonym same-different detection task and no external resources. We show that it enables to predict two behaviorbased measures across a range of parameters in a Latent Semantic Analysis model.

6 0.42022857 69 acl-2013-Bilingual Lexical Cohesion Trigger Model for Document-Level Machine Translation

7 0.37128517 1 acl-2013-"Let Everything Turn Well in Your Wife": Generation of Adult Humor Using Lexical Constraints

8 0.36864793 238 acl-2013-Measuring semantic content in distributional vectors

9 0.35912213 390 acl-2013-Word surprisal predicts N400 amplitude during reading

10 0.35380119 340 acl-2013-Text-Driven Toponym Resolution using Indirect Supervision

11 0.34708866 172 acl-2013-Graph-based Local Coherence Modeling

12 0.343539 324 acl-2013-Smatch: an Evaluation Metric for Semantic Feature Structures

13 0.34322825 263 acl-2013-On the Predictability of Human Assessment: when Matrix Completion Meets NLP Evaluation

14 0.32361621 225 acl-2013-Learning to Order Natural Language Texts

15 0.32029963 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri

16 0.31820884 122 acl-2013-Discriminative Approach to Fill-in-the-Blank Quiz Generation for Language Learners

17 0.31709635 76 acl-2013-Building and Evaluating a Distributional Memory for Croatian

18 0.31349698 21 acl-2013-A Statistical NLG Framework for Aggregated Planning and Realization

19 0.31000686 36 acl-2013-Adapting Discriminative Reranking to Grounded Language Learning

20 0.30634362 89 acl-2013-Computerized Analysis of a Verbal Fluency Test

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.067), (6, 0.073), (11, 0.052), (14, 0.019), (15, 0.016), (24, 0.054), (26, 0.04), (28, 0.011), (35, 0.094), (42, 0.053), (48, 0.049), (70, 0.038), (71, 0.227), (88, 0.038), (90, 0.035), (95, 0.062)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.92123359 177 acl-2013-GuiTAR-based Pronominal Anaphora Resolution in Bengali

Author: Apurbalal Senapati ; Utpal Garain

Abstract: This paper attempts to use an off-the-shelf anaphora resolution (AR) system for Bengali. The language specific preprocessing modules of GuiTAR (v3.0.3) are identified and suitably designed for Bengali. Anaphora resolution module is also modified or replaced in order to realize different configurations of GuiTAR. Performance of each configuration is evaluated and experiment shows that the off-the-shelf AR system can be effectively used for Indic languages. 1

2 0.86805123 179 acl-2013-HYENA-live: Fine-Grained Online Entity Type Classification from Natural-language Text

Author: Mohamed Amir Yosef ; Sandro Bauer ; Johannes Hoffart ; Marc Spaniol ; Gerhard Weikum

Abstract: Recent research has shown progress in achieving high-quality, very fine-grained type classification in hierarchical taxonomies. Within such a multi-level type hierarchy with several hundreds of types at different levels, many entities naturally belong to multiple types. In order to achieve high-precision in type classification, current approaches are either limited to certain domains or require time consuming multistage computations. As a consequence, existing systems are incapable of performing ad-hoc type classification on arbitrary input texts. In this demo, we present a novel Webbased tool that is able to perform domain independent entity type classification under real time conditions. Thanks to its efficient implementation and compacted feature representation, the system is able to process text inputs on-the-fly while still achieving equally high precision as leading state-ofthe-art implementations. Our system offers an online interface where natural-language text can be inserted, which returns semantic type labels for entity mentions. Further more, the user interface allows users to explore the assigned types by visualizing and navigating along the type-hierarchy.

same-paper 3 0.8198263 389 acl-2013-Word Association Profiles and their Use for Automated Scoring of Essays

Author: Beata Beigman Klebanov ; Michael Flor

4 0.76781076 204 acl-2013-Iterative Transformation of Annotation Guidelines for Constituency Parsing

Author: Xiang Li ; Wenbin Jiang ; Yajuan Lu ; Qun Liu

Abstract: This paper presents an effective algorithm of annotation adaptation for constituency treebanks, which transforms a treebank from one annotation guideline to another with an iterative optimization procedure, thus to build a much larger treebank to train an enhanced parser without increasing model complexity. Experiments show that the transformed Tsinghua Chinese Treebank as additional training data brings significant improvement over the baseline trained on Penn Chinese Treebank only.

5 0.76151484 44 acl-2013-An Empirical Examination of Challenges in Chinese Parsing

Author: Jonathan K. Kummerfeld ; Daniel Tse ; James R. Curran ; Dan Klein

Abstract: Aspects of Chinese syntax result in a distinctive mix of parsing challenges. However, the contribution of individual sources of error to overall difficulty is not well understood. We conduct a comprehensive automatic analysis of error types made by Chinese parsers, covering a broad range of error types for large sets of sentences, enabling the first empirical ranking of Chinese error types by their performance impact. We also investigate which error types are resolved by using gold part-of-speech tags, showing that improving Chinese tagging only addresses certain error types, leaving substantial outstanding challenges.

6 0.63689011 205 acl-2013-Joint Apposition Extraction with Syntactic and Semantic Constraints

7 0.63279098 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model

8 0.62436587 164 acl-2013-FudanNLP: A Toolkit for Chinese Natural Language Processing

9 0.62335467 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri

10 0.61594313 353 acl-2013-Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain

11 0.61551589 36 acl-2013-Adapting Discriminative Reranking to Grounded Language Learning

12 0.61551511 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

13 0.61537349 377 acl-2013-Using Supervised Bigram-based ILP for Extractive Summarization

14 0.61466289 52 acl-2013-Annotating named entities in clinical text by combining pre-annotation and active learning

15 0.61376536 172 acl-2013-Graph-based Local Coherence Modeling

16 0.61339861 59 acl-2013-Automated Pyramid Scoring of Summaries using Distributional Semantics

17 0.61311281 275 acl-2013-Parsing with Compositional Vector Grammars

18 0.61255753 250 acl-2013-Models of Translation Competitions

19 0.61053693 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval

20 0.60954934 99 acl-2013-Crowd Prefers the Middle Path: A New IAA Metric for Crowdsourcing Reveals Turker Biases in Query Segmentation