acl acl2010 acl2010-4 knowledge-graph by maker-knowledge-mining

4 acl-2010-A Cognitive Cost Model of Annotations Based on Eye-Tracking Data


Source: pdf

Author: Katrin Tomanek ; Udo Hahn ; Steffen Lohmann ; Jurgen Ziegler

Abstract: We report on an experiment to track complex decision points in linguistic metadata annotation where the decision behavior of annotators is observed with an eyetracking device. As experimental conditions we investigate different forms of textual context and linguistic complexity classes relative to syntax and semantics. Our data renders evidence that annotation performance depends on the semantic and syntactic complexity of the decision points and, more interestingly, indicates that fullscale context is mostly negligible with – the exception of semantic high-complexity cases. We then induce from this observational data a cognitively grounded cost model of linguistic meta-data annotations and compare it with existing non-cognitive models. Our data reveals that the cognitively founded model explains annotation costs (expressed in annotation time) more adequately than non-cognitive ones.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 As experimental conditions we investigate different forms of textual context and linguistic complexity classes relative to syntax and semantics. [sent-2, score-0.502]

2 Our data renders evidence that annotation performance depends on the semantic and syntactic complexity of the decision points and, more interestingly, indicates that fullscale context is mostly negligible with – the exception of semantic high-complexity cases. [sent-3, score-1.066]

3 We then induce from this observational data a cognitively grounded cost model of linguistic meta-data annotations and compare it with existing non-cognitive models. [sent-4, score-0.44]

4 Our data reveals that the cognitively founded model explains annotation costs (expressed in annotation time) more adequately than non-cognitive ones. [sent-5, score-1.156]

5 So the question comes up, how we can rationally manage these investments so that annotation campaigns are economically doable without loss in annotation quality. [sent-16, score-0.964]

6 This intentional selection bias stands in stark contrast to prevailing sampling approaches where annotation examples are randomly chosen. [sent-19, score-0.527]

7 When different approaches to AL are compared with each other, or with standard random sampling, in terms of annotation efficiency, up until now, the AL community assumed uniform annotation costs for each linguistic unit, e. [sent-20, score-1.115]

8 If uniformity does not hold and, hence, the number of annotated units does not indicate the true annotation efforts required for a specific sample, empirically more adequate cost models are needed. [sent-26, score-0.666]

9 Building predictive models for annotation costs has only been addressed in few studies for now (Ringger et al. [sent-27, score-0.579]

10 The proposed models are based on easy-to-determine, yet not so explanatory variables (such as the number of words to be annotated), indicating that accurate models of annotation costs remain a desideratum. [sent-31, score-0.665]

11 We here, alternatively, consider different classes of syntac- tic and semantic complexity that might affect the cognitive load during the annotation process, with 1158 Proce dingUsp opfs thaela 4, 8Stwhe Adnen u,a 1l1- M16e Jtiunlgy o 2f0 t1h0e. [sent-32, score-1.032]

12 However, structural complexity criteria do not translate directly into empirically justified cost measures and thus have to be taken with care. [sent-39, score-0.47]

13 Gaze duration and search time are then taken as empirical correlates of linguistic complexity and, hence, uncover the real costs. [sent-48, score-0.474]

14 We therefore consider eyetracking as a promising means to get a better understanding of the nature of the linguistic annotation processes with the ultimate goal ofidentifying predictive factors for annotation cost models. [sent-49, score-1.153]

15 We conclude with experiments which reveal that cognitively grounded models outperform simpler ones relative to cost prediction using annotation time as a cost measure. [sent-53, score-0.942]

16 2 Experimental Design In our study, we applied, for the first time ever to the best of our knowledge, eye-tracking to study the cognitive processes underlying the annotation of linguistic meta-data, named entities in particular. [sent-55, score-0.824]

17 We chose this rather simple setting because the participants in the experiment had no previous experience with document annotation and no serious linguistic background. [sent-62, score-0.776]

18 We triggered the annotation processes by giving our participants specific annotation examples. [sent-64, score-1.139]

19 An example consists of a text document having one single annotation phrase highlighted which then had to be semantically annotated with respect to named entity mentions. [sent-65, score-0.886]

20 The annotation task was defined such that the correct entity type had to be assigned to each word in the annotation phrase. [sent-66, score-1.071]

21 The chosen stimulus an annotation example with one phrase highlighted for annotation allows for an exact localization of the cognitive processes and annotation actions performed relative to that specific phrase. [sent-77, score-1.831]

22 1 Independent Variables We defined two measures for the complexity of the annotation examples: The syntactic complexity was given by the number of nodes in the constituent parse tree which are dominated by the annotation phrase (Szmrecs a´nyi, 2004). [sent-79, score-1.794]

23 The semantic complexity of an annotation example is based on the inverse document frequency df of the words in the annotation phrase according to a reference corpus. [sent-81, score-1.571]

24 2 We calculated the semantic complexity score of an annotation phrase as maxidf (1wi), where wi is the i-th word of the annotation phrase. [sent-82, score-1.449]

25 Again, we empirically determined a threshold classifying annotation phrases as having either high or low semantic complexity. [sent-83, score-0.644]

26 Additionally, this automatically generated classification was manually checked and, if necessary, revised by two annotation experts. [sent-84, score-0.482]

27 For instance, if an annotation phrase contained a strong trigger (e. [sent-85, score-0.613]

28 , a social role or job title, as with “spokeswoman ” in the annotation phrase “spokeswoman Arlene ”), it was classified as a low-semantic-complexity item even though it might have been assigned a high inverse document frequency (due to the infrequent word “Arlene ”). [sent-87, score-0.735]

29 In the document context con- dition the whole newspaper article was shown as annotation example, while in the sentence context condition only the sentence containing the annotation phrase was presented. [sent-93, score-1.466]

30 However, the availability of (too much) context might overload and distract annotators, with a presumably negative effect on annotation performance. [sent-101, score-0.606]

31 Hypothesis H2: The complexity of the annotation phrases determines the annotation performance. [sent-102, score-1.295]

32 The assumption is that high syntactic or semantic complexity significantly lowers the annotation performance. [sent-103, score-0.915]

33 2, on a ten-point scale with 1 = “poor” and 10 = “excellent”, self-assessed), but without any prior experience in annotation and without previous exposure to linguistic training. [sent-109, score-0.536]

34 3 Stimulus Material According to the above definition of complexity, we automatically preselected annotation examples characterized by either a low or a high degree of semantic and syntactic complexity. [sent-111, score-0.674]

35 After manual fine-tuning of the example set assuring an even distribution of entity types and syntactic correctness of the automatically derived annotation phrases, we finally selected 80 annotation examples for the experiment. [sent-112, score-1.195]

36 4 Experimental Apparatus and Procedure The annotation examples were presented in a custom-built tool and its user interface was kept as simple as possible not to distract the eye movements of the participants. [sent-114, score-0.817]

37 It merely contained one frame showing the text of the annotation example, with the annotation phrase being highlighted. [sent-115, score-1.095]

38 A blank screen was shown after each annotation example to reset the eyes and to allow a break, if needed. [sent-116, score-0.553]

39 The time the blank screen was shown was not counted as annotation time. [sent-117, score-0.591]

40 The 80 annotation examples were presented to all participants in the same randomized order, with a balanced distribution of the complexity classes. [sent-118, score-0.988]

41 The limitation on 80 annotation examples reduces the chances of errors due to fatigue or lack of attention that can be observed in long-lasting annotation activities. [sent-120, score-1.054]

42 All annotation examples were chosen in a way that they completely fitted on the screen (i. [sent-122, score-0.598]

43 The participants used a standard keyboard to assign the entity types for each word of the annotation example. [sent-126, score-0.805]

44 All but 5 keys were removed from the keyboard to avoid extra eye movements for finger coordination (three keys for the positive entity classes, one for the negative “no entity” class, and one to confirm the annotation). [sent-127, score-0.397]

45 Screen resolution was set to 1280 x 1024 px and the annotation examples were presented in the middle of the screen in a font size of 16 px and a line spacing of 5 px. [sent-131, score-0.69]

46 All participants were familiarized with the annotation task and the guidelines in a preexperimental workshop where they practiced annotations on various exercise examples (about 60 minutes). [sent-134, score-0.754]

47 Participants were instructed to focus more on annotation accuracy than on annotation time as we wanted to avoid random guessing. [sent-139, score-1.002]

48 Accordingly, as an extra incentive, we rewarded the three participants with the highest annotation accuracy with cinema vouchers. [sent-140, score-0.657]

49 None of the participants reported serious difficulties with the newspaper articles or annotation tool and all understood the annotation task very well. [sent-141, score-1.194]

50 3 Results We used a mixed-design analysis of variance (ANOVA) model to test the hypotheses, with the context condition as between-subjects factor and the two complexity classes as within-subject factors. [sent-142, score-0.494]

51 Surprisingly, on the total of 174 entity-critical words within the 80 annotation examples, we found exactly the same mean value of 30. [sent-150, score-0.482]

52 4 These results seem to suggest that it makes no difference (neither for annotation accuracy nor for time) whether or not annotators are shown textual context beyond the sentence that contains the annotation phrase. [sent-157, score-1.197]

53 , the smallest distance that separates fixations) of 30 px and excluded the first second (mainly used for orientation and identification of the annotation phrase). [sent-163, score-0.528]

54 Figure 1: Schematic visualization of the sub-areas of an annotation example. [sent-164, score-0.482]

55 9 seconds; concerning the annotation errors on the 174 entity-critical words, these ranged between 21 and 46 errors. [sent-170, score-0.527]

56 participants looked in the textual context above the annotation phrase embedding sentence, and even less perceived the context below (16%). [sent-171, score-1.043]

57 The sentence parts before and after the annotation phrase were, on the average, visited by one third (32% and 34%, respectively) of the participants. [sent-172, score-0.613]

58 This result can be explained by the fact that participants of the document-context condition used the context whenever they thought it might help, whereas participants of the sentence-context condition spent more time thinking about a correct answer, overall with the same result. [sent-177, score-0.641]

59 2 Testing Complexity Classes To test hypothesis H2 we also compared the average annotation time and the number of errors on entity-critical words for the complexity subsets (see Table 2). [sent-179, score-0.902]

60 79 for the annotation time in the document context condition, and t(9) = 1. [sent-192, score-0.668]

61 08 for the annotation errors in the sentence context condition. [sent-194, score-0.61]

62 smeatinmeSDmeanerrSoDrsrate tation examples of each complexity class: number of entity-critical words, mean annotation time and standard deviations (SD), mean annotation errors, standard deviations, and error rates (number of errors divided by number of entity-critical words). [sent-197, score-1.415]

63 3 Context and Complexity We also examined whether the need for inspecting the context increases with the complexity of the annotation phrase. [sent-199, score-0.851]

64 Therefore, we analyzed the eye-tracking data in terms of the average number of fixations on the annotation phrase and on its embedding contexts for each complexity class (see Table 3). [sent-200, score-1.118]

65 The values illustrate that while the number of fixations on the annotation phrase rises generally with both the semantic and the syntactic complexity, the number of fixations on the context rises only with semantic complexity. [sent-201, score-1.249]

66 The number of fixations on the context is nearly the same for the two subsets with low semantic complexity (sem-syn and sem-SYN, with 1. [sent-202, score-0.657]

67 5), while it is significantly higher for the two subsets with high semantic complexity (5. [sent-204, score-0.405]

68 a0169no- tation phrase and context for the document condition and 20 annotation examples of each complexity class. [sent-213, score-1.177]

69 These results suggest that the need for context mainly depends on the semantic complexity of the annotation phrase, while it is less influenced by its syntactic complexity. [sent-214, score-0.998]

70 Figure 2: Annotation example with annotation phrase and the antecedent for “Roselawn ” in the text (left), and gaze plot of one participant showing a scanning-for-coreference behavior (right). [sent-218, score-0.897]

71 Figure 2 shows a gaze plot for one participant that illustrates a scanning-for-coreference behavior we observed for several annotation phrases with high semantic complexity. [sent-220, score-0.879]

72 In this case, words were searched in the upper context, which according to their orthographic signals might refer to a named entity but which could not completely be resolved only relying on the information given by the annotation phrase itself and its embedding sentence. [sent-221, score-0.871]

73 This is the case for “Roselawn ” in the annotation phrase “Roselawn accident”. [sent-222, score-0.613]

74 – 1163 4 Cognitively Grounded Cost Modeling We now discuss whether the findings on dependent variables from our eye-tracking study are fruitful for actually modeling annotation costs. [sent-226, score-0.562]

75 Therefore, we learn a linear regression model with time (an operationalization of annotation costs) as the dependent variable. [sent-227, score-0.602]

76 We compare our ‘cognitive’ model against a baseline model which relies on some simple formal text features only, and test whether the newly introduced features help predict annotation costs more accurately. [sent-228, score-0.669]

77 8 Our cognitive model, however, makes additional use of features based on linguistic complexity, and includes syntactic and semantic criteria related to the annotation phrases. [sent-233, score-0.884]

78 To account for our findings that syntactic and semantic complexity correlates with annotation performance, we added three features based on syntactic, and two based on semantic complex- ity measures. [sent-236, score-1.065]

79 As for syntactic complexity, we use two measures based on structural complexity including (a) the number of nodes of a constituency parse tree which are dominated by the annotation phrase (cf. [sent-240, score-1.026]

80 Given a POS 2-gram model, which we learned from the automatically POS-tagged MUC7 corpus, the complexity of an annotation phrase is defined by Pin=2 P(POSi|POSi−1) where POSi refers to the PPOS-tag of the| PiO-thS word of the annotation phrase. [sent-247, score-1.381]

81 As far as the quantification of semantic complexity is concerned, we use (a) the inverse document frequency df (wi) of each word wi (cf. [sent-250, score-0.476]

82 , the number of meanings contained in WORDNET,9 within an annotation phrase. [sent-254, score-0.482]

83 We consider the maximum ambiguity of the words within the annotation phrase as the overall ambiguity of the respective annotation phrase. [sent-255, score-1.221]

84 This measure is based on the assumption that annotation phrases with higher semantic ambiguity are harder to annotate than low-ambiguity ones. [sent-256, score-0.639]

85 Both often cannot be resolved locally so that annotators need to consult the context of an annotation chunk (cf. [sent-259, score-0.676]

86 Thus, we also added features providing information whether the annotation phrases contain entitycritical words which may denote the referent of an antecedent of an anaphoric relation. [sent-262, score-0.572]

87 In the same vein, we checked whether an annotation phrase contains expressions which can function as an abbreviation by virtue oftheir orthographical appearance, e. [sent-263, score-0.659]

88 Since our participants were sometimes scanning for entity-critical words, we also added features providing information on the number of entity- critical words within the annotation phrase. [sent-266, score-0.74]

89 ambiguity; general linguistic complexity: Flesch-Kincaid Readability Score semantics 3 test whether entity-critical word in annotation phrase is used in document (preceding or following current phrase); test whether phrase contains an abbreviation Table 4: Features for cost modeling. [sent-271, score-1.044]

90 2 Evaluation To test how well annotation costs can be modeled by the features described above, we used the MUC7T corpus, a re-annotation of the MUC7 corpus (Tomanek and Hahn, 2010). [sent-273, score-0.624]

91 These time tags indicate the time it took to annotate the respective phrase for named entity mentions of the types person, location, and organization. [sent-275, score-0.452]

92 We learned a simple linear regression model with the annotation time as dependent variable and the features described above as independent variables. [sent-278, score-0.647]

93 These numbers clearly demonstrate that annotation costs are more adequately modelled by the additional features we identified through our eye-tracking study. [sent-295, score-0.624]

94 We tested two main hypotheses one relating to the amount of contextual information being used for annotation decisions, the other relating to different degrees of syntactic and semantic complexity of expressions that had to be annotated. [sent-302, score-0.96]

95 the exception of tackling high-complexity semantic cases and resolving co-references) and that annotation performance correlates with semantic and syntactic complexity. [sent-377, score-0.734]

96 Instead, we aimed at testing whether the findings from our eye-tracking study can be exploited to model annotation costs more accurately. [sent-386, score-0.579]

97 Estimating annotation cost for active learning in a multi-annotator environment. [sent-403, score-0.676]

98 Investigating the effects of selective sampling on the annotation task. [sent-419, score-0.482]

99 Assessing the costs of machine-assisted corpus annotation through a user study. [sent-443, score-0.579]

100 Annotation time stamps: Temporal metadata from the linguistic annotation process. [sent-470, score-0.574]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('annotation', 0.482), ('complexity', 0.286), ('gaze', 0.175), ('participants', 0.175), ('fixations', 0.169), ('cognitive', 0.156), ('cnps', 0.14), ('cost', 0.135), ('phrase', 0.131), ('eye', 0.129), ('tomanek', 0.123), ('movements', 0.12), ('annotators', 0.111), ('entity', 0.107), ('costs', 0.097), ('cognitively', 0.095), ('roselawn', 0.093), ('settles', 0.093), ('condition', 0.085), ('context', 0.083), ('compl', 0.082), ('jena', 0.082), ('annotator', 0.08), ('syntactic', 0.079), ('hahn', 0.075), ('screen', 0.071), ('posi', 0.07), ('szmrecs', 0.07), ('semantic', 0.068), ('participant', 0.066), ('document', 0.065), ('rayner', 0.063), ('frazier', 0.061), ('ringger', 0.061), ('udo', 0.061), ('named', 0.059), ('active', 0.059), ('duration', 0.059), ('inverse', 0.057), ('grounded', 0.057), ('lyn', 0.056), ('stimulus', 0.056), ('newspaper', 0.055), ('linguistic', 0.054), ('annotations', 0.052), ('subsets', 0.051), ('embedding', 0.05), ('explanatory', 0.05), ('token', 0.049), ('empirically', 0.049), ('universit', 0.049), ('dominated', 0.048), ('arlene', 0.047), ('arora', 0.047), ('duisburg', 0.047), ('nyi', 0.047), ('observational', 0.047), ('spokeswoman', 0.047), ('abbreviation', 0.046), ('garden', 0.046), ('px', 0.046), ('errors', 0.045), ('features', 0.045), ('characters', 0.045), ('hypotheses', 0.045), ('phrases', 0.045), ('examples', 0.045), ('dependent', 0.044), ('accordingly', 0.044), ('ambiguity', 0.044), ('behavior', 0.043), ('highlighted', 0.042), ('katrin', 0.042), ('orthographic', 0.042), ('took', 0.041), ('behavioral', 0.041), ('roark', 0.041), ('apparatus', 0.041), ('distract', 0.041), ('hachey', 0.041), ('keyboard', 0.041), ('synt', 0.041), ('traxler', 0.041), ('classes', 0.04), ('textual', 0.039), ('keith', 0.039), ('regression', 0.038), ('critical', 0.038), ('time', 0.038), ('respective', 0.038), ('deviations', 0.037), ('hardness', 0.037), ('germany', 0.037), ('correlates', 0.037), ('variables', 0.036), ('ambiguities', 0.036), ('entities', 0.035), ('readability', 0.035), ('sd', 0.035), ('alphanumeric', 0.035)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999791 4 acl-2010-A Cognitive Cost Model of Annotations Based on Eye-Tracking Data

Author: Katrin Tomanek ; Udo Hahn ; Steffen Lohmann ; Jurgen Ziegler

Abstract: We report on an experiment to track complex decision points in linguistic metadata annotation where the decision behavior of annotators is observed with an eyetracking device. As experimental conditions we investigate different forms of textual context and linguistic complexity classes relative to syntax and semantics. Our data renders evidence that annotation performance depends on the semantic and syntactic complexity of the decision points and, more interestingly, indicates that fullscale context is mostly negligible with – the exception of semantic high-complexity cases. We then induce from this observational data a cognitively grounded cost model of linguistic meta-data annotations and compare it with existing non-cognitive models. Our data reveals that the cognitively founded model explains annotation costs (expressed in annotation time) more adequately than non-cognitive ones.

2 0.26862425 31 acl-2010-Annotation

Author: Eduard Hovy

Abstract: unkown-abstract

3 0.16049971 220 acl-2010-Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure

Author: Jeff Mitchell ; Mirella Lapata ; Vera Demberg ; Frank Keller

Abstract: The analysis of reading times can provide insights into the processes that underlie language comprehension, with longer reading times indicating greater cognitive load. There is evidence that the language processor is highly predictive, such that prior context allows upcoming linguistic material to be anticipated. Previous work has investigated the contributions of semantic and syntactic contexts in isolation, essentially treating them as independent factors. In this paper we analyze reading times in terms of a single predictive measure which integrates a model of semantic composition with an incremental parser and a language model.

4 0.14922057 59 acl-2010-Cognitively Plausible Models of Human Language Processing

Author: Frank Keller

Abstract: We pose the development of cognitively plausible models of human language processing as a challenge for computational linguistics. Existing models can only deal with isolated phenomena (e.g., garden paths) on small, specifically selected data sets. The challenge is to build models that integrate multiple aspects of human language processing at the syntactic, semantic, and discourse level. Like human language processing, these models should be incremental, predictive, broad coverage, and robust to noise. This challenge can only be met if standardized data sets and evaluation measures are developed.

5 0.12655458 57 acl-2010-Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation

Author: Michael Bloodgood ; Chris Callison-Burch

Abstract: We explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources. The main challenge is how to buck the trend of diminishing returns that is commonly encountered. We present an active learning-style data solicitation algorithm to meet this challenge. We test it, gathering annotations via Amazon Mechanical Turk, and find that we get an order of magnitude increase in performance rates of improvement.

6 0.11237069 13 acl-2010-A Rational Model of Eye Movement Control in Reading

7 0.10856993 240 acl-2010-Training Phrase Translation Models with Leaving-One-Out

8 0.10436736 208 acl-2010-Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

9 0.10097852 58 acl-2010-Classification of Feedback Expressions in Multimodal Data

10 0.095763773 229 acl-2010-The Influence of Discourse on Syntax: A Psycholinguistic Model of Sentence Processing

11 0.094666608 65 acl-2010-Complexity Metrics in an Incremental Right-Corner Parser

12 0.090285845 173 acl-2010-Modeling Norms of Turn-Taking in Multi-Party Conversation

13 0.084224187 132 acl-2010-Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-Jointly Labeled Data

14 0.083989218 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing

15 0.08380542 1 acl-2010-"Ask Not What Textual Entailment Can Do for You..."

16 0.079165235 75 acl-2010-Correcting Errors in a Treebank Based on Synchronous Tree Substitution Grammar

17 0.078030407 230 acl-2010-The Manually Annotated Sub-Corpus: A Community Resource for and by the People

18 0.076619089 15 acl-2010-A Semi-Supervised Key Phrase Extraction Approach: Learning from Title Phrases through a Document Semantic Network

19 0.075197831 38 acl-2010-Automatic Evaluation of Linguistic Quality in Multi-Document Summarization

20 0.072326057 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.255), (1, 0.079), (2, -0.014), (3, -0.093), (4, -0.041), (5, -0.014), (6, -0.025), (7, -0.012), (8, -0.004), (9, 0.018), (10, -0.064), (11, 0.072), (12, 0.061), (13, 0.147), (14, -0.147), (15, 0.127), (16, 0.026), (17, 0.072), (18, 0.057), (19, 0.095), (20, -0.093), (21, 0.076), (22, 0.147), (23, 0.045), (24, -0.052), (25, -0.079), (26, 0.107), (27, 0.09), (28, 0.199), (29, -0.195), (30, -0.07), (31, -0.035), (32, -0.023), (33, -0.018), (34, -0.192), (35, 0.001), (36, 0.136), (37, 0.053), (38, 0.068), (39, 0.008), (40, 0.204), (41, -0.009), (42, -0.048), (43, 0.063), (44, 0.001), (45, -0.002), (46, 0.121), (47, 0.03), (48, 0.028), (49, -0.01)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.98256236 4 acl-2010-A Cognitive Cost Model of Annotations Based on Eye-Tracking Data

Author: Katrin Tomanek ; Udo Hahn ; Steffen Lohmann ; Jurgen Ziegler

Abstract: We report on an experiment to track complex decision points in linguistic metadata annotation where the decision behavior of annotators is observed with an eyetracking device. As experimental conditions we investigate different forms of textual context and linguistic complexity classes relative to syntax and semantics. Our data renders evidence that annotation performance depends on the semantic and syntactic complexity of the decision points and, more interestingly, indicates that fullscale context is mostly negligible with – the exception of semantic high-complexity cases. We then induce from this observational data a cognitively grounded cost model of linguistic meta-data annotations and compare it with existing non-cognitive models. Our data reveals that the cognitively founded model explains annotation costs (expressed in annotation time) more adequately than non-cognitive ones.

2 0.83213896 31 acl-2010-Annotation

Author: Eduard Hovy

Abstract: unkown-abstract

3 0.74217874 230 acl-2010-The Manually Annotated Sub-Corpus: A Community Resource for and by the People

Author: Nancy Ide ; Collin Baker ; Christiane Fellbaum ; Rebecca Passonneau

Abstract: The Manually Annotated Sub-Corpus (MASC) project provides data and annotations to serve as the base for a communitywide annotation effort of a subset of the American National Corpus. The MASC infrastructure enables the incorporation of contributed annotations into a single, usable format that can then be analyzed as it is or ported to any of a variety of other formats. MASC includes data from a much wider variety of genres than existing multiply-annotated corpora of English, and the project is committed to a fully open model of distribution, without restriction, for all data and annotations produced or contributed. As such, MASC is the first large-scale, open, communitybased effort to create much needed language resources for NLP. This paper describes the MASC project, its corpus and annotations, and serves as a call for contributions of data and annotations from the language processing community.

4 0.70219111 57 acl-2010-Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation

Author: Michael Bloodgood ; Chris Callison-Burch

Abstract: We explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources. The main challenge is how to buck the trend of diminishing returns that is commonly encountered. We present an active learning-style data solicitation algorithm to meet this challenge. We test it, gathering annotations via Amazon Mechanical Turk, and find that we get an order of magnitude increase in performance rates of improvement.

5 0.60112858 58 acl-2010-Classification of Feedback Expressions in Multimodal Data

Author: Costanza Navarretta ; Patrizia Paggio

Abstract: This paper addresses the issue of how linguistic feedback expressions, prosody and head gestures, i.e. head movements and face expressions, relate to one another in a collection of eight video-recorded Danish map-task dialogues. The study shows that in these data, prosodic features and head gestures significantly improve automatic classification of dialogue act labels for linguistic expressions of feedback.

6 0.54413891 226 acl-2010-The Human Language Project: Building a Universal Corpus of the World's Languages

7 0.50993425 59 acl-2010-Cognitively Plausible Models of Human Language Processing

8 0.50457466 13 acl-2010-A Rational Model of Eye Movement Control in Reading

9 0.49815145 65 acl-2010-Complexity Metrics in an Incremental Right-Corner Parser

10 0.45352721 220 acl-2010-Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure

11 0.45101002 139 acl-2010-Identifying Generic Noun Phrases

12 0.43837139 140 acl-2010-Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.

13 0.4349961 259 acl-2010-WebLicht: Web-Based LRT Services for German

14 0.43232605 208 acl-2010-Sentence and Expression Level Annotation of Opinions in User-Generated Discourse

15 0.42605901 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images

16 0.42083573 1 acl-2010-"Ask Not What Textual Entailment Can Do for You..."

17 0.41918415 112 acl-2010-Extracting Social Networks from Literary Fiction

18 0.41098842 253 acl-2010-Using Smaller Constituents Rather Than Sentences in Active Learning for Japanese Dependency Parsing

19 0.39081419 19 acl-2010-A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation

20 0.39023665 29 acl-2010-An Exact A* Method for Deciphering Letter-Substitution Ciphers


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(14, 0.012), (25, 0.038), (33, 0.012), (39, 0.014), (42, 0.019), (59, 0.069), (73, 0.033), (78, 0.026), (83, 0.554), (84, 0.044), (98, 0.088)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.98609227 256 acl-2010-Vocabulary Choice as an Indicator of Perspective

Author: Beata Beigman Klebanov ; Eyal Beigman ; Daniel Diermeier

Abstract: We establish the following characteristics of the task of perspective classification: (a) using term frequencies in a document does not improve classification achieved with absence/presence features; (b) for datasets allowing the relevant comparisons, a small number of top features is found to be as effective as the full feature set and indispensable for the best achieved performance, testifying to the existence of perspective-specific keywords. We relate our findings to research on word frequency distributions and to discourse analytic studies of perspective.

same-paper 2 0.97950989 4 acl-2010-A Cognitive Cost Model of Annotations Based on Eye-Tracking Data

Author: Katrin Tomanek ; Udo Hahn ; Steffen Lohmann ; Jurgen Ziegler

Abstract: We report on an experiment to track complex decision points in linguistic metadata annotation where the decision behavior of annotators is observed with an eyetracking device. As experimental conditions we investigate different forms of textual context and linguistic complexity classes relative to syntax and semantics. Our data renders evidence that annotation performance depends on the semantic and syntactic complexity of the decision points and, more interestingly, indicates that fullscale context is mostly negligible with – the exception of semantic high-complexity cases. We then induce from this observational data a cognitively grounded cost model of linguistic meta-data annotations and compare it with existing non-cognitive models. Our data reveals that the cognitively founded model explains annotation costs (expressed in annotation time) more adequately than non-cognitive ones.

3 0.97671872 72 acl-2010-Coreference Resolution across Corpora: Languages, Coding Schemes, and Preprocessing Information

Author: Marta Recasens ; Eduard Hovy

Abstract: This paper explores the effect that different corpus configurations have on the performance of a coreference resolution system, as measured by MUC, B3, and CEAF. By varying separately three parameters (language, annotation scheme, and preprocessing information) and applying the same coreference resolution system, the strong bonds between system and corpus are demonstrated. The experiments reveal problems in coreference resolution evaluation relating to task definition, coding schemes, and features. They also ex- pose systematic biases in the coreference evaluation metrics. We show that system comparison is only possible when corpus parameters are in exact agreement.

4 0.96412289 38 acl-2010-Automatic Evaluation of Linguistic Quality in Multi-Document Summarization

Author: Emily Pitler ; Annie Louis ; Ani Nenkova

Abstract: To date, few attempts have been made to develop and validate methods for automatic evaluation of linguistic quality in text summarization. We present the first systematic assessment of several diverse classes of metrics designed to capture various aspects of well-written text. We train and test linguistic quality models on consecutive years of NIST evaluation data in order to show the generality of results. For grammaticality, the best results come from a set of syntactic features. Focus, coherence and referential clarity are best evaluated by a class of features measuring local coherence on the basis of cosine similarity between sentences, coreference informa- tion, and summarization specific features. Our best results are 90% accuracy for pairwise comparisons of competing systems over a test set of several inputs and 70% for ranking summaries of a specific input.

5 0.96136069 132 acl-2010-Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-Jointly Labeled Data

Author: Jenny Rose Finkel ; Christopher D. Manning

Abstract: One of the main obstacles to producing high quality joint models is the lack of jointly annotated data. Joint modeling of multiple natural language processing tasks outperforms single-task models learned from the same data, but still underperforms compared to single-task models learned on the more abundant quantities of available single-task annotated data. In this paper we present a novel model which makes use of additional single-task annotated data to improve the performance of a joint model. Our model utilizes a hierarchical prior to link the feature weights for shared features in several single-task models and the joint model. Experiments on joint parsing and named entity recog- nition, using the OntoNotes corpus, show that our hierarchical joint model can produce substantial gains over a joint model trained on only the jointly annotated data.

6 0.87295252 31 acl-2010-Annotation

7 0.85424072 1 acl-2010-"Ask Not What Textual Entailment Can Do for You..."

8 0.84455854 73 acl-2010-Coreference Resolution with Reconcile

9 0.77788484 81 acl-2010-Decision Detection Using Hierarchical Graphical Models

10 0.76963413 33 acl-2010-Assessing the Role of Discourse References in Entailment Inference

11 0.76677001 219 acl-2010-Supervised Noun Phrase Coreference Research: The First Fifteen Years

12 0.75698143 112 acl-2010-Extracting Social Networks from Literary Fiction

13 0.74986577 32 acl-2010-Arabic Named Entity Recognition: Using Features Extracted from Noisy Data

14 0.74289 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields

15 0.74075556 134 acl-2010-Hierarchical Sequential Learning for Extracting Opinions and Their Attributes

16 0.73785919 230 acl-2010-The Manually Annotated Sub-Corpus: A Community Resource for and by the People

17 0.73611766 155 acl-2010-Kernel Based Discourse Relation Recognition with Temporal Ordering Information

18 0.7258203 122 acl-2010-Generating Fine-Grained Reviews of Songs from Album Reviews

19 0.72574759 197 acl-2010-Practical Very Large Scale CRFs

20 0.72538912 252 acl-2010-Using Parse Features for Preposition Selection and Error Detection