acl acl2010 acl2010-13 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Klinton Bicknell ; Roger Levy
Abstract: A number of results in the study of realtime sentence comprehension have been explained by computational models as resulting from the rational use of probabilistic linguistic information. Many times, these hypotheses have been tested in reading by linking predictions about relative word difficulty to word-aggregated eye tracking measures such as go-past time. In this paper, we extend these results by asking to what extent reading is well-modeled as rational behavior at a finer level of analysis, predicting not aggregate measures, but the duration and location of each fixation. We present a new rational model of eye movement control in reading, the central assumption of which is that eye move- ment decisions are made to obtain noisy visual information as the reader performs Bayesian inference on the identities of the words in the sentence. As a case study, we present two simulations demonstrating that the model gives a rational explanation for between-word regressions.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract A number of results in the study of realtime sentence comprehension have been explained by computational models as resulting from the rational use of probabilistic linguistic information. [sent-3, score-0.292]
2 Many times, these hypotheses have been tested in reading by linking predictions about relative word difficulty to word-aggregated eye tracking measures such as go-past time. [sent-4, score-0.586]
3 In this paper, we extend these results by asking to what extent reading is well-modeled as rational behavior at a finer level of analysis, predicting not aggregate measures, but the duration and location of each fixation. [sent-5, score-0.557]
4 We present a new rational model of eye movement control in reading, the central assumption of which is that eye move- ment decisions are made to obtain noisy visual information as the reader performs Bayesian inference on the identities of the words in the sentence. [sent-6, score-1.509]
5 As a case study, we present two simulations demonstrating that the model gives a rational explanation for between-word regressions. [sent-7, score-0.416]
6 To the extent that the behavior of these models looks like human behavior, it suggests that humans are making rational use of all the information available to them in language processing. [sent-14, score-0.3]
7 In this paper, we present a new rational model of eye movement control in reading, the central assumption of which is that eye movement decisions are made to obtain noisy visual informa- tion, which the reader uses in Bayesian inference about the form and structure of the sentence. [sent-25, score-1.581]
8 As a case study, we show that this model gives a rational explanation for between-word regressions. [sent-26, score-0.352]
9 In Section 2, we briefly describe the leading models of eye movements in reading, and in Section 3, we describe how these models account for between-word regressions and the intuition behind our model’s account of them. [sent-27, score-0.495]
10 Section 4 describes the model and its implementation and Sections 5– 6 describe two simulations we performed with the model comparing behavioral policies that make regressions to those that do not. [sent-28, score-0.509]
11 In Simulation 1, we show that specific regressive policies outperform specific non-regressive policies, and in Simulation 2, we use optimization to directly find optimal policies for three performance measures. [sent-29, score-0.458]
12 The results show that the regressive policies outperform non-regressive policies across a wide range of performance measures, demonstrating that our model predicts that making between-word regressions is a rational strategy for reading. [sent-30, score-0.919]
13 2 Models of eye movements in reading The two most successful models of eye movements in reading are E-Z Reader (Reichle, Pollatsek, Fisher, & Rayner, 1998; Reichle et al. [sent-31, score-1.12]
14 While both of these models provide a good fit to eye tracking data from reading, neither model asks the higher level question of what a rational solution to the problem would look like. [sent-37, score-0.568]
15 Chips model simplifies the problem of reading in a number of ways: First, it uses a unigram model as its language model, and thus fails to use any information in the linguistic context to help with word identification. [sent-41, score-0.379]
16 The larger problem, however, is that each of these models uses an unrealistic model of visual input, which obtains absolute knowledge of the characters in its visual window. [sent-45, score-0.744]
17 Thus, there is no reason for the model to spend longer on one fixation than another, and the model only makes predictions for where saccades are targeted, and not how long fixations last. [sent-46, score-0.408]
18 Reichle and Laurent (2006) presented a rational model that overcame the limitations of Mr. [sent-47, score-0.315]
19 Chips to produce predictions for both fixation durations and locations, focusing on the ways in which eye movement behavior is an adaptive response to the particular constraints of the task of reading. [sent-48, score-0.549]
20 In this paper, we present another rational model of eye movement control in reading that, like Reichle and Laurent, makes predictions for fixation durations and locations, but which focuses instead on the dynamics of word identification at the core of the task of reading. [sent-50, score-1.16]
21 Specifically, our model identifies the words in a sentence by performing Bayesian inference combining noisy input from a realistic visual model with a language model that takes context into account. [sent-51, score-0.664]
22 3 Explaining between-word regressions In this paper, we use our model to provide a novel explanation for between-word regressive saccades. [sent-52, score-0.44]
23 In reading, about 10–15% of saccades are regressive movements from right-to-left (or to previous lines). [sent-53, score-0.325]
24 , the eyes move backwards to a previous word because they accidentally landed further forward than intended due to motor error. [sent-59, score-0.312]
25 From the present perspective, however, it is unclear how it could be rational to move past an unidentified word and decide to revisit it only much later. [sent-67, score-0.338]
26 – Here, we suggest – a new explanation for between-word regressions that arises as a result of word identification processes (unlike that of E-Z Reader) and can be understood as rational (unlike that of SWIFT). [sent-68, score-0.516]
27 Thus, it is possible that later parts of a sentence can cause a reader’s confidence in the identity of the previous regions to fall. [sent-70, score-0.352]
28 In these cases, a rational way to respond might be to make a between-word regressive saccade to get more visual information about the (now) low confidence previous region. [sent-71, score-0.921]
29 To illustrate this idea, consider the case of a language composed of just two strings, AB and BA, and assume that the eyes can only get noisy in- formation about the identity of one character at a time. [sent-72, score-0.529]
30 After obtaining a little information about the identity of the first character, the reader may be reasonably confident that its identity is A and move on to obtaining visual input about the second character. [sent-73, score-0.933]
31 If the first noisy input about the second character also indicates that it is probably A, then the normative probability that the first character is A (and thus a rational reader’s confidence in its identity) will fall. [sent-74, score-0.732]
32 This simple example just illustrates the point that if a reader is combining noisy visual information with a language model, then confidence in previous regions will sometimes fall. [sent-75, score-0.682]
33 The second option is to read left-to-right relatively more quickly, and then make occasional right-to-left regressions in the cases where probability in previous regions falls. [sent-80, score-0.312]
34 In this paper, we present two simulations suggesting that when using a rational model to read natural language, the best strategies for coping with the problem of confidence about previous regions dropping for any tradeoff between speed and accuracy involve making between-word regressions. [sent-81, score-0.615]
35 In the next section, we present the details of our model of reading and its implementation, and then we present our two simulations in the sections following. [sent-82, score-0.382]
36 Specifically, the model begins reading with a prior distribution over possible identities of a sentence given by its language model. [sent-84, score-0.409]
37 On the basis of that distribution, the model decides whether or not to move its eyes (and if so where to move them to) and obtains noisy visual input about the sentence at the eyes’ position. [sent-85, score-0.878]
38 This framework is unique among models of eye movement control in reading (except Mr. [sent-88, score-0.652]
39 Chips) in having a fully explicit model of how visual input is used to discriminate word identity. [sent-89, score-0.445]
40 The hope in our approach is that the influence of these key factors on the eye movement record will fall out as a natural consequence of rational behavior itself. [sent-91, score-0.643]
41 In our framework, in contrast, we would expect such an effect to emerge as a byproduct of Bayesian inference: words with high prior probability (conditional on preceding fixations) will require less visual input to be reliably identified. [sent-94, score-0.384]
42 In the remainder of this section, we present these details of the formalization of the reading problem we used for the simulations reported in this paper: actions (4. [sent-96, score-0.368]
43 If on the ith timestep, the model chooses option (a), the timestep advances to i+ 1 and another sample of visual input is obtained around the current position. [sent-104, score-0.52]
44 If the model chooses option (c), the reading immediately ends. [sent-105, score-0.318]
45 On the next time step, visual input is obtained around ‘i and another decision is made. [sent-107, score-0.384]
46 The motor error for sac– – cades follows the form of random error used by all major models of eye movements in reading: the landing position ‘i is normally distributed around the intended target ti with standard deviation given by a linear function of the intended distance1 ‘i ∼ N ? [sent-108, score-0.425]
47 2 Noisy visual input As stated earlier, the role of noisy visual input in our model is as the likelihood term in a Bayesian inference about sentence form and identity. [sent-117, score-0.926]
48 r Wacete ars position are conditionally independent given sentence identity, so that if wj denotes letter j of the sen- tence and I(j) denotes the component of visual input a anssdoc Ii(atje)d weintoht ethsa tht letter, pthoenne we can duae-l compose p(I|w) as ∏jp(I(j) |wj). [sent-120, score-0.691]
49 The visual input obtained from an individual fixation can thus be summarized a vector of likelihoods p(I(j) |wj), as as shown in 1In the terminology of the literature, the model has only random motor error (variance), not systematic error (bias). [sent-122, score-0.622]
50 0 4 Figure 1: Peripheral and foveal visual input in the model. [sent-151, score-0.474]
51 In peripheral vision, the letter/whitespace distinction is veridical, but no information about letter identity is obtained. [sent-155, score-0.38]
52 As in the real visual system, our vi- sual acuity function decreases with retinal eccentricity; we follow the SWIFT model in assuming that the spatial distribution of visual processing rate follows an asymmetric Gaussian with σL = 2. [sent-158, score-0.781]
53 If ε denotes a character’s eccentricity in characters from the center of fixation, then the proportion of the total processing rate at that eccentricity λ (ε) is given by integrating the asymmetric Gaussian over a character width centered on that position, λ(ε) =Zε−ε+. [sent-161, score-0.369]
54 From this distribution, we derive two types of visual input, peripheral input giving word boundary information and foveal input giving information about letter identity. [sent-166, score-0.754]
55 1 Peripheral visual input In our model, any eccentricity with a processing rate proportion λ (ε) at least 0. [sent-169, score-0.501]
56 5% of the rate proportion for the centrally fixated character (ε ∈ [−7, 12]), yields peripheral xvaitseudal input, edref (iεne ∈d as 7v,e1ri2d]i),ca yl ewldosrd p boundary viinsfuoarlm iantpiount, i dnedfiincaetding whether each character is a letter or a space. [sent-170, score-0.598]
57 This roughly corresponds to empirical estimates that humans obtain useful information in reading from about 19 characters, more from the right of fixation than the left (Rayner, 1998). [sent-171, score-0.374]
58 Hence in Figure 1, for example, left-peripheral visual input can be represented as veridical knowledge ofthe initial whitespace (denoted d ), and a uniform distribution over the 26 letters of English for the letter a. [sent-172, score-0.525]
59 This threshold of 1% roughly corresponds to estimates that readers get information useful for letter identification from about 4 characters to the left and 8 to the right of fixation (Rayner, 1998). [sent-176, score-0.326]
60 In our model, each letter is equally confusable with all others, following Norris (2006, 2009), but ignoring work on letter confusability (which could be added to future model revisions; Engel, Dougherty, & Jones, 1973; Geyer, 1977). [sent-177, score-0.373]
61 (3) If the eccentricity of the jth character on the tth timestep is outside of foveal input or the character is a space, the inner term is 0 or 1. [sent-189, score-0.584]
62 If the sample was from a letter in foveal input ∈ [−5, 8], it is the probability of sampling It (j) fro∈m[ −th5e, m8],u iltitisva thriea tper oGbaabusilsi tayn o N(wj, ΛΣ(εtj)). [sent-190, score-0.291]
63 4 Control policy The model uses a simple policy to decide between actions based on the marginal probability m of the (a) m = [. [sent-192, score-0.404]
64 7] : Stop reading Figure 2: Values of m for a 6 character sentence under which a model fixating position 3 would take each of its four actions, if α = . [sent-216, score-0.608]
65 If the value of this statistic for the current position of the eyes m(‘i) is less than a parameter α, the model chooses to continue fixating the current position (2a). [sent-222, score-0.401]
66 Otherwise, if the value of m(j) is less than β for some leftward position j < ‘i, the model initiates a saccade to the closest such position (2b). [sent-223, score-0.303]
67 3 Finally, if no such positions exist to the right, the model stops reading the sentence (2d). [sent-226, score-0.356]
68 Intuitively, then, the model reads by making a rightward sweep to bring its confidence in each character up to α, but pauses to move left if confidence in a previous character falls below β . [sent-227, score-0.579]
69 To perform belief update given a new visual input, we create a new wFSA to represent the likelihood of each character from the sample. [sent-230, score-0.498]
70 , the first and second state in the chain are connected by 27 (or fewer) arcs, which emit each of 3The role of n is to ensure that the model does not center its visual field on the first uncertain character. [sent-233, score-0.385]
71 1173 the possible characters for w1 along with their respective likelihoods given the visual input (as in the inner term of Equation 3). [sent-235, score-0.419]
72 5 Simulation 1 With the description of our model in place, we next proceed to describe the first simulation in which we used the model to test the hypothesis that making regressions is a rational way to cope with confidence in previous regions falling. [sent-238, score-0.807]
73 In the terms of our model’s policy parameters α and β described above, non-regressive policies are exactly those with β = 0, and a policy that is faster on the left-to-right pass but does make regressions is one with a lower value of α but a non-zero β . [sent-240, score-0.619]
74 Thus, we tested the performance of our model on the reading of a corpus of text typical of that used in reading experiments at a range of reasonable non-regressive policies, as well as a set of regressive policies with lower α and positive β . [sent-241, score-0.856]
75 This was done in order to facilitate simple composition with the visual likelihood wFSA defined over characters. [sent-275, score-0.324]
76 2 Results and discussion For each policy we tested, we measured the average number of timesteps it took to read the sentences, as well as the average (natural) log probability of the correct sentence identity under the model’s beliefs after reading ended ‘Accuracy’ . [sent-285, score-0.829]
77 As shown in the graph, for each non-regressive policy (the circles), there is a regressive policy that outperforms it, both in terms of average number of timesteps taken to read (further to the left) and the average log probability of the sentence identity (higher). [sent-287, score-0.87]
78 Thus, for a range of policies, these results suggest 1174 Timesteps Figure 3: Mean number of timesteps taken to read a sentence and (natural) log probability of the true identity of the sentence ‘Accuracy’ for a range of values of α and β . [sent-288, score-0.495]
79 For each non-regressive policy (β = 0), there is a policy with a lower α and higher β that achieves better accuracy in less time. [sent-290, score-0.296]
80 that making regressions when confidence about previous regions falls is a rational reader strategy, in that it appears to lead to better performance, both in terms of speed and accuracy. [sent-291, score-0.786]
81 6 Simulation 2 In Simulation 2, we perform a more direct test of the idea that making regressions is a rational response to the problem of confidence falling about previous regions using optimization techniques. [sent-292, score-0.6]
82 2 Optimization of policy parameters Searching directly for optimal values of α and β for our stochastic reading model is difficult because each evaluation of the model with a particular set of parameters produces a different result. [sent-308, score-0.606]
83 In addition, we see that the average results of reading at these parameter values are also as we would expect, with T and L going up as γ goes down. [sent-341, score-0.29]
84 This provides more evidence that whatever the particular performance measure used, policies making regressive saccades when confidence in previous regions falls perform better than those that do not. [sent-361, score-0.525]
85 That may at first seem surprising, since the model’s policy is to fixate a region until its confidence becomes greater than α and then return if it falls below β . [sent-364, score-0.259]
86 7 Conclusion In this paper, we presented a model that performs Bayesian inference on the identity of a sentence, combining a language model with noisy information about letter identities from a realistic visual input model. [sent-369, score-0.919]
87 On the basis of these inferences, it uses a simple policy to determine how long to continue fixating the current position and where to fixate next, on the basis of information about where the model is uncertain about the sentence’s identity. [sent-370, score-0.355]
88 As such, it constitutes a rational model of eye movement control in reading, extending the insights from previous results about rationality in language comprehension. [sent-371, score-0.71]
89 The results of two simulations using this model support a novel explanation for between-word regressive saccades in reading: that they are used to gather visual input about previous regions when confidence about them falls. [sent-372, score-0.94]
90 Rational eye movements in reading combining uncertainty about previous words with contextual probability. [sent-403, score-0.596]
91 Parsing costs as predictors of reading difficulty: An evaluation using the potsdam sentence corpus. [sent-414, score-0.295]
92 Contextual effects on word perception and eye movements during reading. [sent-433, score-0.303]
93 A dynamical model of saccade generation in reading based on spatially distributed lexical processing. [sent-444, score-0.436]
94 A noisy-channel model of rational human sentence comprehension under uncertain input. [sent-542, score-0.353]
95 A Bayesian model predicts human parse preference and reading time in sentence processing. [sent-585, score-0.356]
96 Eye movements in reading and information processing: 20 years of research. [sent-610, score-0.342]
97 Toward a model of eye movement control in reading. [sent-626, score-0.456]
98 Using E-Z Reader to model the effects of higher level language processing on eye movements during reading. [sent-640, score-0.364]
99 Comparing naming, lexical decision, and eye fixation times: Word frequency effects and individual differences. [sent-649, score-0.335]
100 Integration of visual and linguistic information in spoken language comprehension. [sent-674, score-0.324]
wordName wordTfidf (topN-words)
[('visual', 0.324), ('reading', 0.257), ('rational', 0.254), ('eye', 0.218), ('regressions', 0.192), ('eyes', 0.168), ('wfsa', 0.165), ('identity', 0.16), ('rayner', 0.152), ('regressive', 0.15), ('timesteps', 0.15), ('policy', 0.148), ('reader', 0.145), ('reichle', 0.144), ('character', 0.142), ('letter', 0.141), ('swift', 0.132), ('policies', 0.131), ('movement', 0.125), ('levy', 0.123), ('saccade', 0.118), ('fixation', 0.117), ('chips', 0.105), ('engbert', 0.092), ('foveal', 0.09), ('saccades', 0.09), ('movements', 0.085), ('simulation', 0.085), ('move', 0.084), ('regions', 0.079), ('peripheral', 0.079), ('confidence', 0.075), ('bicknell', 0.075), ('eccentricity', 0.075), ('timestep', 0.075), ('wj', 0.066), ('bayesian', 0.066), ('kliegl', 0.064), ('simulations', 0.064), ('position', 0.062), ('model', 0.061), ('motor', 0.06), ('klitz', 0.06), ('legge', 0.06), ('tjan', 0.06), ('wfsas', 0.06), ('input', 0.06), ('noisy', 0.059), ('identities', 0.053), ('fixated', 0.052), ('jaeger', 0.052), ('pollatsek', 0.052), ('control', 0.052), ('psychological', 0.05), ('fixating', 0.048), ('actions', 0.047), ('optimal', 0.046), ('behavior', 0.046), ('genzel', 0.045), ('schilling', 0.045), ('predictions', 0.043), ('mohri', 0.043), ('rate', 0.042), ('read', 0.041), ('laurent', 0.041), ('speed', 0.041), ('norris', 0.039), ('sentence', 0.038), ('explanation', 0.037), ('bigrams', 0.037), ('hale', 0.036), ('uncertainty', 0.036), ('fixations', 0.036), ('fixate', 0.036), ('tracking', 0.035), ('bnc', 0.035), ('cognitive', 0.035), ('log', 0.035), ('characters', 0.035), ('values', 0.033), ('difficulty', 0.033), ('keller', 0.033), ('identification', 0.033), ('belief', 0.032), ('diego', 0.032), ('bigram', 0.03), ('acuity', 0.03), ('blasko', 0.03), ('chumbley', 0.03), ('confusability', 0.03), ('connine', 0.03), ('dougherty', 0.03), ('ehrlich', 0.03), ('engel', 0.03), ('exogenous', 0.03), ('gel', 0.03), ('geyer', 0.03), ('hooven', 0.03), ('longtin', 0.03), ('mcconnell', 0.03)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999952 13 acl-2010-A Rational Model of Eye Movement Control in Reading
Author: Klinton Bicknell ; Roger Levy
Abstract: A number of results in the study of realtime sentence comprehension have been explained by computational models as resulting from the rational use of probabilistic linguistic information. Many times, these hypotheses have been tested in reading by linking predictions about relative word difficulty to word-aggregated eye tracking measures such as go-past time. In this paper, we extend these results by asking to what extent reading is well-modeled as rational behavior at a finer level of analysis, predicting not aggregate measures, but the duration and location of each fixation. We present a new rational model of eye movement control in reading, the central assumption of which is that eye move- ment decisions are made to obtain noisy visual information as the reader performs Bayesian inference on the identities of the words in the sentence. As a case study, we present two simulations demonstrating that the model gives a rational explanation for between-word regressions.
2 0.19795726 220 acl-2010-Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure
Author: Jeff Mitchell ; Mirella Lapata ; Vera Demberg ; Frank Keller
Abstract: The analysis of reading times can provide insights into the processes that underlie language comprehension, with longer reading times indicating greater cognitive load. There is evidence that the language processor is highly predictive, such that prior context allows upcoming linguistic material to be anticipated. Previous work has investigated the contributions of semantic and syntactic contexts in isolation, essentially treating them as independent factors. In this paper we analyze reading times in terms of a single predictive measure which integrates a model of semantic composition with an incremental parser and a language model.
3 0.12962863 59 acl-2010-Cognitively Plausible Models of Human Language Processing
Author: Frank Keller
Abstract: We pose the development of cognitively plausible models of human language processing as a challenge for computational linguistics. Existing models can only deal with isolated phenomena (e.g., garden paths) on small, specifically selected data sets. The challenge is to build models that integrate multiple aspects of human language processing at the syntactic, semantic, and discourse level. Like human language processing, these models should be incremental, predictive, broad coverage, and robust to noise. This challenge can only be met if standardized data sets and evaluation measures are developed.
4 0.11270616 229 acl-2010-The Influence of Discourse on Syntax: A Psycholinguistic Model of Sentence Processing
Author: Amit Dubey
Abstract: Probabilistic models of sentence comprehension are increasingly relevant to questions concerning human language processing. However, such models are often limited to syntactic factors. This paper introduces a novel sentence processing model that consists of a parser augmented with a probabilistic logic-based model of coreference resolution, which allows us to simulate how context interacts with syntax in a reading task. Our simulations show that a Weakly Interactive cognitive architecture can explain data which had been provided as evidence for the Strongly Interactive hypothesis.
5 0.11237069 4 acl-2010-A Cognitive Cost Model of Annotations Based on Eye-Tracking Data
Author: Katrin Tomanek ; Udo Hahn ; Steffen Lohmann ; Jurgen Ziegler
Abstract: We report on an experiment to track complex decision points in linguistic metadata annotation where the decision behavior of annotators is observed with an eyetracking device. As experimental conditions we investigate different forms of textual context and linguistic complexity classes relative to syntax and semantics. Our data renders evidence that annotation performance depends on the semantic and syntactic complexity of the decision points and, more interestingly, indicates that fullscale context is mostly negligible with – the exception of semantic high-complexity cases. We then induce from this observational data a cognitively grounded cost model of linguistic meta-data annotations and compare it with existing non-cognitive models. Our data reveals that the cognitively founded model explains annotation costs (expressed in annotation time) more adequately than non-cognitive ones.
6 0.10072853 65 acl-2010-Complexity Metrics in an Incremental Right-Corner Parser
7 0.10011999 167 acl-2010-Learning to Adapt to Unknown Users: Referring Expression Generation in Spoken Dialogue Systems
8 0.069197856 29 acl-2010-An Exact A* Method for Deciphering Letter-Substitution Ciphers
9 0.069148652 202 acl-2010-Reading between the Lines: Learning to Map High-Level Instructions to Commands
10 0.06319692 112 acl-2010-Extracting Social Networks from Literary Fiction
11 0.061359711 187 acl-2010-Optimising Information Presentation for Spoken Dialogue Systems
12 0.06069576 158 acl-2010-Latent Variable Models of Selectional Preference
13 0.057915553 170 acl-2010-Letter-Phoneme Alignment: An Exploration
14 0.056463614 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images
15 0.0555471 242 acl-2010-Tree-Based Deterministic Dependency Parsing - An Application to Nivre's Method -
16 0.052998159 117 acl-2010-Fine-Grained Genre Classification Using Structural Learning Algorithms
17 0.052988797 102 acl-2010-Error Detection for Statistical Machine Translation Using Linguistic Features
18 0.052090086 168 acl-2010-Learning to Follow Navigational Directions
19 0.051851392 142 acl-2010-Importance-Driven Turn-Bidding for Spoken Dialogue Systems
20 0.051513243 157 acl-2010-Last but Definitely Not Least: On the Role of the Last Sentence in Automatic Polarity-Classification
topicId topicWeight
[(0, -0.155), (1, 0.045), (2, -0.017), (3, -0.111), (4, -0.025), (5, -0.097), (6, -0.051), (7, -0.019), (8, 0.09), (9, 0.008), (10, -0.083), (11, 0.048), (12, 0.134), (13, 0.127), (14, -0.122), (15, 0.095), (16, 0.039), (17, 0.033), (18, -0.106), (19, 0.052), (20, -0.103), (21, 0.028), (22, 0.015), (23, 0.071), (24, 0.023), (25, -0.02), (26, -0.007), (27, -0.118), (28, 0.028), (29, 0.056), (30, -0.005), (31, -0.029), (32, 0.09), (33, 0.0), (34, -0.047), (35, -0.033), (36, 0.035), (37, 0.096), (38, 0.034), (39, -0.083), (40, 0.037), (41, 0.035), (42, 0.044), (43, -0.049), (44, -0.029), (45, 0.025), (46, 0.045), (47, -0.034), (48, 0.065), (49, 0.003)]
simIndex simValue paperId paperTitle
same-paper 1 0.93038243 13 acl-2010-A Rational Model of Eye Movement Control in Reading
Author: Klinton Bicknell ; Roger Levy
Abstract: A number of results in the study of realtime sentence comprehension have been explained by computational models as resulting from the rational use of probabilistic linguistic information. Many times, these hypotheses have been tested in reading by linking predictions about relative word difficulty to word-aggregated eye tracking measures such as go-past time. In this paper, we extend these results by asking to what extent reading is well-modeled as rational behavior at a finer level of analysis, predicting not aggregate measures, but the duration and location of each fixation. We present a new rational model of eye movement control in reading, the central assumption of which is that eye move- ment decisions are made to obtain noisy visual information as the reader performs Bayesian inference on the identities of the words in the sentence. As a case study, we present two simulations demonstrating that the model gives a rational explanation for between-word regressions.
2 0.78685051 65 acl-2010-Complexity Metrics in an Incremental Right-Corner Parser
Author: Stephen Wu ; Asaf Bachrach ; Carlos Cardenas ; William Schuler
Abstract: Hierarchical HMM (HHMM) parsers make promising cognitive models: while they use a bounded model of working memory and pursue incremental hypotheses in parallel, they still achieve parsing accuracies competitive with chart-based techniques. This paper aims to validate that a right-corner HHMM parser is also able to produce complexity metrics, which quantify a reader’s incremental difficulty in understanding a sentence. Besides defining standard metrics in the HHMM framework, a new metric, embedding difference, is also proposed, which tests the hypothesis that HHMM store elements represents syntactic working memory. Results show that HHMM surprisal outperforms all other evaluated metrics in predicting reading times, and that embedding difference makes a significant, independent contribution.
3 0.71089828 59 acl-2010-Cognitively Plausible Models of Human Language Processing
Author: Frank Keller
Abstract: We pose the development of cognitively plausible models of human language processing as a challenge for computational linguistics. Existing models can only deal with isolated phenomena (e.g., garden paths) on small, specifically selected data sets. The challenge is to build models that integrate multiple aspects of human language processing at the syntactic, semantic, and discourse level. Like human language processing, these models should be incremental, predictive, broad coverage, and robust to noise. This challenge can only be met if standardized data sets and evaluation measures are developed.
4 0.70790058 220 acl-2010-Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure
Author: Jeff Mitchell ; Mirella Lapata ; Vera Demberg ; Frank Keller
Abstract: The analysis of reading times can provide insights into the processes that underlie language comprehension, with longer reading times indicating greater cognitive load. There is evidence that the language processor is highly predictive, such that prior context allows upcoming linguistic material to be anticipated. Previous work has investigated the contributions of semantic and syntactic contexts in isolation, essentially treating them as independent factors. In this paper we analyze reading times in terms of a single predictive measure which integrates a model of semantic composition with an incremental parser and a language model.
5 0.62068546 229 acl-2010-The Influence of Discourse on Syntax: A Psycholinguistic Model of Sentence Processing
Author: Amit Dubey
Abstract: Probabilistic models of sentence comprehension are increasingly relevant to questions concerning human language processing. However, such models are often limited to syntactic factors. This paper introduces a novel sentence processing model that consists of a parser augmented with a probabilistic logic-based model of coreference resolution, which allows us to simulate how context interacts with syntax in a reading task. Our simulations show that a Weakly Interactive cognitive architecture can explain data which had been provided as evidence for the Strongly Interactive hypothesis.
6 0.51888645 29 acl-2010-An Exact A* Method for Deciphering Letter-Substitution Ciphers
7 0.4681372 61 acl-2010-Combining Data and Mathematical Models of Language Change
8 0.45727533 173 acl-2010-Modeling Norms of Turn-Taking in Multi-Party Conversation
9 0.45412868 116 acl-2010-Finding Cognate Groups Using Phylogenies
10 0.4352982 16 acl-2010-A Statistical Model for Lost Language Decipherment
11 0.43008339 157 acl-2010-Last but Definitely Not Least: On the Role of the Last Sentence in Automatic Polarity-Classification
12 0.42768109 4 acl-2010-A Cognitive Cost Model of Annotations Based on Eye-Tracking Data
14 0.40098095 202 acl-2010-Reading between the Lines: Learning to Map High-Level Instructions to Commands
15 0.39943475 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images
16 0.39222607 168 acl-2010-Learning to Follow Navigational Directions
17 0.38304129 74 acl-2010-Correcting Errors in Speech Recognition with Articulatory Dynamics
18 0.38242462 137 acl-2010-How Spoken Language Corpora Can Refine Current Speech Motor Training Methodologies
19 0.38119689 35 acl-2010-Automated Planning for Situated Natural Language Generation
20 0.3710843 68 acl-2010-Conditional Random Fields for Word Hyphenation
topicId topicWeight
[(14, 0.026), (25, 0.044), (37, 0.017), (39, 0.014), (42, 0.023), (59, 0.067), (73, 0.06), (76, 0.012), (78, 0.046), (80, 0.017), (83, 0.133), (84, 0.084), (87, 0.268), (98, 0.103)]
simIndex simValue paperId paperTitle
same-paper 1 0.77610159 13 acl-2010-A Rational Model of Eye Movement Control in Reading
Author: Klinton Bicknell ; Roger Levy
Abstract: A number of results in the study of realtime sentence comprehension have been explained by computational models as resulting from the rational use of probabilistic linguistic information. Many times, these hypotheses have been tested in reading by linking predictions about relative word difficulty to word-aggregated eye tracking measures such as go-past time. In this paper, we extend these results by asking to what extent reading is well-modeled as rational behavior at a finer level of analysis, predicting not aggregate measures, but the duration and location of each fixation. We present a new rational model of eye movement control in reading, the central assumption of which is that eye move- ment decisions are made to obtain noisy visual information as the reader performs Bayesian inference on the identities of the words in the sentence. As a case study, we present two simulations demonstrating that the model gives a rational explanation for between-word regressions.
2 0.74990445 166 acl-2010-Learning Word-Class Lattices for Definition and Hypernym Extraction
Author: Roberto Navigli ; Paola Velardi
Abstract: Definition extraction is the task of automatically identifying definitional sentences within texts. The task has proven useful in many research areas including ontology learning, relation extraction and question answering. However, current approaches mostly focused on lexicosyntactic patterns suffer from both low recall and precision, as definitional sentences occur in highly variable syntactic structures. In this paper, we propose WordClass Lattices (WCLs), a generalization of word lattices that we use to model textual definitions. Lattices are learned from a dataset of definitions from Wikipedia. Our method is applied to the task of definition and hypernym extraction and compares favorably to other pattern general– – ization methods proposed in the literature.
3 0.61571789 18 acl-2010-A Study of Information Retrieval Weighting Schemes for Sentiment Analysis
Author: Georgios Paltoglou ; Mike Thelwall
Abstract: Most sentiment analysis approaches use as baseline a support vector machines (SVM) classifier with binary unigram weights. In this paper, we explore whether more sophisticated feature weighting schemes from Information Retrieval can enhance classification accuracy. We show that variants of the classic tf.idf scheme adapted to sentiment analysis provide significant increases in accuracy, especially when using a sublinear function for term frequency weights and document frequency smoothing. The techniques are tested on a wide selection of data sets and produce the best accuracy to our knowledge.
4 0.59744227 59 acl-2010-Cognitively Plausible Models of Human Language Processing
Author: Frank Keller
Abstract: We pose the development of cognitively plausible models of human language processing as a challenge for computational linguistics. Existing models can only deal with isolated phenomena (e.g., garden paths) on small, specifically selected data sets. The challenge is to build models that integrate multiple aspects of human language processing at the syntactic, semantic, and discourse level. Like human language processing, these models should be incremental, predictive, broad coverage, and robust to noise. This challenge can only be met if standardized data sets and evaluation measures are developed.
5 0.59618032 65 acl-2010-Complexity Metrics in an Incremental Right-Corner Parser
Author: Stephen Wu ; Asaf Bachrach ; Carlos Cardenas ; William Schuler
Abstract: Hierarchical HMM (HHMM) parsers make promising cognitive models: while they use a bounded model of working memory and pursue incremental hypotheses in parallel, they still achieve parsing accuracies competitive with chart-based techniques. This paper aims to validate that a right-corner HHMM parser is also able to produce complexity metrics, which quantify a reader’s incremental difficulty in understanding a sentence. Besides defining standard metrics in the HHMM framework, a new metric, embedding difference, is also proposed, which tests the hypothesis that HHMM store elements represents syntactic working memory. Results show that HHMM surprisal outperforms all other evaluated metrics in predicting reading times, and that embedding difference makes a significant, independent contribution.
6 0.59166551 1 acl-2010-"Ask Not What Textual Entailment Can Do for You..."
7 0.58581603 158 acl-2010-Latent Variable Models of Selectional Preference
8 0.58452505 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields
9 0.58398241 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images
10 0.58068025 220 acl-2010-Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure
11 0.58064091 216 acl-2010-Starting from Scratch in Semantic Role Labeling
12 0.5804435 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese
13 0.57821208 195 acl-2010-Phylogenetic Grammar Induction
14 0.57246405 155 acl-2010-Kernel Based Discourse Relation Recognition with Temporal Ordering Information
15 0.57171696 230 acl-2010-The Manually Annotated Sub-Corpus: A Community Resource for and by the People
16 0.57049543 39 acl-2010-Automatic Generation of Story Highlights
17 0.57021594 73 acl-2010-Coreference Resolution with Reconcile
18 0.56989753 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses
19 0.56986785 112 acl-2010-Extracting Social Networks from Literary Fiction
20 0.56958854 251 acl-2010-Using Anaphora Resolution to Improve Opinion Target Identification in Movie Reviews