nips nips2008 nips2008-124 knowledge-graph by maker-knowledge-mining

124 nips-2008-Load and Attentional Bayes


Source: pdf

Author: Peter Dayan

Abstract: Selective attention is a most intensively studied psychological phenomenon, rife with theoretical suggestions and schisms. A critical idea is that of limited capacity, the allocation of which has produced continual conflict about such phenomena as early and late selection. An influential resolution of this debate is based on the notion of perceptual load (Lavie, 2005), which suggests that low-load, easy tasks, because they underuse the total capacity of attention, mandatorily lead to the processing of stimuli that are irrelevant to the current attentional set; whereas high-load, difficult tasks grab all resources for themselves, leaving distractors high and dry. We argue that this theory presents a challenge to Bayesian theories of attention, and suggest an alternative, statistical, account of key supporting data. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 uk Abstract Selective attention is a most intensively studied psychological phenomenon, rife with theoretical suggestions and schisms. [sent-4, score-0.096]

2 A critical idea is that of limited capacity, the allocation of which has produced continual conflict about such phenomena as early and late selection. [sent-5, score-0.082]

3 1 Introduction It was some fifty years after James (1950)’s famously poetic description of our capacities for attention that more analytically-directed experiments began, based originally on dichotic listening Cherry (1953). [sent-8, score-0.182]

4 Various forms, interpretations and conflicts about these three tasks have permeated the field of attention ever since (Driver, 2001; Paschler, 1998), driven by different notions of the computational tasks and constraints at hand. [sent-10, score-0.18]

5 The experiments in dichotic listening coincided with the quickly burgeoning realization that mathematical concepts from Shannonian information theory would be very helpful for understanding biological information processing. [sent-11, score-0.086]

6 One central concept in information theory is that of a limited capacity channel, and Broadbent (1958) adopted this as a formal basis for understanding the necessity for, and hence the nature of, selection. [sent-12, score-0.056]

7 James (1986) (based on influential experiments on distractor processing such as Eriksen and Eriksen, 1974), which suggests that the smaller the attentional focus, the more intense it can somehow be, given that the limited capacity is ‘spread’ over a smaller area. [sent-16, score-0.662]

8 However, of course, late selection makes little sense from a limited capacity viewpoint; and short of a theory of what controls the degree of attenuation of irrelevant stimuli, Treisman (1960)’s idea is hard to falsify. [sent-17, score-0.222]

9 To reiterate, the attentional load hypothesis, although an attractive formalization of attenuation, suggests that the brain is unable on easy tasks to exclude information that is known to be irrelevant. [sent-19, score-0.786]

10 It therefore involves an arguably infelicitous combination of sophisticated attentional shaping (as to what can be attended in high-load situations) with inept control. [sent-20, score-0.295]

11 Although the Bayesian revolution in cognitive science has had a huge impact over modern views of sensory processing (see, for instance, Rao et al. [sent-21, score-0.045]

12 This is despite the many other computational models of attention (see Itti and Koch, 2001; Zhaoping, 2006). [sent-24, score-0.096]

13 Indeed, Whiteley and Sahani (2008) have suggested that this lacuna arises from a focus on optimal Bayesian inference in the face of small numbers of objects in the focus of attention, rather than the necessity of using approximate methods in the light of realistic, cluttered, complex scenes. [sent-25, score-0.099]

14 They acknowledge that there is a critical limited resource coming from the existence of neurons with large receptive fields into which experimenters slot multiple sensory objects, some relevant, some irrelevant. [sent-28, score-0.197]

15 Probabilistically-correct inference should then implement selection, when data that is known to be irrelevant is excluded to the advantage of the relevant information (eg Dayan and Zemel, 1999; Palmer, 1994). [sent-29, score-0.091]

16 However, in other circumstances, it will be appropriate to take advantage of the information about the target that is available in the neurons with large fields, even if this means allowing some influence on the final decisions from distractors. [sent-30, score-0.115]

17 Here, we build a Bayesian-inspired account of key data used to argue for the attentional load hypothesis (based on an extension of Yu et al. [sent-31, score-0.77]

18 Subjects had to report the identity of a target letter that was either an ‘X’ or an ‘N’ (here, the former) presented in one of eight locations arranged in a circle around the fixation point. [sent-35, score-0.18]

19 There was also a distractor letter in the further periphery (the larger ‘N’) which was either compatible (ie the same as the target), incompatible (as here, the opposite of the target), or, in so-called neutral trials, a different letter altogether. [sent-37, score-0.784]

20 Figure 1A is a high-load condition, in that there are irrelevant non-targets in the remaining 7 positions around the circle. [sent-39, score-0.054]

21 Figure 1C is a critical control, called the degraded low-load condition, and was actually the main topic of Lavie and de Fockert (2003). [sent-41, score-0.13]

22 In this, the difficulty of the sensory processing was increased (by making the target smaller and dimmer) without changing the attentional (ie selectional) load. [sent-42, score-0.429]

23 Figure 1D shows the mean reaction times (RTs) for these conditions for the three sorts of distractor (RTs suffice here, since there was no speed accuracy tradeoff at work in the different conditions; data not shown). [sent-43, score-0.372]

24 The central finding about attentional load is that the distractor exerted a significant effect over target processing only in the low load case – that is, an incompatible distractor slowed down the RTs compared with a neutral distractor for the low load case but not the high load case. [sent-45, score-3.656]

25 Figure 1: The attentional load task, from Lavie and de Fockert (2003). [sent-46, score-0.744]

26 Subjects had to judge whether a target letter in the central circle around fixation was ‘N’ or ‘X’ in the face of a compatible, incompatible (shown) or neutral distractor. [sent-47, score-0.43]

27 C) degraded low-load condition with no non-targets but a smaller (not shown) and darker target. [sent-50, score-0.138]

28 Since, in the degraded low-load case the RTs were slower but the influence of the distractor was if anything greater, this could not just be a function of the processing time or difficulty. [sent-53, score-0.438]

29 It is apparent that compatible distractors were of almost no help in any case, whereas incompatible distractors were harmful. [sent-56, score-0.435]

30 3 The Bayesian model The data in figure 1 pose the question for normative modeling as to why the distractor would corrupt processing of the target in the easy, low-load, case, but not the difficult, high-load case. [sent-57, score-0.491]

31 No normative account could simply assume that extra data ‘leak’ through in the low-load condition (which is the attentional load hypothesis) if the subjects have the ability to fashion attention far more finely in other cases, such as that of high load. [sent-58, score-0.978]

32 In this case, normative processing will combine information from all the receptive fields, with Bayesian inference and marginalization exactly eliminating any substantial impact from those that are useless or confusing. [sent-60, score-0.16]

33 In the high load case, the proximal non-target stimuli have the effect of adding so much extra noise to the units with large receptive fields compared with their signal about the target, that only the smallest receptive fields will be substantially useful. [sent-61, score-0.838]

34 This implies that the distractor will exert little influence. [sent-62, score-0.376]

35 In the low load case, large receptive fields that also include the distractor will be usefully informative about the target, and so the distractor will exert an influence. [sent-63, score-1.343]

36 Note that this happens automatically through inference – indeed to make this point starkly, there is no explicit attentional control signal in our model whatsoever, only inference and marginalization. [sent-64, score-0.343]

37 load low high n 0 +1 neutral t n +c 0 +c -1 d 0 0 n 0 +1 incompatible t n d +c 0 -1 +c -1 -1 n 0 +1 compatible t n +c 0 +c -1 d +1 +1 Table 1: Our version of the task. [sent-66, score-0.912]

38 Each display consists of four stimulus positions labelled n for the non-targets; t for the target (shown in the table, though not the display, as being boxed); and d for the distractor, which is relatively far from the target. [sent-68, score-0.115]

39 The target takes the values ±c, where c acts like a contrast; subjects have to report its sign. [sent-69, score-0.148]

40 The distractor can be 0 (neutral) or ±1; and is compatible if it has the same sign as the target (and conversely, incompatible). [sent-70, score-0.609]

41 3) only being run for the case of low load, as in figure 1D. [sent-74, score-0.041]

42 The target takes the value ±c; subjects have to report its sign. [sent-77, score-0.148]

43 The distractor can be neutral (0) or have the same sign as (compatible) or a different sign from (incompatible) the target. [sent-78, score-0.529]

44 In the low load condition, the non-target units are 0; in the high load, one is +1; the other is −1, making them balanced, but confusing, because they lead to excess noise. [sent-79, score-0.651]

45 We assume that the subject performs inference about the sign of the target based on noisy observations created by a generative model. [sent-81, score-0.232]

46 In the generative model, the values in table 1 amount to hidden structure, which, as in Yu et al. [sent-82, score-0.07]

47 (2008), is mapped and mixed through various receptive fields to provide the noisy input to a Bayesian recognition model. [sent-83, score-0.084]

48 The job of the recognition model is to calculate the posterior probability of the various hidden settings given data, and, by marginalizing (summing) out all the hidden settings apart from the state of the target, report on its sign. [sent-84, score-0.06]

49 Figure 2A shows the generative model, indicating the receptive fields (RFs) associated with this mixing. [sent-85, score-0.124]

50 We consider 8 topographically-mapped units, 4 with small RFs covering only a single input (the generative weights are just the identity map); and 4 with large RFs (in which the inputs are mixed together more holistically). [sent-86, score-0.066]

51 For simplicity, we treat the distractor as equidistant from the target and non-target input, partially modeling the fact that it can be in different locations. [sent-88, score-0.452]

52 We assume a crude form of signal-dependent noise; it is this that makes the non-target stimuli so devastating. [sent-89, score-0.06]

53 Figure 2B shows the means and standard deviations arising from the generative model for the 8 units (one per column) for the six conditions in table 1 (rows from top to bottom – low load: neutral, incompatible, compatible; then high load: neutral, incompatible, compatible). [sent-90, score-0.268]

54 The means associated with the small and large RF target units show the lack of bias from the non-targets in the high-load condition; and for the large RF case, the bias associated with the distractor. [sent-92, score-0.221]

55 The standard deviations play the most critical role in the model, defining what it means for the nontarget stimuli, when present, to make inference difficult. [sent-93, score-0.118]

56 In the high load case, the units with the large RFs are assumed to have very high standard deviations, coming from a crude form of signal-dependent noise. [sent-95, score-0.678]

57 This captures the relatively uselessness of these large RFs in the high load condition. [sent-96, score-0.504]

58 A B t n d weights 1 2 3 4 small RFs 5 6 7 large RFs 8 attn load high low n mean unit # 1 2 3 4 5 6 7 8 input std 1 2 34 5 6 7 8 inco neut comp inco neut comp n t nd small large RF size n t nd small large RF size Figure 2: The generative model. [sent-98, score-0.705]

59 A) In the model, the four input units, representing non-targets, the target and the distractor, are assumed to generate 8 input units which fall into two groups, with small and large receptive fields (RFs). [sent-99, score-0.305]

60 B) These plots show the means and standard deviations in the generative model associated with the 8 input units for the low and high load cases shown in table 1 (in raster scan order). [sent-102, score-0.743]

61 The means for the large RFs (based on the weights in A) are unaffected by the load; the standard deviations for the units with large receptive fields are much higher in the high load condition. [sent-103, score-0.746]

62 Standard deviations are affected by a coarse form of signal-dependent noise. [sent-104, score-0.052]

63 In all cases, a new sample from the generative model is provided at each time step; the noise corrupting each of the observed units is assumed to be Gaussian, and independent across units and over time. [sent-105, score-0.278]

64 (2008), it is necessary to perform inference over all the possible values of the hidden variables (all the possible values of the hidden structure2 ), then marginalizing out all the variables apart the the target itself. [sent-108, score-0.212]

65 9 is reached on the probability that the target is either positive or negative (reporting whichever one is more likely). [sent-110, score-0.151]

66 01 per step of stopping the accumulation and reporting whichever sign of target has a higher probability (guessing randomly if this probability is 0. [sent-112, score-0.191]

67 This general pattern of results is robust to many different parameter values; though it is possible (by reducing c) to make inference take very much longer still in the degraded low load condition whilst maintaining and boosting the effect of high load. [sent-122, score-0.72]

68 In the low load case, the lack of non-targets means that the inputs based on the large RFs are usefully informative about the target, and therefore automatically play a key role in posterior inference. [sent-125, score-0.572]

69 Since these inputs are also influenced by the distractor, there is an RT 2 In fact, also including the possibility of a degraded high-load case A RT B 30 error rate steps Incompatible Neutral Compatible 0. [sent-126, score-0.127]

70 1 0 low load high load degraded low load Figure 3: Results. [sent-131, score-1.637]

71 A) Mean RTs (steps of inference) for correct choices in each of the 9 cases (since the target is equally often positive and negative, we averaged over these cases. [sent-132, score-0.115]

72 01 per step that inference would terminate early with whichever response was more probable. [sent-135, score-0.098]

73 However, in the high load case, the non-target stimuli are closer to the target and exert substantial influence over the noise corrupting the large RF units associated with it (and no net signal). [sent-139, score-0.85]

74 This makes these large RF units relatively poor sources of information about the target. [sent-140, score-0.106]

75 Thus the smaller RF units are relied upon instead, which are not affected by the distractor. [sent-141, score-0.106]

76 The compatible distractor is helpful to a lesser extent than the incompatible one is harmful, for a couple of reasons. [sent-145, score-0.592]

77 Second, compared with a neutral distractor, the compatible distractor increases the (signal-dependent) noise associated with the units with large RFs, reducing their informativeness about the target. [sent-147, score-0.672]

78 4 Discussion In this paper, we have shown how to account for key results used to argue for an attentional load hypothesis. [sent-148, score-0.77]

79 Our model involves simple Bayesian inference based on a generative process recognizing the existence of small and large receptive fields. [sent-149, score-0.161]

80 That is, the model does not employ an explicit attentional mechanism in inference which has the capacity to downplay some input units over others. [sent-155, score-0.468]

81 It would be interesting to design neurophysiological experiments to probe the form of online selection at work in the attentional load tasks. [sent-158, score-0.744]

82 (2008) is that the model here includes RFs of different sizes, whereas in that model, the distractors were always close to the target. [sent-160, score-0.09]

83 Further, the two neutral conditions here (no distractor, and low load) were not modeled in the earlier study. [sent-161, score-0.153]

84 (2008) suggested that the anterior cingulate might monitor conflict between the cases of compatible and incompatible distractors as part of an approximate inference strategy. [sent-163, score-0.414]

85 The assumptions of large RFs and their high standard deviations in the high load condition are certainly rather simplistic. [sent-165, score-0.622]

86 The attentional load theory has been applied to many tasks (including the regular Eriksen task, Eriksen and Eriksen, 1974) as well as the one here. [sent-168, score-0.786]

87 Perhaps the most significant lacuna is that, as in the Eriksen task, we assumed that the subjects knew the location of the target in the stimulus array, whereas in the real experiment, this had to be inferred from the letters in the circle of targets close to fixation (figure 1A). [sent-170, score-0.203]

88 It would also be worth extending the current model to the much wider range of other tasks used to explore the effects of attentional load (such as Forster and Lavie, 2008). [sent-173, score-0.811]

89 In conclusion, we have suggested a particular rationale for an attenuation theory of attention, which puts together the three tasks suggested at the outset for dichotic listening. [sent-174, score-0.25]

90 The key resource limitation is the restricted number, and therefore, the necessarily broad tuning of RFs; the normative response to his makes attenuation and combination kissing cousins. [sent-176, score-0.123]

91 A selective review of selective attention research from the past century. [sent-206, score-0.294]

92 The locus of interference in the perception of simultaneous stimuli. [sent-210, score-0.064]

93 Effects of noise-letters on identification of a target letter in a nonsearch task. [sent-215, score-0.155]

94 Visual attention within and around the field of focal attention: a zoom lens model. [sent-223, score-0.096]

95 Failures to ignore entirely irrelevant distractors: the role of load. [sent-228, score-0.054]

96 Contrasting effects of sensory limits and capacity limits in visual selective attention. [sent-248, score-0.262]

97 Perceptual load as a major determinant of the locus of selection in visual attention. [sent-253, score-0.512]

98 Selective attention gates visual processing in the extrastriate cortex. [sent-262, score-0.133]

99 Set-size effects in visual search: the effect of attention is independent of the stimulus for simple tasks. [sent-290, score-0.158]

100 Theoretical understanding of the early visual processes by data compression and data selection. [sent-354, score-0.062]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('load', 0.475), ('distractor', 0.337), ('rfs', 0.314), ('lavie', 0.27), ('attentional', 0.269), ('eriksen', 0.225), ('fockert', 0.15), ('incompatible', 0.138), ('compatible', 0.117), ('target', 0.115), ('neutral', 0.112), ('units', 0.106), ('degraded', 0.101), ('rts', 0.101), ('selective', 0.099), ('attention', 0.096), ('distractors', 0.09), ('yu', 0.086), ('attenuation', 0.084), ('receptive', 0.084), ('treisman', 0.075), ('rf', 0.074), ('dayan', 0.07), ('rev', 0.067), ('itti', 0.066), ('stimuli', 0.06), ('baldwin', 0.06), ('desimone', 0.06), ('dichotic', 0.06), ('duncan', 0.06), ('psychophys', 0.06), ('capacity', 0.056), ('irrelevant', 0.054), ('psychol', 0.053), ('deviations', 0.052), ('elds', 0.051), ('percept', 0.048), ('deutsch', 0.048), ('sensory', 0.045), ('streams', 0.045), ('bobrow', 0.045), ('broadbent', 0.045), ('navalpakkam', 0.045), ('norman', 0.045), ('whiteley', 0.045), ('zhaoping', 0.045), ('palmer', 0.042), ('tasks', 0.042), ('low', 0.041), ('mozer', 0.04), ('letter', 0.04), ('generative', 0.04), ('sign', 0.04), ('exert', 0.039), ('coming', 0.039), ('normative', 0.039), ('visual', 0.037), ('condition', 0.037), ('inference', 0.037), ('gure', 0.036), ('whichever', 0.036), ('driver', 0.036), ('reaction', 0.035), ('zemel', 0.035), ('eg', 0.034), ('interference', 0.034), ('subjects', 0.033), ('stream', 0.033), ('james', 0.033), ('xation', 0.032), ('suggested', 0.032), ('hidden', 0.03), ('chelazzi', 0.03), ('confusing', 0.03), ('forster', 0.03), ('inco', 0.03), ('lacuna', 0.03), ('moran', 0.03), ('neut', 0.03), ('paschler', 0.03), ('shaw', 0.03), ('tsal', 0.03), ('usefully', 0.03), ('perception', 0.03), ('psychology', 0.03), ('high', 0.029), ('critical', 0.029), ('late', 0.028), ('perceptual', 0.028), ('ict', 0.027), ('listening', 0.026), ('leak', 0.026), ('annu', 0.026), ('attended', 0.026), ('corrupting', 0.026), ('argue', 0.026), ('inputs', 0.026), ('circle', 0.025), ('effects', 0.025), ('early', 0.025)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 124 nips-2008-Load and Attentional Bayes

Author: Peter Dayan

Abstract: Selective attention is a most intensively studied psychological phenomenon, rife with theoretical suggestions and schisms. A critical idea is that of limited capacity, the allocation of which has produced continual conflict about such phenomena as early and late selection. An influential resolution of this debate is based on the notion of perceptual load (Lavie, 2005), which suggests that low-load, easy tasks, because they underuse the total capacity of attention, mandatorily lead to the processing of stimuli that are irrelevant to the current attentional set; whereas high-load, difficult tasks grab all resources for themselves, leaving distractors high and dry. We argue that this theory presents a challenge to Bayesian theories of attention, and suggest an alternative, statistical, account of key supporting data. 1

2 0.10836364 146 nips-2008-Multi-task Gaussian Process Learning of Robot Inverse Dynamics

Author: Christopher Williams, Stefan Klanke, Sethu Vijayakumar, Kian M. Chai

Abstract: The inverse dynamics problem for a robotic manipulator is to compute the torques needed at the joints to drive it along a given trajectory; it is beneficial to be able to learn this function for adaptive control. A robotic manipulator will often need to be controlled while holding different loads in its end effector, giving rise to a multi-task learning problem. By placing independent Gaussian process priors over the latent functions of the inverse dynamics, we obtain a multi-task Gaussian process prior for handling multiple loads, where the inter-task similarity depends on the underlying inertial parameters. Experiments demonstrate that this multi-task formulation is effective in sharing information among the various loads, and generally improves performance over either learning only on single tasks or pooling the data over all tasks. 1

3 0.079382278 66 nips-2008-Dynamic visual attention: searching for coding length increments

Author: Xiaodi Hou, Liqing Zhang

Abstract: A visual attention system should respond placidly when common stimuli are presented, while at the same time keep alert to anomalous visual inputs. In this paper, a dynamic visual attention model based on the rarity of features is proposed. We introduce the Incremental Coding Length (ICL) to measure the perspective entropy gain of each feature. The objective of our model is to maximize the entropy of the sampled visual features. In order to optimize energy consumption, the limit amount of energy of the system is re-distributed amongst features according to their Incremental Coding Length. By selecting features with large coding length increments, the computational system can achieve attention selectivity in both static and dynamic scenes. We demonstrate that the proposed model achieves superior accuracy in comparison to mainstream approaches in static saliency map generation. Moreover, we also show that our model captures several less-reported dynamic visual search behaviors, such as attentional swing and inhibition of return. 1

4 0.070937604 231 nips-2008-Temporal Dynamics of Cognitive Control

Author: Jeremy Reynolds, Michael C. Mozer

Abstract: Cognitive control refers to the flexible deployment of memory and attention in response to task demands and current goals. Control is often studied experimentally by presenting sequences of stimuli, some demanding a response, and others modulating the stimulus-response mapping. In these tasks, participants must maintain information about the current stimulus-response mapping in working memory. Prominent theories of cognitive control use recurrent neural nets to implement working memory, and optimize memory utilization via reinforcement learning. We present a novel perspective on cognitive control in which working memory representations are intrinsically probabilistic, and control operations that maintain and update working memory are dynamically determined via probabilistic inference. We show that our model provides a parsimonious account of behavioral and neuroimaging data, and suggest that it offers an elegant conceptualization of control in which behavior can be cast as optimal, subject to limitations on learning and the rate of information processing. Moreover, our model provides insight into how task instructions can be directly translated into appropriate behavior and then efficiently refined with subsequent task experience. 1

5 0.066506945 172 nips-2008-Optimal Response Initiation: Why Recent Experience Matters

Author: Matt Jones, Sachiko Kinoshita, Michael C. Mozer

Abstract: In most cognitive and motor tasks, speed-accuracy tradeoffs are observed: Individuals can respond slowly and accurately, or quickly yet be prone to errors. Control mechanisms governing the initiation of behavioral responses are sensitive not only to task instructions and the stimulus being processed, but also to the recent stimulus history. When stimuli can be characterized on an easy-hard dimension (e.g., word frequency in a naming task), items preceded by easy trials are responded to more quickly, and with more errors, than items preceded by hard trials. We propose a rationally motivated mathematical model of this sequential adaptation of control, based on a diffusion model of the decision process in which difficulty corresponds to the drift rate for the correct response. The model assumes that responding is based on the posterior distribution over which response is correct, conditioned on the accumulated evidence. We derive this posterior as a function of the drift rate, and show that higher estimates of the drift rate lead to (normatively) faster responding. Trial-by-trial tracking of difficulty thus leads to sequential effects in speed and accuracy. Simulations show the model explains a variety of phenomena in human speeded decision making. We argue this passive statistical mechanism provides a more elegant and parsimonious account than extant theories based on elaborate control structures. 1

6 0.058655072 123 nips-2008-Linear Classification and Selective Sampling Under Low Noise Conditions

7 0.054473687 206 nips-2008-Sequential effects: Superstition or rational behavior?

8 0.05037554 241 nips-2008-Transfer Learning by Distribution Matching for Targeted Advertising

9 0.049367912 109 nips-2008-Interpreting the neural code with Formal Concept Analysis

10 0.047561601 10 nips-2008-A rational model of preference learning and choice prediction by children

11 0.046502713 67 nips-2008-Effects of Stimulus Type and of Error-Correcting Code Design on BCI Speller Performance

12 0.045545619 92 nips-2008-Generative versus discriminative training of RBMs for classification of fMRI images

13 0.042045653 60 nips-2008-Designing neurophysiology experiments to optimally constrain receptive field models along parametric submanifolds

14 0.041607961 65 nips-2008-Domain Adaptation with Multiple Sources

15 0.041362185 187 nips-2008-Psychiatry: Insights into depression through normative decision-making models

16 0.041175496 118 nips-2008-Learning Transformational Invariants from Natural Movies

17 0.040934436 160 nips-2008-On Computational Power and the Order-Chaos Phase Transition in Reservoir Computing

18 0.03920719 121 nips-2008-Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement

19 0.038487747 19 nips-2008-An Empirical Analysis of Domain Adaptation Algorithms for Genomic Sequence Analysis

20 0.037487783 244 nips-2008-Unifying the Sensory and Motor Components of Sensorimotor Adaptation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.112), (1, 0.028), (2, 0.072), (3, -0.005), (4, 0.003), (5, 0.026), (6, -0.042), (7, 0.04), (8, 0.079), (9, 0.055), (10, -0.004), (11, 0.102), (12, -0.076), (13, 0.029), (14, -0.012), (15, 0.055), (16, -0.012), (17, -0.021), (18, -0.012), (19, -0.006), (20, -0.02), (21, -0.041), (22, 0.014), (23, 0.008), (24, -0.018), (25, 0.014), (26, 0.028), (27, 0.04), (28, -0.026), (29, 0.053), (30, -0.038), (31, 0.04), (32, -0.013), (33, 0.063), (34, 0.024), (35, -0.035), (36, -0.059), (37, 0.046), (38, 0.064), (39, 0.035), (40, 0.036), (41, -0.002), (42, 0.001), (43, 0.023), (44, 0.007), (45, -0.084), (46, -0.036), (47, -0.065), (48, 0.171), (49, -0.093)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92692107 124 nips-2008-Load and Attentional Bayes

Author: Peter Dayan

Abstract: Selective attention is a most intensively studied psychological phenomenon, rife with theoretical suggestions and schisms. A critical idea is that of limited capacity, the allocation of which has produced continual conflict about such phenomena as early and late selection. An influential resolution of this debate is based on the notion of perceptual load (Lavie, 2005), which suggests that low-load, easy tasks, because they underuse the total capacity of attention, mandatorily lead to the processing of stimuli that are irrelevant to the current attentional set; whereas high-load, difficult tasks grab all resources for themselves, leaving distractors high and dry. We argue that this theory presents a challenge to Bayesian theories of attention, and suggest an alternative, statistical, account of key supporting data. 1

2 0.62562305 172 nips-2008-Optimal Response Initiation: Why Recent Experience Matters

Author: Matt Jones, Sachiko Kinoshita, Michael C. Mozer

Abstract: In most cognitive and motor tasks, speed-accuracy tradeoffs are observed: Individuals can respond slowly and accurately, or quickly yet be prone to errors. Control mechanisms governing the initiation of behavioral responses are sensitive not only to task instructions and the stimulus being processed, but also to the recent stimulus history. When stimuli can be characterized on an easy-hard dimension (e.g., word frequency in a naming task), items preceded by easy trials are responded to more quickly, and with more errors, than items preceded by hard trials. We propose a rationally motivated mathematical model of this sequential adaptation of control, based on a diffusion model of the decision process in which difficulty corresponds to the drift rate for the correct response. The model assumes that responding is based on the posterior distribution over which response is correct, conditioned on the accumulated evidence. We derive this posterior as a function of the drift rate, and show that higher estimates of the drift rate lead to (normatively) faster responding. Trial-by-trial tracking of difficulty thus leads to sequential effects in speed and accuracy. Simulations show the model explains a variety of phenomena in human speeded decision making. We argue this passive statistical mechanism provides a more elegant and parsimonious account than extant theories based on elaborate control structures. 1

3 0.57620382 231 nips-2008-Temporal Dynamics of Cognitive Control

Author: Jeremy Reynolds, Michael C. Mozer

Abstract: Cognitive control refers to the flexible deployment of memory and attention in response to task demands and current goals. Control is often studied experimentally by presenting sequences of stimuli, some demanding a response, and others modulating the stimulus-response mapping. In these tasks, participants must maintain information about the current stimulus-response mapping in working memory. Prominent theories of cognitive control use recurrent neural nets to implement working memory, and optimize memory utilization via reinforcement learning. We present a novel perspective on cognitive control in which working memory representations are intrinsically probabilistic, and control operations that maintain and update working memory are dynamically determined via probabilistic inference. We show that our model provides a parsimonious account of behavioral and neuroimaging data, and suggest that it offers an elegant conceptualization of control in which behavior can be cast as optimal, subject to limitations on learning and the rate of information processing. Moreover, our model provides insight into how task instructions can be directly translated into appropriate behavior and then efficiently refined with subsequent task experience. 1

4 0.53552997 46 nips-2008-Characterizing response behavior in multisensory perception with conflicting cues

Author: Rama Natarajan, Iain Murray, Ladan Shams, Richard S. Zemel

Abstract: We explore a recently proposed mixture model approach to understanding interactions between conflicting sensory cues. Alternative model formulations, differing in their sensory noise models and inference methods, are compared based on their fit to experimental data. Heavy-tailed sensory likelihoods yield a better description of the subjects’ response behavior than standard Gaussian noise models. We study the underlying cause for this result, and then present several testable predictions of these models. 1

5 0.51803297 66 nips-2008-Dynamic visual attention: searching for coding length increments

Author: Xiaodi Hou, Liqing Zhang

Abstract: A visual attention system should respond placidly when common stimuli are presented, while at the same time keep alert to anomalous visual inputs. In this paper, a dynamic visual attention model based on the rarity of features is proposed. We introduce the Incremental Coding Length (ICL) to measure the perspective entropy gain of each feature. The objective of our model is to maximize the entropy of the sampled visual features. In order to optimize energy consumption, the limit amount of energy of the system is re-distributed amongst features according to their Incremental Coding Length. By selecting features with large coding length increments, the computational system can achieve attention selectivity in both static and dynamic scenes. We demonstrate that the proposed model achieves superior accuracy in comparison to mainstream approaches in static saliency map generation. Moreover, we also show that our model captures several less-reported dynamic visual search behaviors, such as attentional swing and inhibition of return. 1

6 0.50552338 38 nips-2008-Bio-inspired Real Time Sensory Map Realignment in a Robotic Barn Owl

7 0.47223163 60 nips-2008-Designing neurophysiology experiments to optimally constrain receptive field models along parametric submanifolds

8 0.42061707 121 nips-2008-Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement

9 0.42057282 3 nips-2008-A Massively Parallel Digital Learning Processor

10 0.41645694 187 nips-2008-Psychiatry: Insights into depression through normative decision-making models

11 0.4061029 244 nips-2008-Unifying the Sensory and Motor Components of Sensorimotor Adaptation

12 0.39925465 146 nips-2008-Multi-task Gaussian Process Learning of Robot Inverse Dynamics

13 0.39686009 222 nips-2008-Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning

14 0.38632953 19 nips-2008-An Empirical Analysis of Domain Adaptation Algorithms for Genomic Sequence Analysis

15 0.38376394 100 nips-2008-How memory biases affect information transmission: A rational analysis of serial reproduction

16 0.37907517 23 nips-2008-An ideal observer model of infant object perception

17 0.37437236 7 nips-2008-A computational model of hippocampal function in trace conditioning

18 0.37396628 67 nips-2008-Effects of Stimulus Type and of Error-Correcting Code Design on BCI Speller Performance

19 0.36771828 10 nips-2008-A rational model of preference learning and choice prediction by children

20 0.35386628 148 nips-2008-Natural Image Denoising with Convolutional Networks


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(6, 0.047), (7, 0.062), (12, 0.023), (15, 0.022), (28, 0.126), (57, 0.051), (59, 0.024), (63, 0.019), (71, 0.015), (74, 0.013), (77, 0.038), (78, 0.453), (83, 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.82921308 52 nips-2008-Correlated Bigram LSA for Unsupervised Language Model Adaptation

Author: Yik-cheung Tam, Tanja Schultz

Abstract: We present a correlated bigram LSA approach for unsupervised LM adaptation for automatic speech recognition. The model is trained using efficient variational EM and smoothed using the proposed fractional Kneser-Ney smoothing which handles fractional counts. We address the scalability issue to large training corpora via bootstrapping of bigram LSA from unigram LSA. For LM adaptation, unigram and bigram LSA are integrated into the background N-gram LM via marginal adaptation and linear interpolation respectively. Experimental results on the Mandarin RT04 test set show that applying unigram and bigram LSA together yields 6%–8% relative perplexity reduction and 2.5% relative character error rate reduction which is statistically significant compared to applying only unigram LSA. On the large-scale evaluation on Arabic, 3% relative word error rate reduction is achieved which is also statistically significant. 1

2 0.80503845 111 nips-2008-Kernel Change-point Analysis

Author: Zaïd Harchaoui, Eric Moulines, Francis R. Bach

Abstract: We introduce a kernel-based method for change-point analysis within a sequence of temporal observations. Change-point analysis of an unlabelled sample of observations consists in, first, testing whether a change in the distribution occurs within the sample, and second, if a change occurs, estimating the change-point instant after which the distribution of the observations switches from one distribution to another different distribution. We propose a test statistic based upon the maximum kernel Fisher discriminant ratio as a measure of homogeneity between segments. We derive its limiting distribution under the null hypothesis (no change occurs), and establish the consistency under the alternative hypothesis (a change occurs). This allows to build a statistical hypothesis testing procedure for testing the presence of a change-point, with a prescribed false-alarm probability and detection probability tending to one in the large-sample setting. If a change actually occurs, the test statistic also yields an estimator of the change-point location. Promising experimental results in temporal segmentation of mental tasks from BCI data and pop song indexation are presented. 1

same-paper 3 0.7960605 124 nips-2008-Load and Attentional Bayes

Author: Peter Dayan

Abstract: Selective attention is a most intensively studied psychological phenomenon, rife with theoretical suggestions and schisms. A critical idea is that of limited capacity, the allocation of which has produced continual conflict about such phenomena as early and late selection. An influential resolution of this debate is based on the notion of perceptual load (Lavie, 2005), which suggests that low-load, easy tasks, because they underuse the total capacity of attention, mandatorily lead to the processing of stimuli that are irrelevant to the current attentional set; whereas high-load, difficult tasks grab all resources for themselves, leaving distractors high and dry. We argue that this theory presents a challenge to Bayesian theories of attention, and suggest an alternative, statistical, account of key supporting data. 1

4 0.72695476 82 nips-2008-Fast Computation of Posterior Mode in Multi-Level Hierarchical Models

Author: Liang Zhang, Deepak Agarwal

Abstract: Multi-level hierarchical models provide an attractive framework for incorporating correlations induced in a response variable that is organized hierarchically. Model fitting is challenging, especially for a hierarchy with a large number of nodes. We provide a novel algorithm based on a multi-scale Kalman filter that is both scalable and easy to implement. For Gaussian response, we show our method provides the maximum a-posteriori (MAP) parameter estimates; for non-Gaussian response, parameter estimation is performed through a Laplace approximation. However, the Laplace approximation provides biased parameter estimates that is corrected through a parametric bootstrap procedure. We illustrate through simulation studies and analyses of real world data sets in health care and online advertising.

5 0.43228757 216 nips-2008-Sparse probabilistic projections

Author: Cédric Archambeau, Francis R. Bach

Abstract: We present a generative model for performing sparse probabilistic projections, which includes sparse principal component analysis and sparse canonical correlation analysis as special cases. Sparsity is enforced by means of automatic relevance determination or by imposing appropriate prior distributions, such as generalised hyperbolic distributions. We derive a variational Expectation-Maximisation algorithm for the estimation of the hyperparameters and show that our novel probabilistic approach compares favourably to existing techniques. We illustrate how the proposed method can be applied in the context of cryptoanalysis as a preprocessing tool for the construction of template attacks. 1

6 0.40912458 147 nips-2008-Multiscale Random Fields with Application to Contour Grouping

7 0.40729398 36 nips-2008-Beyond Novelty Detection: Incongruent Events, when General and Specific Classifiers Disagree

8 0.40603805 98 nips-2008-Hierarchical Semi-Markov Conditional Random Fields for Recursive Sequential Data

9 0.40274209 20 nips-2008-An Extended Level Method for Efficient Multiple Kernel Learning

10 0.40068221 169 nips-2008-Online Models for Content Optimization

11 0.38766116 66 nips-2008-Dynamic visual attention: searching for coding length increments

12 0.38061881 15 nips-2008-Adaptive Martingale Boosting

13 0.3772707 10 nips-2008-A rational model of preference learning and choice prediction by children

14 0.37680796 142 nips-2008-Multi-Level Active Prediction of Useful Image Annotations for Recognition

15 0.37674198 4 nips-2008-A Scalable Hierarchical Distributed Language Model

16 0.37659085 248 nips-2008-Using matrices to model symbolic relationship

17 0.37527841 205 nips-2008-Semi-supervised Learning with Weakly-Related Unlabeled Data : Towards Better Text Categorization

18 0.37280244 231 nips-2008-Temporal Dynamics of Cognitive Control

19 0.37110949 79 nips-2008-Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning

20 0.36985502 127 nips-2008-Logistic Normal Priors for Unsupervised Probabilistic Grammar Induction