nips nips2005 nips2005-149 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Vidhya Navalpakkam, Laurent Itti
Abstract: Survival in the natural world demands the selection of relevant visual cues to rapidly and reliably guide attention towards prey and predators in cluttered environments. We investigate whether our visual system selects cues that guide search in an optimal manner. We formally obtain the optimal cue selection strategy by maximizing the signal to noise ratio (SN R) between a search target and surrounding distractors. This optimal strategy successfully accounts for several phenomena in visual search behavior, including the effect of target-distractor discriminability, uncertainty in target’s features, distractor heterogeneity, and linear separability. Furthermore, the theory generates a new prediction, which we verify through psychophysical experiments with human subjects. Our results provide direct experimental evidence that humans select visual cues so as to maximize SN R between the targets and surrounding clutter.
Reference: text
sentIndex sentText sentNum sentScore
1 Optimal cue selection strategy Vidhya Navalpakkam Department of Computer Science USC, Los Angeles navalpak@usc. [sent-1, score-0.312]
2 edu Abstract Survival in the natural world demands the selection of relevant visual cues to rapidly and reliably guide attention towards prey and predators in cluttered environments. [sent-3, score-0.338]
3 We investigate whether our visual system selects cues that guide search in an optimal manner. [sent-4, score-0.465]
4 We formally obtain the optimal cue selection strategy by maximizing the signal to noise ratio (SN R) between a search target and surrounding distractors. [sent-5, score-0.923]
5 This optimal strategy successfully accounts for several phenomena in visual search behavior, including the effect of target-distractor discriminability, uncertainty in target’s features, distractor heterogeneity, and linear separability. [sent-6, score-0.702]
6 Furthermore, the theory generates a new prediction, which we verify through psychophysical experiments with human subjects. [sent-7, score-0.024]
7 Our results provide direct experimental evidence that humans select visual cues so as to maximize SN R between the targets and surrounding clutter. [sent-8, score-0.255]
8 1 Introduction Detecting a yellow tiger among distracting foliage in different shades of yellow and brown requires efficient top-down strategies that select relevant visual cues to enable rapid and reliable detection of the target among several distractors. [sent-9, score-0.777]
9 For simple scenarios such as searching for a red target, the Guided Search theory [17] predicts that search efficiency can be improved by boosting the red feature in a top-down manner. [sent-10, score-0.305]
10 But for more complex and natural scenarios such as detecting a tiger in the jungle or looking for a face in a crowd, finding the optimum amount of top-down enhancement to be applied to each low-level feature dimension encoded by the early visual system is non-trivial. [sent-11, score-0.247]
11 It must not only consider features present in the target, but also those present in the distractors. [sent-12, score-0.035]
12 In this paper, we formally obtain the optimal cue selection strategy and investigate whether our visual system has evolved to deploy it. [sent-13, score-0.525]
13 In section 2, we formulate cue selection as an optimization problem where the relevant goal is to maximize the signal to noise ratio (SN R) of the saliency map, so that the target becomes most salient and quickly draws attention, thereby minimizing search time. [sent-14, score-0.997]
14 In section 4, we describe the design and analysis of psychophysics experiments to test new, counter-intuitive predictions of the theory. [sent-16, score-0.042]
15 The results of our study suggest that humans deploy optimal cue selection strategies to detect targets in cluttered and distracting environments. [sent-17, score-0.446]
16 2 Formalizing visual search as an optimization problem To quickly find a target among distractors, we wish to maximize the salience of the target relative to the distractors. [sent-18, score-1.309]
17 Thus we can define the signal to noise ratio (SN R) as the ratio of salience of the target to the distractors. [sent-19, score-0.738]
18 Assuming that visual cues or features are encoded by populations of neurons in early visual areas, we define the optimal cue selection strategy as the best choice of neural response gain that maximizes the signal to noise ratio (SN R). [sent-20, score-1.027]
19 In the rest of this section, we formally obtain the optimal choice of gain in neural responses that will maximize SN R. [sent-21, score-0.149]
20 Hence, we express SN R as the ratio of expected salience of the target over expected salience of the distractors, with the expectation taken over all possible target and distractor locations, their features and spatial configurations, and over several repeated trials. [sent-23, score-1.616]
21 Mean salience of the Target SN R = Mean salience of the distractor Search array and its stimuli: Let search array A be a two-dimensional display that consists of one target T and several distractors Dj (j = 1. [sent-24, score-1.924]
22 Let the display be divided into an invisible N × N grid, with one item occuring at each cell (x, y) in the grid. [sent-28, score-0.089]
23 Let the color, contrast, orientation and other target parameters θT be chosen from a distribution P (θ|T ). [sent-29, score-0.323]
24 Similarly, for each distractor Dj , let its parameters θDj be sampled independently from a distribution P (θ|D). [sent-30, score-0.223]
25 Thus, search array A has a fixed choice of target and distractor parameters. [sent-31, score-0.815]
26 Next, the spatial configuration C is decided by a random permutation of some assignment of the target and distractors to the N 2 cells in A (such that there is exactly one item in each cell). [sent-32, score-0.76]
27 Thus, for a given search array A, the spatial configuration as well as stimulus parameters are fixed. [sent-33, score-0.336]
28 Finally, given a choice of parameter θ and its spatial location (x, y), we generate an image pattern R(θ) (a set of pixels and their values) and embed it at location (x, y) in search array A. [sent-34, score-0.359]
29 Saliency computation: Let the input search array A be processed by a population of neurons with gaussian tuning curves tuned to different stimulus parameters such as µ1 , µ2 , . [sent-36, score-0.523]
30 The output of this early visual processing stage is used to compute saliency maps si (x, y, A) of search array A, that consist of the visual salience at every location (x, y) for feature-values µi (i = 1. [sent-40, score-1.026]
31 Let si (x, y, A) be combined linearly to form S(x, y, A), the overall salience at location (x, y). [sent-44, score-0.376]
32 Further, assuming a multiplicative gain gi on the ith saliency map, we obtain: S(x, y, A) = gi si (x, y, A) (1) i Salience of the target and distractors: Let ST (A) be a random variable representing the salience of the target T in search array A. [sent-45, score-1.88]
33 To factor out the variability due to internal noise η, we consider Eη [ST (A)], which is the mean salience of the target over repeated identical presentations of A. [sent-46, score-0.666]
34 Further, let EC [ST (A)] be the mean salience of the target averaged over all spatial configurations of a given set of target and distractor parameters. [sent-47, score-1.216]
35 Similarly, Eθ|T [ST (A)] is the mean salience of the target over all target parameters. [sent-48, score-0.961]
36 The mean salience of the target combined over several repeated presentations of the search array A (to factor out internal noise η), over all spatial configurations C, and over all choices of target parameters θ|T is given below. [sent-49, score-1.29]
37 Further, since η, C and θ are independent random variables, we can rewrite the joint expectation as follows: E[ST (A)] = Eθ|T [EC [Eη [ST (A)]]] (2) Let SD (A) represent the mean salience of distractors Dj (j = 1. [sent-50, score-0.664]
38 Similar to computing the mean salience of the target, we find the mean salience of distractors over all η, C and θ|D. [sent-54, score-0.979]
39 SD (A) = EDj [siDj (A)] (3) E[SD (A)] = Eθ|D [EC [Eη [SD (A)]]] (4) SN R and its optimization: The additive salience and multiplicative gain hypothesis in eqn. [sent-55, score-0.37]
40 SN Ri = EΘ|T [EC [Eη [siT (A)]]]/EΘ|D [EC [Eη [siD (A)]]] (10) The sign of the derivative, d dgi SN R gi =1 tells us whether gi should be increased, de- creased or maintained at the baseline activation 1 in order to maximize SN R. [sent-58, score-0.507]
41 SN Ri SN R < = > d SN R < 0 ⇒ SN R increases as gi decreases ⇒ gi < 1 dgi d 1⇒ SN R = 0 ⇒ SN R does not change with gi ⇒ gi = 1 dgi d 1⇒ SN R > 0 ⇒ SN R increases as gi increases ⇒ gi > 1 dgi 1⇒ (11) (12) (13) Thus, we obtain an intuitive result that gi increases as SN Ri increases. [sent-59, score-1.641]
42 3 Predictions of the optimal cue selection strategy To understand the implications of biasing features according to the optimal cue selection strategy, we simulate a simple model of early visual cortex. [sent-62, score-0.86]
43 We assume that each feature dimension is encoded by a population of neurons with overlapping gaussian tuning curves that are broadly tuned to different features in that dimension. [sent-63, score-0.324]
44 Let fi (θ) represent the tuning curve of the ith neuron in a population of broadly tuned neurons with overlapping tuning curves. [sent-64, score-0.404]
45 Let the tuning width σ and amplitude a be equal for all neurons, and µi represent the preferred stimulus parameter (or feature) of the ith neuron. [sent-65, score-0.137]
46 fi (θ) = a (θ − µi )2 exp − σ 2σ 2 (16) Let r(Θ(x, y, A)) = {r1 (Θ(x, y, A)). [sent-66, score-0.057]
47 rn (Θ(x, y, A))} be the population response to a stimulus parameter Θ(x, y, A) at a location (x, y) in search array A, where ri refers to the response of the ith neuron and n is the total number of neurons in the population. [sent-69, score-0.737]
48 Let the neural response ri (Θ(x, y, A)) be a Poisson random variable. [sent-70, score-0.166]
49 P (ri (Θ(x, y, A)) = z) = Pfi (Θ(x,y,A)) (z) (17) For simplicity, let’s assume that the local neural response ri (Θ(x, y, A)) is a measure of salience si (x, y, A). [sent-71, score-0.513]
50 2, 4, 10, 16, 17, we can derive the mean salience of the target and distractor, and use it to compute SN Ri . [sent-73, score-0.638]
51 si (x, y, A) E[siT (A)] = ri (Θ(x, y, A)) = Eθ|T [fi (θ)] (18) (19) E[siD (A)] = Eθ|D [fi (θ)] (20) Eθ|T [fi (θ)] Eθ|D [fi (θ)] (21) SN Ri = Finally, the gains gi on each saliency map can be found using eqn. [sent-74, score-0.538]
52 Thus, for a given distribution of stimulus parameters for the target P (θ|T ) and distractors P (θ|D), we simulate the above model of early visual cortex, compute salience of target and distractors, compute SN Ri and obtain gi . [sent-76, score-1.695]
53 In the rest of this section, we plot the distribution of optimal choice of gains gi for an exhaustive list of conditions where knowledge of the target and distractors varies from complete certainty to uncertainty. [sent-77, score-0.979]
54 Unknown target and distractors: In the trivial case where there is no knowledge of the target and distractors, all cues are equally relevant and the optimal choice of gains is the same as baseline activation (unity). [sent-78, score-0.902]
55 This prediction is consistent with visual search experiments that observe slow search when the target and distractors are unknown due to reversal between trials [1, 2]. [sent-80, score-1.172]
56 Search for a known target: During search for a known target, the optimal strategy predicts that SN R can be maximised by boosting neurons according to how strongly they respond to the target feature (as shown in figure 1, predicted SN R is 12. [sent-81, score-0.834]
57 Thus, a neuron that is optimally tuned to the target feature receives maximal gain. [sent-83, score-0.55]
58 This prediction is consistent with single unit recordings on feature-based attention which show that the gain in neural response depends on the similarity between the neuron’s preferred feature and the target feature [3, 4]. [sent-84, score-0.56]
59 Role of uncertainty in target features: When there’s uncertainty in the target’s features, i. [sent-85, score-0.373]
60 , when the target’s parameter assumes multiple values according to some probability distribution P (θ|T ), the optimal strategy predicts that SN R decreases, leading to a slower search (as shown in figure 1, SN R decreases from 12. [sent-87, score-0.376]
61 This result is consistent with psychophysics experiments which suggest that better knowledge of the target leads to faster search [5, 6]. [sent-89, score-0.537]
62 Distractor heterogeneity: While searching for an unknown target among known distractors, the optimal strategy predicts that SN R can be maximised by suppressing the neurons tuned to the distractors (see figure 1). [sent-90, score-1.121]
63 But as we increase distractor heterogeneity or the number of distractor types, it predicts a decrease in SN R (from 36 dB to 17 dB, figure 1). [sent-91, score-0.584]
64 Discriminability between target and distractors: Several experiments and theories have studied the effect of target-distractor discriminability [10]-[17]. [sent-93, score-0.443]
65 The optimal cue selection strategy also shows that if the target and distractors are very different or highly discriminable, SN R is high and the search is efficient (SN R = 51. [sent-94, score-1.222]
66 Otherwise, if they are similar and not well separated in feature space, SN R is low and the search is hard (SN R = 16. [sent-96, score-0.218]
67 Moreover, during search for a less discriminable target from distractors, the optimal strategy predicts that the neuron optimally tuned to the target may not be boosted maximally. [sent-98, score-1.299]
68 Instead, a neuron that is sub-optimally tuned to the target and farther away from the distractors receives maximal gain. [sent-99, score-0.827]
69 This new and counterintuitive prediction is tested by visual search experiments described in the next section. [sent-100, score-0.283]
70 Linear separability effect: The optimal strategy also predicts the linear separability effect [18, 19] which suggests that when the target and distractors are less discriminable, search is easier if the target and distractors can be separated by a line in feature space (see figure 1). [sent-101, score-1.878]
71 , search for the smallest or largest item is faster than search for a medium-sized item in the display)[20], chromaticity and luminance [21, 19], and orientation [22, 23]. [sent-104, score-0.456]
72 4 Testing new predictions of the optimal cue selection strategy In this section, we describe the design and analysis of psychophysics experiments to verify the counter-intuitive prediction mentioned in the previous section, i. [sent-105, score-0.444]
73 , during searching for a target that is less discriminable from the distractors, a neuron that is sub-optimally tuned to the target’s feature will be boosted more than a neuron that is optimally tuned to the target’s feature. [sent-107, score-0.827]
74 1 Design of psychophysics experiments Our experiments are designed in two phases: phase 1 to set up the top-down bias and phase 2 to measure the bias. [sent-109, score-0.138]
75 Phase 1 - Setup the top-down bias: Subjects perform the primary task T1 which is a visual search for the target among distractors. [sent-110, score-0.643]
76 This task sets the top-down bias on cues so that the target becomes the most salient item in the display, thus accelerating target detection. [sent-111, score-0.882]
77 Subjects are trained on T1 trials until their performance stabilises with at least 80% accuracy. [sent-112, score-0.045]
78 They are instructed to find the target (55◦ tilt) among several distractors (50◦ tilt). [sent-113, score-0.709]
79 The target and distractors are the same for all T1 trials. [sent-114, score-0.672]
80 To avoid false reports (which may occur due to boredom or lack of attention) and to verify that subjects indeed find the target, we introduce a novel no cheat scheme as follows: After finding the target among distractors, subjects press any key. [sent-115, score-0.556]
81 Following the key press, we flash a grid of fineprint random numbers briefly (120ms) and ask subjects to report the number at the target’s location. [sent-116, score-0.048]
82 Thus, the top-down bias is set up by performing T1 trials. [sent-118, score-0.036]
83 For each of the four subjects, the number of reports on the steepest (80◦ ), relevant (60◦ ), target (55◦ ) and distractor (50◦ ) cues are shown in these bar plots. [sent-122, score-0.831]
84 As predicted by the theory, a paired t-test reveals that the number of reports on the relevant cue is significantly higher (p < 0. [sent-123, score-0.298]
85 05) than the number of reports on the target, distractor and steepest cues, as indicated by the blue star. [sent-124, score-0.355]
86 Phase 2 - Measure the top-down bias: To measure the top-down bias generated by the above task, we randomly insert T2 trials in between T1 trials. [sent-125, score-0.081]
87 Our theory predicts that during search for the target (55◦) among distractors (50◦ ), the most relevant cue will be around 60◦ and not 55◦ . [sent-126, score-1.164]
88 To test this, we briefly (200ms) flash four cues - steepest (S, 80◦ ), relevant as predicted by our theory (R, 60◦ ), target (T, 55◦ ) and distractor (D, 50◦ ). [sent-127, score-0.755]
89 A cue that is biased more appears more salient, attracts a saccade, and gets reported. [sent-128, score-0.185]
90 In other words, the greater the top-down bias on a cue, the higher the number of its reports. [sent-129, score-0.036]
91 According to our theory, there should be higher number of reports on R than T. [sent-130, score-0.076]
92 As mentioned earlier, each subject received training on T1 trials for a few days until the performance (search speed) stabilised with atleast 80% accuracy. [sent-133, score-0.045]
93 Finally, each subject performed 10 blocks of 50 trials each, with T2 trials randomly inserted in between T1 trials. [sent-135, score-0.09]
94 2 Results For each of the four subjects, we extracted the number reports on the steepest (NS ), relevant (NR ), target (NT ) and distractor (ND ) cues, for each block. [sent-137, score-0.715]
95 As predicted by the theory, we found a significantly higher number of reports on the relevant cue than the target cue. [sent-140, score-0.621]
96 5 Discussion In this paper, we have investigated whether our visual system has evolved to use optimal top-down strategies to select relevant cues that quickly and reliably detect the target among distracting environments. [sent-141, score-0.775]
97 We formally obtained the optimal cue selection strategy where cues are chosen such that the signal-to-noise ratio (SN R) of the saliency map is maximized, thus maximizing the target’s salience relative to the distractors. [sent-142, score-1.012]
98 The resulting optimal strategy is to boost a cue or feature if it provides higher signal-to-noise ratio than average. [sent-143, score-0.448]
99 Our study complements the recent work on optimal eye movement strategies [24]. [sent-145, score-0.098]
100 Thus, both optimal cue selection and saccade generation are necessary for optimal visual search. [sent-147, score-0.524]
wordName wordTfidf (topN-words)
[('sn', 0.434), ('distractors', 0.349), ('target', 0.323), ('salience', 0.315), ('distractor', 0.223), ('gi', 0.204), ('cue', 0.185), ('ec', 0.177), ('search', 0.172), ('saliency', 0.124), ('cues', 0.116), ('ri', 0.112), ('visual', 0.111), ('array', 0.097), ('tuned', 0.093), ('sit', 0.093), ('discriminability', 0.092), ('sid', 0.089), ('percept', 0.085), ('heterogeneity', 0.077), ('strategy', 0.077), ('reports', 0.076), ('dgi', 0.071), ('discriminable', 0.071), ('optimal', 0.066), ('db', 0.064), ('neuron', 0.062), ('predicts', 0.061), ('neurons', 0.058), ('fi', 0.057), ('steepest', 0.056), ('item', 0.056), ('sd', 0.056), ('gain', 0.055), ('response', 0.054), ('distracting', 0.053), ('hum', 0.053), ('ratio', 0.05), ('dj', 0.05), ('selection', 0.05), ('subjects', 0.048), ('wolfe', 0.047), ('psychophys', 0.046), ('saccade', 0.046), ('feature', 0.046), ('trials', 0.045), ('st', 0.045), ('separability', 0.042), ('psychophysics', 0.042), ('psychol', 0.039), ('relevant', 0.037), ('among', 0.037), ('gains', 0.037), ('bias', 0.036), ('preferred', 0.036), ('bauer', 0.036), ('deploy', 0.036), ('jolicoeur', 0.036), ('sjd', 0.036), ('stimulus', 0.035), ('tuning', 0.035), ('early', 0.035), ('features', 0.035), ('gure', 0.034), ('display', 0.033), ('population', 0.033), ('spatial', 0.032), ('strategies', 0.032), ('si', 0.032), ('ith', 0.031), ('maximised', 0.031), ('usc', 0.031), ('tiger', 0.031), ('apr', 0.031), ('treisman', 0.031), ('res', 0.03), ('phase', 0.03), ('map', 0.029), ('location', 0.029), ('gj', 0.029), ('salient', 0.028), ('effect', 0.028), ('vision', 0.028), ('vis', 0.028), ('ash', 0.028), ('presentations', 0.028), ('feb', 0.028), ('jun', 0.028), ('maximize', 0.028), ('tilt', 0.026), ('optimally', 0.026), ('searching', 0.026), ('uncertainty', 0.025), ('boosted', 0.025), ('gurations', 0.024), ('encoded', 0.024), ('verify', 0.024), ('cluttered', 0.024), ('boost', 0.024), ('nr', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999934 149 nips-2005-Optimal cue selection strategy
Author: Vidhya Navalpakkam, Laurent Itti
Abstract: Survival in the natural world demands the selection of relevant visual cues to rapidly and reliably guide attention towards prey and predators in cluttered environments. We investigate whether our visual system selects cues that guide search in an optimal manner. We formally obtain the optimal cue selection strategy by maximizing the signal to noise ratio (SN R) between a search target and surrounding distractors. This optimal strategy successfully accounts for several phenomena in visual search behavior, including the effect of target-distractor discriminability, uncertainty in target’s features, distractor heterogeneity, and linear separability. Furthermore, the theory generates a new prediction, which we verify through psychophysical experiments with human subjects. Our results provide direct experimental evidence that humans select visual cues so as to maximize SN R between the targets and surrounding clutter.
2 0.23149174 194 nips-2005-Top-Down Control of Visual Attention: A Rational Account
Author: Michael Shettel, Shaun Vecera, Michael C. Mozer
Abstract: Theories of visual attention commonly posit that early parallel processes extract conspicuous features such as color contrast and motion from the visual field. These features are then combined into a saliency map, and attention is directed to the most salient regions first. Top-down attentional control is achieved by modulating the contribution of different feature types to the saliency map. A key source of data concerning attentional control comes from behavioral studies in which the effect of recent experience is examined as individuals repeatedly perform a perceptual discrimination task (e.g., “what shape is the odd-colored object?”). The robust finding is that repetition of features of recent trials (e.g., target color) facilitates performance. We view this facilitation as an adaptation to the statistical structure of the environment. We propose a probabilistic model of the environment that is updated after each trial. Under the assumption that attentional control operates so as to make performance more efficient for more likely environmental states, we obtain parsimonious explanations for data from four different experiments. Further, our model provides a rational explanation for why the influence of past experience on attentional control is short lived. 1 INTRODUCTION The brain does not have the computational capacity to fully process the massive quantity of information provided by the eyes. Selective attention operates to filter the spatiotemporal stream to a manageable quantity. Key to understanding the nature of attention is discovering the algorithm governing selection, i.e., understanding what information will be selected and what will be suppressed. Selection is influenced by attributes of the spatiotemporal stream, often referred to as bottom-up contributions to attention. For example, attention is drawn to abrupt onsets, motion, and regions of high contrast in brightness and color. Most theories of attention posit that some visual information processing is performed preattentively and in parallel across the visual field. This processing extracts primitive visual features such as color and motion, which provide the bottom-up cues for attentional guidance. However, attention is not driven willy nilly by these cues. The deployment of attention can be modulated by task instructions, current goals, and domain knowledge, collectively referred to as top-down contributions to attention. How do bottom-up and top-down contributions to attention interact? Most psychologically and neurobiologically motivated models propose a very similar architecture in which information from bottom-up and top-down sources combines in a saliency (or activation) map (e.g., Itti et al., 1998; Koch & Ullman, 1985; Mozer, 1991; Wolfe, 1994). The saliency map indicates, for each location in the visual field, the relative importance of that location. Attention is drawn to the most salient locations first. Figure 1 sketches the basic architecture that incorporates bottom-up and top-down contributions to the saliency map. The visual image is analyzed to extract maps of primitive features such as color and orientation. Associated with each location in a map is a scalar visual image horizontal primitive feature maps vertical green top-down gains red saliency map FIGURE 1. An attentional saliency map constructed from bottom-up and top-down information bottom-up activations FIGURE 2. Sample display from Experiment 1 of Maljkovic and Nakayama (1994) response or activation indicating the presence of a particular feature. Most models assume that responses are stronger at locations with high local feature contrast, consistent with neurophysiological data, e.g., the response of a red feature detector to a red object is stronger if the object is surrounded by green objects. The saliency map is obtained by taking a sum of bottom-up activations from the feature maps. The bottom-up activations are modulated by a top-down gain that specifies the contribution of a particular map to saliency in the current task and environment. Wolfe (1994) describes a heuristic algorithm for determining appropriate gains in a visual search task, where the goal is to detect a target object among distractor objects. Wolfe proposes that maps encoding features that discriminate between target and distractors have higher gains, and to be consistent with the data, he proposes limits on the magnitude of gain modulation and the number of gains that can be modulated. More recently, Wolfe et al. (2003) have been explicit in proposing optimization as a principle for setting gains given the task definition and stimulus environment. One aspect of optimizing attentional control involves configuring the attentional system to perform a given task; for example, in a visual search task for a red vertical target among green vertical and red horizontal distractors, the task definition should result in a higher gain for red and vertical feature maps than for other feature maps. However, there is a more subtle form of gain modulation, which depends on the statistics of display environments. For example, if green vertical distractors predominate, then red is a better discriminative cue than vertical; and if red horizontal distractors predominate, then vertical is a better discriminative cue than red. In this paper, we propose a model that encodes statistics of the environment in order to allow for optimization of attentional control to the structure of the environment. Our model is designed to address a key set of behavioral data, which we describe next. 1.1 Attentional priming phenomena Psychological studies involve a sequence of experimental trials that begin with a stimulus presentation and end with a response from the human participant. Typically, trial order is randomized, and the context preceding a trial is ignored. However, in sequential studies, performance is examined on one trial contingent on the past history of trials. These sequential studies explore how experience influences future performance. Consider a the sequential attentional task of Maljkovic and Nakayama (1994). On each trial, the stimulus display (Figure 2) consists of three notched diamonds, one a singleton in color—either green among red or red among green. The task is to report whether the singleton diamond, referred to as the target, is notched on the left or the right. The task is easy because the singleton pops out, i.e., the time to locate the singleton does not depend on the number of diamonds in the display. Nonetheless, the response time significantly depends on the sequence of trials leading up to the current trial: If the target is the same color on the cur- rent trial as on the previous trial, response time is roughly 100 ms faster than if the target is a different color on the current trial. Considering that response times are on the order of 700 ms, this effect, which we term attentional priming, is gigantic in the scheme of psychological phenomena. 2 ATTENTIONAL CONTROL AS ADAPTATION TO THE STATISTICS OF THE ENVIRONMENT We interpret the phenomenon of attentional priming via a particular perspective on attentional control, which can be summarized in two bullets. • The perceptual system dynamically constructs a probabilistic model of the environment based on its past experience. • Control parameters of the attentional system are tuned so as to optimize performance under the current environmental model. The primary focus of this paper is the environmental model, but we first discuss the nature of performance optimization. The role of attention is to make processing of some stimuli more efficient, and consequently, the processing of other stimuli less efficient. For example, if the gain on the red feature map is turned up, processing will be efficient for red items, but competition from red items will reduce the efficiency for green items. Thus, optimal control should tune the system for the most likely states of the world by minimizing an objective function such as: J(g) = ∑ P ( e )RT g ( e ) (1) e where g is a vector of top-down gains, e is an index over environmental states, P(.) is the probability of an environmental state, and RTg(.) is the expected response time—assuming a constant error rate—to the environmental state under gains g. Determining the optimal gains is a challenge because every gain setting will result in facilitation of responses to some environmental states but hindrance of responses to other states. The optimal control problem could be solved via direct reinforcement learning, but the rapidity of human learning makes this possibility unlikely: In a variety of experimental tasks, evidence suggests that adaptation to a new task or environment can occur in just one or two trials (e.g., Rogers & Monsell, 1996). Model-based reinforcement learning is an attractive alternative, because given a model, optimization can occur without further experience in the real world. Although the number of real-world trials necessary to achieve a given level of performance is comparable for direct and model-based reinforcement learning in stationary environments (Kearns & Singh, 1999), naturalistic environments can be viewed as highly nonstationary. In such a situation, the framework we suggest is well motivated: After each experience, the environment model is updated. The updated environmental model is then used to retune the attentional system. In this paper, we propose a particular model of the environment suitable for visual search tasks. Rather than explicitly modeling the optimization of attentional control by setting gains, we assume that the optimization process will serve to minimize Equation 1. Because any gain adjustment will facilitate performance in some environmental states and hinder performance in others, an optimized control system should obtain faster reaction times for more probable environmental states. This assumption allows us to explain experimental results in a minimal, parsimonious framework. 3 MODELING THE ENVIRONMENT Focusing on the domain of visual search, we characterize the environment in terms of a probability distribution over configurations of target and distractor features. We distinguish three classes of features: defining, reported, and irrelevant. To explain these terms, consider the task of searching a display of size varying, colored, notched diamonds (Figure 2), with the task of detecting the singleton in color and judging the notch location. Color is the defining feature, notch location is the reported feature, and size is an irrelevant feature. To simplify the exposition, we treat all features as having discrete values, an assumption which is true of the experimental tasks we model. We begin by considering displays containing a single target and a single distractor, and shortly generalize to multidistractor displays. We use the framework of Bayesian networks to characterize the environment. Each feature of the target and distractor is a discrete random variable, e.g., Tcolor for target color and Dnotch for the location of the notch on the distractor. The Bayes net encodes the probability distribution over environmental states; in our working example, this distribution is P(Tcolor, Tsize, Tnotch, Dcolor, Dsize, Dnotch). The structure of the Bayes net specifies the relationships among the features. The simplest model one could consider would be to treat the features as independent, illustrated in Figure 3a for singleton-color search task. The opposite extreme would be the full joint distribution, which could be represented by a look up table indexed by the six features, or by the cascading Bayes net architecture in Figure 3b. The architecture we propose, which we’ll refer to as the dominance model (Figure 3c), has an intermediate dependency structure, and expresses the joint distribution as: P(Tcolor)P(Dcolor |Tcolor)P(Tsize |Tcolor)P(Tnotch |Tcolor)P(Dsize |Dcolor)P(Dnotch |Tcolor). The structured model is constructed based on three rules. 1. The defining feature of the target is at the root of the tree. 2. The defining feature of the distractor is conditionally dependent on the defining feature of the target. We refer to this rule as dominance of the target over the distractor. 3. The reported and irrelevant features of target (distractor) are conditionally dependent on the defining feature of the target (distractor). We refer to this rule as dominance of the defining feature over nondefining features. As we will demonstrate, the dominance model produces a parsimonious account of a wide range of experimental data. 3.1 Updating the environment model The model’s parameters are the conditional distributions embodied in the links. In the example of Figure 3c with binary random variables, the model has 11 parameters. However, these parameters are determined by the environment: To be adaptive in nonstationary environments, the model must be updated following each experienced state. We propose a simple exponentially weighted averaging approach. For two variables V and W with observed values v and w on trial t, a conditional distribution, P t ( V = u W = w ) = δ uv , is (a) Tcolor Dcolor Tsize Tnotch (b) Tcolor Dcolor Dsize Tsize Dnotch Tnotch (c) Tcolor Dcolor Dsize Tsize Dsize Dnotch Tnotch Dnotch FIGURE 3. Three models of a visual-search environment with colored, notched, size-varying diamonds. (a) feature-independence model; (b) full-joint model; (c) dominance model. defined, where δ is the Kronecker delta. The distribution representing the environment E following trial t, denoted P t , is then updated as follows: E E P t ( V = u W = w ) = αP t – 1 ( V = u W = w ) + ( 1 – α )P t ( V = u W = w ) (2) for all u, where α is a memory constant. Note that no update is performed for values of W other than w. An analogous update is performed for unconditional distributions. E How the model is initialized—i.e., specifying P 0 —is irrelevant, because all experimental tasks that we model, participants begin the experiment with many dozens of practice trials. E Data is not collected during practice trials. Consequently, any transient effects of P 0 do E not impact the results. In our simulations, we begin with a uniform distribution for P 0 , and include practice trials as in the human studies. Thus far, we’ve assumed a single target and a single distractor. The experiments that we model involve multiple distractors. The simple extension we require to handle multiple distractors is to define a frequentist probability for each distractor feature V, P t ( V = v W = w ) = C vw ⁄ C w , where C vw is the count of co-occurrences of feature values v and w among the distractors, and C w is the count of w. Our model is extremely simple. Given a description of the visual search task and environment, the model has only a single degree of freedom, α . In all simulations, we fix α = 0.75 ; however, the choice of α does not qualitatively impact any result. 4 SIMULATIONS In this section, we show that the model can explain a range of data from four different experiments examining attentional priming. All experiments measure response times of participants. On each trial, the model can be used to obtain a probability of the display configuration (the environmental state) on that trial, given the history of trials to that point. Our critical assumption—as motivated earlier—is that response times monotonically decrease with increasing probability, indicating that visual information processing is better configured for more likely environmental states. The particular relationship we assume is that response times are linear in log probability. This assumption yields long response time tails, as are observed in all human studies. 4.1 Maljkovic and Nakayama (1994, Experiment 5) In this experiment, participants were asked to search for a singleton in color in a display of three red or green diamonds. Each diamond was notched on either the left or right side, and the task was to report the side of the notch on the color singleton. The well-practiced participants made very few errors. Reaction time (RT) was examined as a function of whether the target on a given trial is the same or different color as the target on trial n steps back or ahead. Figure 4 shows the results, with the human RTs in the left panel and the simulation log probabilities in the right panel. The horizontal axis represents n. Both graphs show the same outcome: repetition of target color facilitates performance. This influence lasts only for a half dozen trials, with an exponentially decreasing influence further into the past. In the model, this decreasing influence is due to the exponential decay of recent history (Equation 2). Figure 4 also shows that—as expected—the future has no influence on the current trial. 4.2 Maljkovic and Nakayama (1994, Experiment 8) In the previous experiment, it is impossible to determine whether facilitation is due to repetition of the target’s color or the distractor’s color, because the display contains only two colors, and therefore repetition of target color implies repetition of distractor color. To unconfound these two potential factors, an experiment like the previous one was con- ducted using four distinct colors, allowing one to examine the effect of repeating the target color while varying the distractor color, and vice versa. The sequence of trials was composed of subsequences of up-to-six consecutive trials with either the target or distractor color held constant while the other color was varied trial to trial. Following each subsequence, both target and distractors were changed. Figure 5 shows that for both humans and the simulation, performance improves toward an asymptote as the number of target and distractor repetitions increases; in the model, the asymptote is due to the probability of the repeated color in the environment model approaching 1.0. The performance improvement is greater for target than distractor repetition; in the model, this difference is due to the dominance of the defining feature of the target over the defining feature of the distractor. 4.3 Huang, Holcombe, and Pashler (2004, Experiment 1) Huang et al. (2004) and Hillstrom (2000) conducted studies to determine whether repetitions of one feature facilitate performance independently of repetitions of another feature. In the Huang et al. study, participants searched for a singleton in size in a display consisting of lines that were short and long, slanted left or right, and colored white or black. The reported feature was target slant. Slant, size, and color were uncorrelated. Huang et al. discovered that repeating an irrelevant feature (color or orientation) facilitated performance, but only when the defining feature (size) was repeated. As shown in Figure 6, the model replicates human performance, due to the dominance of the defining feature over the reported and irrelevant features. 4.4 Wolfe, Butcher, Lee, and Hyde (2003, Experiment 1) In an empirical tour-de-force, Wolfe et al. (2003) explored singleton search over a range of environments. The task is to detect the presence or absence of a singleton in displays conHuman data Different Color 600 Different Color 590 580 570 15 13 11 9 7 Past 5 3.2 3 Same Color 2.8 Same Color 560 550 Simulation 3.4 log(P(trial)) Reaction Time (msec) 610 3 1 +1 +3 +5 Future 2.6 +7 15 13 Relative Trial Number 11 9 7 Past 5 3 1 +1 +3 +5 Future +7 Relative Trial Number FIGURE 4. Experiment 5 of Maljkovic and Nakayama (1994): performance on a given trial conditional on the color of the target on a previous or subsequent trial. Human data is from subject KN. 650 6 Distractors Same 630 5.5 log(P(trial)) FIGURE 5. Experiment 8 of Maljkovic and Nakayama (1994). (left panel) human data, average of subjects KN and SS; (right panel) simulation Reaction Time (msec) 640 620 Target Same 610 5 Distractors Same 4.5 4 600 Target Same 3.5 590 3 580 1 2 3 4 5 1 6 4 5 6 4 1000 Size Alternate Size Alternate log(P(trial)) Reaction Time (msec) 3 4.2 1050 FIGURE 6. Experiment 1 of Huang, Holcombe, & Pashler (2004). (left panel) human data; (right panel) simulation 2 Order in Sequence Order in Sequence 950 3.8 3.6 3.4 900 3.2 Size Repeat 850 Size Repeat 3 Color Repeat Color Alternate Color Repeat Color Alternate sisting of colored (red or green), oriented (horizontal or vertical) lines. Target-absent trials were used primarily to ensure participants were searching the display. The experiment examined seven experimental conditions, which varied in the amount of uncertainty as to the target identity. The essential conditions, from least to most uncertainty, are: blocked (e.g., target always red vertical among green horizontals), mixed feature (e.g., target always a color singleton), mixed dimension (e.g., target either red or vertical), and fully mixed (target could be red, green, vertical, or horizontal). With this design, one can ascertain how uncertainty in the environment and in the target definition influence task difficulty. Because the defining feature in this experiment could be either color or orientation, we modeled the environment with two Bayes nets—one color dominant and one orientation dominant—and performed model averaging. A comparison of Figures 7a and 7b show a correspondence between human RTs and model predictions. Less uncertainty in the environment leads to more efficient performance. One interesting result from the model is its prediction that the mixed-feature condition is easier than the fully-mixed condition; that is, search is more efficient when the dimension (i.e., color vs. orientation) of the singleton is known, even though the model has no abstract representation of feature dimensions, only feature values. 4.5 Optimal adaptation constant In all simulations so far, we fixed the memory constant. From the human data, it is clear that memory for recent experience is relatively short lived, on the order of a half dozen trials (e.g., left panel of Figure 4). In this section we provide a rational argument for the short duration of memory in attentional control. Figure 7c shows mean negative log probability in each condition of the Wolfe et al. (2003) experiment, as a function of α . To assess these probabilities, for each experimental condition, the model was initialized so that all of the conditional distributions were uniform, and then a block of trials was run. Log probability for all trials in the block was averaged. The negative log probability (y axis of the Figure) is a measure of the model’s misprediction of the next trial in the sequence. For complex environments, such as the fully-mixed condition, a small memory constant is detrimental: With rapid memory decay, the effective history of trials is a high-variance sample of the distribution of environmental states. For simple environments, a large memory constant is detrimental: With slow memory decay, the model does not transition quickly from the initial environmental model to one that reflects the statistics of a new environment. Thus, the memory constant is constrained by being large enough that the environment model can hold on to sufficient history to represent complex environments, and by being small enough that the model adapts quickly to novel environments. If the conditions in Wolfe et al. give some indication of the range of naturalistic environments an agent encounters, we have a rational account of why attentional priming is so short lived. Whether priming lasts 2 trials or 20, the surprising empirical result is that it does not last 200 or 2000 trials. Our rational argument provides a rough insight into this finding. (a) fully mixed mixed feature mixed dimension blocked 460 (c) Simulation fully mixed mixed feature mixed dimension blocked 4 5 420 log(P(trial)) 440 2 Blocked Red or Vertical Blocked Red and Vertical Mixed Feature Mixed Dimension Fully Mixed 4 3 log(P(trial)) reaction time (msec) (b) Human Data 480 3 2 1 400 1 380 0 360 0 red or vert red and vert target type red or vert red and vert target type 0 0.5 0.8 0.9 0.95 0.98 Memory Constant FIGURE 7. (a) Human data for Wolfe et al. (2003), Experiment 1; (b) simulation; (c) misprediction of model (i.e., lower y value = better) as a function of α for five experimental condition 5 DISCUSSION The psychological literature contains two opposing accounts of attentional priming and its relation to attentional control. Huang et al. (2004) and Hillstrom (2000) propose an episodic account in which a distinct memory trace—representing the complete configuration of features in the display—is laid down for each trial, and priming depends on configural similarity of the current trial to previous trials. Alternatively, Maljkovic and Nakayama (1994) and Wolfe et al. (2003) propose a feature-strengthening account in which detection of a feature on one trial increases its ability to attract attention on subsequent trials, and priming is proportional to the number of overlapping features from one trial to the next. The episodic account corresponds roughly to the full joint model (Figure 3b), and the feature-strengthening account corresponds roughly to the independence model (Figure 3a). Neither account is adequate to explain the range of data we presented. However, an intermediate account, the dominance model (Figure 3c), is not only sufficient, but it offers a parsimonious, rational explanation. Beyond the model’s basic assumptions, it has only one free parameter, and can explain results from diverse experimental paradigms. The model makes a further theoretical contribution. Wolfe et al. distinguish the environments in their experiment in terms of the amount of top-down control available, implying that different mechanisms might be operating in different environments. However, in our account, top-down control is not some substance distributed in different amounts depending on the nature of the environment. Our account treats all environments uniformly, relying on attentional control to adapt to the environment at hand. We conclude with two limitations of the present work. First, our account presumes a particular network architecture, instead of a more elegant Bayesian approach that specifies priors over architectures, and performs automatic model selection via the sequence of trials. We did explore such a Bayesian approach, but it was unable to explain the data. Second, at least one finding in the literature is problematic for the model. Hillstrom (2000) occasionally finds that RTs slow when an irrelevant target feature is repeated but the defining target feature is not. However, because this effect is observed only in some experiments, it is likely that any model would require elaboration to explain the variability. ACKNOWLEDGEMENTS We thank Jeremy Wolfe for providing the raw data from his experiment for reanalysis. This research was funded by NSF BCS Award 0339103. REFERENCES Huang, L, Holcombe, A. O., & Pashler, H. (2004). Repetition priming in visual search: Episodic retrieval, not feature priming. Memory & Cognition, 32, 12–20. Hillstrom, A. P. (2000). Repetition effects in visual search. Perception & Psychophysics, 62, 800-817. Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Analysis & Machine Intelligence, 20, 1254–1259. Kearns, M., & Singh, S. (1999). Finite-sample convergence rates for Q-learning and indirect algorithms. In Advances in Neural Information Processing Systems 11 (pp. 996–1002). Cambridge, MA: MIT Press. Koch, C. and Ullman, S. (1985). Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiology, 4, 219–227. Maljkovic, V., & Nakayama, K. (1994). Priming of pop-out: I. Role of features. Mem. & Cognition, 22, 657-672. Mozer, M. C. (1991). The perception of multiple objects: A connectionist approach. Cambridge, MA: MIT Press. Rogers, R. D., & Monsell, S. (1995). The cost of a predictable switch between simple cognitive tasks. Journal of Experimental Psychology: General, 124, 207–231. Wolfe, J.M. (1994). Guided Search 2.0: A Revised Model of Visual Search. Psych. Bull. & Rev., 1, 202–238. Wolfe, J. S., Butcher, S. J., Lee, C., & Hyde, M. (2003). Changing your mind: on the contributions of top-down and bottom-up guidance in visual search for feature singletons. Journal of Exptl. Psychology: Human Perception & Performance, 29, 483-502.
3 0.22919014 141 nips-2005-Norepinephrine and Neural Interrupts
Author: Peter Dayan, Angela J. Yu
Abstract: Experimental data indicate that norepinephrine is critically involved in aspects of vigilance and attention. Previously, we considered the function of this neuromodulatory system on a time scale of minutes and longer, and suggested that it signals global uncertainty arising from gross changes in environmental contingencies. However, norepinephrine is also known to be activated phasically by familiar stimuli in welllearned tasks. Here, we extend our uncertainty-based treatment of norepinephrine to this phasic mode, proposing that it is involved in the detection and reaction to state uncertainty within a task. This role of norepinephrine can be understood through the metaphor of neural interrupts. 1
4 0.12963137 5 nips-2005-A Computational Model of Eye Movements during Object Class Detection
Author: Wei Zhang, Hyejin Yang, Dimitris Samaras, Gregory J. Zelinsky
Abstract: We present a computational model of human eye movements in an object class detection task. The model combines state-of-the-art computer vision object class detection methods (SIFT features trained using AdaBoost) with a biologically plausible model of human eye movement to produce a sequence of simulated fixations, culminating with the acquisition of a target. We validated the model by comparing its behavior to the behavior of human observers performing the identical object class detection task (looking for a teddy bear among visually complex nontarget objects). We found considerable agreement between the model and human data in multiple eye movement measures, including number of fixations, cumulative probability of fixating the target, and scanpath distance.
5 0.12955244 169 nips-2005-Saliency Based on Information Maximization
Author: Neil Bruce, John Tsotsos
Abstract: A model of bottom-up overt attention is proposed based on the principle of maximizing information sampled from a scene. The proposed operation is based on Shannon's self-information measure and is achieved in a neural circuit, which is demonstrated as having close ties with the circuitry existent in the primate visual cortex. It is further shown that the proposed saliency measure may be extended to address issues that currently elude explanation in the domain of saliency based models. Resu lts on natural images are compared with experimental eye tracking data revealing the efficacy of the model in predicting the deployment of overt attention as compared with existing efforts.
6 0.11426675 146 nips-2005-On the Accuracy of Bounded Rationality: How Far from Optimal Is Fast and Frugal?
7 0.10861152 109 nips-2005-Learning Cue-Invariant Visual Responses
8 0.1078311 193 nips-2005-The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search
9 0.087674402 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity
10 0.079969823 3 nips-2005-A Bayesian Framework for Tilt Perception and Confidence
11 0.07218872 41 nips-2005-Coarse sample complexity bounds for active learning
12 0.070198596 8 nips-2005-A Criterion for the Convergence of Learning with Spike Timing Dependent Plasticity
13 0.069910631 26 nips-2005-An exploration-exploitation model based on norepinepherine and dopamine activity
14 0.06682808 181 nips-2005-Spiking Inputs to a Winner-take-all Network
15 0.066762939 7 nips-2005-A Cortically-Plausible Inverse Problem Solving Method Applied to Recognizing Static and Kinematic 3D Objects
16 0.066201873 132 nips-2005-Nearest Neighbor Based Feature Selection for Regression and its Application to Neural Activity
17 0.066081017 38 nips-2005-Beyond Gaussian Processes: On the Distributions of Infinite Networks
18 0.062684953 166 nips-2005-Robust Fisher Discriminant Analysis
19 0.060443044 157 nips-2005-Principles of real-time computing with feedback applied to cortical microcircuit models
20 0.060143553 12 nips-2005-A PAC-Bayes approach to the Set Covering Machine
topicId topicWeight
[(0, 0.178), (1, -0.133), (2, 0.026), (3, 0.15), (4, -0.001), (5, 0.222), (6, -0.031), (7, -0.121), (8, -0.239), (9, 0.156), (10, 0.114), (11, -0.003), (12, -0.08), (13, 0.045), (14, -0.145), (15, 0.038), (16, 0.003), (17, 0.041), (18, 0.063), (19, -0.215), (20, 0.049), (21, -0.107), (22, -0.099), (23, 0.103), (24, 0.063), (25, -0.125), (26, 0.055), (27, 0.044), (28, 0.056), (29, 0.113), (30, 0.078), (31, 0.018), (32, 0.085), (33, 0.072), (34, -0.038), (35, -0.041), (36, -0.122), (37, -0.08), (38, 0.091), (39, -0.108), (40, 0.016), (41, -0.022), (42, 0.136), (43, -0.006), (44, 0.171), (45, 0.005), (46, 0.006), (47, 0.026), (48, 0.075), (49, 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.98543477 149 nips-2005-Optimal cue selection strategy
Author: Vidhya Navalpakkam, Laurent Itti
Abstract: Survival in the natural world demands the selection of relevant visual cues to rapidly and reliably guide attention towards prey and predators in cluttered environments. We investigate whether our visual system selects cues that guide search in an optimal manner. We formally obtain the optimal cue selection strategy by maximizing the signal to noise ratio (SN R) between a search target and surrounding distractors. This optimal strategy successfully accounts for several phenomena in visual search behavior, including the effect of target-distractor discriminability, uncertainty in target’s features, distractor heterogeneity, and linear separability. Furthermore, the theory generates a new prediction, which we verify through psychophysical experiments with human subjects. Our results provide direct experimental evidence that humans select visual cues so as to maximize SN R between the targets and surrounding clutter.
2 0.82109326 194 nips-2005-Top-Down Control of Visual Attention: A Rational Account
Author: Michael Shettel, Shaun Vecera, Michael C. Mozer
Abstract: Theories of visual attention commonly posit that early parallel processes extract conspicuous features such as color contrast and motion from the visual field. These features are then combined into a saliency map, and attention is directed to the most salient regions first. Top-down attentional control is achieved by modulating the contribution of different feature types to the saliency map. A key source of data concerning attentional control comes from behavioral studies in which the effect of recent experience is examined as individuals repeatedly perform a perceptual discrimination task (e.g., “what shape is the odd-colored object?”). The robust finding is that repetition of features of recent trials (e.g., target color) facilitates performance. We view this facilitation as an adaptation to the statistical structure of the environment. We propose a probabilistic model of the environment that is updated after each trial. Under the assumption that attentional control operates so as to make performance more efficient for more likely environmental states, we obtain parsimonious explanations for data from four different experiments. Further, our model provides a rational explanation for why the influence of past experience on attentional control is short lived. 1 INTRODUCTION The brain does not have the computational capacity to fully process the massive quantity of information provided by the eyes. Selective attention operates to filter the spatiotemporal stream to a manageable quantity. Key to understanding the nature of attention is discovering the algorithm governing selection, i.e., understanding what information will be selected and what will be suppressed. Selection is influenced by attributes of the spatiotemporal stream, often referred to as bottom-up contributions to attention. For example, attention is drawn to abrupt onsets, motion, and regions of high contrast in brightness and color. Most theories of attention posit that some visual information processing is performed preattentively and in parallel across the visual field. This processing extracts primitive visual features such as color and motion, which provide the bottom-up cues for attentional guidance. However, attention is not driven willy nilly by these cues. The deployment of attention can be modulated by task instructions, current goals, and domain knowledge, collectively referred to as top-down contributions to attention. How do bottom-up and top-down contributions to attention interact? Most psychologically and neurobiologically motivated models propose a very similar architecture in which information from bottom-up and top-down sources combines in a saliency (or activation) map (e.g., Itti et al., 1998; Koch & Ullman, 1985; Mozer, 1991; Wolfe, 1994). The saliency map indicates, for each location in the visual field, the relative importance of that location. Attention is drawn to the most salient locations first. Figure 1 sketches the basic architecture that incorporates bottom-up and top-down contributions to the saliency map. The visual image is analyzed to extract maps of primitive features such as color and orientation. Associated with each location in a map is a scalar visual image horizontal primitive feature maps vertical green top-down gains red saliency map FIGURE 1. An attentional saliency map constructed from bottom-up and top-down information bottom-up activations FIGURE 2. Sample display from Experiment 1 of Maljkovic and Nakayama (1994) response or activation indicating the presence of a particular feature. Most models assume that responses are stronger at locations with high local feature contrast, consistent with neurophysiological data, e.g., the response of a red feature detector to a red object is stronger if the object is surrounded by green objects. The saliency map is obtained by taking a sum of bottom-up activations from the feature maps. The bottom-up activations are modulated by a top-down gain that specifies the contribution of a particular map to saliency in the current task and environment. Wolfe (1994) describes a heuristic algorithm for determining appropriate gains in a visual search task, where the goal is to detect a target object among distractor objects. Wolfe proposes that maps encoding features that discriminate between target and distractors have higher gains, and to be consistent with the data, he proposes limits on the magnitude of gain modulation and the number of gains that can be modulated. More recently, Wolfe et al. (2003) have been explicit in proposing optimization as a principle for setting gains given the task definition and stimulus environment. One aspect of optimizing attentional control involves configuring the attentional system to perform a given task; for example, in a visual search task for a red vertical target among green vertical and red horizontal distractors, the task definition should result in a higher gain for red and vertical feature maps than for other feature maps. However, there is a more subtle form of gain modulation, which depends on the statistics of display environments. For example, if green vertical distractors predominate, then red is a better discriminative cue than vertical; and if red horizontal distractors predominate, then vertical is a better discriminative cue than red. In this paper, we propose a model that encodes statistics of the environment in order to allow for optimization of attentional control to the structure of the environment. Our model is designed to address a key set of behavioral data, which we describe next. 1.1 Attentional priming phenomena Psychological studies involve a sequence of experimental trials that begin with a stimulus presentation and end with a response from the human participant. Typically, trial order is randomized, and the context preceding a trial is ignored. However, in sequential studies, performance is examined on one trial contingent on the past history of trials. These sequential studies explore how experience influences future performance. Consider a the sequential attentional task of Maljkovic and Nakayama (1994). On each trial, the stimulus display (Figure 2) consists of three notched diamonds, one a singleton in color—either green among red or red among green. The task is to report whether the singleton diamond, referred to as the target, is notched on the left or the right. The task is easy because the singleton pops out, i.e., the time to locate the singleton does not depend on the number of diamonds in the display. Nonetheless, the response time significantly depends on the sequence of trials leading up to the current trial: If the target is the same color on the cur- rent trial as on the previous trial, response time is roughly 100 ms faster than if the target is a different color on the current trial. Considering that response times are on the order of 700 ms, this effect, which we term attentional priming, is gigantic in the scheme of psychological phenomena. 2 ATTENTIONAL CONTROL AS ADAPTATION TO THE STATISTICS OF THE ENVIRONMENT We interpret the phenomenon of attentional priming via a particular perspective on attentional control, which can be summarized in two bullets. • The perceptual system dynamically constructs a probabilistic model of the environment based on its past experience. • Control parameters of the attentional system are tuned so as to optimize performance under the current environmental model. The primary focus of this paper is the environmental model, but we first discuss the nature of performance optimization. The role of attention is to make processing of some stimuli more efficient, and consequently, the processing of other stimuli less efficient. For example, if the gain on the red feature map is turned up, processing will be efficient for red items, but competition from red items will reduce the efficiency for green items. Thus, optimal control should tune the system for the most likely states of the world by minimizing an objective function such as: J(g) = ∑ P ( e )RT g ( e ) (1) e where g is a vector of top-down gains, e is an index over environmental states, P(.) is the probability of an environmental state, and RTg(.) is the expected response time—assuming a constant error rate—to the environmental state under gains g. Determining the optimal gains is a challenge because every gain setting will result in facilitation of responses to some environmental states but hindrance of responses to other states. The optimal control problem could be solved via direct reinforcement learning, but the rapidity of human learning makes this possibility unlikely: In a variety of experimental tasks, evidence suggests that adaptation to a new task or environment can occur in just one or two trials (e.g., Rogers & Monsell, 1996). Model-based reinforcement learning is an attractive alternative, because given a model, optimization can occur without further experience in the real world. Although the number of real-world trials necessary to achieve a given level of performance is comparable for direct and model-based reinforcement learning in stationary environments (Kearns & Singh, 1999), naturalistic environments can be viewed as highly nonstationary. In such a situation, the framework we suggest is well motivated: After each experience, the environment model is updated. The updated environmental model is then used to retune the attentional system. In this paper, we propose a particular model of the environment suitable for visual search tasks. Rather than explicitly modeling the optimization of attentional control by setting gains, we assume that the optimization process will serve to minimize Equation 1. Because any gain adjustment will facilitate performance in some environmental states and hinder performance in others, an optimized control system should obtain faster reaction times for more probable environmental states. This assumption allows us to explain experimental results in a minimal, parsimonious framework. 3 MODELING THE ENVIRONMENT Focusing on the domain of visual search, we characterize the environment in terms of a probability distribution over configurations of target and distractor features. We distinguish three classes of features: defining, reported, and irrelevant. To explain these terms, consider the task of searching a display of size varying, colored, notched diamonds (Figure 2), with the task of detecting the singleton in color and judging the notch location. Color is the defining feature, notch location is the reported feature, and size is an irrelevant feature. To simplify the exposition, we treat all features as having discrete values, an assumption which is true of the experimental tasks we model. We begin by considering displays containing a single target and a single distractor, and shortly generalize to multidistractor displays. We use the framework of Bayesian networks to characterize the environment. Each feature of the target and distractor is a discrete random variable, e.g., Tcolor for target color and Dnotch for the location of the notch on the distractor. The Bayes net encodes the probability distribution over environmental states; in our working example, this distribution is P(Tcolor, Tsize, Tnotch, Dcolor, Dsize, Dnotch). The structure of the Bayes net specifies the relationships among the features. The simplest model one could consider would be to treat the features as independent, illustrated in Figure 3a for singleton-color search task. The opposite extreme would be the full joint distribution, which could be represented by a look up table indexed by the six features, or by the cascading Bayes net architecture in Figure 3b. The architecture we propose, which we’ll refer to as the dominance model (Figure 3c), has an intermediate dependency structure, and expresses the joint distribution as: P(Tcolor)P(Dcolor |Tcolor)P(Tsize |Tcolor)P(Tnotch |Tcolor)P(Dsize |Dcolor)P(Dnotch |Tcolor). The structured model is constructed based on three rules. 1. The defining feature of the target is at the root of the tree. 2. The defining feature of the distractor is conditionally dependent on the defining feature of the target. We refer to this rule as dominance of the target over the distractor. 3. The reported and irrelevant features of target (distractor) are conditionally dependent on the defining feature of the target (distractor). We refer to this rule as dominance of the defining feature over nondefining features. As we will demonstrate, the dominance model produces a parsimonious account of a wide range of experimental data. 3.1 Updating the environment model The model’s parameters are the conditional distributions embodied in the links. In the example of Figure 3c with binary random variables, the model has 11 parameters. However, these parameters are determined by the environment: To be adaptive in nonstationary environments, the model must be updated following each experienced state. We propose a simple exponentially weighted averaging approach. For two variables V and W with observed values v and w on trial t, a conditional distribution, P t ( V = u W = w ) = δ uv , is (a) Tcolor Dcolor Tsize Tnotch (b) Tcolor Dcolor Dsize Tsize Dnotch Tnotch (c) Tcolor Dcolor Dsize Tsize Dsize Dnotch Tnotch Dnotch FIGURE 3. Three models of a visual-search environment with colored, notched, size-varying diamonds. (a) feature-independence model; (b) full-joint model; (c) dominance model. defined, where δ is the Kronecker delta. The distribution representing the environment E following trial t, denoted P t , is then updated as follows: E E P t ( V = u W = w ) = αP t – 1 ( V = u W = w ) + ( 1 – α )P t ( V = u W = w ) (2) for all u, where α is a memory constant. Note that no update is performed for values of W other than w. An analogous update is performed for unconditional distributions. E How the model is initialized—i.e., specifying P 0 —is irrelevant, because all experimental tasks that we model, participants begin the experiment with many dozens of practice trials. E Data is not collected during practice trials. Consequently, any transient effects of P 0 do E not impact the results. In our simulations, we begin with a uniform distribution for P 0 , and include practice trials as in the human studies. Thus far, we’ve assumed a single target and a single distractor. The experiments that we model involve multiple distractors. The simple extension we require to handle multiple distractors is to define a frequentist probability for each distractor feature V, P t ( V = v W = w ) = C vw ⁄ C w , where C vw is the count of co-occurrences of feature values v and w among the distractors, and C w is the count of w. Our model is extremely simple. Given a description of the visual search task and environment, the model has only a single degree of freedom, α . In all simulations, we fix α = 0.75 ; however, the choice of α does not qualitatively impact any result. 4 SIMULATIONS In this section, we show that the model can explain a range of data from four different experiments examining attentional priming. All experiments measure response times of participants. On each trial, the model can be used to obtain a probability of the display configuration (the environmental state) on that trial, given the history of trials to that point. Our critical assumption—as motivated earlier—is that response times monotonically decrease with increasing probability, indicating that visual information processing is better configured for more likely environmental states. The particular relationship we assume is that response times are linear in log probability. This assumption yields long response time tails, as are observed in all human studies. 4.1 Maljkovic and Nakayama (1994, Experiment 5) In this experiment, participants were asked to search for a singleton in color in a display of three red or green diamonds. Each diamond was notched on either the left or right side, and the task was to report the side of the notch on the color singleton. The well-practiced participants made very few errors. Reaction time (RT) was examined as a function of whether the target on a given trial is the same or different color as the target on trial n steps back or ahead. Figure 4 shows the results, with the human RTs in the left panel and the simulation log probabilities in the right panel. The horizontal axis represents n. Both graphs show the same outcome: repetition of target color facilitates performance. This influence lasts only for a half dozen trials, with an exponentially decreasing influence further into the past. In the model, this decreasing influence is due to the exponential decay of recent history (Equation 2). Figure 4 also shows that—as expected—the future has no influence on the current trial. 4.2 Maljkovic and Nakayama (1994, Experiment 8) In the previous experiment, it is impossible to determine whether facilitation is due to repetition of the target’s color or the distractor’s color, because the display contains only two colors, and therefore repetition of target color implies repetition of distractor color. To unconfound these two potential factors, an experiment like the previous one was con- ducted using four distinct colors, allowing one to examine the effect of repeating the target color while varying the distractor color, and vice versa. The sequence of trials was composed of subsequences of up-to-six consecutive trials with either the target or distractor color held constant while the other color was varied trial to trial. Following each subsequence, both target and distractors were changed. Figure 5 shows that for both humans and the simulation, performance improves toward an asymptote as the number of target and distractor repetitions increases; in the model, the asymptote is due to the probability of the repeated color in the environment model approaching 1.0. The performance improvement is greater for target than distractor repetition; in the model, this difference is due to the dominance of the defining feature of the target over the defining feature of the distractor. 4.3 Huang, Holcombe, and Pashler (2004, Experiment 1) Huang et al. (2004) and Hillstrom (2000) conducted studies to determine whether repetitions of one feature facilitate performance independently of repetitions of another feature. In the Huang et al. study, participants searched for a singleton in size in a display consisting of lines that were short and long, slanted left or right, and colored white or black. The reported feature was target slant. Slant, size, and color were uncorrelated. Huang et al. discovered that repeating an irrelevant feature (color or orientation) facilitated performance, but only when the defining feature (size) was repeated. As shown in Figure 6, the model replicates human performance, due to the dominance of the defining feature over the reported and irrelevant features. 4.4 Wolfe, Butcher, Lee, and Hyde (2003, Experiment 1) In an empirical tour-de-force, Wolfe et al. (2003) explored singleton search over a range of environments. The task is to detect the presence or absence of a singleton in displays conHuman data Different Color 600 Different Color 590 580 570 15 13 11 9 7 Past 5 3.2 3 Same Color 2.8 Same Color 560 550 Simulation 3.4 log(P(trial)) Reaction Time (msec) 610 3 1 +1 +3 +5 Future 2.6 +7 15 13 Relative Trial Number 11 9 7 Past 5 3 1 +1 +3 +5 Future +7 Relative Trial Number FIGURE 4. Experiment 5 of Maljkovic and Nakayama (1994): performance on a given trial conditional on the color of the target on a previous or subsequent trial. Human data is from subject KN. 650 6 Distractors Same 630 5.5 log(P(trial)) FIGURE 5. Experiment 8 of Maljkovic and Nakayama (1994). (left panel) human data, average of subjects KN and SS; (right panel) simulation Reaction Time (msec) 640 620 Target Same 610 5 Distractors Same 4.5 4 600 Target Same 3.5 590 3 580 1 2 3 4 5 1 6 4 5 6 4 1000 Size Alternate Size Alternate log(P(trial)) Reaction Time (msec) 3 4.2 1050 FIGURE 6. Experiment 1 of Huang, Holcombe, & Pashler (2004). (left panel) human data; (right panel) simulation 2 Order in Sequence Order in Sequence 950 3.8 3.6 3.4 900 3.2 Size Repeat 850 Size Repeat 3 Color Repeat Color Alternate Color Repeat Color Alternate sisting of colored (red or green), oriented (horizontal or vertical) lines. Target-absent trials were used primarily to ensure participants were searching the display. The experiment examined seven experimental conditions, which varied in the amount of uncertainty as to the target identity. The essential conditions, from least to most uncertainty, are: blocked (e.g., target always red vertical among green horizontals), mixed feature (e.g., target always a color singleton), mixed dimension (e.g., target either red or vertical), and fully mixed (target could be red, green, vertical, or horizontal). With this design, one can ascertain how uncertainty in the environment and in the target definition influence task difficulty. Because the defining feature in this experiment could be either color or orientation, we modeled the environment with two Bayes nets—one color dominant and one orientation dominant—and performed model averaging. A comparison of Figures 7a and 7b show a correspondence between human RTs and model predictions. Less uncertainty in the environment leads to more efficient performance. One interesting result from the model is its prediction that the mixed-feature condition is easier than the fully-mixed condition; that is, search is more efficient when the dimension (i.e., color vs. orientation) of the singleton is known, even though the model has no abstract representation of feature dimensions, only feature values. 4.5 Optimal adaptation constant In all simulations so far, we fixed the memory constant. From the human data, it is clear that memory for recent experience is relatively short lived, on the order of a half dozen trials (e.g., left panel of Figure 4). In this section we provide a rational argument for the short duration of memory in attentional control. Figure 7c shows mean negative log probability in each condition of the Wolfe et al. (2003) experiment, as a function of α . To assess these probabilities, for each experimental condition, the model was initialized so that all of the conditional distributions were uniform, and then a block of trials was run. Log probability for all trials in the block was averaged. The negative log probability (y axis of the Figure) is a measure of the model’s misprediction of the next trial in the sequence. For complex environments, such as the fully-mixed condition, a small memory constant is detrimental: With rapid memory decay, the effective history of trials is a high-variance sample of the distribution of environmental states. For simple environments, a large memory constant is detrimental: With slow memory decay, the model does not transition quickly from the initial environmental model to one that reflects the statistics of a new environment. Thus, the memory constant is constrained by being large enough that the environment model can hold on to sufficient history to represent complex environments, and by being small enough that the model adapts quickly to novel environments. If the conditions in Wolfe et al. give some indication of the range of naturalistic environments an agent encounters, we have a rational account of why attentional priming is so short lived. Whether priming lasts 2 trials or 20, the surprising empirical result is that it does not last 200 or 2000 trials. Our rational argument provides a rough insight into this finding. (a) fully mixed mixed feature mixed dimension blocked 460 (c) Simulation fully mixed mixed feature mixed dimension blocked 4 5 420 log(P(trial)) 440 2 Blocked Red or Vertical Blocked Red and Vertical Mixed Feature Mixed Dimension Fully Mixed 4 3 log(P(trial)) reaction time (msec) (b) Human Data 480 3 2 1 400 1 380 0 360 0 red or vert red and vert target type red or vert red and vert target type 0 0.5 0.8 0.9 0.95 0.98 Memory Constant FIGURE 7. (a) Human data for Wolfe et al. (2003), Experiment 1; (b) simulation; (c) misprediction of model (i.e., lower y value = better) as a function of α for five experimental condition 5 DISCUSSION The psychological literature contains two opposing accounts of attentional priming and its relation to attentional control. Huang et al. (2004) and Hillstrom (2000) propose an episodic account in which a distinct memory trace—representing the complete configuration of features in the display—is laid down for each trial, and priming depends on configural similarity of the current trial to previous trials. Alternatively, Maljkovic and Nakayama (1994) and Wolfe et al. (2003) propose a feature-strengthening account in which detection of a feature on one trial increases its ability to attract attention on subsequent trials, and priming is proportional to the number of overlapping features from one trial to the next. The episodic account corresponds roughly to the full joint model (Figure 3b), and the feature-strengthening account corresponds roughly to the independence model (Figure 3a). Neither account is adequate to explain the range of data we presented. However, an intermediate account, the dominance model (Figure 3c), is not only sufficient, but it offers a parsimonious, rational explanation. Beyond the model’s basic assumptions, it has only one free parameter, and can explain results from diverse experimental paradigms. The model makes a further theoretical contribution. Wolfe et al. distinguish the environments in their experiment in terms of the amount of top-down control available, implying that different mechanisms might be operating in different environments. However, in our account, top-down control is not some substance distributed in different amounts depending on the nature of the environment. Our account treats all environments uniformly, relying on attentional control to adapt to the environment at hand. We conclude with two limitations of the present work. First, our account presumes a particular network architecture, instead of a more elegant Bayesian approach that specifies priors over architectures, and performs automatic model selection via the sequence of trials. We did explore such a Bayesian approach, but it was unable to explain the data. Second, at least one finding in the literature is problematic for the model. Hillstrom (2000) occasionally finds that RTs slow when an irrelevant target feature is repeated but the defining target feature is not. However, because this effect is observed only in some experiments, it is likely that any model would require elaboration to explain the variability. ACKNOWLEDGEMENTS We thank Jeremy Wolfe for providing the raw data from his experiment for reanalysis. This research was funded by NSF BCS Award 0339103. REFERENCES Huang, L, Holcombe, A. O., & Pashler, H. (2004). Repetition priming in visual search: Episodic retrieval, not feature priming. Memory & Cognition, 32, 12–20. Hillstrom, A. P. (2000). Repetition effects in visual search. Perception & Psychophysics, 62, 800-817. Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Analysis & Machine Intelligence, 20, 1254–1259. Kearns, M., & Singh, S. (1999). Finite-sample convergence rates for Q-learning and indirect algorithms. In Advances in Neural Information Processing Systems 11 (pp. 996–1002). Cambridge, MA: MIT Press. Koch, C. and Ullman, S. (1985). Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiology, 4, 219–227. Maljkovic, V., & Nakayama, K. (1994). Priming of pop-out: I. Role of features. Mem. & Cognition, 22, 657-672. Mozer, M. C. (1991). The perception of multiple objects: A connectionist approach. Cambridge, MA: MIT Press. Rogers, R. D., & Monsell, S. (1995). The cost of a predictable switch between simple cognitive tasks. Journal of Experimental Psychology: General, 124, 207–231. Wolfe, J.M. (1994). Guided Search 2.0: A Revised Model of Visual Search. Psych. Bull. & Rev., 1, 202–238. Wolfe, J. S., Butcher, S. J., Lee, C., & Hyde, M. (2003). Changing your mind: on the contributions of top-down and bottom-up guidance in visual search for feature singletons. Journal of Exptl. Psychology: Human Perception & Performance, 29, 483-502.
3 0.68138677 141 nips-2005-Norepinephrine and Neural Interrupts
Author: Peter Dayan, Angela J. Yu
Abstract: Experimental data indicate that norepinephrine is critically involved in aspects of vigilance and attention. Previously, we considered the function of this neuromodulatory system on a time scale of minutes and longer, and suggested that it signals global uncertainty arising from gross changes in environmental contingencies. However, norepinephrine is also known to be activated phasically by familiar stimuli in welllearned tasks. Here, we extend our uncertainty-based treatment of norepinephrine to this phasic mode, proposing that it is involved in the detection and reaction to state uncertainty within a task. This role of norepinephrine can be understood through the metaphor of neural interrupts. 1
4 0.55583125 146 nips-2005-On the Accuracy of Bounded Rationality: How Far from Optimal Is Fast and Frugal?
Author: Michael Schmitt, Laura Martignon
Abstract: Fast and frugal heuristics are well studied models of bounded rationality. Psychological research has proposed the take-the-best heuristic as a successful strategy in decision making with limited resources. Take-thebest searches for a sufficiently good ordering of cues (features) in a task where objects are to be compared lexicographically. We investigate the complexity of the problem of approximating optimal cue permutations for lexicographic strategies. We show that no efficient algorithm can approximate the optimum to within any constant factor, if P = NP. We further consider a greedy approach for building lexicographic strategies and derive tight bounds for the performance ratio of a new and simple algorithm. This algorithm is proven to perform better than take-the-best. 1
5 0.49401334 169 nips-2005-Saliency Based on Information Maximization
Author: Neil Bruce, John Tsotsos
Abstract: A model of bottom-up overt attention is proposed based on the principle of maximizing information sampled from a scene. The proposed operation is based on Shannon's self-information measure and is achieved in a neural circuit, which is demonstrated as having close ties with the circuitry existent in the primate visual cortex. It is further shown that the proposed saliency measure may be extended to address issues that currently elude explanation in the domain of saliency based models. Resu lts on natural images are compared with experimental eye tracking data revealing the efficacy of the model in predicting the deployment of overt attention as compared with existing efforts.
6 0.48964524 193 nips-2005-The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search
7 0.43545637 176 nips-2005-Silicon growth cones map silicon retina
8 0.42218229 3 nips-2005-A Bayesian Framework for Tilt Perception and Confidence
9 0.38417161 5 nips-2005-A Computational Model of Eye Movements during Object Class Detection
10 0.36642468 132 nips-2005-Nearest Neighbor Based Feature Selection for Regression and its Application to Neural Activity
11 0.33894253 26 nips-2005-An exploration-exploitation model based on norepinepherine and dopamine activity
12 0.32792598 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity
13 0.30939826 109 nips-2005-Learning Cue-Invariant Visual Responses
14 0.29920179 51 nips-2005-Correcting sample selection bias in maximum entropy density estimation
15 0.29766598 157 nips-2005-Principles of real-time computing with feedback applied to cortical microcircuit models
16 0.27888477 7 nips-2005-A Cortically-Plausible Inverse Problem Solving Method Applied to Recognizing Static and Kinematic 3D Objects
17 0.26179197 18 nips-2005-Active Learning For Identifying Function Threshold Boundaries
18 0.25347528 180 nips-2005-Spectral Bounds for Sparse PCA: Exact and Greedy Algorithms
19 0.25101042 41 nips-2005-Coarse sample complexity bounds for active learning
20 0.230354 203 nips-2005-Visual Encoding with Jittering Eyes
topicId topicWeight
[(3, 0.046), (10, 0.038), (11, 0.011), (25, 0.018), (27, 0.03), (31, 0.029), (34, 0.071), (39, 0.077), (55, 0.032), (57, 0.014), (60, 0.023), (69, 0.079), (70, 0.309), (73, 0.033), (88, 0.04), (91, 0.048)]
simIndex simValue paperId paperTitle
same-paper 1 0.81081086 149 nips-2005-Optimal cue selection strategy
Author: Vidhya Navalpakkam, Laurent Itti
Abstract: Survival in the natural world demands the selection of relevant visual cues to rapidly and reliably guide attention towards prey and predators in cluttered environments. We investigate whether our visual system selects cues that guide search in an optimal manner. We formally obtain the optimal cue selection strategy by maximizing the signal to noise ratio (SN R) between a search target and surrounding distractors. This optimal strategy successfully accounts for several phenomena in visual search behavior, including the effect of target-distractor discriminability, uncertainty in target’s features, distractor heterogeneity, and linear separability. Furthermore, the theory generates a new prediction, which we verify through psychophysical experiments with human subjects. Our results provide direct experimental evidence that humans select visual cues so as to maximize SN R between the targets and surrounding clutter.
2 0.7214607 99 nips-2005-Integrate-and-Fire models with adaptation are good enough
Author: Renaud Jolivet, Alexander Rauch, Hans-rudolf Lüscher, Wulfram Gerstner
Abstract: Integrate-and-Fire-type models are usually criticized because of their simplicity. On the other hand, the Integrate-and-Fire model is the basis of most of the theoretical studies on spiking neuron models. Here, we develop a sequential procedure to quantitatively evaluate an equivalent Integrate-and-Fire-type model based on intracellular recordings of cortical pyramidal neurons. We find that the resulting effective model is sufficient to predict the spike train of the real pyramidal neuron with high accuracy. In in vivo-like regimes, predicted and recorded traces are almost indistinguishable and a significant part of the spikes can be predicted at the correct timing. Slow processes like spike-frequency adaptation are shown to be a key feature in this context since they are necessary for the model to connect between different driving regimes. 1
3 0.44337341 8 nips-2005-A Criterion for the Convergence of Learning with Spike Timing Dependent Plasticity
Author: Robert A. Legenstein, Wolfgang Maass
Abstract: We investigate under what conditions a neuron can learn by experimentally supported rules for spike timing dependent plasticity (STDP) to predict the arrival times of strong “teacher inputs” to the same neuron. It turns out that in contrast to the famous Perceptron Convergence Theorem, which predicts convergence of the perceptron learning rule for a simplified neuron model whenever a stable solution exists, no equally strong convergence guarantee can be given for spiking neurons with STDP. But we derive a criterion on the statistical dependency structure of input spike trains which characterizes exactly when learning with STDP will converge on average for a simple model of a spiking neuron. This criterion is reminiscent of the linear separability criterion of the Perceptron Convergence Theorem, but it applies here to the rows of a correlation matrix related to the spike inputs. In addition we show through computer simulations for more realistic neuron models that the resulting analytically predicted positive learning results not only hold for the common interpretation of STDP where STDP changes the weights of synapses, but also for a more realistic interpretation suggested by experimental data where STDP modulates the initial release probability of dynamic synapses. 1
4 0.42104322 14 nips-2005-A Probabilistic Interpretation of SVMs with an Application to Unbalanced Classification
Author: Yves Grandvalet, Johnny Mariethoz, Samy Bengio
Abstract: In this paper, we show that the hinge loss can be interpreted as the neg-log-likelihood of a semi-parametric model of posterior probabilities. From this point of view, SVMs represent the parametric component of a semi-parametric model fitted by a maximum a posteriori estimation procedure. This connection enables to derive a mapping from SVM scores to estimated posterior probabilities. Unlike previous proposals, the suggested mapping is interval-valued, providing a set of posterior probabilities compatible with each SVM score. This framework offers a new way to adapt the SVM optimization problem to unbalanced classification, when decisions result in unequal (asymmetric) losses. Experiments show improvements over state-of-the-art procedures. 1
5 0.41852254 193 nips-2005-The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search
Author: Gregory Zelinsky, Wei Zhang, Bing Yu, Xin Chen, Dimitris Samaras
Abstract: To investigate how top-down (TD) and bottom-up (BU) information is weighted in the guidance of human search behavior, we manipulated the proportions of BU and TD components in a saliency-based model. The model is biologically plausible and implements an artificial retina and a neuronal population code. The BU component is based on featurecontrast. The TD component is defined by a feature-template match to a stored target representation. We compared the model’s behavior at different mixtures of TD and BU components to the eye movement behavior of human observers performing the identical search task. We found that a purely TD model provides a much closer match to human behavior than any mixture model using BU information. Only when biological constraints are removed (e.g., eliminating the retina) did a BU/TD mixture model begin to approximate human behavior.
6 0.41228265 169 nips-2005-Saliency Based on Information Maximization
7 0.40688756 96 nips-2005-Inference with Minimal Communication: a Decision-Theoretic Variational Approach
8 0.40556988 5 nips-2005-A Computational Model of Eye Movements during Object Class Detection
9 0.40553284 181 nips-2005-Spiking Inputs to a Winner-take-all Network
10 0.39573672 200 nips-2005-Variable KD-Tree Algorithms for Spatial Pattern Search and Discovery
11 0.39477697 98 nips-2005-Infinite latent feature models and the Indian buffet process
12 0.39156127 58 nips-2005-Divergences, surrogate loss functions and experimental design
13 0.39060789 177 nips-2005-Size Regularized Cut for Data Clustering
14 0.39015743 28 nips-2005-Analyzing Auditory Neurons by Learning Distance Functions
15 0.39008057 30 nips-2005-Assessing Approximations for Gaussian Process Classification
16 0.38875648 90 nips-2005-Hot Coupling: A Particle Approach to Inference and Normalization on Pairwise Undirected Graphs
17 0.38863665 34 nips-2005-Bayesian Surprise Attracts Human Attention
18 0.38841647 72 nips-2005-Fast Online Policy Gradient Learning with SMD Gain Vector Adaptation
19 0.38742611 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity
20 0.38718328 132 nips-2005-Nearest Neighbor Based Feature Selection for Regression and its Application to Neural Activity