nips nips2007 nips2007-85 knowledge-graph by maker-knowledge-mining

85 nips-2007-Experience-Guided Search: A Theory of Attentional Control


Source: pdf

Author: David Baldwin, Michael C. Mozer

Abstract: People perform a remarkable range of tasks that require search of the visual environment for a target item among distractors. The Guided Search model (Wolfe, 1994, 2007), or GS, is perhaps the best developed psychological account of human visual search. To prioritize search, GS assigns saliency to locations in the visual field. Saliency is a linear combination of activations from retinotopic maps representing primitive visual features. GS includes heuristics for setting the gain coefficient associated with each map. Variants of GS have formalized the notion of optimization as a principle of attentional control (e.g., Baldwin & Mozer, 2006; Cave, 1999; Navalpakkam & Itti, 2006; Rao et al., 2002), but every GS-like model must be ’dumbed down’ to match human data, e.g., by corrupting the saliency map with noise and by imposing arbitrary restrictions on gain modulation. We propose a principled probabilistic formulation of GS, called Experience-Guided Search (EGS), based on a generative model of the environment that makes three claims: (1) Feature detectors produce Poisson spike trains whose rates are conditioned on feature type and whether the feature belongs to a target or distractor; (2) the environment and/or task is nonstationary and can change over a sequence of trials; and (3) a prior specifies that features are more likely to be present for target than for distractors. Through experience, EGS infers latent environment variables that determine the gains for guiding search. Control is thus cast as probabilistic inference, not optimization. We show that EGS can replicate a range of human data from visual search, including data that GS does not address. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract People perform a remarkable range of tasks that require search of the visual environment for a target item among distractors. [sent-4, score-0.556]

2 To prioritize search, GS assigns saliency to locations in the visual field. [sent-6, score-0.374]

3 Saliency is a linear combination of activations from retinotopic maps representing primitive visual features. [sent-7, score-0.21]

4 , by corrupting the saliency map with noise and by imposing arbitrary restrictions on gain modulation. [sent-14, score-0.34]

5 The flexibility of the human visual system stems from the endogenous (or internal) control of attention, which allows for processing resources to be directed to task-relevant regions and objects in the visual field. [sent-21, score-0.273]

6 To what sort of features of the visual environment can attention be directed? [sent-23, score-0.23]

7 Visual search has traditionally been studied in the laboratory using cluttered stimulus displays containing artificial objects. [sent-25, score-0.177]

8 For example, an experimental task might be to search for a red vertical line segment—the target—among green verticals and red horizontals—the distractors. [sent-27, score-0.541]

9 Performance is typically evaluated as the response latency to detect the presence or absence of a target with high accuracy. [sent-28, score-0.22]

10 An inefficient search can often require an additional 25–35 ms/item (or more, if eye movements are required). [sent-30, score-0.179]

11 Many computational models of visual search have been proposed to explain data from the burgeoning experimental literature (e. [sent-31, score-0.217]

12 1 As depicted Figure 1, GS posits that primitive visual features are detected across the retina in parallel along dimensions such as color and orientation, yielding a set of feature activity maps. [sent-37, score-0.296]

13 The bottom-up activations from all feature maps are combined to form a saliency map in which activation at a location indicates the priority of that location for the task at hand. [sent-43, score-0.613]

14 GS supposes that response latency is linear in the number of locations that need to be searched before a target is found. [sent-45, score-0.267]

15 (The model includes rules for terminating search if no target is found after a reasonable amount of effort. [sent-46, score-0.278]

16 ) Consider the task of searching for a red vertical bar among green vertical bars and red horizontal bars. [sent-47, score-0.575]

17 Ideally, attention should be drawn to red and vertical bars, not to green or horizontal bar. [sent-48, score-0.304]

18 To allow for guidance of attention, GS posits that a weight or top-down gain is associated with each feature map, and the contribution of given feature map to the saliency map is scaled by the gain. [sent-49, score-0.508]

19 If the gains on the red and vertical maps are set to 1, and the gains on green and horizontal maps are set to 0, then a target (red vertical) will have two units of activation in the saliency map, whereas each distractor (red horizontal or green vertical) will have only one unit of activation. [sent-53, score-1.188]

20 Because the target is the most salient item and GS assumes that response time is monotonically related to the saliency ranking of the target, the target should be located quickly, in a time independent of the number of distractors. [sent-54, score-0.666]

21 In contrast, human response times increase linearly with the number of distractors in conjunction search. [sent-55, score-0.321]

22 To reduce search efficiency, GS assumes noise corruption of the saliency map. [sent-56, score-0.428]

23 Baldwin and Mozer (2006) also require noise corruption for the same reason, although the corruption is to the low-level feature representation not the saliency map. [sent-58, score-0.402]

24 To further reduce search efficiency, GS includes a complex set of rules that limit gain control. [sent-60, score-0.18]

25 For example, gain modulation is allowed for only one feature map per dimension. [sent-61, score-0.192]

26 Baldwin and Mozer (2006) impose the restriction i |gi − 1| < c, where gi is the gain of feature map i and c is a constant. [sent-65, score-0.181]

27 Gain tuning is cast as an optimization problem: the goal of the model is to adjust the gains so as to maximize the target saliency relative to distractor saliency for the task at hand. [sent-68, score-0.946]

28 Baldwin and Mozer (2006) define the criterion in terms of the target saliency ranking. [sent-69, score-0.388]

29 Navalpakkam and Itti (2006) use the expected target to distractor saliency ratio. [sent-70, score-0.611]

30 Although limitations on gain modulation might be neurobiologically rationalized, a more elegant account would characterize these limitations in terms of trade offs: constraints on gain modulation may limit performance, but they yield certain benefits. [sent-77, score-0.207]

31 (3) In GS, attentional control is achieved by tuning gains to optimize performance. [sent-79, score-0.183]

32 In contrast, our model is designed to infer the structure of its environment through experience, and gain modulation is a byproduct of this inference. [sent-80, score-0.178]

33 Our approach begins with the premise that attention is fundamentally task based: a location in the visual field is salient if a target is likely at that location. [sent-82, score-0.386]

34 We define saliency as the target probability, P (Tx = 1|Fx ), where Fx is the local feature activity vector at retinal location x and Tx is a binary random variable indicating if location x contains a target. [sent-83, score-0.608]

35 (2006) and Zhang and Cottrell (submitted) have also suggested that saliency should reflect target probability, although they propose approaches to computing the target probability very different from ours. [sent-85, score-0.535]

36 Our approach is to compute the target probability using statistics obtained from recent experience performing the task. [sent-86, score-0.168]

37 We propose a generative model of the environment in which Fxi is a binomial random variable, Fxi |{Tx = t, ρ} ∼ Binomial(ρit , n), where a spike rate ρit is associated with feature i for target (t = 1) and nontarget (t = 0) items. [sent-91, score-0.37]

38 Because of the logistic relationship, P (Tx |Fx , ρ) is monotonic in rx + n sx . [sent-96, score-0.24]

39 Consequently, if at2 tentional priority is given to locations in order of their target probability, P (Tx |Fx , ρ), then it is 3 equivalent to rank using rx + n sx . [sent-97, score-0.434]

40 Further, if we assume that the target is equally likely in any 2 location, then rx is constant across locations, and sx can substitute for P (Tx |Fx , ρ) as an equivalent measure of saliency. [sent-98, score-0.387]

41 Saliency at a location increases if feature i’s activity is distant from the mean activity observed in the past for a distractor (ρi0 ) and decreases if feature i’s activity is distant from the mean activity observed in the past for a target (ρi1 ). [sent-100, score-0.728]

42 These saliency increases (decreases) are scaled by the variance of the distractor (target) activities, such that high-variance features have less impact on saliency. [sent-101, score-0.49]

43 Expanding the numerator terms in the definition of sx (Equation 2), we observe that sx can be written ˜ ˜2 as a linear combination of terms involving the feature activities, fxi , and the squared activities, fxi (along with a constant term that can be ignored for ranking by saliency). [sent-102, score-1.132]

44 The saliency measure ˜ sx in EGS is thus quite similar to the saliency measure in GS, sGS = i gi fxi . [sent-103, score-1.034]

45 The differences x are: first, EGS incorporates quadratic terms, and second, gain coefficients of EGS are not free parameters but are derived from statistics of targets and distractors in the current task and stimulus environment. [sent-104, score-0.357]

46 1 Uncertainty in the Environment Statistics The model parameters, ρ, could be maximum likelihood estimates obtained by observing target and distractor activations over a series of trials. [sent-107, score-0.424]

47 That is, suppose that each item in the display is identified as a target or distractor. [sent-108, score-0.273]

48 The set of activations of feature i at all locations containing a target could be used to estimate ρi1 , and likewise with locations containing a distractor to estimate ρi0 . [sent-109, score-0.59]

49 Because feature spike rates lie in [0, 1], we define ρit as a beta random variable, ρit ∼ Beta(αit , βit ). [sent-111, score-0.161]

50 For example, in the absence of any task experience, a conservative assumption is that all feature activations are predictive of a target, i. [sent-113, score-0.165]

51 When αit and βit are large, the distribution of ρit is sharply peaked, and sx approaches sx ¯ with ρit = αit /(αit + βit ). [sent-121, score-0.388]

52 When this condition is satisfied, ranking by sx is equivalent to ranking ¯ by P (Tx |Fx ). [sent-122, score-0.238]

53 Indeed, in our simulations, we find that defining saliency as either sx or sx yields similar results, reinforcing the robustness of our approach. [sent-124, score-0.629]

54 We assume that following a trial, each item in the display has been identified as either a target or distractor. [sent-128, score-0.273]

55 We earlier characterized Fxi as a binomial random variable reflecting a spike count; that is, during n time intervals, fxi spikes are observed. [sent-131, score-0.391]

56 Given prior distribution ρit ∼ Beta(αit , βit ), the posterior is ρit |Fxi ∼ Beta(αit + fxi , βit + n − fxi ). [sent-133, score-0.65]

57 When all locations are considered, the resulting posterior is: ρit |Fi ∼ Beta αit + x∈χt ˜ fxi , βit + x∈χt ˜ 1 − fxi (4) where Fi is feature map i and χt is the set of locations containing elements of type t. [sent-135, score-0.843]

58 We thus consider a switching model of the environment that specifies with probability λ, the environment changes and all evidence should be discarded. [sent-138, score-0.19]

59 The update rule we use is therefore 0 ˜ fxi , λβit + (1 − λ) βit + 0 ρit |Fi ∼ Beta λαit + (1 − λ) αit + x∈χt 3 ˜ 1 − fxi . [sent-143, score-0.65]

60 (5) x∈χt Simulation Methodology We present a step-by-step description of how the model runs to simulate experimental subjects performing a visual search task. [sent-144, score-0.217]

61 (1) Feature extraction is performed on the display ˜ to obtain firing rates, fxi for each location x and feature type i. [sent-148, score-0.528]

62 (2) Saliency, sx , is computed for ¯ each location according to Equation 3. [sent-149, score-0.248]

63 (3) The saliency rank of each location is assessed, and the number of locations that need to be searched in order to identify the target is assumed to be equal to the target rank. [sent-150, score-0.636]

64 (4) Following each trial, target and distractor statistics, αit and βit , are updated according to Equation 5. [sent-152, score-0.37]

65 Because we are focused on the issue of attentional control, we wanted to sidestep other issues, such as feature extraction. [sent-156, score-0.164]

66 We treat the activation produced by GS for feature i at location ˜ x as the firing rate fxi needed to simulate EGS. [sent-161, score-0.479]

67 Like GS, the response time of EGS is linear in the target ranking. [sent-162, score-0.18]

68 To implement EGS, we simply removed much of the complexity of GS—including the distinction between bottom-up and top-down weights, heuristics for setting the weights, and the injection of high-amplitude noise into the saliency map—and replaced it with Equations 3 and 5. [sent-171, score-0.264]

69 The latter condition yields E[ρi1 ] > E[ρi0 ] for all i, and corresponds to the bias that features are more likely to be present for a target than for a distractor. [sent-180, score-0.173]

70 The first four involve displays of a homogeneous color, and search for a target orientation among distractors of different orientations. [sent-192, score-0.584]

71 Task A explores search for a vertical (defined as 0◦ ) target among homogeneous distractors of a different orientation. [sent-193, score-0.662]

72 The graph plots the slope of the line relating display size to response latency, as a function of the distractor orientation. [sent-194, score-0.364]

73 Task B explores search for a target among two types of distractors as a function of display size. [sent-196, score-0.609]

74 The distractors are 100◦ apart, and the target is 40◦ and 60◦ from the distractors, but in one case the target differs from the distractors in that it is the only nearly vertical item, allowing pop out via the vertical feature detector. [sent-197, score-0.988]

75 Task C examines search efficiency for a target among heterogeneous distractors, for two target orientations and two degrees of targetdistractor similarity. [sent-199, score-0.457]

76 Search is more efficient when the target and distractors are dissimilar. [sent-200, score-0.345]

77 ) Task D explores an asymmetry in search: it is more efficient to find a tilted bar among verticals than a vertical among tilted. [sent-202, score-0.285]

78 This effect arises from the same mechanism that yielded efficient search in task B: a unique feature is highly activated when the target is tilted but not when it is vertical. [sent-203, score-0.415]

79 And search is better guided to features that are present than to features that are absent in EGS, due to the ρ priors. [sent-204, score-0.272]

80 The target is a red vertical among green vertical and red tilted distractors. [sent-206, score-0.634]

81 Both distractor environments yield inefficient search, but—consistent with human data—conjunction searches can vary in their relative difficulty. [sent-210, score-0.267]

82 Task F examines search efficiency for a red vertical among red 60◦ and yellow vertical distractors, as a function of the ratio of the two distractor types. [sent-211, score-0.768]

83 The result shows that search can be guided: response times become faster as either the target color or target orientation becomes sparse, because a relatively unique feature serves as a reliable cue to the target. [sent-212, score-0.607]

84 Figure 3a depicts how EGS adapts differently for the extreme conditions in which the distractors are mostly vertical (dark bars) or mostly red (light bars). [sent-213, score-0.455]

85 ) When distractors are mostly vertical, the red feature is a better cue, and vice versa. [sent-216, score-0.39]

86 To summarize, EGS predicts the key factors in visual search that determine search efficiency. [sent-227, score-0.348]

87 Most efficient search is for a target defined by the presence of a single categorical feature among homogeneous distractors that do not share the categorical feature. [sent-228, score-0.666]

88 Least efficient search is for target and distractors that share features (e. [sent-229, score-0.52]

89 , T among L’s, or red verticals among red horizontals and green verticals) and/or when distractors are heterogeneous. [sent-231, score-0.552]

90 This experiment, which oddly has never been modeled by GS, involves search for a conjunction target defined by a triple of features, e. [sent-233, score-0.324]

91 The target might be presented among heterogeneous distractors that share two features with it, such as a big red horizontal bar, or distractors that share only one feature with it, such as a small green vertical bar. [sent-236, score-0.98]

92 Performance in these two conditions, denoted T3-D2 and T3-D1, respectively, is compared to performance in a standard conjunction search task, denoted T2-D1, involving targets defined by two features and sharing one feature with each distractor. [sent-237, score-0.298]

93 reasoned that if search can be guided, saliency at a location should be proportional to the number of target-relevant features at that location, and the ratio of target to distractor salience should be x/y in condition Tx-Dy. [sent-239, score-0.822]

94 Because x > y, the target is always more salient than any distractor, but GS assumes less efficient search due to noise corruption of the saliency map, thereby predicting search slopes that are inversely related to x/y. [sent-240, score-0.768]

95 The human data show exactly this pattern, producing almost flat search slopes for T3-D1. [sent-241, score-0.21]

96 2 (a) mostly vert distractors mostly red distractors 0. [sent-247, score-0.549]

97 05 0 vertical Feature 1000 T3−D2 T2−D1 T3−D1 800 600 400 0 red (b) 1200 Reaction Time 0. [sent-250, score-0.191]

98 (b) EGS performance on the triple-conjunction task of Wolfe, Cave, & Franzel (1989) 10 20 Display Size 30 40 Discussion We presented a model, EGS, that guides visual search via statistics collected over the course of experience in a task environment. [sent-252, score-0.316]

99 If the environment can change from one trial to the next, the cognitive system does well not to turn up gains on one feature dimension at the expense of other feature dimensions. [sent-261, score-0.314]

100 The result is a sensible trade off: attentional control can be rapidly tuned as the task or environment changes, but this flexibility restricts EGS’s search efficiency when the task and environment remain constant. [sent-262, score-0.507]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('egs', 0.453), ('gs', 0.425), ('fxi', 0.325), ('saliency', 0.241), ('distractor', 0.223), ('distractors', 0.198), ('sx', 0.194), ('tx', 0.192), ('wolfe', 0.157), ('target', 0.147), ('search', 0.131), ('vertical', 0.104), ('fx', 0.102), ('msec', 0.101), ('attentional', 0.092), ('guided', 0.089), ('red', 0.087), ('mozer', 0.086), ('visual', 0.086), ('environment', 0.085), ('baldwin', 0.081), ('display', 0.077), ('feature', 0.072), ('cave', 0.07), ('itti', 0.069), ('rt', 0.06), ('navalpakkam', 0.055), ('gains', 0.055), ('activations', 0.054), ('location', 0.054), ('beta', 0.05), ('gain', 0.049), ('item', 0.049), ('green', 0.047), ('locations', 0.047), ('verticals', 0.046), ('conjunction', 0.046), ('rx', 0.046), ('modulation', 0.044), ('human', 0.044), ('maps', 0.044), ('latency', 0.04), ('activity', 0.04), ('task', 0.039), ('spike', 0.039), ('control', 0.036), ('franzel', 0.035), ('sandon', 0.035), ('slopes', 0.035), ('gi', 0.033), ('mostly', 0.033), ('attention', 0.033), ('response', 0.033), ('horizontal', 0.033), ('corruption', 0.033), ('among', 0.032), ('slope', 0.031), ('orientation', 0.03), ('trial', 0.03), ('simulation', 0.029), ('activation', 0.028), ('binomial', 0.027), ('map', 0.027), ('salient', 0.027), ('color', 0.026), ('stimulus', 0.026), ('tasks', 0.026), ('features', 0.026), ('tilted', 0.026), ('primitive', 0.026), ('homogeneous', 0.026), ('movements', 0.025), ('explores', 0.024), ('activities', 0.024), ('six', 0.024), ('cottrell', 0.023), ('executive', 0.023), ('horizontals', 0.023), ('nondeterminism', 0.023), ('eye', 0.023), ('targets', 0.023), ('noise', 0.023), ('free', 0.022), ('ranking', 0.022), ('directed', 0.021), ('categorical', 0.021), ('experience', 0.021), ('bars', 0.021), ('cue', 0.021), ('submitted', 0.021), ('elegant', 0.021), ('bar', 0.021), ('posits', 0.02), ('replication', 0.02), ('displays', 0.02), ('evidence', 0.02), ('trials', 0.02), ('pop', 0.018), ('consequently', 0.018), ('share', 0.018)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0 85 nips-2007-Experience-Guided Search: A Theory of Attentional Control

Author: David Baldwin, Michael C. Mozer

Abstract: People perform a remarkable range of tasks that require search of the visual environment for a target item among distractors. The Guided Search model (Wolfe, 1994, 2007), or GS, is perhaps the best developed psychological account of human visual search. To prioritize search, GS assigns saliency to locations in the visual field. Saliency is a linear combination of activations from retinotopic maps representing primitive visual features. GS includes heuristics for setting the gain coefficient associated with each map. Variants of GS have formalized the notion of optimization as a principle of attentional control (e.g., Baldwin & Mozer, 2006; Cave, 1999; Navalpakkam & Itti, 2006; Rao et al., 2002), but every GS-like model must be ’dumbed down’ to match human data, e.g., by corrupting the saliency map with noise and by imposing arbitrary restrictions on gain modulation. We propose a principled probabilistic formulation of GS, called Experience-Guided Search (EGS), based on a generative model of the environment that makes three claims: (1) Feature detectors produce Poisson spike trains whose rates are conditioned on feature type and whether the feature belongs to a target or distractor; (2) the environment and/or task is nonstationary and can change over a sequence of trials; and (3) a prior specifies that features are more likely to be present for target than for distractors. Through experience, EGS infers latent environment variables that determine the gains for guiding search. Control is thus cast as probabilistic inference, not optimization. We show that EGS can replicate a range of human data from visual search, including data that GS does not address. 1

2 0.24811758 202 nips-2007-The discriminant center-surround hypothesis for bottom-up saliency

Author: Dashan Gao, Vijay Mahadevan, Nuno Vasconcelos

Abstract: The classical hypothesis, that bottom-up saliency is a center-surround process, is combined with a more recent hypothesis that all saliency decisions are optimal in a decision-theoretic sense. The combined hypothesis is denoted as discriminant center-surround saliency, and the corresponding optimal saliency architecture is derived. This architecture equates the saliency of each image location to the discriminant power of a set of features with respect to the classification problem that opposes stimuli at center and surround, at that location. It is shown that the resulting saliency detector makes accurate quantitative predictions for various aspects of the psychophysics of human saliency, including non-linear properties beyond the reach of previous saliency models. Furthermore, it is shown that discriminant center-surround saliency can be easily generalized to various stimulus modalities (such as color, orientation and motion), and provides optimal solutions for many other saliency problems of interest for computer vision. Optimal solutions, under this hypothesis, are derived for a number of the former (including static natural images, dense motion fields, and even dynamic textures), and applied to a number of the latter (the prediction of human eye fixations, motion-based saliency in the presence of ego-motion, and motion-based saliency in the presence of highly dynamic backgrounds). In result, discriminant saliency is shown to predict eye fixations better than previous models, and produces background subtraction algorithms that outperform the state-of-the-art in computer vision. 1

3 0.1578909 155 nips-2007-Predicting human gaze using low-level saliency combined with face detection

Author: Moran Cerf, Jonathan Harel, Wolfgang Einhaeuser, Christof Koch

Abstract: Under natural viewing conditions, human observers shift their gaze to allocate processing resources to subsets of the visual input. Many computational models try to predict such voluntary eye and attentional shifts. Although the important role of high level stimulus properties (e.g., semantic information) in search stands undisputed, most models are based on low-level image properties. We here demonstrate that a combined model of face detection and low-level saliency significantly outperforms a low-level model in predicting locations humans fixate on, based on eye-movement recordings of humans observing photographs of natural scenes, most of which contained at least one person. Observers, even when not instructed to look for anything particular, fixate on a face with a probability of over 80% within their first two fixations; furthermore, they exhibit more similar scanpaths when faces are present. Remarkably, our model’s predictive performance in images that do not contain faces is not impaired, and is even improved in some cases by spurious face detector responses. 1

4 0.064051196 110 nips-2007-Learning Bounds for Domain Adaptation

Author: John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, Jennifer Wortman

Abstract: Empirical risk minimization offers well-known learning guarantees when training and test data come from the same domain. In the real world, though, we often wish to adapt a classifier from a source domain with a large amount of training data to different target domain with very little training data. In this work we give uniform convergence bounds for algorithms that minimize a convex combination of source and target empirical risk. The bounds explicitly model the inherent trade-off between training on a large but inaccurate source data set and a small but accurate target training set. Our theory also gives results when we have multiple source domains, each of which may have a different number of instances, and we exhibit cases in which minimizing a non-uniform combination of source risks can achieve much lower target error than standard empirical risk minimization. 1

5 0.061815865 154 nips-2007-Predicting Brain States from fMRI Data: Incremental Functional Principal Component Regression

Author: Sennay Ghebreab, Arnold Smeulders, Pieter Adriaans

Abstract: We propose a method for reconstruction of human brain states directly from functional neuroimaging data. The method extends the traditional multivariate regression analysis of discretized fMRI data to the domain of stochastic functional measurements, facilitating evaluation of brain responses to complex stimuli and boosting the power of functional imaging. The method searches for sets of voxel time courses that optimize a multivariate functional linear model in terms of R2 statistic. Population based incremental learning is used to identify spatially distributed brain responses to complex stimuli without attempting to localize function first. Variation in hemodynamic lag across brain areas and among subjects is taken into account by voxel-wise non-linear registration of stimulus pattern to fMRI data. Application of the method on an international test benchmark for prediction of naturalistic stimuli from new and unknown fMRI data shows that the method successfully uncovers spatially distributed parts of the brain that are highly predictive of a given stimulus. 1

6 0.051489245 57 nips-2007-Congruence between model and human attention reveals unique signatures of critical visual events

7 0.048008192 17 nips-2007-A neural network implementing optimal state estimation based on dynamic spike train decoding

8 0.044887222 36 nips-2007-Better than least squares: comparison of objective functions for estimating linear-nonlinear models

9 0.043196782 74 nips-2007-EEG-Based Brain-Computer Interaction: Improved Accuracy by Automatic Single-Trial Error Detection

10 0.041426715 93 nips-2007-GRIFT: A graphical model for inferring visual classification features from human data

11 0.040997524 33 nips-2007-Bayesian Inference for Spiking Neuron Models with a Sparsity Prior

12 0.039887778 124 nips-2007-Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning

13 0.039610699 104 nips-2007-Inferring Neural Firing Rates from Spike Trains Using Gaussian Processes

14 0.039574165 140 nips-2007-Neural characterization in partially observed populations of spiking neurons

15 0.038812742 211 nips-2007-Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data

16 0.037825219 177 nips-2007-Simplified Rules and Theoretical Analysis for Information Bottleneck Optimization and PCA with Spiking Neurons

17 0.037393458 125 nips-2007-Markov Chain Monte Carlo with People

18 0.036935225 164 nips-2007-Receptive Fields without Spike-Triggering

19 0.035701141 64 nips-2007-Cooled and Relaxed Survey Propagation for MRFs

20 0.035201218 100 nips-2007-Hippocampal Contributions to Control: The Third Way


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.124), (1, 0.042), (2, 0.059), (3, -0.039), (4, 0.001), (5, 0.158), (6, 0.044), (7, 0.13), (8, -0.012), (9, -0.107), (10, -0.084), (11, -0.079), (12, -0.011), (13, 0.102), (14, -0.193), (15, -0.041), (16, 0.055), (17, 0.133), (18, 0.035), (19, -0.0), (20, -0.037), (21, -0.243), (22, -0.088), (23, -0.107), (24, 0.173), (25, 0.032), (26, 0.044), (27, 0.058), (28, -0.019), (29, -0.038), (30, 0.08), (31, -0.018), (32, -0.028), (33, -0.011), (34, -0.063), (35, -0.021), (36, 0.039), (37, 0.039), (38, -0.01), (39, -0.033), (40, -0.049), (41, -0.009), (42, -0.093), (43, -0.002), (44, -0.029), (45, 0.126), (46, -0.037), (47, -0.003), (48, -0.041), (49, -0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93806154 85 nips-2007-Experience-Guided Search: A Theory of Attentional Control

Author: David Baldwin, Michael C. Mozer

Abstract: People perform a remarkable range of tasks that require search of the visual environment for a target item among distractors. The Guided Search model (Wolfe, 1994, 2007), or GS, is perhaps the best developed psychological account of human visual search. To prioritize search, GS assigns saliency to locations in the visual field. Saliency is a linear combination of activations from retinotopic maps representing primitive visual features. GS includes heuristics for setting the gain coefficient associated with each map. Variants of GS have formalized the notion of optimization as a principle of attentional control (e.g., Baldwin & Mozer, 2006; Cave, 1999; Navalpakkam & Itti, 2006; Rao et al., 2002), but every GS-like model must be ’dumbed down’ to match human data, e.g., by corrupting the saliency map with noise and by imposing arbitrary restrictions on gain modulation. We propose a principled probabilistic formulation of GS, called Experience-Guided Search (EGS), based on a generative model of the environment that makes three claims: (1) Feature detectors produce Poisson spike trains whose rates are conditioned on feature type and whether the feature belongs to a target or distractor; (2) the environment and/or task is nonstationary and can change over a sequence of trials; and (3) a prior specifies that features are more likely to be present for target than for distractors. Through experience, EGS infers latent environment variables that determine the gains for guiding search. Control is thus cast as probabilistic inference, not optimization. We show that EGS can replicate a range of human data from visual search, including data that GS does not address. 1

2 0.91792136 202 nips-2007-The discriminant center-surround hypothesis for bottom-up saliency

Author: Dashan Gao, Vijay Mahadevan, Nuno Vasconcelos

Abstract: The classical hypothesis, that bottom-up saliency is a center-surround process, is combined with a more recent hypothesis that all saliency decisions are optimal in a decision-theoretic sense. The combined hypothesis is denoted as discriminant center-surround saliency, and the corresponding optimal saliency architecture is derived. This architecture equates the saliency of each image location to the discriminant power of a set of features with respect to the classification problem that opposes stimuli at center and surround, at that location. It is shown that the resulting saliency detector makes accurate quantitative predictions for various aspects of the psychophysics of human saliency, including non-linear properties beyond the reach of previous saliency models. Furthermore, it is shown that discriminant center-surround saliency can be easily generalized to various stimulus modalities (such as color, orientation and motion), and provides optimal solutions for many other saliency problems of interest for computer vision. Optimal solutions, under this hypothesis, are derived for a number of the former (including static natural images, dense motion fields, and even dynamic textures), and applied to a number of the latter (the prediction of human eye fixations, motion-based saliency in the presence of ego-motion, and motion-based saliency in the presence of highly dynamic backgrounds). In result, discriminant saliency is shown to predict eye fixations better than previous models, and produces background subtraction algorithms that outperform the state-of-the-art in computer vision. 1

3 0.75278527 155 nips-2007-Predicting human gaze using low-level saliency combined with face detection

Author: Moran Cerf, Jonathan Harel, Wolfgang Einhaeuser, Christof Koch

Abstract: Under natural viewing conditions, human observers shift their gaze to allocate processing resources to subsets of the visual input. Many computational models try to predict such voluntary eye and attentional shifts. Although the important role of high level stimulus properties (e.g., semantic information) in search stands undisputed, most models are based on low-level image properties. We here demonstrate that a combined model of face detection and low-level saliency significantly outperforms a low-level model in predicting locations humans fixate on, based on eye-movement recordings of humans observing photographs of natural scenes, most of which contained at least one person. Observers, even when not instructed to look for anything particular, fixate on a face with a probability of over 80% within their first two fixations; furthermore, they exhibit more similar scanpaths when faces are present. Remarkably, our model’s predictive performance in images that do not contain faces is not impaired, and is even improved in some cases by spurious face detector responses. 1

4 0.35992786 57 nips-2007-Congruence between model and human attention reveals unique signatures of critical visual events

Author: Robert Peters, Laurent Itti

Abstract: Current computational models of bottom-up and top-down components of attention are predictive of eye movements across a range of stimuli and of simple, fixed visual tasks (such as visual search for a target among distractors). However, to date there exists no computational framework which can reliably mimic human gaze behavior in more complex environments and tasks, such as driving a vehicle through traffic. Here, we develop a hybrid computational/behavioral framework, combining simple models for bottom-up salience and top-down relevance, and looking for changes in the predictive power of these components at different critical event times during 4.7 hours (500,000 video frames) of observers playing car racing and flight combat video games. This approach is motivated by our observation that the predictive strengths of the salience and relevance models exhibit reliable temporal signatures during critical event windows in the task sequence—for example, when the game player directly engages an enemy plane in a flight combat game, the predictive strength of the salience model increases significantly, while that of the relevance model decreases significantly. Our new framework combines these temporal signatures to implement several event detectors. Critically, we find that an event detector based on fused behavioral and stimulus information (in the form of the model’s predictive strength) is much stronger than detectors based on behavioral information alone (eye position) or image information alone (model prediction maps). This approach to event detection, based on eye tracking combined with computational models applied to the visual input, may have useful applications as a less-invasive alternative to other event detection approaches based on neural signatures derived from EEG or fMRI recordings. 1

5 0.28144166 3 nips-2007-A Bayesian Model of Conditioned Perception

Author: Alan Stocker, Eero P. Simoncelli

Abstract: unkown-abstract

6 0.22494632 89 nips-2007-Feature Selection Methods for Improving Protein Structure Prediction with Rosetta

7 0.22075857 35 nips-2007-Bayesian binning beats approximate alternatives: estimating peri-stimulus time histograms

8 0.21950102 114 nips-2007-Learning and using relational theories

9 0.21205263 110 nips-2007-Learning Bounds for Domain Adaptation

10 0.20341878 17 nips-2007-A neural network implementing optimal state estimation based on dynamic spike train decoding

11 0.2000303 167 nips-2007-Regulator Discovery from Gene Expression Time Series of Malaria Parasites: a Hierachical Approach

12 0.19039348 211 nips-2007-Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data

13 0.18992533 6 nips-2007-A General Boosting Method and its Application to Learning Ranking Functions for Web Search

14 0.18651542 83 nips-2007-Evaluating Search Engines by Modeling the Relationship Between Relevance and Clicks

15 0.18634009 36 nips-2007-Better than least squares: comparison of objective functions for estimating linear-nonlinear models

16 0.1854202 100 nips-2007-Hippocampal Contributions to Control: The Third Way

17 0.18156743 205 nips-2007-Theoretical Analysis of Learning with Reward-Modulated Spike-Timing-Dependent Plasticity

18 0.18059927 203 nips-2007-The rat as particle filter

19 0.18032087 104 nips-2007-Inferring Neural Firing Rates from Spike Trains Using Gaussian Processes

20 0.17759019 93 nips-2007-GRIFT: A graphical model for inferring visual classification features from human data


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.037), (13, 0.022), (16, 0.026), (18, 0.026), (19, 0.013), (21, 0.045), (34, 0.023), (35, 0.016), (47, 0.095), (83, 0.068), (90, 0.511)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93075919 85 nips-2007-Experience-Guided Search: A Theory of Attentional Control

Author: David Baldwin, Michael C. Mozer

Abstract: People perform a remarkable range of tasks that require search of the visual environment for a target item among distractors. The Guided Search model (Wolfe, 1994, 2007), or GS, is perhaps the best developed psychological account of human visual search. To prioritize search, GS assigns saliency to locations in the visual field. Saliency is a linear combination of activations from retinotopic maps representing primitive visual features. GS includes heuristics for setting the gain coefficient associated with each map. Variants of GS have formalized the notion of optimization as a principle of attentional control (e.g., Baldwin & Mozer, 2006; Cave, 1999; Navalpakkam & Itti, 2006; Rao et al., 2002), but every GS-like model must be ’dumbed down’ to match human data, e.g., by corrupting the saliency map with noise and by imposing arbitrary restrictions on gain modulation. We propose a principled probabilistic formulation of GS, called Experience-Guided Search (EGS), based on a generative model of the environment that makes three claims: (1) Feature detectors produce Poisson spike trains whose rates are conditioned on feature type and whether the feature belongs to a target or distractor; (2) the environment and/or task is nonstationary and can change over a sequence of trials; and (3) a prior specifies that features are more likely to be present for target than for distractors. Through experience, EGS infers latent environment variables that determine the gains for guiding search. Control is thus cast as probabilistic inference, not optimization. We show that EGS can replicate a range of human data from visual search, including data that GS does not address. 1

2 0.9181819 8 nips-2007-A New View of Automatic Relevance Determination

Author: David P. Wipf, Srikantan S. Nagarajan

Abstract: Automatic relevance determination (ARD) and the closely-related sparse Bayesian learning (SBL) framework are effective tools for pruning large numbers of irrelevant features leading to a sparse explanatory subset. However, popular update rules used for ARD are either difficult to extend to more general problems of interest or are characterized by non-ideal convergence properties. Moreover, it remains unclear exactly how ARD relates to more traditional MAP estimation-based methods for learning sparse representations (e.g., the Lasso). This paper furnishes an alternative means of expressing the ARD cost function using auxiliary functions that naturally addresses both of these issues. First, the proposed reformulation of ARD can naturally be optimized by solving a series of re-weighted 1 problems. The result is an efficient, extensible algorithm that can be implemented using standard convex programming toolboxes and is guaranteed to converge to a local minimum (or saddle point). Secondly, the analysis reveals that ARD is exactly equivalent to performing standard MAP estimation in weight space using a particular feature- and noise-dependent, non-factorial weight prior. We then demonstrate that this implicit prior maintains several desirable advantages over conventional priors with respect to feature selection. Overall these results suggest alternative cost functions and update procedures for selecting features and promoting sparse solutions in a variety of general situations. In particular, the methodology readily extends to handle problems such as non-negative sparse coding and covariance component estimation. 1

3 0.88741505 119 nips-2007-Learning with Tree-Averaged Densities and Distributions

Author: Sergey Kirshner

Abstract: We utilize the ensemble of trees framework, a tractable mixture over superexponential number of tree-structured distributions [1], to develop a new model for multivariate density estimation. The model is based on a construction of treestructured copulas – multivariate distributions with uniform on [0, 1] marginals. By averaging over all possible tree structures, the new model can approximate distributions with complex variable dependencies. We propose an EM algorithm to estimate the parameters for these tree-averaged models for both the real-valued and the categorical case. Based on the tree-averaged framework, we propose a new model for joint precipitation amounts data on networks of rain stations. 1

4 0.86907887 184 nips-2007-Stability Bounds for Non-i.i.d. Processes

Author: Mehryar Mohri, Afshin Rostamizadeh

Abstract: The notion of algorithmic stability has been used effectively in the past to derive tight generalization bounds. A key advantage of these bounds is that they are designed for specific learning algorithms, exploiting their particular properties. But, as in much of learning theory, existing stability analyses and bounds apply only in the scenario where the samples are independently and identically distributed (i.i.d.). In many machine learning applications, however, this assumption does not hold. The observations received by the learning algorithm often have some inherent temporal dependence, which is clear in system diagnosis or time series prediction problems. This paper studies the scenario where the observations are drawn from a stationary mixing sequence, which implies a dependence between observations that weaken over time. It proves novel stability-based generalization bounds that hold even with this more general setting. These bounds strictly generalize the bounds given in the i.i.d. case. It also illustrates their application in the case of several general classes of learning algorithms, including Support Vector Regression and Kernel Ridge Regression.

5 0.80183291 182 nips-2007-Sparse deep belief net model for visual area V2

Author: Honglak Lee, Chaitanya Ekanadham, Andrew Y. Ng

Abstract: Motivated in part by the hierarchical organization of the cortex, a number of algorithms have recently been proposed that try to learn hierarchical, or “deep,” structure from unlabeled data. While several authors have formally or informally compared their algorithms to computations performed in visual area V1 (and the cochlea), little attempt has been made thus far to evaluate these algorithms in terms of their fidelity for mimicking computations at deeper levels in the cortical hierarchy. This paper presents an unsupervised learning model that faithfully mimics certain properties of visual area V2. Specifically, we develop a sparse variant of the deep belief networks of Hinton et al. (2006). We learn two layers of nodes in the network, and demonstrate that the first layer, similar to prior work on sparse coding and ICA, results in localized, oriented, edge filters, similar to the Gabor functions known to model V1 cell receptive fields. Further, the second layer in our model encodes correlations of the first layer responses in the data. Specifically, it picks up both colinear (“contour”) features as well as corners and junctions. More interestingly, in a quantitative comparison, the encoding of these more complex “corner” features matches well with the results from the Ito & Komatsu’s study of biological V2 responses. This suggests that our sparse variant of deep belief networks holds promise for modeling more higher-order features. 1

6 0.56970394 202 nips-2007-The discriminant center-surround hypothesis for bottom-up saliency

7 0.54683471 66 nips-2007-Density Estimation under Independent Similarly Distributed Sampling Assumptions

8 0.52882195 156 nips-2007-Predictive Matrix-Variate t Models

9 0.50189245 96 nips-2007-Heterogeneous Component Analysis

10 0.49892458 128 nips-2007-Message Passing for Max-weight Independent Set

11 0.49324435 113 nips-2007-Learning Visual Attributes

12 0.4925186 122 nips-2007-Locality and low-dimensions in the prediction of natural experience from fMRI

13 0.48927689 79 nips-2007-Efficient multiple hyperparameter learning for log-linear models

14 0.48619062 155 nips-2007-Predicting human gaze using low-level saliency combined with face detection

15 0.48531452 140 nips-2007-Neural characterization in partially observed populations of spiking neurons

16 0.48347825 185 nips-2007-Stable Dual Dynamic Programming

17 0.48109984 63 nips-2007-Convex Relaxations of Latent Variable Training

18 0.47734144 82 nips-2007-Estimating divergence functionals and the likelihood ratio by penalized convex risk minimization

19 0.47472462 163 nips-2007-Receding Horizon Differential Dynamic Programming

20 0.47357959 51 nips-2007-Comparing Bayesian models for multisensory cue combination without mandatory integration