nips nips2005 nips2005-203 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Michele Rucci
Abstract: Under natural viewing conditions, small movements of the eye and body prevent the maintenance of a steady direction of gaze. It is known that stimuli tend to fade when they are stabilized on the retina for several seconds. However, it is unclear whether the physiological self-motion of the retinal image serves a visual purpose during the brief periods of natural visual fixation. This study examines the impact of fixational instability on the statistics of visual input to the retina and on the structure of neural activity in the early visual system. Fixational instability introduces fluctuations in the retinal input signals that, in the presence of natural images, lack spatial correlations. These input fluctuations strongly influence neural activity in a model of the LGN. They decorrelate cell responses, even if the contrast sensitivity functions of simulated cells are not perfectly tuned to counter-balance the power-law spectrum of natural images. A decorrelation of neural activity has been proposed to be beneficial for discarding statistical redundancies in the input signals. Fixational instability might, therefore, contribute to establishing efficient representations of natural stimuli. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract Under natural viewing conditions, small movements of the eye and body prevent the maintenance of a steady direction of gaze. [sent-3, score-0.3]
2 It is known that stimuli tend to fade when they are stabilized on the retina for several seconds. [sent-4, score-0.11]
3 However, it is unclear whether the physiological self-motion of the retinal image serves a visual purpose during the brief periods of natural visual fixation. [sent-5, score-0.542]
4 This study examines the impact of fixational instability on the statistics of visual input to the retina and on the structure of neural activity in the early visual system. [sent-6, score-1.026]
5 Fixational instability introduces fluctuations in the retinal input signals that, in the presence of natural images, lack spatial correlations. [sent-7, score-0.817]
6 These input fluctuations strongly influence neural activity in a model of the LGN. [sent-8, score-0.173]
7 They decorrelate cell responses, even if the contrast sensitivity functions of simulated cells are not perfectly tuned to counter-balance the power-law spectrum of natural images. [sent-9, score-0.311]
8 A decorrelation of neural activity has been proposed to be beneficial for discarding statistical redundancies in the input signals. [sent-10, score-0.205]
9 Fixational instability might, therefore, contribute to establishing efficient representations of natural stimuli. [sent-11, score-0.533]
10 1 Introduction Models of the visual system often examine steady-state levels of neural activity during presentations of visual stimuli. [sent-12, score-0.408]
11 It is difficult, however, to envision how such steady-states could occur under natural viewing conditions, given that the projection of the visual scene on the retina is never stationary. [sent-13, score-0.348]
12 Indeed, the physiological instability of visual fixation keeps the retinal image in permanent motion even during the brief periods in between saccades. [sent-14, score-0.796]
13 Several sources cause this constant jittering of the eye. [sent-15, score-0.083]
14 Fixational eye movements, of which we are not aware, alternate small saccades with periods of drifts, even when subjects are instructed to maintain steady fixation [8]. [sent-16, score-0.17]
15 Following macroscopic redirection of gaze, other small eye movements, such as corrective saccades and post-saccadic drifts, are likely to occur. [sent-17, score-0.129]
16 Furthermore, outside of the controlled conditions of a laboratory, when the head is not constrained by a bite bar, movements of the body, as well as imperfections in the vestibulo-ocular reflex, significantly amplify the motion of the retinal image. [sent-18, score-0.194]
17 edu/∼rucci this constant jitter, it is remarkable that the brain is capable of constructing a stable percept, as fixational instability moves the stimulus by an amount that should be clearly visible (see, for example, [7]). [sent-22, score-0.498]
18 It is often claimed that small saccades are necessary to refresh neuronal responses and prevent the disappearance of a stationary scene, a claim that has remained controversial given the brief durations of natural visual fixation (reviewed in [16]). [sent-24, score-0.381]
19 Yet, recent theoretical proposals [1, 11] have claimed that fixational instability plays a more central role in the acquisition and neural encoding of visual information than that of simply refreshing neural activity. [sent-25, score-0.686]
20 Consistent with the ideas of these proposals, neurophysiological investigations have shown that fixational eye movements strongly influence the activity of neurons in several areas of the monkey’s brain [5, 14, 6]. [sent-26, score-0.36]
21 Furthermore, modeling studies that simulated neural responses during freeviewing suggest that fixational instability profoundly affects the statistics of thalamic [13] and thalamocortical activity [10]. [sent-27, score-0.627]
22 Instead of regarding the jitter of visual fixation as necessary for refreshing neuronal responses, it is argued that the self-motion of the retinal image is essential for properly structuring neural activity in the early visual system into a format that is suitable for processing at later stages. [sent-29, score-0.634]
23 It is proposed that fixational instability is part of a strategy of acquisition of visual information that enables compact visual representations in the presence of natural visual input. [sent-30, score-1.054]
24 2 Neural decorrelation and fixational instability It is a long-standing proposal that an important function of early visual processing is the removal of part of the redundancy that characterizes natural visual input [3]. [sent-31, score-0.949]
25 While several methods exist for eliminating input redundancies, a possible approach is the removal of pairwise correlations between the intensity values of nearby pixels [2]. [sent-33, score-0.089]
26 Elimination of these spatial correlations allows efficient representations in which neuronal responses tend to be less statistically dependent. [sent-34, score-0.169]
27 According to the theory described in this paper, fixational instability contributes to decorrelating the responses of cells in the retina and the LGN during viewing of natural scenes. [sent-35, score-0.765]
28 1, is the spatially uncorrelated input signal that occurs when natural scenes are scanned by jittering eyes. [sent-38, score-0.383]
29 The second factor is an amplification of this spatially uncorrelated input, which is mediated by cell response characteristics. [sent-39, score-0.166]
30 2 examines the interaction between the dynamics of fixational instability and the temporal characteristics of neurons in the Lateral Geniculate Nucleus (LGN), the main relay of visual information to the cortex. [sent-41, score-0.778]
31 2 allows an analytical estimation of the power spectrum of the signal entering the eye during the self-motion of the retinal image. [sent-45, score-0.38]
32 2, the retinal input ˜ S(x, t) can be approximated by the sum of two contributions, I and I, its power spectrum RSS consists of three terms: RSS (u, w) ≈ RII + RI I + 2RI I ˜ ˜˜ where u and w represent, respectively, spatial and temporal frequency. [sent-47, score-0.423]
33 Fixational instability can be modeled as an ergodic process with zero mean and uncorrelated components along the two axes, i. [sent-48, score-0.547]
34 4 compensates for the scaling invariance of natural images. [sent-56, score-0.073]
35 That is, since for natural images RII (u) ∝ u−2 , the product RII (u)|u|2 whitens RII by producing a power spectrum RI I that remains virtually ˜˜ constant at all spatial frequencies. [sent-57, score-0.348]
36 2 Influence of fixational instability on neural activity This section analyzes the structure of correlated activity during fixational instability in a model of the LGN. [sent-59, score-1.155]
37 To delineate the important elements of the theory, we consider linear approximations of geniculate responses provided by space-time separable kernels. [sent-60, score-0.278]
38 Results are, however, general, and the outcomes of simulations with space-time inseparable kernels and different levels of rectification (the most prominent nonlinear behavior of parvocellular geniculate neurons) can be found in [13, 10]. [sent-62, score-0.235]
39 Mean instantaneous firing rates were estimated on the basis of the convolution between the input I and the cell spatiotemporal kernel hα : t ∞ ∞ −∞ −∞ α(t) = hα (x, t) I(x, t) = hα (x , y , t )I(x − x , y − y , t − t ) dx dy dt 0 where hα (x, t) = gα (t)fα (x). [sent-63, score-0.107]
40 The graph compares the power spectrum of natural images (RII ) to the dynamic power spectrum introduced by fixational instability (RI I ). [sent-65, score-0.917]
41 The two curves represent radial averages evaluated over 15 pictures of ˜˜ natural scenes. [sent-66, score-0.095]
42 The temporal kernel gα (t) possessed a biphasic profile with positive peak at 50 ms, negative peak at 75 ms, and overall duration of less than 200 ms [4]. [sent-69, score-0.106]
43 3 and separation of spatial and temporal elements yield: S D Rαα ≈ |Gα |2 |Fα |2 RII + |Gα |2 |Fα |2 RI I = Rαα + Rαα ˜˜ (7) where Fα (u) and Gα (w) represent the Fourier Transforms of the spatial and temporal kernels. [sent-74, score-0.256]
44 7 shows that, similar to the retinal input, also the power spectrum of geniculate D activity can be approximated by the sum of two separate elements. [sent-76, score-0.568]
45 The first term, Rαα , is determined by the power spectrum of the stimulus and the characteristics of geniculate cells but does not depend on the motion of the eye during the acquisition of visual information. [sent-78, score-0.77]
46 8 shows that fixational instability adds the term cD to the pattern of correlated activity αα cS that would obtained with presentation of the same set of stimuli without the self-motion αα of the eye. [sent-82, score-0.673]
47 With presentation of pictures of natural scenes, RII (w) = 2πδ(w), and the two input S D signals Rαα and Rαα provide, respectively, a static and a dynamic contribution to the spatiotemporal correlation of geniculate activity. [sent-83, score-0.556]
48 8 gives a correlation pattern: ˜˜ S cD (x) = kD FS −1 {|Fα |2 RII (u)|u|2 } ˆαα where kD = FT −1 {|Gα (w)|2 Rξξ (w)} (10) is a constant given by the temporal dynamics t=0 −1 −1 of cell response and fixational instability. [sent-88, score-0.155]
49 Since in natural images, most power is concentrated at low spatial frequencies, the uncorrelated fluctuations in the input D signals generated by fixational instability have small amplitudes. [sent-91, score-0.833]
50 However, geniculate cells tend to respond more strongly to changing stimuli than stationary ones, and kD is larger than kS . [sent-93, score-0.343]
51 Therefore, the small input modulations introduced by fixational instability are amplified by the dynamics of geniculate cells. [sent-94, score-0.76]
52 2 shows the structure of correlated activity in the model when images of natural scenes are examined in the presence of fixational instability. [sent-96, score-0.342]
53 In this example, fixational instability was assumed to possess Gaussian temporal correlation, Rξξ (w), with standard deviation σT = 22 ms and amplitude σS = 12 arcmin. [sent-97, score-0.592]
54 2 also shows the patterns of correlation produced by the two components cS and cD . [sent-100, score-0.077]
55 Whereas cS was strongly influenced by the broad spatial correlations ˆαα ˆαα ˆαα of natural images, cD , due to its dependence on the whitened power spectrum RI I , was ˆαα ˜˜ determined exclusively by cell receptive fields. [sent-101, score-0.423]
56 11 when natural images are examined in the presence of fixational instability. [sent-110, score-0.172]
57 The three curves represent the total level of correlation (Total), the correlation cS (x) that would be present if the same images were examined ˆαα in the absence of fixational instability (Static), and the contribution cD (x) of fixational ˆαα instability (Dynamic). [sent-111, score-1.116]
58 Data are radial averages evaluated over pairs of cells with the same separation ||x|| between their receptive fields. [sent-112, score-0.121]
59 presentation of natural images and for various parameters of fixational instability. [sent-113, score-0.15]
60 3 (a) shows the effect of varying the spatial amplitude of the retinal jitter. [sent-115, score-0.202]
61 3 (a), the larger the instability of visual fixation, the larger the contribution of the dynamic term cD with respect to cS . [sent-119, score-0.658]
62 Except for very small ˆαα ˆαα values of σS , ρDS is larger than one, indicating that cD influences the structure of correˆαα lated activity more strongly than cS . [sent-120, score-0.128]
63 3 (b) shows the impact of varying σT , which ˆαα defines the temporal window over which fixational jitter is correlated. [sent-122, score-0.085]
64 For a range of σT corresponding to intervals shorter than the typical duration of visual fixation, cD is significantly larger than cS . [sent-124, score-0.188]
65 Thus, fixational ˆαα ˆαα instability strongly influences correlated activity in the model when it moves the direction of gaze within a range of a few arcmin and is correlated over a fraction of the duration of visual fixation. [sent-125, score-0.934]
66 This range of parameters is consistent with the instability of fixation observed in primates. [sent-126, score-0.46]
67 3 Conclusions It has been proposed that neurons in the early visual system decorrelate their responses to natural stimuli, an operation that is believed to be beneficial for the encoding of visual information [2]. [sent-127, score-0.569]
68 The original claim, which was based on psychophysical measurements of human contrast sensitivity, relies on an inverse proportionality between the spatial response characteristics of retinal and geniculate neurons and the structure of natural images. [sent-128, score-0.534]
69 However, data from neurophysiological recordings have clearly shown that neurons in the retina and the LGN respond significantly to low spatial frequencies, in a way that is not compatible with the requirements of Atick and Redlich’s proposal. [sent-129, score-0.237]
70 During natural viewing, input signals to the retina depend not only on the stimulus, but also on the physiological instability of visual fixation. [sent-130, score-0.866]
71 The results of this study show that when natural scenes are examined 3 5 2. [sent-131, score-0.13]
72 5 0 0 100 σT (ms) 200 300 (b) Figure 3: Influence of the characteristics of fixational instability on the patterns of correlated activity during presentation of natural images. [sent-134, score-0.734]
73 Fixational instability was asˆαα sumed to possess a Gaussian correlation with standard deviation σT and amplitude σS . [sent-137, score-0.571]
74 with jittering eyes, as occurs under natural viewing conditions, fixational instability tends to decorrelate cell responses even if the contrast sensitivity functions of individual neurons do not counterbalance the power spectrum of visual input. [sent-140, score-1.201]
75 The first component is the presence of a spatially uncorrelated input signal during presentation of natural visual stimuli (RI I in Eq. [sent-142, score-0.529]
76 This input signal is a direct consequence of the scale invariance ˜˜ of natural images. [sent-144, score-0.14]
77 It is a property of natural images that, although the intensity values of nearby pixels tend to be correlated, changes in intensity around pairs of pixels are uncorrelated. [sent-145, score-0.184]
78 In a spatial grating, for example, intensity changes at any two locations are highly correlated. [sent-147, score-0.096]
79 During the instability of visual fixation, neurons receive input from the small regions of the visual field covered by the jittering of their receptive fields. [sent-148, score-0.998]
80 In the presence of natural images, although the inputs to cells with nearby receptive fields are on average correlated, the fluctuations in these input signals produced by fixational instability are not correlated. [sent-149, score-0.775]
81 Fixational instability appears to be tuned to the statistics of natural images, as it introduces a spatially uncorrelated signal only in the presence of visual input with a power spectrum that declines as u−2 with spatial frequency. [sent-150, score-1.147]
82 The second element of the theory is the neuronal amplification of the spatially uncorrelated input signal introduced by the self-motion of the retinal image. [sent-151, score-0.327]
83 This amplification originates from the interaction between the dynamics of fixational instability and the temporal sensitivity of geniculate units. [sent-152, score-0.754]
84 Since RI I attenuates the low spatial frequencies of ˜˜ the stimulus, it tends to possess less power than RII . [sent-153, score-0.188]
85 11, the contributions of the two input signals are modulated by the multiplicative terms kS and kD , which depend on the temporal characteristics of cell responses (both kS and kD ) and fixational instability (kD only). [sent-155, score-0.714]
86 Since geniculate neurons respond more strongly to changing stimuli than to stationary ones, kD tends to be higher than kS . [sent-156, score-0.35]
87 Correspondingly, in a linear model of the LGN, units are highly sensitive to the uncorrelated fluctuations in the input signals produced by fixational instability. [sent-157, score-0.188]
88 The theory summarized in this study is consistent with the strong modulations of neural responses observed during fixational eye movements [5, 14, 6], as well as with the results of recent psychophysical experiments aimed at investigating perceptual influences of fixational instability [12, 9]. [sent-158, score-0.714]
89 It should be observed that, since patterns of correlations were evaluated via Fourier analysis, this study implicitly assumed a steady-state condition of visual fixation. [sent-159, score-0.156]
90 Further work is needed to extend the proposed theory in order to take into account time-varying natural stimuli and the nonstationary regime produced by the occurrence of saccades. [sent-160, score-0.137]
91 Dynamics of primate P retinal ganglion cells: Responses to chromatic and achromatic stimuli. [sent-189, score-0.131]
92 Microsaccades differentially modulate neural activity in the striate and extrastriate visual cortex. [sent-198, score-0.252]
93 The function of bursts of spikes during visual fixation in the awake primate lateral geniculate nucleus and primary visual cortex. [sent-209, score-0.587]
94 Decorrelation of neural activity during fixational instability: Possible implications for the refinement of V1 receptive fields. [sent-237, score-0.165]
95 Fixational instability and natural image statistics: Implications for early visual representations. [sent-243, score-0.712]
96 Contributions of fixational eye movements to the discrimination of briefly presented stimuli. [sent-248, score-0.159]
97 Modeling LGN responses during free-viewing: A possible role of microscopic eye movements in the refinement of cortical orientation selectivity. [sent-256, score-0.23]
98 Selective activation of visual cortex neurons by fixational eye movements: Implications for neural coding. [sent-264, score-0.298]
99 Effects of aging on the primate visual system: spatial and temporal processing by lateral geniculate neurons in young adult and old rhesus monkeys. [sent-279, score-0.574]
100 The role of eye movements in the detection of contrast and spatial detail. [sent-288, score-0.231]
wordName wordTfidf (topN-words)
[('xational', 0.607), ('instability', 0.46), ('rii', 0.228), ('geniculate', 0.207), ('visual', 0.156), ('xation', 0.153), ('cd', 0.153), ('fixational', 0.11), ('rucci', 0.11), ('retinal', 0.105), ('spectrum', 0.096), ('activity', 0.096), ('cs', 0.096), ('eye', 0.093), ('ks', 0.088), ('lgn', 0.088), ('uncorrelated', 0.087), ('jittering', 0.083), ('kd', 0.082), ('ri', 0.082), ('natural', 0.073), ('spatial', 0.072), ('responses', 0.071), ('retina', 0.07), ('movements', 0.066), ('rss', 0.066), ('power', 0.064), ('fs', 0.055), ('correlation', 0.053), ('fourier', 0.053), ('neurons', 0.049), ('receptive', 0.049), ('viewing', 0.049), ('input', 0.045), ('jitter', 0.044), ('correlated', 0.043), ('images', 0.043), ('cells', 0.042), ('spatially', 0.042), ('arcmin', 0.041), ('decorrelate', 0.041), ('temporal', 0.041), ('uctuations', 0.04), ('stimuli', 0.04), ('stimulus', 0.038), ('ampli', 0.038), ('cell', 0.037), ('saccades', 0.036), ('decorrelation', 0.036), ('presentation', 0.034), ('ms', 0.033), ('possess', 0.033), ('signals', 0.032), ('duration', 0.032), ('strongly', 0.032), ('scenes', 0.031), ('gaze', 0.031), ('physiological', 0.03), ('presence', 0.03), ('separation', 0.03), ('ds', 0.029), ('uence', 0.028), ('characteristics', 0.028), ('parvocellular', 0.028), ('redundancies', 0.028), ('refreshing', 0.028), ('primate', 0.026), ('examined', 0.026), ('transform', 0.026), ('neuronal', 0.026), ('spatiotemporal', 0.025), ('amplitude', 0.025), ('neurophysiological', 0.024), ('atick', 0.024), ('drifts', 0.024), ('modulations', 0.024), ('intensity', 0.024), ('dynamics', 0.024), ('produced', 0.024), ('static', 0.023), ('lateral', 0.023), ('uences', 0.023), ('motion', 0.023), ('acquisition', 0.023), ('early', 0.023), ('signal', 0.022), ('sensitivity', 0.022), ('periods', 0.022), ('respond', 0.022), ('pictures', 0.022), ('contribution', 0.021), ('dynamic', 0.021), ('boston', 0.02), ('examines', 0.02), ('implications', 0.02), ('nearby', 0.02), ('claimed', 0.019), ('nucleus', 0.019), ('steady', 0.019), ('frequencies', 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999976 203 nips-2005-Visual Encoding with Jittering Eyes
Author: Michele Rucci
Abstract: Under natural viewing conditions, small movements of the eye and body prevent the maintenance of a steady direction of gaze. It is known that stimuli tend to fade when they are stabilized on the retina for several seconds. However, it is unclear whether the physiological self-motion of the retinal image serves a visual purpose during the brief periods of natural visual fixation. This study examines the impact of fixational instability on the statistics of visual input to the retina and on the structure of neural activity in the early visual system. Fixational instability introduces fluctuations in the retinal input signals that, in the presence of natural images, lack spatial correlations. These input fluctuations strongly influence neural activity in a model of the LGN. They decorrelate cell responses, even if the contrast sensitivity functions of simulated cells are not perfectly tuned to counter-balance the power-law spectrum of natural images. A decorrelation of neural activity has been proposed to be beneficial for discarding statistical redundancies in the input signals. Fixational instability might, therefore, contribute to establishing efficient representations of natural stimuli. 1
2 0.10804006 28 nips-2005-Analyzing Auditory Neurons by Learning Distance Functions
Author: Inna Weiner, Tomer Hertz, Israel Nelken, Daphna Weinshall
Abstract: We present a novel approach to the characterization of complex sensory neurons. One of the main goals of characterizing sensory neurons is to characterize dimensions in stimulus space to which the neurons are highly sensitive (causing large gradients in the neural responses) or alternatively dimensions in stimulus space to which the neuronal response are invariant (defining iso-response manifolds). We formulate this problem as that of learning a geometry on stimulus space that is compatible with the neural responses: the distance between stimuli should be large when the responses they evoke are very different, and small when the responses they evoke are similar. Here we show how to successfully train such distance functions using rather limited amount of information. The data consisted of the responses of neurons in primary auditory cortex (A1) of anesthetized cats to 32 stimuli derived from natural sounds. For each neuron, a subset of all pairs of stimuli was selected such that the responses of the two stimuli in a pair were either very similar or very dissimilar. The distance function was trained to fit these constraints. The resulting distance functions generalized to predict the distances between the responses of a test stimulus and the trained stimuli. 1
3 0.087980233 193 nips-2005-The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search
Author: Gregory Zelinsky, Wei Zhang, Bing Yu, Xin Chen, Dimitris Samaras
Abstract: To investigate how top-down (TD) and bottom-up (BU) information is weighted in the guidance of human search behavior, we manipulated the proportions of BU and TD components in a saliency-based model. The model is biologically plausible and implements an artificial retina and a neuronal population code. The BU component is based on featurecontrast. The TD component is defined by a feature-template match to a stored target representation. We compared the model’s behavior at different mixtures of TD and BU components to the eye movement behavior of human observers performing the identical search task. We found that a purely TD model provides a much closer match to human behavior than any mixture model using BU information. Only when biological constraints are removed (e.g., eliminating the retina) did a BU/TD mixture model begin to approximate human behavior.
4 0.087618075 134 nips-2005-Neural mechanisms of contrast dependent receptive field size in V1
Author: Jim Wielaard, Paul Sajda
Abstract: Based on a large scale spiking neuron model of the input layers 4Cα and β of macaque, we identify neural mechanisms for the observed contrast dependent receptive field size of V1 cells. We observe a rich variety of mechanisms for the phenomenon and analyze them based on the relative gain of excitatory and inhibitory synaptic inputs. We observe an average growth in the spatial extent of excitation and inhibition for low contrast, as predicted from phenomenological models. However, contrary to phenomenological models, our simulation results suggest this is neither sufficient nor necessary to explain the phenomenon.
5 0.085018732 5 nips-2005-A Computational Model of Eye Movements during Object Class Detection
Author: Wei Zhang, Hyejin Yang, Dimitris Samaras, Gregory J. Zelinsky
Abstract: We present a computational model of human eye movements in an object class detection task. The model combines state-of-the-art computer vision object class detection methods (SIFT features trained using AdaBoost) with a biologically plausible model of human eye movement to produce a sequence of simulated fixations, culminating with the acquisition of a target. We validated the model by comparing its behavior to the behavior of human observers performing the identical object class detection task (looking for a teddy bear among visually complex nontarget objects). We found considerable agreement between the model and human data in multiple eye movement measures, including number of fixations, cumulative probability of fixating the target, and scanpath distance.
6 0.082717218 109 nips-2005-Learning Cue-Invariant Visual Responses
7 0.067459308 101 nips-2005-Is Early Vision Optimized for Extracting Higher-order Dependencies?
8 0.060834408 169 nips-2005-Saliency Based on Information Maximization
9 0.058863189 141 nips-2005-Norepinephrine and Neural Interrupts
10 0.053243868 170 nips-2005-Scaling Laws in Natural Scenes and the Inference of 3D Shape
11 0.050399061 7 nips-2005-A Cortically-Plausible Inverse Problem Solving Method Applied to Recognizing Static and Kinematic 3D Objects
12 0.050295986 164 nips-2005-Representing Part-Whole Relationships in Recurrent Neural Networks
13 0.050265588 34 nips-2005-Bayesian Surprise Attracts Human Attention
14 0.049652584 29 nips-2005-Analyzing Coupled Brain Sources: Distinguishing True from Spurious Interaction
15 0.049303707 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity
16 0.048633173 149 nips-2005-Optimal cue selection strategy
17 0.047675394 188 nips-2005-Temporally changing synaptic plasticity
18 0.047031201 129 nips-2005-Modeling Neural Population Spiking Activity with Gibbs Distributions
19 0.04570131 94 nips-2005-Identifying Distributed Object Representations in Human Extrastriate Visual Cortex
20 0.041985229 8 nips-2005-A Criterion for the Convergence of Learning with Spike Timing Dependent Plasticity
topicId topicWeight
[(0, 0.115), (1, -0.119), (2, -0.009), (3, 0.125), (4, -0.026), (5, 0.089), (6, -0.03), (7, -0.086), (8, -0.057), (9, 0.031), (10, 0.012), (11, -0.028), (12, -0.027), (13, -0.07), (14, -0.069), (15, -0.01), (16, 0.061), (17, -0.056), (18, -0.003), (19, 0.085), (20, -0.095), (21, -0.046), (22, 0.012), (23, -0.047), (24, 0.057), (25, -0.05), (26, -0.012), (27, -0.071), (28, -0.013), (29, -0.077), (30, 0.035), (31, 0.04), (32, 0.05), (33, -0.074), (34, 0.128), (35, -0.048), (36, 0.089), (37, 0.024), (38, 0.025), (39, 0.079), (40, -0.032), (41, 0.012), (42, -0.13), (43, -0.102), (44, -0.062), (45, 0.042), (46, 0.038), (47, -0.051), (48, 0.003), (49, -0.003)]
simIndex simValue paperId paperTitle
same-paper 1 0.95688468 203 nips-2005-Visual Encoding with Jittering Eyes
Author: Michele Rucci
Abstract: Under natural viewing conditions, small movements of the eye and body prevent the maintenance of a steady direction of gaze. It is known that stimuli tend to fade when they are stabilized on the retina for several seconds. However, it is unclear whether the physiological self-motion of the retinal image serves a visual purpose during the brief periods of natural visual fixation. This study examines the impact of fixational instability on the statistics of visual input to the retina and on the structure of neural activity in the early visual system. Fixational instability introduces fluctuations in the retinal input signals that, in the presence of natural images, lack spatial correlations. These input fluctuations strongly influence neural activity in a model of the LGN. They decorrelate cell responses, even if the contrast sensitivity functions of simulated cells are not perfectly tuned to counter-balance the power-law spectrum of natural images. A decorrelation of neural activity has been proposed to be beneficial for discarding statistical redundancies in the input signals. Fixational instability might, therefore, contribute to establishing efficient representations of natural stimuli. 1
2 0.60009199 28 nips-2005-Analyzing Auditory Neurons by Learning Distance Functions
Author: Inna Weiner, Tomer Hertz, Israel Nelken, Daphna Weinshall
Abstract: We present a novel approach to the characterization of complex sensory neurons. One of the main goals of characterizing sensory neurons is to characterize dimensions in stimulus space to which the neurons are highly sensitive (causing large gradients in the neural responses) or alternatively dimensions in stimulus space to which the neuronal response are invariant (defining iso-response manifolds). We formulate this problem as that of learning a geometry on stimulus space that is compatible with the neural responses: the distance between stimuli should be large when the responses they evoke are very different, and small when the responses they evoke are similar. Here we show how to successfully train such distance functions using rather limited amount of information. The data consisted of the responses of neurons in primary auditory cortex (A1) of anesthetized cats to 32 stimuli derived from natural sounds. For each neuron, a subset of all pairs of stimuli was selected such that the responses of the two stimuli in a pair were either very similar or very dissimilar. The distance function was trained to fit these constraints. The resulting distance functions generalized to predict the distances between the responses of a test stimulus and the trained stimuli. 1
3 0.56252337 193 nips-2005-The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search
Author: Gregory Zelinsky, Wei Zhang, Bing Yu, Xin Chen, Dimitris Samaras
Abstract: To investigate how top-down (TD) and bottom-up (BU) information is weighted in the guidance of human search behavior, we manipulated the proportions of BU and TD components in a saliency-based model. The model is biologically plausible and implements an artificial retina and a neuronal population code. The BU component is based on featurecontrast. The TD component is defined by a feature-template match to a stored target representation. We compared the model’s behavior at different mixtures of TD and BU components to the eye movement behavior of human observers performing the identical search task. We found that a purely TD model provides a much closer match to human behavior than any mixture model using BU information. Only when biological constraints are removed (e.g., eliminating the retina) did a BU/TD mixture model begin to approximate human behavior.
4 0.51442981 109 nips-2005-Learning Cue-Invariant Visual Responses
Author: Jarmo Hurri
Abstract: Multiple visual cues are used by the visual system to analyze a scene; achromatic cues include luminance, texture, contrast and motion. Singlecell recordings have shown that the mammalian visual cortex contains neurons that respond similarly to scene structure (e.g., orientation of a boundary), regardless of the cue type conveying this information. This paper shows that cue-invariant response properties of simple- and complex-type cells can be learned from natural image data in an unsupervised manner. In order to do this, we also extend a previous conceptual model of cue invariance so that it can be applied to model simple- and complex-cell responses. Our results relate cue-invariant response properties to natural image statistics, thereby showing how the statistical modeling approach can be used to model processing beyond the elemental response properties visual neurons. This work also demonstrates how to learn, from natural image data, more sophisticated feature detectors than those based on changes in mean luminance, thereby paving the way for new data-driven approaches to image processing and computer vision. 1
5 0.48777497 34 nips-2005-Bayesian Surprise Attracts Human Attention
Author: Laurent Itti, Pierre F. Baldi
Abstract: The concept of surprise is central to sensory processing, adaptation, learning, and attention. Yet, no widely-accepted mathematical theory currently exists to quantitatively characterize surprise elicited by a stimulus or event, for observers that range from single neurons to complex natural or engineered systems. We describe a formal Bayesian definition of surprise that is the only consistent formulation under minimal axiomatic assumptions. Surprise quantifies how data affects a natural or artificial observer, by measuring the difference between posterior and prior beliefs of the observer. Using this framework we measure the extent to which humans direct their gaze towards surprising items while watching television and video games. We find that subjects are strongly attracted towards surprising locations, with 72% of all human gaze shifts directed towards locations more surprising than the average, a figure which rises to 84% when considering only gaze targets simultaneously selected by all subjects. The resulting theory of surprise is applicable across different spatio-temporal scales, modalities, and levels of abstraction. Life is full of surprises, ranging from a great christmas gift or a new magic trick, to wardrobe malfunctions, reckless drivers, terrorist attacks, and tsunami waves. Key to survival is our ability to rapidly attend to, identify, and learn from surprising events, to decide on present and future courses of action [1]. Yet, little theoretical and computational understanding exists of the very essence of surprise, as evidenced by the absence from our everyday vocabulary of a quantitative unit of surprise: Qualities such as the “wow factor” have remained vague and elusive to mathematical analysis. Informal correlates of surprise exist at nearly all stages of neural processing. In sensory neuroscience, it has been suggested that only the unexpected at one stage is transmitted to the next stage [2]. Hence, sensory cortex may have evolved to adapt to, to predict, and to quiet down the expected statistical regularities of the world [3, 4, 5, 6], focusing instead on events that are unpredictable or surprising. Electrophysiological evidence for this early sensory emphasis onto surprising stimuli exists from studies of adaptation in visual [7, 8, 4, 9], olfactory [10, 11], and auditory cortices [12], subcortical structures like the LGN [13], and even retinal ganglion cells [14, 15] and cochlear hair cells [16]: neural response greatly attenuates with repeated or prolonged exposure to an initially novel stimulus. Surprise and novelty are also central to learning and memory formation [1], to the point that surprise is believed to be a necessary trigger for associative learning [17, 18], as supported by mounting evidence for a role of the hippocampus as a novelty detector [19, 20, 21]. Finally, seeking novelty is a well-identified human character trait, with possible association with the dopamine D4 receptor gene [22, 23, 24]. In the Bayesian framework, we develop the only consistent theory of surprise, in terms of the difference between the posterior and prior distributions of beliefs of an observer over the available class of models or hypotheses about the world. We show that this definition derived from first principles presents key advantages over more ad-hoc formulations, typically relying on detecting outlier stimuli. Armed with this new framework, we provide direct experimental evidence that surprise best characterizes what attracts human gaze in large amounts of natural video stimuli. We here extend a recent pilot study [25], adding more comprehensive theory, large-scale human data collection, and additional analysis. 1 Theory Bayesian Definition of Surprise. We propose that surprise is a general concept, which can be derived from first principles and formalized across spatio-temporal scales, sensory modalities, and, more generally, data types and data sources. Two elements are essential for a principled definition of surprise. First, surprise can exist only in the presence of uncertainty, which can arise from intrinsic stochasticity, missing information, or limited computing resources. A world that is purely deterministic and predictable in real-time for a given observer contains no surprises. Second, surprise can only be defined in a relative, subjective, manner and is related to the expectations of the observer, be it a single synapse, neuronal circuit, organism, or computer device. The same data may carry different amount of surprise for different observers, or even for the same observer taken at different times. In probability and decision theory it can be shown that the only consistent and optimal way for modeling and reasoning about uncertainty is provided by the Bayesian theory of probability [26, 27, 28]. Furthermore, in the Bayesian framework, probabilities correspond to subjective degrees of beliefs in hypotheses or models which are updated, as data is acquired, using Bayes’ theorem as the fundamental tool for transforming prior belief distributions into posterior belief distributions. Therefore, within the same optimal framework, the only consistent definition of surprise must involve: (1) probabilistic concepts to cope with uncertainty; and (2) prior and posterior distributions to capture subjective expectations. Consistently with this Bayesian approach, the background information of an observer is captured by his/her/its prior probability distribution {P (M )}M ∈M over the hypotheses or models M in a model space M. Given this prior distribution of beliefs, the fundamental effect of a new data observation D on the observer is to change the prior distribution {P (M )}M ∈M into the posterior distribution {P (M |D)}M ∈M via Bayes theorem, whereby P (D|M ) ∀M ∈ M, P (M |D) = P (M ). (1) P (D) In this framework, the new data observation D carries no surprise if it leaves the observer beliefs unaffected, that is, if the posterior is identical to the prior; conversely, D is surprising if the posterior distribution resulting from observing D significantly differs from the prior distribution. Therefore we formally measure surprise elicited by data as some distance measure between the posterior and prior distributions. This is best done using the relative entropy or Kullback-Leibler (KL) divergence [29]. Thus, surprise is defined by the average of the log-odd ratio: P (M |D) S(D, M) = KL(P (M |D), P (M )) = P (M |D) log dM (2) P (M ) M taken with respect to the posterior distribution over the model class M. Note that KL is not symmetric but has well-known theoretical advantages, including invariance with respect to Figure 1: Computing surprise in early sensory neurons. (a) Prior data observations, tuning preferences, and top-down influences contribute to shaping a set of “prior beliefs” a neuron may have over a class of internal models or hypotheses about the world. For instance, M may be a set of Poisson processes parameterized by the rate λ, with {P (M )}M ∈M = {P (λ)}λ∈I +∗ the prior distribution R of beliefs about which Poisson models well describe the world as sensed by the neuron. New data D updates the prior into the posterior using Bayes’ theorem. Surprise quantifies the difference between the posterior and prior distributions over the model class M. The remaining panels detail how surprise differs from conventional model fitting and outlier-based novelty. (b) In standard iterative Bayesian model fitting, at every iteration N , incoming data DN is used to update the prior {P (M |D1 , D2 , ..., DN −1 )}M ∈M into the posterior {P (M |D1 , D2 , ..., DN )}M ∈M . Freezing this learning at a given iteration, one then picks the currently best model, usually using either a maximum likelihood criterion, or a maximum a posteriori one (yielding MM AP shown). (c) This best model is used for a number of tasks at the current iteration, including outlier-based novelty detection. New data is then considered novel at that instant if it has low likelihood for the best model b a (e.g., DN is more novel than DN ). This focus onto the single best model presents obvious limitations, especially in situations where other models are nearly as good (e.g., M∗ in panel (b) is entirely ignored during standard novelty computation). One palliative solution is to consider mixture models, or simply P (D), but this just amounts to shifting the problem into a different model class. (d) Surprise directly addresses this problem by simultaneously considering all models and by measuring how data changes the observer’s distribution of beliefs from {P (M |D1 , D2 , ..., DN −1 )}M ∈M to {P (M |D1 , D2 , ..., DN )}M ∈M over the entire model class M (orange shaded area). reparameterizations. A unit of surprise — a “wow” — may then be defined for a single model M as the amount of surprise corresponding to a two-fold variation between P (M |D) and P (M ), i.e., as log P (M |D)/P (M ) (with log taken in base 2), with the total number of wows experienced for all models obtained through the integration in eq. 2. Surprise and outlier detection. Outlier detection based on the likelihood P (D|M best ) of D given a single best model Mbest is at best an approximation to surprise and, in some cases, is misleading. Consider, for instance, a case where D has very small probability both for a model or hypothesis M and for a single alternative hypothesis M. Although D is a strong outlier, it carries very little information regarding whether M or M is the better model, and therefore very little surprise. Thus an outlier detection method would strongly focus attentional resources onto D, although D is a false positive, in the sense that it carries no useful information for discriminating between the two alternative hypotheses M and M. Figure 1 further illustrates this disconnect between outlier detection and surprise. 2 Human experiments To test the surprise hypothesis — that surprise attracts human attention and gaze in natural scenes — we recorded eye movements from eight na¨ve observers (three females and ı five males, ages 23-32, normal or corrected-to-normal vision). Each watched a subset from 50 videoclips totaling over 25 minutes of playtime (46,489 video frames, 640 × 480, 60.27 Hz, mean screen luminance 30 cd/m2 , room 4 cd/m2 , viewing distance 80cm, field of view 28◦ × 21◦ ). Clips comprised outdoors daytime and nighttime scenes of crowded environments, video games, and television broadcast including news, sports, and commercials. Right-eye position was tracked with a 240 Hz video-based device (ISCAN RK-464), with methods as previously [30]. Two hundred calibrated eye movement traces (10,192 saccades) were analyzed, corresponding to four distinct observers for each of the 50 clips. Figure 2 shows sample scanpaths for one videoclip. To characterize image regions selected by participants, we process videoclips through computational metrics that output a topographic dynamic master response map, assigning in real-time a response value to every input location. A good master map would highlight, more than expected by chance, locations gazed to by observers. To score each metric we hence sample, at onset of every human saccade, master map activity around the saccade’s future endpoint, and around a uniformly random endpoint (random sampling was repeated 100 times to evaluate variability). We quantify differences between histograms of master Figure 2: (a) Sample eye movement traces from four observers (squares denote saccade endpoints). (b) Our data exhibits high inter-individual overlap, shown here with the locations where one human saccade endpoint was nearby (≈ 5◦ ) one (white squares), two (cyan squares), or all three (black squares) other humans. (c) A metric where the master map was created from the three eye movement traces other than that being tested yields an upper-bound KL score, computed by comparing the histograms of metric values at human (narrow blue bars) and random (wider green bars) saccade targets. Indeed, this metric’s map was very sparse (many random saccades landing on locations with nearzero response), yet humans preferentially saccaded towards the three active hotspots corresponding to the eye positions of three other humans (many human saccades landing on locations with near-unity responses). map samples collected from human and random saccades using again the Kullback-Leibler (KL) distance: metrics which better predict human scanpaths exhibit higher distances from random as, typically, observers non-uniformly gaze towards a minority of regions with highest metric responses while avoiding a majority of regions with low metric responses. This approach presents several advantages over simpler scoring schemes [31, 32], including agnosticity to putative mechanisms for generating saccades and the fact that applying any continuous nonlinearity to master map values would not affect scoring. Experimental results. We test six computational metrics, encompassing and extending the state-of-the-art found in previous studies. The first three quantify static image properties (local intensity variance in 16 × 16 image patches [31]; local oriented edge density as measured with Gabor filters [33]; and local Shannon entropy in 16 × 16 image patches [34]). The remaining three metrics are more sensitive to dynamic events (local motion [33]; outlier-based saliency [33]; and surprise [25]). For all metrics, we find that humans are significantly attracted by image regions with higher metric responses. However, the static metrics typically respond vigorously at numerous visual locations (Figure 3), hence they are poorly specific and yield relatively low KL scores between humans and random. The metrics sensitive to motion, outliers, and surprising events, in comparison, yield sparser maps and higher KL scores. The surprise metric of interest here quantifies low-level surprise in image patches over space and time, and at this point does not account for high-level or cognitive beliefs of our human observers. Rather, it assumes a family of simple models for image patches, each processed through 72 early feature detectors sensitive to color, orientation, motion, etc., and computes surprise from shifts in the distribution of beliefs about which models better describe the patches (see [25] and [35] for details). We find that the surprise metric significantly outperforms all other computational metrics (p < 10−100 or better on t-tests for equality of KL scores), scoring nearly 20% better than the second-best metric (saliency) and 60% better than the best static metric (entropy). Surprising stimuli often substantially differ from simple feature outliers; for example, a continually blinking light on a static background elicits sustained flicker due to its locally outlier temporal dynamics but is only surprising for a moment. Similarly, a shower of randomly-colored pixels continually excites all low-level feature detectors but rapidly becomes unsurprising. Strongest attractors of human attention. Clearly, in our and previous eye-tracking experiments, in some situations potentially interesting targets were more numerous than in others. With many possible targets, different observers may orient towards different locations, making it more difficult for a single metric to accurately predict all observers. Hence we consider (Figure 4) subsets of human saccades where at least two, three, or all four observers simultaneously agreed on a gaze target. Observers could have agreed based on bottom-up factors (e.g., only one location had interesting visual appearance at that time), top-down factors (e.g., only one object was of current cognitive interest), or both (e.g., a single cognitively interesting object was present which also had distinctive appearance). Irrespectively of the cause for agreement, it indicates consolidated belief that a location was attractive. While the KL scores of all metrics improved when progressively focusing onto only those locations, dynamic metrics improved more steeply, indicating that stimuli which more reliably attracted all observers carried more motion, saliency, and surprise. Surprise remained significantly the best metric to characterize these agreed-upon attractors of human gaze (p < 10−100 or better on t-tests for equality of KL scores). Overall, surprise explained the greatest fraction of human saccades, indicating that humans are significantly attracted towards surprising locations in video displays. Over 72% of all human saccades were targeted to locations predicted to be more surprising than on average. When only considering saccades where two, three, or four observers agreed on a common gaze target, this figure rose to 76%, 80%, and 84%, respectively. Figure 3: (a) Sample video frames, with corresponding human saccades and predictions from the entropy, surprise, and human-derived metrics. Entropy maps, like intensity variance and orientation maps, exhibited many locations with high responses, hence had low specificity and were poorly discriminative. In contrast, motion, saliency, and surprise maps were much sparser and more specific, with surprise significantly more often on target. For three example frames (first column), saccades from one subject are shown (arrows) with corresponding apertures over which master map activity at the saccade endpoint was sampled (circles). (b) KL scores for these metrics indicate significantly different performance levels, and a strict ranking of variance < orientation < entropy < motion < saliency < surprise < human-derived. KL scores were computed by comparing the number of human saccades landing onto each given range of master map values (narrow blue bars) to the number of random saccades hitting the same range (wider green bars). A score of zero would indicate equality between the human and random histograms, i.e., humans did not tend to hit various master map values any differently from expected by chance, or, the master map could not predict human saccades better than random saccades. Among the six computational metrics tested in total, surprise performed best, in that surprising locations were relatively few yet reliably gazed to by humans. Figure 4: KL scores when considering only saccades where at least one (all 10,192 saccades), two (7,948 saccades), three (5,565 saccades), or all four (2,951 saccades) humans agreed on a common gaze location, for the static (a) and dynamic metrics (b). Static metrics improved substantially when progressively focusing onto saccades with stronger inter-observer agreement (average slope 0.56 ± 0.37 percent KL score units per 1,000 pruned saccades). Hence, when humans agreed on a location, they also tended to be more reliably predicted by the metrics. Furthermore, dynamic metrics improved 4.5 times more steeply (slope 2.44 ± 0.37), suggesting a stronger role of dynamic events in attracting human attention. Surprising events were significantly the strongest (t-tests for equality of KL scores between surprise and other metrics, p < 10−100 ). 3 Discussion While previous research has shown with either static scenes or dynamic synthetic stimuli that humans preferentially fixate regions of high entropy [34], contrast [31], saliency [32], flicker [36], or motion [37], our data provides direct experimental evidence that humans fixate surprising locations even more reliably. These conclusions were made possible by developing new tools to quantify what attracts human gaze over space and time in dynamic natural scenes. Surprise explained best where humans look when considering all saccades, and even more so when restricting the analysis to only those saccades for which human observers tended to agree. Surprise hence represents an inexpensive, easily computable approximation to human attentional allocation. In the absence of quantitative tools to measure surprise, most experimental and modeling work to date has adopted the approximation that novel events are surprising, and has focused on experimental scenarios which are simple enough to ensure an overlap between informal notions of novelty and surprise: for example, a stimulus is novel during testing if it has not been seen during training [9]. Our definition opens new avenues for more sophisticated experiments, where surprise elicited by different stimuli can be precisely compared and calibrated, yielding predictions at the single-unit as well as behavioral levels. The definition of surprise — as the distance between the posterior and prior distributions of beliefs over models — is entirely general and readily applicable to the analysis of auditory, olfactory, gustatory, or somatosensory data. While here we have focused on behavior rather than detailed biophysical implementation, it is worth noting that detecting surprise in neural spike trains does not require semantic understanding of the data carried by the spike trains, and thus could provide guiding signals during self-organization and development of sensory areas. At higher processing levels, top-down cues and task demands are known to combine with stimulus novelty in capturing attention and triggering learning [1, 38], ideas which may now be formalized and quantified in terms of priors, posteriors, and surprise. Surprise, indeed, inherently depends on uncertainty and on prior beliefs. Hence surprise theory can further be tested and utilized in experiments where the prior is biased, for ex- ample by top-down instructions or prior exposures to stimuli [38]. In addition, simple surprise-based behavioral measures such as the eye-tracking one used here may prove useful for early diagnostic of human conditions including autism and attention-deficit hyperactive disorder, as well as for quantitative comparison between humans and animals which may have lower or different priors, including monkeys, frogs, and flies. Beyond sensory biology, computable surprise could guide the development of data mining and compression systems (giving more bits to surprising regions of interest), to find surprising agents in crowds, surprising sentences in books or speeches, surprising sequences in genomes, surprising medical symptoms, surprising odors in airport luggage racks, surprising documents on the world-wide-web, or to design surprising advertisements. Acknowledgments: Supported by HFSP, NSF and NGA (L.I.), NIH and NSF (P.B.). We thank UCI’s Institute for Genomics and Bioinformatics and USC’s Center High Performance Computing and Communications (www.usc.edu/hpcc) for access to their computing clusters. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] Ranganath, C. & Rainer, G. Nat Rev Neurosci 4, 193–202 (2003). Rao, R. P. & Ballard, D. H. Nat Neurosci 2, 79–87 (1999). Olshausen, B. A. & Field, D. J. Nature 381, 607–609 (1996). M¨ ller, J. R., Metha, A. B., Krauskopf, J. & Lennie, P. Science 285, 1405–1408 (1999). u Dragoi, V., Sharma, J., Miller, E. K. & Sur, M. Nat Neurosci 5, 883–891 (2002). David, S. V., Vinje, W. E. & Gallant, J. L. J Neurosci 24, 6991–7006 (2004). Maffei, L., Fiorentini, A. & Bisti, S. Science 182, 1036–1038 (1973). Movshon, J. A. & Lennie, P. Nature 278, 850–852 (1979). Fecteau, J. H. & Munoz, D. P. Nat Rev Neurosci 4, 435–443 (2003). Kurahashi, T. & Menini, A. Nature 385, 725–729 (1997). Bradley, J., Bonigk, W., Yau, K. W. & Frings, S. Nat Neurosci 7, 705–710 (2004). Ulanovsky, N., Las, L. & Nelken, I. Nat Neurosci 6, 391–398 (2003). Solomon, S. G., Peirce, J. W., Dhruv, N. T. & Lennie, P. Neuron 42, 155–162 (2004). Smirnakis, S. M., Berry, M. J. & et al. Nature 386, 69–73 (1997). Brown, S. P. & Masland, R. H. Nat Neurosci 4, 44–51 (2001). Kennedy, H. J., Evans, M. G. & et al. Nat Neurosci 6, 832–836 (2003). Schultz, W. & Dickinson, A. Annu Rev Neurosci 23, 473–500 (2000). Fletcher, P. C., Anderson, J. M., Shanks, D. R. et al. Nat Neurosci 4, 1043–1048 (2001). Knight, R. Nature 383, 256–259 (1996). Stern, C. E., Corkin, S., Gonzalez, R. G. et al. Proc Natl Acad Sci U S A 93, 8660–8665 (1996). Li, S., Cullen, W. K., Anwyl, R. & Rowan, M. J. Nat Neurosci 6, 526–531 (2003). Ebstein, R. P., Novick, O., Umansky, R. et al. Nat Genet 12, 78–80 (1996). Benjamin, J., Li, L. & et al. Nat Genet 12, 81–84 (1996). Lusher, J. M., Chandler, C. & Ball, D. Mol Psychiatry 6, 497–499 (2001). Itti, L. & Baldi, P. In Proc. IEEE CVPR. San Siego, CA (2005 in press). Cox, R. T. Am. J. Phys. 14, 1–13 (1964). Savage, L. J. The foundations of statistics (Dover, New York, 1972). (First Edition in 1954). Jaynes, E. T. Probability Theory. The Logic of Science (Cambridge University Press, 2003). Kullback, S. Information Theory and Statistics (Wiley, New York:New York, 1959). Itti, L. Visual Cognition (2005 in press). Reinagel, P. & Zador, A. M. Network 10, 341–350 (1999). Parkhurst, D., Law, K. & Niebur, E. Vision Res 42, 107–123 (2002). Itti, L. & Koch, C. Nat Rev Neurosci 2, 194–203 (2001). Privitera, C. M. & Stark, L. W. IEEE Trans Patt Anal Mach Intell 22, 970–982 (2000). All source code for all metrics is freely available at http://iLab.usc.edu/toolkit/. Theeuwes, J. Percept Psychophys 57, 637–644 (1995). Abrams, R. A. & Christ, S. E. Psychol Sci 14, 427–432 (2003). Wolfe, J. M. & Horowitz, T. S. Nat Rev Neurosci 5, 495–501 (2004).
6 0.48668087 134 nips-2005-Neural mechanisms of contrast dependent receptive field size in V1
7 0.48409379 94 nips-2005-Identifying Distributed Object Representations in Human Extrastriate Visual Cortex
8 0.48043689 169 nips-2005-Saliency Based on Information Maximization
9 0.43373454 129 nips-2005-Modeling Neural Population Spiking Activity with Gibbs Distributions
10 0.3949939 101 nips-2005-Is Early Vision Optimized for Extracting Higher-order Dependencies?
11 0.37822112 5 nips-2005-A Computational Model of Eye Movements during Object Class Detection
12 0.3542679 157 nips-2005-Principles of real-time computing with feedback applied to cortical microcircuit models
13 0.34637439 174 nips-2005-Separation of Music Signals by Harmonic Structure Modeling
14 0.34094822 93 nips-2005-Ideal Observers for Detecting Motion: Correspondence Noise
15 0.3392103 29 nips-2005-Analyzing Coupled Brain Sources: Distinguishing True from Spurious Interaction
16 0.33630681 7 nips-2005-A Cortically-Plausible Inverse Problem Solving Method Applied to Recognizing Static and Kinematic 3D Objects
17 0.33117145 176 nips-2005-Silicon growth cones map silicon retina
18 0.32677567 130 nips-2005-Modeling Neuronal Interactivity using Dynamic Bayesian Networks
19 0.30916679 183 nips-2005-Stimulus Evoked Independent Factor Analysis of MEG Data with Large Background Activity
20 0.30601957 165 nips-2005-Response Analysis of Neuronal Population with Synaptic Depression
topicId topicWeight
[(3, 0.03), (10, 0.033), (27, 0.055), (31, 0.041), (34, 0.03), (39, 0.07), (55, 0.017), (57, 0.03), (60, 0.036), (65, 0.022), (68, 0.31), (69, 0.054), (73, 0.024), (77, 0.013), (88, 0.065), (91, 0.031), (93, 0.011)]
simIndex simValue paperId paperTitle
same-paper 1 0.79636252 203 nips-2005-Visual Encoding with Jittering Eyes
Author: Michele Rucci
Abstract: Under natural viewing conditions, small movements of the eye and body prevent the maintenance of a steady direction of gaze. It is known that stimuli tend to fade when they are stabilized on the retina for several seconds. However, it is unclear whether the physiological self-motion of the retinal image serves a visual purpose during the brief periods of natural visual fixation. This study examines the impact of fixational instability on the statistics of visual input to the retina and on the structure of neural activity in the early visual system. Fixational instability introduces fluctuations in the retinal input signals that, in the presence of natural images, lack spatial correlations. These input fluctuations strongly influence neural activity in a model of the LGN. They decorrelate cell responses, even if the contrast sensitivity functions of simulated cells are not perfectly tuned to counter-balance the power-law spectrum of natural images. A decorrelation of neural activity has been proposed to be beneficial for discarding statistical redundancies in the input signals. Fixational instability might, therefore, contribute to establishing efficient representations of natural stimuli. 1
2 0.44774225 154 nips-2005-Preconditioner Approximations for Probabilistic Graphical Models
Author: John D. Lafferty, Pradeep K. Ravikumar
Abstract: We present a family of approximation techniques for probabilistic graphical models, based on the use of graphical preconditioners developed in the scientific computing literature. Our framework yields rigorous upper and lower bounds on event probabilities and the log partition function of undirected graphical models, using non-iterative procedures that have low time complexity. As in mean field approaches, the approximations are built upon tractable subgraphs; however, we recast the problem of optimizing the tractable distribution parameters and approximate inference in terms of the well-studied linear systems problem of obtaining a good matrix preconditioner. Experiments are presented that compare the new approximation schemes to variational methods. 1
3 0.40134802 8 nips-2005-A Criterion for the Convergence of Learning with Spike Timing Dependent Plasticity
Author: Robert A. Legenstein, Wolfgang Maass
Abstract: We investigate under what conditions a neuron can learn by experimentally supported rules for spike timing dependent plasticity (STDP) to predict the arrival times of strong “teacher inputs” to the same neuron. It turns out that in contrast to the famous Perceptron Convergence Theorem, which predicts convergence of the perceptron learning rule for a simplified neuron model whenever a stable solution exists, no equally strong convergence guarantee can be given for spiking neurons with STDP. But we derive a criterion on the statistical dependency structure of input spike trains which characterizes exactly when learning with STDP will converge on average for a simple model of a spiking neuron. This criterion is reminiscent of the linear separability criterion of the Perceptron Convergence Theorem, but it applies here to the rows of a correlation matrix related to the spike inputs. In addition we show through computer simulations for more realistic neuron models that the resulting analytically predicted positive learning results not only hold for the common interpretation of STDP where STDP changes the weights of synapses, but also for a more realistic interpretation suggested by experimental data where STDP modulates the initial release probability of dynamic synapses. 1
4 0.38490313 193 nips-2005-The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search
Author: Gregory Zelinsky, Wei Zhang, Bing Yu, Xin Chen, Dimitris Samaras
Abstract: To investigate how top-down (TD) and bottom-up (BU) information is weighted in the guidance of human search behavior, we manipulated the proportions of BU and TD components in a saliency-based model. The model is biologically plausible and implements an artificial retina and a neuronal population code. The BU component is based on featurecontrast. The TD component is defined by a feature-template match to a stored target representation. We compared the model’s behavior at different mixtures of TD and BU components to the eye movement behavior of human observers performing the identical search task. We found that a purely TD model provides a much closer match to human behavior than any mixture model using BU information. Only when biological constraints are removed (e.g., eliminating the retina) did a BU/TD mixture model begin to approximate human behavior.
5 0.3715409 181 nips-2005-Spiking Inputs to a Winner-take-all Network
Author: Matthias Oster, Shih-Chii Liu
Abstract: Recurrent networks that perform a winner-take-all computation have been studied extensively. Although some of these studies include spiking networks, they consider only analog input rates. We present results of this winner-take-all computation on a network of integrate-and-fire neurons which receives spike trains as inputs. We show how we can configure the connectivity in the network so that the winner is selected after a pre-determined number of input spikes. We discuss spiking inputs with both regular frequencies and Poisson-distributed rates. The robustness of the computation was tested by implementing the winner-take-all network on an analog VLSI array of 64 integrate-and-fire neurons which have an innate variance in their operating parameters. 1
6 0.36606476 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity
7 0.36385834 26 nips-2005-An exploration-exploitation model based on norepinepherine and dopamine activity
8 0.36219254 5 nips-2005-A Computational Model of Eye Movements during Object Class Detection
9 0.36063105 109 nips-2005-Learning Cue-Invariant Visual Responses
10 0.35286471 63 nips-2005-Efficient Unsupervised Learning for Localization and Detection in Object Categories
11 0.35118344 94 nips-2005-Identifying Distributed Object Representations in Human Extrastriate Visual Cortex
12 0.35109776 149 nips-2005-Optimal cue selection strategy
13 0.35085517 30 nips-2005-Assessing Approximations for Gaussian Process Classification
14 0.35078529 141 nips-2005-Norepinephrine and Neural Interrupts
15 0.34939894 169 nips-2005-Saliency Based on Information Maximization
16 0.3492403 157 nips-2005-Principles of real-time computing with feedback applied to cortical microcircuit models
17 0.34891441 155 nips-2005-Predicting EMG Data from M1 Neurons with Variational Bayesian Least Squares
18 0.34661022 28 nips-2005-Analyzing Auditory Neurons by Learning Distance Functions
19 0.34501445 35 nips-2005-Bayesian model learning in human visual perception
20 0.34484378 99 nips-2005-Integrate-and-Fire models with adaptation are good enough