nips nips2000 nips2000-8 knowledge-graph by maker-knowledge-mining

8 nips-2000-A New Model of Spatial Representation in Multimodal Brain Areas


Source: pdf

Author: Sophie Denève, Jean-René Duhamel, Alexandre Pouget

Abstract: Most models of spatial representations in the cortex assume cells with limited receptive fields that are defined in a particular egocentric frame of reference. However, cells outside of primary sensory cortex are either gain modulated by postural input or partially shifting. We show that solving classical spatial tasks, like sensory prediction, multi-sensory integration, sensory-motor transformation and motor control requires more complicated intermediate representations that are not invariant in one frame of reference. We present an iterative basis function map that performs these spatial tasks optimally with gain modulated and partially shifting units, and tests it against neurophysiological and neuropsychological data. In order to perform an action directed toward an object, it is necessary to have a representation of its spatial location. The brain must be able to use spatial cues coming from different modalities (e.g. vision, audition, touch, proprioception), combine them to infer the position of the object, and compute the appropriate movement. These cues are in different frames of reference corresponding to different sensory or motor modalities. Visual inputs are primarily encoded in retinotopic maps, auditory inputs are encoded in head centered maps and tactile cues are encoded in skin-centered maps. Going from one frame of reference to the other might seem easy. For example, the head-centered position of an object can be approximated by the sum of its retinotopic position and the eye position. However, positions are represented by population codes in the brain, and computing a head-centered map from a retinotopic map is a more complex computation than the underlying sum. Moreover, as we get closer to sensory-motor areas it seems reasonable to assume Spksls 150 100 50 o Figure 1: Response of a VIP cell to visual stimuli appearing in different part of the screen, for three different eye positions. The level of grey represent the frequency of discharge (In spikes per seconds). The white cross is the fixation point (the head is fixed). The cell's receptive field is moving with the eyes, but only partially. Here the receptive field shift is 60% of the total gaze shift. Moreover this cell is gain modulated by eye position (adapted from Duhamel et al). that the representations should be useful for sensory-motor transformations, rather than encode an

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A new model of spatial representations In multimodal brain areas. [sent-2, score-0.264]

2 edu Abstract Most models of spatial representations in the cortex assume cells with limited receptive fields that are defined in a particular egocentric frame of reference. [sent-13, score-0.67]

3 However, cells outside of primary sensory cortex are either gain modulated by postural input or partially shifting. [sent-14, score-0.935]

4 We show that solving classical spatial tasks, like sensory prediction, multi-sensory integration, sensory-motor transformation and motor control requires more complicated intermediate representations that are not invariant in one frame of reference. [sent-15, score-0.653]

5 We present an iterative basis function map that performs these spatial tasks optimally with gain modulated and partially shifting units, and tests it against neurophysiological and neuropsychological data. [sent-16, score-0.766]

6 In order to perform an action directed toward an object, it is necessary to have a representation of its spatial location. [sent-17, score-0.173]

7 The brain must be able to use spatial cues coming from different modalities (e. [sent-18, score-0.317]

8 vision, audition, touch, proprioception), combine them to infer the position of the object, and compute the appropriate movement. [sent-20, score-0.295]

9 These cues are in different frames of reference corresponding to different sensory or motor modalities. [sent-21, score-0.429]

10 Visual inputs are primarily encoded in retinotopic maps, auditory inputs are encoded in head centered maps and tactile cues are encoded in skin-centered maps. [sent-22, score-1.052]

11 Going from one frame of reference to the other might seem easy. [sent-23, score-0.143]

12 For example, the head-centered position of an object can be approximated by the sum of its retinotopic position and the eye position. [sent-24, score-1.244]

13 However, positions are represented by population codes in the brain, and computing a head-centered map from a retinotopic map is a more complex computation than the underlying sum. [sent-25, score-0.667]

14 Moreover, as we get closer to sensory-motor areas it seems reasonable to assume Spksls 150 100 50 o Figure 1: Response of a VIP cell to visual stimuli appearing in different part of the screen, for three different eye positions. [sent-26, score-0.724]

15 The cell's receptive field is moving with the eyes, but only partially. [sent-29, score-0.195]

16 Here the receptive field shift is 60% of the total gaze shift. [sent-30, score-0.318]

17 Moreover this cell is gain modulated by eye position (adapted from Duhamel et al). [sent-31, score-0.941]

18 According to the linear model, space is always represented in the sensory and sensory-motor cortex in one particular egocentric frame of reference. [sent-33, score-0.315]

19 This process is mediated by cells whose receptive fields are anchored to a particular body part. [sent-34, score-0.344]

20 In this view spatial cues coming from different modalities should all be remapped in a common frame of reference at some point, that can be used in turn to compute motor maps (for reaching, grasping, etc ). [sent-35, score-0.576]

21 The linear model was challenged when cells truly invariant in one modality failed to be found in parietal areas. [sent-36, score-0.435]

22 Andersen et al, for example, found retinotopic cells that were gain modulated by eye position in LIP [1], but none of these cells had a headcentered receptive fields. [sent-37, score-1.632]

23 Subsequent studies confirmed that gain-modulation by eye position is a very general phenomena in the cortex, whereas truly head-centered or arm-centered cells have rarely been reported. [sent-38, score-0.785]

24 found cells that were neither eye nor headcentered, but whose receptive fields were partially moving with the eyes [2]. [sent-40, score-0.807]

25 As a consequence, the receptive fields appeared to be moving both in the retinotopic and head-centered frames of reference (see figure 1). [sent-41, score-0.716]

26 The amount of shift with gaze varied from cell to cell, and was continuously distributed between 0% (head-centered) and 100% (retinotopic). [sent-42, score-0.323]

27 Partially shifting cells where also found for auditory targets in LIP [5] and in the superior colliculus [3]. [sent-43, score-0.52]

28 We present an interconnected network that can perform multi-directional coordinate and sensory-motor transforms by using intermediate basis function units. [sent-45, score-0.495]

29 These intermediate units are gain modulated by eye position, have partially shifting receptive field and, as a result, represent space in a mixture of frames of reference. [sent-46, score-1.058]

30 They provide a new model of spatial representations in multimodal areas according to which cells responses are not determined solely by the position of the stimulus in a particular egocentric frame of reference, but by the interactions between the dominant input modalities. [sent-47, score-0.989]

31 1 Sensory predictions and sensory-motor transformations with distributed population codes We will focus on the eye/head system which deals with two frames of reference (retinotopic and head-centered) and one postural input (the eye position). [sent-48, score-0.91]

32 Sensory predictions consist of anticipating a stimulus in one sensory modality from a stimulus originating from the same location, but in another sensory modality. [sent-49, score-0.574]

33 Predictions of auditory stimuli from visual stimuli, for example, requires the computation of a head-centered map from a retinotopic map. [sent-50, score-0.923]

34 In addition we suppose that cells are organized topographically in each layer, so that a stimulus at position r and for eye position 9 will give rise to a hill of activity peaking at position r on the retinotopic map and 9 on the eye position map. [sent-53, score-2.586]

35 We wish to compute a head-centered map where cells responses are described by head-centered gaussian tuning curves Bh(H - Hk) where H is the head-centered position and Hk the preferred position. [sent-54, score-0.654]

36 2 Basis function map To solve this problem we could use an intermediate neural layer that implements a product between visual and postural tuning curves [4]. [sent-58, score-0.86]

37 Products of Gaussians are basis functions and thus a population of retinotopic cells gain modulated by eye position, whose responses are described by BT(R - Ri)Be(E - Ej ) implement a basis function map of Rand E. [sent-59, score-1.419]

38 Any function f(R, E) can be approximated by a linear combination of these cells responses: f(R,E) =L wijBT(R - Ri)Be(E - Ej ). [sent-60, score-0.158]

39 (1) ij In particular, a head centered map is a function of retinotopic position and eye position and can be computed very easily from the basis function map (by a simple linear combination). [sent-61, score-1.604]

40 Even more importantly, any sensory-motor transform can be implemented by feedforward weights coming from the basis function layer. [sent-62, score-0.151]

41 The basis function map itself can be readily implemented from a retinotopic map and an eye position map, by connecting each unit with one visual cell and one eye position cell, and computing a product between these two inputs [4]. [sent-63, score-2.255]

42 Similarly, another basis function map could be implemented by making the product between auditory and postural tuning curves, BT(R - Ri)Bh(H - Hk), in order to predict the position of a visual cue from the sound it makes, or to compute reaching toward auditory cues. [sent-64, score-1.534]

43 However it would be better to combine these two basis function maps in a common architecture, especially if we want to integrate visual and auditory inputs or implement motor feedback to sensory representation, both of which require a multi-directional computation. [sent-65, score-0.842]

44 This ensures that this basis function units can use the two sensory maps as both input and output. [sent-67, score-0.343]

45 We implemented this idea in an interconnected neural network that non-linearly combines visual, auditory and postural inputs in an intermediate layer (the basis function map), which in turn is used to reconstruct the activities on the auditory, visual, and postural layers. [sent-68, score-1.411]

46 This network is completely symmetric, similarly processing visual, postural and auditory inputs. [sent-69, score-0.585]

47 It converges to stable hills of activity on the three neural maps that simultaneously gives the retinotopic position, headcentered position, and the eye position in the input (see figure 2A), performing multi-directional sensory prediction. [sent-70, score-1.444]

48 For this reason, we called this model an iterative basis function network. [sent-71, score-0.15]

49 3 The iterative basis function network The network is represented on figure 2A. [sent-72, score-0.254]

50 It has four layers: three visible, one dimensional layers (visual, auditory and postural) and a two dimensional hidden layer. [sent-73, score-0.332]

51 The three input layers are not directly connected to one another, but they are all interconnected with the hidden layer. [sent-74, score-0.189]

52 We note W r , W h , we the respective weights of the retinotopic, head-centered and eye position layers with the hidden layer. [sent-79, score-0.69]

53 Note that with these weights, the hidden unit l, m is maximally connected to the unit l in the retinotopic layer, m in the eye position layer, and l +m in the head-centered layer. [sent-82, score-1.013]

54 :( 5 ~ 00 2 0 o -100 ~ 0 100 ---B--,G = _100 - B - G = -100 Preferred head centered l OCaIJOIl 000000000 ]Wh Olil il il il il il '"'0000'"'0 OIilIilI. [sent-86, score-0.216]

55 The intermediate cells look like partially shifting cell in VIP. [sent-95, score-0.673]

56 B- An intermediate cell's response properties when one varies the ratio Zr/Zh of modality dominance (strength of the weights). [sent-96, score-0.403]

57 The gain of the shift varies from 0 to 1 depending of the relative strength of Wh (the auditory weights) and W r (the visual weights). [sent-97, score-0.563]

58 Activities on the inputs layers are pooled linearly on the intermediate layers, according to the connection matrices. [sent-100, score-0.413]

59 The resulting activities on the intermediate layer are then sent back to the input layers, through the symmetric connections, and in turn squared and normalized. [sent-102, score-0.352]

60 The inputs are modeled by bell-shaped distribution of activities clamped on the input layers at time O. [sent-103, score-0.255]

61 The amplitude of these initial hills of activity represents the contrasts of the stimuli. [sent-104, score-0.157]

62 A purely visual stimulus, for example, would have an auditory contrast of 0 on the head-centered layer. [sent-105, score-0.38]

63 Except for very low contrasts in all modalities, the network converges toward non-zero stable states when provided with visual, auditory, or bimodal input. [sent-106, score-0.218]

64 These stable states are stable hills of activity on the visual, auditory and postural layers, so that the position of the hill on the head-centered layer is the sum of the position of the hill on the visual layer, and the position of the hill on the postural layer . [sent-107, score-2.488]

65 When provided with visual and postural input, the network predicts the auditory position of the stimulus. [sent-108, score-1.022]

66 When provided with auditory and postural input, the retinotopic position can be read from the position of the stable hill on the visual layer. [sent-109, score-1.76]

67 Thus, the network is automatically doing coordinate transforms in both directions. [sent-110, score-0.143]

68 4 Spatial representation in the intermediate layer The cells in the intermediate layer provide a multimodal representation of space that we can characterize and compare to neurophysiological data. [sent-112, score-0.895]

69 We will focus on the unit's response after the network reached its stable state. [sent-113, score-0.169]

70 The final state depends only on the position encoded in the input, which implies that the unit's responses are identical regardless of the input modality (visual, auditory or bimodal). [sent-114, score-0.808]

71 The receptive fields in different modalities are spatially congruent, like the receptive fields of most multimodal cells in the brain. [sent-115, score-0.67]

72 In figure 2B, we plotted for different eye positions the activity of an intermediate cell as a function of the retinotopic position of the stimulus. [sent-116, score-1.354]

73 Note that because of the symmetries in the network, all the other intermediate cells responses are translated version of this one. [sent-117, score-0.408]

74 The critical parameter that will govern the intermediate representation is ratio Zr/Zh that defines the relative strength of visual and auditory weights. [sent-118, score-0.608]

75 When neither the visual nor the auditory representation dominates (that is, when Zr/Zh = 1, see figure 2B, top panel), the intermediate cell's receptive field on the retina shift with the eyes, but it does not shift as much as the eyes do. [sent-120, score-1.038]

76 This is a partially shifting cell, gain modulated by eye position. [sent-121, score-0.636]

77 The amount of receptive field shift with the gaze is 50%. [sent-122, score-0.318]

78 In fact we found that this cell's response was very close to a product between a gaussian of retinotopic position, head-centered position and eye position, thus implementing the basis function we already proposed as a solution to the multi-directional computation problem. [sent-123, score-1.087]

79 This cell looks very much like a one dimensional version of the particular VIP cell plotted in figure 1A. [sent-124, score-0.332]

80 There is a continuum between a gain modulated retinotopic cell for a high value of x ( 0% shift, figure 2B, middle panel) and a gain modulated head-centered cell for a low value of x (100% shift, figure 2B, bottom panel). [sent-126, score-1.043]

81 This behavior is easy to understand: an intermediate cell receives tuned retinotopic, head-centered and eye position inputs. [sent-127, score-0.991]

82 Thus, the whole distribution of shifts found in VIP could belong to an iterative basis function map with varying ratio between visual and "head-centered" weights. [sent-129, score-0.399]

83 On the other hand, if one modality dominates in all cells (e. [sent-131, score-0.296]

84 in LIP for vision), we can predict that the distribution of responses will be displaced toward the frame of reference of this modality. [sent-133, score-0.244]

85 5 Lesion of the iterative basis function map In order to link the intermediate representation with spatial representations in the human parietal cortex, we studied the consequences of a lesion to this network. [sent-134, score-0.806]

86 Unilateral right parietal lesions result in a syndrome called hemineglect: The patient is slower to react to, and has difficulty detecting stimuli in the contralesional space. [sent-135, score-0.191]

87 This is usually coupled with extinction of leftward stimuli by rightward stimuli. [sent-136, score-0.259]

88 Two striking characteristics of hemineglect are that it is usually in a mixture of frames of reference, challenging the view that parietal cortex is a mosaic of areas devoted to spatial processing in different frames of reference. [sent-137, score-0.464]

89 For example, tactile stimuli can be extinguished by visual stimuli, suggesting that the lesioned spatial representation are themselves multimodal. [sent-139, score-0.452]

90 We modeled a right parietal lesion by implementing a gradient of units in the intermediate layer, so that there are more cells tuned to contralateral retinotopic (visual) and contralateral head-centered (auditory) positions. [sent-140, score-1.009]

91 Thus the network "neglected" stimuli in a mixture of frames of reference: The severity of neglect gradually increased from right to left both in retinotopic and head-centered coordinates. [sent-143, score-0.556]

92 Furthermore when we entered two simultaneous inputs to the network, we observed that the leftward stimulus was always extinguished by the rightward stimulus (the final stable state reflected only the rightward stimulus), regardless of the modality. [sent-144, score-0.471]

93 Thus we obtained extinction of auditory stimuli by visual stimuli, and vice-versa. [sent-145, score-0.542]

94 In our model, these two aspects of neglect (mixture of frames of reference and cross modal extinction) can be explained by a lesion in only one multimodal brain area. [sent-146, score-0.316]

95 In this case, the implementation of motor control (the feedback from the motor representations to the sensory representations) will lead to intermediate cells that partially shift in the sensory as well as the motor frame of reference. [sent-148, score-1.233]

96 Iterative basis function maps provide a new model of spatial representations and processing that can be applied to neurophysiological and neuropsychological data. [sent-151, score-0.398]

97 Eye position effect on visual memory and saccade-related activity in areas LIP and 7a of macaque. [sent-158, score-0.515]

98 Spacial invariance of visual receptive fields in parietal cortex. [sent-165, score-0.436]

99 Spatial transformations in the parietal cortex using basis functions. [sent-176, score-0.298]

100 Modulation by the eye position of auditory responses of macaque area LIP in an auditory memory saccade task. [sent-182, score-1.129]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('retinotopic', 0.353), ('eye', 0.301), ('postural', 0.295), ('position', 0.295), ('auditory', 0.238), ('intermediate', 0.193), ('cell', 0.166), ('cells', 0.158), ('visual', 0.142), ('sensory', 0.141), ('modality', 0.138), ('receptive', 0.129), ('vip', 0.118), ('parietal', 0.108), ('map', 0.107), ('shift', 0.106), ('modulated', 0.102), ('basis', 0.1), ('motor', 0.099), ('spatial', 0.094), ('layers', 0.094), ('layer', 0.086), ('shifting', 0.085), ('stimuli', 0.083), ('stable', 0.079), ('duhamel', 0.079), ('extinction', 0.079), ('gain', 0.077), ('modalities', 0.077), ('stimulus', 0.077), ('wh', 0.076), ('reference', 0.075), ('partially', 0.071), ('frame', 0.068), ('frames', 0.068), ('bh', 0.068), ('hills', 0.068), ('rochester', 0.068), ('lip', 0.067), ('maps', 0.066), ('population', 0.064), ('hill', 0.063), ('multimodal', 0.063), ('lesion', 0.061), ('headcentered', 0.059), ('interconnected', 0.059), ('tactile', 0.059), ('representations', 0.058), ('fields', 0.057), ('responses', 0.057), ('eyes', 0.057), ('inputs', 0.056), ('cortex', 0.055), ('coordinate', 0.054), ('network', 0.052), ('coming', 0.051), ('egocentric', 0.051), ('gaze', 0.051), ('leftward', 0.051), ('bt', 0.051), ('iterative', 0.05), ('brain', 0.049), ('activity', 0.046), ('neurophysiological', 0.046), ('rightward', 0.046), ('cues', 0.046), ('head', 0.046), ('encoded', 0.044), ('toward', 0.044), ('contrasts', 0.043), ('colliculus', 0.039), ('extinguished', 0.039), ('hemineglect', 0.039), ('pooled', 0.039), ('hk', 0.038), ('cue', 0.038), ('ej', 0.038), ('response', 0.038), ('transforms', 0.037), ('tuning', 0.037), ('activities', 0.037), ('input', 0.036), ('tuned', 0.036), ('codes', 0.036), ('ri', 0.035), ('representation', 0.035), ('transformations', 0.035), ('moving', 0.034), ('andersen', 0.034), ('contralateral', 0.034), ('dominance', 0.034), ('neuropsychological', 0.034), ('il', 0.034), ('modeled', 0.032), ('unit', 0.032), ('field', 0.032), ('areas', 0.032), ('panel', 0.032), ('connection', 0.031), ('truly', 0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 8 nips-2000-A New Model of Spatial Representation in Multimodal Brain Areas

Author: Sophie Denève, Jean-René Duhamel, Alexandre Pouget

Abstract: Most models of spatial representations in the cortex assume cells with limited receptive fields that are defined in a particular egocentric frame of reference. However, cells outside of primary sensory cortex are either gain modulated by postural input or partially shifting. We show that solving classical spatial tasks, like sensory prediction, multi-sensory integration, sensory-motor transformation and motor control requires more complicated intermediate representations that are not invariant in one frame of reference. We present an iterative basis function map that performs these spatial tasks optimally with gain modulated and partially shifting units, and tests it against neurophysiological and neuropsychological data. In order to perform an action directed toward an object, it is necessary to have a representation of its spatial location. The brain must be able to use spatial cues coming from different modalities (e.g. vision, audition, touch, proprioception), combine them to infer the position of the object, and compute the appropriate movement. These cues are in different frames of reference corresponding to different sensory or motor modalities. Visual inputs are primarily encoded in retinotopic maps, auditory inputs are encoded in head centered maps and tactile cues are encoded in skin-centered maps. Going from one frame of reference to the other might seem easy. For example, the head-centered position of an object can be approximated by the sum of its retinotopic position and the eye position. However, positions are represented by population codes in the brain, and computing a head-centered map from a retinotopic map is a more complex computation than the underlying sum. Moreover, as we get closer to sensory-motor areas it seems reasonable to assume Spksls 150 100 50 o Figure 1: Response of a VIP cell to visual stimuli appearing in different part of the screen, for three different eye positions. The level of grey represent the frequency of discharge (In spikes per seconds). The white cross is the fixation point (the head is fixed). The cell's receptive field is moving with the eyes, but only partially. Here the receptive field shift is 60% of the total gaze shift. Moreover this cell is gain modulated by eye position (adapted from Duhamel et al). that the representations should be useful for sensory-motor transformations, rather than encode an

2 0.20665129 101 nips-2000-Place Cells and Spatial Navigation Based on 2D Visual Feature Extraction, Path Integration, and Reinforcement Learning

Author: Angelo Arleo, Fabrizio Smeraldi, Stéphane Hug, Wulfram Gerstner

Abstract: We model hippocampal place cells and head-direction cells by combining allothetic (visual) and idiothetic (proprioceptive) stimuli. Visual input, provided by a video camera on a miniature robot, is preprocessed by a set of Gabor filters on 31 nodes of a log-polar retinotopic graph. Unsupervised Hebbian learning is employed to incrementally build a population of localized overlapping place fields. Place cells serve as basis functions for reinforcement learning. Experimental results for goal-oriented navigation of a mobile robot are presented.

3 0.18380463 87 nips-2000-Modelling Spatial Recall, Mental Imagery and Neglect

Author: Suzanna Becker, Neil Burgess

Abstract: We present a computational model of the neural mechanisms in the parietal and temporal lobes that support spatial navigation, recall of scenes and imagery of the products of recall. Long term representations are stored in the hippocampus, and are associated with local spatial and object-related features in the parahippocampal region. Viewer-centered representations are dynamically generated from long term memory in the parietal part of the model. The model thereby simulates recall and imagery of locations and objects in complex environments. After parietal damage, the model exhibits hemispatial neglect in mental imagery that rotates with the imagined perspective of the observer, as in the famous Milan Square experiment [1]. Our model makes novel predictions for the neural representations in the parahippocampal and parietal regions and for behavior in healthy volunteers and neuropsychological patients.

4 0.17916915 89 nips-2000-Natural Sound Statistics and Divisive Normalization in the Auditory System

Author: Odelia Schwartz, Eero P. Simoncelli

Abstract: We explore the statistical properties of natural sound stimuli preprocessed with a bank of linear filters. The responses of such filters exhibit a striking form of statistical dependency, in which the response variance of each filter grows with the response amplitude of filters tuned for nearby frequencies. These dependencies may be substantially reduced using an operation known as divisive normalization, in which the response of each filter is divided by a weighted sum of the rectified responses of other filters. The weights may be chosen to maximize the independence of the normalized responses for an ensemble of natural sounds. We demonstrate that the resulting model accounts for nonlinearities in the response characteristics of the auditory nerve, by comparing model simulations to electrophysiological recordings. In previous work (NIPS, 1998) we demonstrated that an analogous model derived from the statistics of natural images accounts for non-linear properties of neurons in primary visual cortex. Thus, divisive normalization appears to be a generic mechanism for eliminating a type of statistical dependency that is prevalent in natural signals of different modalities. Signals in the real world are highly structured. For example, natural sounds typically contain both harmonic and rythmic structure. It is reasonable to assume that biological auditory systems are designed to represent these structures in an efficient manner [e.g., 1,2]. Specifically, Barlow hypothesized that a role of early sensory processing is to remove redundancy in the sensory input, resulting in a set of neural responses that are statistically independent. Experimentally, one can test this hypothesis by examining the statistical properties of neural responses under natural stimulation conditions [e.g., 3,4], or the statistical dependency of pairs (or groups) of neural responses. Due to their technical difficulty, such multi-cellular experiments are only recently becoming possible, and the earliest reports in vision appear consistent with the hypothesis [e.g., 5]. An alternative approach, which we follow here, is to develop a neural model from the statistics of natural signals and show that response properties of this model are similar to those of biological sensory neurons. A number of researchers have derived linear filter models using statistical criterion. For visual images, this results in linear filters localized in frequency, orientation and phase [6, 7]. Similar work in audition has yielded filters localized in frequency and phase [8]. Although these linear models provide an important starting point for neural modeling, sensory neurons are highly nonlinear. In addition, the statistical properties of natural signals are too complex to expect a linear transformation to result in an independent set of components. Recent results indicate that nonlinear gain control plays an important role in neural processing. Ruderman and Bialek [9] have shown that division by a local estimate of standard deviation can increase the entropy of responses of center-surround filters to natural images. Such a model is consistent with the properties of neurons in the retina and lateral geniculate nucleus. Heeger and colleagues have shown that the nonlinear behaviors of neurons in primary visual cortex may be described using a form of gain control known as divisive normalization [10], in which the response of a linear kernel is rectified and divided by the sum of other rectified kernel responses and a constant. We have recently shown that the responses of oriented linear filters exhibit nonlinear statistical dependencies that may be substantially reduced using a variant of this model, in which the normalization signal is computed from a weighted sum of other rectified kernel responses [11, 12]. The resulting model, with weighting parameters determined from image statistics, accounts qualitatively for physiological nonlinearities observed in primary visual cortex. In this paper, we demonstrate that the responses of bandpass linear filters to natural sounds exhibit striking statistical dependencies, analogous to those found in visual images. A divisive normalization procedure can substantially remove these dependencies. We show that this model, with parameters optimized for a collection of natural sounds, can account for nonlinear behaviors of neurons at the level of the auditory nerve. Specifically, we show that: 1) the shape offrequency tuning curves varies with sound pressure level, even though the underlying linear filters are fixed; and 2) superposition of a non-optimal tone suppresses the response of a linear filter in a divisive fashion, and the amount of suppression depends on the distance between the frequency of the tone and the preferred frequency of the filter. 1 Empirical observations of natural sound statistics The basic statistical properties of natural sounds, as observed through a linear filter, have been previously documented by Attias [13]. In particular, he showed that, as with visual images, the spectral energy falls roughly according to a power law, and that the histograms of filter responses are more kurtotic than a Gaussian (i.e., they have a sharp peak at zero, and very long tails). Here we examine the joint statistical properties of a pair of linear filters tuned for nearby temporal frequencies. We choose a fixed set of filters that have been widely used in modeling the peripheral auditory system [14]. Figure 1 shows joint histograms of the instantaneous responses of a particular pair of linear filters to five different types of natural sound, and white noise. First note that the responses are approximately decorrelated: the expected value of the y-axis value is roughly zero for all values of the x-axis variable. The responses are not, however, statistically independent: the width of the distribution of responses of one filter increases with the response amplitude of the other filter. If the two responses were statistically independent, then the response of the first filter should not provide any information about the distribution of responses of the other filter. We have found that this type of variance dependency (sometimes accompanied by linear correlation) occurs in a wide range of natural sounds, ranging from animal sounds to music. We emphasize that this dependency is a property of natural sounds, and is not due purely to our choice of linear filters. For example, no such dependency is observed when the input consists of white noise (see Fig. 1). The strength of this dependency varies for different pairs of linear filters . In addition, we see this type of dependency between instantaneous responses of a single filter at two Speech o -1 Drums • Monkey Cat White noise Nocturnal nature I~ ~; ~ • Figure 1: Joint conditional histogram of instantaneous linear responses of two bandpass filters with center frequencies 2000 and 2840 Hz. Pixel intensity corresponds to frequency of occurrence of a given pair of values, except that each column has been independently rescaled to fill the full intensity range. For the natural sounds, responses are not independent: the standard deviation of the ordinate is roughly proportional to the magnitude of the abscissa. Natural sounds were recorded from CDs and converted to sampling frequency of 22050 Hz. nearby time instants. Since the dependency involves the variance of the responses, we can substantially reduce it by dividing. In particular, the response of each filter is divided by a weighted sum of responses of other rectified filters and an additive constant. Specifically: L2 Ri = 2: (1) 12 j WjiLj + 0'2 where Li is the instantaneous linear response of filter i, strength of suppression of filter i by filter j. 0' is a constant and Wji controls the We would like to choose the parameters of the model (the weights Wji, and the constant 0') to optimize the independence of the normalized response to an ensemble of natural sounds. Such an optimization is quite computationally expensive. We instead assume a Gaussian form for the underlying conditional distribution, as described in [15]: P (LiILj,j E Ni ) '

5 0.14123167 40 nips-2000-Dendritic Compartmentalization Could Underlie Competition and Attentional Biasing of Simultaneous Visual Stimuli

Author: Kevin A. Archie, Bartlett W. Mel

Abstract: Neurons in area V4 have relatively large receptive fields (RFs), so multiple visual features are simultaneously

6 0.13370278 10 nips-2000-A Productive, Systematic Framework for the Representation of Visual Structure

7 0.12600543 102 nips-2000-Position Variance, Recurrence and Perceptual Learning

8 0.099519089 45 nips-2000-Emergence of Movement Sensitive Neurons' Properties by Learning a Sparse Code for Natural Moving Images

9 0.093685001 19 nips-2000-Adaptive Object Representation with Hierarchically-Distributed Memory Sites

10 0.089741334 88 nips-2000-Multiple Timescales of Adaptation in a Neural Code

11 0.083642364 42 nips-2000-Divisive and Subtractive Mask Effects: Linking Psychophysics and Biophysics

12 0.075545825 109 nips-2000-Redundancy and Dimensionality Reduction in Sparse-Distributed Representations of Natural Objects in Terms of Their Local Features

13 0.066704914 34 nips-2000-Competition and Arbors in Ocular Dominance

14 0.066205382 104 nips-2000-Processing of Time Series by Neural Circuits with Biologically Realistic Synaptic Dynamics

15 0.065572314 32 nips-2000-Color Opponency Constitutes a Sparse Representation for the Chromatic Structure of Natural Scenes

16 0.063626185 118 nips-2000-Smart Vision Chip Fabricated Using Three Dimensional Integration Technology

17 0.061663095 2 nips-2000-A Comparison of Image Processing Techniques for Visual Speech Recognition Applications

18 0.061457548 147 nips-2000-Who Does What? A Novel Algorithm to Determine Function Localization

19 0.061197586 43 nips-2000-Dopamine Bonuses

20 0.05831784 96 nips-2000-One Microphone Source Separation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.182), (1, -0.227), (2, -0.099), (3, 0.07), (4, -0.066), (5, 0.042), (6, 0.166), (7, -0.158), (8, 0.312), (9, -0.103), (10, 0.004), (11, 0.148), (12, -0.124), (13, 0.058), (14, 0.055), (15, -0.011), (16, 0.056), (17, 0.044), (18, 0.035), (19, 0.017), (20, -0.013), (21, 0.109), (22, 0.031), (23, 0.073), (24, -0.056), (25, -0.136), (26, -0.127), (27, 0.003), (28, 0.005), (29, -0.087), (30, 0.01), (31, -0.062), (32, 0.057), (33, 0.034), (34, 0.026), (35, -0.021), (36, 0.011), (37, -0.025), (38, 0.018), (39, -0.062), (40, -0.038), (41, 0.005), (42, -0.005), (43, -0.073), (44, -0.039), (45, -0.037), (46, -0.055), (47, 0.184), (48, -0.137), (49, 0.008)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.98364568 8 nips-2000-A New Model of Spatial Representation in Multimodal Brain Areas

Author: Sophie Denève, Jean-René Duhamel, Alexandre Pouget

Abstract: Most models of spatial representations in the cortex assume cells with limited receptive fields that are defined in a particular egocentric frame of reference. However, cells outside of primary sensory cortex are either gain modulated by postural input or partially shifting. We show that solving classical spatial tasks, like sensory prediction, multi-sensory integration, sensory-motor transformation and motor control requires more complicated intermediate representations that are not invariant in one frame of reference. We present an iterative basis function map that performs these spatial tasks optimally with gain modulated and partially shifting units, and tests it against neurophysiological and neuropsychological data. In order to perform an action directed toward an object, it is necessary to have a representation of its spatial location. The brain must be able to use spatial cues coming from different modalities (e.g. vision, audition, touch, proprioception), combine them to infer the position of the object, and compute the appropriate movement. These cues are in different frames of reference corresponding to different sensory or motor modalities. Visual inputs are primarily encoded in retinotopic maps, auditory inputs are encoded in head centered maps and tactile cues are encoded in skin-centered maps. Going from one frame of reference to the other might seem easy. For example, the head-centered position of an object can be approximated by the sum of its retinotopic position and the eye position. However, positions are represented by population codes in the brain, and computing a head-centered map from a retinotopic map is a more complex computation than the underlying sum. Moreover, as we get closer to sensory-motor areas it seems reasonable to assume Spksls 150 100 50 o Figure 1: Response of a VIP cell to visual stimuli appearing in different part of the screen, for three different eye positions. The level of grey represent the frequency of discharge (In spikes per seconds). The white cross is the fixation point (the head is fixed). The cell's receptive field is moving with the eyes, but only partially. Here the receptive field shift is 60% of the total gaze shift. Moreover this cell is gain modulated by eye position (adapted from Duhamel et al). that the representations should be useful for sensory-motor transformations, rather than encode an

2 0.80486208 101 nips-2000-Place Cells and Spatial Navigation Based on 2D Visual Feature Extraction, Path Integration, and Reinforcement Learning

Author: Angelo Arleo, Fabrizio Smeraldi, Stéphane Hug, Wulfram Gerstner

Abstract: We model hippocampal place cells and head-direction cells by combining allothetic (visual) and idiothetic (proprioceptive) stimuli. Visual input, provided by a video camera on a miniature robot, is preprocessed by a set of Gabor filters on 31 nodes of a log-polar retinotopic graph. Unsupervised Hebbian learning is employed to incrementally build a population of localized overlapping place fields. Place cells serve as basis functions for reinforcement learning. Experimental results for goal-oriented navigation of a mobile robot are presented.

3 0.78042239 87 nips-2000-Modelling Spatial Recall, Mental Imagery and Neglect

Author: Suzanna Becker, Neil Burgess

Abstract: We present a computational model of the neural mechanisms in the parietal and temporal lobes that support spatial navigation, recall of scenes and imagery of the products of recall. Long term representations are stored in the hippocampus, and are associated with local spatial and object-related features in the parahippocampal region. Viewer-centered representations are dynamically generated from long term memory in the parietal part of the model. The model thereby simulates recall and imagery of locations and objects in complex environments. After parietal damage, the model exhibits hemispatial neglect in mental imagery that rotates with the imagined perspective of the observer, as in the famous Milan Square experiment [1]. Our model makes novel predictions for the neural representations in the parahippocampal and parietal regions and for behavior in healthy volunteers and neuropsychological patients.

4 0.51036358 10 nips-2000-A Productive, Systematic Framework for the Representation of Visual Structure

Author: Shimon Edelman, Nathan Intrator

Abstract: We describe a unified framework for the understanding of structure representation in primate vision. A model derived from this framework is shown to be effectively systematic in that it has the ability to interpret and associate together objects that are related through a rearrangement of common

5 0.50257045 40 nips-2000-Dendritic Compartmentalization Could Underlie Competition and Attentional Biasing of Simultaneous Visual Stimuli

Author: Kevin A. Archie, Bartlett W. Mel

Abstract: Neurons in area V4 have relatively large receptive fields (RFs), so multiple visual features are simultaneously

6 0.49523056 89 nips-2000-Natural Sound Statistics and Divisive Normalization in the Auditory System

7 0.43508038 42 nips-2000-Divisive and Subtractive Mask Effects: Linking Psychophysics and Biophysics

8 0.38766205 102 nips-2000-Position Variance, Recurrence and Perceptual Learning

9 0.36431256 109 nips-2000-Redundancy and Dimensionality Reduction in Sparse-Distributed Representations of Natural Objects in Terms of Their Local Features

10 0.31402871 19 nips-2000-Adaptive Object Representation with Hierarchically-Distributed Memory Sites

11 0.29531071 88 nips-2000-Multiple Timescales of Adaptation in a Neural Code

12 0.25709823 125 nips-2000-Stability and Noise in Biochemical Switches

13 0.24617086 118 nips-2000-Smart Vision Chip Fabricated Using Three Dimensional Integration Technology

14 0.24225174 65 nips-2000-Higher-Order Statistical Properties Arising from the Non-Stationarity of Natural Signals

15 0.24058487 45 nips-2000-Emergence of Movement Sensitive Neurons' Properties by Learning a Sparse Code for Natural Moving Images

16 0.23435245 15 nips-2000-Accumulator Networks: Suitors of Local Probability Propagation

17 0.23417053 56 nips-2000-Foundations for a Circuit Complexity Theory of Sensory Processing

18 0.23092102 32 nips-2000-Color Opponency Constitutes a Sparse Representation for the Chromatic Structure of Natural Scenes

19 0.2225997 34 nips-2000-Competition and Arbors in Ocular Dominance

20 0.21542405 141 nips-2000-Universality and Individuality in a Neural Code


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.338), (4, 0.024), (10, 0.022), (17, 0.118), (33, 0.023), (36, 0.017), (38, 0.036), (42, 0.068), (55, 0.04), (62, 0.034), (65, 0.038), (67, 0.046), (76, 0.035), (81, 0.024), (90, 0.015), (97, 0.013)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.82571542 8 nips-2000-A New Model of Spatial Representation in Multimodal Brain Areas

Author: Sophie Denève, Jean-René Duhamel, Alexandre Pouget

Abstract: Most models of spatial representations in the cortex assume cells with limited receptive fields that are defined in a particular egocentric frame of reference. However, cells outside of primary sensory cortex are either gain modulated by postural input or partially shifting. We show that solving classical spatial tasks, like sensory prediction, multi-sensory integration, sensory-motor transformation and motor control requires more complicated intermediate representations that are not invariant in one frame of reference. We present an iterative basis function map that performs these spatial tasks optimally with gain modulated and partially shifting units, and tests it against neurophysiological and neuropsychological data. In order to perform an action directed toward an object, it is necessary to have a representation of its spatial location. The brain must be able to use spatial cues coming from different modalities (e.g. vision, audition, touch, proprioception), combine them to infer the position of the object, and compute the appropriate movement. These cues are in different frames of reference corresponding to different sensory or motor modalities. Visual inputs are primarily encoded in retinotopic maps, auditory inputs are encoded in head centered maps and tactile cues are encoded in skin-centered maps. Going from one frame of reference to the other might seem easy. For example, the head-centered position of an object can be approximated by the sum of its retinotopic position and the eye position. However, positions are represented by population codes in the brain, and computing a head-centered map from a retinotopic map is a more complex computation than the underlying sum. Moreover, as we get closer to sensory-motor areas it seems reasonable to assume Spksls 150 100 50 o Figure 1: Response of a VIP cell to visual stimuli appearing in different part of the screen, for three different eye positions. The level of grey represent the frequency of discharge (In spikes per seconds). The white cross is the fixation point (the head is fixed). The cell's receptive field is moving with the eyes, but only partially. Here the receptive field shift is 60% of the total gaze shift. Moreover this cell is gain modulated by eye position (adapted from Duhamel et al). that the representations should be useful for sensory-motor transformations, rather than encode an

2 0.6559428 60 nips-2000-Gaussianization

Author: Scott Saobing Chen, Ramesh A. Gopinath

Abstract: High dimensional data modeling is difficult mainly because the so-called

3 0.58355695 52 nips-2000-Fast Training of Support Vector Classifiers

Author: Fernando Pérez-Cruz, Pedro Luis Alarcón-Diana, Angel Navia-Vázquez, Antonio Artés-Rodríguez

Abstract: In this communication we present a new algorithm for solving Support Vector Classifiers (SVC) with large training data sets. The new algorithm is based on an Iterative Re-Weighted Least Squares procedure which is used to optimize the SVc. Moreover, a novel sample selection strategy for the working set is presented, which randomly chooses the working set among the training samples that do not fulfill the stopping criteria. The validity of both proposals, the optimization procedure and sample selection strategy, is shown by means of computer experiments using well-known data sets. 1 INTRODUCTION The Support Vector Classifier (SVC) is a powerful tool to solve pattern recognition problems [13, 14] in such a way that the solution is completely described as a linear combination of several training samples, named the Support Vectors. The training procedure for solving the SVC is usually based on Quadratic Programming (QP) which presents some inherent limitations, mainly the computational complexity and memory requirements for large training data sets. This problem is typically avoided by dividing the QP problem into sets of smaller ones [6, 1, 7, 11], that are iteratively solved in order to reach the SVC solution for the whole set of training samples. These schemes rely on an optimizing engine, QP, and in the sample selection strategy for each sub-problem, in order to obtain a fast solution for the SVC. An Iterative Re-Weighted Least Squares (IRWLS) procedure has already been proposed as an alternative solver for the SVC [10] and the Support Vector Regressor [9], being computationally efficient in absolute terms. In this communication, we will show that the IRWLS algorithm can replace the QP one in any chunking scheme in order to find the SVC solution for large training data sets. Moreover, we consider that the strategy to decide which training samples must j oin the working set is critical to reduce the total number of iterations needed to attain the SVC solution, and the runtime complexity as a consequence. To aim for this issue, the computer program SV cradit have been developed so as to solve the SVC for large training data sets using IRWLS procedure and fixed-size working sets. The paper is organized as follows. In Section 2, we start by giving a summary of the IRWLS procedure for SVC and explain how it can be incorporated to a chunking scheme to obtain an overall implementation which efficiently deals with large training data sets. We present in Section 3 a novel strategy to make up the working set. Section 4 shows the capabilities of the new implementation and they are compared with the fastest available SVC implementation, SV Mlight [6]. We end with some concluding remarks. 2 IRWLS-SVC In order to solve classification problems, the SVC has to minimize Lp = ~llwI12+CLei- LJliei- LQi(Yi(¢(xifw+b)-l+ei) (1) i i i with respectto w, band ei and maximize it with respectto Qi and Jli, subject to Qi, Jli ~ 0, where ¢(.) is a nonlinear transformation (usually unknown) to a higher dimensional space and C is a penalization factor. The solution to (1) is defined by the Karush-Kuhn-Tucker (KKT) conditions [2]. For further details on the SVC, one can refer to the tutorial survey by Burges [2] and to the work ofVapnik [13, 14]. In order to obtain an IRWLS procedure we will first need to rearrange (1) in such a way that the terms depending on ei can be removed because, at the solution C - Qi - Jli = 0 Vi (one of the KKT conditions [2]) must hold. Lp = 1 Qi(l- Yi(¢T(Xi)W + b)) 211wl12 + L i = (2) where The weighted least square nature of (2) can be understood if ei is defined as the error on each sample and ai as its associated weight, where! IIwl1 2 is a regularizing functional. The minimization of (2) cannot be accomplished in a single step because ai = ai(ei), and we need to apply an IRWLS procedure [4], summarized below in tree steps: 1. Considering the ai fixed, minimize (2). 2. Recalculate ai from the solution on step 1. 3. Repeat until convergence. In order to work with Reproducing Kernels in Hilbert Space (RKHS), as the QP procedure does, we require that w = Ei (JiYi¢(Xi) and in order to obtain a non-zero b, that Ei {JiYi = O. Substituting them into (2), its minimum with respect to {Ji and b for a fixed set of ai is found by solving the following linear equation system l (3) IThe detailed description of the steps needed to obtain (3) from (2) can be found in [10]. where y = [Yl, Y2, ... Yn]T (4) 'r/i,j = 1, ... ,n 'r/i,j = 1, ... ,n (H)ij = YiYj¢T(Xi)¢(Xj) = YiyjK(Xi,Xj) (Da)ij = aio[i - j] 13 = [,81, ,82, ... (5) (6) (7) , ,8n]T and 0[·] is the discrete impulse function. Finally, the dependency of ai upon the Lagrange multipliers is eliminated using the KKT conditions, obtaining a, ai 2.1 ={~ ei Yi' eiYi < Yt.et. > - ° ° (8) IRWLS ALGORITHMIC IMPLEMENTATION The SVC solution with the IRWLS procedure can be simplified by dividing the training samples into three sets. The first set, SI, contains the training samples verifying < ,8i < C, which have to be determined by solving (3). The second one, S2, includes every training sample whose,8i = 0. And the last one, S3, is made up of the training samples whose ,8i = C. This division in sets is fully justified in [10]. The IRWLS-SVC algorithm is shown in Table 1. ° 0. Initialization: SI will contain every training sample, S2 = 0 and S3 = 0. Compute H. e_a = y, f3_a = 0, b_a = 0, G 13 = Gin, a = 1 and G b3 = G bi n . 1 Solve [ (H)Sb S1 + D(al S1 . =° = e-lt a, 3. ai = { ~ (13) S2 2. e ° 1[ (Y)Sl (f3)Sl ] (y ) ~1 b and (13) Ss = C DyH(f3 - f3_a) - (b - b_a)1 =[1- G 13 ] G b3 ' °. eiYi < e- _ > O'r/Z E SI U S2 U S3 tYt 4. Sets reordering: a. Move every sample in S3 with eiYi < to S2. b. Move every sample in SI with ,8i = C to S3. c. Move every sample in SI with ai = to S2 . d. Move every sample in S2 with ai :I to SI. 5. e_a = e, f3_a = 13, G 13 = (H)Sl,SS (f3)ss + (G in )Sl' b-lt = band Gb3 = -y~s (f3)ss + Gbin · 6. Go to step 1 and repeat until convergence. ei Yi ' ° ° ° Table 1: IRWLS-SVC algorithm. The IRWLS-SVC procedure has to be slightly modified in order to be used inside a chunk:ing scheme as the one proposed in [8, 6], such that it can be directly applied in the one proposed in [1]. A chunking scheme is needed to solve the SVC whenever H is too large to fit into memory. In those cases, several SVC with a reduced set of training samples are iteratively solved until the solution for the whole set is found. The samples are divide into a working set, Sw, which is solved as a full SVC problem, and an inactive set, Sin. If there are support vectors in the inactive set, as it might be, the inactive set modifies the IRWLSSVC procedure, adding a contribution to the independent term in the linear equation system (3) . Those support vectors in S in can be seen as anchored samples in S3, because their ,8i is not zero and can not be modified by the IRWLS procedure. Then, such contribution (Gin and G bin ) will be calculated as G 13 and G b3 are (Table 1, 5th step), before calling the IRWLS-SVC algorithm. We have already modified the IRWLS-SVC in Table 1 to consider Gin and G bin , which must be set to zero if the Hessian matrix, H, fits into memory for the whole set of training samples. The resolution of the SVC for large training data sets, employing as minimization engine the IRWLS procedure, is summarized in the following steps: 1. Select the samples that will form the working set. 2. Construct Gin = (H)Sw,Sin (f3)s.n and G bin = -yIin (f3)Sin 3. Solve the IRWLS-SVC procedure, following the steps in Table 1. 4. Compute the error of every training sample. 5. If the stopping conditions Yiei < C eiYi> -c leiYil < C 'Vii 'Vii 'Vii (Ji = 0 (Ji = C 0 < (Ji < C (9) (10) (11) are fulfilled, the SVC solution has been reached. The stopping conditions are the ones proposed in [6] and C must be a small value around 10 - 3 , a full discussion concerning this topic can be found in [6]. 3 SAMPLE SELECTION STRATEGY The selection of the training samples that will constitute the working set in each iteration is the most critical decision in any chunking scheme, because such decision is directly involved in the number of IRWLS-SVC (or QP-SVC) procedures to be called and in the number of reproducing kernel evaluations to be made, which are, by far, the two most time consuming operations in any chunking schemes. In order to solve the SVC efficiently, we first need to define a candidate set of training samples to form the working set in each iteration. The candidate set will be made up, as it could not be otherwise, with all the training samples that violate the stopping conditions (9)-(11); and we will also add all those training samples that satisfy condition (11) but a small variation on their error will make them violate such condition. The strategies to select the working set are as numerous as the number of problems to be solved, but one can think three different simple strategies: • Select those samples which do not fulfill the stopping criteria and present the largest Iei I values. • Select those samples which do not fulfill the stopping criteria and present the smallest Iei I values. • Select them randomly from the ones that do not fulfill the stopping conditions. The first strategy seems the more natural one and it was proposed in [6]. If the largest leil samples are selected we guanrantee that attained solution gives the greatest step towards the solution of (1). But if the step is too large, which usually happens, it will cause the solution in each iteration and the (Ji values to oscillate around its optimal value. The magnitude of this effect is directly proportional to the value of C and q (size of the working set), so in the case ofsmall C (C < 10) and low q (q < 20) it would be less noticeable. The second one is the most conservative strategy because we will be moving towards the solution of (1) with small steps. Its drawback is readily discerned if the starting point is inappropriate, needing too many iterations to reach the SVC solution. The last strategy, which has been implemented together with the IRWLS-SVC procedure, is a mid-point between the other two, but if the number of samples whose 0 < (3i < C increases above q there might be some iterations where we will make no progress (working set is only made up of the training samples that fulfill the stopping condition in (11)). This situation is easily avoided by introducing one sample that violates each one of the stopping conditions per class. Finally, if the cardinality of the candidate set is less than q the working set is completed with those samples that fulfil the stopping criteria conditions and present the least leil. In summary, the sample selection strategy proposed is 2 : 1. Construct the candidate set, Se with those samples that do not fulfill stopping conditions (9) and (10), and those samples whose (3 obeys 0 < (3i < C. 2. IfISel < ngot05. 3. Choose a sample per class that violates each one of the stopping conditions and move them from Se to the working set, SW. 4. Choose randomly n - ISw I samples from Se and move then to SW. Go to Step 6. 5. Move every sample form Se to Sw and then-ISwl samples that fulfill the stopping conditions (9) and (10) and present the lowest leil values are used to complete SW . 6. Go on, obtaining Gin and Gbin. 4 BENCHMARK FOR THE IRWLS-SVC We have prepared two different experiments to test both the IRWLS and the sample selection strategy for solving the SVc. The first one compares the IRWLS against QP and the second one compares the samples selection strategy, together with the IRWLS, against a complete solving procedure for SVC, the SV Mlight. In the first trial, we have replaced the LOQO interior point optimizer used by SV M1ig ht version 3.02 [5] by the IRWLS-SVC procedure in Table 1, to compare both optimizing engines with equal samples selection strategy. The comparison has been made over a Pentium ill-450MHz with 128Mb running on Window98 and the programs have been compiled using Microsoft Developer 6.0. In Table 2, we show the results for two data sets: the first q 20 40 70 Adult44781 CPU time Optimize Time LOQO IRWLS LOQO IRWLS 21.25 20.70 0.61 0.39 20.60 19.22 1.01 0.17 21.15 18.72 2.30 0.46 Splice 2175 CPU time Optimize Time LOQO IRWLS LOQO IRWLS 46.19 30.76 21.94 4.77 71.34 24.93 46.26 8.07 53.77 20.32 34.24 7.72 Table 2: CPU Time indicates the consume time in seconds for the whole procedure. The Optimize Time indicates the consume time in second for the LOQO or IRWLS procedure. one, containing 4781 training samples, needs most CPU resources to compute the RKHS and the second one, containing 2175 training samples, uses most CPU resources to solve the SVC for each Sw, where q indicates the size of the working set. The value of C has 2In what follows, I . I represents absolute value for numbers and cardinality for sets been set to 1 and 1000, respectively, and a Radial Basis Function (RBF) RKHS [2] has been employed, where its parameter a has been set, respectively, to 10 and 70. As it can be seen, the SV M1ig ht with IRWLS is significantly faster than the LOQO procedure in all cases. The kernel cache size has been set to 64Mb for both data sets and for both procedures. The results in Table 2 validates the IRWLS procedure as the fastest SVC solver. For the second trial, we have compiled a computer program that uses the IRWLS-SVC procedure and the working set selection in Section 3, we will refer to it as svcradit from now on. We have borrowed the chunking and shrinking ideas from the SV Mlight [6] for our computer program. To test these two programs several data sets have been used. The Adult and Web data sets have been obtained from 1. Platt's web page http://research.microsoft.comr jplatt/smo.html/; the Gauss-M data set is a two dimensional classification problem proposed in [3] to test neural networks, which comprises a gaussian random variable for each class, which highly overlap. The Banana, Diabetes and Splice data sets have been obtained from Gunnar Ratsch web page http://svm.first.gmd.der raetschl. The selection of C and the RKHS has been done as indicated in [11] for Adult and Web data sets and in http://svm.first.gmd.derraetschl for Banana, Diabetes and Splice data sets. In Table 3, we show the runtime complexity for each data set, where the value of q has been elected as the one that reduces the runtime complexity. Database Dim Adult6 Adult9 Adult! Web 1 Web7 Gauss-M Gauss-M Banana Banana Diabetes Splice 123 123 123 300 300 2 2 2 2 8 69 N Sampl. 11221 32562 1605 2477 24693 4000 4000 400 4900 768 2175 C a SV 1 1 1000 5 5 1 100 316.2 316.2 10 1000 10 10 10 10 10 1 1 1 1 2 70 4477 12181 630 224 1444 1736 1516 80 1084 409 525 q CPU time radit light radit light 150 130 100 100 150 70 100 40 70 40 150 40 70 10 10 10 10 10 70 40 10 20 118.2 1093.29 25.98 2.42 158.13 12.69 61.68 0.33 22.46 2.41 14.06 124.46 1097.09 113.54 2.36 124.57 48.28 3053.20 0.77 1786.56 6.04 49.19 Table 3: Several data sets runtime complexity, when solved with the short, and SV Mlight, light for short. s v c radit , radit for One can appreciate that the svcradit is faster than the SV M1ig ht for most data sets. For the Web data set, which is the only data set the SV Mlight is sligthly faster, the value of C is low and most training samples end up as support vector with (3i < C. In such cases the best strategy is to take the largest step towards the solution in every iteration, as the SV Mlig ht does [6], because most training samples (3i will not be affected by the others training samples (3j value. But in those case the value of C increases the SV c radit samples selection strategy is a much more appropriate strategy than the one used in SV Mlight. 5 CONCLUSIONS In this communication a new algorithm for solving the SVC for large training data sets has been presented. Its two major contributions deal with the optimizing engine and the sample selection strategy. An IRWLS procedure is used to solve the SVC in each step, which is much faster that the usual QP procedure, and simpler to implement, because the most difficult step is the linear equation system solution that can be easily obtained by LU decomposition means [12]. The random working set selection from the samples not fulfilling the KKT conditions is the best option if the working is be large, because it reduces the number of chunks to be solved. This strategy benefits from the IRWLS procedure, which allows to work with large training data set. All these modifications have been concreted in the svcradit solving procedure, publicly available at http://svm.tsc.uc3m.es/. 6 ACKNOWLEDGEMENTS We are sincerely grateful to Thorsten Joachims who has allowed and encouraged us to use his SV Mlight to test our IRWLS procedure, comparisons which could not have been properly done otherwise. References [1] B. E. Boser, I. M . Guyon, and V. Vapnik. A training algorithm for optimal margin classifiers. In 5th Annual Workshop on Computational Learning Theory, Pittsburg, U.S.A., 1992. [2] C. J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121-167, 1998. [3] S. Haykin. Neural Networks: A comprehensivefoundation. Prentice-Hall, 1994. [4] P. W. Holland and R. E. Welch. Robust regression using iterative re-weighted least squares. Communications of Statistics Theory Methods, A6(9):813-27, 1977. [5] T. Joachims. http://www-ai.infonnatik.uni-dortmund.de/forschung/verfahren Isvmlight Isvmlight.eng.html. Technical report, University of Dortmund, Informatik, AI-Unit Collaborative Research Center on 'Complexity Reduction in Multivariate Data', 1998. [6] T. Joachims. Making Large Scale SVM Learning Practical, In Advances in Kernel Methods- Support Vector Learning, Editors SchOlkopf, B., Burges, C. 1. C. and Smola, A. 1., pages 169-184. M.I.T. Press, 1999. [7] E. Osuna, R. Freund, and F. Girosi. An improved training algorithm for support vector machines. In Proc. of the 1997 IEEE Workshop on Neural Networks for Signal Processing, pages 276-285, Amelia Island, U.S.A, 1997. [8] E. Osuna and F. Girosi. Reducing the run-time complexity of support vector machines. In ICPR'98, Brisbane, Australia, August 1998. [9] F. Perez-Cruz, A. Navia-Vazquez

4 0.3975035 42 nips-2000-Divisive and Subtractive Mask Effects: Linking Psychophysics and Biophysics

Author: Barbara Zenger, Christof Koch

Abstract: We describe an analogy between psychophysically measured effects in contrast masking, and the behavior of a simple integrate-andfire neuron that receives time-modulated inhibition. In the psychophysical experiments, we tested observers ability to discriminate contrasts of peripheral Gabor patches in the presence of collinear Gabor flankers. The data reveal a complex interaction pattern that we account for by assuming that flankers provide divisive inhibition to the target unit for low target contrasts, but provide subtractive inhibition to the target unit for higher target contrasts. A similar switch from divisive to subtractive inhibition is observed in an integrate-and-fire unit that receives inhibition modulated in time such that the cell spends part of the time in a high-inhibition state and part of the time in a low-inhibition state. The similarity between the effects suggests that one may cause the other. The biophysical model makes testable predictions for physiological single-cell recordings. 1 Psychophysics Visual images of Gabor patches are thought to excite a small and specific subset of neurons in the primary visual cortex and beyond. By measuring psychophysically in humans the contrast detection and discrimination thresholds of peripheral Gabor patches, one can estimate the sensitivity of this subset of neurons. Furthermore, spatial interactions between different neuronal populations can be probed by testing the effects of additional Gabor patches (masks) on performance. Such experiments have revealed a highly configuration-specific pattern of excitatory and inhibitory spatial interactions [1, 2]. 1.1 Methods Two vertical Gabor patches with a spatial frequency of 4cyc/deg were presented at 4 deg eccentricity left and right of fixation, and observers had to report which patch had the higher contrast (spatial 2AFC). In the

5 0.39028633 88 nips-2000-Multiple Timescales of Adaptation in a Neural Code

Author: Adrienne L. Fairhall, Geoffrey D. Lewen, William Bialek, Robert R. de Ruyter van Steveninck

Abstract: Many neural systems extend their dynamic range by adaptation. We examine the timescales of adaptation in the context of dynamically modulated rapidly-varying stimuli, and demonstrate in the fly visual system that adaptation to the statistical ensemble of the stimulus dynamically maximizes information transmission about the time-dependent stimulus. Further, while the rate response has long transients, the adaptation takes place on timescales consistent with optimal variance estimation.

6 0.38922721 101 nips-2000-Place Cells and Spatial Navigation Based on 2D Visual Feature Extraction, Path Integration, and Reinforcement Learning

7 0.38319302 10 nips-2000-A Productive, Systematic Framework for the Representation of Visual Structure

8 0.38283476 40 nips-2000-Dendritic Compartmentalization Could Underlie Competition and Attentional Biasing of Simultaneous Visual Stimuli

9 0.3776992 45 nips-2000-Emergence of Movement Sensitive Neurons' Properties by Learning a Sparse Code for Natural Moving Images

10 0.3732501 146 nips-2000-What Can a Single Neuron Compute?

11 0.37127033 107 nips-2000-Rate-coded Restricted Boltzmann Machines for Face Recognition

12 0.37125391 95 nips-2000-On a Connection between Kernel PCA and Metric Multidimensional Scaling

13 0.36982173 2 nips-2000-A Comparison of Image Processing Techniques for Visual Speech Recognition Applications

14 0.36679566 104 nips-2000-Processing of Time Series by Neural Circuits with Biologically Realistic Synaptic Dynamics

15 0.36487824 122 nips-2000-Sparse Representation for Gaussian Process Models

16 0.36309534 130 nips-2000-Text Classification using String Kernels

17 0.36301062 102 nips-2000-Position Variance, Recurrence and Perceptual Learning

18 0.36176041 92 nips-2000-Occam's Razor

19 0.36014631 49 nips-2000-Explaining Away in Weight Space

20 0.35941321 74 nips-2000-Kernel Expansions with Unlabeled Examples