nips nips2002 nips2002-206 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Robert A. Jacobs, Melissa Dominguez
Abstract: We consider the hypothesis that systems learning aspects of visual perception may benefit from the use of suitably designed developmental progressions during training. Four models were trained to estimate motion velocities in sequences of visual images. Three of the models were “developmental models” in the sense that the nature of their input changed during the course of training. They received a relatively impoverished visual input early in training, and the quality of this input improved as training progressed. One model used a coarse-to-multiscale developmental progression (i.e. it received coarse-scale motion features early in training and finer-scale features were added to its input as training progressed), another model used a fine-to-multiscale progression, and the third model used a random progression. The final model was nondevelopmental in the sense that the nature of its input remained the same throughout the training period. The simulation results show that the coarse-to-multiscale model performed best. Hypotheses are offered to account for this model’s superior performance. We conclude that suitably designed developmental sequences can be useful to systems learning to estimate motion velocities. The idea that visual development can aid visual learning is a viable hypothesis in need of further study.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We consider the hypothesis that systems learning aspects of visual perception may benefit from the use of suitably designed developmental progressions during training. [sent-6, score-0.732]
2 Four models were trained to estimate motion velocities in sequences of visual images. [sent-7, score-0.411]
3 They received a relatively impoverished visual input early in training, and the quality of this input improved as training progressed. [sent-9, score-0.449]
4 it received coarse-scale motion features early in training and finer-scale features were added to its input as training progressed), another model used a fine-to-multiscale progression, and the third model used a random progression. [sent-12, score-0.754]
5 We conclude that suitably designed developmental sequences can be useful to systems learning to estimate motion velocities. [sent-16, score-0.819]
6 The idea that visual development can aid visual learning is a viable hypothesis in need of further study. [sent-17, score-0.387]
7 In previous work, we studied the effects of different types of developmental sequences on the performances of systems trained to estimate the binocular disparities present in pairs of visual images [2]. [sent-25, score-0.916]
8 These filters are widely used to model the binocular sensitivities of simple and complex cells in primary visual cortex of primates [3]. [sent-30, score-0.651]
9 Based on local patches of the right-eye and left-eye images, each filter acted as a disparity feature detector at a coarse, medium, or fine scale depending on whether the filter was tuned to a low, medium, or high spatial frequency, respectively. [sent-31, score-0.471]
10 The outputs of the binocular energy filters were the inputs to this network. [sent-33, score-0.426]
11 The network was trained to estimate the disparity of the object which was defined as the amount that the object was shifted between the right-eye and left-eye images. [sent-34, score-0.619]
12 A non-developmental system was compared to three developmental systems. [sent-35, score-0.567]
13 The network of the non-developmental system received the outputs of all binocular energy filters throughout the entire training period. [sent-36, score-0.829]
14 The networks of the developmental systems, in contrast, were trained in three stages. [sent-37, score-0.548]
15 The network of the coarse-to-multiscale system received the outputs of binocular energy filters tuned to a low spatial frequency during the first training stage. [sent-38, score-1.254]
16 It received the outputs of filters tuned to low and medium spatial frequencies during the second training stage, and it received the outputs of all filters during the third training stage. [sent-39, score-1.49]
17 This network received the outputs of filters tuned to a high frequency during the first training stage, and the outputs of medium and then low frequency filters were added during subsequent stages. [sent-41, score-1.169]
18 The network of the random developmental model was also trained in stages, though its inputs were chosen at random at each stage and, thus, were not organized by spatial frequency content. [sent-42, score-1.096]
19 The results show that the coarse-to-multiscale and fine-to-multiscale systems consistently outperformed the non-developmental and random developmental systems. [sent-43, score-0.599]
20 The fact that they outperformed the non-developmental system is important because this demonstrates that models that undergo a developmental maturation can acquire a more advanced perceptual ability than one that does not. [sent-44, score-0.658]
21 The fact that they outperformed the random developmental system is important because this demonstrates that not all developmental sequences can be expected to provide performance benefits. [sent-45, score-1.204]
22 At a more general level, these results suggest that the idea that visual development aids visual learning is a viable hypothesis in need of further study. [sent-48, score-0.407]
23 This paper studies this hypothesis in the context of visual motion velocity estimation. [sent-49, score-0.599]
24 Our simulations show that the tasks of disparity estimation and velocity estimation yield similar, though not identical, patterns of results. [sent-50, score-0.417]
25 Although a developmental approach to the velocity estimation task is shown to be beneficial, it is not the case that all developmental progressions that lead to performance advantages on the disparity estimation task also lead to advantages on the velocity estimation task. [sent-51, score-1.817]
26 In particular, a coarse-to-multiscale developmental system outperformed non-developmental and random developmental systems on the velocity estimation task, but a fine-to-multiscale system did not. [sent-52, score-1.496]
27 We hypothesize that the performance advantage of the coarse-to-multiscale system relative to the fine-to-multiscale system is due to the fact that the coarse-to-multiscale system learned to make greater use of motion energy filters tuned to a low spatial frequency. [sent-53, score-0.833]
28 Analyses suggest that coarse-scale motion features are more informative for the velocity estimation task than fine-scale features. [sent-54, score-0.591]
29 2 Developmental and Non-developmental Systems The structure of the developmental and non-developmental systems was as follows. [sent-55, score-0.508]
30 The input to each system was a sequence of 88 retinal images where each image was a onedimensional array 40 pixels in length. [sent-56, score-0.383]
31 As described below, this sequence depicted an object moving at a constant velocity in front of a stationary background. [sent-57, score-0.532]
32 The sequence of retinal images was filtered using motion energy filters. [sent-61, score-0.457]
33 Based on neurophysiological results, Adelson and Bergen [4] proposed motion energy filters as a way of modeling the motion sensitivities of simple and complex cells in primary visual cortex. [sent-62, score-1.011]
34 In this case, motion energy filters are two-dimensional filters which extract motion information in local patches of the spatiotemporal space. [sent-64, score-0.63]
35 A quadrature pair of such functions with even and odd phases tuned to leftward (-) and rightward (+) directions of motion is given by % 6 GE 74IHF3) ' ¦ # ' ¨¦ $" ¨¦ © © ! [sent-66, score-0.462]
36 The ratio of a Gabor function in the spatiotemporal space which, in turn, determines the velocity sensitivity of the function. [sent-69, score-0.341]
37 The activities of simple cells with even and odd phases are summed in order to form the activity of a complex cell. [sent-71, score-0.442]
38 Q Q Fifteen complex cells corresponding to three spatial frequencies and five temporal frequencies were centered at each receptive-field location. [sent-77, score-0.733]
39 The spatial and temporal frequencies were each separated by an octave. [sent-78, score-0.309]
40 Temporal frequencies were chosen so that the set of cells at each spatial frequency had the same pattern of velocity tunings. [sent-79, score-0.838]
41 All cells were tuned to rightward motion because we restricted our data sets to only include objects that were moving to the right. [sent-89, score-0.724]
42 A cell’s spatial and temporal standard deviations were set to be inversely proportional to its spatial and temporal frequencies, respectively. [sent-90, score-0.45]
43 The outputs of the complex cells within each spatial frequency band were normalized using a softmax nonlinearity. [sent-91, score-0.773]
44 Consequently, complex cells tended to respond to relative contrast in the image sequence rather than absolute contrast [5] [6]. [sent-92, score-0.433]
45 The normalized outputs of the complex cells were the inputs to an artificial neural network. [sent-93, score-0.564]
46 The network had 1200 input units (the complex cells had 80 receptive-field locations and there were 15 cells at each location). [sent-94, score-0.754]
47 The connectivity of the hidden units was set so that each group had a limited receptive field, and so that neighboring groups had overlapping receptive fields. [sent-96, score-0.417]
48 A group of hidden units received inputs from thirty-two receptive field locations at the complex cell level, and the receptive fields of neighboring groups overlapped by eight receptive-field locations. [sent-97, score-0.846]
49 The output layer consisted of a single linear unit; this unit’s output was an estimate of the object velocity depicted in the sequence of retinal images. [sent-99, score-0.535]
50 Weight sharing was implemented at the hidden unit level so that corresponding units within each group of hidden units had the same incoming and outgoing weight values, and so that a hidden unit had the same set of weight values from each receptive field location at the complex unit level. [sent-101, score-0.504]
51 Three developmental systems and one non-developmental system were simulated. [sent-108, score-0.567]
52 The coarse-to-multiscale system, or model C2M, was trained using a coarse-to-multiscale developmental sequence which was implemented as follows. [sent-109, score-0.636]
53 During the first stage, the neural network portion of the model only received the outputs of complex cells tuned to the low spatial frequency (the outputs of other complex cells were set to zero). [sent-111, score-1.826]
54 During the second stage, the network received the outputs of complex cells tuned to low and medium spatial frequencies; it received the outputs of all complex cells during the third stage. [sent-112, score-1.997]
55 The training of the fine-to-multiscale system, or model F2M, was identical to that of model C2M except that its training used a fine-to-multiscale developmental sequence. [sent-113, score-0.734]
56 During the first stage of training, its network received the outputs of complex cells tuned to the high spatial frequency. [sent-114, score-1.228]
57 This network received the outputs of complex cells tuned to high and medium spatial frequencies during the second stage, and received the outputs of all complex cells during the third stage. [sent-115, score-2.036]
58 The training of the random developmental system, or model RD, also used a developmental sequence, though this sequence was generated randomly and, thus, was not based on the spatial frequency tunings of the complex cells. [sent-116, score-1.621]
59 The collection of complex cells was randomly partitioned into three equal-sized subsets with the constraint that each subset included one-third of the cells at each receptive-field location. [sent-117, score-0.637]
60 During the first stage of training, the neural network portion of the model only received the outputs of the complex cells in the first subset. [sent-118, score-0.982]
61 It received the outputs of the cells in the first and second subsets during the second stage of training, and received the outputs of all complex cells during the third stage. [sent-119, score-1.506]
62 In contrast, the training period of the non-developmental system, or model ND, was not divided into separate stages; its neural network received the outputs of all complex cells throughout the entire training period. [sent-120, score-1.021]
63 Solid object data item Noisy object data item Figure 1: Ten frames of an image sequence from the solid object data set (top) and ten frames of an image sequence from the noisy object data set (bottom). [sent-121, score-1.202]
64 In all cases the images were gray scale with luminance values between 0 and 1, and motion velocities were rightward with magnitudes between 0 and 4 pixels per time frame. [sent-123, score-0.485]
65 In the solid object data set, images consisted of a moving light or dark object in front of a stationary gray background. [sent-125, score-0.534]
66 The size of the object was randomly chosen to be an integer between 6 and 12 pixels, its initial location was a randomly chosen pixel on the retina, and its velocity was randomly chosen to be a real value between 0 and 4 pixels per time frame. [sent-132, score-0.706]
67 The top portion of Figure 1 gives an example of ten frames of an image sequence from the solid object data set. [sent-134, score-0.461]
68 The labels for the developmental models C2M, F2M, and RD include a number. [sent-137, score-0.508]
69 Recall that the training of these models was divided into three training stages (or developmental stages). [sent-138, score-0.726]
70 The number in the label gives the length of developmental stages 1 and 2 (the length of developmental stage 3 can be calculated using the fact that the entire training period lasted 100 iterations). [sent-139, score-1.269]
71 For example, the label ‘C2M-5’ corresponds to a version of model C2M in which the solid object data set 0. [sent-140, score-0.317]
72 first stage was 5 iterations, the second stage was 5 iterations, and the third stage was 90 iterations. [sent-145, score-0.393]
73 The images in the second data set, referred to as the noisy object data set, were meant to resemble random-dot kinematograms frequently used in behavioral experiments. [sent-160, score-0.301]
74 Images contained a noisy object which was moving to the right and a noisy background which was stationary. [sent-161, score-0.335]
75 The gray-scale values of the object pixels and the background pixels were set to random numbers between 0 and 1. [sent-162, score-0.392]
76 The size of the object was randomly chosen to be an integer between 6 and 12 pixels, its initial location was a randomly chosen pixel on the retina, and its velocity was randomly chosen to be an integer between 0 and 4 pixels per time frame. [sent-163, score-0.706]
77 As before, the task was to map an image sequence to an estimate of an object velocity. [sent-164, score-0.324]
78 The bottom portion of Figure 1 gives an example of ten frames of an image sequence from the noisy object data set. [sent-165, score-0.467]
79 Simulation results described in Jacobs and Dominguez [8] suggest that coarse-scale motion features are more informative for the velocity estimation task than fine-scale features. [sent-192, score-0.591]
80 For example, networks that received only the outputs of complex cells tuned to a low spatial frequency consistently outperformed networks that received only the outputs of mid frequency complex cells or only the outputs of high frequency complex cells. [sent-193, score-2.426]
81 First, complex cells tuned to the lowest spatial frequency have the largest receptive fields. [sent-195, score-0.887]
82 This type of reasoning also applies to the activities of complex cells with receptive fields in the spatiotemporal domain. [sent-197, score-0.651]
83 That is, there is comparatively less ambiguity in the activities of complex cells with larger receptive fields than in the activities of cells with smaller receptive fields. [sent-198, score-1.035]
84 Because cells tuned to a low spatial frequency tend to have larger receptive fields than cells tuned to a high spatial frequency, low frequency tuned cells tend to be more reliable for the purposes of motion velocity estimation. [sent-199, score-2.479]
85 Second, model C2M may have benefited from the fact that complex cells with large, overlapping receptive fields provide a high resolution coarse-code of the spatiotemporal space [10]-[12]. [sent-200, score-0.661]
86 This code could provide model C2M with accurate information as to the location of the moving object at each moment in time. [sent-201, score-0.314]
87 For example, the activities of the population of these cells may have coded with high accuracy the fact and at location at time . [sent-202, score-0.366]
88 If so, the that the moving object was at location at time model’s neural network could have easily learned to accurately estimate the object velocity . [sent-203, score-0.792]
89 In contrast, model F2M, for example, received early in training only the outputs of complex cells with smaller, less-overlapping receptive fields. [sent-205, score-1.018]
90 The activities of a population of these cells form a lower resolution coarse-code of the spatiotemporal space. [sent-206, score-0.475]
91 Condition (1) allowed a system to combine and compare input features at an early training stage without the need to compensate for the fact that these features could be at different spatial scales. [sent-208, score-0.481]
92 If condition (2) was satisfied, when a system received inputs at a new spatial scale, it was close to a scale with which the system was already familiar. [sent-209, score-0.603]
93 Although not described here (see Jacobs and Dominguez [8]), we tested the importance on the motion velocity estimation task for the resolution of a system’s inputs to progress in an orderly fashion from one scale to a neighboring scale. [sent-210, score-0.812]
94 The results suggest that this factor is moderately important, but not highly important, for a developmental system learning to estimate motion velocities. [sent-211, score-0.823]
95 Overall, it is more important for a system to receive the outputs of the low spatial frequency complex cells as early in training as possible. [sent-212, score-1.01]
96 Based on the entire set of simulations, we conclude that suitably designed developmental sequences can be useful to systems learning to estimate motion velocities. [sent-213, score-0.819]
97 The idea that visual development can aid visual learning is a viable hypothesis in need of further study. [sent-214, score-0.387]
98 (2003) Developmental constraints aid the acquisition of binocular disparity sensitivities. [sent-224, score-0.318]
99 (1994) Filter selection model for motion segmentation and velocity integration. [sent-247, score-0.493]
100 (2003) Visual development and the acquisition of motion velocity sensitivities. [sent-262, score-0.552]
wordName wordTfidf (topN-words)
[('developmental', 0.508), ('cells', 0.253), ('velocity', 0.235), ('motion', 0.228), ('received', 0.211), ('object', 0.196), ('spatial', 0.171), ('outputs', 0.167), ('tuned', 0.144), ('receptive', 0.137), ('binocular', 0.134), ('stage', 0.118), ('disparity', 0.11), ('dominguez', 0.107), ('spatiotemporal', 0.106), ('visual', 0.105), ('lters', 0.104), ('pixels', 0.098), ('frequency', 0.095), ('rmse', 0.093), ('outperformed', 0.091), ('complex', 0.087), ('medium', 0.085), ('jacobs', 0.085), ('frequencies', 0.084), ('training', 0.083), ('network', 0.077), ('rochester', 0.074), ('activities', 0.068), ('energy', 0.068), ('adelson', 0.068), ('rd', 0.065), ('development', 0.06), ('system', 0.059), ('sequence', 0.058), ('images', 0.057), ('elds', 0.057), ('inputs', 0.057), ('rightward', 0.056), ('items', 0.055), ('temporal', 0.054), ('units', 0.053), ('stages', 0.052), ('eld', 0.052), ('frames', 0.051), ('orderly', 0.051), ('early', 0.05), ('version', 0.049), ('noisy', 0.048), ('resolution', 0.048), ('neighboring', 0.047), ('scale', 0.046), ('retinal', 0.046), ('location', 0.045), ('low', 0.045), ('aid', 0.045), ('suitably', 0.045), ('randomly', 0.044), ('hidden', 0.043), ('cell', 0.043), ('moving', 0.043), ('bergen', 0.043), ('fifteen', 0.043), ('progressed', 0.043), ('progression', 0.043), ('progressions', 0.043), ('twodimensional', 0.043), ('sensitivities', 0.042), ('retina', 0.042), ('solid', 0.042), ('viable', 0.041), ('ten', 0.04), ('trained', 0.04), ('portion', 0.039), ('third', 0.039), ('sequences', 0.038), ('aids', 0.037), ('exposed', 0.037), ('tunings', 0.037), ('estimation', 0.036), ('image', 0.035), ('task', 0.035), ('disparities', 0.034), ('odd', 0.034), ('tend', 0.033), ('smaller', 0.032), ('nd', 0.032), ('locations', 0.031), ('versions', 0.031), ('hypothesis', 0.031), ('model', 0.03), ('array', 0.03), ('throughout', 0.03), ('bene', 0.029), ('fashion', 0.029), ('acquisition', 0.029), ('informative', 0.029), ('error', 0.028), ('simulation', 0.028), ('suggest', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities
Author: Robert A. Jacobs, Melissa Dominguez
Abstract: We consider the hypothesis that systems learning aspects of visual perception may benefit from the use of suitably designed developmental progressions during training. Four models were trained to estimate motion velocities in sequences of visual images. Three of the models were “developmental models” in the sense that the nature of their input changed during the course of training. They received a relatively impoverished visual input early in training, and the quality of this input improved as training progressed. One model used a coarse-to-multiscale developmental progression (i.e. it received coarse-scale motion features early in training and finer-scale features were added to its input as training progressed), another model used a fine-to-multiscale progression, and the third model used a random progression. The final model was nondevelopmental in the sense that the nature of its input remained the same throughout the training period. The simulation results show that the coarse-to-multiscale model performed best. Hypotheses are offered to account for this model’s superior performance. We conclude that suitably designed developmental sequences can be useful to systems learning to estimate motion velocities. The idea that visual development can aid visual learning is a viable hypothesis in need of further study.
Author: Alistair Bray, Dominique Martinez
Abstract: In Slow Feature Analysis (SFA [1]), it has been demonstrated that high-order invariant properties can be extracted by projecting inputs into a nonlinear space and computing the slowest changing features in this space; this has been proposed as a simple general model for learning nonlinear invariances in the visual system. However, this method is highly constrained by the curse of dimensionality which limits it to simple theoretical simulations. This paper demonstrates that by using a different but closely-related objective function for extracting slowly varying features ([2, 3]), and then exploiting the kernel trick, this curse can be avoided. Using this new method we show that both the complex cell properties of translation invariance and disparity coding can be learnt simultaneously from natural images when complex cells are driven by simple cells also learnt from the image. The notion of maximising an objective function based upon the temporal predictability of output has been progressively applied in modelling the development of invariances in the visual system. F6ldiak used it indirectly via a Hebbian trace rule for modelling the development of translation invariance in complex cells [4] (closely related to many other models [5,6,7]); this rule has been used to maximise invariance as one component of a hierarchical system for object and face recognition [8]. On the other hand, similar functions have been maximised directly in networks for extracting linear [2] and nonlinear [9, 1] visual invariances. Direct maximisation of such functions have recently been used to model complex cells [10] and as an alternative to maximising sparseness/independence in modelling simple cells [11]. Slow Feature Analysis [1] combines many of the best properties of these methods to provide a good general nonlinear model. That is, it uses an objective function that minimises the first-order temporal derivative of the outputs; it provides a closedform solution which maximises this function by projecting inputs into a nonlinear http://www.loria.fr/equipes/cortex/ space; it exploits sphering (or PCA-whitening) of the data to ensure that all outputs have unit variance and are uncorrelated. However, the method suffers from the curse of dimensionality in that the nonlinear feature space soon becomes very large as the input dimension grows, and yet this feature space must be represented explicitly in order for the essential sphering to occur. The alternative that we propose here is to use the objective function of Stone [2, 9], that maximises output variance over a long period whilst minimising variance over a shorter period; in the linear case, this can be implemented by a biologically plausible mixture of Hebbian and anti-Hebbian learning on the same synapses [2]. In recent work, Stone has proposed a closed-form solution for maximising this function in the linear domain of blind source separation that does not involve data-sphering. This paper describes how this method can be kernelised. The use of the
3 0.19236346 28 nips-2002-An Information Theoretic Approach to the Functional Classification of Neurons
Author: Elad Schneidman, William Bialek, Michael Ii
Abstract: A population of neurons typically exhibits a broad diversity of responses to sensory inputs. The intuitive notion of functional classification is that cells can be clustered so that most of the diversity is captured by the identity of the clusters rather than by individuals within clusters. We show how this intuition can be made precise using information theory, without any need to introduce a metric on the space of stimuli or responses. Applied to the retinal ganglion cells of the salamander, this approach recovers classical results, but also provides clear evidence for subclasses beyond those identified previously. Further, we find that each of the ganglion cells is functionally unique, and that even within the same subclass only a few spikes are needed to reliably distinguish between cells. 1
4 0.19068567 51 nips-2002-Classifying Patterns of Visual Motion - a Neuromorphic Approach
Author: Jakob Heinzle, Alan Stocker
Abstract: We report a system that classifies and can learn to classify patterns of visual motion on-line. The complete system is described by the dynamics of its physical network architectures. The combination of the following properties makes the system novel: Firstly, the front-end of the system consists of an aVLSI optical flow chip that collectively computes 2-D global visual motion in real-time [1]. Secondly, the complexity of the classification task is significantly reduced by mapping the continuous motion trajectories to sequences of ’motion events’. And thirdly, all the network structures are simple and with the exception of the optical flow chip based on a Winner-Take-All (WTA) architecture. We demonstrate the application of the proposed generic system for a contactless man-machine interface that allows to write letters by visual motion. Regarding the low complexity of the system, its robustness and the already existing front-end, a complete aVLSI system-on-chip implementation is realistic, allowing various applications in mobile electronic devices.
5 0.16192466 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex
Author: Jarmo Hurri, Aapo Hyvärinen
Abstract: We show that two important properties of the primary visual cortex emerge when the principle of temporal coherence is applied to natural image sequences. The properties are simple-cell-like receptive fields and complex-cell-like pooling of simple cell outputs, which emerge when we apply two different approaches to temporal coherence. In the first approach we extract receptive fields whose outputs are as temporally coherent as possible. This approach yields simple-cell-like receptive fields (oriented, localized, multiscale). Thus, temporal coherence is an alternative to sparse coding in modeling the emergence of simple cell receptive fields. The second approach is based on a two-layer statistical generative model of natural image sequences. In addition to modeling the temporal coherence of individual simple cells, this model includes inter-cell temporal dependencies. Estimation of this model from natural data yields both simple-cell-like receptive fields, and complex-cell-like pooling of simple cell outputs. In this completely unsupervised learning, both layers of the generative model are estimated simultaneously from scratch. This is a significant improvement on earlier statistical models of early vision, where only one layer has been learned, and others have been fixed a priori.
6 0.13779213 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning
7 0.13152447 10 nips-2002-A Model for Learning Variance Components of Natural Images
9 0.11285195 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions
10 0.11221537 153 nips-2002-Neural Decoding of Cursor Motion Using a Kalman Filter
11 0.10564883 172 nips-2002-Recovering Articulated Model Topology from Observed Rigid Motion
12 0.095039733 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search
13 0.092934385 74 nips-2002-Dynamic Structure Super-Resolution
14 0.090690196 39 nips-2002-Bayesian Image Super-Resolution
15 0.087039091 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior
16 0.079799742 73 nips-2002-Dynamic Bayesian Networks with Deterministic Latent Tables
17 0.078974664 2 nips-2002-A Bilinear Model for Sparse Coding
18 0.078192301 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture
19 0.076011509 136 nips-2002-Linear Combinations of Optic Flow Vectors for Estimating Self-Motion - a Real-World Test of a Neural Model
20 0.075978078 105 nips-2002-How to Combine Color and Shape Information for 3D Object Recognition: Kernels do the Trick
topicId topicWeight
[(0, -0.222), (1, 0.133), (2, 0.018), (3, 0.207), (4, 0.003), (5, -0.016), (6, 0.112), (7, -0.014), (8, 0.071), (9, 0.037), (10, -0.033), (11, 0.082), (12, 0.046), (13, 0.089), (14, -0.106), (15, 0.122), (16, 0.286), (17, -0.002), (18, -0.071), (19, -0.043), (20, 0.073), (21, 0.067), (22, 0.212), (23, -0.107), (24, -0.13), (25, -0.029), (26, -0.055), (27, 0.046), (28, -0.172), (29, 0.055), (30, 0.01), (31, 0.042), (32, 0.074), (33, -0.062), (34, -0.023), (35, 0.008), (36, -0.089), (37, 0.044), (38, -0.011), (39, 0.024), (40, 0.008), (41, 0.023), (42, -0.041), (43, -0.048), (44, -0.073), (45, 0.033), (46, -0.097), (47, 0.038), (48, -0.006), (49, 0.042)]
simIndex simValue paperId paperTitle
same-paper 1 0.97163153 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities
Author: Robert A. Jacobs, Melissa Dominguez
Abstract: We consider the hypothesis that systems learning aspects of visual perception may benefit from the use of suitably designed developmental progressions during training. Four models were trained to estimate motion velocities in sequences of visual images. Three of the models were “developmental models” in the sense that the nature of their input changed during the course of training. They received a relatively impoverished visual input early in training, and the quality of this input improved as training progressed. One model used a coarse-to-multiscale developmental progression (i.e. it received coarse-scale motion features early in training and finer-scale features were added to its input as training progressed), another model used a fine-to-multiscale progression, and the third model used a random progression. The final model was nondevelopmental in the sense that the nature of its input remained the same throughout the training period. The simulation results show that the coarse-to-multiscale model performed best. Hypotheses are offered to account for this model’s superior performance. We conclude that suitably designed developmental sequences can be useful to systems learning to estimate motion velocities. The idea that visual development can aid visual learning is a viable hypothesis in need of further study.
Author: Alistair Bray, Dominique Martinez
Abstract: In Slow Feature Analysis (SFA [1]), it has been demonstrated that high-order invariant properties can be extracted by projecting inputs into a nonlinear space and computing the slowest changing features in this space; this has been proposed as a simple general model for learning nonlinear invariances in the visual system. However, this method is highly constrained by the curse of dimensionality which limits it to simple theoretical simulations. This paper demonstrates that by using a different but closely-related objective function for extracting slowly varying features ([2, 3]), and then exploiting the kernel trick, this curse can be avoided. Using this new method we show that both the complex cell properties of translation invariance and disparity coding can be learnt simultaneously from natural images when complex cells are driven by simple cells also learnt from the image. The notion of maximising an objective function based upon the temporal predictability of output has been progressively applied in modelling the development of invariances in the visual system. F6ldiak used it indirectly via a Hebbian trace rule for modelling the development of translation invariance in complex cells [4] (closely related to many other models [5,6,7]); this rule has been used to maximise invariance as one component of a hierarchical system for object and face recognition [8]. On the other hand, similar functions have been maximised directly in networks for extracting linear [2] and nonlinear [9, 1] visual invariances. Direct maximisation of such functions have recently been used to model complex cells [10] and as an alternative to maximising sparseness/independence in modelling simple cells [11]. Slow Feature Analysis [1] combines many of the best properties of these methods to provide a good general nonlinear model. That is, it uses an objective function that minimises the first-order temporal derivative of the outputs; it provides a closedform solution which maximises this function by projecting inputs into a nonlinear http://www.loria.fr/equipes/cortex/ space; it exploits sphering (or PCA-whitening) of the data to ensure that all outputs have unit variance and are uncorrelated. However, the method suffers from the curse of dimensionality in that the nonlinear feature space soon becomes very large as the input dimension grows, and yet this feature space must be represented explicitly in order for the essential sphering to occur. The alternative that we propose here is to use the objective function of Stone [2, 9], that maximises output variance over a long period whilst minimising variance over a shorter period; in the linear case, this can be implemented by a biologically plausible mixture of Hebbian and anti-Hebbian learning on the same synapses [2]. In recent work, Stone has proposed a closed-form solution for maximising this function in the linear domain of blind source separation that does not involve data-sphering. This paper describes how this method can be kernelised. The use of the
Author: Terry Elliott, Jörg Kramer
Abstract: A neurotrophic model for the co-development of topography and ocular dominance columns in the primary visual cortex has recently been proposed. In the present work, we test this model by driving it with the output of a pair of neuronal vision sensors stimulated by disparate moving patterns. We show that the temporal correlations in the spike trains generated by the two sensors elicit the development of refined topography and ocular dominance columns, even in the presence of significant amounts of spontaneous activity and fixed-pattern noise in the sensors.
4 0.69281995 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex
Author: Jarmo Hurri, Aapo Hyvärinen
Abstract: We show that two important properties of the primary visual cortex emerge when the principle of temporal coherence is applied to natural image sequences. The properties are simple-cell-like receptive fields and complex-cell-like pooling of simple cell outputs, which emerge when we apply two different approaches to temporal coherence. In the first approach we extract receptive fields whose outputs are as temporally coherent as possible. This approach yields simple-cell-like receptive fields (oriented, localized, multiscale). Thus, temporal coherence is an alternative to sparse coding in modeling the emergence of simple cell receptive fields. The second approach is based on a two-layer statistical generative model of natural image sequences. In addition to modeling the temporal coherence of individual simple cells, this model includes inter-cell temporal dependencies. Estimation of this model from natural data yields both simple-cell-like receptive fields, and complex-cell-like pooling of simple cell outputs. In this completely unsupervised learning, both layers of the generative model are estimated simultaneously from scratch. This is a significant improvement on earlier statistical models of early vision, where only one layer has been learned, and others have been fixed a priori.
5 0.59097588 28 nips-2002-An Information Theoretic Approach to the Functional Classification of Neurons
Author: Elad Schneidman, William Bialek, Michael Ii
Abstract: A population of neurons typically exhibits a broad diversity of responses to sensory inputs. The intuitive notion of functional classification is that cells can be clustered so that most of the diversity is captured by the identity of the clusters rather than by individuals within clusters. We show how this intuition can be made precise using information theory, without any need to introduce a metric on the space of stimuli or responses. Applied to the retinal ganglion cells of the salamander, this approach recovers classical results, but also provides clear evidence for subclasses beyond those identified previously. Further, we find that each of the ganglion cells is functionally unique, and that even within the same subclass only a few spikes are needed to reliably distinguish between cells. 1
6 0.58644491 51 nips-2002-Classifying Patterns of Visual Motion - a Neuromorphic Approach
7 0.48733857 172 nips-2002-Recovering Articulated Model Topology from Observed Rigid Motion
8 0.45623076 22 nips-2002-Adaptive Nonlinear System Identification with Echo State Networks
9 0.42464471 177 nips-2002-Retinal Processing Emulation in a Programmable 2-Layer Analog Array Processor CMOS Chip
10 0.40498492 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions
11 0.39633948 153 nips-2002-Neural Decoding of Cursor Motion Using a Kalman Filter
12 0.35575971 18 nips-2002-Adaptation and Unsupervised Learning
13 0.34800312 141 nips-2002-Maximally Informative Dimensions: Analyzing Neural Responses to Natural Signals
14 0.34788206 122 nips-2002-Learning About Multiple Objects in Images: Factorial Learning without Factorial Search
15 0.34258166 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning
16 0.33055085 2 nips-2002-A Bilinear Model for Sparse Coding
17 0.32488883 200 nips-2002-Topographic Map Formation by Silicon Growth Cones
18 0.32368279 136 nips-2002-Linear Combinations of Optic Flow Vectors for Estimating Self-Motion - a Real-World Test of a Neural Model
19 0.31043369 87 nips-2002-Fast Transformation-Invariant Factor Analysis
20 0.30791339 160 nips-2002-Optoelectronic Implementation of a FitzHugh-Nagumo Neural Model
topicId topicWeight
[(11, 0.028), (23, 0.025), (41, 0.011), (42, 0.048), (54, 0.107), (55, 0.065), (64, 0.353), (67, 0.011), (68, 0.04), (74, 0.124), (92, 0.019), (98, 0.075)]
simIndex simValue paperId paperTitle
same-paper 1 0.84408647 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities
Author: Robert A. Jacobs, Melissa Dominguez
Abstract: We consider the hypothesis that systems learning aspects of visual perception may benefit from the use of suitably designed developmental progressions during training. Four models were trained to estimate motion velocities in sequences of visual images. Three of the models were “developmental models” in the sense that the nature of their input changed during the course of training. They received a relatively impoverished visual input early in training, and the quality of this input improved as training progressed. One model used a coarse-to-multiscale developmental progression (i.e. it received coarse-scale motion features early in training and finer-scale features were added to its input as training progressed), another model used a fine-to-multiscale progression, and the third model used a random progression. The final model was nondevelopmental in the sense that the nature of its input remained the same throughout the training period. The simulation results show that the coarse-to-multiscale model performed best. Hypotheses are offered to account for this model’s superior performance. We conclude that suitably designed developmental sequences can be useful to systems learning to estimate motion velocities. The idea that visual development can aid visual learning is a viable hypothesis in need of further study.
2 0.73358274 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior
Author: Patrik O. Hoyer, Aapo Hyvärinen
Abstract: The responses of cortical sensory neurons are notoriously variable, with the number of spikes evoked by identical stimuli varying significantly from trial to trial. This variability is most often interpreted as ‘noise’, purely detrimental to the sensory system. In this paper, we propose an alternative view in which the variability is related to the uncertainty, about world parameters, which is inherent in the sensory stimulus. Specifically, the responses of a population of neurons are interpreted as stochastic samples from the posterior distribution in a latent variable model. In addition to giving theoretical arguments supporting such a representational scheme, we provide simulations suggesting how some aspects of response variability might be understood in this framework.
3 0.71285427 117 nips-2002-Intrinsic Dimension Estimation Using Packing Numbers
Author: Balázs Kégl
Abstract: We propose a new algorithm to estimate the intrinsic dimension of data sets. The method is based on geometric properties of the data and requires neither parametric assumptions on the data generating model nor input parameters to set. The method is compared to a similar, widelyused algorithm from the same family of geometric techniques. Experiments show that our method is more robust in terms of the data generating distribution and more reliable in the presence of noise. 1
4 0.58320475 193 nips-2002-Temporal Coherence, Natural Image Sequences, and the Visual Cortex
Author: Jarmo Hurri, Aapo Hyvärinen
Abstract: We show that two important properties of the primary visual cortex emerge when the principle of temporal coherence is applied to natural image sequences. The properties are simple-cell-like receptive fields and complex-cell-like pooling of simple cell outputs, which emerge when we apply two different approaches to temporal coherence. In the first approach we extract receptive fields whose outputs are as temporally coherent as possible. This approach yields simple-cell-like receptive fields (oriented, localized, multiscale). Thus, temporal coherence is an alternative to sparse coding in modeling the emergence of simple cell receptive fields. The second approach is based on a two-layer statistical generative model of natural image sequences. In addition to modeling the temporal coherence of individual simple cells, this model includes inter-cell temporal dependencies. Estimation of this model from natural data yields both simple-cell-like receptive fields, and complex-cell-like pooling of simple cell outputs. In this completely unsupervised learning, both layers of the generative model are estimated simultaneously from scratch. This is a significant improvement on earlier statistical models of early vision, where only one layer has been learned, and others have been fixed a priori.
5 0.54784024 2 nips-2002-A Bilinear Model for Sparse Coding
Author: David B. Grimes, Rajesh P. Rao
Abstract: Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. We describe an algorithm for sparse coding based on a bilinear generative model of images. By explicitly modeling the interaction between image features and their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision. We present results demonstrating bilinear sparse coding of natural images. We also explore an extension of the model that can capture spatial relationships between the independent features of an object, thereby providing a new framework for parts-based object recognition.
6 0.5382663 10 nips-2002-A Model for Learning Variance Components of Natural Images
7 0.52524972 148 nips-2002-Morton-Style Factorial Coding of Color in Primary Visual Cortex
8 0.52398396 43 nips-2002-Binary Coding in Auditory Cortex
9 0.50853407 28 nips-2002-An Information Theoretic Approach to the Functional Classification of Neurons
10 0.49503189 29 nips-2002-Analysis of Information in Speech Based on MANOVA
11 0.49480605 173 nips-2002-Recovering Intrinsic Images from a Single Image
12 0.49466747 136 nips-2002-Linear Combinations of Optic Flow Vectors for Estimating Self-Motion - a Real-World Test of a Neural Model
13 0.49229768 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions
14 0.4904986 141 nips-2002-Maximally Informative Dimensions: Analyzing Neural Responses to Natural Signals
15 0.48540002 147 nips-2002-Monaural Speech Separation
16 0.48024455 66 nips-2002-Developing Topography and Ocular Dominance Using Two aVLSI Vision Sensors and a Neurotrophic Model of Plasticity
17 0.47572505 199 nips-2002-Timing and Partial Observability in the Dopamine System
18 0.47519055 51 nips-2002-Classifying Patterns of Visual Motion - a Neuromorphic Approach
19 0.47367084 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture
20 0.47192913 126 nips-2002-Learning Sparse Multiscale Image Representations