nips nips2005 nips2005-61 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Anna Levina, Michael Herrmann
Abstract: There is experimental evidence that cortical neurons show avalanche activity with the intensity of firing events being distributed as a power-law. We present a biologically plausible extension of a neural network which exhibits a power-law avalanche distribution for a wide range of connectivity parameters. 1
Reference: text
sentIndex sentText sentNum sentScore
1 de 1 2 Abstract There is experimental evidence that cortical neurons show avalanche activity with the intensity of firing events being distributed as a power-law. [sent-5, score-0.718]
2 We present a biologically plausible extension of a neural network which exhibits a power-law avalanche distribution for a wide range of connectivity parameters. [sent-6, score-0.74]
3 Because it is unlikely that the specific parameter values at which the critical behavior occurs are assumed by chance, the question arises as to what mechanisms may tune the parameters towards the critical state. [sent-8, score-0.43]
4 Furthermore it is known that criticality brings about optimal computational capabilities [10], improves mixing or enhances the sensitivity to unpredictable stimuli [5]. [sent-9, score-0.222]
5 Therefore, it is interesting to search for mechanisms that entail criticality in biological systems, for example in the nervous tissue. [sent-10, score-0.222]
6 In [6] a simple model of a fully connected neural network of non-leaky integrate-and-fire neurons was studied. [sent-11, score-0.118]
7 This study not only presented the first example of a globally coupled system that shows criticality, but also predicted the critical exponent as well as some extra-critical dynamical phenomena, which were later observed in experimental researches. [sent-12, score-0.41]
8 Recently, Beggs and Plenz [3] studied the propagation of spontaneous neuronal activity in slices of rat cortex and neuronal cultures using multi-electrode arrays. [sent-13, score-0.122]
9 Thereby, they found avalanche-like activity where the avalanche sizes were distributed according to a powerlaw with an exponent of -3/2. [sent-14, score-0.772]
10 The network in [6] consisted of a set of N identical threshold elements characterized by the membrane potential u ≥ 0 and was driven by a slowly delivered random input. [sent-17, score-0.335]
11 When the potential exceeds a threshold θ = 1, the neuron spikes and relaxes. [sent-18, score-0.218]
12 All connections in the network are described by a single parameter α representing the evoked synaptic potential which a spiking neuron transmits to the all postsynaptic neurons. [sent-19, score-0.474]
13 The system is driven by a slowly delivered random input. [sent-20, score-0.172]
14 The simplicity of that model allows analytical consideration: an explicit formula for probability distribution of avalanche size depending on the parameter α was derived. [sent-21, score-0.688]
15 Only at an externally well-tuned critical value of α = α cr did the distribution take a form of a power-law, although with an exponent of precisely -3/2 (in the limit of a large system). [sent-23, score-0.454]
16 The term critical will be applied here also to finite systems. [sent-24, score-0.175]
17 True criticality requires a thermodynamic limit N −→ ∞, we consider approximate power-law behavior characterized by an exponent and an error that describes the remaining deviation from the best-matching exponent. [sent-25, score-0.491]
18 1 (a-c) it is visible that the system may also exhibit other types of behavior such as small avalanches with a finite mean (even in the thermodynamic limit) at α < α cr . [sent-29, score-0.667]
19 On the other hand at α > αcr the distribution becomes non-monotonous, which indicates that avalanches of the size of the system are occurring frequently. [sent-30, score-0.477]
20 Generally speaking, in order to drive the system towards criticality it therefore suffices to decrease the large avalanches and to enhance the small ones. [sent-31, score-0.696]
21 Most interestingly, synaptic connections among real neurons show a similar tendency which thus deserves further study. [sent-32, score-0.302]
22 We will consider the standard model of a short-term dynamics in synaptic efficacies [11, 13] and thereafter discuss several numerically determined quantities. [sent-33, score-0.305]
23 Our studies imply that dynamical synapses indeed may support the criticalization of the neural activity in a small homogeneous neural system. [sent-34, score-0.218]
24 2 The model We are considering a network of integrate-and-fire neurons with dynamical synapses. [sent-35, score-0.187]
25 Each synapse is described by two parameters: amount of available neurotransmitters and a fraction of them which is ready to be used at the next synaptic event. [sent-36, score-0.246]
26 Both parameters change in time depending on the state of the presynaptic neuron. [sent-37, score-0.137]
27 Such a system keeps a long memory of the previous events and is known to exert a regulatory effect to the network dynamics, which will turnout to be beneficial. [sent-38, score-0.11]
28 Our approach is based on the model of dynamical synapses, which was shown by Tsodyks and Markram to reliably reproduce the synaptic responses between pyramidal neurons [11, 13]. [sent-39, score-0.396]
29 Consider a set of N integrate-and-fire neurons characterized by a membrane potential hi ≥ 0, and two connectivity parameters for each synapse: Ji,j ≥ 0, ui,j ∈ [0, 1]. [sent-40, score-0.445]
30 The parameter Ji,j characterizes the number of available vesicles on the presynaptic side of the connection from neuron j to neuron i. [sent-41, score-0.39]
31 Each spike leads to the usage of a portion of the resources of the presynaptic neuron, hence, at the next synaptic event less transmitters will be available i. [sent-42, score-0.355]
32 Between spikes vesicles are slowly recovering on a timescale τ1 . [sent-45, score-0.191]
33 The parameter ui,j denotes the actual fraction of vesicles on the presynaptic side of the connection from neuron j to neuron i, which will be used in the synaptic transmission. [sent-46, score-0.613]
34 When a spike arrives at the presynaptic side j, it causes an increase of u i,j . [sent-47, score-0.111]
35 Between spikes, ui,j slowly decrease to zero on a timescale τ2 . [sent-48, score-0.122]
36 The combined effect of Ji,j and ui,j results in the facilitation or depression of the synapse. [sent-49, score-0.188]
37 The dynamics of a membrane potential hi consists of the integration of excitatory postsynaptic currents over all synapses of the neuron and the slowly delivered random input. [sent-50, score-0.725]
38 When the membrane potential exceeds threshold, the neuron emits a spike and hi resets to a smaller value. [sent-51, score-0.465]
39 4 0 1 2 10 Log L 10 10 Figure 1: Probability distributions of avalanche sizes P (L, N, α). [sent-64, score-0.635]
40 In (a-c) the solid lines and symbols denote the numerical results for the avalanche size distributions, dashed lines show the best matching power-law. [sent-69, score-0.628]
41 Here the curves are temporal averages over 106 avalanches with N = 100, u0 = 0. [sent-70, score-0.501]
42 The presented curves are temporal averages over 106 avalanches with N = 200, u0 = 0. [sent-77, score-0.501]
43 Presented curves are temporal averages over 107 avalanches with N = 200, u0 = 0. [sent-101, score-0.501]
44 Note that for a network of 200 units, the absolute critical exponent is smaller than the large system limit γ = −1. [sent-103, score-0.419]
45 Because synaptic values are essentially determined presynaptically, we assume that all synapses of a neuron are identical, i. [sent-109, score-0.425]
46 Jj , uj are used instead of Ji,j and ui,j respectively. [sent-111, score-0.124]
47 The system is initialized with arbitrary values hi ∈ [0, 1), i = 1, . [sent-112, score-0.223]
48 Deext pending on the state of the system at time t, the i-th element receives external input I i (t) int or internal input Ii (t) from other neural elements. [sent-116, score-0.31]
49 if the activation exceeds the threshold, it is reset but retains the supra-threshold portion ˜ hi (t + 1) − 1 of the membrane potential. [sent-119, score-0.329]
50 ext The external input Ii (t) is a random amount c ξ, received by a randomly chosen neuron. [sent-120, score-0.178]
51 The external input is considered to be delivered slowly compared to the internal relaxation dynamics (which corresponds to τsep 1), i. [sent-122, score-0.356]
52 This corresponds to an infinite separation of the time scales of external driving and avalanche dynamics discussed in the literature on self-organized criticality [12, 14]. [sent-125, score-1.006]
53 The present results, however, are not affected by a continuous external input even during the avalanches. [sent-126, score-0.111]
54 2 For the fit, avalanches of a size larger than 1 and smaller than N/2 have been used. [sent-144, score-0.46]
55 Interesting is again the sharp transition of the dynamical model, which is due to the facilitation strength surpassing a critical level. [sent-148, score-0.498]
56 We will consider c = J0 , thus an external input is comparable with the typical internal input. [sent-154, score-0.151]
57 int The internal input Ii (t) is given by int Ii (t) = Jj (t) uj (t). [sent-155, score-0.292]
58 j∈M (t−1) The system is initialized with ui = u0 , Ji = J0 , where J0 = α/(N u0 ) and α is the connection strength parameter. [sent-156, score-0.112]
59 Similar to the membrane potentials dynamics, we can distinguish two situations: either there were supra-threshold neurons at the previous moment of time or not. [sent-157, score-0.153]
60 ˜ uj (t) − τ1 u0 uj (t)) · δ|M (t)|,0 if hi (t) < 1, 2 uj (t + 1) = (6) ˜ uj (t) + (1 − uj (t))u0 (t) if hi (t) ≥ 1, Jj (t + 1) = Jj (t) + τ1 (J0 − Jj (t)) · δ|M (t)|,0 1 Jj (t)(1 − uj (t)) ˜ if hi (t) < 1, ˜ if hi (t) ≥ 1, (7) Thus, we have a model with parameters α, u0 , τ1 , τ2 and N . [sent-158, score-1.462]
61 The dependence on N has been studied in [6], where it was found that the critical parameter of the distribution scales as αcr = 1 − N −1/2 . [sent-160, score-0.226]
62 In the same way, the exponent will be smaller in modulus than -3/2 for finite systems. [sent-161, score-0.129]
63 Averaged synaptic efficacy Deviation from a power−law 0. [sent-162, score-0.223]
64 055 Figure 4: Average synaptic efficacy for the parameter α varied from 0. [sent-180, score-0.252]
65 If at time t0 an element receives an external input and fires, then an avalanche starts and |M (t0 )| = 1. [sent-185, score-0.741]
66 The system is globally coupled, such that during an avalanche all elements receive internal input including the unstable elements themselves. [sent-186, score-0.749]
67 The avalanche duration D ≥ 0 is defined to be the smallest integer for which the stopping condition |M (t 0 +D)| = D−1 0 is satisfied. [sent-187, score-0.599]
68 The avalanche size L is given by L = k=0 |M (t0 + k)|. [sent-188, score-0.628]
69 The subject of our interest is the probability distribution of avalanche size P (L, N, α) depending on the parameter α. [sent-189, score-0.688]
70 4 Results Similarly, as in model [6] we considered the avalanche size distribution for different values of α, cf. [sent-190, score-0.628]
71 For small values of α, subcritical avalanche-size distributions are observed. [sent-194, score-0.155]
72 The subcriticality is characterized by the neglible number of avalanches of a size close to the system size. [sent-195, score-0.512]
73 For αcr , the system has an avalanche distribution with an approximate power-law behavior for L, inside a range from 1 almost up to the size of the system, where the exponential cut-off is observed (Fig. [sent-196, score-0.706]
74 Above the critical value α cr , avalanche size distributions become non-monotonous (Fig. [sent-198, score-0.958]
75 Such supra-critical curves have a minimum at an intermediate avalanche size. [sent-200, score-0.635]
76 There is the sharp transition from subcritical to critical regime and then a long critical region, where the distribution of avalanche size stays close to the power-law. [sent-201, score-1.269]
77 For a system of 200 neurons this transition is shown in Fig. [sent-202, score-0.178]
78 At this point the transition from the subcritical to the critical regime occurs. [sent-208, score-0.425]
79 8 Figure 5: Difference between synaptic efficacy after and before avalanche averaged over all synapses . [sent-220, score-0.931]
80 Presented curves are temporal averages over 106 avalanches with N = 100, u0 = 0. [sent-222, score-0.501]
81 The average synaptic efficacy σ = σi = Ji ui is determined by taking the average over all neurons participating in an avalanche. [sent-225, score-0.344]
82 This average shows the mean input, which neurons receive at each step of avalanche. [sent-226, score-0.131]
83 This characteristic quantity undergoes a sharp transition together with the avalanches distribution, cf. [sent-227, score-0.529]
84 It is equal to the average EPSP which all postsynaptic neurons will receive after presynaptic neuron spikes. [sent-231, score-0.346]
85 The transition from a subcritical to a critical regime happens when σ jumps into the vicinity of αcr /N of the previous model (for N = 100 and αcr = 0. [sent-232, score-0.452]
86 When α is large, then the synaptic efficacy is high and, hence, avalanches are large and intervals between them are small. [sent-235, score-0.622]
87 The depression during the avalanche dominates facilitation and decrease synaptic efficacy and vise versa. [sent-236, score-1.039]
88 Thus, the synaptic dynamics stabilizes the network to remain near the critical value for a large interval of parameters α. [sent-238, score-0.541]
89 4 shown the averaged effect of an avalanche for different values of parameter α. [sent-240, score-0.65]
90 For α > αcr , depression during the avalanche is stronger than facilitation and avalanches on average decrease synaptic efficacy. [sent-241, score-1.434]
91 When α is very small, the effect of facilitation is washed out during the inter-avalanche period where synaptic parameters return to the resting state. [sent-242, score-0.42]
92 5 shows the difference, ∆σ = σafter − σbefore , between the average synaptic efficacies after and before the avalanche depending on the parameter α. [sent-244, score-0.903]
93 If it is smaller than zero, synapses are depressed. [sent-246, score-0.141]
94 For small values of the parameter α avalanches lead to facilitation, while, for large values of α avalanches depress synapses. [sent-247, score-0.827]
95 In the limit N → ∞, the synaptic dynamics should be rescaled such that the maximum of transmitter available at a time t divided by the average avalanche size converges to a value which scales as 1 − N −1/2 . [sent-248, score-1.003]
96 In this way, if the average avalanche size is smaller than critical, synapses will essentially be enhanced, or they will otherwise experience depression. [sent-249, score-0.79]
97 5 Conclusion We presented a simple biologically plausible complement to a model of a non-leaky integrate-and-fire neurons network which exhibits a power-law avalanche distribution for a wide range of connectivity parameters. [sent-251, score-0.819]
98 In previous studies [6] we showed, that the simplest model with only one parameter α, characterizing synaptic efficacy of all synapses exhibits subcritical, critical and supra critical regimes with continuous transition from one to another, depending on parameter α. [sent-252, score-0.88]
99 These main classes are also present here but the region of critical behavior is immensely enlarged. [sent-253, score-0.204]
100 Both models have a power-law distribution with an exponent approximately equal to -3/2, although the exponent is somewhat smaller for small network sizes. [sent-254, score-0.265]
wordName wordTfidf (topN-words)
[('avalanche', 0.599), ('avalanches', 0.399), ('synaptic', 0.223), ('criticality', 0.222), ('critical', 0.175), ('hi', 0.174), ('subcritical', 0.155), ('cr', 0.155), ('facilitation', 0.124), ('uj', 0.124), ('jj', 0.116), ('synapses', 0.109), ('exponent', 0.097), ('neuron', 0.093), ('cacy', 0.091), ('presynaptic', 0.084), ('dynamics', 0.082), ('external', 0.081), ('neurons', 0.079), ('membrane', 0.074), ('dynamical', 0.069), ('beggs', 0.067), ('ext', 0.067), ('vesicles', 0.067), ('sp', 0.066), ('delivered', 0.062), ('slowly', 0.061), ('transition', 0.05), ('system', 0.049), ('int', 0.049), ('deviation', 0.046), ('regime', 0.045), ('cnorm', 0.044), ('feder', 0.044), ('averages', 0.042), ('tj', 0.042), ('depression', 0.042), ('neuronal', 0.041), ('ii', 0.041), ('sharp', 0.041), ('activity', 0.04), ('internal', 0.04), ('network', 0.039), ('strength', 0.039), ('undergoes', 0.039), ('postsynaptic', 0.038), ('sizes', 0.036), ('curves', 0.036), ('bak', 0.035), ('timescale', 0.035), ('thermodynamic', 0.035), ('ttingen', 0.035), ('characterized', 0.035), ('cacies', 0.033), ('tsodyks', 0.033), ('anna', 0.033), ('exceeds', 0.033), ('smaller', 0.032), ('potential', 0.032), ('threshold', 0.032), ('depending', 0.031), ('receive', 0.031), ('element', 0.031), ('input', 0.03), ('exhibits', 0.03), ('markram', 0.029), ('resting', 0.029), ('neurosci', 0.029), ('regimes', 0.029), ('size', 0.029), ('connectivity', 0.029), ('behavior', 0.029), ('parameter', 0.029), ('spikes', 0.028), ('vicinity', 0.027), ('spike', 0.027), ('activation', 0.027), ('limit', 0.027), ('decrease', 0.026), ('dominates', 0.025), ('pyramidal', 0.025), ('grey', 0.025), ('game', 0.025), ('ji', 0.025), ('temporal', 0.024), ('ef', 0.024), ('submitted', 0.024), ('connection', 0.024), ('log', 0.023), ('synapse', 0.023), ('diverse', 0.023), ('effect', 0.022), ('biologically', 0.022), ('scales', 0.022), ('parameters', 0.022), ('plausible', 0.021), ('portion', 0.021), ('average', 0.021), ('spiking', 0.02), ('coupled', 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000006 61 nips-2005-Dynamical Synapses Give Rise to a Power-Law Distribution of Neuronal Avalanches
Author: Anna Levina, Michael Herrmann
Abstract: There is experimental evidence that cortical neurons show avalanche activity with the intensity of firing events being distributed as a power-law. We present a biologically plausible extension of a neural network which exhibits a power-law avalanche distribution for a wide range of connectivity parameters. 1
2 0.15293784 181 nips-2005-Spiking Inputs to a Winner-take-all Network
Author: Matthias Oster, Shih-Chii Liu
Abstract: Recurrent networks that perform a winner-take-all computation have been studied extensively. Although some of these studies include spiking networks, they consider only analog input rates. We present results of this winner-take-all computation on a network of integrate-and-fire neurons which receives spike trains as inputs. We show how we can configure the connectivity in the network so that the winner is selected after a pre-determined number of input spikes. We discuss spiking inputs with both regular frequencies and Poisson-distributed rates. The robustness of the computation was tested by implementing the winner-take-all network on an analog VLSI array of 64 integrate-and-fire neurons which have an innate variance in their operating parameters. 1
3 0.11604302 8 nips-2005-A Criterion for the Convergence of Learning with Spike Timing Dependent Plasticity
Author: Robert A. Legenstein, Wolfgang Maass
Abstract: We investigate under what conditions a neuron can learn by experimentally supported rules for spike timing dependent plasticity (STDP) to predict the arrival times of strong “teacher inputs” to the same neuron. It turns out that in contrast to the famous Perceptron Convergence Theorem, which predicts convergence of the perceptron learning rule for a simplified neuron model whenever a stable solution exists, no equally strong convergence guarantee can be given for spiking neurons with STDP. But we derive a criterion on the statistical dependency structure of input spike trains which characterizes exactly when learning with STDP will converge on average for a simple model of a spiking neuron. This criterion is reminiscent of the linear separability criterion of the Perceptron Convergence Theorem, but it applies here to the rows of a correlation matrix related to the spike inputs. In addition we show through computer simulations for more realistic neuron models that the resulting analytically predicted positive learning results not only hold for the common interpretation of STDP where STDP changes the weights of synapses, but also for a more realistic interpretation suggested by experimental data where STDP modulates the initial release probability of dynamic synapses. 1
4 0.11346497 188 nips-2005-Temporally changing synaptic plasticity
Author: Minija Tamosiunaite, Bernd Porr, Florentin Wörgötter
Abstract: Recent experimental results suggest that dendritic and back-propagating spikes can influence synaptic plasticity in different ways [1]. In this study we investigate how these signals could temporally interact at dendrites leading to changing plasticity properties at local synapse clusters. Similar to a previous study [2], we employ a differential Hebbian plasticity rule to emulate spike-timing dependent plasticity. We use dendritic (D-) and back-propagating (BP-) spikes as post-synaptic signals in the learning rule and investigate how their interaction will influence plasticity. We will analyze a situation where synapse plasticity characteristics change in the course of time, depending on the type of post-synaptic activity momentarily elicited. Starting with weak synapses, which only elicit local D-spikes, a slow, unspecific growth process is induced. As soon as the soma begins to spike this process is replaced by fast synaptic changes as the consequence of the much stronger and sharper BP-spike, which now dominates the plasticity rule. This way a winner-take-all-mechanism emerges in a two-stage process, enhancing the best-correlated inputs. These results suggest that synaptic plasticity is a temporal changing process by which the computational properties of dendrites or complete neurons can be substantially augmented. 1
Author: Misha Ahrens, Liam Paninski, Quentin J. Huys
Abstract: Our understanding of the input-output function of single cells has been substantially advanced by biophysically accurate multi-compartmental models. The large number of parameters needing hand tuning in these models has, however, somewhat hampered their applicability and interpretability. Here we propose a simple and well-founded method for automatic estimation of many of these key parameters: 1) the spatial distribution of channel densities on the cell’s membrane; 2) the spatiotemporal pattern of synaptic input; 3) the channels’ reversal potentials; 4) the intercompartmental conductances; and 5) the noise level in each compartment. We assume experimental access to: a) the spatiotemporal voltage signal in the dendrite (or some contiguous subpart thereof, e.g. via voltage sensitive imaging techniques), b) an approximate kinetic description of the channels and synapses present in each compartment, and c) the morphology of the part of the neuron under investigation. The key observation is that, given data a)-c), all of the parameters 1)-4) may be simultaneously inferred by a version of constrained linear regression; this regression, in turn, is efficiently solved using standard algorithms, without any “local minima” problems despite the large number of parameters and complex dynamics. The noise level 5) may also be estimated by standard techniques. We demonstrate the method’s accuracy on several model datasets, and describe techniques for quantifying the uncertainty in our estimates. 1
6 0.10140388 118 nips-2005-Learning in Silicon: Timing is Everything
7 0.099314667 99 nips-2005-Integrate-and-Fire models with adaptation are good enough
8 0.089114331 164 nips-2005-Representing Part-Whole Relationships in Recurrent Neural Networks
9 0.080759563 157 nips-2005-Principles of real-time computing with feedback applied to cortical microcircuit models
10 0.075984024 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity
11 0.070535347 39 nips-2005-Beyond Pair-Based STDP: a Phenomenological Rule for Spike Triplet and Frequency Effects
12 0.065044723 148 nips-2005-Online Discovery and Learning of Predictive State Representations
13 0.062984869 64 nips-2005-Efficient estimation of hidden state dynamics from spike trains
14 0.061411984 124 nips-2005-Measuring Shared Information and Coordinated Activity in Neuronal Networks
15 0.059403829 50 nips-2005-Convex Neural Networks
16 0.058510177 141 nips-2005-Norepinephrine and Neural Interrupts
17 0.058036726 165 nips-2005-Response Analysis of Neuronal Population with Synaptic Depression
18 0.050525881 134 nips-2005-Neural mechanisms of contrast dependent receptive field size in V1
19 0.049862102 40 nips-2005-CMOL CrossNets: Possible Neuromorphic Nanoelectronic Circuits
20 0.045207046 27 nips-2005-Analysis of Spectral Kernel Design based Semi-supervised Learning
topicId topicWeight
[(0, 0.135), (1, -0.211), (2, -0.046), (3, -0.088), (4, -0.006), (5, -0.009), (6, -0.01), (7, 0.014), (8, -0.012), (9, 0.025), (10, 0.012), (11, 0.077), (12, 0.004), (13, -0.056), (14, 0.013), (15, -0.013), (16, 0.016), (17, 0.085), (18, 0.068), (19, -0.029), (20, 0.033), (21, 0.016), (22, 0.012), (23, -0.028), (24, -0.067), (25, -0.014), (26, 0.004), (27, -0.071), (28, 0.09), (29, 0.068), (30, 0.068), (31, -0.053), (32, -0.143), (33, -0.06), (34, 0.048), (35, 0.054), (36, 0.053), (37, -0.042), (38, -0.071), (39, 0.044), (40, -0.093), (41, 0.039), (42, -0.054), (43, -0.094), (44, -0.002), (45, -0.016), (46, 0.022), (47, 0.077), (48, 0.207), (49, -0.064)]
simIndex simValue paperId paperTitle
same-paper 1 0.94607884 61 nips-2005-Dynamical Synapses Give Rise to a Power-Law Distribution of Neuronal Avalanches
Author: Anna Levina, Michael Herrmann
Abstract: There is experimental evidence that cortical neurons show avalanche activity with the intensity of firing events being distributed as a power-law. We present a biologically plausible extension of a neural network which exhibits a power-law avalanche distribution for a wide range of connectivity parameters. 1
2 0.66424066 106 nips-2005-Large-scale biophysical parameter estimation in single neurons via constrained linear regression
Author: Misha Ahrens, Liam Paninski, Quentin J. Huys
Abstract: Our understanding of the input-output function of single cells has been substantially advanced by biophysically accurate multi-compartmental models. The large number of parameters needing hand tuning in these models has, however, somewhat hampered their applicability and interpretability. Here we propose a simple and well-founded method for automatic estimation of many of these key parameters: 1) the spatial distribution of channel densities on the cell’s membrane; 2) the spatiotemporal pattern of synaptic input; 3) the channels’ reversal potentials; 4) the intercompartmental conductances; and 5) the noise level in each compartment. We assume experimental access to: a) the spatiotemporal voltage signal in the dendrite (or some contiguous subpart thereof, e.g. via voltage sensitive imaging techniques), b) an approximate kinetic description of the channels and synapses present in each compartment, and c) the morphology of the part of the neuron under investigation. The key observation is that, given data a)-c), all of the parameters 1)-4) may be simultaneously inferred by a version of constrained linear regression; this regression, in turn, is efficiently solved using standard algorithms, without any “local minima” problems despite the large number of parameters and complex dynamics. The noise level 5) may also be estimated by standard techniques. We demonstrate the method’s accuracy on several model datasets, and describe techniques for quantifying the uncertainty in our estimates. 1
3 0.55505639 165 nips-2005-Response Analysis of Neuronal Population with Synaptic Depression
Author: Wentao Huang, Licheng Jiao, Shan Tan, Maoguo Gong
Abstract: In this paper, we aim at analyzing the characteristic of neuronal population responses to instantaneous or time-dependent inputs and the role of synapses in neural information processing. We have derived an evolution equation of the membrane potential density function with synaptic depression, and obtain the formulas for analytic computing the response of instantaneous re rate. Through a technical analysis, we arrive at several signi cant conclusions: The background inputs play an important role in information processing and act as a switch betwee temporal integration and coincidence detection. the role of synapses can be regarded as a spatio-temporal lter; it is important in neural information processing for the spatial distribution of synapses and the spatial and temporal relation of inputs. The instantaneous input frequency can affect the response amplitude and phase delay. 1
4 0.52720195 40 nips-2005-CMOL CrossNets: Possible Neuromorphic Nanoelectronic Circuits
Author: Jung Hoon Lee, Xiaolong Ma, Konstantin K. Likharev
Abstract: Hybrid “CMOL” integrated circuits, combining CMOS subsystem with nanowire crossbars and simple two-terminal nanodevices, promise to extend the exponential Moore-Law development of microelectronics into the sub-10-nm range. We are developing neuromorphic network (“CrossNet”) architectures for this future technology, in which neural cell bodies are implemented in CMOS, nanowires are used as axons and dendrites, while nanodevices (bistable latching switches) are used as elementary synapses. We have shown how CrossNets may be trained to perform pattern recovery and classification despite the limitations imposed by the CMOL hardware. Preliminary estimates have shown that CMOL CrossNets may be extremely dense (~10 7 cells per cm2) and operate approximately a million times faster than biological neural networks, at manageable power consumption. In Conclusion, we discuss in brief possible short-term and long-term applications of the emerging technology. 1 Introduction: CMOL Circuits Recent results [1, 2] indicate that the current VLSI paradigm based on CMOS technology can be hardly extended beyond the 10-nm frontier: in this range the sensitivity of parameters (most importantly, the gate voltage threshold) of silicon field-effect transistors to inevitable fabrication spreads grows exponentially. This sensitivity will probably send the fabrication facilities costs skyrocketing, and may lead to the end of Moore’s Law some time during the next decade. There is a growing consensus that the impending Moore’s Law crisis may be preempted by a radical paradigm shift from the purely CMOS technology to hybrid CMOS/nanodevice circuits, e.g., those of “CMOL” variety (Fig. 1). Such circuits (see, e.g., Ref. 3 for their recent review) would combine a level of advanced CMOS devices fabricated by the lithographic patterning, and two-layer nanowire crossbar formed, e.g., by nanoimprint, with nanowires connected by simple, similar, two-terminal nanodevices at each crosspoint. For such devices, molecular single-electron latching switches [4] are presently the leading candidates, in particular because they may be fabricated using the self-assembled monolayer (SAM) technique which already gave reproducible results for simpler molecular devices [5]. (a) nanodevices nanowiring and nanodevices interface pins upper wiring level of CMOS stack (b) βFCMOS Fnano α Fig. 1. CMOL circuit: (a) schematic side view, and (b) top-view zoom-in on several adjacent interface pins. (For clarity, only two adjacent nanodevices are shown.) In order to overcome the CMOS/nanodevice interface problems pertinent to earlier proposals of hybrid circuits [6], in CMOL the interface is provided by pins that are distributed all over the circuit area, on the top of the CMOS stack. This allows to use advanced techniques of nanowire patterning (like nanoimprint) which do not have nanoscale accuracy of layer alignment [3]. The vital feature of this interface is the tilt, by angle α = arcsin(Fnano/βFCMOS), of the nanowire crossbar relative to the square arrays of interface pins (Fig. 1b). Here Fnano is the nanowiring half-pitch, FCMOS is the half-pitch of the CMOS subsystem, and β is a dimensionless factor larger than 1 that depends on the CMOS cell complexity. Figure 1b shows that this tilt allows the CMOS subsystem to address each nanodevice even if Fnano << βFCMOS. By now, it has been shown that CMOL circuits can combine high performance with high defect tolerance (which is necessary for any circuit using nanodevices) for several digital applications. In particular, CMOL circuits with defect rates below a few percent would enable terabit-scale memories [7], while the performance of FPGA-like CMOL circuits may be several hundred times above that of overcome purely CMOL FPGA (implemented with the same FCMOS), at acceptable power dissipation and defect tolerance above 20% [8]. In addition, the very structure of CMOL circuits makes them uniquely suitable for the implementation of more complex, mixed-signal information processing systems, including ultradense and ultrafast neuromorphic networks. The objective of this paper is to describe in brief the current status of our work on the development of so-called Distributed Crossbar Networks (“CrossNets”) that could provide high performance despite the limitations imposed by CMOL hardware. A more detailed description of our earlier results may be found in Ref. 9. 2 Synapses The central device of CrossNet is a two-terminal latching switch [3, 4] (Fig. 2a) which is a combination of two single-electron devices, a transistor and a trap [3]. The device may be naturally implemented as a single organic molecule (Fig. 2b). Qualitatively, the device operates as follows: if voltage V = Vj – Vk applied between the external electrodes (in CMOL, nanowires) is low, the trap island has no net electric charge, and the single-electron transistor is closed. If voltage V approaches certain threshold value V+ > 0, an additional electron is inserted into the trap island, and its field lifts the Coulomb blockade of the single-electron transistor, thus connecting the nanowires. The switch state may be reset (e.g., wires disconnected) by applying a lower voltage V < V- < V+. Due to the random character of single-electron tunneling [2], the quantitative description of the switch is by necessity probabilistic: actually, V determines only the rates Γ↑↓ of device switching between its ON and OFF states. The rates, in turn, determine the dynamics of probability p to have the transistor opened (i.e. wires connected): dp/dt = Γ↑(1 - p) - Γ↓p. (1) The theory of single-electron tunneling [2] shows that, in a good approximation, the rates may be presented as Γ↑↓ = Γ0 exp{±e(V - S)/kBT} , (2) (a) single-electron trap tunnel junction Vj Vk single-electron transistor (b) O clipping group O N C R diimide acceptor groups O O C N R R O OPE wires O N R R N O O R O N R R = hexyl N O O R R O N C R R R Fig. 2. (a) Schematics and (b) possible molecular implementation of the two-terminal single-electron latching switch where Γ0 and S are constants depending on physical parameters of the latching switches. Note that despite the random character of switching, the strong nonlinearity of Eq. (2) allows to limit the degree of the device “fuzziness”. 3 CrossNets Figure 3a shows the generic structure of a CrossNet. CMOS-implemented somatic cells (within the Fire Rate model, just nonlinear differential amplifiers, see Fig. 3b,c) apply their output voltages to “axonic” nanowires. If the latching switch, working as an elementary synapse, on the crosspoint of an axonic wire with the perpendicular “dendritic” wire is open, some current flows into the latter wire, charging it. Since such currents are injected into each dendritic wire through several (many) open synapses, their addition provides a natural passive analog summation of signals from the corresponding somas, typical for all neural networks. Examining Fig. 3a, please note the open-circuit terminations of axonic and dendritic lines at the borders of the somatic cells; due to these terminations the somas do not communicate directly (but only via synapses). The network shown on Fig. 3 is evidently feedforward; recurrent networks are achieved in the evident way by doubling the number of synapses and nanowires per somatic cell (Fig. 3c). Moreover, using dual-rail (bipolar) representation of the signal, and hence doubling the number of nanowires and elementary synapses once again, one gets a CrossNet with somas coupled by compact 4-switch groups [9]. Using Eqs. (1) and (2), it is straightforward to show that that the average synaptic weight wjk of the group obeys the “quasi-Hebbian” rule: d w jk = −4Γ0 sinh (γ S ) sinh (γ V j ) sinh (γ Vk ) . dt (3) (a) - +soma j (b) RL + -- jk+ RL (c) jk- RL + -- -+soma k RL Fig. 3. (a) Generic structure of the simplest, (feedforward, non-Hebbian) CrossNet. Red lines show “axonic”, and blue lines “dendritic” nanowires. Gray squares are interfaces between nanowires and CMOS-based somas (b, c). Signs show the dendrite input polarities. Green circles denote molecular latching switches forming elementary synapses. Bold red and blue points are open-circuit terminations of the nanowires, that do not allow somas to interact in bypass of synapses In the simplest cases (e.g., quasi-Hopfield networks with finite connectivity), the tri-level synaptic weights of the generic CrossNets are quite satisfactory, leading to just a very modest (~30%) network capacity loss. However, some applications (in particular, pattern classification) may require a larger number of weight quantization levels L (e.g., L ≈ 30 for a 1% fidelity [9]). This may be achieved by using compact square arrays (e.g., 4×4) of latching switches (Fig. 4). Various species of CrossNets [9] differ also by the way the somatic cells are distributed around the synaptic field. Figure 5 shows feedforward versions of two CrossNet types most explored so far: the so-called FlossBar and InBar. The former network is more natural for the implementation of multilayered perceptrons (MLP), while the latter system is preferable for recurrent network implementations and also allows a simpler CMOS design of somatic cells. The most important advantage of CrossNets over the hardware neural networks suggested earlier is that these networks allow to achieve enormous density combined with large cell connectivity M >> 1 in quasi-2D electronic circuits. 4 CrossNet training CrossNet training faces several hardware-imposed challenges: (i) The synaptic weight contribution provided by the elementary latching switch is binary, so that for most applications the multi-switch synapses (Fig. 4) are necessary. (ii) The only way to adjust any particular synaptic weight is to turn ON or OFF the corresponding latching switch(es). This is only possible to do by applying certain voltage V = Vj – Vk between the two corresponding nanowires. At this procedure, other nanodevices attached to the same wires should not be disturbed. (iii) As stated above, synapse state switching is a statistical progress, so that the degree of its “fuzziness” should be carefully controlled. (a) Vj (b) V w – A/2 i=1 i=1 2 2 … … n n Vj V w+ A/2 i' = 1 RL 2 … i' = 1 n RS ±(V t –A/2) 2 … RS n ±(V t +A/2) Fig. 4. Composite synapse for providing L = 2n2+1 discrete levels of the weight in (a) operation and (b) weight adjustment modes. The dark-gray rectangles are resistive metallic strips at soma/nanowire interfaces (a) (b) Fig. 5. Two main CrossNet species: (a) FlossBar and (b) InBar, in the generic (feedforward, non-Hebbian, ternary-weight) case for the connectivity parameter M = 9. Only the nanowires and nanodevices coupling one cell (indicated with red dashed lines) to M post-synaptic cells (blue dashed lines) are shown; actually all the cells are similarly coupled We have shown that these challenges may be met using (at least) the following training methods [9]: (i) Synaptic weight import. This procedure is started with training of a homomorphic “precursor” artificial neural network with continuous synaptic weighs wjk, implemented in software, using one of established methods (e.g., error backpropagation). Then the synaptic weights wjk are transferred to the CrossNet, with some “clipping” (rounding) due to the binary nature of elementary synaptic weights. To accomplish the transfer, pairs of somatic cells are sequentially selected via CMOS-level wiring. Using the flexibility of CMOS circuitry, these cells are reconfigured to apply external voltages ±VW to the axonic and dendritic nanowires leading to a particular synapse, while all other nanowires are grounded. The voltage level V W is selected so that it does not switch the synapses attached to only one of the selected nanowires, while voltage 2VW applied to the synapse at the crosspoint of the selected wires is sufficient for its reliable switching. (In the composite synapses with quasi-continuous weights (Fig. 4), only a part of the corresponding switches is turned ON or OFF.) (ii) Error backpropagation. The synaptic weight import procedure is straightforward when wjk may be simply calculated, e.g., for the Hopfield-type networks. However, for very large CrossNets used, e.g., as pattern classifiers the precursor network training may take an impracticably long time. In this case the direct training of a CrossNet may become necessary. We have developed two methods of such training, both based on “Hebbian” synapses consisting of 4 elementary synapses (latching switches) whose average weight dynamics obeys Eq. (3). This quasi-Hebbian rule may be used to implement the backpropagation algorithm either using a periodic time-multiplexing [9] or in a continuous fashion, using the simultaneous propagation of signals and errors along the same dual-rail channels. As a result, presently we may state that CrossNets may be taught to perform virtually all major functions demonstrated earlier with the usual neural networks, including the corrupted pattern restoration in the recurrent quasi-Hopfield mode and pattern classification in the feedforward MLP mode [11]. 5 C r o s s N e t p e r f o r m an c e e s t i m a t e s The significance of this result may be only appreciated in the context of unparalleled physical parameters of CMOL CrossNets. The only fundamental limitation on the half-pitch Fnano (Fig. 1) comes from quantum-mechanical tunneling between nanowires. If the wires are separated by vacuum, the corresponding specific leakage conductance becomes uncomfortably large (~10-12 Ω-1m-1) only at Fnano = 1.5 nm; however, since realistic insulation materials (SiO2, etc.) provide somewhat lower tunnel barriers, let us use a more conservative value Fnano= 3 nm. Note that this value corresponds to 1012 elementary synapses per cm2, so that for 4M = 104 and n = 4 the areal density of neural cells is close to 2×107 cm-2. Both numbers are higher than those for the human cerebral cortex, despite the fact that the quasi-2D CMOL circuits have to compete with quasi-3D cerebral cortex. With the typical specific capacitance of 3×10-10 F/m = 0.3 aF/nm, this gives nanowire capacitance C0 ≈ 1 aF per working elementary synapse, because the corresponding segment has length 4Fnano. The CrossNet operation speed is determined mostly by the time constant τ0 of dendrite nanowire capacitance recharging through resistances of open nanodevices. Since both the relevant conductance and capacitance increase similarly with M and n, τ0 ≈ R0C0. The possibilities of reduction of R0, and hence τ0, are limited mostly by acceptable power dissipation per unit area, that is close to Vs2/(2Fnano)2R0. For room-temperature operation, the voltage scale V0 ≈ Vt should be of the order of at least 30 kBT/e ≈ 1 V to avoid thermally-induced errors [9]. With our number for Fnano, and a relatively high but acceptable power consumption of 100 W/cm2, we get R0 ≈ 1010Ω (which is a very realistic value for single-molecule single-electron devices like one shown in Fig. 3). With this number, τ0 is as small as ~10 ns. This means that the CrossNet speed may be approximately six orders of magnitude (!) higher than that of the biological neural networks. Even scaling R0 up by a factor of 100 to bring power consumption to a more comfortable level of 1 W/cm2, would still leave us at least a four-orders-of-magnitude speed advantage. 6 D i s c u s s i on: P o s s i bl e a p p l i c at i o n s These estimates make us believe that that CMOL CrossNet chips may revolutionize the neuromorphic network applications. Let us start with the example of relatively small (1-cm2-scale) chips used for recognition of a face in a crowd [11]. The most difficult feature of such recognition is the search for face location, i.e. optimal placement of a face on the image relative to the panel providing input for the processing network. The enormous density and speed of CMOL hardware gives a possibility to time-and-space multiplex this task (Fig. 6). In this approach, the full image (say, formed by CMOS photodetectors on the same chip) is divided into P rectangular panels of h×w pixels, corresponding to the expected size and approximate shape of a single face. A CMOS-implemented communication channel passes input data from each panel to the corresponding CMOL neural network, providing its shift in time, say using the TV scanning pattern (red line in Fig. 6). The standard methods of image classification require the network to have just a few hidden layers, so that the time interval Δt necessary for each mapping position may be so short that the total pattern recognition time T = hwΔt may be acceptable even for online face recognition. w h image network input Fig. 6. Scan mapping of the input image on CMOL CrossNet inputs. Red lines show the possible time sequence of image pixels sent to a certain input of the network processing image from the upper-left panel of the pattern Indeed, let us consider a 4-Megapixel image partitioned into 4K 32×32-pixel panels (h = w = 32). This panel will require an MLP net with several (say, four) layers with 1K cells each in order to compare the panel image with ~10 3 stored faces. With the feasible 4-nm nanowire half-pitch, and 65-level synapses (sufficient for better than 99% fidelity [9]), each interlayer crossbar would require chip area about (4K×64 nm)2 = 64×64 μm2, fitting 4×4K of them on a ~0.6 cm2 chip. (The CMOS somatic-layer and communication-system overheads are negligible.) With the acceptable power consumption of the order of 10 W/cm2, the input-to-output signal propagation in such a network will take only about 50 ns, so that Δt may be of the order of 100 ns and the total time T = hwΔt of processing one frame of the order of 100 microseconds, much shorter than the typical TV frame time of ~10 milliseconds. The remaining two-orders-of-magnitude time gap may be used, for example, for double-checking the results via stopping the scan mapping (Fig. 6) at the most promising position. (For this, a simple feedback from the recognition output to the mapping communication system is necessary.) It is instructive to compare the estimated CMOL chip speed with that of the implementation of a similar parallel network ensemble on a CMOS signal processor (say, also combined on the same chip with an array of CMOS photodetectors). Even assuming an extremely high performance of 30 billion additions/multiplications per second, we would need ~4×4K×1K×(4K)2/(30×109) ≈ 104 seconds ~ 3 hours per frame, evidently incompatible with the online image stream processing. Let us finish with a brief (and much more speculative) discussion of possible long-term prospects of CMOL CrossNets. Eventually, large-scale (~30×30 cm2) CMOL circuits may become available. According to the estimates given in the previous section, the integration scale of such a system (in terms of both neural cells and synapses) will be comparable with that of the human cerebral cortex. Equipped with a set of broadband sensor/actuator interfaces, such (necessarily, hierarchical) system may be capable, after a period of initial supervised training, of further self-training in the process of interaction with environment, with the speed several orders of magnitude higher than that of its biological prototypes. Needless to say, the successful development of such self-developing systems would have a major impact not only on all information technologies, but also on the society as a whole. Acknowledgments This work has been supported in part by the AFOSR, MARCO (via FENA Center), and NSF. Valuable contributions made by Simon Fölling, Özgür Türel and Ibrahim Muckra, as well as useful discussions with P. Adams, J. Barhen, D. Hammerstrom, V. Protopopescu, T. Sejnowski, and D. Strukov are gratefully acknowledged. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] Frank, D. J. et al. (2001) Device scaling limits of Si MOSFETs and their application dependencies. Proc. IEEE 89(3): 259-288. Likharev, K. K. (2003) Electronics below 10 nm, in J. Greer et al. (eds.), Nano and Giga Challenges in Microelectronics, pp. 27-68. Amsterdam: Elsevier. Likharev, K. K. and Strukov, D. B. (2005) CMOL: Devices, circuits, and architectures, in G. Cuniberti et al. (eds.), Introducing Molecular Electronics, Ch. 16. Springer, Berlin. Fölling, S., Türel, Ö. & Likharev, K. K. (2001) Single-electron latching switches as nanoscale synapses, in Proc. of the 2001 Int. Joint Conf. on Neural Networks, pp. 216-221. Mount Royal, NJ: Int. Neural Network Society. Wang, W. et al. (2003) Mechanism of electron conduction in self-assembled alkanethiol monolayer devices. Phys. Rev. B 68(3): 035416 1-8. Stan M. et al. (2003) Molecular electronics: From devices and interconnect to circuits and architecture, Proc. IEEE 91(11): 1940-1957. Strukov, D. B. & Likharev, K. K. (2005) Prospects for terabit-scale nanoelectronic memories. Nanotechnology 16(1): 137-148. Strukov, D. B. & Likharev, K. K. (2005) CMOL FPGA: A reconfigurable architecture for hybrid digital circuits with two-terminal nanodevices. Nanotechnology 16(6): 888-900. Türel, Ö. et al. (2004) Neuromorphic architectures for nanoelectronic circuits”, Int. J. of Circuit Theory and Appl. 32(5): 277-302. See, e.g., Hertz J. et al. (1991) Introduction to the Theory of Neural Computation. Cambridge, MA: Perseus. Lee, J. H. & Likharev, K. K. (2005) CrossNets as pattern classifiers. Lecture Notes in Computer Sciences 3575: 434-441.
5 0.51868755 188 nips-2005-Temporally changing synaptic plasticity
Author: Minija Tamosiunaite, Bernd Porr, Florentin Wörgötter
Abstract: Recent experimental results suggest that dendritic and back-propagating spikes can influence synaptic plasticity in different ways [1]. In this study we investigate how these signals could temporally interact at dendrites leading to changing plasticity properties at local synapse clusters. Similar to a previous study [2], we employ a differential Hebbian plasticity rule to emulate spike-timing dependent plasticity. We use dendritic (D-) and back-propagating (BP-) spikes as post-synaptic signals in the learning rule and investigate how their interaction will influence plasticity. We will analyze a situation where synapse plasticity characteristics change in the course of time, depending on the type of post-synaptic activity momentarily elicited. Starting with weak synapses, which only elicit local D-spikes, a slow, unspecific growth process is induced. As soon as the soma begins to spike this process is replaced by fast synaptic changes as the consequence of the much stronger and sharper BP-spike, which now dominates the plasticity rule. This way a winner-take-all-mechanism emerges in a two-stage process, enhancing the best-correlated inputs. These results suggest that synaptic plasticity is a temporal changing process by which the computational properties of dendrites or complete neurons can be substantially augmented. 1
6 0.51814711 164 nips-2005-Representing Part-Whole Relationships in Recurrent Neural Networks
7 0.49039373 73 nips-2005-Fast biped walking with a reflexive controller and real-time policy searching
8 0.48183066 181 nips-2005-Spiking Inputs to a Winner-take-all Network
9 0.48032099 99 nips-2005-Integrate-and-Fire models with adaptation are good enough
10 0.45162779 118 nips-2005-Learning in Silicon: Timing is Everything
11 0.39817709 8 nips-2005-A Criterion for the Convergence of Learning with Spike Timing Dependent Plasticity
12 0.39697409 157 nips-2005-Principles of real-time computing with feedback applied to cortical microcircuit models
13 0.3734284 124 nips-2005-Measuring Shared Information and Coordinated Activity in Neuronal Networks
14 0.36941451 6 nips-2005-A Connectionist Model for Constructive Modal Reasoning
15 0.32599232 148 nips-2005-Online Discovery and Learning of Predictive State Representations
16 0.3200407 32 nips-2005-Augmented Rescorla-Wagner and Maximum Likelihood Estimation
17 0.31294984 68 nips-2005-Factorial Switching Kalman Filters for Condition Monitoring in Neonatal Intensive Care
18 0.30812782 64 nips-2005-Efficient estimation of hidden state dynamics from spike trains
19 0.29270807 62 nips-2005-Efficient Estimation of OOMs
20 0.28414688 39 nips-2005-Beyond Pair-Based STDP: a Phenomenological Rule for Spike Triplet and Frequency Effects
topicId topicWeight
[(3, 0.046), (10, 0.067), (11, 0.014), (27, 0.016), (31, 0.044), (34, 0.045), (39, 0.011), (41, 0.012), (44, 0.014), (52, 0.297), (55, 0.037), (57, 0.059), (69, 0.057), (73, 0.035), (77, 0.013), (88, 0.056), (91, 0.073)]
simIndex simValue paperId paperTitle
same-paper 1 0.79886717 61 nips-2005-Dynamical Synapses Give Rise to a Power-Law Distribution of Neuronal Avalanches
Author: Anna Levina, Michael Herrmann
Abstract: There is experimental evidence that cortical neurons show avalanche activity with the intensity of firing events being distributed as a power-law. We present a biologically plausible extension of a neural network which exhibits a power-law avalanche distribution for a wide range of connectivity parameters. 1
2 0.66369164 43 nips-2005-Comparing the Effects of Different Weight Distributions on Finding Sparse Representations
Author: Bhaskar D. Rao, David P. Wipf
Abstract: Given a redundant dictionary of basis vectors (or atoms), our goal is to find maximally sparse representations of signals. Previously, we have argued that a sparse Bayesian learning (SBL) framework is particularly well-suited for this task, showing that it has far fewer local minima than other Bayesian-inspired strategies. In this paper, we provide further evidence for this claim by proving a restricted equivalence condition, based on the distribution of the nonzero generating model weights, whereby the SBL solution will equal the maximally sparse representation. We also prove that if these nonzero weights are drawn from an approximate Jeffreys prior, then with probability approaching one, our equivalence condition is satisfied. Finally, we motivate the worst-case scenario for SBL and demonstrate that it is still better than the most widely used sparse representation algorithms. These include Basis Pursuit (BP), which is based on a convex relaxation of the ℓ0 (quasi)-norm, and Orthogonal Matching Pursuit (OMP), a simple greedy strategy that iteratively selects basis vectors most aligned with the current residual. 1
3 0.61256409 77 nips-2005-From Lasso regression to Feature vector machine
Author: Fan Li, Yiming Yang, Eric P. Xing
Abstract: Lasso regression tends to assign zero weights to most irrelevant or redundant features, and hence is a promising technique for feature selection. Its limitation, however, is that it only offers solutions to linear models. Kernel machines with feature scaling techniques have been studied for feature selection with non-linear models. However, such approaches require to solve hard non-convex optimization problems. This paper proposes a new approach named the Feature Vector Machine (FVM). It reformulates the standard Lasso regression into a form isomorphic to SVM, and this form can be easily extended for feature selection with non-linear models by introducing kernels defined on feature vectors. FVM generates sparse solutions in the nonlinear feature space and it is much more tractable compared to feature scaling kernel machines. Our experiments with FVM on simulated data show encouraging results in identifying the small number of dominating features that are non-linearly correlated to the response, a task the standard Lasso fails to complete.
4 0.60128617 45 nips-2005-Conditional Visual Tracking in Kernel Space
Author: Cristian Sminchisescu, Atul Kanujia, Zhiguo Li, Dimitris Metaxas
Abstract: We present a conditional temporal probabilistic framework for reconstructing 3D human motion in monocular video based on descriptors encoding image silhouette observations. For computational efÄ?Ĺš ciency we restrict visual inference to low-dimensional kernel induced non-linear state spaces. Our methodology (kBME) combines kernel PCA-based non-linear dimensionality reduction (kPCA) and Conditional Bayesian Mixture of Experts (BME) in order to learn complex multivalued predictors between observations and model hidden states. This is necessary for accurate, inverse, visual perception inferences, where several probable, distant 3D solutions exist due to noise or the uncertainty of monocular perspective projection. Low-dimensional models are appropriate because many visual processes exhibit strong non-linear correlations in both the image observations and the target, hidden state variables. The learned predictors are temporally combined within a conditional graphical model in order to allow a principled propagation of uncertainty. We study several predictors and empirically show that the proposed algorithm positively compares with techniques based on regression, Kernel Dependency Estimation (KDE) or PCA alone, and gives results competitive to those of high-dimensional mixture predictors at a fraction of their computational cost. We show that the method successfully reconstructs the complex 3D motion of humans in real monocular video sequences. 1 Introduction and Related Work We consider the problem of inferring 3D articulated human motion from monocular video. This research topic has applications for scene understanding including human-computer interfaces, markerless human motion capture, entertainment and surveillance. A monocular approach is relevant because in real-world settings the human body parts are rarely completely observed even when using multiple cameras. This is due to occlusions form other people or objects in the scene. A robust system has to necessarily deal with incomplete, ambiguous and uncertain measurements. Methods for 3D human motion reconstruction can be classiÄ?Ĺš ed as generative and discriminative. They both require a state representation, namely a 3D human model with kinematics (joint angles) or shape (surfaces or joint positions) and they both use a set of image features as observations for state inference. The computational goal in both cases is the conditional distribution for the model state given image observations. Generative model-based approaches [6, 16, 14, 13] have been demonstrated to Ä?Ĺš‚exibly reconstruct complex unknown human motions and to naturally handle problem constraints. However it is difÄ?Ĺš cult to construct reliable observation likelihoods due to the complexity of modeling human appearance. This varies widely due to different clothing and deformation, body proportions or lighting conditions. Besides being somewhat indirect, the generative approach further imposes strict conditional independence assumptions on the temporal observations given the states in order to ensure computational tractability. Due to these factors inference is expensive and produces highly multimodal state distributions [6, 16, 13]. Generative inference algorithms require complex annealing schedules [6, 13] or systematic non-linear search for local optima [16] in order to ensure continuing tracking. These difÄ?Ĺš culties motivate the advent of a complementary class of discriminative algorithms [10, 12, 18, 2], that approximate the state conditional directly, in order to simplify inference. However, inverse, observation-to-state multivalued mappings are difÄ?Ĺš cult to learn (see e.g. Ä?Ĺš g. 1a) and a probabilistic temporal setting is necessary. In an earlier paper [15] we introduced a probabilistic discriminative framework for human motion reconstruction. Because the method operates in the originally selected state and observation spaces that can be task generic, therefore redundant and often high-dimensional, inference is more expensive and can be less robust. To summarize, reconstructing 3D human motion in a Figure 1: (a, Left) Example of 180o ambiguity in predicting 3D human poses from silhouette image features (center). It is essential that multiple plausible solutions (e.g. F 1 and F2 ) are correctly represented and tracked over time. A single state predictor will either average the distant solutions or zig-zag between them, see also tables 1 and 2. (b, Right) A conditional chain model. The local distributions p(yt |yt−1 , zt ) or p(yt |zt ) are learned as in Ä?Ĺš g. 2. For inference, the predicted local state conditional is recursively combined with the Ä?Ĺš ltered prior c.f . (1). conditional temporal framework poses the following difÄ?Ĺš culties: (i) The mapping between temporal observations and states is multivalued (i.e. the local conditional distributions to be learned are multimodal), therefore it cannot be accurately represented using global function approximations. (ii) Human models have multivariate, high-dimensional continuous states of 50 or more human joint angles. The temporal state conditionals are multimodal which makes efÄ?Ĺš cient Kalman Ä?Ĺš ltering algorithms inapplicable. General inference methods (particle Ä?Ĺš lters, mixtures) have to be used instead, but these are expensive for high-dimensional models (e.g. when reconstructing the motion of several people that operate in a joint state space). (iii) The components of the human state and of the silhouette observation vector exhibit strong correlations, because many repetitive human activities like walking or running have low intrinsic dimensionality. It appears wasteful to work with high-dimensional states of 50+ joint angles. Even if the space were truly high-dimensional, predicting correlated state dimensions independently may still be suboptimal. In this paper we present a conditional temporal estimation algorithm that restricts visual inference to low-dimensional, kernel induced state spaces. To exploit correlations among observations and among state variables, we model the local, temporal conditional distributions using ideas from Kernel PCA [11, 19] and conditional mixture modeling [7, 5], here adapted to produce multiple probabilistic predictions. The corresponding predictor is referred to as a Conditional Bayesian Mixture of Low-dimensional Kernel-Induced Experts (kBME). By integrating it within a conditional graphical model framework (Ä?Ĺš g. 1b), we can exploit temporal constraints probabilistically. We demonstrate that this methodology is effective for reconstructing the 3D motion of multiple people in monocular video. Our contribution w.r.t. [15] is a probabilistic conditional inference framework that operates over a non-linear, kernel-induced low-dimensional state spaces, and a set of experiments (on both real and artiÄ?Ĺš cial image sequences) that show how the proposed framework positively compares with powerful predictors based on KDE, PCA, or with the high-dimensional models of [15] at a fraction of their cost. 2 Probabilistic Inference in a Kernel Induced State Space We work with conditional graphical models with a chain structure [9], as shown in Ä?Ĺš g. 1b, These have continuous temporal states yt , t = 1 . . . T , observations zt . For compactness, we denote joint states Yt = (y1 , y2 , . . . , yt ) or joint observations Zt = (z1 , . . . , zt ). Learning and inference are based on local conditionals: p(yt |zt ) and p(yt |yt−1 , zt ), with yt and zt being low-dimensional, kernel induced representations of some initial model having state xt and observation rt . We obtain zt , yt from rt , xt using kernel PCA [11, 19]. Inference is performed in a low-dimensional, non-linear, kernel induced latent state space (see Ä?Ĺš g. 1b and Ä?Ĺš g. 2 and (1)). For display or error reporting, we compute the original conditional p(x|r), or a temporally Ä?Ĺš ltered version p(xt |Rt ), Rt = (r1 , r2 , . . . , rt ), using a learned pre-image state map [3]. 2.1 Density Propagation for Continuous Conditional Chains For online Ä?Ĺš ltering, we compute the optimal distribution p(yt |Zt ) for the state yt , conditioned by observations Zt up to time t. The Ä?Ĺš ltered density can be recursively derived as: p(yt |Zt ) = p(yt |yt−1 , zt )p(yt−1 |Zt−1 ) (1) yt−1 We compute using a conditional mixture for p(yt |yt−1 , zt ) (a Bayesian mixture of experts c.f . §2.2) and the prior p(yt−1 |Zt−1 ), each having, say M components. We integrate M 2 pairwise products of Gaussians analytically. The means of the expanded posterior are clustered and the centers are used to initialize a reduced M -component Kullback-Leibler approximation that is reÄ?Ĺš ned using gradient descent [15]. The propagation rule (1) is similar to the one used for discrete state labels [9], but here we work with multivariate continuous state spaces and represent the local multimodal state conditionals using kBME (Ä?Ĺš g. 2), and not log-linear models [9] (these would require intractable normalization). This complex continuous model rules out inference based on Kalman Ä?Ĺš ltering or dynamic programming [9]. 2.2 Learning Bayesian Mixtures over Kernel Induced State Spaces (kBME) In order to model conditional mappings between low-dimensional non-linear spaces we rely on kernel dimensionality reduction and conditional mixture predictors. The authors of KDE [19] propose a powerful structured unimodal predictor. This works by decorrelating the output using kernel PCA and learning a ridge regressor between the input and each decorrelated output dimension. Our procedure is also based on kernel PCA but takes into account the structure of the studied visual problem where both inputs and outputs are likely to be low-dimensional and the mapping between them multivalued. The output variables xi are projected onto the column vectors of the principal space in order to obtain their principal coordinates y i . A z ∈ P(Fr ) O p(y|z) kP CA ĂŽĹšr (r) ⊂ Fr O / y ∈ P(Fx ) O QQQ QQQ QQQ kP CA QQQ Q( ĂŽĹšx (x) ⊂ Fx x ≈ PreImage(y) O ĂŽĹšr ĂŽĹšx r ∈ R ⊂ Rr x ∈ X ⊂ Rx p(x|r) ≈ p(x|y) Figure 2: The learned low-dimensional predictor, kBME, for computing p(x|r) â‰Ä„ p(xt |rt ), ∀t. (We similarly learn p(xt |xt−1 , rt ), with input (x, r) instead of r – here we illustrate only p(x|r) for clarity.) The input r and the output x are decorrelated using Kernel PCA to obtain z and y respectively. The kernels used for the input and output are ĂŽĹš r and ĂŽĹšx , with induced feature spaces Fr and Fx , respectively. Their principal subspaces obtained by kernel PCA are denoted by P(Fr ) and P(Fx ), respectively. A conditional Bayesian mixture of experts p(y|z) is learned using the low-dimensional representation (z, y). Using learned local conditionals of the form p(yt |zt ) or p(yt |yt−1 , zt ), temporal inference can be efÄ?Ĺš ciently performed in a low-dimensional kernel induced state space (see e.g. (1) and Ä?Ĺš g. 1b). For visualization and error measurement, the Ä?Ĺš ltered density, e.g. p(yt |Zt ), can be mapped back to p(xt |Rt ) using the pre-image c.f . (3). similar procedure is performed on the inputs ri to obtain zi . In order to relate the reduced feature spaces of z and y (P(Fr ) and P(Fx )), we estimate a probability distribution over mappings from training pairs (zi , yi ). We use a conditional Bayesian mixture of experts (BME) [7, 5] in order to account for ambiguity when mapping similar, possibly identical reduced feature inputs to very different feature outputs, as common in our problem (Ä?Ĺš g. 1a). This gives a model that is a conditional mixture of low-dimensional kernel-induced experts (kBME): M g(z|δ j )N (y|Wj z, ĂŽĹ j ) p(y|z) = (2) j=1 where g(z|δ j ) is a softmax function parameterized by δ j and (Wj , ĂŽĹ j ) are the parameters and the output covariance of expert j, here a linear regressor. As in many Bayesian settings [17, 5], the weights of the experts and of the gates, Wj and δ j , are controlled by hierarchical priors, typically Gaussians with 0 mean, and having inverse variance hyperparameters controlled by a second level of Gamma distributions. We learn this model using a double-loop EM and employ ML-II type approximations [8, 17] with greedy (weight) subset selection [17, 15]. Finally, the kBME algorithm requires the computation of pre-images in order to recover the state distribution x from it’s image y ∈ P(Fx ). This is a closed form computation for polynomial kernels of odd degree. For more general kernels optimization or learning (regression based) methods are necessary [3]. Following [3, 19], we use a sparse Bayesian kernel regressor to learn the pre-image. This is based on training data (xi , yi ): p(x|y) = N (x|AĂŽĹšy (y), â„Ĺš) (3) with parameters and covariances (A, â„Ĺš). Since temporal inference is performed in the low-dimensional kernel induced state space, the pre-image function needs to be calculated only for visualizing results or for the purpose of error reporting. Propagating the result from the reduced feature space P(Fx ) to the output space X pro- duces a Gaussian mixture with M elements, having coefÄ?Ĺš cients g(z|δ j ) and components N (x|AĂŽĹšy (Wj z), AJĂŽĹšy ĂŽĹ j JĂŽĹšy A + â„Ĺš), where JĂŽĹšy is the Jacobian of the mapping ĂŽĹšy . 3 Experiments We run experiments on both real image sequences (Ä?Ĺš g. 5 and Ä?Ĺš g. 6) and on sequences where silhouettes were artiÄ?Ĺš cially rendered. The prediction error is reported in degrees (for mixture of experts, this is w.r.t. the most probable one, but see also Ä?Ĺš g. 4a), and normalized per joint angle, per frame. The models are learned using standard cross-validation. Pre-images are learned using kernel regressors and have average error 1.7o . Training Set and Model State Representation: For training we gather pairs of 3D human poses together with their image projections, here silhouettes, using the graphics package Maya. We use realistically rendered computer graphics human surface models which we animate using human motion capture [1]. Our original human representation (x) is based on articulated skeletons with spherical joints and has 56 skeletal d.o.f. including global translation. The database consists of 8000 samples of human activities including walking, running, turns, jumps, gestures in conversations, quarreling and pantomime. Image Descriptors: We work with image silhouettes obtained using statistical background subtraction (with foreground and background models). Silhouettes are informative for pose estimation although prone to ambiguities (e.g. the left / right limb assignment in side views) or occasional lack of observability of some of the d.o.f. (e.g. 180o ambiguities in the global azimuthal orientation for frontal views, e.g. Ä?Ĺš g. 1a). These are multiplied by intrinsic forward / backward monocular ambiguities [16]. As observations r, we use shape contexts extracted on the silhouette [4] (5 radial, 12 angular bins, size range 1/8 to 3 on log scale). The features are computed at different scales and sizes for points sampled on the silhouette. To work in a common coordinate system, we cluster all features in the training set into K = 50 clusters. To compute the representation of a new shape feature (a point on the silhouette), we ‘project’ onto the common basis by (inverse distance) weighted voting into the cluster centers. To obtain the representation (r) for a new silhouette we regularly sample 200 points on it and add all their feature vectors into a feature histogram. Because the representation uses overlapping features of the observation the elements of the descriptor are not independent. However, a conditional temporal framework (Ä?Ĺš g. 1b) Ä?Ĺš‚exibly accommodates this. For experiments, we use Gaussian kernels for the joint angle feature space and dot product kernels for the observation feature space. We learn state conditionals for p(yt |zt ) and p(yt |yt−1 , zt ) using 6 dimensions for the joint angle kernel induced state space and 25 dimensions for the observation induced feature space, respectively. In Ä?Ĺš g. 3b) we show an evaluation of the efÄ?Ĺš cacy of our kBME predictor for different dimensions in the joint angle kernel induced state space (the observation feature space dimension is here 50). On the analyzed dancing sequence, that involves complex motions of the arms and the legs, the non-linear model signiÄ?Ĺš cantly outperforms alternative PCA methods and gives good predictions for compact, low-dimensional models.1 In tables 1 and 2, as well as Ä?Ĺš g. 4, we perform quantitative experiments on artiÄ?Ĺš cially rendered silhouettes. 3D ground truth joint angles are available and this allows a more 1 Running times: On a Pentium 4 PC (3 GHz, 2 GB RAM), a full dimensional BME model with 5 experts takes 802s to train p(xt |xt−1 , rt ), whereas a kBME (including the pre-image) takes 95s to train p(yt |yt−1 , zt ). The prediction time is 13.7s for BME and 8.7s (including the pre-image cost 1.04s) for kBME. The integration in (1) takes 2.67s for BME and 0.31s for kBME. The speed-up for kBME is signiÄ?Ĺš cant and likely to increase with original models having higher dimensionality. Prediction Error Number of Clusters 100 1000 100 10 1 1 2 3 4 5 6 7 8 Degree of Multimodality kBME KDE_RVM PCA_BME PCA_RVM 10 1 0 20 40 Number of Dimensions 60 Figure 3: (a, Left) Analysis of ‘multimodality’ for a training set. The input zt dimension is 25, the output yt dimension is 6, both reduced using kPCA. We cluster independently in (yt−1 , zt ) and yt using many clusters (2100) to simulate small input perturbations and we histogram the yt clusters falling within each cluster in (yt−1 , zt ). This gives intuition on the degree of ambiguity in modeling p(yt |yt−1 , zt ), for small perturbations in the input. (b, Right) Evaluation of dimensionality reduction methods for an artiÄ?Ĺš cial dancing sequence (models trained on 300 samples). The kBME is our model §2.2, whereas the KDE-RVM is a KDE model learned with a Relevance Vector Machine (RVM) [17] feature space map. PCA-BME and PCA-RVM are models where the mappings between feature spaces (obtained using PCA) is learned using a BME and a RVM. The non-linearity is signiÄ?Ĺš cant. Kernel-based methods outperform PCA and give low prediction error for 5-6d models. systematic evaluation. Notice that the kernelized low-dimensional models generally outperform the PCA ones. At the same time, they give results competitive to the ones of high-dimensional BME predictors, while being lower-dimensional and therefore signiÄ?Ĺš cantly less expensive for inference, e.g. the integral in (1). In Ä?Ĺš g. 5 and Ä?Ĺš g. 6 we show human motion reconstruction results for two real image sequences. Fig. 5 shows the good quality reconstruction of a person performing an agile jump. (Given the missing observations in a side view, 3D inference for the occluded body parts would not be possible without using prior knowledge!) For this sequence we do inference using conditionals having 5 modes and reduced 6d states. We initialize tracking using p(yt |zt ), whereas for inference we use p(yt |yt−1 , zt ) within (1). In the second sequence in Ä?Ĺš g. 6, we simultaneously reconstruct the motion of two people mimicking domestic activities, namely washing a window and picking an object. Here we do inference over a product, 12-dimensional state space consisting of the joint 6d state of each person. We obtain good 3D reconstruction results, using only 5 hypotheses. Notice however, that the results are not perfect, there are small errors in the elbow and the bending of the knee for the subject at the l.h.s., and in the different wrist orientations for the subject at the r.h.s. This reÄ?Ĺš‚ects the bias of our training set. Walk and turn Conversation Run and turn left KDE-RR 10.46 7.95 5.22 RVM 4.95 4.96 5.02 KDE-RVM 7.57 6.31 6.25 BME 4.27 4.15 5.01 kBME 4.69 4.79 4.92 Table 1: Comparison of average joint angle prediction error for different models. All kPCA-based models use 6 output dimensions. Testing is done on 100 video frames for each sequence, the inputs are artiÄ?Ĺš cially generated silhouettes, not in the training set. 3D joint angle ground truth is used for evaluation. KDE-RR is a KDE model with ridge regression (RR) for the feature space mapping, KDE-RVM uses an RVM. BME uses a Bayesian mixture of experts with no dimensionality reduction. kBME is our proposed model. kPCAbased methods use kernel regressors to compute pre-images. Expert Prediction Frequency − Closest to Ground truth Frequency − Close to ground truth 30 25 20 15 10 5 0 1 2 3 4 Expert Number 14 10 8 6 4 2 0 5 1st Probable Prev Output 2nd Probable Prev Output 3rd Probable Prev Output 4th Probable Prev Output 5th Probable Prev Output 12 1 2 3 4 Current Expert 5 Figure 4: (a, Left) Histogram showing the accuracy of various expert predictors: how many times the expert ranked as the k-th most probable by the model (horizontal axis) is closest to the ground truth. The model is consistent (the most probable expert indeed is the most accurate most frequently), but occasionally less probable experts are better. (b, Right) Histograms show the dynamics of p(yt |yt−1 , zt ), i.e. how the probability mass is redistributed among experts between two successive time steps, in a conversation sequence. Walk and turn back Run and turn KDE-RR 7.59 17.7 RVM 6.9 16.8 KDE-RVM 7.15 16.08 BME 3.6 8.2 kBME 3.72 8.01 Table 2: Joint angle prediction error computed for two complex sequences with walks, runs and turns, thus more ambiguity (100 frames). Models have 6 state dimensions. Unimodal predictors average competing solutions. kBME has signiÄ?Ĺš cantly lower error. Figure 5: Reconstruction of a jump (selected frames). Top: original image sequence. Middle: extracted silhouettes. Bottom: 3D reconstruction seen from a synthetic viewpoint. 4 Conclusion We have presented a probabilistic framework for conditional inference in latent kernelinduced low-dimensional state spaces. Our approach has the following properties: (a) Figure 6: Reconstructing the activities of 2 people operating in an 12-d state space (each person has its own 6d state). Top: original image sequence. Bottom: 3D reconstruction seen from a synthetic viewpoint. Accounts for non-linear correlations among input or output variables, by using kernel nonlinear dimensionality reduction (kPCA); (b) Learns probability distributions over mappings between low-dimensional state spaces using conditional Bayesian mixture of experts, as required for accurate prediction. In the resulting low-dimensional kBME predictor ambiguities and multiple solutions common in visual, inverse perception problems are accurately represented. (c) Works in a continuous, conditional temporal probabilistic setting and offers a formal management of uncertainty. We show comparisons that demonstrate how the proposed approach outperforms regression, PCA or KDE alone for reconstructing the 3D human motion in monocular video. Future work we will investigate scaling aspects for large training sets and alternative structured prediction methods. References [1] CMU Human Motion DataBase. Online at http://mocap.cs.cmu.edu/search.html, 2003. [2] A. Agarwal and B. Triggs. 3d human pose from silhouettes by Relevance Vector Regression. In CVPR, 2004. [3] G. Bakir, J. Weston, and B. Scholkopf. Learning to Ä?Ĺš nd pre-images. In NIPS, 2004. [4] S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. PAMI, 24, 2002. [5] C. Bishop and M. Svensen. Bayesian mixtures of experts. In UAI, 2003. [6] J. Deutscher, A. Blake, and I. Reid. Articulated Body Motion Capture by Annealed Particle Filtering. In CVPR, 2000. [7] M. Jordan and R. Jacobs. Hierarchical mixtures of experts and the EM algorithm. Neural Computation, (6):181–214, 1994. [8] D. Mackay. Bayesian interpolation. Neural Computation, 4(5):720–736, 1992. [9] A. McCallum, D. Freitag, and F. Pereira. Maximum entropy Markov models for information extraction and segmentation. In ICML, 2000. [10] R. Rosales and S. Sclaroff. Learning Body Pose Via Specialized Maps. In NIPS, 2002. [11] B. Sch¨ lkopf, A. Smola, and K. M¨ ller. Nonlinear component analysis as a kernel eigenvalue o u problem. Neural Computation, 10:1299–1319, 1998. [12] G. Shakhnarovich, P. Viola, and T. Darrell. Fast Pose Estimation with Parameter Sensitive Hashing. In ICCV, 2003. [13] L. Sigal, S. Bhatia, S. Roth, M. Black, and M. Isard. Tracking Loose-limbed People. In CVPR, 2004. [14] C. Sminchisescu and A. Jepson. Generative Modeling for Continuous Non-Linearly Embedded Visual Inference. In ICML, pages 759–766, Banff, 2004. [15] C. Sminchisescu, A. Kanaujia, Z. Li, and D. Metaxas. Discriminative Density Propagation for 3D Human Motion Estimation. In CVPR, 2005. [16] C. Sminchisescu and B. Triggs. Kinematic Jump Processes for Monocular 3D Human Tracking. In CVPR, volume 1, pages 69–76, Madison, 2003. [17] M. Tipping. Sparse Bayesian learning and the Relevance Vector Machine. JMLR, 2001. [18] C. Tomasi, S. Petrov, and A. Sastry. 3d tracking = classiÄ?Ĺš cation + interpolation. In ICCV, 2003. [19] J. Weston, O. Chapelle, A. Elisseeff, B. Scholkopf, and V. Vapnik. Kernel dependency estimation. In NIPS, 2002.
5 0.44134322 181 nips-2005-Spiking Inputs to a Winner-take-all Network
Author: Matthias Oster, Shih-Chii Liu
Abstract: Recurrent networks that perform a winner-take-all computation have been studied extensively. Although some of these studies include spiking networks, they consider only analog input rates. We present results of this winner-take-all computation on a network of integrate-and-fire neurons which receives spike trains as inputs. We show how we can configure the connectivity in the network so that the winner is selected after a pre-determined number of input spikes. We discuss spiking inputs with both regular frequencies and Poisson-distributed rates. The robustness of the computation was tested by implementing the winner-take-all network on an analog VLSI array of 64 integrate-and-fire neurons which have an innate variance in their operating parameters. 1
6 0.429735 96 nips-2005-Inference with Minimal Communication: a Decision-Theoretic Variational Approach
7 0.42392236 11 nips-2005-A Hierarchical Compositional System for Rapid Object Detection
8 0.42336616 106 nips-2005-Large-scale biophysical parameter estimation in single neurons via constrained linear regression
9 0.42211354 157 nips-2005-Principles of real-time computing with feedback applied to cortical microcircuit models
10 0.42100075 124 nips-2005-Measuring Shared Information and Coordinated Activity in Neuronal Networks
11 0.42048186 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity
12 0.41858625 90 nips-2005-Hot Coupling: A Particle Approach to Inference and Normalization on Pairwise Undirected Graphs
13 0.41057956 197 nips-2005-Unbiased Estimator of Shape Parameter for Spiking Irregularities under Changing Environments
14 0.41034314 32 nips-2005-Augmented Rescorla-Wagner and Maximum Likelihood Estimation
15 0.41006178 201 nips-2005-Variational Bayesian Stochastic Complexity of Mixture Models
16 0.40917626 30 nips-2005-Assessing Approximations for Gaussian Process Classification
17 0.40714866 46 nips-2005-Consensus Propagation
18 0.40459508 49 nips-2005-Convergence and Consistency of Regularized Boosting Algorithms with Stationary B-Mixing Observations
19 0.40346459 177 nips-2005-Size Regularized Cut for Data Clustering
20 0.4021478 74 nips-2005-Faster Rates in Regression via Active Learning