nips nips2002 nips2002-200 knowledge-graph by maker-knowledge-mining

200 nips-2002-Topographic Map Formation by Silicon Growth Cones


Source: pdf

Author: Brian Taba, Kwabena A. Boahen

Abstract: We describe a self-configuring neuromorphic chip that uses a model of activity-dependent axon remodeling to automatically wire topographic maps based solely on input correlations. Axons are guided by growth cones, which are modeled in analog VLSI for the first time. Growth cones migrate up neurotropin gradients, which are represented by charge diffusing in transistor channels. Virtual axons move by rerouting address-events. We refined an initially gross topographic projection by simulating retinal wave input. 1 Neuromorphic Systems Neuromorphic engineers are attempting to match the computational efficiency of biological systems by morphing neurocircuitry into silicon circuits [1]. One of the most detailed implementations to date is the silicon retina described in [2] . This chip comprises thirteen different cell types, each of which must be individually and painstakingly wired. While this circuit-level approach has been very successful in sensory systems, it is less helpful when modeling largely unelucidated and exceedingly plastic higher processing centers in cortex. Instead of an explicit blueprint for every cortical area, what is needed is a developmental rule that can wire complex circuits from minimal specifications. One candidate is the famous

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract We describe a self-configuring neuromorphic chip that uses a model of activity-dependent axon remodeling to automatically wire topographic maps based solely on input correlations. [sent-3, score-0.757]

2 Axons are guided by growth cones, which are modeled in analog VLSI for the first time. [sent-4, score-0.194]

3 Growth cones migrate up neurotropin gradients, which are represented by charge diffusing in transistor channels. [sent-5, score-0.931]

4 We refined an initially gross topographic projection by simulating retinal wave input. [sent-7, score-0.302]

5 1 Neuromorphic Systems Neuromorphic engineers are attempting to match the computational efficiency of biological systems by morphing neurocircuitry into silicon circuits [1]. [sent-8, score-0.161]

6 This chip comprises thirteen different cell types, each of which must be individually and painstakingly wired. [sent-10, score-0.137]

7 Instead of an explicit blueprint for every cortical area, what is needed is a developmental rule that can wire complex circuits from minimal specifications. [sent-12, score-0.194]

8 One candidate is the famous "cells that fire together wire together" rule, which strengthens excitatory connections between coactive presynaptic and postsynaptic cells. [sent-13, score-0.495]

9 We implemented a self-rewiring scheme of this type in silicon, taking our cue from axon remodeling during development. [sent-14, score-0.426]

10 2 Growth Cones During development, the brain wires axons into a myriad of topographic projections between regions. [sent-15, score-0.243]

11 These gross topographic projections are refined and maintained by subsequent neuronal spike activity, and can reroute post II A B Figure 1: A. [sent-17, score-0.209]

12 Postsynaptic activity is transmitted to the next layer (up arrows) and releases neurotropin into the extracellular medium (down arrows). [sent-18, score-0.71]

13 Presynaptic activity excites postsynaptic dendrites (up arrows) and triggers neurotropin uptake by active growth cones (down arrows). [sent-20, score-1.272]

14 Each growth cone samples the neurotropin concentration at several spatial locations, measuring the gradient across the axon terminal. [sent-21, score-1.252]

15 In such cases, axons abandon obsolete territory and invade more promising targets [3]. [sent-26, score-0.144]

16 An axon grows by adding membrane and microtubule segments to its distal tip, an amoeboid body called a growth cone. [sent-27, score-0.592]

17 Growth cones extend and retract fingers of cytoplasm called filopodia, which are sensitive to local levels of guidance chemicals in the surrounding medium. [sent-28, score-0.196]

18 Candidate guidance chemicals include BDNF and NO, whose release can be triggered by action potentials in the target neuron [4]. [sent-29, score-0.129]

19 Our learning rule is based on an activity-derived diffusive chemical that guides growth cone migration. [sent-30, score-0.336]

20 In our model, this neurotropin is released by spiking neurons and diffuses in the extracellular medium until scavenged by glia or bound by growth cones (Figure lA). [sent-31, score-1.013]

21 An active growth cone compares amounts of neurotropin bound to each of its filopodia in order to measure the local gradient (Figure IB). [sent-32, score-0.924]

22 The growth cone then moves up the gradient, dragging the axon behind it. [sent-33, score-0.701]

23 Since neurotropin is released by postsynaptic activity and axon migration is driven by presynaptic activity, this rule translates temporal coincidence into spatial coincidence (Figure 1C). [sent-34, score-1.473]

24 For topographic map formation, this migration rule requires temporal correlations in the presynaptic plane to reflect neighborhood relations. [sent-35, score-0.461]

25 We supply such correlations by simulating retinal waves, spontaneous bursts of action potentials that sweep across the ganglion cell layer in the developing mammalian retina. [sent-36, score-0.148]

26 Retinal waves start at random locations and spread over a limited domain before fading away, eventually tiling the entire retinal plane [5]. [sent-37, score-0.204]

27 Axons participating in the same retinal ~ ,, iJC , ,, j=~==~~~~~~~~~·VG~' , \ >> u a: >- -'-' . [sent-38, score-0.104]

28 Axon terminal (AT) and neuron (N) circuits are arrayed hexagonally, surrounded by a continuous charge-diffusing lattice. [sent-43, score-0.251]

29 An active axon terminal (AT x,y) excites the three adjacent neurons and its growth cone samples neurotropin from four adjacent lattice nodes. [sent-44, score-1.627]

30 The growth cone sends the measured gradient direction off-chip (VGCx,y)' An active postsynaptic neuron (Nx,y) releases neurotropin into the six surrounding lattice nodes and sends its spike off-chip. [sent-45, score-1.361]

31 Presynaptic neurons send spikes to the lookup table (LUT), which routes them to axon terminal coordinates (AT) on-chip. [sent-48, score-0.661]

32 wave migrate to the same postsynaptic neighborhood, since neurotropin concentration is maximized when every cell that fires at the same time releases neurotropin at the same place. [sent-52, score-1.404]

33 To prevent all of the axons from collapsing onto a single postsynaptic target, we enforce a strictly constant synaptic density. [sent-53, score-0.375]

34 We have a fixed number of synaptic sites, each of which can be occupied by one and only one presynaptic afferent. [sent-54, score-0.286]

35 An axon terminal moves from one synaptic site to another by swapping places with the axon already occupying the desired location. [sent-55, score-1.049]

36 3 System Architecture We have fabricated and tested a first-generation neurotropin chip, Neurotrope 1, that implements retrograde transmission of a diffusive factor from postsynaptic neurons to presynaptic afferents (Figure 2A). [sent-57, score-1.023]

37 lm process, and includes a 40 x 20 array of growth cones interleaved with a 20 x 20 array of neurons. [sent-61, score-0.433]

38 Postsynaptic activity gates neurotropin release (left box) and presynaptic activity gates neurotropin uptake (right box). [sent-63, score-1.463]

39 The core of the chip consists of an array of axon terminals that target a second array of neurons, all surrounded by a monolithic pFET channel laid out as a hexagonal lattice, representing a two-dimensional extracellular medium. [sent-66, score-0.875]

40 An activated axon terminal generates postsynaptic potentials in all the fixed-radius dendritic arbors that span its location, as modeled by a diffusor network [8]. [sent-67, score-0.772]

41 Once the membrane potential crosses a threshold, the neuron fires, transmitting its coordinates off-chip and simultaneously releasing neurotropin, represented as charge spreading within the lattice. [sent-68, score-0.272]

42 N eurotropin diffuses spatially until removed by either an activityindependent leak current or an active axon terminal. [sent-69, score-0.501]

43 An axon terminal senses the local extracellular neurotropin gradient by draining charge from its own node on the hexagonal lattice and from the three immediately adjacent nodes. [sent-70, score-1.628]

44 The winner of this latency competition transmits a set of coordinates that uniquely identify the location and direction of the measured gradient. [sent-72, score-0.265]

45 We use the neuron circuit described in [9] to integrate neurotropin as well as dendritic potentials. [sent-73, score-0.695]

46 Coordinates transmitted off-chip thus fall into two categories: neuron spikes that are routed through the LUT, and gradient directions that are used to update entries in the LUT. [sent-74, score-0.23]

47 An axon migrates simply by looking up the entry in the table corresponding to the site it wants to occupy and swapping that address with that of its current location. [sent-75, score-0.471]

48 Thus, although the physical axon terminal circuits are immobilized in silicon, the virtual axons are free to move within the postsynaptic plane. [sent-77, score-0.967]

49 1 Neurotropin circuit Neurotropin in the extracellular medium is represented by charge in the hexagonal charge-diffusing lattice Ml (Figure 3). [sent-79, score-0.622]

50 VCDL sets the maximum amount of charge MI can hold. [sent-80, score-0.182]

51 The total charge in Ml is determined by circuits that implement 1 1 , * 1 C 1 2 Vm - sp 2 C * 10 1 3 Vm - sp * Vm - sp 1 - s031 3 C Figure 4: Latency competition circuit diagram. [sent-81, score-0.522]

52 A growth cone integrates neurotropin samples from its own location (right box) and the three neighboring locations (left three boxes). [sent-82, score-0.933]

53 The first location to accumulate a threshold of charge resets its three competitors and signals its identity off-chip. [sent-83, score-0.323]

54 Postsynaptic activity triggers neurotropin release, as implemented by the circuit in the left box of Figure 3. [sent-86, score-0.778]

55 Spikes from any of the three neighboring postsynaptic neurons pull Cspost to ground, opening M7 and discharging C/pos t through M4 and M5. [sent-87, score-0.247]

56 As C/post falls, M6 opens, establishing a transient path from Vdd to M1 that injects charge into the hexagonal lattice. [sent-88, score-0.345]

57 Upon termination of the postsynaptic spike, Cspost and C/pos t are recharged by decay currents through M2 and M3. [sent-89, score-0.211]

58 permitting C/post to integrate several postsynaptic spikes and facilitate charge injection if spikes arrive in a burst rather than singly. [sent-91, score-0.491]

59 Presynaptic activity triggers neurotropin uptake, as implemented by the circuit in the right box of Figure 3. [sent-93, score-0.778]

60 Charge is removed from the hexagonal lattice by a facilitation circuit similar to that used for postsynaptic release. [sent-94, score-0.597]

61 A presynaptic spike targeted to the axon terminal pulls C spre to ground through M24. [sent-95, score-0.856]

62 in turn, drains charge from C/pre through M21 and M22. [sent-97, score-0.182]

63 C/pre removes charge from the hexagonal lattice through M14, up to a limit set by M13, which prevents the hexagonal lattice from being completely drained in order to avoid charge trapping. [sent-98, score-0.939]

64 Depending on presynaptic activation, up to four axon terminals may sample a fraction of this current through M 15-18; the remainder is shunted to ground through M 19 in order to prevent a single presynaptic event from exerting undue influence on gradient measurements. [sent-100, score-1.082]

65 The current sampled by the axon terminal at its own site is gated by ~sampleo, which is pulled low by a presynaptic spike through M26 and subsequently recovers through M25. [sent-101, score-0.897]

66 Identical circuits in the other axon terminals generate signals ~sample], ~sample2' and ~sample3. [sent-102, score-0.564]

67 Sample currents la, h hand 13 are routed to latency competition circuits in the four adjacent axon terminals. [sent-103, score-0.681]

68 Randomly centered patches of active retinal cells (left) excite cortical targets (right). [sent-106, score-0.219]

69 Density plot of a single mobile growth cone initialized in a static topographic projection. [sent-108, score-0.43]

70 2 Latency competition circuit Each axon terminal measures the local neurotropin gradient by sampling a fraction of the neurotropin present at its own site, location 0, and the three immediately adjacent nodes on the hexagonal lattice, locations 1-3. [sent-113, score-2.086]

71 Charge drained from the hexagonal lattice at these four sites is integrated on a separate capacitor for each location. [sent-114, score-0.352]

72 The first capacitor to reach the threshold voltage wins the race, resetting itself and all of its competitors and signaling its victory off-chip. [sent-115, score-0.124]

73 In the circuit that samples neurotropin from location 1 (left box of Figure 4), charge pulses 1J arrive through diode Ml and accumulate on capacitor C J in an integrateand-fire circuit described in [9]. [sent-116, score-1.091]

74 During the time that ~sol is low, the other three capacitors are shunted to ground by GRST, preventing late arrivals from corrupting the declared gradient measurement before it has been transmitted off-chip. [sent-118, score-0.151]

75 C] being reset releases GRST to relax to ground through M24 with a decay time determined by Vgrs t • C] is also reset if the neighboring axon terminal initiates a swap. [sent-119, score-0.717]

76 GRSTil is pulled low if either the axon terminal at location 1 decides to move to location 0 or the axon terminal at location 0 decides to move to location 1. [sent-120, score-1.477]

77 The accumulated neurotropin samples at both locations become obsolete after the exchange, and are therefore discarded when GRST is pulled high through MS. [sent-121, score-0.638]

78 Identical circuits sample neurotropin from locations 2 and 3 (center two boxes of Figure 4). [sent-122, score-0.646]

79 If Co (right box of Figure 4) wins the latency competition, the axon terminal decides that its current location is optimal and therefore no action is required. [sent-123, score-0.779]

80 Thus, the location 0 circuit is identical to those of locations 1-3 except that the inverted spike is fed directly back to the reset transistor M20 instead of to a communication circuit. [sent-125, score-0.368]

81 4 Results We drove the chip with a sequence of randomly centered patches of presynaptic activity meant to simulate retinal waves. [sent-127, score-0.539]

82 Each patch consisted of 19 adjacent presynaptic cells: a randomly selected presynaptic cell and its nearest, next-nearest, Presynaptic Postsynapti c . [sent-128, score-0.621]

83 Axon terminals in the postsynaptic plane (right) are dyed according to the presynaptic coordinates of their cell body (left). [sent-153, score-0.664]

84 Map error in units of average postsynaptic distance between axon terminals of presynaptic neighbors. [sent-159, score-0.938]

85 and third-nearest presynaptic neighbors on a hexagonal grid (Figure 5A). [sent-161, score-0.401]

86 Every patch participant generated a burst of 8192 spikes, which were routed to the appropriate axon terminal circuit according to the connectivity map stored in the CAM. [sent-162, score-0.821]

87 Over 800 min, the single mobile growth cone wandered within the cortical area of the patch (Figure 5B), suggesting that the patch radius limits maximum sustainable topography even in the ideal case. [sent-165, score-0.492]

88 We opted for a fanout of 1 and full synaptic site occupancy, so 480 presynaptic cells projected axons to 480 synaptic sites. [sent-167, score-0.49]

89 (One side of the neuron array exhibited enhanced excitability, apparently due to noise on the power rails, so the 320 synaptic sites on that side were abandoned. [sent-168, score-0.13]

90 ) The perturbed connectivity map preserved a loose global bias, representing the formation of a coarse topographic projection from activityindependent cues. [sent-169, score-0.294]

91 After approximately 12000 patches, a refined topographic projection reemerged (Figure 6A,B). [sent-171, score-0.198]

92 To investigate the dynamics of topographic refinement, we defined the error for a single presynaptic cell to be the average of the postsynaptic distances between the axon terminals projected by the cell body and its three immediate presynaptic neighbors. [sent-172, score-1.391]

93 A cell in a perfectly topographic projection would therefore have unit error. [sent-173, score-0.21]

94 The error drops quickly at the beginning of the evolution as local clumps of correlated axon terminals crystallize. [sent-174, score-0.489]

95 Further refinement requires the disassembly of locally topographic crystals that happened to nucleate in a globally inconvenient location. [sent-175, score-0.166]

96 5 Discussion Our results demonstrate the feasibility of a spike-based neuromorphic learning system based on principles of developmental plasticity. [sent-178, score-0.104]

97 This neurotropin chip lends itself readily to more ambitious multichip systems incorporating silicon retinae that could be used to automatically wire ocular dominance columns and orientationselectivity maps when driven by spatiotemporal correlations among neurons of different origin (e. [sent-179, score-0.766]

98 Axon weights saturate at neurotrophin-rich locations and vanish at neurotrophin-starved locations, pruning a dense initial arbor until only the final circuit remains [10]. [sent-183, score-0.209]

99 By contrast, in our chemotaxis model, a handful of growth cone-guided wires rearrange themselves by moving through locations at which they had no initial presence. [sent-184, score-0.26]

100 Zagh1ou1 (2002) A silicon implementation of a novel model for retinal processing. [sent-199, score-0.19]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('neurotropin', 0.505), ('axon', 0.398), ('presynaptic', 0.238), ('postsynaptic', 0.211), ('growth', 0.194), ('charge', 0.182), ('cones', 0.163), ('hexagonal', 0.163), ('terminal', 0.132), ('topographic', 0.127), ('axons', 0.116), ('circuit', 0.115), ('cone', 0.109), ('lattice', 0.108), ('retinal', 0.104), ('vdd', 0.099), ('chip', 0.093), ('terminals', 0.091), ('silicon', 0.086), ('grst', 0.082), ('circuits', 0.075), ('lut', 0.071), ('box', 0.067), ('locations', 0.066), ('neuromorphic', 0.065), ('patch', 0.061), ('activity', 0.06), ('location', 0.059), ('competition', 0.057), ('swap', 0.057), ('releases', 0.057), ('routed', 0.057), ('latency', 0.054), ('extracellular', 0.054), ('release', 0.052), ('spike', 0.05), ('cspost', 0.049), ('migrate', 0.049), ('sol', 0.049), ('transmits', 0.049), ('spikes', 0.049), ('boahen', 0.048), ('capacitor', 0.048), ('synaptic', 0.048), ('gradient', 0.046), ('reset', 0.046), ('coordinates', 0.046), ('wire', 0.046), ('cell', 0.044), ('neuron', 0.044), ('patches', 0.044), ('uptake', 0.043), ('competitors', 0.043), ('adjacent', 0.04), ('site', 0.04), ('pulled', 0.039), ('developmental', 0.039), ('resets', 0.039), ('refinement', 0.039), ('projection', 0.039), ('array', 0.038), ('ground', 0.038), ('active', 0.037), ('coarse', 0.037), ('neurons', 0.036), ('decides', 0.036), ('move', 0.035), ('cortical', 0.034), ('transmitted', 0.034), ('vm', 0.034), ('plane', 0.034), ('arrows', 0.033), ('activityindependent', 0.033), ('chemicals', 0.033), ('diffuses', 0.033), ('diffusive', 0.033), ('drained', 0.033), ('filopodia', 0.033), ('fires', 0.033), ('kwabena', 0.033), ('migration', 0.033), ('shunted', 0.033), ('sustainable', 0.033), ('swapping', 0.033), ('vgc', 0.033), ('wins', 0.033), ('refined', 0.032), ('transistor', 0.032), ('sp', 0.031), ('triggers', 0.031), ('dendritic', 0.031), ('map', 0.029), ('connectivity', 0.029), ('excites', 0.028), ('remodeling', 0.028), ('arbor', 0.028), ('wiring', 0.028), ('obsolete', 0.028), ('released', 0.028), ('race', 0.028)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9999997 200 nips-2002-Topographic Map Formation by Silicon Growth Cones

Author: Brian Taba, Kwabena A. Boahen

Abstract: We describe a self-configuring neuromorphic chip that uses a model of activity-dependent axon remodeling to automatically wire topographic maps based solely on input correlations. Axons are guided by growth cones, which are modeled in analog VLSI for the first time. Growth cones migrate up neurotropin gradients, which are represented by charge diffusing in transistor channels. Virtual axons move by rerouting address-events. We refined an initially gross topographic projection by simulating retinal wave input. 1 Neuromorphic Systems Neuromorphic engineers are attempting to match the computational efficiency of biological systems by morphing neurocircuitry into silicon circuits [1]. One of the most detailed implementations to date is the silicon retina described in [2] . This chip comprises thirteen different cell types, each of which must be individually and painstakingly wired. While this circuit-level approach has been very successful in sensory systems, it is less helpful when modeling largely unelucidated and exceedingly plastic higher processing centers in cortex. Instead of an explicit blueprint for every cortical area, what is needed is a developmental rule that can wire complex circuits from minimal specifications. One candidate is the famous

2 0.20462838 102 nips-2002-Hidden Markov Model of Cortical Synaptic Plasticity: Derivation of the Learning Rule

Author: Michael Eisele, Kenneth D. Miller

Abstract: Cortical synaptic plasticity depends on the relative timing of pre- and postsynaptic spikes and also on the temporal pattern of presynaptic spikes and of postsynaptic spikes. We study the hypothesis that cortical synaptic plasticity does not associate individual spikes, but rather whole firing episodes, and depends only on when these episodes start and how long they last, but as little as possible on the timing of individual spikes. Here we present the mathematical background for such a study. Standard methods from hidden Markov models are used to define what “firing episodes” are. Estimating the probability of being in such an episode requires not only the knowledge of past spikes, but also of future spikes. We show how to construct a causal learning rule, which depends only on past spikes, but associates pre- and postsynaptic firing episodes as if it also knew future spikes. We also show that this learning rule agrees with some features of synaptic plasticity in superficial layers of rat visual cortex (Froemke and Dan, Nature 416:433, 2002).

3 0.18910135 186 nips-2002-Spike Timing-Dependent Plasticity in the Address Domain

Author: R. J. Vogelstein, Francesco Tenore, Ralf Philipp, Miriam S. Adlerstein, David H. Goldberg, Gert Cauwenberghs

Abstract: Address-event representation (AER), originally proposed as a means to communicate sparse neural events between neuromorphic chips, has proven efficient in implementing large-scale networks with arbitrary, configurable synaptic connectivity. In this work, we further extend the functionality of AER to implement arbitrary, configurable synaptic plasticity in the address domain. As proof of concept, we implement a biologically inspired form of spike timing-dependent plasticity (STDP) based on relative timing of events in an AER framework. Experimental results from an analog VLSI integrate-and-fire network demonstrate address domain learning in a task that requires neurons to group correlated inputs.

4 0.13133185 50 nips-2002-Circuit Model of Short-Term Synaptic Dynamics

Author: Shih-Chii Liu, Malte Boegershausen, Pascal Suter

Abstract: We describe a model of short-term synaptic depression that is derived from a silicon circuit implementation. The dynamics of this circuit model are similar to the dynamics of some present theoretical models of shortterm depression except that the recovery dynamics of the variable describing the depression is nonlinear and it also depends on the presynaptic frequency. The equations describing the steady-state and transient responses of this synaptic model fit the experimental results obtained from a fabricated silicon network consisting of leaky integrate-and-fire neurons and different types of synapses. We also show experimental data demonstrating the possible computational roles of depression. One possible role of a depressing synapse is that the input can quickly bring the neuron up to threshold when the membrane potential is close to the resting potential.

5 0.11262815 154 nips-2002-Neuromorphic Bisable VLSI Synapses with Spike-Timing-Dependent Plasticity

Author: Giacomo Indiveri

Abstract: We present analog neuromorphic circuits for implementing bistable synapses with spike-timing-dependent plasticity (STDP) properties. In these types of synapses, the short-term dynamics of the synaptic efficacies are governed by the relative timing of the pre- and post-synaptic spikes, while on long time scales the efficacies tend asymptotically to either a potentiated state or to a depressed one. We fabricated a prototype VLSI chip containing a network of integrate and fire neurons interconnected via bistable STDP synapses. Test results from this chip demonstrate the synapse’s STDP learning properties, and its long-term bistable characteristics.

6 0.098233879 47 nips-2002-Branching Law for Axons

7 0.09766461 66 nips-2002-Developing Topography and Ocular Dominance Using Two aVLSI Vision Sensors and a Neurotrophic Model of Plasticity

8 0.096061192 177 nips-2002-Retinal Processing Emulation in a Programmable 2-Layer Analog Array Processor CMOS Chip

9 0.090827622 76 nips-2002-Dynamical Constraints on Computing with Spike Timing in the Cortex

10 0.085344188 23 nips-2002-Adaptive Quantization and Density Estimation in Silicon

11 0.079822823 180 nips-2002-Selectivity and Metaplasticity in a Unified Calcium-Dependent Model

12 0.076844193 62 nips-2002-Coulomb Classifiers: Generalizing Support Vector Machines via an Analogy to Electrostatic Systems

13 0.075400233 91 nips-2002-Field-Programmable Learning Arrays

14 0.073803566 11 nips-2002-A Model for Real-Time Computation in Generic Neural Microcircuits

15 0.070384622 171 nips-2002-Reconstructing Stimulus-Driven Neural Networks from Spike Times

16 0.068199269 129 nips-2002-Learning in Spiking Neural Assemblies

17 0.066941068 28 nips-2002-An Information Theoretic Approach to the Functional Classification of Neurons

18 0.064082317 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities

19 0.062474076 43 nips-2002-Binary Coding in Auditory Cortex

20 0.054150846 57 nips-2002-Concurrent Object Recognition and Segmentation by Graph Partitioning


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.125), (1, 0.185), (2, 0.013), (3, -0.059), (4, 0.045), (5, 0.212), (6, 0.149), (7, -0.026), (8, -0.011), (9, -0.041), (10, 0.073), (11, 0.002), (12, -0.027), (13, 0.035), (14, 0.099), (15, 0.058), (16, 0.014), (17, -0.046), (18, 0.05), (19, -0.028), (20, 0.021), (21, 0.006), (22, 0.018), (23, -0.038), (24, -0.079), (25, -0.061), (26, 0.015), (27, -0.04), (28, -0.042), (29, -0.157), (30, 0.033), (31, -0.056), (32, -0.002), (33, -0.037), (34, -0.195), (35, 0.026), (36, -0.022), (37, -0.147), (38, 0.083), (39, 0.13), (40, -0.057), (41, 0.115), (42, -0.032), (43, -0.085), (44, -0.038), (45, 0.065), (46, -0.14), (47, 0.064), (48, 0.129), (49, -0.032)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97592694 200 nips-2002-Topographic Map Formation by Silicon Growth Cones

Author: Brian Taba, Kwabena A. Boahen

Abstract: We describe a self-configuring neuromorphic chip that uses a model of activity-dependent axon remodeling to automatically wire topographic maps based solely on input correlations. Axons are guided by growth cones, which are modeled in analog VLSI for the first time. Growth cones migrate up neurotropin gradients, which are represented by charge diffusing in transistor channels. Virtual axons move by rerouting address-events. We refined an initially gross topographic projection by simulating retinal wave input. 1 Neuromorphic Systems Neuromorphic engineers are attempting to match the computational efficiency of biological systems by morphing neurocircuitry into silicon circuits [1]. One of the most detailed implementations to date is the silicon retina described in [2] . This chip comprises thirteen different cell types, each of which must be individually and painstakingly wired. While this circuit-level approach has been very successful in sensory systems, it is less helpful when modeling largely unelucidated and exceedingly plastic higher processing centers in cortex. Instead of an explicit blueprint for every cortical area, what is needed is a developmental rule that can wire complex circuits from minimal specifications. One candidate is the famous

2 0.7159642 186 nips-2002-Spike Timing-Dependent Plasticity in the Address Domain

Author: R. J. Vogelstein, Francesco Tenore, Ralf Philipp, Miriam S. Adlerstein, David H. Goldberg, Gert Cauwenberghs

Abstract: Address-event representation (AER), originally proposed as a means to communicate sparse neural events between neuromorphic chips, has proven efficient in implementing large-scale networks with arbitrary, configurable synaptic connectivity. In this work, we further extend the functionality of AER to implement arbitrary, configurable synaptic plasticity in the address domain. As proof of concept, we implement a biologically inspired form of spike timing-dependent plasticity (STDP) based on relative timing of events in an AER framework. Experimental results from an analog VLSI integrate-and-fire network demonstrate address domain learning in a task that requires neurons to group correlated inputs.

3 0.592942 102 nips-2002-Hidden Markov Model of Cortical Synaptic Plasticity: Derivation of the Learning Rule

Author: Michael Eisele, Kenneth D. Miller

Abstract: Cortical synaptic plasticity depends on the relative timing of pre- and postsynaptic spikes and also on the temporal pattern of presynaptic spikes and of postsynaptic spikes. We study the hypothesis that cortical synaptic plasticity does not associate individual spikes, but rather whole firing episodes, and depends only on when these episodes start and how long they last, but as little as possible on the timing of individual spikes. Here we present the mathematical background for such a study. Standard methods from hidden Markov models are used to define what “firing episodes” are. Estimating the probability of being in such an episode requires not only the knowledge of past spikes, but also of future spikes. We show how to construct a causal learning rule, which depends only on past spikes, but associates pre- and postsynaptic firing episodes as if it also knew future spikes. We also show that this learning rule agrees with some features of synaptic plasticity in superficial layers of rat visual cortex (Froemke and Dan, Nature 416:433, 2002).

4 0.57200438 154 nips-2002-Neuromorphic Bisable VLSI Synapses with Spike-Timing-Dependent Plasticity

Author: Giacomo Indiveri

Abstract: We present analog neuromorphic circuits for implementing bistable synapses with spike-timing-dependent plasticity (STDP) properties. In these types of synapses, the short-term dynamics of the synaptic efficacies are governed by the relative timing of the pre- and post-synaptic spikes, while on long time scales the efficacies tend asymptotically to either a potentiated state or to a depressed one. We fabricated a prototype VLSI chip containing a network of integrate and fire neurons interconnected via bistable STDP synapses. Test results from this chip demonstrate the synapse’s STDP learning properties, and its long-term bistable characteristics.

5 0.49754074 66 nips-2002-Developing Topography and Ocular Dominance Using Two aVLSI Vision Sensors and a Neurotrophic Model of Plasticity

Author: Terry Elliott, Jörg Kramer

Abstract: A neurotrophic model for the co-development of topography and ocular dominance columns in the primary visual cortex has recently been proposed. In the present work, we test this model by driving it with the output of a pair of neuronal vision sensors stimulated by disparate moving patterns. We show that the temporal correlations in the spike trains generated by the two sensors elicit the development of refined topography and ocular dominance columns, even in the presence of significant amounts of spontaneous activity and fixed-pattern noise in the sensors.

6 0.47170541 47 nips-2002-Branching Law for Axons

7 0.45023644 180 nips-2002-Selectivity and Metaplasticity in a Unified Calcium-Dependent Model

8 0.40469748 177 nips-2002-Retinal Processing Emulation in a Programmable 2-Layer Analog Array Processor CMOS Chip

9 0.39633828 50 nips-2002-Circuit Model of Short-Term Synaptic Dynamics

10 0.37161481 91 nips-2002-Field-Programmable Learning Arrays

11 0.31228134 23 nips-2002-Adaptive Quantization and Density Estimation in Silicon

12 0.29394248 76 nips-2002-Dynamical Constraints on Computing with Spike Timing in the Cortex

13 0.2592763 11 nips-2002-A Model for Real-Time Computation in Generic Neural Microcircuits

14 0.22574539 206 nips-2002-Visual Development Aids the Acquisition of Motion Velocity Sensitivities

15 0.22038063 67 nips-2002-Discriminative Binaural Sound Localization

16 0.20531395 190 nips-2002-Stochastic Neighbor Embedding

17 0.18484342 62 nips-2002-Coulomb Classifiers: Generalizing Support Vector Machines via an Analogy to Electrostatic Systems

18 0.17654237 28 nips-2002-An Information Theoretic Approach to the Functional Classification of Neurons

19 0.17388044 43 nips-2002-Binary Coding in Auditory Cortex

20 0.17096673 60 nips-2002-Convergence Properties of Some Spike-Triggered Analysis Techniques


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(23, 0.013), (42, 0.047), (54, 0.076), (55, 0.021), (58, 0.022), (67, 0.021), (68, 0.05), (74, 0.095), (83, 0.428), (92, 0.013), (98, 0.076)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.91738886 200 nips-2002-Topographic Map Formation by Silicon Growth Cones

Author: Brian Taba, Kwabena A. Boahen

Abstract: We describe a self-configuring neuromorphic chip that uses a model of activity-dependent axon remodeling to automatically wire topographic maps based solely on input correlations. Axons are guided by growth cones, which are modeled in analog VLSI for the first time. Growth cones migrate up neurotropin gradients, which are represented by charge diffusing in transistor channels. Virtual axons move by rerouting address-events. We refined an initially gross topographic projection by simulating retinal wave input. 1 Neuromorphic Systems Neuromorphic engineers are attempting to match the computational efficiency of biological systems by morphing neurocircuitry into silicon circuits [1]. One of the most detailed implementations to date is the silicon retina described in [2] . This chip comprises thirteen different cell types, each of which must be individually and painstakingly wired. While this circuit-level approach has been very successful in sensory systems, it is less helpful when modeling largely unelucidated and exceedingly plastic higher processing centers in cortex. Instead of an explicit blueprint for every cortical area, what is needed is a developmental rule that can wire complex circuits from minimal specifications. One candidate is the famous

2 0.89539194 91 nips-2002-Field-Programmable Learning Arrays

Author: Seth Bridges, Miguel Figueroa, Chris Diorio, Daniel J. Hsu

Abstract: This paper introduces the Field-Programmable Learning Array, a new paradigm for rapid prototyping of learning primitives and machinelearning algorithms in silicon. The FPLA is a mixed-signal counterpart to the all-digital Field-Programmable Gate Array in that it enables rapid prototyping of algorithms in hardware. Unlike the FPGA, the FPLA is targeted directly for machine learning by providing local, parallel, online analog learning using floating-gate MOS synapse transistors. We present a prototype FPLA chip comprising an array of reconfigurable computational blocks and local interconnect. We demonstrate the viability of this architecture by mapping several learning circuits onto the prototype chip.

3 0.72775447 168 nips-2002-Real-Time Monitoring of Complex Industrial Processes with Particle Filters

Author: Rubén Morales-menéndez, Nando D. Freitas, David Poole

Abstract: This paper discusses the application of particle filtering algorithms to fault diagnosis in complex industrial processes. We consider two ubiquitous processes: an industrial dryer and a level tank. For these applications, we compared three particle filtering variants: standard particle filtering, Rao-Blackwellised particle filtering and a version of RaoBlackwellised particle filtering that does one-step look-ahead to select good sampling regions. We show that the overhead of the extra processing per particle of the more sophisticated methods is more than compensated by the decrease in error and variance.

4 0.66955549 130 nips-2002-Learning in Zero-Sum Team Markov Games Using Factored Value Functions

Author: Michail G. Lagoudakis, Ronald Parr

Abstract: We present a new method for learning good strategies in zero-sum Markov games in which each side is composed of multiple agents collaborating against an opposing team of agents. Our method requires full observability and communication during learning, but the learned policies can be executed in a distributed manner. The value function is represented as a factored linear architecture and its structure determines the necessary computational resources and communication bandwidth. This approach permits a tradeoff between simple representations with little or no communication between agents and complex, computationally intensive representations with extensive coordination between agents. Thus, we provide a principled means of using approximation to combat the exponential blowup in the joint action space of the participants. The approach is demonstrated with an example that shows the efficiency gains over naive enumeration.

5 0.53925216 23 nips-2002-Adaptive Quantization and Density Estimation in Silicon

Author: Seth Bridges, Miguel Figueroa, Chris Diorio, Daniel J. Hsu

Abstract: We present the bump mixture model, a statistical model for analog data where the probabilistic semantics, inference, and learning rules derive from low-level transistor behavior. The bump mixture model relies on translinear circuits to perform probabilistic inference, and floating-gate devices to perform adaptation. This system is low power, asynchronous, and fully parallel, and supports various on-chip learning algorithms. In addition, the mixture model can perform several tasks such as probability estimation, vector quantization, classification, and clustering. We tested a fabricated system on clustering, quantization, and classification of handwritten digits and show performance comparable to the E-M algorithm on mixtures of Gaussians. 1 I n trod u cti on Many system-on-a-chip applications, such as data compression and signal processing, use online adaptation to improve or tune performance. These applications can benefit from the low-power compact design that analog VLSI learning systems can offer. Analog VLSI learning systems can benefit immensely from flexible learning algorithms that take advantage of silicon device physics for compact layout, and that are capable of a variety of learning tasks. One learning paradigm that encompasses a wide variety of learning tasks is density estimation, learning the probability distribution over the input data. A silicon density estimator can provide a basic template for VLSI systems for feature extraction, classification, adaptive vector quantization, and more. In this paper, we describe the bump mixture model, a statistical model that describes the probability distribution function of analog variables using low-level transistor equations. We intend the bump mixture model to be the silicon version of mixture of Gaussians [1], one of the most widely used statistical methods for modeling the probability distribution of a collection of data. Mixtures of Gaussians appear in many contexts from radial basis functions [1] to hidden Markov models [2]. In the bump mixture model, probability computations derive from translinear circuits [3] and learning derives from floating-gate device equations [4]. The bump mixture model can perform different functions such as quantization, probability estimation, and classification. In addition this VLSI mixture model can implement multiple learning algorithms using different peripheral circuitry. Because the equations for system operation and learning derive from natural transistor behavior, we can build large bump mixture model with millions of parameters on a single chip. We have fabricated a bump mixture model, and tested it on clustering, classification, and vector quantization of handwritten digits. The results show that the fabricated system performs comparably to mixtures of Gaussians trained with the E-M algorithm [1]. Our work builds upon several trends of research in the VLSI community. The results in this paper are complement recent work on probability propagation in analog VLSI [5-7]. These previous systems, intended for decoding applications in communication systems, model special forms of probability distributions over discrete variables, and do not incorporate learning. In contrast, the bump mixture model performs inference and learning on probability distributions over continuous variables. The bump mixture model significantly extends previous results on floating-gate circuits [4]. Our system is a fully realized floating-gate learning algorithm that can be used for vector quantization, probability estimation, clustering, and classification. Finally, the mixture model’s architecture is similar to many previous VLSI vector quantizers [8, 9]. We can view the bump mixture model as a VLSI vector quantizer with well-defined probabilistic semantics. Computations such as probability estimation and maximum-likelihood classification have a natural statistical interpretation under the mixture model. In addition, because we rely on floating-gate devices, the mixture model does not require a refresh mechanism unlike previous learning VLSI quantizers. 2 T h e ad ap ti ve b u mp ci rcu i t The adaptive bump circuit [4], depicted in Fig.1(a-b), forms the basis of the bump mixture model. This circuit is slightly different from previous versions reported in the literature. Nevertheless, the high level functionality remains the same; the adaptive bump circuit computes the similarity between a stored variable and an input, and adapts to increase the similarity between the stored variable and input. Fig.1(a) shows the computation portion of the circuit. The bump circuit takes as input, a differential voltage signal (+Vin, −Vin) around a DC bias, and computes the similarity between Vin and a stored value, µ. We represent the stored memory µ as a voltage: µ= Vw- − Vw+ 2 (1) where Vw+ and Vw− are the gate-offset voltages stored on capacitors C1 and C2. Because C1 and C2 isolate the gates of transistors M1 and M2 respectively, these transistors are floating-gate devices. Consequently, the stored voltages Vw+ and Vw− are nonvolatile. We can express the floating-gate voltages Vfg1 and Vfg2 as Vfg1 =Vin +Vw+ and Vfg2 =Vw− −Vin, and the output of the bump circuit as [10]: I out = Ib cosh 2 ( ( 4κ / SU ) (V t fg 1 − V fg 2 ) ) = Ib cosh ( ( 8κ / SU t )(Vin − µ ) ) 2 (2) where Ib is the bias current, κ is the gate-coupling coefficient, Ut is the thermal voltage, and S depends on the transistor sizes. Fig.1(b) shows Iout for three different stored values of µ. As the data show, different µ’s shift the location of the peak response of the circuit. Vw+ V fg1 V in V fg2 Vb M1 −V in M2 I out Vw− C1 C2 V ca sc V2 V1 Vb V tun M6 V fg1 V2 V1 V in j (a) (b) bump circuit's transfer function for three µ's 10 Iout (nA) µ2 µ1 µ3 6 4 2 0 -0.4 -0.2 V fg2 M3 M4 V inj 8 V tun M5 0 V in (c) 0.2 0.4 Figure 1. (a-b) The adaptive bump circuit. (a) The original bump circuit augmented by capacitors C1 and C2, and cascode transistors (driven by Vcasc). (b) The adaptation subcircuit. M3 and M4 control injection on the floating-gates and M5 and M6 control tunneling. (b) Measured output current of a bump circuit for three programmed memories. Fig.1(b) shows the circuit that implements learning in the adaptive bump circuit. We implement learning through Fowler-Nordheim tunneling [11] on tunneling junctions M5-M6 and hot electron injection [12] on the floating-gate transistors M3-M4. Transistor M3 and M5 control injection and tunneling on M1’s floating-gate. Transistors M4 and M6 control injection and tunneling on M2’s floating-gate. We activate tunneling and injection by a high Vtun and low Vinj respectively. In the adaptive bump circuit, both processes increase the similarity between Vin and µ. In addition, the magnitude of the update does not depend on the sign of (Vin − µ) because the differential input provides common-mode rejection to the input differential pair. The similarity function, as seen in Fig.1(b), has a Gaussian-like shape. Consequently, we can equate the output current of the bump circuit with the probability of the input under a distribution parameterized by mean µ: P (Vin | µ ) = I out (3) In addition, increasing the similarity between Vin and µ is equivalent to increasing P(Vin |µ). Consequently, the adaptive bump circuit adapts to maximize the likelihood of the present input under the circuit’s probability distribution. 3 T h e b u mp mi xtu re mod el We now describe the computations and learning rule implemented by the bump mixture model. A mixture model is a general class of statistical models that approximates the probability of an analog input as the weighted sum of probability of the input under several simple distributions. The bump mixture model comprises a set of Gaussian-like probability density functions, each parameterized by a mean vector, µi. Denoting the j th dimension of the mean of the ith density as µij, we express the probability of an input vector x as: P ( x ) = (1/ N ) i P ( x | i ) = (1/ N ) i (∏ P ( x j j | µij ) ) (4) where N is the number of densities in the model and i denotes the ith density. P(x|i) is the product of one-dimensional densities P(xj|µij) that depend on the j th dimension of the ith mean, µij. We derive each one-dimensional probability distribution from the output current of a single bump circuit. The bump mixture model makes two assumptions: (1) the component densities are equally likely, and (2) within each component density, the input dimensions are independent and have equal variance. Despite these restrictions, this mixture model can, in principle, approximate any probability density function [1]. The bump mixture model adapts all µi to maximize the likelihood of the training data. Learning in the bump mixture model is based on the E-M algorithm, the standard algorithm for training Gaussian mixture models. The E-M algorithm comprises two steps. The E-step computes the conditional probability of each density given the input, P(i|x). The M-step updates the parameters of each distribution to increase the likelihood of the data, using P(i|x) to scale the magnitude of each parameter update. In the online setting, the learning rule is: ∆µij = η P (i | x ) ∂ log P ( x j | µij ) ∂µij =η P( x | i) k P( x | k) ∂ log P ( x j | µij ) ∂µij (5) where η is a learning rate and k denotes component densities. Because the adaptive bump circuit already adapts to increase the likelihood of the present input, we approximate E-M by modulating injection and tunneling in the adaptive bump circuit by the conditional probability: ∆µij = η P ( i | x ) f ( x j − µ ij ) (6) where f() is the parameter update implemented by the bump circuit. We can modulate the learning update in (6) with other competitive factors instead of the conditional probability to implement a variety of learning rules such as online K-means. 4 S i l i con i mp l emen tati on We now describe a VLSI system that implements the silicon mixture model. The high level organization of the system detailed in Fig.2, is similar to VLSI vector quantization systems. The heart of the mixture model is a matrix of adaptive bump circuits where the ith row of bump circuits corresponds to the ith component density. In addition, the periphery of the matrix comprises a set of inhibitory circuits for performing probability estimation, inference, quantization, and generating feedback for learning. We send each dimension of an input x down a single column. Unity-gain inverting amplifiers (not pictured) at the boundary of the matrix convert each single ended voltage input into a differential signal. Each bump circuit computes a current that represents (P(xj|µij))σ, where σ is the common variance of the one-dimensional densities. The mixture model computes P(x|i) along the ith row and inhibitory circuits perform inference, estimation, or quantization. We utilize translinear devices [3] to perform all of these computations. Translinear devices, such as the subthreshold MOSFET and bipolar transistor, exhibit an exponential relationship between the gate-voltage and source current. This property allows us to establish a power-law relationship between currents and probabilities (i.e. a linear relationship between gate voltages and log-probabilities). x1 x2 xn Vtun,Vinj P(x|µ11) P(x|µ12) Inh() P(x|µ1n) Output P(x|µ1) µ P(x|µ21) P(x|µ22) P(x|µ2n) Inh() P(x|µ2) µ Figure 2. Bump mixture model architecture. The system comprises a matrix of adaptive bump circuits where each row computes the probability P(x|µi). Inhibitory circuits transform the output of each row into system outputs. Spike generators also transform inhibitory circuit outputs into rate-coded feedback for learning. We compute the multiplication of the probabilities in each row of Fig.2 as addition in the log domain using the circuit in Fig.3 (a). This circuit first converts each bump circuit’s current into a voltage using a diode (e.g. M1). M2’s capacitive divider computes Vavg as the average of the scalar log probabilities, logP(xj|µij): Vavg = (σ / N ) j log P ( x j | µ ij ) (7) where σ is the variance, N is the number of input dimensions, and voltages are in units of κ/Ut (Ut is the thermal voltage and κ is the transistor-gate coupling coefficient). Transistors M2- M5 mirror Vavg to the gate of M5. We define the drain voltage of M5 as log P(x|i) (up to an additive constant) and compute: log ( P ( x | i ) ) = (C1 +C2 ) C1 Vavg = (C1 +C2 )σ C1 N j ( ) log P ( x j | µ ij ) + k (8) where k is a constant dependent on Vg (the control gate voltage on M5), and C1 and C2 are capacitances. From eq.8 we can derive the variance as: σ = NC1 / ( C1 + C2 ) (9) The system computes different output functions and feedback signals for learning by operating on the log probabilities of eq.8. Fig.3(b) demonstrates a circuit that computes P(i|x) for each distribution. The circuit is a k-input differential pair where the bias transistor M0 normalizes currents representing the probabilities P(x|i) at the ith leg. Fig.3(c) demonstrates a circuit that computes P(x). The ith transistor exponentiates logP(x|i), and a single wire sums the currents. We can also apply other inhibitory circuits to the log probabilities such as winner-take-all circuits (WTA) [13] and resistive networks [14]. In our fabricated chip, we implemented probability estimation,conditional probability computation, and WTA. The WTA outputs the index of the most likely component distribution for the present input, and can be used to implement vector quantization and to produce feedback for an online K-means learning rule. At each synapse, the system combines a feedback signal, such as the conditional probability P(i|x), computed at the matrix periphery, with the adaptive bump circuit to implement learning. We trigger adaptation at each bump circuit by a rate-coded spike signal generated from the inhibitory circuit’s current outputs. We generate this spike train with a current-to-spike converter based on Lazzaro’s low-powered spiking neuron [15]. This rate-coded signal toggles Vtun and Vinj at each bump circuit. Consequently, adaptation is proportional to the frequency of the spike train, which is in turn a linear function of the inhibitory feedback signal. The alternative to the rate code would be to transform the inhibitory circuit’s output directly into analog Vs M1 Vavg M2 M5 Vavg C2 ... P(xn|µin)σ P(x1|µi1)σ Vs Vg Vb C1 M4 M3 M0 ... ... log P(x|i) ... ... P(x) P(i|x) log P(x|i) (a) (b) (c) Figure 3. (a) Circuit for computing logP(x|i). (b) Circuit for computing P(i|x). The current through the ith leg represents P(i|x). (c) Circuit for computing P(x). Vtun and Vinj signals. Because injection and tunneling are highly nonlinear functions of Vinj and Vtun respectively, implementing updates that are linear in the inhibitory feedback signal is quite difficult using this approach. 5 E xp eri men tal Res u l ts an d Con cl u s i on s We fabricated an 8 x 8 mixture model (8 probability distribution functions with 8 dimensions each) in a TSMC 0.35µm CMOS process available through MOSIS, and tested the chip on synthetic data and a handwritten digits dataset. In our tests, we found that due to a design error, one of the input dimensions coupled to the other inputs. Consequently, we held that input fixed throughout the tests, effectively reducing the input to 7 dimensions. In addition, we found that the learning rule in eq.6 produced poor performance because the variance of the bump distributions was too large. Consequently, in our learning experiments, we used the hard winner-take-all circuit to control adaptation, resulting in a K-means learning rule. We trained the chip to perform different tasks on handwritten digits from the MNIST dataset [16]. To prepare the data, we first perform PCA to reduce the 784-pixel images to sevendimensional vectors, and then sent the data on-chip. We first tested the circuit on clustering handwritten digits. We trained the chip on 1000 examples of each of the digits 1-8. Fig.4(a) shows reconstructions of the eight means before and after training. We compute each reconstruction by multiplying the means by the seven principal eigenvectors of the dataset. The data shows that the means diverge to associate with different digits. The chip learns to associate most digits with a single probability distribution. The lone exception is digit 5 which doesn’t clearly associate with one distribution. We speculate that the reason is that 3’s, 5’s, and 8’s are very similar in our training data’s seven-dimensional representation. Gaussian mixture models trained with the E-M algorithm also demonstrate similar results, recovering only seven out of the eight digits. We next evaluated the same learned means on vector quantization of a set of test digits (4400 examples of each digit). We compare the chip’s learned means with means learned by the batch E-M algorithm on mixtures of Gaussians (with σ=0.01), a mismatch E-M algorithm that models chip nonidealities, and a non-adaptive baseline quantizer. The purpose of the mismatch E-M algorithm was to assess the effect of nonuniform injection and tunneling strengths in floating-gate transistors. Because tunneling and injection magnitudes can vary by a large amount on different floatinggate transistors, the adaptive bump circuits can learn a mean that is somewhat offcenter. We measured the offset of each bump circuit when adapting to a constant input and constructed the mismatch E-M algorithm by altering the learned means during the M-step by the measured offset. We constructed the baseline quantizer by selecting, at random, an example of each digit for the quantizer codebook. For each quantizer, we computed the reconstruction error on the digit’s seven-dimensional after average squared quantization error before E-M Probability under 7's model (µA) 7 + 9 o 1.5 1 0.5 1 1.5 2 Probability under 9's model (µA) 1 2 3 4 5 6 7 8 digit (b) 2 0.5 10 0 baseline chip E-M/mismatch (a) 2.5 20 2.5 Figure 4. (a) Reconstruction of chip means before and after training with handwritten digits. (b) Comparison of average quantization error on unseen handwritten digits, for the chip’s learned means and mixture models trained by standard algorithms. (c) Plot of probability of unseen examples of 7’s and 9’s under two bump mixture models trained solely on each digit. (c) representation when we represent each test digit by the closest mean. The results in Fig.4(b) show that for most of the digits the chip’s learned means perform as well as the E-M algorithm, and better than the baseline quantizer in all cases. The one digit where the chip’s performance is far from the E-M algorithm is the digit “1”. Upon examination of the E-M algorithm’s results, we found that it associated two means with the digit “1”, where the chip allocated two means for the digit “3”. Over all the digits, the E-M algorithm exhibited a quantization error of 9.98, mismatch E-M gives a quantization error of 10.9, the chip’s error was 11.6, and the baseline quantizer’s error was 15.97. The data show that mismatch is a significant factor in the difference between the bump mixture model’s performance and the E-M algorithm’s performance in quantization tasks. Finally, we use the mixture model to classify handwritten digits. If we train a separate mixture model for each class of data, we can classify an input by comparing the probabilities of the input under each model. In our experiment, we train two separate mixture models: one on examples of the digit 7, and the other on examples of the digit 9. We then apply both mixtures to a set of unseen examples of digits 7 and 9, and record the probability score of each unseen example under each mixture model. We plot the resulting data in Fig.4(c). Each axis represents the probability under a different class. The data show that the model probabilities provide a good metric for classification. Assigning each test example to the class model that outputs the highest probability results in an accuracy of 87% on 2000 unseen digits. Additional software experiments show that mixtures of Gaussians (σ=0.01) trained by the batch E-M algorithm provide an accuracy of 92.39% on this task. Our test results show that the bump mixture model’s performance on several learning tasks is comparable to standard mixtures of Gaussians trained by E-M. These experiments give further evidence that floating-gate circuits can be used to build effective learning systems even though their learning rules derive from silicon physics instead of statistical methods. The bump mixture model also represents a basic building block that we can use to build more complex silicon probability models over analog variables. This work can be extended in several ways. We can build distributions that have parameterized covariances in addition to means. In addition, we can build more complex, adaptive probability distributions in silicon by combining the bump mixture model with silicon probability models over discrete variables [5-7] and spike-based floating-gate learning circuits [4]. A c k n o w l e d g me n t s This work was supported by NSF under grants BES 9720353 and ECS 9733425, and Packard Foundation and Sloan Fellowships. References [1] C. M. Bishop, Neural Networks for Pattern Recognition. Oxford, UK: Clarendon Press, 1995. [2] L. R. Rabiner,

6 0.45892316 186 nips-2002-Spike Timing-Dependent Plasticity in the Address Domain

7 0.44058281 154 nips-2002-Neuromorphic Bisable VLSI Synapses with Spike-Timing-Dependent Plasticity

8 0.43275782 50 nips-2002-Circuit Model of Short-Term Synaptic Dynamics

9 0.41299376 177 nips-2002-Retinal Processing Emulation in a Programmable 2-Layer Analog Array Processor CMOS Chip

10 0.36566848 175 nips-2002-Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games

11 0.36518472 4 nips-2002-A Differential Semantics for Jointree Algorithms

12 0.36079201 11 nips-2002-A Model for Real-Time Computation in Generic Neural Microcircuits

13 0.33957601 37 nips-2002-Automatic Derivation of Statistical Algorithms: The EM Family and Beyond

14 0.33378932 66 nips-2002-Developing Topography and Ocular Dominance Using Two aVLSI Vision Sensors and a Neurotrophic Model of Plasticity

15 0.33098525 51 nips-2002-Classifying Patterns of Visual Motion - a Neuromorphic Approach

16 0.33082789 173 nips-2002-Recovering Intrinsic Images from a Single Image

17 0.32991511 132 nips-2002-Learning to Detect Natural Image Boundaries Using Brightness and Texture

18 0.32810262 5 nips-2002-A Digital Antennal Lobe for Pattern Equalization: Analysis and Design

19 0.32800993 74 nips-2002-Dynamic Structure Super-Resolution

20 0.32742622 124 nips-2002-Learning Graphical Models with Mercer Kernels