2.5 Introducing time Although time reared its head in the last section,it appeared by the back door,as it were,and was not intrinsic to the dynamics of the unit-we could choose not to integrate,or,equivalently,set N=1.The way to model the temporal summation of PSPs at the axon hillock is to use the rate of change of the activation as the fundamental defining quantity,rather than the activation itself.A full treatment requires the use of a branch of mathematics known as the calculus but the resulting behaviour may be described in a reasonably straightforward way.We shall, however,adopt the calculus notation dx/dt,for the rate of change of a quantity x.It cannot be overemphasized that this is to be read as a single symbolic entity, "dx/dt",and not as dx divided by dt.To avoid confusion with the previous notation it is necessary to introduce another symbol for the weighted sum of inputs,so we define (2.5) The rate of change of the activation,da/dt,is then defined by da 成 =-Qa+Bs (2.6) where (alpha)and B(beta)are positive constants.The first term gives rise to activation decay,while the second represents the input from the other units.As usual the output y is given by the sigmoid of the activation,y=0(a).A unit like this is sometimes known as a leaky integrator for reasons that will become apparent shortly. There is an exact physical analogue for the leaky integrator with which we are all familiar.Consider a tank of water that has a narrow outlet near the base and that is also being fed by hose or tap as shown in Figure 2.9(we might think of a bathtub, with a smaller drainage hole than is usual).Let the rate at which the water is flowing through the hose be s litres per minute and let the depth of water be a.If the outlet were plugged,the rate of change of water level would be proportional to s, 38
2.5 Introducing time Although time reared its head in the last section, it appeared by the back door, as it were, and was not intrinsic to the dynamics of the unit—we could choose not to integrate, or, equivalently, set N=1. The way to model the temporal summation of PSPs at the axon hillock is to use the rate of change of the activation as the fundamental defining quantity, rather than the activation itself. A full treatment requires the use of a branch of mathematics known as the calculus but the resulting behaviour may be described in a reasonably straightforward way. We shall, however, adopt the calculus notation dx/dt, for the rate of change of a quantity x. It cannot be overemphasized that this is to be read as a single symbolic entity, "dx/dt", and not as dx divided by dt. To avoid confusion with the previous notation it is necessary to introduce another symbol for the weighted sum of inputs, so we define (2.5) The rate of change of the activation, da/dt, is then defined by (2.6) where (alpha) and (beta) are positive constants. The first term gives rise to activation decay, while the second represents the input from the other units. As usual the output y is given by the sigmoid of the activation, y= (a). A unit like this is sometimes known as a leaky integrator for reasons that will become apparent shortly. There is an exact physical analogue for the leaky integrator with which we are all familiar. Consider a tank of water that has a narrow outlet near the base and that is also being fed by hose or tap as shown in Figure 2.9 (we might think of a bathtub, with a smaller drainage hole than is usual). Let the rate at which the water is flowing through the hose be s litres per minute and let the depth of water be a. If the outlet were plugged, the rate of change of water level would be proportional to s, 38
or da/dt=Bs where B is a constant.Now suppose there is no inflow,but the outlet is working.The rate at which water leaves is directly proportional to the water pressure at the outlet,which is,in turn,proportional to the depth of water in the tank.Thus,the rate of water emission may be written as oa litres per minute where is some constant.The water level is now decreasing so that its rate of change is now negative and we have da/dt=-da.If both hose and outlet are functioning then da/dt is the sum of contributions from both,and its governing equation is just the same as that for the neural activation in(2.6).During the subsequent discussion it might be worth while referring back to this analogy if the reader has any doubts about what is taking place. Inlet hose Rate s Outlet Figure 2.9 Water tank analogy for leaky integrators. Returning to the neural model,the activation can be negative or positive(whereas the water level is always positive in the tank).Thus,on putting s-0,so that the unit has no external input,there are two cases: (a)a>0.Then da/dt<0.That is,the rate of change is negative,signifying a decrease of a with time. (b)a<0.Then da/dr>0.That is,the rate of change is positive,signifying an increase of a with time. These are illustrated in Figure 2.10,in which the left and right sides correspond to cases (a)and (b)respectively In both instances the activity gradually approaches its resting value of zero.It is this decay process that leads to the"leaky"part of the unit's name.In a TLU or semilinear node,if we withdraw input,the activity immediately becomes zero.In the new model,however,the unit has a kind of short- term memory of its previous input before it was withdrawn.Thus,if this was negative,the activation remains negative for a while afterwards,with a corresponding condition holding for recently withdrawn positive input. 39
or da/dt= s where is a constant. Now suppose there is no inflow, but the outlet is working. The rate at which water leaves is directly proportional to the water pressure at the outlet, which is, in turn, proportional to the depth of water in the tank. Thus, the rate of water emission may be written as a litres per minute where is some constant. The water level is now decreasing so that its rate of change is now negative and we have da/dt=- a. If both hose and outlet are functioning then da/dt is the sum of contributions from both, and its governing equation is just the same as that for the neural activation in (2.6). During the subsequent discussion it might be worth while referring back to this analogy if the reader has any doubts about what is taking place. Figure 2.9 Water tank analogy for leaky integrators. Returning to the neural model, the activation can be negative or positive (whereas the water level is always positive in the tank). Thus, on putting s=0, so that the unit has no external input, there are two cases: (a) a>0. Then da/dt<0. That is, the rate of change is negative, signifying a decrease of a with time. (b) a<0. Then da/dt>0. That is, the rate of change is positive, signifying an increase of a with time. These are illustrated in Figure 2.10, in which the left and right sides correspond to cases (a) and (b) respectively In both instances the activity gradually approaches its resting value of zero. It is this decay process that leads to the "leaky" part of the unit's name. In a TLU or semilinear node, if we withdraw input, the activity immediately becomes zero. In the new model, however, the unit has a kind of shortterm memory of its previous input before it was withdrawn. Thus, if this was negative, the activation remains negative for a while afterwards, with a corresponding condition holding for recently withdrawn positive input. 39
Time 0 Time Figure 2.10 Activation decay in leaky integrator. Suppose now that we start with activation zero and no input,and supply a constant input s=1 for a time t before withdrawing it again.The activation resulting from this is shown in Figure 2.11.The activation starts to increase but does so rather sluggishly.After s is taken down to zero,a decays in the way described above.Ifs had been maintained long enough,then a would have eventually reached a constant value.To see what this is we put da/dt-0,since this is a statement of there being no rate of change of a,and a is constant at some equilibrium value degmr Putting da/dt=0 in (2.6)gives (a (2.7) that is,a constant fraction ofs.If-B then deam=s.The speed at whicha can respond to an input change may be characterized by the time taken to reach some fraction of dem(0.75der say)and is called the rise-time. Signal level --·Net input s Activation Time 0 Figure 2.11 Input pulse to leaky integrator Suppose now that a further input pulse is presented soon after the first has been withdrawn.The new behaviour is shown in Figure 2.12.Now the activation starts to pick up again as the second input signal is delivered and,since a has not had 40
Figure 2.10 Activation decay in leaky integrator. Suppose now that we start with activation zero and no input, and supply a constant input s=1 for a time t before withdrawing it again. The activation resulting from this is shown in Figure 2.11. The activation starts to increase but does so rather sluggishly. After s is taken down to zero, a decays in the way described above. If s had been maintained long enough, then a would have eventually reached a constant value. To see what this is we put da/dt=0, since this is a statement of there being no rate of change of a, and a is constant at some equilibrium value aeqm. Putting da/dt=0 in (2.6) gives (2.7) that is, a constant fraction of s. If = then aeqm=s. The speed at which a can respond to an input change may be characterized by the time taken to reach some fraction of aeqm (0.75aeqm, say) and is called the rise-time. Figure 2.11 Input pulse to leaky integrator. Suppose now that a further input pulse is presented soon after the first has been withdrawn. The new behaviour is shown in Figure 2.12. Now the activation starts to pick up again as the second input signal is delivered and, since a has not had 40
time to decay to its resting value in the interim,the peak value obtained this time is larger than before.Thus the two signals interact with each other and there is temporal summation or integration (the "integrator"part of the unit's name).In a TLU,the activation would,of course,just be equal to s.The value of the constants and B govern the decay rate and rise-time respectively and,as they are increased, the decay rate increases and the rise-time falls.Keeping =B and letting both become very large therefore allows a to rise and fall very quickly and to reach equilibrium at s.As these constants are increased further,the resulting behaviour of a becomes indistinguishable from that of a TLU,which can therefore be thought of as a special case of the leaky integrator with very large constants &B (and,of course,very steep sigmoid). Leaky integrators find their main application in self-organizing nets (Ch.8).They have been studied extensively by Stephen Grossberg who provides a review in Grossberg(1988).What Grossberg calls the "additive STM model"is essentially the same as that developed here,but he also goes on to describe another-the "shunting STM"neuron-which is rather different. This completes our first foray into the realm of artificial neurons.It is adequate for most of the material in the rest of this book but,to round out the story,Chapter 10 introduces some alternative structures. Signal level --Net input s Activation Time 0 Figure 2.12 Leaky-integrator activation (solid line)for two square input pulses(dashed line). 41
time to decay to its resting value in the interim, the peak value obtained this time is larger than before. Thus the two signals interact with each other and there is temporal summation or integration (the "integrator" part of the unit's name). In a TLU, the activation would, of course, just be equal to s. The value of the constants and govern the decay rate and rise-time respectively and, as they are increased, the decay rate increases and the rise-time falls. Keeping = and letting both become very large therefore allows a to rise and fall very quickly and to reach equilibrium at s. As these constants are increased further, the resulting behaviour of a becomes indistinguishable from that of a TLU, which can therefore be thought of as a special case of the leaky integrator with very large constants , (and, of course, very steep sigmoid). Leaky integrators find their main application in self-organizing nets (Ch. 8). They have been studied extensively by Stephen Grossberg who provides a review in Grossberg (1988). What Grossberg calls the "additive STM model" is essentially the same as that developed here, but he also goes on to describe another—the "shunting STM" neuron—which is rather different. This completes our first foray into the realm of artificial neurons. It is adequate for most of the material in the rest of this book but, to round out the story, Chapter 10 introduces some alternative structures. Figure 2.12 Leaky-integrator activation (solid line) for two square input pulses (dashed line). 41
2.6 Summary The function of real neurons is extremely complex.However,the essential information processing attributes may be summarized as follows.A neuron receives input signals from many other (afferent)neurons.Each such signal is modulated(by the synaptic mechanism)from the voltage spike of an action potential into a continuously variable (graded)postsynaptic potential (PSP).PSPs are integrated by the dendritic arbors over both space(many synaptic inputs)and time (PSPs do not decay to zero instantaneously).PSPs may be excitatory or inhibitory and their integrated result is a change in the membrane potential at the axon hillock,which may serve to depolarize (excite or activate)or hyperpolarize (inhibit)the neuron.The dynamics of the membrane under these changes are complex but may be described in many instances by supposing that there is a membrane-potential threshold,beyond which an action potential is generated and below which no such event takes place.The train of action potentials constitutes the neural "output".They travel away from the cell body along the axon until they reach axon terminals(at synapses)upon which the cycle of events is initiated once again. Information is encoded in many ways in neurons but a common method is to make use of the frequency or rate of production of action potentials. The integration of signals over space may be modelled using a linear weighted sum of inputs.Synaptic action is then supposed to be equivalent to multiplication by a weight.The TLU models the action potential by a simple threshold mechanism that allows two signal levels(0 or 1).The rate of firing may be represented directly in a semilinear node by allowing a continuous-valued output or(in the stochastic variant)by using this value as a probability for the production of signal pulses. Integration over time is catered for in the leaky-integrator model.All artificial neurons show robust behaviour under degradation of input signals and hardware failure. 42
2.6 Summary The function of real neurons is extremely complex. However, the essential information processing attributes may be summarized as follows. A neuron receives input signals from many other (afferent) neurons. Each such signal is modulated (by the synaptic mechanism) from the voltage spike of an action potential into a continuously variable (graded) postsynaptic potential (PSP). PSPs are integrated by the dendritic arbors over both space (many synaptic inputs) and time (PSPs do not decay to zero instantaneously). PSPs may be excitatory or inhibitory and their integrated result is a change in the membrane potential at the axon hillock, which may serve to depolarize (excite or activate) or hyperpolarize (inhibit) the neuron. The dynamics of the membrane under these changes are complex but may be described in many instances by supposing that there is a membrane-potential threshold, beyond which an action potential is generated and below which no such event takes place. The train of action potentials constitutes the neural "output". They travel away from the cell body along the axon until they reach axon terminals (at synapses) upon which the cycle of events is initiated once again. Information is encoded in many ways in neurons but a common method is to make use of the frequency or rate of production of action potentials. The integration of signals over space may be modelled using a linear weighted sum of inputs. Synaptic action is then supposed to be equivalent to multiplication by a weight. The TLU models the action potential by a simple threshold mechanism that allows two signal levels (0 or 1). The rate of firing may be represented directly in a semilinear node by allowing a continuous-valued output or (in the stochastic variant) by using this value as a probability for the production of signal pulses. Integration over time is catered for in the leaky-integrator model. All artificial neurons show robust behaviour under degradation of input signals and hardware failure. 42