A Computational Model of Auditory Processing for Sound Localization Diana Stan, Michael Reed Department of Mathematics, Duke University. Abstract. AVCN and MNTB Output Curves; LSO Inputs.
A Computational Model of Auditory Processing for Sound Localization
Diana Stan, Michael Reed
Department of Mathematics, Duke University
AVCN and MNTB Output Curves; LSO Inputs
The auditory system is such that certain topological regions respond to a certain frequency much better than all others; this is shown through tuning curves. Thus, when thinking of our model, we can study only one such “frequency slab”, as we assume that all distinct tonotopic regions make computations in the same way. We can view the neurons of the three participatory brain parts as being columns for ease, and then assign cell numbers to neurons going up the columns. We assume that both the AVCN and MNTB have 40 cells project onto the LSO per tonotopic region, and that the LSO has 60 cells with which to make connections. We then assume that even numbered cells from the AVCN and MNTB make 8 synapses with target cells in the LSO, and that odd numbered cells make 7 synapses. Thus, each LSO cell receives 37 or 38 synapses as inputs. We have created the model so that all parameters could be changed if desired. Of course, each of these neurons must have a threshold at which to fire: we have made the assumption for our model the thresholds of AVCN neurons increase monotonically as cell number increases, but that the thresholds of MNTB neurons decrease monotonically as cell number increases, as shown below, to account for this clear, everday phenomenon: if a sound is a louderin, say the right ear, it is coming from the right side of the head.
We have formulated and worked with a model for azimuthal sound localization, as done by the Lateral Superior Olive (LSO), a nucleus in the brain. The inputs to our model are the intensity in decibels at each ear, as the LSO does sound localization based upon Interaural Intensity Difference (IID). The output is independent of absolute intensity, and is instead dependent upon IID. The output is the number of LSO cells (out of a total of 60) that fire, which can be translated into the degree on the horizon from which the sound originated, up to 3 degrees for high frequency sounds. If IID is 0, this means that 30 of 60 LSO cells are firing, then this indicates that the sound is coming from straight ahead, at 0 degrees. When 0 LSO cells fire, and this indicates that the sound is coming from -90 degrees, on the ipsilateral side, since the LSO receives inhibitory input from the medial nucleus of the trapezoid body (MNTB) on the ipsilateral side, whereas it receives excitatory input from the anterior ventral cochlear nucleus (AVCN) on the contralateral side. Similarly, when all 60 LSO cells fire, and this indicates that the sound is coming from +90 degrees, on the contralateral side. Our model is shown to adapt to synapse death and randomness in synaptic connections between the AVCN, MNTB, and LSO. The LSO cell number at which cell firing in spikes per second increases linearly with an increase in IID, since the cells in the AVCN and MNTB are arranged in columns so that the firing thresholds of the cells increase monotonically. This model can be adapted as a teaching tool.
This curve shows the firing rate of the MNTB as a function of cell number. The curve for the AVCN at the same intensity level is precisely the mirror image.
The figure on the left shows how the model responds to extreme synaptic death. Even though the summed excitatory and inhibitory curves are rather distorted, LSO cell number at which firing rate becomes 0 spikes/sec is the same, and thus the LSO can still do azimuthal sound localization even in cases of synaptic death. The figure on the right shows the case of uneven synaptic death, here more AVCN synapses are removed, and thus the model responds a bit poorly, as LSO cell number at which firing ceases decreases, since the model assumes IID has decreased.
The input to the model is the intensity level in decibels at each ear. The LSO computes based upon InterauralIntensity Difference (IID), the difference of the two.
This figure shows the case of cell death, when 3 cells are removed from the AVCN and MNTB. The model responds similarly to above, as LSO cell number has decreased. Yet in both cases, the model is still fairly accurate, and the LSO can still encode azimuthal degree fairly well.
Shown on the left is the output of the model set up in a deterministic way: each AVCN and MNTB neuron project onto specific LSO neurons. On the right, the set up is more random, and the LSO neurons pick with which AVCN and MNTB neurons to make synapses using a modified normal distribution. The two curves are very similar.
The LSO works in conjunction with the Medial Superior Olive (MSO); the LSO doing calculations based upon intensity differences in sound between the two ears, and the MSO doing calculations based upon time arrival differences of sound to each ear. Thus, the LSO is able to compute easier with high frequency sounds, whereas the MSO does better with lower frequency sounds. Both can handle all frequencies, however.
We have developed a model for azimuthal sound localization as computed by the LSO.
We have shown the model to be robust in terms of synapse death and randomness in the connections, and responds reasonably well to cell death.
We have shown the model to be independent of actual intensities, and to instead be dependent upon interaural intensity differences.
A teaching tool has been designed and can be viewed at: www.people.duke.edu/~dms60.
Further work: Examining projections from the LSO onto the Inferior Colliculus.
This figure shows the models dependence on difference in intensities. Even though the absolute intensities are different, the IID is the same, and thus the same number of LSO cells fire as above.
The equation for the firing rate of the ith AVCN or MNTB neuron is:
S = B + V * [(I - Ti) / Ki + (I - Ti)],
where I - Ti = I - Ti, I ≥ Ti; I - Ti = 0, I < Ti
Similarly, the equation for the firing rate of the jth LSO neuron is:
S’ = 0, if Ej – Ij < 0
S’ = B + V * [((Ej – Ij) – Tj) / Kj+ ((Ej – Ij) - Tj)],
Here, S and S’ are respective firing rates in spikes per second; B is the firing rate of a neuron when the inputs are below its threshold, set to 10 spikes/sec in the model; V is the maximum “plateau” firing rate, here set to be 300 spikes/sec; Ki and Kj are half-saturation constants,set at 10 decibels; Ej is the summed excitatory input to the LSO in spikes/sec; Ij is the summed inhibitory input to the LSO in spikes/sec; Ti and Tj are the thresholds of the ith and jth AVCN, MNTB, and LSO neurons, Ti range from 0.5 to 20, stepping up by 0.5, Tj are all 0.
LSO output curve: example of behavior when intensities at ears are not equal. Note how LSO cell number at which firing rate becomes 0 has increased, as IID has increased.
Shown above is the circuitry of the LSO. Blue connections are excitatory, whereas pink are inhibitory. Connections are made “tonotopically”, that is areas that respond well to a certain frequency connect to areas that respond well to the same frequency.
Michael Reed, Jacob Blum. J. Acoust. Soc. Am. Volume 88, Issue 3, pp. 1442-1453 (1990)