350 likes | 650 Views
Models of Neural Encoding in the Auditory System. Robert Turetsky rjt72@columbia.edu LabROSA. Raul Rodriguez-Esteban raul@ee.columbia.edu Comet Lab (we hope). Time Encoding – Prof. Lazar, Spring 2003. Talk Overview. The problem of hearing Physiology: Transduction in the ear
E N D
Models of Neural Encoding in the Auditory System Robert Turetsky rjt72@columbia.edu LabROSA Raul Rodriguez-Esteban raul@ee.columbia.edu Comet Lab (we hope) Time Encoding – Prof. Lazar, Spring 2003
Talk Overview • The problem of hearing • Physiology: Transduction in the ear • Biological Signal Processing • Sound Profile Extraction • Pitch Detection • Spatial Localization • Conclusion
Audition: The fifth sense • Hearing was the last of our senses to evolve • Detects pressure changes in air: motion and contact • We can hear things that we can’t see: • Movement of large objects at great distance • Motion of objects when vision is occluded, darkness • Communication (e.g. speech, music)
Analogy: Bregman’s Lake “Imagine two narrow channels dug up from the edge of a lake, with handkerchiefs stretched across each one. Looking only at the motion of the handkerchiefs, you are to answer questions such as: How many boats are there on the lake and where are they?” (after Bregman’90)
Talk Overview • The problem of hearing • Physiology: Transduction in the ear • Biological Signal Processing • Conclusion
Dataflow: Perception to Cognition Outer ear: Pressure waves collected Middle ear: Pressure -> Mechanical Energy -> Hydraulic Energy Inner ear: Hydraulic Energy -> Neural Impulses Midbrain: Feature extraction Cortex: ???
The Cochlea as Transducer Thickness in BM resonant frequency along length Spatial Coding along BM Vibrating BM trigger hair cells neuron fires ‘near’ peak at resonant frequency Temporal Coding (<= 4kHz) Firing is probabilistic – may fire at peak or somewhere near it. ~3500 hair cells in humans.
The Cochlea as Filterbank • Groups of neurons are connected to hair cells which respond to different frequencies/areas of BM • Approximately log-scale responses made up of 40-150 neurons per band (Fleisher) • Evidence of Fourier-like analysis in inner ear -> filtering
Auditory Nerve: Where to next? • Sounds must be processed: grouping, identification, localization • Hard/impossible to probe living auditory cortex non-invasively • Mathematical models of function (must be biologically plausible)
Mathematical Models of Neural Processing in the Auditory System • The problem of hearing • Physiology: Transduction in the ear • Biological Signal Processing • Sound Profile Extraction • Pitch Detection • Spatial Localization • Conclusion
Early Audition: Block Diagram Sound Profile Extraction Timbre Spatial Localization Loc. (to intermediate auditory system) Cochlea (Filterbank) Auditory Nerve Frequency Periodicity Pitch Temporal Periodicity Meter
Model of the Cochlea: freq. response • Usually a filterbank (Log, Mel scales) • Filterbank model does not capture all of the information: Onset detection Intensity?
Model of Peak Detector in Cochlea Input Output
Model of Peak Detector in Cochlea Output Input
Model of Peak Detector in Cochlea Output Input
Model of Peak Detector in Cochlea Time Encoding High pass filtering
Model of Peak Detector in Cochlea sum of tones π/2 delay π delay
Coincidence Detection mechanism timing comparative
Coincidence Detection avians mammals
Mathematical Models of Neural Processing in the Auditory System • The problem of hearing • Physiology: Transduction in the ear • Biological Signal Processing • Sound Profile Extraction • Pitch Detection • Spatial Localization • Conclusion
The problem of pitch perception • Three types of pitch percepts: • Spectral: Pitch evoked from sinusoidal signals • Periodicity (incl. missing fundamental): Low order, spectrally resolved harmonic tone complexes (e.g. notes from musical instruments) • Residue: Assuming a pitch from unresolved high order harmonics (aka virtual pitch) Log spectrum of pitched sound Resolving multiple voices?
Theories of Periodic Pitch Perception • Place Theory (Helmholtz): Pure tone vibrates specific area of BM. • Problem: Complex tone w/missing fundamental does not induce vibration at fundamental (Licklider’s experiments) • Spectral pattern recognition: Response from cochlear filterbank compared against templates of all possible fundamentals • Problem: Where are the templates stored? • Not learned (infants have sense of pitch) • Does not account for residue, missing fundamental
Theories of Periodic Pitch Perception • Temporal (e.g. Meddis and Hewitt): each channel is processed independently and then summed together, e.g. summary autocorrelation • Problem: No physiological evidence for mechanism • Dual Model: Spatio-temporal makes use of BM location and frequency of AN firing • THE PLAN: Improve the spatial aspect of the model • Develop a biologically plausible algorithm that can generate pitch templates • Account for residue, missing fundamental
The Shamma and Klein model • Spectral sharpening: Lateral Inhibition • Temporal Sharpening: Enhance synchrony b/w channels • Coincidence Matrix: compare responses from all channels across the array
Shamma: System Block Diagram h(n;1) . h(n;2) s(t) . . h(n;3) . . . h(n;X) filter bank hair cells spectral sharpening temporal sharpening coincidence detection + signal
Shamma: Lateral Inhibition • Equivalent to simple/complex cells in the visual system • Goal is to enhance salient frequency peaks • Can be modeled as difference in frequency • Probably integral to ASA (Auditory Scene Analysis)
Coincidence Matrix: Creating harmonic templates After LIN After temporal enhancement Coincidence Matrix Templates created regardless of input (e.g. harmonics, noise, click track)
What’s next: Scene Analysis • Model of visual system (Sajda 1995) shows that retinotopically patterned LINs can detect Gestalt rules like good continuity • Maybe the same exists in the AN?
Mathematical Models of Neural Processing in the Auditory System • The problem of hearing • Physiology: Transduction in the ear • Biological Signal Processing • Sound Profile Extraction • Pitch Detection • Spatial Localization • Conclusion
Spectro-Temporal Response • The most important physical correlate of timbre • Superposition principle
Spectro-Temporal Response - MGB: rate selectivity units
Selected References • A. S. Bregman. Auditory Scene Analysis. MIT Press 1990. • D. P. W. Ellis. Lecture notes from Speech and Audio Processing and Recognition 2002. • B. Gold, N. Morgan. Speeh and Audio Signal Processing 2000. • D. Oertel et al. Detection of synchrony in the activity of auditory nerve fibers by octopus cells of the mammalian cochlear nucleus 2000. • S. Shamma, D. Klein. The Case of the Missing Pitch Templates: How Harmonic Templates emerge in the early auditory system 1999. • S. Shamma. On the role of space and time in auditory processing 2001.