270 likes | 276 Views
SPEECH PROCESSING FOR BINAURAL HEARING AIDS Dr P. C. Pandey EE Dept., IIT Bombay Feb’03. R&D activities in SPI Lab, EE Dept, IIT Bombay • Speech & hearing • Healthcare instrumentation • Impedance cardiography • Industrial instrumentation. Speech & hearing
E N D
SPEECH PROCESSING FOR BINAURAL HEARING AIDS Dr P. C. Pandey EE Dept., IIT Bombay Feb’03
R&D activities in SPI Lab, EE Dept, IIT Bombay • Speech & hearing • Healthcare instrumentation • Impedance cardiography • Industrial instrumentation
Speech & hearing • Speech processing for improving perception by persons with sensori-neural hearing loss: - Consonantal enhancement (with Prof SD Agashe) - Binaural dichotic presentation • Vocal tract shape estimation for speech training of deaf children • Speech synthesis and study of phonemic features using HNM • Cancellation of background noise in alaryngeal speech using spectral subtraction
Healthcare instrumentation • Low cost diagnostic audiometer • Impedance glottograph for voice pitch • Impedance cardiograph for sports medicine. • Intravenous drip rate indicator • Communicator for children with cerebral palsy (with Prof GG Ray) • Non-invasive ultrasonic thermometry system (with Prof T Anjaneyulu) • Myoelectric hand (with Prof SR Devasahayam & R Lal)
Impedance cardiography Signal processing for improving the estimation of stroke volume from impedance cardiogram Industrial Instrumentation Noninvasive m/s of single phase fluid flow using ultrasonic crosscorrelation technique(with Prof T Anjaneyulu) Online measurement of dielectric dissipation factor for condition monitoring of high voltage insulation (with Prof SV Kulkarni)
Speech Processing for Binaural Hearing Aids • Hearing system • Outer ear Middle ear Inner ear Cochlear nerve • Brain Hearing impairments • Conducrtive • Sensorineural • Central • Functional Sensory Aids for the hearing impaired • Hearing aids • Cochlear prosthesis • Visual & tactile aids
Causes of sensorineural loss • • Loss of sensory hair cells in cochlea • • Degeneration of auditory nerve fibers • Characteristics of sensorineural loss • • Frequency dependent shifts in hearing thresholds • • Reduced dynamic range, loudness recruitment • • Poor frequency selectivity & increased spectral masking • • Reduced temporal resolution & increased temporal masking
Effects of increased spectral masking • Smearing of spectral peaks and valleys due to broader auditory filters • Reduction of internal spectral contrast • Reduced discrimination of consonantal place feature Effects of increased temporal masking • Forward and backward masking of weak segments by strong ones • Reduced ability to discriminate sub-phonemic segments like noise bursts, voice-onset-time, and formant transitions
Speech processing for dichotic presentation for • binaural hearing aids to reduce the effects of masking • Masking takes place at the peripheral level of the auditory system • Information from the two ears gets integrated at higher levels in the perception process • Binaural dichotic presentation for persons with bilateral residual hearing: - Speech signal split in a complementary form, - Signal components likely to mask each other presented to different ears, - - Information integrated at higher levels, for better speech perception
Binaural dichotic presentation schemes ·Spectral splitting Filtering by 2 complementary comb filters: better place reception ·Temporal splitting Gating by 2 complementary fading functions: better duration reception ·Combined splitting Processing by 2 time-varying comb filters All the sensory cells of the basilar membrane get periodic relaxation from stimulation: better perception of consonantal duration, place, and other features
w1(n) s1(n) s(n) s2(n) w2(n) w1(n) N L M M n w2(n) L N n TEMPORAL SPLITTING WITH TRAPEZOIDAL FADING Temporal splitting of the signal for dichotic presentation using w1(n) and w2(n) Inter-aural switching period = 20 ms, Duty cycles = 70%, Transition durations = 0, 1, 2, 3 ms Inter-aural fading with trapezoidal transition and inter-aural overlap
Investigations with spectral splitting • Auditory filter bandwidth based comb filters 18 bands over 5 kHz, 256 coefficient linear phase filters, designed using frequency sampling technique • Listening tests with hearing impaired subjects: improvement in response time, recognition scores, & reception of place feature • Better results with perceptually balanced filters 1 dB ripple, 30 dB attenuation, 4-6 dB crossover • Filters with personalized frequency response Overall improvement, but not particularly for place
Combined splitting with time-varying filters s1(n) Time varying comb filter 1 1 m/2 +2 m/2 +1 m m/2 s(n) Magnitude set of filter coefficients m 1 2 2 s2(n) Time varying comb filter 2 1 Sweep cycle duration = 20 ms. With m shiftings, each pair of comb filter processes for 20/m ms Frequency
Inten. dB 5 0 4 3 2 Frequency (kHz) 1 0 -40 0 5 10 15 20 25 30 Time in ms Inten. dB (a) 5 0 4 3 Frequency (kHz) 2 1 0 -40 0 5 10 15 20 25 30 Time in ms (b) An idealized representation of magnitude response of the pair of time-varying comb filters using 4 shiftings for the (a) left ear (b) right ear.
1 4 Magnitude (dB) 3 2 1 Normalized frequency
Time-varying comb filters Set of linear phase 256-coeff. FIR filters with pre-calculated coefficients (designed using iterative use of frequency sampling technique). Comb filter responses optimized for min. perceived spectral distortion: low passband ripple & high stopband attenuation, inter-band crossover gains adjusted for loudness balance. Pass band ripple < 1 dB, Stop band attenuation > 30 dB Gain at inter-band crossovers: -4 to -6 dB Sweep cycle duration : 20 ms Number of shiftings: 2, 4, 8, 16
Listening tests for evaluation of the schemes Test material: Closed set of 12 VCV syllables, formed with consonants / p, t, k, b, d, g, m, n, s, z, f, v / and vowel / a/ Subjects & listening condition: • Normal hearing subjects with loss simulated by Gaussian noise with short-time (~10 ms) SNRs of6 : -15 dB. MCL( 70–75 dB SPL) • Hearing impaired subjects with bilateral sensorineural loss. MCL. Performance measurement • Response time statistics • Stimulus-response confusion matrix • Recognition scores • Rel. information trans. of consonantal features
Acoustically Isolated Chamber s1(t) Lowpass Filter and Audio Amplifier PCL-208 D/A Ports Subject terminal s2(t) PC : Lowpass Filter and Audio Amplifier Subject RS232C Listening test set-up
Conclusions • • All the three schemes improve response time, recognition scores, & rel. info. tr. for overall and various speech features. • • Extent of improvement with a scheme related to nature of the loss • - Severe high frequency hearing loss : • Max. improvement with temporal splitting (17.9%). • - Symmetrically low frequency hearing loss and symmetrically sloping high frequency hearing loss: max improvement with spectral splitting (17.5%) & combined splitting with 8 shiftings (20.5%). • Asymmetrical high frequency loss: temporal splitting (7.6%) & combined splitting (7.6%) • (contd.)
• Spectral splitting more effective in reducing perceptual load. • • Overall max improvement in rec. scores with combined splitting • with 8 shiftings. • • Temporal splitting mainly improved the duration feature perception. • • Spectral splitting mainly improved the the place feature perception. • • Combined splitting with 8 improved perception of both duration and place. • • Reception of the relatively robust consonantal features • (voicing, manner, and nasality) not adversally affected by splitting. • • Personalized filter response gives additional improvement
Next • Listening tests with a larger number of S’s to establish relationship between processing parameters & nature of loss. • Individualized multi-band compression. • Implementation of the processing schemes as part of wearable hearing aids, with personalized parameter setting. • Effect of binaural dichotic listening on non-speech signals & source localization to be investigated. • Investigations with combination of consonant enhancement with dichotic presentation.