1 / 38

Interrupted speech perception

Interrupted speech perception. Su-Hyun Jin, Ph.D. University of Texas & Peggy B. Nelson, Ph.D. University of Minnesota. Introduction. In everyday life, speech is not equally intelligible over time due to background noise

aine
Download Presentation

Interrupted speech perception

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interrupted speech perception Su-Hyun Jin, Ph.D. University of Texas & Peggy B. Nelson, Ph.D. University of Minnesota

  2. Introduction • In everyday life, speech is not equally intelligible over time due to background noise • quite fluctuating rather than steady in terms of time, amplitude and frequency • Normal hearing (NH) listeners can take advantage of fluctuating nature of noise • Masking release: improvements in speech recognition in fluctuating noise compared to performance in steady noise. • Listener’s ability to utilize the momentary decrease (dips) in fluctuating noise to resolve the background fluctuations in order to extract speech information (Dubno et al., 2002; Nelson et al. 2003; Jin & Nelson, 2006)

  3. Speech and Noise (NH) Quiet speech Speech in noise Speech in fluctuating noise

  4. Introduction • Effect of noise on HI speech perception • only little or no release from masking in fluctuating noise • Even in listeners with mild hearing loss ( Bacon et al., 1998; Dubno et al. 2002) • speech recognition in steady noise was close to normal • SOME significantly worse than normal in fluctuating noise • Speech perception in fluctuating noise may provide a more sensitive measure of impairment due to hearing loss

  5. Jin & Nelson (2006) • Investigated the relationship between the amount of masking release (MR) and hearing sensitivity and temporal resolution in NH and HI listeners • Sentence and consonant recognition in quiet, steady and fluctuating noise • Hearing sensitivity and forward masking

  6. Jin & Nelson (06)Amplification and shaping • Two-stage process to amplify speech and noise for HI listeners • Shaping applied based on half-gain rule to compensate for hearing loss configuration • Overall amplification added to bring listeners to maximum sentence recognition (90% or better) in quiet • Process applied to speech and noise for each HI listener individually

  7. Jin & Nelson (06) • Role of hearing sensitivity • Listeners with sensorineural hearing loss showed reduced hearing sensitivity • The effect of noise on HI listeners becomes more detrimental than it is to NH listeners because they already have reduced redundancy of the speech signal in quiet (Van Tasell, 1993). fluct • Role of temporal resolution • Compared to NH, HI listeners are more affected by non-simultaneous maskers • Strong correlation between masking release and forward masking threshold (Dubno et al., 2002) • Observed that syllable recognition in fluctuating noise might be associated with age-related increases in forward-masked thresholds

  8. Result 1: sentence recognitionPercent correct keyword identification at -5 dB SNR

  9. Result 1: Sentence recognitionMasking release at -5 dB SNR

  10. Result 1: Speech recognition for both sentence and syllable identification • In Quiet and Steady noise • There were no significant NH and HI group differences (p > 0.177). • In Gated noise • Significant improvements in gated noise (over steady) were seen for both NH and HI listeners. • NH listeners could take advantage of gating about to regain 80% of their quiet performance whereas HI listeners only gained 15% to 55%.

  11. Result 1b: Percent correct syllable identification and MR at -5 dB SNR

  12. Result 2:FWD Masked Threshold • Masked thresholds for HI listeners were higher than those for NH listeners • Those with lower thresholds in quiet showed better thresholds in the presence of noise. • The slopes of recovery function were shallower for HI than NH listeners • Those HI listeners with close to normal recovery function showed relatively larger amount of masking release in fluctuating noise cv?

  13. Discussion • Relation between MR and other measures • High negative correlation (r-0.8) between the amount of MR and the results of forward masking for both NH and HI groups • When the results of HI listeners were analyzed, only a few factors retained a strong correlation with MR. • Hearing thresholds at 0.5 and 1.0 kHz, and FWD thresholds at 2 kHz • The MR for sentence recognition and CV syllable recognition were analyzed separately. • ??

  14. Result : regression analyses • The strength of the relationship between MR and the predictors: • For sentence recognition in gated noise, hearing sensitivity at low-to-mid frequencies(0.5 and 1 k Hz) accounted for a substantial proportion of variance in the MR. • For consonant recognition, forward masked thresholds contributed primarily to the variancein the MR.

  15. Discussion • HI listeners who performed close to normal in quiet and in steady noise still show reduced masking release for both sentence and consonant recognition in gated noise • The overall pattern of masking release measured using sentences and using CV syllables was similar.

  16. Discussion • Several factors seemed to contribute to the MR • Hearing sensitivity for low-to-mid frequencies (0.5 kHz and 1.0 kHz) were strongly related to the MR for sentence recognition but not for syllable identification. • Forward masking thresholds were more strongly related to the MR for CV syllable identification. • Consistent with Dubno et al. (2002)

  17. Follow-up • Continued to investigate additional factors that might contribute the reduced sentence recognition in fluctuating noise for HI listeners • Auditory integration • Frequency resolution

  18. Follow-up • Role of spectral resolution • Hearing impairment often associated with reduced frequency selectivity. • Reduced auditory representation of spectral peaks and valleys in speech (Miller et al., 1997) • Cochlear implant listeners • Had little MR for sentence recognition in fluctuating noise (Nelson et al., 2003). • Are known to have normal like temporal resolution (Nelson & Donaldson, 2001) but limited spectral resolution. • Reduced MR may be related to broader auditory filters

  19. Follow-up • Role of auditory integration • Understanding speech in real life requires a listener to analyze complex sounds and separate the acoustic characteristics of the input signal from background noise. • This process has been known as auditory stream segregation (Bregman, 1990). • Listeners seem to be able to segregate into streams when sounds are different in spectral cues (Rose & Moore), or in the harmonicity and fundamental frequency (F0) of a signal (Qin & Oxenham, 2003).

  20. Follow-up • Two tasks • Auditory filter characteristics • Auditory integration: interrupted sentence recognition • the same NH and HI listeners who participated in Jin & Nelson (2006) • Analysis • Examine the relationship between the MR from Jin & Nelson (2006) and the auditory filter shape and interrupted speech recognition

  21. Follow-up • Auditory filter characteristics • Used the filter shape equation (Patterson et al., 1982) • Estimate equivalent rectangular bandwidths (ERB) and slope (p) of the auditory filters for 2000 and 4000 Hz • Interrupted IEEE sentence recognition by silence gap • IEEE sentences were gated at rates of 1, 2, 4, 8 and 16 Hz. • There was no noise (no masking). • Depending on the gate frequencies, whole or only parts of words were available to listeners. • Percent correct keyword identification was recorded.

  22. Speech in gated noiseVS interrupted speech silent gaps Fluctuating noise at 8 Hz Interrupted sentence at 8 Hz

  23. Follow-upResult 1: frequency resolution • Compared to NH group, HI listeners showed greater ERBs and shallower slopes for both 2 kHz and 4 kHz. • The average ERB for HI listeners (for both 2 kHz and 4 kHz) were about 2.2-2.7 times to those of NH group.

  24. Follow-upResult 2: interrupted sentence recognition • The average percent correct scores of the NH group at each gate frequency were higher than the scores of HI listeners. • The relationship between sentence recognition in gated noise and interrupted sentence recognition for both NH and HI listeners were significant (r≈ 0.8). • when the scores of HI listeners only were compared, the correlation remained strong (r ≥ 0.8)

  25. Follow-upResult 2: interrupted sentence recognition

  26. Discussion • Several factors seemed to contribute to the MR • Hearing sensitivity for low-to-mid frequencies (0.5 kHz and 1.0 kHz) as well as the auditory filter shape at 2 and 4 kHz were strongly related to the MR for sentence recognition. • Understanding speech interrupted either by noise or silence might require a similar underlying integration process. • Percent correct scores for interrupted sentence recognition and sentence recognition in gated noise were strongly correlated. • The hearing sensitivity for the low-to-mid frequencies and ERBs that were significant predictors for both the MR and interrupted sentence recognition

  27. Follow-up 2:interrupted sentence recognition • The degree of MR in sentence recognition seemed to be correlated with low-mid frequency hearing sensitivity • Gaëtan & Christophe (2002) found older listeners with mild HL put more perceptual weight on mid frequency band (750-1750 Hz) compared to NH listeners. • Qin & Oxenham (2003) suggested that a strong pitch cue is important for a listener to segregate speech from noise. • Low-mid frequency information may be more important for understanding sentence with competing noise for HI listeners • Reducing low-frequency gain to improve comfort in noise may have unwanted consequences for HI listeners

  28. Follow-up 2 • Task:Understanding IEEE sentences • in quiet • in steady noise • in gated noise • gated with silent gaps • Participants • 10 young adult NH listeners

  29. Follow-up 2 • Stimuli • Both speech and noise were either unprocessed (natural) or processed through 4 different filters • Band-pass (bp) filter cutoff frequencies: 0.5-0.75 kHz, 0.75-1 kHz, 1-2 kHz, and 2-3 kHz

  30. Follow-up 2 • Speech • IEEE sentences spoken by ten talkers • presented at comfortable level (70-75 dB SPL) for individual NH listeners • Noise • Long-term spectrum of speech (SNR of -5 dB) • steady or fluctuating • Gate frequency: 8 Hz • the RMS amplitude of noise was adjusted relative to the RMS of the target speech

  31. Results

  32. Results • In quiet, performance was similar across filter conditions (above 90 %) except band-pass filtered at 1 kHz-2 kHz. • In steady noise at -5 SNR, the performance was quite low (below 5 % ). • except when only mid-to-high frequency information was audible (bp 2-3 kHz) • In gated noise, performance was significantly better than in steady noise. • NH listeners were able to use the limited spectral information in the dips of noise to separate speech from noise • except bp 1_2 kHz • poorer performance both in quiet and steady noise compared to other filtering conditions

  33. Percent correct key word identification Masking Release

  34. Key word identification of interrupted sentencesVSMasking Release

  35. Results • the performance in interrupted speech recognition showed similar pattern to the performance in gated noise • If only limited spectral information is available, it results in low performance level in both gated noise and interrupted speech • (for example, bp1_2 kHz). • With more spectral cues available in dips, the performance was better in both conditions.

  36. Discussion • NH listeners were able to use limited spectral information to understand speech in quiet and gated noise. • Using high frequency cues in filtered speech such as bp2_3 kHz, NH listeners were able to segregate speech from gated noise better and showed greater amount of masking release. • the high frequency cues might be more important for speech recognition in complex noise background than the low frequency ones for NH listeners. • it would be less available for HI listeners

  37. Discussion • Similar performance in recognition of speech with gated noise and recognition of gated speech • Understanding interrupted speech is dominated by the information in the dips. (Limited information in the dips = limited masking release) • suggesting that masking release is determined by listeners' ability to decode the information in the dips. (consistent with Kwon and Turner, 2001.)

  38. Future directions??????? • Investigate recognition of narrow band pass filtered speech from HI listeners • Different degrees and configuration of HL • Compare the perceptual weight function of NH and HI listeners by using sentence recognition in different types of noise. • Implications

More Related