Acoustics Research Institute Austrian Academy of Sciences • Additivity of auditory masking • using Gaussian-shaped tones • aLaback, B., aBalazs, P., aToupin, G., bNecciari, T., bSavel, S., bMeunier, S., bYstad, S., and bKronland-Martinet, R. • aAcoustics Research Institute, Austrian Acad. of Sciences, Austria • bLaboratoire de Mécanique et d'Acoustique, CNRS Marseille, France • MULAC Meeting, Vienna • Sept 24rd, 2008 • Bernhard.Laback@oeaw.ac.at • http://www.kfs.oeaw.ac.at
Motivation • Both temporal and frequency masking have been studied extensively in the literature • Very little is known about their interaction, i.e., masking in the time-frequency domain • An accompanying study (Necciari et al., this conference) presents data on time-frequency time-frequency masking caused by a Gaussian-shaped tone pulse (“Gaussian”) • Our aim is to study the additivity of masking from multiple Gaussian maskers • Taken together, these data may serve as a basis to model time-frequency masking in complex signals
Tim-frequency masking frequency time
Tim-frequency masking frequency time
Tim-frequency masking frequency time
Outline • 3 steps: • Additivity of temporal masking • Additivity of frequency masking • Additivity of time-frequency masking (not presented today)
Experiment design • Both signal and maskers are Gaussian-windowed tones: with Γ: gamma factor: (Γ = α.f0), where f0 is the tone frequency and α the shape factor • Equivalent rectangular bandwidth ( Γ): 600 Hz • Equivalent rectangular duration: 1.7 ms • Good properties of Gaussian in time-frequency domain: • Minimal spread in time-frequency • Gaussian shape in both time and frequency • A study by van Schijndel et al. (1999) has shown that Gaussian-windowed tones with an appropriate alpha factor may fit the auditory time-frequency window.
Experiment design • Procedure: • 3 interval - 3 AFC (oddity task) • Adaptive procedure: 3 down - 1 up rule (estimates the 79.4% threshold) • 12 turnarounds, the last 8 used to calculate the threshold • Stepsize: 5 dB, halved after 2 turnarounds • Repeated measurements to have at least three stable values • Presented in blocks of equivalent number of maskers • Five subjects, normal hearing according to standard audiometric tests
Additivity of temporal maskingDesign • Frequency (target and maskers): 4000 Hz • Four maskers with time shifts: -24, -16, -8, +8 ms • Maskers nearly equally effective (iterative approach) • Amount of masking: 8 dB • Combinations: “M2-M3”, “M3-M4”, “M1-M2-M3”, “M2-M3-M4”, “M1-M2-M3-M4” M1 M2 M3 T M4 time (ms) -24 0 -16 -8 +8 Δt
Waveform of four maskers at equally effective levels(target at masked threshold for single masker) M4 T M3 M2 M1
p << 0.05 p << 0.05 p << 0.05 p >> 0.05 Additivity of temporal maskingAverage results over five subjects Error bars: 95% confidence intervals p >> 0.05 Empty symbols: measured data Filled symbols: linear additivity model
Additivity of temporal maskingAverage results over five subjects Error bars: 95% confidence intervals
Summary of temporal masking data(average) • No difference between forward and backward maskers • Amount of masking increases with number of maskers: • 2 maskers vs. 1 masker: + 18 dB (p << 0.05) • 3 maskers vs. 2 maskers: + 5 dB (p << 0.05) • 4 maskers vs. 3 maskers: + 11 dB (p << 0.05) • Amount of excess masking (nonlinear additivity) increases with number of maskers • 2 maskers: 14 dB • 3 maskers: 17 dB • 4 maskers: 26 dB • Results qualitatively consistent with literature data using stimuli with no or little temporal overlap of maskers
Additivity of frequency maskingDesign • Target frequency: 5611 Hz • Four simultaneous maskers with frequency separations: -7, -5, -3, +3 erbs • Maskers nearly equally effective • Amount of masking: 8 dB • Combinations: as for temporal masking M1 M2 M3 T M4 Frequency(erb) -7 0 -5 -3 +3 Δf
Additivity of frequency maskingDesign • Cochlear distortions (combination tones) could be detection cues • Therefore, lowpass-filtered background noise was added • The most critical condition (M3+T) wastested with/without noise on two subjects • No difference in threshold: so finally NO masking noise!
Additivity of frequency maskingAverage results over five subjects Error bars: 95% CI Empty symbols: measured data Filled symbols: linear additivity model
Summary of frequency masking data(average) • Amount of masking depends on maskers involved: • M2-M3 vs. single: 3 dB (p < 0.05) • M3-M4 vs. single: 15 dB (p << 0.05) • M1-M2-M3 vs. M2-M3: 5 dB (p < 0.05) • M2-M3-M4 vs. M3-M4: 0 dB (p > 0.05) • M2-M3-M4 vs. M2-M3: 14 dB (p << 0.05) • M1-M2-M3-M4 vs. M1-M2-M3: 9 dB (p << 0.05) • M1-M2-M3-M4 vs. M2-M3-M4: 0 dB (p > 0.05) • Excess masking (nonlinear additivity) mainly occurring when higher-frequency masker (M4) included • Pairs: 2-3: 0 dB, 3-4:15 dB • Triples: 1-2-3: 5 dB, 2-3-4: 13 dB • Quadruple: 14 dB
Waveform of four maskers at equally effective levels(target at masked threshold for single masker) M2 M3 M1 M4 T Maskers M1,M2, and M3 overlap with each other, but not with M4
Discussion and Conclusions • Strong excess masking for Gaussian maskers if they are physically non-overlapping • Amount of excess masking increases monotonically with number of non-overlapping maskers • Excess masking is thought to be related to the compressivity of BM vibration (e.g. Humes and Jesteadt, 1989) • Thus, our Gaussians seem to be subject to BM compression, even though they are rather short (ERD = 1.7 ms) • This is consistent with the physiological finding that the BM starts to be highly compressive already 0.5 to 0.7 ms after the onset of a signal (Recio et al., 1998)
Modeling of Results Linear Energy Summation Model • Assumption: Masked threshold proportional to masker energy at out put of integrator stage • Combining two equally effective maskers A and B should produce X + 3 dB of masking • Valid for completely overlapping maskers Nonlinear Model • Assumption: Compressive nonlinearity in auditory system is preceding the integrator stage • Combining maskers A and B results in more than linear additivity (excess masking) • Valid for non-overlapping maskers
Modeling of Results • General form: where • MA, B: Amount of masking produced by maskers A or B MAB: Amount of masking produced by the combination of maskers A and B J: Compressive nonlinearity in peripheral auditory processing
Modeling of Results • Power-law model (Lutfi, 1980): • for p = 1: linear model • for p < 1: compressive model MTX: Masked threshold of masker X • Modified Power-law model (Humes et al., 1989): • Threshold in quiet (QT) considered as “internal noise”
Start with Temporal Masking: → perfect masker separation • Power Model: • best fit for p = 0.2 • Mean error: 1.9 dB • Modified power model: • Prediction always too low
Include Correction for Quiet Threshold: -7 dB • Power Model: • Mean error: 1.9 dB • Modified power model: • Mean error: 1.6 dB Why correction required? → Probably, absolute thresholds for Gaussians are no good approximation for internal noise
Spectral Masking: Using same p-value (0.2) and threshold correction • Power Model: • Good fit only for M3M4 (non-overlapping) • Modified power model: • too high predictions Adjustment of parameters required!
Some questions • Can we derive appropriate p-values from amount of overlap between maskers? • Can the (modified) power model be included into the Gabor-Multiplier framework to predict time-frequency masking effects for complex signals?
Acknowledgements • We would like to thank • the subjects for their patience • Piotr Majdak for providing support in the development of the software for the experiments • Work partly supported by WTZ (project AMADEUS) and WWTF (project MULAC)
Time-frequency conditions frequency time