David Griesinger Consultant Cambridge MA USA www.DavidGriesinger.com

The importance of the direct to reverberant ratio in the perception of distance, localization, clarity, and envelopment- or - Measuring Auditory Engagement- or –Near/Far David Griesinger Consultant Cambridge MA USA www.DavidGriesinger.com

Introduction • Part one of this talk will consist of: • 1. A description of the sonic perception of near/far and its relevance to music and drama • 2. A plea for acoustic designs that utilize this perception to engage the audience by making the music so exciting and accessible that they listen closely – instead of sitting back and letting it wash by. • 3. A proposal that engagement is encouraged both by low perceived sonic distance, and the ease of localizing the azimuth of musicians in an ensemble. • 4. A proposal that ease of localization can be used as a proxy for measuring the engagement of an acoustic scene. • 5. The development and testing of an impulse response based measure for ease of localization. • The measure is based on how our hearing processes Syllables or Notes and thus involves a double integral. • The measure integrates the log of sound pressure, not the pressure itself. • It includes both lateral and medial reflections. • 6. A proposal that envelopment is also enhanced by the presence of direct sound. • Part two of this talk will describe the implications of enhancing engagement in both large and small halls. • Experiments in a particular small hall will be discussed which reveal the usefulness of the new measure. • The ability to perceive direct sound (sound that travels to the listener without reflection) is the key to localization, perceived distance, engagement, and envelopment.

Warning! • This talk contains concepts that contradict deeply held convictions. • I propose that reflections (often early reflections) in the time range of 10 to 100ms often reduce clarity, envelopment, and engagement. • Whether they are lateral reflections or not! • These detrimental effects are easy to demonstrate, and I will attempt to do so. • I am NOT saying all early reflections are bad! • The ability to detect direct sound in the presence of reverberation is frequency dependent, and frequencies above 700Hz are particularly important. • The critical issue is the amount of early energy and its time delay. If the energy above 700Hz is below a critical threshold, early energy and late reverberation can enhance the listening experience. • Often reflectors directed into the audience, which absorbs the first-order reflection, have the effect of reducing the early energy above 700Hz in other areas of the hall – with beneficial results. • Reflectors placed near certain instruments can reduce disturbing late echoes, or reinforce low frequencies without increasing the energy at 700Hz. • The major point of the talk is clear: The ability to perceive direct sound in a large majority of seats is a vital component of a great hall. • And this perception requires close attention to the amount of reflected energy in the first 100ms after the direct sound.

Near/Far • The apparent closeness of a sound source is a fundamental perception for all of us. • We can tell instantly if a person talking is within a few feet of us, or further away – and this perception has survival value. • The perception of “Near” depends critically on our ability to perceive the direct sound – the sound that travels to the listener without reflecting. • Surprisingly, in a theater or hall it is possible to perceive the performers as both acoustically close to the listener and enveloped by the hall. • The best halls (Boston Symphony Hall, Concertgebouw, the front half of the Musikverrein) provide both, but many, perhaps most, provide only reverberation. • Harmonic coherence of speech and music is a principle cue for perceiving near and far. • The audio examples in the click box above show the decrease in apparent distance caused by increasing amounts of harmonic coherence. • Note that all of the examples have high intelligibility – but their emotional effect is quite different. • This perception correlates with musical clarity and the ability to localize sound sources.

Neural model Analysis – direct sound

Neural Analysis “ten” with 88ms reflections

Neural Analysis “ten” with 133ms reflections

Neural model Analysis – direct sound

Add reverb at 2s RT -10dB D/R

Add reverb at 1s RT -10dB D/r

A slide from Asbjørn Krokstad - IoA,NAS Oslo 2008 [With permission] To succeed: [in bringing new audience into concert halls…] ENGAGING “Interesting” "Nice” [We need to make the sonic impression of a concert engage the audience – not just the visual and social perceptions. Especially since audiences are increasingly accustomed to recordings!]

ENGANGEMENT, not NICE • At the IOA conference in Oslo, Asbjørn Krokstad (a musician, conductor, and Norway’s best-known acoustician) gave a lecture where he insisted that acousticians needed to provide engagement, not just pleasant music. • And not just for drama and opera, but for chamber music and symphony too. • At the end of the lecture he showed a picture of the Teatro Colónin Buenos Aires, Argentina. “Is this the concert hall of the future” he asked? • This hall is not a shoebox, but a large semicircular theater with a high ceiling. It ranks at the top in Beranek’s surveys, and the reverberation time is 1.6 seconds occupied. • Krokstad may have conducted there. • Engagement requires the independent perception of the direct sound • We must learn how to provide this essential element in halls. • I have been fortunate to hear several of the live broadcasts of the Metropolitan Opera in a good theater. For example, the performance of Salome: • The sound was harsh and dry – radio mikes coupled to directional loudspeakers. But you could hear every syllable of Mattila’s impeccable German. The performance was totally gripping! • This is the dramatic and sonic experience audiences increasingly demand.

What is “Auditory Engagement” • “Engagement” is the perception that you are not just watching a scene from distance, but present in the middle of it. • Thus lack of distance is a critical component of presence. • Auditory engagement is the perception that you are acoustically close to the sound sources. • Distance is perceived directly through harmonic coherence – but experiments to directly measure it with subjects are difficult. However it correlates both with the ability to localize sound sources, and the perception presence, or musical clarity. • To perceive presence you must be able to localize sound sources nearly all the time, • and be able to distinguish them from one another nearly all the time. • Clear localization and the ability to hear most of the notes are key components of audience engagement. • Although particularly important in drama and opera, it should be (and often is not) a part of the emotional experience of music. • Being able to hear all the notes and localize the players draws the audience into the performance. They don’t just watch it. • This view of clarity is different from the one that equates clarity with intelligibility. Perhaps we need a new word for it.

Barron on Localization “Raising the Roof” NATURE Vol 4531 12 June 2008 • “Much remains to be discovered about how our ears and brains process sound reflections. Understanding this has been complicated, for instance, by our remarkable ability to work out where a sound is coming from. This ability, called localization, works even when the sound arriving directly from the source represents only a small proportion of the total sound we receive, perhaps only 5% at the back of a concert hall. • [Without a visual reference precise localization is frequently not possible at this level of direct sound. With a visual reference we perceive what we do not hear.] • “Usually we are listening to speech or music, which have short elements such as syllables or notes that vary with time. Our brains use this time-varying information to extract where the initial sound comes from”. • [but to do this we MUST be able to detect and process the direct sound!] • “The downside of this localization is that, in effect, our hearing suppresses awareness of sound reflections. We notice early sound reflections but are often not conscious of their effects - such as making sound seem clearer than it would be otherwise.” [italics added] • [or less clear, as I believe is often the case. Barron is equating “clear” with “intelligibility” – but that is different than engagement. “I would rather the audience not hear the words than have the actors sound far away” – said a well known drama director in Copenhagen.]

Experiment for threshold of Azimuth Detection in halls A model is constructed with a source position on the left, and another source on the right Source signal alternates between the left and a right position. When the d/r is less than about minus 13dB both sources are perceived in the middle. Subject varies the d/r, and reports the value of d/r that separates the two sources by half the actual angle. This is the threshold value for azimuth detection for this model (Above this threshold the subject also reports a decrease in subjective distance)

Threshold for azimuth detection as a function of frequency and initial delay As the time gap between notes increases (allowing reverberation to decay) the threshold goes down. To duplicate the actual perception in small halls I need a 50ms gap between notes. As the time gap between the direct sound and the reverberation increases, the threshold for azimuth detection goes down. (the d/r scale on this old slide is arbitrary)

An important caveat! • All these thresholds were measured without visual cues • The author has found that in a concert (with occasional visual input) instruments (such as a string quartet) are perceived as clearly localized and spread. • When I record the sound with probes at my own eardrums, and play it back through calibrated earphones the sound seems highly accurate, but localization often disappears! • Without visual cues when the d/r is below threshold the individual instruments are localized and spread when they play solo, but collapse to the center when they play together. • My brain will not allow me to detect this collapse when I am in the concert hall – even if I close my eyes most of the time! • With eyes closed it is more difficult to separate the sounds of the individuals, such as the second violin and the viola. This difficulty persists in the binaural recording.

Localization • For this paper we assume sound sources are localized by the direct sound. • In some cases localization is aided by early reflections – but these vary strongly from seat to seat, and are too complex to consider here. • For localization to be successful the direct sound must be perceived. • Prompt strong reflections can – and do – mask the direct sound. • Let’s propose that the brain detects the loudness of – and the presence of – sounds by integrating nerve firings over a period of time. • If the integrated nerve firings from the direct sound exceed the integrated nerve firings from the reflections inside this time window, the direct sound will be perceived – and localized. • We can calculate the threshold of perception by double integrating the impulse response over a fixed time window.

The ear perceives notes – not the impulse response itself. • Here is a graph of the ipselateral binaural impulse response from spatially diffuse exponentially decaying white noise with an onset time of 5ms and an RT of 1 second. This is NOT a note, and NOT what the ear hears! • To visualize what the ear hears, we must convolve this with a sound. • Let’s use a 200ms constant level as an example. • The nerve firings from the direct component of this note have a constant rate for the duration of the sound. • The nerve firings from the reverberant component steadily build up until the note ceases and then slowly stop as the sound decays. D/R = -10dB RT = 2s: C80 = 3.5dB C50 = 2.2dB IACC80 = .24 RT = 1s: C80 = 6.4dB C50 = 4.1dB IACC80 = .20

Direct and reverberation for d/r = -10dB, and RT = 1s The blue line shows the rate of nerve firing rate for a constant direct sound 10dB less than the total reverberation energy. The red line shows the rate of nerve firings for the reverberation, which builds up for the duration of the note. The black line shows a time window (100ms) over which to integrate the two rates.In this example the area in light blue is larger than the area in pink, so the direct sound is inaudible.

Direct and build-up RT = 2s If we hold the d/r constant, when the reverberation time is two seconds it takes longer for the reverberation to build up, so the light blue area decreases, while the pink area stays constant. This makes the direct sound more audible. In a large hall the time delay between the direct sound and the reverberation also increases, further reducing the area in light blue. The direct sound would be even more audible.

Equation for Localizability – 700 to 4000Hz • We can use this simple model to derive an equation that expresses the ease of perceiving the direction of direct sound as a decibel value. p(t) is the sound pressure of the ipselateral channel of a binaural impulse response. With the previous simple assumptions, we propose the threshold for detection would be 0dB, and clear localization would occur at a localizability value of +3dB. • Where D is the window width (~ 0.1s), and S is a scale factor: • Localizability (LOC) in dB = • The scale factor S and the window width D interact to set the slope of the threshold as a function of added time delay. The values I have chosen (100ms and -20dB) fit my personal data. The extra factor of +1.5dB is added to match my personal thresholds. S is the zero nerve firing line in the previous two slides. It is 20dB below the maximum loudness. POS means ignore the negative values for the sum of S and the cumulative log pressure.

Some explanation of the equation • The equation as written in the previous slide simply calculates the ratios of the pink and blue areas shown in the previous pictures. • The first integral on the left in LOC is the “pink” area – the sum of the nerve firings for the direct sound. This area is the product of the normalized sound pressure times the length of the window D. • However here we have divided through by D – so this factor is not shown. • The next two integrals represent the total nerve firings for the reverberation – the “blue” area. • Since we have divided by D, a factor of 1/D is included at the beginning. • The second of the two integrals is the physical sum of the sound pressure that would exist if the impulse response was convolved with a steady excitation. The first integral finds the area under this curve. In the second integral we have excluded the direct sound – assuming this will be in the first 5 milliseconds. • The limits of the integrals have been adjusted to account for this exclusion. Thus the second integral goes from .005 seconds to the end, and the first integral is from zero to the window width minus .005. • I have included the -1.5dB adjustment for my personal thresholds.

Matlab code for LOC % load in a .wav file containing a binaural impulse response – filter it and truncate the beginning upper_scale =20; % 20dB range for firings % proposed box length box_length = round(100*sr/1000); % try 100ms early_time = round(5*sr/1000); D = box_length; %the window width ir_left = data1; % the binaural IR ir_right = data2; clear data1 data2 % filter the Irs wb = [2*1000/sr 2*4000/sr]; [b a] = ellip(3,2,30,wb); ir_left = filter(b,a,ir_left); ir_right = filter(b,a,ir_right); clear data1 data2 wb = [2*1000/sr 2*4000/sr]; [b a] = ellip(3,2,30,wb); ir_left = filter(b,a,ir_left); ir_right = filter(b,a,ir_right); for il = 1:0.1*sr if abs(ir_left(il)) > 500 break end if abs(ir_right(il)) > 500 break end end ir_left(1:il) = []; ir_right(1:il) = []; % ir_left is an ipselateral binaural impulse response, %truncated to start at zero and filtered to 1000-4000Hz. % early_time is 5ms in samples, D is 100ms in samples. % here starts the equation on the slide: S = 20-10*log10(sum(ir_left.^2)); early = 10*log10(sum(ir_left(1:early_time).^2)); % first integral is a cumsum representing the build up in %energy when the IR is excited by a steady tone: ln = length(ir_left); log_rvb = 10*log10(cumsum(ir_left(early_time:ln).^2)); % look at positive values of S+log_rvb only for ix = 1:ln-early_time if S+log_rvb(ix) < 0 log_rvb(ix) = -S; end end LOC = S-1.5+early -(1/D)*sum(S+log_rvb(1:D-early_time))

Use of the localization equation • Just as RT or C80, LOC uses a measured impulse response as an input, with the direct sound starting at time zero. This is the only data a user needs to supply. • The measure is calibrated for a front facing binaural impulse response. • An omnidirectional impulse response will give lower values of LOC for the same seat position, due to the lack of head shadowing. • The localization equation appears more complex than most current measures for room acoustics, but it has a simple, physiologically based interpretation. • It is the ratio in dB of the number of nerve firings received by the brain from the direct sound in a 100ms window, divided by the number of nerve firings received from all reflections in the same time period. • It contains three experimentally based parameters: the window width D, the dynamic range of the nerve channels S, and the time window for separating direct sound from reflections (5ms). These parameters are not intended to be adjustable without further experimental work. • Matlab code for calculating LOC is simple, and available from the author.

Interpretation of LOC • LOC was developed and verified as a method for predicting when a sound will be accurately localized when the direct sound is much lower in total energy than the sum of all reflections. • Like C80, IACC80, and similar measures, LOC is based on a time window that begins with the onset of the direct sound. • In practice, syllables or notes that will be affected by any of these measures will depend on the rise time (onset time) of the sound. • If the sound starts gradually the precise moment of onset becomes indeterminate, and separating direct sound from reflections becomes impossible. • Thus LOC – and other such measures – are accurately predictive only for signals with sharp onsets. • Additionally, if the direct sound from a note or syllable is masked by reverberation from a previous sound, the direct sound will not be audible. • LOC predicts the audibility of the direct sound for a syllable or note with a rapid rise-time when there is sufficient freedom from masking from previous sounds. • Although musical signals often do not meet these criteria, in practice there are enough occasions that do meet the criteria that the LOC equation is useful. • Remember that for the purposes of this talk Localization is only a proxy for the main goal – predicting when the direct sound is sufficiently audible to produce engagement. • Preliminary results suggest LOC achieves this goal.

Localization Equation Setup • The Localization Equation was developed and tested using binaural impulse response generated using the author’s own HRTFs. • The source position was 15 degrees to the left (and right) of center. Only the ipselateral channel was analyzed. • Male speech alternated from left to right with a time gap of 400ms, to allow for complete decay of the reverberation between each word. • The reverberation was generated using an independent decaying noise signal convolved with each of 54 HRTFs spaced equally around the listening position. • The HRTFs were equalized so that the azimuth zero elevation zero HRTF was flat from 40Hz to about 4kHz. The elevation notch at 7.8kHz was not equalized away, but was left in place. • Playback was done through headphones equalized to match a loudspeaker placed in front of the listener – again not equalizing the 7.8kHz notch from the listener’s frontal HRTF of the loudspeaker. • Because my data show that the perception of both localization and near/far is mostly a high frequency phenomenon, the impulse response was bandpass filtered between 700Hz and 4000Hz before being analyzed for localization. • If a measured binaural impulse response is used as an input, care should be taken to insure the dummy head is equalized as described above. • Because of the importance of upward masking in localization, if the low frequencies in the room signal are significantly stronger than those in the frequency range from 700 to 4000Hz, localization is likely to be poorer than the equation would predict.

Comments on LOC • LOC is based on the LOG of the build-up of reverberant energy. • This follows directly from the physiological model. • Current measures integrate the sound energy rather than the log of sound energy. But our physiology works differently. One of the consequences is that reflections that arrive early have more influence than reflections that arrive later. • As energy builds up additional reflections are not counted as strongly. • Reflections later than 100ms are ignored in calculating LOC. • This is very different from C80 or C50, which count the earliest reflections a part of the direct sound, and compare the energy sum to the energy sum of all the later reverberation. • In a small hall most of the energy arrives before 80ms regardless of the relative strength of the direct sound, so C80 and C50 are usually high. • But small halls can have high C80 or C50, poor localization, and a lack of clarity. • LOC depends strongly on the delay between the direct sound and the build-up of the reverberation. • late reverberation does not impair localization of short notes. • The principle difference between the localizability in small halls and large halls is the rate at which reflected energy builds up after the start of a note. • LOC is NOT related to EDT – even if Jordan’s original definition of EDT is used. • EDT is relatively independent of the initial time delay • When D/R < -10dB, EDT and RT are the same, as there is insufficient direct sound to be detected in a reverse integrated impulse response. • LOC correlates with IACC80 – but IACC is not sensitive to medial reflections. • IACC is sensitive to the sum of reflected energy – not the log of energy, and thus is insensitive to when the reflections arrive

Tests with speech A speech signal was convolved with a pair of binaural impulse responses, such that the sound appears to come from +-15 degrees from the front. Then a fully spatially diffuse reverberation was added, in such a way as the D/R, the RT, and the time delay before the reverberation onset could be varied.

Broadband Speech Data Blue – experimental thresholds for the alternating speech with a 1 second reverb time. Red – the threshold predicted by the localization equation. Black – experimental thresholds for RT = 2seconds. Cyan – thresholds predicted by the localization equation.

Threshold Data from Other Subjects – 1s RT Blue – new data using absence of any localization as a criterion for threshold. Red – the author’s previous data based on a half-angle criterion. • Seven subjects participated in a threshold experiment at Kyushu University. • In these experiments the threshold was defined by the extinction of localization , not by the reduction of angle by a factor of two. • Consequently the thresholds are lower than they were in my previous experiment, and they have more variation. • However, the data is consistent to within 3dB

Threshold data in Japan, 2s RT • When the RT was raised to 2s the subjects had great difficulty with determining the point of extinction, which appeared to be defined differently for each subject. • There is clearly more spread in the data, and for some subjects the effect of added delay is reduced. • The criterion of reducing the apparent separation by a factor of two seems to give more reliable results. Cyan – the authors data with a half-angle criterion for threshold

Tests with Music – and the difference between localization and engagement • The gaps between words in the speech selection were deliberately chosen so there would be no masking of the direct sound from reverberation. This is NOT the case in real music. • Tapio Lokki kindly made anechoic music recordings available on the web. I used the violin1, violin2, cello, and viola tracks from the Mozart selection to form a string quartet. After a lot of noise reduction and balancing it worked quite well. • In music the direct sound of succeeding notes is frequently masked by reverberation from the previous notes. • When you first listen to the string quartet at low values of D/R localization is impossible, and all the instruments clump together in the middle of the sound field. • But if the value of the localization equation is above 0dB the localization is not always masked, and given time the brain can localize each instrument. Succeeding notes with the same timbre are localized to the correct position. Thus given time, about two minutes for me, the presence equation predicts the localization threshold. • But it does NOT predict the sense of engagement. You can localize sounds (sort of sometimes) but the music is not clear, and the instruments seem far away. (Here is where we need harmonic coherence.) • A value of the LOC equation of +3dB does predict engagement!

Difficulties with the music tests • Because localization of sound sources with music depends so strongly on masking, experiments to determine localization threshold and the threshold of engagement are difficult to perform. • When you first start to listen the localization threshold is as much as 5dB higher than will be achieved after a few minutes of listening. • This is why many (even most) concert halls can give the impression of localization, but lack the sense of engagement. • I found that the adaptation process could be speeded by turning off the reverberation and just listening for 10 seconds or so to the direct sound alone. This teaches the brain where to expect to hear the sound of each instrument. When you turn the reverberation on, sounds of the same timbre will be perceived in the correct location. • The same process occurs in concerts where the visual image is present. The eyes train the brain where to expect each sound – and this is where we hear it. • But such a visually constructed sonic image DOES NOT produce the impression of engagement!

Results of music experiments • I have a lot of data on the music experiments – because of the adaptation problem it is not as consistent as I would like. • But the results are easy to summarize: • Sufficient localization and musical clarity result for the Mozart string quartet at values of the localization equation of +3dB or higher. • These values are very seldom achieved in modern concert halls (or opera houses.) They ARE achieved in Boston Symphony Hall over a wide range of seats, and in a number of other old houses. • The reasons for the lack of success in modern halls will be discussed in the remainder of this talk • Old opera houses (with their surplus of velvet) achieve these values easily – but lack the late reverberation which is so popular these days. • Some opera fans – including myself – would rather have the dramatic intensity of the old halls, even without the reverberation. • This is the sound for which the operas were written.

Direct sound and Envelopment • Recent work by the author in both experiments with several subjects, and in live lecture demonstrations with loudspeakers, have shown that the sense of both reverberance and envelopment increases when the direct sound is audible. • Where there is no perceivable direct sound the sound can be reverberant, but comes from the front. • When the direct sound is above the threshold of localization the reverberation becomes louder and more spacious. • Envelopment and reverberance are created by late energy – at least 100ms after the direct sound. • When the direct sound is inaudible the brain cannot perceive when a sound has started. • So effectively the time between the onset of the direct sound and the reverberation is reduced, and less reverberation is heard. • In the absence of direct sound syllabic sound sources (speech, woodwinds, brass, solo instruments of all kinds) are perceived as in front of the listener, even if reflections come from all around. • The brain will not allow the perception of a singer (for example) to be perceived as all around the listener. • In addition, Barron has shown that reverberation is always stronger in front of a hall than in the rear – so in most seats sound decays are perceived as frontal. • But when direct sound is separately perceived, the brain can create two separate sound streams, one for the direct sound (the foreground) and one for the reverberation (the background). • A background sound stream is perceived as both louder and more enveloping than the reverberation in a single combined sound stream.

Part 2 - Main Points • The ability to hear the Direct Sound – as measured by LOC – is a vital component of the sound quality in a great hall. • The ability to separately perceive the direct sound when the D/R is less than 0dB requires time. When the d/r ratio is low there must be sufficient time between the arrival of the direct sound and the build-up of the reverberation if engagement is to be perceived. • Hall shape does not scale • Our ability to perceive the direct sound – and thus localization, engagement, and envelopment - depends on the direct to reverberant ratio (d/r), and on the rate that reverberation builds up with time. • Both the direct to reverberant ratio (d/r) and the rate of build-up change as the hall size scales – but human hearing (and the properties of music) do not change. • A hall shape that provides good localization in a high percentage of 2000 seats may produce a much lower percentage of great seats if it is scaled to 1000 seats. • And a miniscule number of great seats if it is scaled to 500 seats.

Diffusing elements do not scale • The audibility of direct sound, and thus the perceptions of both localization and engagement, is frequency dependent. Frequencies above 700Hz are particularly important. • Frequency dependent diffusing elements can cause the D/R to vary with frequency in ways that improve direct sound audibility. • The best halls (Boston, Amsterdam, Vienna) all have ceiling and side wall elements with box shape and a depth of ~0.4m. • These elements tend to send frequencies above 700Hz back toward the orchestra and the floor, where they are absorbed. (The absorption only occurs in occupied halls – so the effect will not show up in unoccupied measurements!) • The result is a lower early and late reverberant level above 700Hz in the rear of the hall. • This increases the D/R for the rear seats, and improves engagement. • The LOC equation is sensitive to all reflections in a 100ms window, which will include many second-order reflections, especially in small halls. • Replacing these elements with smooth curves or with smaller size features does not achieve the same result. • Some evidence of this effect can be seen in RT and IACC80 measurements when the hall and stage are occupied. • Measurements in Boston Symphony Hall (BSH) above 1000Hz show a clear double slope that is not visible at 500Hz. • The hall has high engagement in at least 70% of the seats.

We need better measures • Current acoustic measures ignore both the D/R and the time gap between the direct (the first wavefront) and the reverberation. • RT, C80, and EDT all ignore the strength of the direct sound and the effects of musical style on the audibility of the D/R. • IACC comes close, but measures something different. • LOC is an attempt to supply a simple measure for a basic human perception which depends on direct sound. • But impulse response measurements under occupied conditions are notoriously difficult to obtain. • We need measures that use binaural recordings of actual performances as inputs. • And the ability to listen to these recordings to test the validity of these measures against the true experience. • Methods for recording and reproducing binaurally will be discussed in the next paper • We are working on ways to measure LOC from such recordings.

Why do large halls sound different? • In Boston Symphony Hall (BSH), and the Amsterdam Concertgebouw (CG) the reverberation decay is nearly identical, but the halls sound different. • The difference can be explained using the same model that was used to develop LOC. • Lacking good data with an occupied hall and stage I used a binaural image-source model with HRTFs measured from my own eardrums.

Reverberation build-up and decay – from models Amsterdam Boston LOC = +6dB LOC = 4.2dB The seat position in the model has been chosen so that the D/R is -10dB for a continuous note. The upward dashed curve shows the exponential rise of reverberant energy from a continuous source assuming exponential decay with no time gap. The solid line shows the build up and decay from a short note of 100ms duration. Note the actual D/R for the short note is only about -6dB. The initial time gap is less in Boston than Amsterdam, but after about 50ms the curves are nearly identical. (Without the direct sound they sound identical.) Both halls show a high value of LOC, but the value in Amsterdam is significantly higher – and the sound is clearer.

Comparisons of C80, C50, IACC80, and LOC • Conventional measures for the models of Amsterdam Concertgebouw and Boston Symphony Hall give the following results: • Amsterdam: C80 = .43dB, C50 = -2.8dB, IACC80 = .38, LOC = +6dB • BSH: C80 = .65dB, C50 = -2.1dB, IACC80 = .22, LOC = +4.2dB • Half-Size BSH: C80 = 3.7, C50 = 1.7, IACC80 = .15, LOC = 0.5dB • Only the IACC80 shows that Amsterdam might have more direct sound than Boston. The standard Clarity measures predict the opposite – and predict that the small hall would have high clarity, and it does not. • But IACC80 is sensitive only to lateral reflections. Strong reflections from the front, overhead, or rear do not affect IACC. • An IACC of 0.22 would usually be considered low. In spite of this BSH has both clarity and good localization in this seat.

Smaller halls • What if we build a hall with the shape of BSH, but half the size? • The new hall will hold about 600 seats. • The RT will be half, or about 1 second. • We would expect the average D/R to be the same. Is it? How does the new hall sound? • If the client specifies a 1.7s RT will this make the new hall better, or worse?

Half-Size Boston The gap between the direct and the reverberation and the RT have become half as long. Additionally, in spite of the shorter RT, the D/R has decreased from about -6 in the large BSH model, to about -8.5 in the half-size model. This is because the reverberation builds-up quicker and stronger in the smaller hall. LOC =0.5 The direct sound, which was distinct in more than 50% of the seats in the large hall will be audible in fewer than 30% of the seats in the small hall. If the client insists on increasing the RT by reducing absorption, the D/R will be further reduced, unless the hall shape is changed to increase the cubic volume. The client and the architects expect the new hall to sound like BSH – but they, and the audience, will be disappointed. As Leo Beranek said about the Berlin Philharmonie: “They can always sell the bad seats to tourists.”

An existing small hall – pictures Note the highly reflective stage and side walls, deeply coffered ceiling, and relatively low internal volume per seat. The sound in many seats is muddy. Adding reflections or decreasing absorption only increases the muddiness.

Hall data • The pictures show a recital hall of 65000 cubic feet (1840 cubic meters). Designed for 350 seats, it has currently 300 seats, giving a volume/seat of 6 cubic meters. There is 1400 square feet of carpet under the seats on the floor. • Reverberation Time (RT) unoccupied is 1.1 seconds from 1000Hz to 63Hz. C80, dominated by the reverberation time, is ~+5dB everywhere. • The parallel side walls of the stage provide little diffusion. • The hall is generally liked by the audience and players, although there are reports of loudness and balance problems on stage. • Musicians desire more resonance and greater clarity in the middle of the hall.

Experiments with absorption and acoustic enhancement • Measurements and experiments involving various combinations of fiberglass panels and electronic reverberation enhancement were conducted in January 2009. • Measurements were made with three loudspeakers, three dummy heads, and a Soundfield microphone. • All musical performances were recorded with the same microphones, and with an array of close microphones on stage. • About 30 musicians participated, including faculty, staff, students from all three divisions, and musicians from the wider community. • The goal was to improve the instrumental balance on stage, reduce excess stage loudness, and to increase resonance and the ability to hear individual instruments throughout the hall. • With both panels and enhancement in place comments from the participants were favorable. Players and singers found balancing with piano was easier, and the middle registers of the piano were more easily heard both by the musicians and in the hall.

The absorptive curtains at the rear of the stage could be rapidly withdrawn. The blankets that simulated audience could be removed in 5 minutes, along with the panels on stage. This allowed prompt A/B comparisons. Some of the 25 LARES enhancement speakers are visible

Results from the experiments • The experiments in January showed that adding fiberglass panels around the stage increased clarity and the ability to localize instruments in the hall, raising the measured value of LOC from an average of minus 1.5dB to +3dB or more. • Localization and clarity in the balcony were additionally improved by adding panels to the upper audience right side wall, which eliminated the strong lateral reflection from that surface. • The lower surface of this wall was already absorptive. • The electronic enhancement successfully compensated for the loss of resonance due to the panels. Without the enhancement the perceived resonance was reduced. • In a subsequent experiment with a violin-piano combination and no enhancement we found that just 12 fiberglass panels each 2’x6’x2” around the bottom of the stage noticeably improved the clarity on the floor of the hall, and also improved the balance for the players on stage. For this music the reduced resonance was not a problem. • These panels absorbed the first-order reflection from the back of the stage, which has the highest level and the shortest time delay. Absorbing this reflection contributed strongly to increasing LOC.

Usefulness of the measure LOC • LOC informs us that the primary contribution to difficulty in localization are the first strong reflections, regardless of the direction they come from. • We initially thought that since the floor of the hall is not a significant source of these reflections, it is would be likely that removing the carpet under the seats would raise the RT without decreasing LOC significantly. • However LOC is also sensitive to reverberation which arrives before 100ms, and this would be increased by removing the carpet. • A few later experiments suggest that removing the carpet will increase the reverberant level sufficiently to eliminate the improvement in LOC provided by the absorption on stage. • The existence of a LOC as a physical measure can help to answer these questions in advance – or at least suggest that an experiment is needed before drastic alterations are undertaken.

David Griesinger Consultant Cambridge MA USA www.DavidGriesinger.com

David Griesinger Consultant Cambridge MA USA www.DavidGriesinger.com

Presentation Transcript

Subjective aspects of room acoustics David Griesinger

Cambridge Cab MA

David Martin, Behavior Consultant

David R. Glowacki, PhD, MA

Ramesh Raskar Media Lab, MIT Cambridge, MA

Mitsubishi Electric Research Labs (MERL) Cambridge, MA

October, 28 2004 Cambridge, MA USA

Ramesh Raskar Mitsubishi Electric Research Labs (MERL) Cambridge, MA, USA

PRISME Forum Meeting Biogen Idec, Cambridge, MA, USA

303 Third St, Cambridge, MA

David Griesinger David Griesinger Acoustics davidgriesinger

Barrett Hand Barrett Technology, Cambridge, MA

Education Consultant in USA

David Wayne Hagler Marketing Consultant

Locksmith Cambridge MA

USA immigration consultant

Residential Painting Cambridge MA

Ramesh Raskar Mitsubishi Electric Research Labs (MERL) Cambridge, MA, USA

Trish Jibben Smithsonian Astrophysical Observatory Cambridge,MA

Mitsubishi Electric Research Labs (MERL) Cambridge, MA

Tamara Shapiro Ledley TERC, Cambridge, MA

MA Fence Contractor - USA