Loading in 5 sec....

SOUND Laura Hyland cs.aue.auc.dk/~lauraPowerPoint Presentation

SOUND Laura Hyland cs.aue.auc.dk/~laura

- 70 Views
- Uploaded on
- Presentation posted in: General

SOUND Laura Hyland cs.aue.auc.dk/~laura. What sound is?.

SOUND Laura Hyland cs.aue.auc.dk/~laura

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

SOUND

Laura Hyland

cs.aue.auc.dk/~laura

Sound is a form of energy. When we give energy to a body (by hitting or exciting it) we set that body in motion – it will vibrate. This body in turn will set the air around it in motion, causing it to vibrate also. The vibrations in the air will reflect the vibrations of the body. These vibrations travel through the air in waves until they reach our ears where we perceive them as sound. For example, if we excite a tuning fork by striking it the energy given causes the tines to vibrate very quickly. The movement of the tines is too small for the eye to percieve but not for the ear. These vibrations can also be felt when the tuning fork is brought in contact with the skin.

As the tines move back and forth they exert pressure on the air around them.

(a) The first displacement of the tine compresses the air molecules causing high

pressure.

(b) Equal displacement of the tine in the opposite direction forces the molecules to

widely disperse themselves and so, causes low pressure.

(c) These rapid variations in pressure over time form a pattern which propogates

itself through the air as a wave. Points of high and low pressure are sometimes

reffered to as ’compression’ and ’rarefaction’ respectively.

(a) compression

(b) rarefaction

(c) wave propegation of a tuning fork

as seen from above

If we look at the way one tine moves we see that it only moves in one plane

back and forth. In addition to this it moves the same distance back and forth at

a constant speed. We could say that it is trying to reach equlibrium – it is

trying to get back to its original stable position. This kind of response of an

object to excitation is characteristic to many objects. A pendulum is another example of this movement – it is also easier to visualize than the tines of a tuning fork! Think of the movement of a pendulum when put into motion; it swings back and forth but will always come to rest at its original position. This is called Simple Harmonic Motion.

Note with the pendulum that even when it approaches equlibrium it doesnt

slow down – it simply travels a smaller distance from the point of rest. This

is also the case for the tine of the tuning fork. Thus, we can say that any

body undergoing simple harmonic motion moves periodicallywith uniform

speed. We can also say that if the tine is moving periodically then the

pressure variations it creates will also be periodic.

The time taken to get from position a to b in all three cases is the same

a

b

a

b

a

b

Maximum displacement

after say, 6 seconds

Maximum displacement

at 0 seconds

Maximum displacement

after say, 3 seconds

These pressure patterns can be represented using as a circle.

Imagine the journey of the pendulum or the tine in four stages:

1) from its point of rest to its first point of maximum displacement...

2) its first point of maximum displacement back through the point of rest...

3) ... to its second point of maximum displacement...

4) ... and back from there through its point of rest again

We can map that journey to a circle. This is called the Unit Circle. The sine wave

represents this journey around and around the unit circle over time.

3

4

2

1

Time

The sine wave or sinusoid or sinusoidal signal is probably the most commonly used graphic representation of sound waves. The diagram below shows one cycle or ’period’ of a wave, i.e., the build-up from equilibrium to maximum high pressure, to maximum low pressure, to equilibrium again. A sine wave sounds like this...

high pressure

or ’compression’

+ 1

low pressure

or ’rarefaction’

Pressure or density

of air molecules;

’Amplitude’ in

deciBels

0.5

1

0

-1

Time in seconds

The specific properties of a sine wave are described as follows.

Amplitude = variations in air pressure (measured in decibels)

Phase = The starting point of a wave along the y-axis (measured in degrees)

Period = The time need to come to exact the same location

1 second

Frequency

Frequency refers to the number of cycles of a wave per second. This is measured in Hertz.So if a sinusoid has a frequency of 100Hz then one period of that wave repeats itself every 1/100th of a second. The diagram below shows a 100Hz sine and an 800Hz sine. For every 8 periods of (a) there is one period of (b). Humans can hear frequencies between 20Hz and 20,000Hz (20Khz).

There are three important things to remember about frequency:

1) Frequency is closely related to, but not the same as!!!,pitch.

2) Frequency does not determine the speed a wave travels at. Sound waves travel at approximately 340metres/second regardless of frequency. f=c/l

3) Frequency is inherent to, and determined by the vibrating body – not the amount of energy used to set that body vibrating. For example, the tuning fork emits the same frequency regardless of how hard we strike it.

(a) 800Hz

(b) 100Hz

Wavelength

Wavelength describes the length of one period of a wave, or twice the distance between one zero crossing point and the next. (A zero crossing point refers to the point at which the wave crosses the x-axis. This represents the point at which there is no pressure variation, i.e., the point where air molecules return to their original position.)

It is important to have a sense of the actual physical size of a wave. The speed of sound in air is approximately 340 metres per second*. Consider a wave of frequency 20 Hz, i.e., a pressure pattern repeating itself 20 times a second. 20 periods back to back have a length of 340metres so 1 period = 340/20 = 17 metres. Similarly, a wave at a frequency of 20kHz will be 340/20,000 in length = 0.017 metres or 1.7mm.

This property is important for formants!

Wavelength = Speed of sound in air / Frequency

1 period

+ 1

0

-1

Zero crossing points

*This is dependant on air temperature – the higher the temp the more freely air molecules will move, therefore the faster the wave will travel.

Amplitude

Amplitude describes the size of the pressure variations. It is measured along the vertical y-axis. Think of the pendulum or the tuning fork; the wider the displacement of the pendulum or tine, the larger the amplitude is. Amplitude is closely related to but not the same as!!!,loudness. Hence the reason the tuning fork sounds louder when we strike it hard. We will examine the relationship between amplitude and loudness later...

(a) Two signals of equal frequency and

varying amplitude

(b) Two signals of varying frequency and

equal amplitude

Amplitude Envelope

The amplitude of a wave changes or ’decays’ over time as it loses energy.

These changes are normally broken down into four stages;

Attack , Decay, Sustain and Release. Each stages is measured in milliseconds. For example, the signal below has an attack of 100ms. That means its amplitude goes from 0dB to 0.8dB in 100ms. Similiarly, it has a decay of 90ms so its amplitude goes from 0.8dB to ~0.38dB in 90ms, etc.

Collectively, the four stages are described as the amplitude envelope.

Attack

Decay

Sustain

Release

Slow attack

0.8

0.6

0.4

Fast attack

0.2

0

100

200

300

400

600

700

500

Phase

Consider the Unit Circle again. So far we have mapped the journey around the

circle starting from the point corrosponding to an amplitude of 0 on the y-axis

but we can offset the starting point so that we begin mapping from another point

on the circle. The offset from the point 0 on the circle will determine the initial

phase of the sine wave, i.e., the starting point of the wave along the y-axis.

This offset is the phase of the sine wave and it is measured in degrees. Figure (b)

below has a phase shift of 170 degrees

0 degrees

260

180

(a)

0

90

170 degrees

(b)

Phase

Why offset the start time? As we will see later, it’s often neccessary to look at several

signals together, each one having a different start time. The easiest way to represent

this time difference is using phase. For example, take two 100Hz signals. Say (a)

starts at 0 seconds and (b) starts at 1.375secs. (a) will be ¾ ways through its 131st

cycle when (b) begins. We can represent this time difference by giving (a) a phase

shift of 260 degrees or ¾ of a cycle. That is, when (b) is starting out at amplitude 0,

(a) is at amplitude 1. So, phase can be defined as the representation of the time delay

between two signals.

(a)

0secs

1.375secs

(b)

0secs

1.375secs

Wave Superposition

If we add these two 100hz signals together we see that points of high pressure in

(a) correspond with points of low pressure in (b). Thus, they cancel each other out

and the result we get is no pressure variations at all! The picture of this?... (c)!

When two waves intereact with each other like this it is called interference.

(a)

+

(b)

=

(c)

Wave Superposition

In the previous example the two waves cancelled each other out, resulting in a

decrease of pressure variations. This is Destructive Interference. The following

example shows a case where two waves interacting result in an increase in pressure

variations. This is called Constructive Interference. We have the same two signals

but this time they start ’in phase’, i.e., at the same time.

(a)

+

(b)

=

(c)

Wave Superposition

Constructive and destructive interference due to phase cancellation of two waves of

equal amp, freq and direction.

Standing wave due to two waves of equal amplitude, frequency and phase travelling

in opposite directions.

Wave Superposition

If we take two sine waves which are very close in frequency we experience a

phenomena called beating. Beating can be described as a periodic variation in

amplitude. These variations occur at at rate of f1 - f2(where f1 the higher

frequency and f2 the lower). The frequency of the new signal will be the average of

f1 and f2. Forexample, when a 440 Hz. and 442 Hz. sinusoids are combined we

hear 2 beats per second and tone whose frequency is 441 Hz.

440hz

442hz

441hz

Beating sounds like this... (170hz + 174hz)

Fundamentals, Harmonics & Partials

So far we have investigated interference between two sinusoids of equal frequency and also that between very close frequencies. Now we need to consider some other relationships. What happens when one sine wave is exactly half the frequency of the other? In the diagram below we see two sinusoids with frequencies, 220hz and 440hzand both have a 0 degree phase so for every one period of (a) we get two periods of (b). We can hear this relationship as well as see it. Listen to both frequencies – we hear the same note but one is much ’higher’ than the other. If both are played together we hear one one tone, not two! We will investigate the reason for this later when we look at pitch. In the following diagram this pattern is extended to five sinusoids. In this case the 2nd sinusoid is twice the 1st, the 3rd is three times the 1st, the 4th is four times the 1st and so on....

Fundamentals, Harmonics & Partials

5th Harmonic or 6th partial

e) 500hz

3rd Harmonic or 4th partial

d) 400hz

2nd Harmonic or 3rd partial

c) 300hz

1st Harmonic or 2nd partial

b) 200hz

’Fundamental frequency’

or ’1st partial’

a) 100hz

Fundamentals, Harmonics & Partials

Visually, it is clear that there is a relationship between all these sine waves.

Aurally, it is also clear; when all five sines are played together we perceive it

as one tone. Numerically there is also a relationship – the frequencies are all

multiples of the first frequency. There is an integer relationship between

the frequencies of all these sinusoids. (integers are whole numbers like 3, 5, -8,

120, etc). In a set of sinusoids like this the first frequency is reffered to as the

fundamental frequency. Subsequent sinusoids are called harmonics. The

whole set together is called the Harmonic Series.

Another term often used in this context is a ’partial’. NOTE: a partial is a

generic term to describe any component of a sound, for example, a sinusoid of

318hz in this set is a partial but NOT a harmonic because 318 is not an even

multiple of 100. The fundamental frequency is also a partial.Thus, all harmonics

are partials but not all partials are harmonics!

The Harmonic Series

Notice that there is also a relationship between the amplitude of partials

comprising the harmonic series.

Amp f5 = amp f1/5

Amp f4 = amp f1/4

Amp f3 = amp f1/3

Amp f2 = amp f1/2

Amp f0 = 1

Wave Superposition of Harmonically Related Signals

Notice that the signals being added are the 1st, 3rd and 5th harmonics of a series where f0 = 100, i.e., the odd harmonics. Now look at the resultant diagram to the right. Notice that the emergent ’shape’ approaches a square. The combination of odd harmonics will always give a square wave.

Time Domain vs Frequency Domain

50

80

Frequency

140

600

Time

2700

Time Domain vs Frequency Domain

Even though we cannot see sound we have to remember that it has physical

dimensions and exists in space. Thus, like any other physical body it is

3-dimensional. The previous figure shows a sine wave in the three dimensional

plane. Along the horizontal x - axis we have time; on the vertical y-axis we have

amplitude; on the diagonal z-axis we have frequency.

Up until now we have viewed sine waves in the TIME DOMAIN only.

So why use both views? What can we see in one view that we can’t see in the other?

First let’s look at the Time Domain again; Here we can see time, of course, and

phase and amplitude. We can also see low frequencies if its a single sine wave

and the diagram is big enough, however, only so far as the eye can count the resolutions of the pressure pattern. Also, if there is more than one sinusoid identifying frequency is practically impossible. Obviously this is not very a sophisticated or accurate way of identifying frequency! Hence the reason we need to be able to switch views between one axis and the other. (If you think of architectural drawings of a building in plan and elevation In might make it easier to conceptualize this switching between views.)

Time Domain vs Frequency Domain

Now, if we look at the sinusoid in the FREQUENCY DOMAIN what can we see? Obvoiusly we can see Frequency. We can also see amplitude. However, we cannot see time or phase at all. Look at the following diagrams to see the same signal represented in the time domain and frequency domain.

Time Domain

Frequency Domain

Amp

Amp

20hz

220

440

800

2.5Khz

7Khz

20Khz

Time

Frequency

440hz tone at an amplitude of 0.87 and phase shift of 260 degrees

220hz tone at an amplitude of 0.4 and a phase shift of 90 degrees

- A physical process can be described either in the time domain by a function of time t, h(t), or in the frequency domain as a function of frequency f, H(f)
- h(t) and H(f) are two different representations of the same process.
- One goes back and forth between these two representations by means of the Fourier transform,
- Dirac delta function:

- Using angular frequency =2f,

1

d(t)

cos(w0t)

w

t

+w0

-w0

0

0

w

t

- FT of Dirac delta function:
- FT of cos(0t)

exp(iw0t)

Im

t

0

Sum

Re

t

0

- FT of exp(2if0t)=exp(i0t)

F{exp(iw0t)}

w

w0

0

FT

w

0

- Correspondence between symmetries in the two domains:
- Scaling and shifting

w

t

w

t

t

w

Shortpulse

Medium-lengthpulse

Longpulse

- With two functions h(t) and g(t), and their FT H(f) and G(f), the convolution, g*h, is defined by
- Convolution theorem: the FT of the convolution is the product of the individual FTs.
- The correlation, Corr(g,h)
- Correlation theorem (for two real functions, g and h):
- Autocorrelation, Wiener-Khinchin theorem:
- Parseval’s theorem:

- Suppose function h(t) is sampled at evenly spaced intervals in time;
- 1/: Sampling rate

- For any sampling interval , there is a special frequency fc, called Nyquist frequency, given by
- ex: critical sampling of a sine wave of Nyquist frequency is two sample points per cycle.

- A function f is “bandwidth limited” if its Fourier transform is 0 outside of a finite interval [-L, L]
- Sampling Theorem: If a continuous function h(t), sampled at an interval , is bandwidth limited to frequency smaller than fc, i.e., H(f)=0 for all |f|>fc, then the function h(t) is completely determined by its samples hn.

- For bandwidth limited signals, such as music in concert hall, sampling theorem tells us that the entire information content of the signal can be recorded by sampling rate -1 equal to twice the maximum frequency pass by the amplifier.

- For the function that is not bandwidth limited to less then the Nyquist critical frequency, frequency component that lies outside of the frequency range, -fc < f < fc is spuriously moved into that range (aliasing).
- Demo applet

- http://www.cs.brown.edu/exploratories/freeSoftware/repository/edu/brown/cs/exploratories/applets/nyquist/nyquist_limit_java_plugin.html

- Suppose we have N consecutive sampled values
where is the sampling interval, and assume N is even.

- With N numbers of input, we can produce no more than N independent number of outputs. Therefore, we seek estimates only at the discrete values;
- Then, discrete Fourier Transform (DFT) is
- DFT maps N complex numbers (the hk’s) into N complex numbers (the Hn’s)

- It’s periodic in n, with period N; H-n = HN-n, n=1,2,…
- With this conversion, one lets the n in Hn vary from 0 to N-1. Then n and k vary exactly over the same range.
- With this convention,
- zero frequency n=0
- positive frequencies, 0 < f < fc 1 n N/2-1
- negative frequencies, –fc < f < 0 N/2+1 n N-1
- the value n = N/2 both f = fc and f = -fc

- The DFT has symmetry properties almost exactly the same as the continuous Fourier transform.

- The discrete inverse Fourier transform is
- Proof:

- Parseval’s theorem:

Loudness

- What is loudness? How can it be meaured? There are two ways of doing so:
- Measuring the sound pressure level (SPL)of a wave. That is, measuring how
- much pressure the vibrations of an object exert on the air around it.
- Or by measuring the sound intensity level (SIL) . That is, measuring how much
- energy a wave carries through the air.
- So, loudness is the perceptual quality corrosponding to changes in SPL or SIL.
- However, loudness is relative – that is, to describe the loudness of one sound we need to reference some other sound. For example, late at night you may turn your music down. After some minutes it seems perfectly loud but if you listen to it the following morning at the same ’volume’ it probably seems very quiet. So, loudness of one sound is relative to some other sound. This is summarised in the Weber-Fechner Law which states that
- ”the increase in Intensity needed to produce a given increase in perceived loudness is proportional to the pre-existing intensity.”
- This is similar to weight judgement or brightness of light.

Sound Intesity = rate of energy radiation in watts per metre squared per second

Imagine a spherical sound source radiating the same amount of energy at the same rate in all directions at the same time.That energy is recieved or ’picked up’ by some surface – say our eardrum or a microphone.

Intensity falls off

with distance

squared

A

Watts per second

1metre squared

B

- The reception of energy is proportional to the recieving surface area. i.e., the bigger
- the surface area, the more energy will be picked up.
- Intensity falls off with distance squared, i.e., the greater the distance between source
- and receptor, to lower the intensity will be.
- The smallest detectable sound intensity is detectable by the ear is 10^-12 Watts/m^2;
- the largest is 10^0 W/m^2 (i.e. 1W/m^2).
- These two extremes are called the threshold of hearing and the threshold of pain,
- respectively.

When a body vibrates with more energy its displacement from equilibrum is greater. So too then is the displacement of air molecules. The result is cycles of densely and sparsely packed molecules. The figure below shows a representation of two sines at the same frequency with different amplitudes. We can see that when the amplitude is higher the molecules are more densely packed together and thus, create higher pressure. This is what ’sound pressure level’ refers to.

So, higher pressure means more densely packed molecules which means more ’impact’ on the ear. (think of the different between being hit by a sponge and being hit by a stone; stone = denser material = more pressure = more impact.)

Loudness - Not a Linear Scale!

2* 10^-5N/m^2

2*10^1N/m^2

Sound pressure Level

10^0W/m^2

10^-12W/m^2

Sound Intensity Level

Threshold of pain

Loudness

Threshold of hearing

The problem with trying to measure loudness it that it is not linear. That means a doubling in intensity or in pressure does not necessarily corrospond to a perceived doubling in loudness! Thus, loudness increases logarithmically

SPL or SIL

Loudness

Relationship between SPL and SIL?

An increase in sound intensity is proportional to the square of the pressure amplitude. In other words, as the amplitude doubles the sound intensity is quadrupled. This relationship requires a bit of a mathematical detour but here it is sufficient to simply remember that when measuring Sound pressure level we use the equation:

20 Log P1/P0 = SPL (in decibels)

...and when measuring Sound Intensity Level we use the equation:

10 Log I1/I0 = SPL (in decibels)

The Decibel Scale

So, by converting the linear SPL and SIL scales to logarithmic scales we now have

an accurate loudness scale. We will see shortly that the relationship between pitch

and frequency is also non linear – logarithmic scales will reappear there too!

SPL or SIL

Non-Linear

Perceived Loudness

Log (SPL or SIL)

Linear

Perceived Loudness

Other ways of measuring Loudness?

So far we’ve seen that there are two ways of measuring loudness – SIL and SPL. However, SPL is the most commonly used scale for loudness measurment. For example, SPL is used to measure the level of noise on a building site for example, or in a concert hall in the centre of a busy city.

Musicians use the following dynamic markings to indicate loudness;

ppp - quiet as possible

pp - very quiet

p - quiet

f - loud

ff - very loud

fff - loud as possible

crescendo

diminuendo

Loudness Perception

- Now that we’ve found a way to quantify loudness what can we tell about our
- experience of it? Is our perception of loudness the same under all circumstances for all sounds? No! In the following slides we will see that loudness is dependant on:
- Frequency (S)
- Presence of other sounds (S)
- Duration (T)
- Adaption (T)

- These factors can be divided into Spectral and Temporal characteristics of Loudness, indicated by ’S’ and ’T’ above.

Frequency: Fletcher Munson Curves

The most comprehensive study of loudness perception at different frequencies is shown in the Fletcher Munson Curves. These curves demonstrate the relationship between sound pressure and frequency and the resulting loudness we perceive.

Fletcher Munson Curves

- Each line is called an equal loudness contour. The reference point for Fletcher Munson Curves is a 1000hz tone at 40dB. The ’question’ posed by the graph is
- ”by what amount must the SPL be increased for a tone of frequency x before it sounds as loud as a 1000hz tone at 40dB?”
- For example, we can see from the graph that a frequency of 20hz at an SPL of ~80dB will sound as loud as a frequency of 100hz at ~35dB. Staying on the same contour, 1000hz will sound as loud at 20dB, etc.
- Important points to note about ELCs is that:
- The smallest amplitude required to match the 1000hz reference tone is at ~3000
- - 5000hz. This indeicates that we are more sensetive to frequencies in this range.
- (This makes sense considering that the range of the human voice roughly
- corrosponds to this range.)
- Higher ECLs relating to higher amplitudes are flatter than lower level
- ECLs. So, at higher amplitudes audible frequencies are more similar in loudness
- than at lower frequencies.
- At low listening levels there is a fall off of bass frequencies – i.e., amplitude
- smooths out at higher frequencies.

- DFT appears to be an O(N2) process.
- Danielson and Lanczos; DFT of length N can be rewritten as the sum of two DFT of length N/2.
- We can do the same reduction of Hk0 to the transform of its N/4 even-numbered input data and N/4 odd-numbered data.
- For N = 2R, we can continue applying the reduction until we subdivide the data into the transforms of length 1.
- For every pattern of log2N number of 0’s and 1’s, there is one-point transformation that is just one of the input number hn

Signal representation on time-frequency plane

(synthetic /a/)

Sonogram of speech signal –evolution of spectum

Color spectrogram

Formowanie sygnału mowy