Digital Media Lecture 12: Additional Audio Georgia Gwinnett College School of Science and Technology Dr. Jim Rowan
Audio & Illusions • Can you hear this? • “mosquito ring tone” • http://www.freemosquitoringtones.org/hearing_test/ • Audio illusion: “Creep” • http://www.youtube.com/watch?v=ugriWSmRxcM
The nature of sound First, a video from ted.com http://www.wimp.com/howsound/
Other related video #1 How to use visualizations of human speech and music to explain computation: http://www.youtube.com/watch?v=mGc6clf_Wt4&feature=bf_prev&list=PL278ECDA0705DAF3DS
Other related video #2 David Byrne on how the venue shapes the form of the music performed: http://www.ted.com/talks/lang/en/david_byrne_how_architecture_helped_music_evolve.html
The nature of sound • Three classes of audio that we will discuss • 1) Environmental sound (sounds found in the environment) • 2) Music • 3) Speech
The nature of sound • Environmental sounds • Provides information about the surroundings that the human is currently in • Music and Speech • Functionally and uniquely different than other sounds • Music • Carries a cultural status • Can be represented by non-sound: MIDI • Can be represented by a musical score • Speech • Linquistic content • Lends itself to special compression
And it’s complicated… • Converting energy to vibrations and back • Transported through some medium • Either air or some other compressible medium • Consider speech • Starts as an electrical signal (brain & nerves) • Ends as an electrical signal (brain & nerves) • But…
No… it’s REALLY complicated..http://en.wikipedia.org/wiki/Ear • Starts as an electrical signal (brain & nerves) ==> • Muscle movement (vocal chords) • Vibrates a column of air sending out a series of compression waves in the air • Compression waves cause ear membrane to vibrate ==> • Moves 3 tiny bones ==> • Causes waves in the liquid in the inner ear ==> • Bends tiny hair cells immersed in the liquid ==> • When bent they fire ==> • Sends electrical signals to the cerebral cortex • Processed by the temporal cortex
Audio Illusions • Audio creep… • Play a 200 Hz pure tone • Softly at first • Gradually increase the volume • Most listeners will report that the tone drops in pitch as the volume increases • Play a 2000 Hz pure tone • Softly at first • Gradually increase the volume • Most listeners will report that the tone rises in pitch as the volume increases
Why do you think… • You can’t tell where some sounds come from (like some alarms for instance) • You only need one sub woofer when you need at least two for everything else • You can’t tell where sound is coming from underwater • Two things running at the same speed make a “beating” sound
Why do you think… (cont) • With your eyes closed you can’t tell whether a sound is in front of you or behind you • You hear sound that isn’t there (tinnitis) • Phantom sounds • Heard… but not there • Masking sounds • Not simply drowning them out • Can mask a sound that occurs before the masking sound actually starts
Why do you think… (cont) • You can hear your name in a noisy room • Cocktail party effect • http://en.wikipedia.org/wiki/Cocktail_party_effect • Still very much a subject of research
Why? It’s complicated! • http://en.wikipedia.org/wiki/Psychoacoustics • Psychoacoustics • The study of human sound perception • The study of the psychological and physiological affects of sound
Why?It’s complicated! • Sound is physical phenomenon that is interpreted through the human perceptual system • Wavelength affects stereo hearing • The distance between your ears related to the wavelength • Speed of sound affects stereo hearing • The faster the sound travels, the wider apart your ears need to be • You can tell where a sound comes from if • the wavelength is long enough and • the speed that sound travels is slow enough to allow the waves arrive at your ears at different times
Processing audio • How can we characterize sound? • Amplitude • Frequency • Time • Waveform displays • Summed amplitude of all frequencies & time • Amplitude & frequency components at one point in time • Amplitude & frequency & time
Croak! Play Croak!
The sonogram, a snapshot of frequency Croak! Play Croak!
Another way to show audio,frequency density across time Slim Pickens from Dr. Strangelove
Croak! Play Croak!
More examples… Pure sine wave G, E, C Bassoon playing the same notes
Waveform & time G C E
Sonogram G C E
Digitized audio • As we have seen earlier this semester • Sample rate & quantization level • Reduction in sample rate is less noticeable than reducing the quantization level • Jitter is a problem • Slight changes in timing causes problems • 20k+ frequencies? • Though they can’t be heard they manifest themselves as aliases when reconstructed
Audio Dithering is Weird… add noise… get better sounding result?!? • Add random noise to the original signal • This noise causes rapid transitioning between the few quantized levels • Makes audio with few quantization levels seem more acceptable
Audio processingterms to know • Clipping • …but you don’t know how high the amplitude will be before the performance is recorded • Noise gate • has an amplitude threshold • Notch filter • remove 60 cycle hum • Low pass filter • High pass filter • Time stretching (or shrinking… Limbaugh) • Pitch alteration • Envelope shaping (modifying attack)
What these filters look like: High pass filter
What these filters look like: Low pass filter
What these filters look like: Notch filter
One thing about humans… • We can actively “filter out” what we don’t want to hear • remember the cocktail party effect? • Over time we don’t hear the pops and snaps of a vinyl record • Have you ever recorded something that you thought would be good only to play it back and hear the air conditioner or traffic roaring in the background? • A piece of software can’t do this… • …not yet anyway!
Compressing sound: Voice • Remove silence • Similar to RLE • Non-linear quantization • “companding” • Quiet sounds are represented in greater detail than loud ones
Compressing sound: Voice • Differential Pulse Code Modulation (DPCM) • Related to temporal (inter-frame) video compression • It predicts what the next sample will be • It sends that difference rather than the absolute value • Not as effective for sound as it is for images • Adaptive DCPM • Dynamically varies the sample step size • Large differences were encoded using large steps • Small differences were encoded using small steps
Sound compressionthat is based on perception • The idea is to remove what doesn’t matter • Based on the psycho-acoustic model • Threshold of hearing • Remove sounds too low to be heard • High and low frequencies not as important (for voice)