Sound – Digital Audio • Waveform • Digital sampling of electrical signal • Analogue to digital conversion • Digital data stores the amplitude of the note • Pitch is frequency of the sound - not specifically digitised • Sound recreated (playback) through sound card and speakers • Digital to analogue conversion
Digital Audio – Sampling • Sample rate is how often the incoming sound wave is measured • Most common sampling rates are multiples: • 11.025 kHz (voice only - telephone quality) • 22.05 kHz (Most frequently used rate) • 44.1 kHz (CD quality — Potentially) • Sampling rates are very important to the quality of the sound • Sound can be sampled as Monaural or Stereo
Sampling Considerations • Human hearing range not more than 20Hz to 20kHz • often only 40Hz to 15kHz in later life • Highest frequencies cannot be recorded at 11kHz sampling rate • Speech needs 4kHz to 8kHz sampling rate • Music needs 22kHz to 44kHz sampling • Too high a volume may incur clipping • Too low a rate my cause quantisation to affect to reproduced sound • Small variations in the sound may be too small to record as different sample values
Digital Audio – Sampling Size • The sample size is how much information is recorded at each sampling - also known as Bit Depth • The bit depth also influences sound quality • An 8 bit sample = 256 values • A 16 bit sample can store 65,536 values — A huge difference! • 16 bit sampling gives a cleaner waveform with fewer steps
Quantisation & Clipping • Quantisation is an integral part of the digitising process • It is only a problem when the variations between the discreet values recordable at a particular bit depth are too large to register the changes in the sound • Clipping occurs when the largest values recordable are less than the highest volumes recorded
Digital Audio – The Trade-off • Mono, 8 bit, 11 kHz audio • 1 byte ´ 11,000 ´ 1 second = 11 KB per second • 11 KB/s ´ 60 second = 660 KB per minute • How much for Stereo and/or 16 bit and/or 44 kHz audio?
Stereo and Mono • Mono, 16 bit, 22 kHz audio • 2 bytes ´ 22,000 ´ 1 second = 44 KB per second • 44 KB/s ´ 60 second = 2.64 MB per minute • Stereo, 16 bit, 44 kHz audio • 2 bytes ´ 44,100 ´ 1 second ´ 2 = 176 KB per second • 176 KB/s ´ 60 second = 10.56 MB per minute
Digital Audio – Compression • Many flavours • ADPCM – 4:1 • MicroSoft, MicroSoft IMA, Creative • CCITT – 2:1 • A-law and -law • Audio MPEG – 20:1
Advantages and Disadvantages Of Compression • Advantages • Smaller disk storage requirements • Disadvantages • Must be decompressed before use • Can take up to twice sound duration • Supported by good sound cards and specialist sound editing packages
Digital Audio – File Formats • Apple • Audio Interchange File Format – AIFF • .AIF or .AIFF or .AIFC • 8-bit, mono • 8-bit, stereo • 16-bit, mono • 16-bit, stereo • 32-bit, mono • 32-bit, stereo
Apples sound formats • .AIF files support a range of sampling rates 8kHz, 11kHz, 22kHz, 44kHz and 48kHz • compression of between 2 to 1 and 4 to 1 is available using suitable codecs but causes reduction in sound quality • .AIFC is AIFF with IMA compression • Sound • .SND
Other platforms • MS-DOS • Voice – a Sound Blaster format • .VOC • SUN • Sun Audio - (NeXT Audio) • .AU • .AU files support only 8kHz, 11kHz and 44kHz sampling rates
Windows • Wave and PCM • Adaptive Delta Pulse Code Modulation – ADPCM • CCITT • All WAV (sometimes .PCM for PCM files)
Windows sound formats continued • .WAV files support a range of sampling rates 8kHz, 11kHz, 22kHz, 44kHz and 48kHz • also a version with Microsoft’s own compression algorithm • Can exceed CD quality • Higher quality – Greater storage penalty
Windows Media Audio • A streaming audio format • Designed for network transfer and play-before-download replay • Available for UNIX, Mac and Windows • Formats: • .asf, .wma, .wmv • wide range of quality options
Real Audio • A streaming audio format • Designed for network transfer and play-before-download replay • Available for UNIX, Mac and Windows • Formats: • .RA • also as part of .RM, .RAM • wide range of quality options
Other Streaming Formats • In addition there are other streaming formats including: • LiveAudio - .LA • LiquidAudio - .LQT • Streamworks - .MPA • Shockwave Audio - .SWA • The players for many of these can also play non-streaming audio • Most streaming formats deliver mono sound at 8kHz or less
MP3 • MPEG Audio Layer-3 • In 1987, the IIS started to work on perceptual audio coding in the framework of the EUREKA project EU147, Digital Audio Broadcasting (DAB). • In a joint co-operation with the University of Erlangen (Prof. Dieter Seitzer), the IIS finally devised a very powerful algorithm that is standardised as ISO-MPEG Audio Layer-3 (IS 11172-3 and IS 13818-3) • MPEG2
MP3 cont 2 • Without data reduction, digital audio signals typically consist of 16 bit samples recorded at a sampling rate more than twice the actual audio bandwidth (e.g. 44.1 kHz for Compact Disks). • So you end up with more than 1.400 Mb to represent just one second of stereo music in CD quality. • By using MPEG audio coding, you may shrink down the original sound data from a CD by a factor of 12, without losing sound quality.
MP3 Cont 3 • Factors of 24 and even more still maintain a sound quality that is significantly better than what you get by just reducing the sampling rate and the resolution of your samples. • Basically, this is realised by perceptual coding techniques addressing the perception of sound waves by the human ear.
Typical Data Reduction Using MPEG Audio • still maintaining the original CD sound quality.
MP3 cont 4 • By exploiting stereo effects and by limiting the audio bandwidth, the coding schemes may achieve an acceptable sound quality at even lower bit rates. • MPEG Layer-3 is the most powerful member of the MPEG audio coding family. • For a given sound quality level, it requires the lowest bit rate - or for a given bit rate, it achieves the highest sound quality.
Typical Performance Data Of MPEG Layer-3 • * Fraunhofer uses a non-ISO extension of MPEG Layer-3 for enhanced performance (MPEG 2.5)
MP3 Sound Quality • In all international listening tests, MPEG Layer-3 impressively proved its superior performance, maintaining the original sound quality at a data reduction of 1:12 (around 64 kbit/s per audio channel). • If applications may tolerate a limited bandwidth of around 10 kHz, a reasonable sound quality for stereo signals can be achieved even at a reduction of 1:24.
MP3 sound quality cont 2 • For the use of low bit-rate audio coding schemes in broadcast applications at bit rates of 60 kbit/s per audio channel, the ITU-R recommends MPEG Layer-3. (ITU-R doc. BS.1115) • For more information take a look at our Layer-3 FAQ at http://www.fhg.de/layer3faq/index.html. • However, NN and IE do not offer support for MP3 yet
Digital Audio • Creation & Modification • Apple – Passport’s Alchemy • Windows’ Sound Recorder • Sound Card or MM package supplied utilities – Creative’s Wave Studio • MicroSoft’s Wave Edit • Playback • Any editor • Windows’ Media Player • Most MM authoring packages • Many browser plug-ins
Digital Audio – Considerations • Can record speech • Can record complex noises • Can exceed CD quality • Higher quality – Greater storage penalty • Easily manipulated • Difficult to change inherent sound
Sound with Animation and Video • Wave recording may be linked to animations • Wave recordings may be incorporated into video clips • Wave recordings may be extracted from video clips
Rules of Thumb: Digital Audio • Record at the highest practical bit depth and sampling frequency • Reducing quality after recording gives better results than recording at lower quality • Use the lowest resolution that gives the required results • “CD quality” stereo is 16 bit, 44 kHz • i.e., 16 ´ 44.1 ´ 2 = 176 KB per second ! • Not all sound cards can handle the fidelity properly • Test your content at various sampling rates
Audio for MM and Web • Director can import: • SWA (via an xtra till v 7), AIFF, AIFC, WAVE (but not with Microsoft’s compression), AU (via an xtra) • Optimising audio for the web • keep it short • mono rather than stereo • sample at 8-bit rather than 16-bit • sample at 8kHz/11kHz for noises or speech and 22kHz for music
Adding Non-Streaming Audio to a Web Page • There are four ways to do this: • Use a normal link: <A HREF=”audio/music.wav”>Play the music.</A> • the result may be the sound just plays when the page is opened • a plug-in player may open as a Web page • a helper application may open in a separate window
Basic embedding • Use a BGSOUND in IE: <BGSOUND SRC=”midi/music.wav” LOOP=infinite> • the result will be the sound plays when the page is opened • Use the OBJECT in IE tag to use an Active-X control to play the sound, e.g.:<OBJECT ID=pop CLASSID=”clsid:0FC6BF2B-E16A-11CF- AB2E-0080AD08A326” HEIGHT=60 WIDTH=145<PARAM NAME=”song” VALUE=”midi/music.wav”></OBJECT> • the result will be an Active-X control opens at the specified size in-line in the Web page
Basic embedding 2 • Use the EMBED tag to use a plug-in to play the sound, e.g.:<EMBED SRC=”audio/music.wav” CONTROLS=”console” HEIGHT=60 WIDTH=145 AUTOSTART=”false” LOOP =”false”></EMBED> • the result will be a plug-in player (LiveAudio in NN 3 or later) opens at the specified size in-line in the Web page
Adding Streaming Audio to a Web Page • Use a normal link: <A HREF=”audio/music.ram”>Play the music.</A> • the plug-in player opens • however, the file linked to may be a reference file rather than the actual sound file • the reference file contains details of the actual audio file • Refer to the specific streaming audio documentation for details
Scripts • The important features of the Script depend on the type of sound • For a recorded sound: • The activities that produce the required noise • Background noises (if desired) • The stereo effects required • The volume required • The duration
For music: • The Form • The Melody • The Harmony • The Tempo • N.B. These are all terms used in music criticism – if you are unfamiliar with them you may need a third party who does to undertake your liaison with the musician – a music producer for example • Major to Minor key changes and their exact timing to synchronise with other on-screen events • The moods to be matched and their timings • The sound card standard to be used for playback