1 / 23

Digital Audio Compression

Digital Audio Compression. Formats. There are many different formats for storing and communicating digital audio: CD audio Wav Aiff Au MP3. The Storage Problem. CD quality recording 44100 sampling rate 16 bit quantization 2 channels (stereo) 176.4 Kbytes per second

Download Presentation

Digital Audio Compression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Digital Audio Compression

  2. Formats • There are many different formats for storing and communicating digital audio: • CD audio • Wav • Aiff • Au • MP3

  3. The Storage Problem • CD quality recording • 44100 sampling rate • 16 bit quantization • 2 channels (stereo) • 176.4 Kbytes per second • 1 minute is ~ 10.5 MBytes • 74 minutes is ~780 MB

  4. Psychoacoustics • The study of the psychological and physiological principles of sound perception • CDs try to accurately reproduce the original audio signal • But we do not hear all of this signal • The parts that we don’t hear are redundant • If we remove these parts we can store the signal using less data but without effecting the perceived sound

  5. Threshold of Hearing & Masking • The threshold of hearing curve describes the minimum level at which the ear can detect a tone at a given frequency Fletcher-Munson curves

  6. Amplitude Masking • Amplitude masking occurs when a tone shifts the threshold curve upwards in the frequency region that surrounds it 0.

  7. Critical Band • Hair cells on the Basilar membrane respond to the strongest stimulation in their local region • This local region is called the critical band • Critical bands are smaller for low frequency signals than they are for high frequency signals

  8. Critical Bands

  9. Amplitude Masking & Thresholds

  10. Temporal Masking • Masking can also occur when tones are sounded at slightly different times • Premasking – signal A is masked by signal B which occurs later • Postmaking – signal A is masked by signal B which ends before signal A has started • Temporal masking increases as time differences reduce

  11. Temporal Masking

  12. Masking • Amplitude and temporal masking form a masking area in the time-frequency domain

  13. Perceptual Coding • Perceptual coders analyse the frequency and amplitude content of the input signal and compare it to a model of human auditory perception • Parts of the input signal which are inaudible are removed

  14. Perceptual Coding • A perceptual coder uses a digital filter bank to split a short duration of audio signal into multiple frequency bands

  15. Perceptual Coding • The coder analyses the energy in each of these subbands to determine which subbands contain audible information • Subbands which are not audible are not coded

  16. Perceptual Coding • Quantization bits are assigned according to signal strength above the audibility curve

  17. Perceptual Coding • The purpose of perceptual coding is to reduce the data rate • Perceptual coders maintain sampling frequency, selectively decrease word length • Coders reduction ratio is the ratio of input bit rate to output bit rate • Ratios of up to 6:1 are often transparent

  18. Perceptual Coding • Because the inaudible content of the signal is removed the playback system’s ability to convey audible music should improve • In theory it is possible to get better reproduction after perceptual coding than the original! (In theory…) • Perceptual coders more properly code an audio signal for passage through an audio system

  19. MP3 • Mpeg 1 Audio Layer 3 • Developed to support audio coding for playback with video • Uses : • A filterbank producing 32 subbands from 24ms of audio data • Perceptual coder originally produced by the Fraunhofer Institut Integrierte Schaltungen • Lossless Huffman coding

  20. MP3

  21. MP3 • Sound quality is highly dependent on the performance of the encoder • Most encoders use constant-bitrate (CBR) encoding. In this mode you choose a target bitrate (e.g. 128kBit/s) • Codecs • Fraunhofer • Xing MP3 encoder • Etc…

  22. Joint Stereo Coding • Takes advantage of interchannel redundancy between stereo channels • Some sounds and some components are equal in both channels • Low frequencies: Bass instruments, strings, low components of drums • Centrally placed signals: typically vocals • Removing duplication reduces data without effecting perceived sound

  23. Fin Fin

More Related