Multimedia systems lecture3 digital audio representation
1 / 36

Multimedia Systems Lecture3 – Digital Audio Representation - PowerPoint PPT Presentation

  • Uploaded on

Multimedia Systems Lecture3 – Digital Audio Representation. What Is Sound?. Sound is the brain's interpretation of electrical impulses being sent by the inner ear through the nervous system. There are some sounds the human ear cannot perceive—those which have a very

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Multimedia Systems Lecture3 – Digital Audio Representation' - paco

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Multimedia systems lecture3 digital audio representation
Multimedia SystemsLecture3 – Digital Audio Representation

What is sound
What Is Sound?

  • Sound is the brain's interpretation of electrical impulses being

  • sent by the inner ear through the nervous system. There are some

  • sounds the human ear cannot perceive—those which have a very

  • high or low frequency.

  • You can use sound in a multimedia project in two ways. In fact,

  • all sounds fall into two broad categories:

    • Content sound: provides information to audiences, for example, dialogs in movies or theater.

    • Ambient sound: consists of an array of background and sound effects.

How do we hear
How Do We Hear?

  • When an object moves back and forth (vibrates), it pushes the air

  • immediately next to it a bit to one side and, when coming back,

  • creates a slight vacuum. This process of oscillation creates a wave.

  • You will find that your voice sounds different in a tape recording

  • than you sound to yourself. This is because sound waves inside your

  • body travel through the bones, cartilage, and muscles between your

  • voice box and your inner ear. Sounds from tape recorders (and other

  • people) travel through the air and reach your eardrum, and thus

  • sound different.

Characteristics of sound
Characteristics of Sound

  • Amplitude

  • Wavelength (w)

  • Frequency ( )

  • Timbre

  • Hearing: [20Hz – 20KHz]

  • Speech: [200Hz – 8KHz]

Doppler effect
Doppler Effect

  • Why does the horn of an approaching car sound high-pitched

  • when it is coming close to you, yet suddenly becomes low when

  • it moves away?

    • As a car and its horn move toward you, the pushes of sound—the sound waves—get crammed together, which makes them higher pitched. On the other hand, when the car and the horn move away from you, the sound waves are spread out further apart. That makes a lower pitched sound.

    • This is called the Doppler effect.

Discrete vs continuous form
Discrete vs. Continuous form

  • All multimedia elements have to be in digital format. In

  • contrast, other multiple media such as TV programs &

  • Films are analog in nature.

  • The line made on computer screen is discrete. Since the

  • pixels on screen are very closed to each other, our eyes

  • cannot tell the difference and we perceive a continuous

  • line.

  • The plants and trees that we see around us are continuous, but their

  • digital pictures are forced to be discrete. Nevertheless, we have seen

  • that if we include enough data in our digital representation, our eyes

  • cannot tell the difference.

Digital audio
Digital Audio

  • The sound heard by the ear (also called audio) is

  • analog in nature and is a continuous waveform.

  • Acoustic instruments produce analog sounds.

  • A computer needs to transfer the analog sound wave into

  • its digital representation, consisting of discrete numbers.

Digital representation of audio
Digital Representation of Audio

  • Must convert wave form to digital:

    • Sampling

    • Quantization

    • Compress

1 sampling
1- Sampling

  • Sampling is the reduction of a continuous signal to a discrete signal.

  • A sample refers to a value or set of values at a point in time

  • and/or space.

  • A sampler is a subsystem or operation that extracts samples from

  • a continuous signal.

  • A common example is the conversion of a sound wave (a

  • continuous signal) to a sequence of samples (a discrete-time

  • signal).

Sampling example
Sampling - Example

Signal sampling representation. The continuous signal is represented with a green color whereas the discrete samples are in blue.

Sampling rate
Sampling Rate

  • When sampling a sound, the computer processes snapshots of the

  • waveform. The frequency of these snapshots is called the

  • sampling rate. The rate can vary typically from 5000-90,000

  • samples per second.

  • Example. Your mother is scolding you for breaking her precious

  • vase . Your sister hears only bits of the conversation because she is

  • not interested in the matter. Later you ask your sister if the scolding

  • was justified and your sister replies that she did not listen to the

  • whole conversation. This is because she sampled the voices at a very

  • wide range.


  • Digitization is the process of assigning a discrete value to each

  • of the sampled values. It is performed by an Integrated Chip

  • (IC) called an A to D Converter. In the case of 8-bit digitization,

  • this value is between 0 and 255. In 16-bit digitization, this

  • value is between 0 and 65,535.

  • The process of digitization introduces noise in a signal. This is

  • related to the number of bits per sample. A higher number of bits

  • used to store the sampled value leads to a more accurate sample,

  • with less noise.

Nyquist sampling theorem
Nyquist Sampling Theorem

  • For lossless digitization, the sampling rate should be at least twice the maximum frequency response, to insure that no information is lost.

  • In mathematical terms:

    fs > 2*fmwhere fsis sampling frequency and

    fm is the maximum frequency in the signal.

  • Thus, to represent a sound with a frequency of 440 Hz,

  • it is necessary to sample that sound at a minimum rate of 880

  • samples per second. Therefore, Sampling rate = 2 x Highest

  • frequency.

Nyquist limit
Nyquist Limit

  • max data rate = 2 H log2V bits/second, where H = bandwidth (in Hz) V = discrete levels (bits per signal change)

  • Shows the maximum number of bits that can be sent per second on a noiseless channel with a bandwidth of H, if V bits are sent per signal

    • Example: what is the maximum data rate for a 3kHz channel that transmits data using 2 levels (binary) ?

    • (2*3,000*ln(2)/ln(2)=6,000bits/second)

Limited sampling
Limited Sampling

  • But what if one cannot sample fast enough?

  • Reduce signal frequency to half of maximum sampling frequency

    • low-pass filter removes higher-frequencies

    • e.g., if max sampling frequency is 22kHz, must low-pass filter a signal down to 11kHz

Sampling ranges
Sampling Ranges

  • Auditory range (human hearing) 20Hz to 22.05 kHz

    • must sample up to 44.1kHz to ensure that the information is not lost.

  • Speech frequency [200 Hz, 8 kHz]

    • sample up to 16 kHz

    • but typically 4 kHz to 11 kHz is used

2 quantization
2- Quantization

  • Quantization is the process of approximating a continuous range

  • of values (or a very large set of possible discrete values) by a

  • relatively small set of discrete symbols or integer values.

  • Quantization of an audio sample means allocation of number of

  • bits to the height of the sampled waveform;

  • Typically use:

    • 8 bits = 256 levels

    • 16 bits = 65,536 levels

  • Ex: Telephone applications frequently use 8 bit quantization.

  • Compact Disc uses 16 bit quantization.


  • How should the levels be distributed:

    • Linearly? (PCM)

    • Perceptually? (u-Law)

    • Differential? (DPCM)

    • Adaptively? (ADPCM)

1 pulse code modulation
1- Pulse Code Modulation

  • Is a method to digitally represent sampled analog signals.

  • Pulse modulation

    • Use discrete time samples of analog signals

    • Transmission is composed of analog information sent at different times

    • Variation of pulse amplitude or pulse timing allowed to vary continuously over all values

  • It is the standard form for digital audio in computers and various Compact Disc and DVD formats, as well as other uses such as digital telephone systems.

Linear quantization lpcm
Linear Quantization (LPCM)

  • Divide amplitude spectrum into N units (for log2N bit quantization).

  • LPCM is a particular method of pulse-code modulation which represents an audio waveform as a sequence of amplitude values recorded at a sequence of times.

2 perceptual quantization u law
2- Perceptual Quantization (u-Law)

  • Algorithms that reduce the dynamic range of an audio signal.

  • Intensity values logarithmically mapped over N quantization units.

Quantization Index

Sound Intensity

3 differential pulse code modulation dpcm
3- Differential Pulse Code Modulation (DPCM)

  • Is a signal encoder that uses the baseline of PCM but adds some functionalities based on the prediction of the samples of the signal.

  • It represents first PCM-coded sample as a whole and following samples as differences from the previous PCM-coded sample.

  • What if we look at sample differences, not the samples themselves?

    • dt = xt-xt-1

Differential quantization dpcm
Differential Quantization (DPCM)

  • Calculates difference (changes) between two adjacent samples.

  • Send value, then relative changes.

    • value uses full bits, changes use fewer bits

    • E.g., 220, 218, 221, 219, 220, 221, 222, 218,.. (all values between 218 and 222)

    • Difference sequence sent: 220, +2, -3, +3, -1, -1, -1, 4, ..


  • Result: originally for encoding sequence 0..255 numbers need 8 bits;

  • Difference coding: need only 3 bits

  • Coding table:

    • Example:3-bits for encoding

  • Adaptive differential pulse code modulation adpcm
    Adaptive Differential Pulse Code Modulation (ADPCM)

    • Is a variant of DPCM that varies the size of the quantization step, to allow further reduction of the required bandwidth for a given signal-to-noise ratio.

    • Encode difference in 4 bits, but vary the mapping of bits to difference, dynamically.

      • If rapid change, use large differences

      • If slow change, use small differences

    Quantization error
    Quantization Error

    • Difference between actual and sampled value

      • amplitude between [-A, A]

      • quantization levels = N

    • e.g., if A = 1,N = 8, = 1/4

    Signal to noise ratio
    Signal-to-Noise Ratio

    • Is a measure used in science and engineering that compares the

    • level of a desired signal to the level of background noise. It is

    • defined as the ratio of signal power (energy) to the noise power.

    Signal to noise ratio2
    Signal To Noise Ratio

    • Measures strength of signal to noise

      SNR (in DB)=

    • Given sound form with amplitude in [-A, A]

    • Signal energy =

    Compute signal to noise ratio
    Compute Signal to Noise Ratio

    • Signal energy = ; Noise energy = ;

    • Noise energy =

    • Signal to noise =

    • Every bit increases SNR by ~ 6 decibels


    • Consider a full load sinusoidal modulating signal of amplitude A, which utilizes all the representation levels provided.

    • The average signal power is P= A2/2

    • The total range of quantizer is 2A because modulating signal swings between –A and A. Therefore, if it is N=16 (4-bit quantizer), Δ = 2A/24 = A/8

    • The quantization noise is Δ2/12 = A2/768

    • The SNR is (A2/2)/(A2/768) = 384; SNR (in dB) 25.8 dB

    Data rates
    Data Rates

    • Data rate = sample rate * quantization * channel

    • Compare rates for CD vs. mono audio:

      • 8000 samples/second * 8 bits/sample * 1 channel= 8 kBytes / second

      • 44000 samples/second * 16 bits/sample * 2 channel

      • = 176 kBytes / second