Mpeg1 coding standard
1 / 34

MPEG1 Coding Standard - PowerPoint PPT Presentation

  • Updated On :

MPEG1 Coding Standard. By: Richard M Tarbell. MPEG: Motion Picture Expert Group . First devised in 1988 by a group of almost 1000 experts Primary motivations: High compression rate for video storage comparable to VHS quality Random access capability

Related searches for MPEG1 Coding Standard

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'MPEG1 Coding Standard' - LionelDale

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Mpeg1 coding standard l.jpg

MPEG1 Coding Standard

By: Richard M Tarbell

Mpeg motion picture expert group l.jpg
MPEG: Motion Picture Expert Group

  • First devised in 1988 by a group of almost 1000 experts

  • Primary motivations:

    • High compression rate for video storage comparable to VHS quality

    • Random access capability

  • Overall MPEG standard combines video and audio signal into one large compression algorithm

Standard specs of mpeg1 l.jpg
Standard Specs of MPEG1

  • Works at 1.5 megabits per second bitrate

    • Bitrate = (length * width * depth * fps) / (compression ratio)

  • Makes use of the 8x8 discrete cosine transform (DCT) for intraframe compression

  • Uses an algorithm to reduce both temporal redundancy and spatial redundancy

  • The code can use several different variable length codebooks to achieve a higher compression ratio

    • the more codebooks used, the higher the potential

      compression ratio

Slide4 l.jpg


  • Was specifically designed for digital storage media

  • Did not initially lend itself to:

    • Real time applications such as videophone and video-over-IP

    • Applications that involve no long-term storage medium

  • Uses a complex compression algorithm to allow for a simple decompression algorithm

    • Simple decompression algorithm allows for real-time decompression

Mpeg1 compression aspects l.jpg
MPEG1 Compression Aspects

  • Lossless and Lossy compression are both used for a high compression rate

  • Interframe compression

    • Correlation/compression between like frames

    • Based on H.261 compression standard

  • Intraframe compression

    • Correlation/compression within a frame

    • Based on “baseline” JPEG compression standard

  • Audio compression

    • Three different layers (MP3)

What defines good video quality l.jpg
What defines good video quality?

  • Size of pictures

  • Bitrate of channel medium (especially in real-time applications)

  • Resolution of the original images and/or frames

  • Frame rate of source AND frame rate of reproduction medium (24 frames per second is standard movie quality)

If any one of these factors is inferior, it can bottleneck the overall system and cause reduction in video quality:

GIGO  garbage in = garbage out !!!

Intraframe coding compression within each individual frame l.jpg

Intraframe Coding: Compression within each individual frame

  • Intraframe Compression:

    • Reduces spatial redundancy to reduce necessary transmission rate

    • Encoding I blocks are practically identical to JPEG standard

    • Makes use of the DCT transform along with zigzag ordering

Video coloring scheme l.jpg
Video Coloring Scheme

  • Translate the RGB system into a YUV system

  • Human perception is less sensitive to chrominance than to brightness

    • Translate brightness into chrominance and then the resolution does not have to be as good  lower necessary bitrate

Coloring Scheme:

Normal JPEG Coloring Blocks




Translation formulas:

Y = Wr*R + Wb*B + Wg*G

Cr =Wr* (R - Y)

Cb =Wb* (B - Y)

Cg =Wg* (G - Y)





Slide9 l.jpg

According to

  • chro·mi·nance P Pronunciation Key  (kr m-n ns)n.

    • The difference between one color and a reference color of the same brightness and chromaticity.

GOB: Group of Blocks, composed of 33 macroblocks in an 11x3 arrangement

  • Macroblock: composed of six blocks (4:2:0 or 4:1:1 format)

    • Four blocks of yellow (luminance)

    • One block of Cb (chrominance)

    • One block of Cr (chrominance)

Slide10 l.jpg

Encoding an Image Into a JPEG Block

  • Divide the image into blocks, the size of each block is 8x8

  • Level shift and use the 8x8 DCT transform

    • If there exists a block or several blocks that are not of size 8x8, then force them to be by replicating the last column or row until proper dimension is achieved

DC and low frequencies

High frequencies

Slide12 l.jpg

DataVector = Cat(1,2,3,4,5,…….63,64)

Why a zig zag pattern l.jpg
Why a zig-zag pattern? single vector using

  • The zig-zag pattern starts with DC and low frequency values first and proceeds to high frequency values last

  • Most individual frames and blocks have much more energy in the low frequency spectrum than in the high frequency spectrum

  • Allows for better prediction to be made once the data vector has been assembled:


Data vector is generally monotonically decreasing in energy, most energy is in first few blocks

Data compression of the new vector l.jpg
Data Compression of the New Vector single vector using

  • Run-length code for grey:

    • long “runs” of grey can be stored like so:

(Grey color) (* 3) (regular color) (regular color) (Grey color) (* 4)

  • Huffman coding:

    • Lossless coding

    • Use tree diagram to decide how to encode all values

    • Assign the shortest codewords to the most frequent values

    • Assign longest codewords to least frequent values

Slide15 l.jpg

  • Random access capability

  • Temporal Redundancy

    • Only perform repeated encoding of the parts of a picture frame that are rapidly changing

    • Do not repeatedly encode background elements and still elements

Slide16 l.jpg

Decoding with non-random access single vector using

  • Decoding and playing sub-frames located in section G,

    • All frames before section G must be decoded as well

    • Synchronization algorithm issues.

  • If section G is far along in the movie, this could take a considerable amount of time.

Slide17 l.jpg

Decoding with random access single vector using

Introduce “I” frames, frames that are NOT predictively encoded by design

Frames that are still encoded using a prediction algorithm are called “P” frames

  • When decoding any frame after an I frame (frame G in this example)

    • we only have to decode past frames until we reach an I frame

    • saves time when skipping from frame to frame

  • I frames are not predictively encoded

    • reduction in compression ratio

  • Depending on the concentration of I frames, there is a tradeoff:

    • More I frames  faster random access time

    • Less I frames  better compression ratio

Slide18 l.jpg

  • Most MPEG1 implementations use a large number of I frames to ensure fast access

  • -Somewhat low compression ratio by itself

  • For predictive coding, P frames depend on only a small number of past frames

  • -Using less past frames reduces the propagation error

  • To further enhance compression in an MPEG1 file, introduce a third frame called the “B” frame  bi-directional frame

    • B frames are encoded using predictive coding of only two other frames: a past frame and a future frame

  • By looking at both the past and the future, this helps reduce prediction error due to rapid changes from frame to frame (i.e. a fight scene or fast-action scene)






Slide19 l.jpg

Predictive coding hierarchy: I, P, and B frames to ensure fast access

  • I frames (black) do not depend on any other frame and are encoded separately

    • Called “Anchor frame”

  • P frames (red) depend on the last P frame or I frame (whichever is closer)

    • Also called “Anchor frame”

  • B frames (blue) depend on two frames: the closest past P or I frame, and the closest future P or I frame

    • B frames are NOT used to predict other B frames, only P frames and I frames are used for predicting other frames

  • Slide20 l.jpg

    MPEG1 Temporal Order of Compression to ensure fast access

    • I frames are generated and compressed first

      • Have no frame dependence

    • P frames are generated and compressed second

      • Only depend upon the past I frame values

    • B frames are generated and compressed last

      • Depend on surrounding frames

      • Forward prediction needed
















    Differences between mpeg1 and h 261 l.jpg
    Differences Between MPEG1 and H.261 to ensure fast access

    • Computational cost

      • - MPEG1 is more advanced  higher computation

    • MPEG1 uses I, P, and B frames, H.261 uses I and P frames

      • Large gaps between I frames and P frames

      • Predicted frame and reference frame are not necessarily adjacent

      • Higher coding rate

      • More prediction error due to larger distance between predicted frame and reference frames

    • Prediction of B frame uses two references

      • One past frame and one future frame forward prediction

    • MPEG1 was intended for motion pictures  not real time

    • -H.261 was intended for video conferencing

    Rate control and mpeg1 l.jpg
    Rate control and MPEG1 to ensure fast access

    • Although MPEG1 is for storage, rate control is possible in certain software packages/modifications

    • Techniques to increase code rate

      • Increase quantizer step size to reduce picture quality

      • B frame quality is reduced first to reduce net error

        • No other frame depends on a B frame

    Slide23 l.jpg

    MPEG Audio Compression to ensure fast access

    • First developed to compress 1.5 Mbits/sec (normal uncompressed audio stream) into 56 kbits/sec (the rate of a basic dialup modem)

    • Can encode mono, stereo, and joint-stereo audio

    • Designed for generic waveforms (not partial to speech)

    • Many different standard sampling rates:

      • 16kHz, 22.05kHz, 24kHz, 32 kHz, 44.1 kHz, 48 kHz

    • Uses sub-band filtering

      • 32 filter bands for layers I and II

      • 32 filter bands and 18 sub-bands for layer III (576 bands total)

      • Each is equidistant in bandwidth (width = Fs / 64)

    Layers of mpeg audio l.jpg
    Layers of MPEG Audio to ensure fast access

    • Layer I

      • 12 samples per sub-band (384 total samples)

      • Compression ratio: approx 4:1

      • Around 384kbps (depends on chosen sampling rate)

    • Layer II

      • 36 samples per sub-band (1152 total samples)

      • Compression ratio: approx 6:1 to 8:1

      • Around 256kbps to 192kbps

    • Layer III

      • 12 samples per sub-band (384 total samples)

      • Compression ratio: approx 10:1 to 12:1

    Perceptual coding and psychoacoustics l.jpg
    Perceptual Coding and Psychoacoustics to ensure fast access

    • Similar concept to MPEG interframe coding: irrelevancy is identified and then removed

      • This is called auditory masking

    • Input signal is quantized according to a level that meets both bitrate and masking requirements

    • DFT of the input is taken, masking takes place on several levels

    Slide26 l.jpg

    Two major types of masking take place for encoding: to ensure fast access

    • Temporal Masking: loud noises that occur close to each other in time (about 3 to 5 milliseconds) can be approximated by a single loud noise

    • Frequency Masking: areas of the spectrum that have high energy and are close together can be filtered to eliminate “spikes” in the spectrum

    Human perception l.jpg
    Human Perception to ensure fast access

    • Smallest signal perception is 0dB (JBN)

    • Largest signal perception is about 135dB (threshold of pain)

       Dynamic range of about 5,000,000 to 1 (ratio of largest signal to smallest signal)

    Slide28 l.jpg

    Normal Spectrum to ensure fast access

    Slide29 l.jpg

    Auditory Masking Spectrum to ensure fast access

    Encoding model for layer i l.jpg
    Encoding model for Layer I to ensure fast access




    Mpeg1 audio encoding process l.jpg
    MPEG1 Audio Encoding Process to ensure fast access

    • Break the signal into frames (a frame is usually several milliseconds in length)

      2. Determine spectral energy distribution by taking the DFT, then break this distribubtion into sub-bands via filtering

      3. Consider the specified bit rate (this will determine the number of bits per frame that can be used)

    Slide32 l.jpg

    4. Compare the DFT signal to the human psychoacoutic model, perform masking

    5. Use a modified Huffman code to achieve compression

    6. After each frame has been compressed, place a header on each frame

    7. Reassemble the new frames with headers back into a continuous bit stream

    Mpeg1 audio is used l.jpg
    MPEG1 Audio is Used: perform masking

    • In CD format (first intentions were for the CD)

    • VCD and DVD formats (even though MPEG1 picture format is obsolete for these kinds of media)

    • As a basis for AAC audio (AAC was modeled to rival MPEG2 audio, AAC is considered to be superior to MP3 audio)

    • MPEG1 layers I and II were based off of Sony MUSICAM technology

    Primary references l.jpg
    Primary References: perform masking


    • Sayood, Khalid. Introduction to Data Compression, 2nd Edition. 2000, Morgan Kaufmann Publishers: San Diego.

    • Berkely Multimedia Research Center