Discrete Cosine Transform (DCT) Eric Rust ESS 522 Spring/2007 University of Washington

Discrete Cosine Transform (DCT) Eric Rust ESS 522 Spring/2007 University of Washington

References • International Telecommunication Union, Information Technology - Digital Compression and Coding of Continuous-Tone Still Images - Requirements and Guidelines, T.81 ed. , New York: n.p., 1992. • I. Richardson, Video CODEC Design, New Jersey: John Wiley and Sons Inc., 2002 • Khalid Sayood, Introduction to Data Compression, Second ed., San Francisco: Morgan Kaufmann, 2000. • W. Pennebaker and J. Mitchell, JPEG Still Image Data Compression Standard, New York: Van Nostrand Reinhold, 1993. • D. Taubman and M. Marcellin, JPEG2000 Image Compression Fundamentals, Standards and Practice, Boston: Kluwer Academic Publishers, 2002. • Multiple Wikipedia searches

Objectives • Learn where Discrete Cosine Transform (DCT) is used. • Learn how DCT works • Experiment in using DCT

Why Did I Choose This? • We learned Fourier Transform. • Wanted to show that there are other domains that may be more useful.

Where DCT is used • Data compression • Image Compression

Image Compression (why?) • Common Cameras • 8 megapixels => 24MB • Yahoo!, Hotmail, AOL – 10MB • GMAIL – 20MB • Uploading pictures: • Myspace – 5MB

JPEG • JPEG - current image compression standard. The group was organized in 1986, issuing a standard in 1992 [1]. • lossy and lossless • Gif - CompuServe in 1987 [2] • Lossless • png – came about because of patent issues with LZW algorithm [1] - http://en.wikipedia.org/wiki/JPEG - last visited on 06/02/2007 [2] - http://en.wikipedia.org/wiki/GIF - last visited on 06/02/2007

JPEG Overview Current Image JPEG Compressed File

Baseline JPEG CODEC Level Shift FDCT Quant. Zigzag DC Pred. Entropy Encode Output Buffer Markers

Image Division The image is divided into 8x8 matrices Matrix image taken from Wikipedia.com

Level Shift • Each pixel has a numeric value representing its color. Level Shift (0,255) (-128,127) -Recall that this is done on each individual 8x8 block for each color plane

DCT II

Plot of DCT Coefficients High value coefficients Low frequency We can eliminate low value coefficients without losing much image quality Low value coefficients High frequency

FDCT Coefficients Shifting 8X8 block of pixels FDCT Matrix images taken from Wikipedia.com

Moving into Quant. Completed Level Shift FDCT Quant. Zigzag DC Pred. Entropy Encode Output Buffer Markers

FDCT Quantization Common JPEG quantization matrix. Note that these matrices will change depending upon what quality you want your compressed image to be.

Quantization FDCT Coefficients Note that the numbers are rounded which leads to many zeros Resulting quantized matrix Common JPEG quantization matrix.

Zigzag Zigzag arranges the DCT coefficients in an array such that the larger coefficients are grouped together at the beginning.

Update on JPEG Level Shift FDCT Quant. Zigzag DC Pred. Entropy Encode Output Buffer Markers

Entropy Encoding • “After quantization, it is not unusual for more than half of the DCT coefficients to equal zero”1. • JPEG uses Run-Length Encoding to reduce the size of the file by taking advantage of repeating characters2 (the zeros). • Huffman coding could be used as well • ece.perdue.edu • http://www.eee.bham.ac.uk/WoolleySI/All7/run_1.htm

DCT • Why DCT in JPEG?

DFT Sines and cosines Periodic extension DCT Cosines Periodic and even extension FT versus DCT DCT • 4 types of DCT • (Martucci 1994) shows four of them • The other four are:

WHY DCT? • Different boundary conditions strongly affect the applications of the transform. • Discontinuities in a function reduce the rate of convergence of the Fourier series, so that more sinusoids are needed to represent the function with a given accuracy

WHY DCT cont. • The periodicity of the DFT results in discontinuities usually occur at boundaries • DCT where both boundaries are even always yields a continuous extension at the boundaries • DCT II is preferred for applications, in part for reasons of computational convenience.

Example 0.9^n*cos(0.1pi*n), n=0,1,2...31

High-Pass High value coefficients Low frequency We can eliminate low value coefficients without losing much image quality Low value coefficients High frequency

Background on Images • Color image has three color planes • Black and white has one color plane • Each color plane is represented by pixels with numerical values that correspond to a color.

Matrix image taken from Wikipedia.com

Background on Images Cont. • Each pixel has 8-bits (1 or 0) that gives the pixel a value from 0 to 255. 0000 0001 = 1 0001 0010 = 18 1000 0000 = 128 • Goal is to find out haw many bits per pixel it takes to represent an image (uncompressed image would be 8 bits per pixel).

Finding Bits Per Pixel • Download files from website. • Review code to understand what it is doing • Run code (look at images that are displayed) • Complete two Exercises • QualityJPEG=[1 2 5 80 100]

Discrete Cosine Transform (DCT) Eric Rust ESS 522 Spring/2007 University of Washington