Multimedia Compression (2)

1 / 74

# Multimedia Compression (2) - PowerPoint PPT Presentation

Multimedia Compression (2). Mei-Chen Yeh 03/23/2009. Review. Entropy Entropy coding Huffman coding Arithmetic coding. Lossless!. Outline. Revisiting Information theory Lossy compression Quantization Transform coding. Encoder. Decoder. source. s=r Lossless s ≠ r Lossy.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Multimedia Compression (2)' - sidney

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Multimedia Compression (2)

Mei-Chen Yeh

03/23/2009

Review
• Entropy
• Entropy coding
• Huffman coding
• Arithmetic coding

Lossless!

Outline
• Revisiting Information theory
• Lossy compression
• Quantization
• Transform coding

Encoder

Decoder

source

s=r Lossless

s≠r Lossy

Performance measures

Distortion

reconstruction

1

Rate

2

• Lossless coding
• Rate
• Lossy coding
• Rate
• Distortion
Lossy compression
• Goals
• Minimize the rate
• Keep the distortion small
• Tradeoffs between the best of both worlds
• Rate-distortion theory
• Provides theoretical bounds for lossy compression
Distortion (1)
• Suppose we send a symbol xi, receive yi
• d(xi, yi) ≥ 0
• d(xi, yi) = 0 for xi= yi
• Average distortion
• D = ∑x∑yp(xi, yi)d(xi, yi)
• Distortion measurement?
Distortion (2)

decibels

(分貝)

• User feedbacks
• Subjective and may be biased
• Some popular criteria:
• Mean square error (mse)
• Average of the absolute difference
• Signal-to-noise ratio (SNR)
• Peak-signal-to-noise ratio (PSNR)
Information theory revisited (1)
• Information: reduction in uncertainty

#1: predict the outcome of a coin flip

#2: predict the outcome of a die roll

Next

#1: You observe the outcome of a coin flip

#2: You observe the outcome of a die roll

Which has more uncertainty?

#2

#2

Information theory revisited (2)
• Entropy
• The average self-information
• The average amount of information provided per symbol
• The uncertainty an observer has before seeing the symbol
• The average number of bits needed to communicate each symbol
Conditional entropy
• Example: two random variables X and Y

X : College major

Y : Likes “XBOX”

Yes

No

0

H(Y|X=CS) =?

1

H(Y|X=Math) =?

Conditional entropy
• The conditional entropy H(Y|X)

H(Y|X) = 0.5H(Y|X=Math)+0.25H(Y|X=CS)+0.25H(Y|X=H.)

H(Y|X) = 0.5*1 + 0.25*0 + 0.25*0 = 0.5

Joint probability

conditional probability

Conditional entropy

H(X) = 1.5

H(Y) = 1

H(Y|X) = 0.5

H(X|Y) = 1

• H(X|Y)
• The amount of uncertainty remaining about X, given we know the value Y
• H(X|Y) vs. H(X)
Average mutual information

The amount of information that X and Y convey about each other!

Mutual information

Average mutual information

Average mutual information
• Properties
• 0 ≤ I(X; Y) = I(Y; X)
• I(X; Y) ≤ H(X)
• I(Y; X) ≤ H(Y)
Lossy compression

Lower the bit rate R by allowing some acceptable distortion D of the signal

Types of lossy data compression

Given R, minimize D

Given D, minimize R

Rate distortion theory

Calculates the minimum transmission bit-rate R for a required reconstruction quality

The results do not depend on a specific coding method.

Rate distortion function

With the assumption of statistical independence between distortion and reconstructed signal

Represents trade-offs between distortion and rate

Definition:

The lowest rate at which the source can be encoded while keeping the distortion less than or equal to D*

Shannon lower bound

Example: Rate distortion function for the Gaussian source

No compression system exists that performs outside the gray area!

• A zero mean Gaussian pdf with variance σx2
• Distortion: the MSE measure
• D = E[(X-Y)2]
• The rate distortion function is:

Khalid Sayood. Introduction to Data Compression, 3rd edition, Morgan Kaufmann, 2005.

Outline
• Revisiting Information theory
• Lossy compression
• Quantization
• Transform coding
Quantization
• A practical lossy compression technique
• The process of representing a large–possibly infinite–set of values with a much smaller set
Example

Use -10, -9, …,0, 1, …, 10 (21 values) to represent real numbers (infinite values)

2.47 => 2

3.1415926 => 3

-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10

Loss of information!

The reconstruction value 3 could be 2.95, 3.16, 3.05, …

Scalar quantization
• Inputs and outputs are scalars
• Design of a scalar quantizer
• Construct intervals (Encoder)
• Assign codewords to intervals (Encoder)
• Select reconstruction values (Decoder)

-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10

Scalar quantization (cont.)
• Input-output map

Output

Input

-∞

v

Decision boundaries

Reconstruction levels

Example: Quantization for image compression

64

196

8-bit per pixel [0 255]

1-bit per pixel {0, 128, 255}

2-bit per pixel {0, 64, 128, 196, 255}

3-bit per pixel (8 intervals)

Quantization problem (1)
• Fixed-length coding

Given an input pdf fx(x) and the number of levels M in the quantizer, find the decision boundaries {bi} and the reconstruction levels {yi} so as to minimize the distortion.

Quantization problem (2)
• Variable-length coding

Given a distortion constraint

find the decision boundaries {bi} and the reconstruction levels {yi}, andbinary codesthat minimize the rate while satisfying the constraint.

Vector Quantization

SLOW

FAST

K

L

How to generate the codebook?

Operate on blocks of data

The VQ procedure:

Why VQ?

Samples may be correlated!

Example: height and weight of individuals

2-d Vector Quantization

Quantization rule

Quantization regions

The quantization regions are no longer restricted to be rectangles!

Codebook Design

3

2.5

2

1.5

y

1

0.5

0

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

x

The Linde-Buzo-Gray Algorithm (also known as k-means)

LBG Algorithm

Training set

Problems of LBG

No guarantee that the procedure will converge to the optimal solution!

Sensitive to initial points

Empty-cell

The Empty Cell Problem

No update in an empty region

End up with an output point that is never used

More…
• LBG has problems when clusters have different
• Sizes
• Densities
• Non-globular shapes
Differing Sizes

LBG (3 Clusters)

Original Points

Different Densities

LBG (3 Clusters)

Original Points

Non-globular Shapes

Original Points

LBG (2 Clusters)

Use of LBG for Image Compression

Use 4x4 blocks of pixels

Codebook size 16

Codebook size 64

Codebook size 256

Codebook size 1024

Outline
• Revisiting Information theory
• Lossy compression
• Quantization
• Transform coding and the baseline JPEG
Transform coding

Transform

Quantization

Binary coding

• Three steps
• Divide a data sequence into blocks of size N and transform each block using a reversible mapping
• Quantize the transformed sequence
• Encode the quantized values
Transforms of interest
• Data-dependent
• Discrete Karhunen Lòeve transform (KLT)
• Data-independent
• Discrete cosine transform (DCT)
• Sub-band coding
• Wavelet transform
KLT
• Also known as Principal Component Analysis (PCA), or the Hotelling transform
• Transforms correlated variables into uncorrelated variables
• Basis vectors are eigenvectors of the covariance matrix of the input signal
• Achieves optimum energy concentration
• Dependent on signal statistics
• Not separable
DCT (1)

The frequency increases as we go from top to bottom!

Part of many standards (JPEG, MPEG, H.261, …)

The transform matrix C

Visualize the rows of C

DCT (2)

Increased variation!

2-D basis matrices of DCT

DCT (3)

Performs close to the optimum KLT in terms of compaction

DCT (4)
• DCT on a 8x8 image block

low freq.

high freq.

DC

low freq.

DCT

high freq.

ACs

8 x 8 image block

8 x 8 DCTcoefficients

DCT Example (1)
• Keep DC, remove others (quantize to 0s)
DCT Example (3)
• Keep DC and the first row of ACs
DCT Example (4)
• Keep DC and the first column of ACs
DCT Example (5)
• Keep DC and the first ACs
DCT Example (6)
• Keep DC and the first eight ACs
Quantization and coding of transform coefficients (1)

Transform

Quantization

Binary coding

DC

ACs

• The bit-rate allocation problem
• Divide bit-rate R among transform coefficients such that resulting distortion D is minimized
Quantization and coding (2)

variance of the transform

coefficient θk

The optimal bit allocation

Quantization and coding (3)

Horizontal frequency →

DC component

vertical frequency →

AC components

noisy

• The threshold coding
• Transform coefficients that fall below a threshold are discarded
• Example: a 8x8 image block
Quantization and coding (4)

The zigzag scan

Sample quantization table

### The Baseline JPEG

Baseline algorithm

Slide credit: Bernd Girod

Transform (1)

An 8x8 block from the Sena image

Large values

DCT coefficients

Small values

• Level shift—subtract the mean
• Subtract 128 from each pixel for a 8-bit image [0, 255] -> [-128, 127]
• 8x8 forward DCT
• Replicate the last column/row until the size is a multiple of eight
Transform (2)

Forward DCT

Inverse DCT

Quantization (1)

Reconstruction = ?

32

DCT coefficients

The quantization step sizes are organized in a table (the quantization table)

Quantization (2)

More quantization errors!

DCT coefficients

Quantization errors in the DC and lower AC coefficients are more easily detectable than that in the higher AC coefficients.

Quantization table

Encode

Remove mean,

DCT

Quantize

Zigzag scan

-26 -3 0 -3 -2 -6 2 -4 1 -4 1 1 5 1 2 -1 1 -1 2 0 0 −1 1 −1 2 0 0 0 0 0 −1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

-26 -3 0 -3 -2 -6 2 -4 1 -4 1 1 5 1 2 -1 1 -1 2 0 0 −1 1 −1 2 0 0 0 0 0 −1 −1 EOF

Entropy coding

010001010000101110000011101010011000101111100000……………

Quantization table

Decode

010001010000101110000011101010011000101111100000……………

Original block

Entropy decoding

-26 -3 0 -3 -2 -6 2 -4 1 -4 1 1 5 1 2 -1 1 -1 2 0 0 −1 1 −1 2 0 0 0 0 0 −1 −1 EOF

Put into a block

Inverse DCT,