Loading in 5 sec....

Multimedia Compression (2)PowerPoint Presentation

Multimedia Compression (2)

- 140 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Multimedia Compression (2)' - sidney

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Outline

- Revisiting Information theory
- Lossy compression
- Quantization
- Transform coding

Decoder

source

s=r Lossless

s≠r Lossy

Performance measuresDistortion

reconstruction

1

Rate

2

- Lossless coding
- Rate

- Lossy coding
- Rate
- Distortion

Lossy compression

- Goals
- Minimize the rate
- Keep the distortion small

- Tradeoffs between the best of both worlds
- Rate-distortion theory
- Provides theoretical bounds for lossy compression

Distortion (1)

- Suppose we send a symbol xi, receive yi
- d(xi, yi) ≥ 0
- d(xi, yi) = 0 for xi= yi

- Average distortion
- D = ∑x∑yp(xi, yi)d(xi, yi)

- Distortion measurement?

Distortion (2)

decibels

(分貝)

- User feedbacks
- Subjective and may be biased

- Some popular criteria:
- Mean square error (mse)
- Average of the absolute difference
- Signal-to-noise ratio (SNR)
- Peak-signal-to-noise ratio (PSNR)

Information theory revisited (1)

- Information: reduction in uncertainty
#1: predict the outcome of a coin flip

#2: predict the outcome of a die roll

Next

#1: You observe the outcome of a coin flip

#2: You observe the outcome of a die roll

Which has more uncertainty?

#2

#2

Which provides more information?

Information theory revisited (2)

- Entropy
- The average self-information
- The average amount of information provided per symbol
- The uncertainty an observer has before seeing the symbol
- The average number of bits needed to communicate each symbol

Conditional entropy

- Example: two random variables X and Y

X : College major

Y : Likes “XBOX”

Yes

No

0

H(Y|X=CS) =?

1

H(Y|X=Math) =?

Conditional entropy

- The conditional entropy H(Y|X)

H(Y|X) = 0.5H(Y|X=Math)+0.25H(Y|X=CS)+0.25H(Y|X=H.)

H(Y|X) = 0.5*1 + 0.25*0 + 0.25*0 = 0.5

conditional probability

Conditional entropy

H(X) = 1.5

H(Y) = 1

H(Y|X) = 0.5

H(X|Y) = 1

≤

- H(X|Y)
- The amount of uncertainty remaining about X, given we know the value Y

- H(X|Y) vs. H(X)

Average mutual information

The amount of information that X and Y convey about each other!

Mutual information

Average mutual information

Average mutual information

- Properties
- 0 ≤ I(X; Y) = I(Y; X)
- I(X; Y) ≤ H(X)
- I(Y; X) ≤ H(Y)

Lossy compression

Lower the bit rate R by allowing some acceptable distortion D of the signal

Rate distortion theory

Calculates the minimum transmission bit-rate R for a required reconstruction quality

The results do not depend on a specific coding method.

Rate distortion function

With the assumption of statistical independence between distortion and reconstructed signal

Represents trade-offs between distortion and rate

Definition:

The lowest rate at which the source can be encoded while keeping the distortion less than or equal to D*

Shannon lower bound

Example: Rate distortion function for the Gaussian source

No compression system exists that performs outside the gray area!

- A zero mean Gaussian pdf with variance σx2
- Distortion: the MSE measure
- D = E[(X-Y)2]

- The rate distortion function is:

For more information:

Khalid Sayood. Introduction to Data Compression, 3rd edition, Morgan Kaufmann, 2005.

Outline

- Revisiting Information theory
- Lossy compression
- Quantization
- Transform coding

Quantization

- A practical lossy compression technique
- The process of representing a large–possibly infinite–set of values with a much smaller set

Example

Use -10, -9, …,0, 1, …, 10 (21 values) to represent real numbers (infinite values)

2.47 => 2

3.1415926 => 3

-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10

Loss of information!

The reconstruction value 3 could be 2.95, 3.16, 3.05, …

Scalar quantization

- Inputs and outputs are scalars
- Design of a scalar quantizer
- Construct intervals (Encoder)
- Assign codewords to intervals (Encoder)
- Select reconstruction values (Decoder)

-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10

Scalar quantization (cont.)

- Input-output map

Output

Input

-∞

∞

v

Decision boundaries

Reconstruction levels

Example: Quantization for image compression

64

196

8-bit per pixel [0 255]

1-bit per pixel {0, 128, 255}

2-bit per pixel {0, 64, 128, 196, 255}

3-bit per pixel (8 intervals)

Quantization problem (1)

- Fixed-length coding

Given an input pdf fx(x) and the number of levels M in the quantizer, find the decision boundaries {bi} and the reconstruction levels {yi} so as to minimize the distortion.

Quantization problem (2)

- Variable-length coding

Given a distortion constraint

find the decision boundaries {bi} and the reconstruction levels {yi}, andbinary codesthat minimize the rate while satisfying the constraint.

Vector Quantization

SLOW

FAST

K

L

How to generate the codebook?

Operate on blocks of data

The VQ procedure:

2-d Vector Quantization

Quantization rule

Quantization regions

The quantization regions are no longer restricted to be rectangles!

Codebook Design

3

2.5

2

1.5

y

1

0.5

0

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

x

The Linde-Buzo-Gray Algorithm (also known as k-means)

LBG Algorithm

Training set

Problems of LBG

No guarantee that the procedure will converge to the optimal solution!

Sensitive to initial points

Empty-cell

More…

- LBG has problems when clusters have different
- Sizes
- Densities
- Non-globular shapes

Use of LBG for Image Compression

Use 4x4 blocks of pixels

Codebook size 16

Codebook size 64

Codebook size 256

Codebook size 1024

Outline

- Revisiting Information theory
- Lossy compression
- Quantization
- Transform coding and the baseline JPEG

To compact most of the information into a few elements!

Slide credit: Bernd Girod

Transform coding

Transform

Quantization

Binary coding

- Three steps
- Divide a data sequence into blocks of size N and transform each block using a reversible mapping
- Quantize the transformed sequence
- Encode the quantized values

Transforms of interest

- Data-dependent
- Discrete Karhunen Lòeve transform (KLT)

- Data-independent
- Discrete cosine transform (DCT)

- Sub-band coding
- Wavelet transform

KLT

- Also known as Principal Component Analysis (PCA), or the Hotelling transform
- Transforms correlated variables into uncorrelated variables
- Basis vectors are eigenvectors of the covariance matrix of the input signal
- Achieves optimum energy concentration
- Disadvantages
- Dependent on signal statistics
- Not separable

DCT (1)

The frequency increases as we go from top to bottom!

Part of many standards (JPEG, MPEG, H.261, …)

The transform matrix C

Visualize the rows of C

DCT (3)

Performs close to the optimum KLT in terms of compaction

DCT (4)

- DCT on a 8x8 image block

low freq.

high freq.

DC

low freq.

DCT

high freq.

ACs

8 x 8 image block

8 x 8 DCTcoefficients

DCT Example (1)

- Keep DC, remove others (quantize to 0s)

DCT Example (2)

- Remove DC

DCT Example (3)

- Keep DC and the first row of ACs

DCT Example (4)

- Keep DC and the first column of ACs

DCT Example (5)

- Keep DC and the first ACs

DCT Example (6)

- Keep DC and the first eight ACs

Quantization and coding of transform coefficients (1)

Transform

Quantization

Binary coding

DC

ACs

- The bit-rate allocation problem
- Divide bit-rate R among transform coefficients such that resulting distortion D is minimized

Quantization and coding (3)

Horizontal frequency →

DC component

vertical frequency →

AC components

noisy

- The threshold coding
- Transform coefficients that fall below a threshold are discarded
- Example: a 8x8 image block

Baseline algorithm

Slide credit: Bernd Girod

Transform (1)

An 8x8 block from the Sena image

Large values

DCT coefficients

Small values

- Level shift—subtract the mean
- Subtract 128 from each pixel for a 8-bit image [0, 255] -> [-128, 127]

- 8x8 forward DCT
- Replicate the last column/row until the size is a multiple of eight

Quantization (1)

Reconstruction = ?

32

DCT coefficients

The quantization step sizes are organized in a table (the quantization table)

Quantization (2)

More quantization errors!

DCT coefficients

Quantization errors in the DC and lower AC coefficients are more easily detectable than that in the higher AC coefficients.

Encode

Remove mean,

DCT

Quantize

Zigzag scan

-26 -3 0 -3 -2 -6 2 -4 1 -4 1 1 5 1 2 -1 1 -1 2 0 0 −1 1 −1 2 0 0 0 0 0 −1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Add EOF

-26 -3 0 -3 -2 -6 2 -4 1 -4 1 1 5 1 2 -1 1 -1 2 0 0 −1 1 −1 2 0 0 0 0 0 −1 −1 EOF

Entropy coding

010001010000101110000011101010011000101111100000……………

Decode

010001010000101110000011101010011000101111100000……………

Original block

Entropy decoding

-26 -3 0 -3 -2 -6 2 -4 1 -4 1 1 5 1 2 -1 1 -1 2 0 0 −1 1 −1 2 0 0 0 0 0 −1 −1 EOF

Put into a block

Inverse DCT,

Add mean

Reconstructed block

Q-1

Download Presentation

Connecting to Server..