1 / 48

Data Compression: Advanced Topics

Data Compression: Advanced Topics. Huffman Coding Algorithm Motivation Procedure Examples Unitary Transforms Definition Properties Applications. Recall: Variable Length Codes (VLC). Recall:. Self-information. It follows from the above formula that a small-probability event contains

dwight
Download Presentation

Data Compression: Advanced Topics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Compression: Advanced Topics • Huffman Coding Algorithm • Motivation • Procedure • Examples • Unitary Transforms • Definition • Properties • Applications EE465: Introduction to Digital Image Processing

  2. Recall: Variable Length Codes (VLC) Recall: Self-information It follows from the above formula that a small-probability event contains much information and therefore worth many bits to represent it. Conversely, if some event frequently occurs, it is probably a good idea to use as few bits as possible to represent it. Such observation leads to the idea of varying the code lengths based on the events’ probabilities. Assign a long codeword to an event with small probability Assign a short codeword to an event with large probability EE465: Introduction to Digital Image Processing

  3. Two Goals of VLC design • achieve optimal code length (i.e., minimal redundancy) For an event x with probability of p(x), the optimal code-length is , where x denotes the smallest integer larger than x (e.g., 3.4=4 ) –log2p(x)  code redundancy: Unless probabilities of events are all power of 2, we often have r>0 • satisfy uniquely decodable (prefix) condition EE465: Introduction to Digital Image Processing

  4. “Big Question” How can we simultaneously achieve minimum redundancy and uniquely decodable conditions? D. Huffman was the first one to think about this problem and come up with a systematic solution. EE465: Introduction to Digital Image Processing

  5. Huffman Coding (Huffman’1952) • Coding Procedures for an N-symbol source • Source reduction • List all probabilities in a descending order • Merge the two symbols with smallest probabilities into a new compound symbol • Repeat the above two steps for N-2 steps • Codeword assignment • Start from the smallest source and work back to the original source • Each merging point corresponds to a node in binary codeword tree EE465: Introduction to Digital Image Processing

  6. Example-I Step 1: Source reduction p(x) symbol x 0.5 0.5 0.5 S N 0.25 0.25 0.5 E 0.125 (NEW) 0.25 W 0.125 (EW) compound symbols EE465: Introduction to Digital Image Processing

  7. Example-I (Con’t) Step 2: Codeword assignment symbol x p(x) codeword 0 1 NEW 0.5 0.5 0.5 0 0 S 0 N 1 0 S 0.25 0.25 0 10 1 0.5 EW 10 0 E 0.125 110 N 1 1 0 0.25 W 0.125 110 111 1 111 W E EE465: Introduction to Digital Image Processing

  8. Example-I (Con’t) 0 1 1 0 NEW NEW 0 1 1 0 S 0 1 S or EW 10 EW 01 N N 1 0 1 0 110 000 001 W E W E The codeword assignment is not unique. In fact, at each merging point (node), we can arbitrarily assign “0” and “1” to the two branches (average code length is the same). EE465: Introduction to Digital Image Processing

  9. Example-II Step 1: Source reduction p(x) symbol x 0.4 0.4 0.4 0.6 e (aiou) a 0.2 0.2 0.4 0.4 (iou) i 0.2 0.2 0.2 o 0.1 0.2 u 0.1 (ou) compound symbols EE465: Introduction to Digital Image Processing

  10. Example-II (Con’t) Step 2: Codeword assignment p(x) codeword symbol x 0 1 0.4 0.4 0.4 0.6 e (aiou) 01 a 0.2 0.2 0.4 1 0.4 (iou) 000 i 0.2 0.2 0.2 0010 o 0.1 0.2 u 0.1 (ou) 0011 compound symbols EE465: Introduction to Digital Image Processing

  11. Example-II (Con’t) 0 1 (aiou) e 00 01 (iou) a 000 001 (ou) i 0010 0011 o u binary codeword tree representation EE465: Introduction to Digital Image Processing

  12. Example-II (Con’t) p(x) codeword length symbol x 1 1 0.4 e 01 2 a 0.2 3 i 0.2 000 o 0.1 0010 4 u 0.1 0011 4 If we use fixed-length codes, we have to spend three bits per sample, which gives code redundancy of 3-2.122=0.878bps EE465: Introduction to Digital Image Processing

  13. Example-III Step 1: Source reduction compound symbol EE465: Introduction to Digital Image Processing

  14. Example-III (Con’t) Step 2: Codeword assignment compound symbol EE465: Introduction to Digital Image Processing

  15. Summary of Huffman Coding Algorithm • Achieve minimal redundancy subject to the constraint that the source symbols be coded one at a time • Sorting symbols in descending probabilities is the key in the step of source reduction • The codeword assignment is not unique. Exchange the labeling of “0” and “1” at any node of binary codeword tree would produce another solution that equally works well • Only works for a source with finite number of symbols (otherwise, it does not know where to start) EE465: Introduction to Digital Image Processing

  16. Data Compression: Advanced Topics • Huffman Coding Algorithm • Motivation • Procedure • Examples • Unitary Transforms • Definition • Properties • Applications EE465: Introduction to Digital Image Processing

  17. An Example of 1D Transform with Two Variables x2 y2 y1 (1,1) (1.414,0) x1 Transform matrix EE465: Introduction to Digital Image Processing

  18. Decorrelating Property of Transform x2 y1 y2 x1 x1 and x2 are highly correlated y1 and y2 are less correlated p(x1x2)  p(x1)p(x2) p(y1y2)  p(y1)p(y2) Please use MATLAB demo program to help your understanding why it is desirable to have less correlation for image compression EE465: Introduction to Digital Image Processing

  19. Transform=Change of Coordinates • Intuitively speaking, transform plays the role of facilitating the source modeling • Due to the decorrelating property of transform, it is easier to model transform coefficients (Y) instead of pixel values (X) • An appropriate choice of transform (transform matrix A) depends on the source statistics P(X) • We will only consider the class of transforms corresponding to unitary matrices EE465: Introduction to Digital Image Processing

  20. Unitary Matrix Definition conjugate transpose A matrix A is called unitary if A-1=A*T Example Notes:  transpose and conjugate can exchange, i.e., A*T=AT* For a real matrix A, it is unitary if A-1=AT EE465: Introduction to Digital Image Processing

  21. Example 1: Discrete Fourier Transform (DFT) DFT Matrix: Im Re DFT: EE465: Introduction to Digital Image Processing

  22. Discrete Fourier Transform (Con’t) Properties of DFT matrix symmetry Proof: unitary Proof: If we denote then we have (identity matrix) EE465: Introduction to Digital Image Processing

  23. Example 2: Discrete Cosine Transform (DCT) real You can check it using MATLAB demo EE465: Introduction to Digital Image Processing

  24. DCT Examples N=2: Haar Transform 0.5000 0.5000 0.5000 0.5000 0.6533 0.2706 -0.2706 -0.6533 0.5000 -0.5000 -0.5000 0.5000 0.2706 -0.6533 0.6533 -0.2706 N=4: Here is a piece of MATLAB code to generate DCT matrix by yourself % generate DCT matrix with size of N-by-N Function C=DCT_matrix(N) for i=1:N; x=zeros(N,1);x(i)=1;y=dct(x);C(:,i)=y;end; end EE465: Introduction to Digital Image Processing

  25. Example 3: Hadamard Transform Here is a piece of MATLAB code to generate Hadamard matrix by yourself % generate Hadamard matrix N=2^{n} function H=hadamard(n) H=[1 1;1 -1]/sqrt(2); i=1; while i<n H=[H H;H -H]/sqrt(2); i=i+1; end EE465: Introduction to Digital Image Processing

  26. 1D Unitary Transform When the transform matrix A is unitary, the defined 1D transform is called unitary transform Forward Transform Inverse Transform EE465: Introduction to Digital Image Processing

  27. Basis Vectors basis vectors corresponding to forward transform (column vectors of transform matrix A) basis vectors corresponding to inverse transform (column vectors of transform matrix A*T ) EE465: Introduction to Digital Image Processing

  28. From 1D to 2D Do N 1D transforms in parallel EE465: Introduction to Digital Image Processing

  29. Definition of 2D Transform 2D forward transform 1D column transform 1D row transform EE465: Introduction to Digital Image Processing

  30. 2D Transform=Two Sequential 1D Transforms (left matrix multiplication first) column transform row transform row transform (right matrix multiplication first) column transform Conclusion:  2D separable transform can be decomposed into two sequential  The ordering of 1D transforms does not matter EE465: Introduction to Digital Image Processing

  31. T Basis Images T 1N N1 T Basis image Bijcan be viewed as the response of the linear system (2D transform) to a delta-function input ij EE465: Introduction to Digital Image Processing

  32. Example 1: 8-by-8 Hadamard Transform j DC i Bij In MATLAB demo, you can generate these 64 basis images and display them EE465: Introduction to Digital Image Processing

  33. Example 2: 8-by-8 DCT j DC i In MATLAB demo, you can generate these 64 basis images and display them EE465: Introduction to Digital Image Processing

  34. 2D Unitary Transform Suppose A is a unitary matrix, forward transform inverse transform Proof Since A is a unitary matrix, we have EE465: Introduction to Digital Image Processing

  35. Properties of Unitary Transforms • Energy compaction: only a small fraction of transform coefficients have large magnitude • Such property is related to the decorrelating capability of unitary transforms • Energy conservation: unitary transform preserves the 2-norm of input vectors • Such property essentially comes from the fact that rotating coordinates does not affect Euclidean distance EE465: Introduction to Digital Image Processing

  36. Energy Compaction Property • How does unitary transform compact the energy? • Assumption: signal is correlated; no energy compaction can be done for white noise even with unitary transform • Advanced mathematical analysis can show that DCT basis is an approximation of eigenvectors of AR(1) process (a good model for correlated signals such as an image) • A frequency-domain interpretation • Most transform coefficients would be small except those around DC and those corresponding to edges (spatially high-frequency components) • Images are mixture of smooth regions and edges EE465: Introduction to Digital Image Processing

  37. Energy Compaction Example in 1D Hadamard matrix significant A coefficient is called significant if its magnitude is above a pre-selected threshold th insignificant coefficients (th=64) EE465: Introduction to Digital Image Processing

  38. Energy Compaction Example in 2D Example A coefficient is called significant if its magnitude is above a pre-selected threshold th insignificant coefficients (th=64) EE465: Introduction to Digital Image Processing

  39. Image Example low-frequency high-frequency Original cameraman image X Its DCT coefficients Y (2451 significant coefficients, th=64) Notice the excellent energy compaction property of DCT EE465: Introduction to Digital Image Processing

  40. Counter Example Original noise image X Its DCT coefficients Y No energy compaction can be achieved for white noise EE465: Introduction to Digital Image Processing

  41. Energy Conservation Property in 1D 1D case A is unitary Proof EE465: Introduction to Digital Image Processing

  42. Numerical Example Check: EE465: Introduction to Digital Image Processing

  43. Implication of Energy Conservation Q T T-1 Linearity of Transform A is unitary EE465: Introduction to Digital Image Processing

  44. Energy Conservation Property in 2D 2-norm of a matrix X Step 1: A unitary Proof: Using energy conservation property in 1D, we have EE465: Introduction to Digital Image Processing

  45. Energy Conservation Property in 2D (Con’t) Step 2: A unitary Hint: 2D transform can be decomposed into two sequential 1D transforms, e.g., column transform row transform Use the result you obtained in step 1 and note that EE465: Introduction to Digital Image Processing

  46. Numerical Example T Check: EE465: Introduction to Digital Image Processing

  47. Implication of Energy Conservation Q T T-1 Linearity of Transform Similar to 1D case, quantization noise in the transform domain has the same energy as that in the spatial domain EE465: Introduction to Digital Image Processing

  48. Why Energy Conservation? s Y entropy coding binary bit stream forward Transform image X f probability estimation super channel ^ ^ s Y image X entropy decoding inverse Transform f-1 EE465: Introduction to Digital Image Processing

More Related