1 / 27

LZW Compression

LZW Compression. Heejin Park College of Information and Communications Hanyang University. Overview. Popular text compressors such as zip and Unix’s compress are based on the LZW (Lempel-Ziv-Welch) method.

toni
Download Presentation

LZW Compression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LZW Compression Heejin Park College of Information and Communications Hanyang University

  2. Overview • Popular text compressors such as zip and Unix’s compress are based on the LZW (Lempel-Ziv-Welch) method. • Character sequences in the original text are replaced by codes that are dynamically determined.

  3. code 0 1 key a b Compression • Assume the letters in the text are limited to {a, b}. • In practice, the alphabet may be the 256 character ASCII set. • The characters in the alphabet are assigned code numbers beginning at 0. • The initial code table is:

  4. code 0 1 key a b Compression • Original text = abababbabaabbabbaabba • Compress by scanning the original text from left to right. • Find longest prefix p for which there is a code in the code table. • Represent p by its code pCode and assign the next available code number to pc, where c is the next character in the text that is to be compressed.

  5. code 0 1 2 key a b ab Compression • Original text = abababbabaabbabbaabba • Compressed text = • Findp = a,pCode = 0. • Take c = b • Representaby0and enterabinto the code table. • Compressed text = 0 (ahas been compressed)

  6. 3 ba code 0 1 2 key a b ab Compression • Original text =abababbabaabbabbaabba • Compressed text = 0 • Findp = b,pCode = 1. • Take c = a • Representbby1and enterbainto the code table. • Compressed text = 01

  7. 3 4 ba aba code 0 1 2 key a b ab Compression • Original text =abababbabaabbabbaabba • Compressed text = 01 • Findp = ab,pCode = 2. • Take c = a • Representabby2and enterabainto the code table. • Compressed text = 012

  8. 3 4 5 ba aba abb code 0 1 2 key a b ab Compression • Original text =abababbabaabbabbaabba • Compressed text = 012 • Findp = ab,pCode = 2. • Take c = b • Representabby2and enterabbinto the code table. • Compressed text = 0122

  9. 6 bab 3 4 5 ba aba abb code 0 1 2 key a b ab Compression • Original text =abababbabaabbabbaabba • Compressed text = 0122 • Findp = ba,pCode = 3. • Take c = b • Representbaby3and enterbabinto the code table. • Compressed text = 01223

  10. 6 7 baa bab 3 4 5 ba aba abb code 0 1 2 key a b ab Compression • Original text =abababbabaabbabbaabba • Compressed text = 01223 • Findp = ba,pCode = 3. • Take c = a • Representbaby3and enterbaainto the code table. • Compressed text = 012233

  11. 7 6 8 abba baa bab 3 4 5 ba aba abb code 0 1 2 key a b ab Compression • Original text =abababbabaabbabbaabba • Compressed text = 012233 • Findp = abb,pCode = 5. • Take c = a • Representabbby5and enterabbainto the code table. • Compressed text = 0122335

  12. 8 7 6 9 abbaa abba baa bab 3 4 5 ba aba abb code 0 1 2 key a b ab Compression • Original text =abababbabaabbabbaabba • Compressed text = 0122335 • Findp = abba,pCode = 8. • Take c = a • Representabbaby8and enterabbaainto the code table. • Compressed text = 01223358

  13. 8 7 6 9 abbaa abba baa bab 3 4 5 ba aba abb code 0 1 2 key a b ab Compression • Original text =abababbabaabbabbaabba • Compressed text = 01223358 • Findp = abba,pCode = 8. • Take c = null • Representabbaby8 • Compressed text = 012233588

  14. 9 7 6 8 abba baa bab abbaa 3 4 5 ba aba abb code 0 1 2 key a b ab Compression Code Table Representation • Dictionary. • Pairs are (key, element)=(key,code). • Operations are :get(key)andput(key, code) • Limit number of codes to 212. • Use a hash table. • Convert variable length keys into fixed length keys. • Each key has the form pc, where the string p is a key that is already in the table. • Replace pc with (pCode)c.

  15. 8 7 6 9 9 8 7 6 8a bab baa abba 3a 5a abbaa 3b 3 4 5 ba aba abb 3 4 5 1a 2a 2b code code 0 0 1 1 2 2 key key a a b b ab 0b Compression Code Table Representation Modified LZW compression dictionary

  16. Decompression code 0 1 • Original text =abababbabaabbabbaabba • Compressed text =012233588 • Convert codes to text from left to right. • 0representsa. • Decompressed text =a • pCode=0andp = a. • p= afollowed by next text character(c) is entered into the code table. key a b

  17. code 0 1 2 key a b ab Decompression • Original text = abababbabaabbabbaabba • Compressed text =012233588 • 1representsb. • Decompressed text =ab • pCode =1 and p = b. • lastP = afollowed by first character of p is entered into the code table.

  18. code 0 1 2 key a b ab Decompression 3 • Original text = abababbabaabbabbaabba • Compressed text =012233588 • 2representsab. • Decompressed text =abab • pCode =2 and p = ab. • lastP = bfollowed by first character of p is entered into the code table. ba

  19. code 0 1 2 key a b ab Decompression 3 4 • Original text = abababbabaabbabbaabba • Compressed text =012233588 • 2representsab. • Decompressed text =ababab • pCode =2 and p = ab. • lastP = abfollowed by first character of p is entered into the code table. ba aba

  20. code code 0 0 1 1 2 2 key key a a b b ab ab Decompression 3 3 4 5 • Original text = abababbabaabbabbaabba • Compressed text =012233588 • 3representsba. • Decompressed text =abababba • pCode =3 and p = ba. • lastP = abfollowed by first character of p is entered into the code table. ba ba aba abb

  21. 6 bab code code code 0 0 0 1 1 1 2 2 2 key key key a a a b b b ab ab ab Decompression 3 3 3 4 4 5 • Original text = abababbabaabbabbaabba • Compressed text =012233588 • 3representsba. • Decompressed text =abababbaba • pCode =3 and p = ba. • lastP = bafollowed by first character of p is entered into the code table. ba ba ba aba aba abb

  22. 7 6 baa bab code code code code 0 0 0 0 1 1 1 1 2 2 2 2 key key key key a a a a b b b b ab ab ab ab Decompression 3 3 3 3 4 4 4 5 5 • Original text = abababbabaabbabbaabba • Compressed text =012233588 • 5 representsabb. • Decompressed text =abababbabaabb • pCode =5 and p = abb. • lastP = bafollowed by first character of p is entered into the code table. ba ba ba ba aba aba aba abb abb

  23. 6 8 6 7 bab bab abba baa code code code code code 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 key key key key key a a a a a b b b b b ab ab ab ab ab Decompression 3 3 3 3 3 4 4 4 4 5 5 5 • Original text = abababbabaabbabbaabba • Compressed text =012233588 • 8 represents???. • When a code is not in the table, its key is lastPfollowed by first character oflastP. • lastP = abb • So 8 representsabba. ba ba ba ba ba aba aba aba aba abb abb abb

  24. 6 9 6 7 7 6 8 bab abba abbaa baa bab baa bab code code code code code code 0 0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2 key key key key key key a a a a a a b b b b b b ab ab ab ab ab ab Decompression 3 3 3 3 3 3 4 4 4 4 4 5 5 5 5 • Original text = abababbabaabbabbaabba • Compressed text =012233588 • 8 representsabba. • Decompressed text =abababbabaabbabbaabba • pCode =8 and p = abba. • lastP = abbafollowed by first character of p is entered into the code table. ba ba ba ba ba ba aba aba aba aba aba abb abb abb abb

  25. 8 7 6 9 abbaa abba baa bab 3 4 5 ba aba abb code 0 1 2 key a b ab Decompression Code Table Representation • Dictionary. • Pairs are (key, element) = (code, what the code represents) = (code, codeKey). • Operations are :get(key)andput(key, code) • Keys are integers 0, 1, 2, … • Use a 1D array codeTable[code]= codeKey • Each code key has the form pc, where the string p is a code key that is already in the table. • Replace pc with (pCode)c.

  26. Performance Evaluation • Time Complexity • Compression. • O(n) expected time, where n is the length of the text. • Decompression. • O(n) time, where n is the length of the decompressed text.

  27. conclusion • Character sequences in the original text are replaced by codes that are dynamically determined. • The code table is not encoded into the compressed text, because it may be reconstructed from the compresses text during decompression.

More Related