algorithm programming 1 89 210 some topics in compression
Download
Skip this Video
Download Presentation
Algorithm Programming 1 89-210 Some Topics in Compression

Loading in 2 Seconds...

play fullscreen
1 / 15

Algorithm Programming 1 89-210 Some Topics in Compression - PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on

Algorithm Programming 1 89-210 Some Topics in Compression. Bar-Ilan University 2007-2008 תשס"ח by Moshe Fresko. Huffman Coding. Variable-length encoding Works on probabilities of symbols (characters, words, etc.) Build a tree Get two least frequent symbols/nodes

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Algorithm Programming 1 89-210 Some Topics in Compression' - duard


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
algorithm programming 1 89 210 some topics in compression

Algorithm Programming 189-210Some Topics in Compression

Bar-Ilan University

2007-2008 תשס"ח

by Moshe Fresko

huffman coding
Huffman Coding
  • Variable-length encoding
  • Works on probabilities of symbols (characters, words, etc.)
  • Build a tree
    • Get two least frequent symbols/nodes
    • Join them into a parent node
    • Parent node’s frequency is sum of child nodes’
    • Continue until the tree contains all nodes and symbols
    • The path of a leaf indicates its code
  • Frequent symbols are near the root giving them short codes
slide3
LZ77
  • Introduced in 1977 by Abraham Lempel and Jacob Ziv
  • Dictionary based
  • Works in a window size n
  • Decoding is easy and fast (but not Encoding)
  • Produces a list of tuples (Pos,Len,C)
    • Pos : Position backwards from the current position
    • Len : Number of symbols to be taken
    • C : Next character
slide4
LZ77
  • Based on strings that repeat themselves

An outcry in Spain is an outcry in vain

An outcry in Spa(6,3)is a(22,12)v(21,3)

aaaaaaaaaa

a(1,9)

lz77 example
LZ77 - Example
  • Window size : 5
  • ABBABCABBBBC

NextSeqCode

A (0,0,A)

B (0,0,B)

BA (1,1,A)

BC (3,1,C)

ABB (3,2,B)

BBC (2,2,C)

lz77 some variations
LZ77 - Some Variations
  • LZSS - A flag bit for distinguishing pointers from the other items.
  • LZR - No limit on the pointer size.
  • LZH - Compress the pointers in Huffman coding.
slide7
LZ78
  • Instead of a window to previously seen text, a dictionary of phrases will be build
  • Both encoding and decoding are simple
    • From the current position in the text, find the longest phrase that is found in the dictionary
    • Output the pair (Index,NextChar)
      • Index : The dictionary phrase of that index
      • NextChar : The next character after that phrase
    • Add to the dictionary the new phrase by appending the next character
lz78 example
LZ78 - Example
  • ABBABCABBBBC

Input Output Add to dictionary

A (0,A) 1 = “A”

B (0,B) 2 = “B”

BA (2,A) 3 = “BA”

BC (2,C) 4 = “BC”

AB (1,B) 5 = “AB”

BB (2,B) 6 = “BB”

BC (4,EOLN)

  • Dictionary size
slide9
LZW
  • Produces only a list of dictionary entry indexes
  • Encoding
    • Starts with initial dictionary
      • For example, possible ascii characters (0..255)
    • From the input, find the longest string that exists in the dictionary
    • Output this string’s index in the dictionary
    • Append the next character in the input to that string and add it into the dictionary
    • Continue from that character on from (2)
lzw example
LZW - Example
  • ABBABCABBBBC
    • Initial dictionary 0=“A”, 1=“B”, 2=“C”

Input NextChar Output Add to dictionary

A B 0 3 = “AB”

B B 1 4 = “BB”

B A 1 5 = “BA”

AB C 3 6 = “ABC”

C A 2 7 = “CA”

AB B 3 8 = “ABB”

BB B 4 9 = “BBB”

B C 1 10 = “BC”

C - 2 -

  • Dictionary size : ?
lzw encoding example
LZW – Encoding Example
  • T=ababcbababaaaaaaa
  • Initial Dictionary Entries :1=a 2=b 3=c

Input Output NextSymbol Add To Dictionary

a 1 b 4 = ab

b 2 a 5 = ba

ab 4 c 6 = abc

c 3 b 7 = cb

ba 5 b 8 = bab

bab 8 a 9 = baba

a 1 a 10 = aa

aa 10 a 11 = aaa

aaa 11 a 12= aaaa

a 1 - -

lzw encoding algorithm
LZW – Encoding Algorithm

w = Empty

while ( read next symbol k ) {

if wk exists in the dictionary

w = wk

else

add wk to the dictionary;

output the code for w;

w = k;

}

lzw decoding algorithm
LZW – Decoding Algorithm

read a code k

output dictionary entry for k

w = k

while ( read a code k ) {

entry = dictionary entry for k

output entry

add w + entry[0] to dictionary

w = entry

}

lzw decoding
LZW – Decoding
  • There is a special case problem with the previous algorithm
    • It can be confronted on every decoding process of a big file
    • It is the case where the index number read is not in the dictionary yet
    • Example : ABABABA
    • Initially : A=1,B=2
    • Output=1 2 3 5
    • In decoding above algorithm will not find the dictionary entry ABA=5
    • An additional small check will solve the problem
      • Be careful to do it in the Exercise 3
lzw dictionary length
LZW – Dictionary Length
  • Dictionary length
    • Typically : 14 bits = 16384 entries (first 256 of them are single bytes)
    • What if we are out of dictionary length
      • Don’t add to the dictionary any more
      • Delete the whole dictionary (This will be used in the exercise)
      • LRU : Throw those that are not used recently
      • Monitor performance, and flush dictionary when the performance is poor.
      • Double the dictionary size
ad