Huffman Coding Theory: Examples & Algorithm for Optimization
190 likes | 239 Views
Learn about Huffman coding, an efficient compression method. Explore prefix codes, binary trees, and the Huffman algorithm for optimal code generation. Dive into code examples and correctness proofs.
Huffman Coding Theory: Examples & Algorithm for Optimization
E N D
Presentation Transcript
CSE 326Huffman coding Richard Anderson
Coding theory Code examples 000,001,010,011,100,101 1,01,001,0001,00001,000001 00,010,011,100,11,101 • Conversion, Encryption, Compression • Binary coding • Variable length coding
Decode the following 11010010010101011 100100101010 Prefix code Ambiguous
Prefix code • No prefix of a codeword is a codeword • Uniquely decodable
Prefix codes and binary trees • Tree representation of prefix codes
Minimum length code • Average cost • Average leaf depth • Huffman tree – tree with minimum weighted path length • C(T) – weighted path length
Huffman code algorithm • Derivation • Two rarest items will have the longest codewords • Codewords for rarest items differ only in the last bit • Idea: suppose the weights are with and the smallest weights • Start with an optimal code for and • Extend the codeword for to get codewords for and
Huffman code H = new Heap() for each wi T = new Tree(wi) H.Insert(T) while H.Size() > 1 T1 = H.DeleteMin() T2 = H.DeleteMin() T3 = Merge(T1, T2) H.Insert(T3)
Example:Weights 4, 5, 6, 7, 11, 14, 21 21 14 11 6 7 4 5
Draw a Huffman tree for the following data values and show internal weights:3, 5, 9, 14, 16, 35
Correctness proof • The most amazing induction proof • Induction on the number of code words • The Huffman algorithm finds an optimal code for n = 1 • Suppose that the Huffman algorithm finds an optimal code for codes size n, now consider a code of size n + 1 . . .
Key lemma • Given a tree T, we can find a tree T’, with the two minimum cost leaves as siblings, and C(T’) <= C(T)
Modify the following tree to reduce the WPL 29 10 19 6 4 13 6 10 3 5 5
Finish the induction proof • T – Tree constructed by Huffman • X – Any code tree • Show C(T) <= C(X) • T’ and X’ – Trees from the lemma • C(T’) = C(T) • C(X’) <= C(X) • T’’ and X’’ – Trees with minimum cost leaves x and y removed
X : Any tree, X’: – modified, X’’ : Two smallest leaves removed • C(X’’) = C(X’) – x – y • C(T’’) = C(T’) – x – y • C(T’’) <= C(X’’) • C(T) = C(T’) = C(T’’) + x + y <= C(X’’) + x + y = C(X’) <= C(X)