slide1 n.
Download
Skip this Video
Download Presentation
Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004

Loading in 2 Seconds...

play fullscreen
1 / 18

Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004 - PowerPoint PPT Presentation


  • 122 Views
  • Uploaded on

Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004. Introduction. Compression is used to reduce the volume of information to be stored into storages or to reduce the communication bandwidth required for its transmission over the networks. Compression Principles.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004' - quincy-cameron


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1
Data Compression

Hae-sun Jung

CS146 Dr. Sin-Min Lee

Spring 2004

introduction
Introduction
  • Compression is used to reduce the volume of information to be stored into storages or to reduce the communication bandwidth required for its transmission over the networks
compression principles
Compression Principles
  • Entropy Encoding
    • Run-length encoding
      • Lossless & Independent of the type of source information
      • Used when the source information comprises long substrings of the same character or binary digit

(string or bit pattern, # of occurrences), as FAX

e.g) 000000011111111110000011……

 0,7 1, 10, 0,5 1,2……  7,10,5,2……

compression principles1
Compression Principles
  • Entropy Encoding
  • Statistical encoding
    • Based on the probability of occurrence of a pattern
    • The more probable, the shorter codeword
    • “Prefix property”: a shorter codeword must not form the start of a longer codeword
compression principles2
Compression Principles
  • Huffman Encoding
    • Entropy, H: theoretical min. avg. # of bits that are required to transmit a particular stream

H = -Σi=1nPi log2Pi

where n: # of symbols, Pi: probability of symbol i

    • Efficiency, E=H/H’

where, H’ = avr. # of bits per codeword = Σi=1nNiPi

Ni: # of bits of symbol i

slide8
E.g) symbols M(10), F(11), Y(010), N(011), 0(000), 1(001) with probabilities 0.25, 0.25, 0.125, 0.125, 0.125, 0.125
    • H’ = Σi=16NiPi = (2(20.25) + 4(30.125)) = 2.5 bits/codeword
    • H = -Σi=16Pi log2Pi= - (2(0.25log20.25) + 4(0.125log20.125)) = 2.5
    • E = H/H’ =100 %
    • 3-bit/codeword if we use fixed-length codewords for six symbols
huffman algorithm
Huffman Algorithm

Method of construction for an encoding tree

  • Full Binary Tree Representation
  • Each edge of the tree has a value,

(0 is the left child, 1 is the right child)

  • Data is at the leaves, not internal nodes
  • Result: encoding tree
  • “Variable-Length Encoding”
huffman algorithm1
Huffman Algorithm
  • 1. Maintain a forest of trees
  • 2. Weight of tree = sum frequency of leaves
  • 3. For 0 to N-1
    • Select two smallest weight trees
    • Form a new tree
slide11
Huffman coding
      • variable length code whose length is inversely proportional to that character’s frequency
      • must satisfy nonprefix property to be uniquely decodable
      • two pass algorithm
        • first pass accumulates the character frequency and generate codebook
        • second pass does compression with the codebook
slide12

Huffman coding

  • create codes by constructing a binary tree

1. consider all characters as free nodes

2. assign two free nodes with lowest frequency to a parent nodes with weights equal to sum of their frequencies

3. remove the two free nodes and add the newly created parent node to the list of free nodes

4. repeat step2 and 3 until there is one free node left. It becomes the root of tree

slide13
Right of binary tree :1
  • Left of Binary tree :0
  • Prefix (example)
    • e:”01”, b: “010”
    • “01” is prefix of “010” ==> “e0”
  • same frequency : need consistency of left or right
slide14
Example(64 data)
  • R K K K K K K K
  • K K K R R K K K
  • K K R R R R G G
  • K K B C C C R R
  • G G G M C B R R
  • B B B M Y B B R
  • G G G G G G G R
  • G R R R R G R R
slide15
Color frequency Huffman code
  • =================================
  • R 19 00
  • K 17 01
  • G 14 10
  • B 7 110
  • C 4 1110
  • M 2 11110
  • Y 1 11111
static huffman coding

0 1

Root node

8

0 1

A

4

Branch node

0 1

2

B

Leaf node

D

C

Static Huffman Coding
  • Huffman (Code) Tree
    • Given : a number of symbols (or characters) and their relative probabilities in prior
    • Must hold “prefix property” among codes

Symbol Occurrence

A 4/8

B 2/8

C 1/8

D 1/8

Symbol Code

A 1

B 01

C 001

D 000

41 + 22 + 13 + 13 = 14 bits are required to transmit

“AAAABBCD”

Prefix Property !

ad