An overview of different compression algorithms
Download
1 / 9

an overview of different compression algorithms - PowerPoint PPT Presentation


  • 296 Views
  • Updated On :

An Overview of Different Compression Algorithms. Their application on compressing inverted files. Alternative Compression Algorithms. Arithmetic coding Huffman coding Character-based Word-based Dictionary-based coding – Ziv-Lempel family of coding. Pros and Cons of Different Algorithms.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'an overview of different compression algorithms' - albert


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
An overview of different compression algorithms l.jpg

An Overview of Different Compression Algorithms

Their application on compressing inverted files


Alternative compression algorithms l.jpg
Alternative Compression Algorithms

  • Arithmetic coding

  • Huffman coding

    • Character-based

    • Word-based

  • Dictionary-based coding – Ziv-Lempel family of coding



Choosing an compression algorithm for inverted files l.jpg
Choosing an Compression Algorithm for inverted files

  • Factors need to be considered

    • Compression ratio

    • Speed

    • Random access

  • In modern IR system, Word-based Huffman coding is commonly used

  • There are a lot of research on Ziv-Lempel family coding to see if they can be applied to indices compression


An improved sliding window ziv lempel algorithm l.jpg
An Improved Sliding-window Ziv-Lempel Algorithm

  • Conventional LZ family compression algorithms use a sliding window approach.

    • Based on longest matching length (m-length)

  • An improved sliding window LZ algorithm is proposed by Bender and Wolf.

    • Instead of m-length, the improved algorithm is based on the offset of the length (o-length) and the differential of the length (-length)


Benefits of the improved algorithm l.jpg
Benefits of the Improved Algorithm

  • Better compression ratio in the experiment

  • Still linear compression and searching: O(n).

  • It didn’t really provide an LZ algorithm that support random access.


Another modified lz algorithm l.jpg
Another Modified LZ algorithm

  • Proposed by Williams

    • Use literal/copy item;

    • Each step, transmit original if it is a literal item, a pointer if it is a copy item;

  • Aimed at faster compression speed and smaller memory footprint.

  • Better used in the embedded system where real-time compression is required.

  • Inappropriate for index compression.


Conclusion l.jpg
Conclusion

  • Up to date, the best practical compression algorithm for index is still word-based Huffman coding.

  • There are theoretical studies about Ziv-Lempel family coding. Non of them are practically applicable to our problem. But they can be used in other areas.


Reference l.jpg
Reference

  • An Improved Data Compression Algorithm Based on Ziv-Lempel Data Compression Algorithm, Paul Edward Bender and Jack Keil Wolf;

  • An Extremely Fast Ziv-Lempel Data Compression Algorithm, Ross N. Williams;

  • Modern Information Retrieval, Ricardo Baeza-Yates and Berthier Ribeiro-Neto;


ad