Production and Compression
Download
1 / 12

Production and Compression of Raw data for Time Projection Chamber Ajit Kumar Mohanty - PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on

Production and Compression of Raw data for Time Projection Chamber Ajit Kumar Mohanty Dario Favretto. Dario Favretto 9 September 2002 1. Summary. ALTRO data format Data compression based on standard Huffman technique (ref. A. Nicolaucig, M. Mattavelli, S. Carrato)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Production and Compression of Raw data for Time Projection Chamber Ajit Kumar Mohanty' - prema


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Production and Compression

of Raw data for

Time Projection Chamber

Ajit Kumar Mohanty

Dario Favretto

Dario Favretto 9 September 2002 1


Summary
Summary

  • ALTRO data format

  • Data compression based on standard Huffman technique (ref. A. Nicolaucig, M. Mattavelli, S. Carrato)

    • Using one table

    • Using 5 tables

  • Preliminary results

  • Future developments

Dario Favretto 9 September 2002 2


Altro data format
Altro Data Format

  • ALTRO (Alice Tpc Read Out)

    • Only the samples over a given threshold are considered (while the others are discarded)

    • A Bunch is a group of adjacent over threshold samples coming from one pad (The signal can be represented bunch by bunch).

      • Information relative to one pad is stored in one packet

        A packet is a sequence of 10 bit words (range 0 -1023) followed by a trailer

        • Bunch length (number of samples in the bunch)

        • Time information (temporal position of the last sample in the bunch

        • Sequence of amplitude values

          Trailer

        • Number of words in the packet (10 bits)

        • Hardware and channel address (8 and 4 bit respectively)

Dario Favretto 9 September 2002 3


Compression
Compression

  • Lossless compression technique

    • Static Huffman coding

      • Variable length coding technique based on frequency of the symbols (symbols that appear more frequently are coded with a shorter sequence of bits respect to those symbol that appear less frequently in the source file

      • Static means that the algorithm is based on one or more tables that are built before the compression phase according to the frequency of the symbols

Dario Favretto 9 September 2002 4


Compression using one table

  • Frequency distribution using one table (entropy: 4.97)

Dario Favretto 9 September 2002 5


Results

  • Compression applied on a source file generated simulating one event of 1000 primaries

    • Threshold value: 2 (Source file dimension 6.5 MB)

      • Huffman (Dimension of the compressed file: ~3.5 MB) 54%

      • Gzip (Dimension of the compressed file: ~4.5 MB) 69%

    • Threshold value: 5 (Source file dimension 1.4 MB)

      • Huffman (Dimension of the compressed file: ~0.9 MB) 68%

      • Gzip (Dimension of the compressed file: ~1.2 MB) 83%

    • Threshold value: 10 (Source file dimension 1 MB)

      • Huffman (Dimension of the compressed file: ~0.7 MB) 72%

      • Gzip (Dimension of the compressed file: ~0.9 MB) 85%

Dario Favretto 9 September 2002 6


Compression using 5 tables

Improvement in compression can be obtained considering the nature of the data. Most of the bunches have a pseudo Gaussian shape in which first and last sample have a smaller value with respect to those in central position.

  • Samples are classified in three categories (each category correspond to a table)

    • Isolated samples

    • Border samples

    • Central samples

  • Two more tables are used to store the frequency for the Time-Bin values and bunch length values.

Dario Favretto 9 September 2002 7


Frequency distribution

Entropy

  • Bunch length: 1.00

  • Bunch of 1 sample: 0.36

  • Border samples: 4.43

  • Central Samples: 6.95

Dario Favretto 9 September 2002 8


Results

  • Compression applied on a source file generated simulating one event of 1000 primaries

    • Threshold value: 2 (Source file dimension 6.5 MB)

      • Huffman (Dimension of the compressed file: ~3.5 MB) 54%

      • Huff. 5 Table (Dimension of the compressed file: ~2.8 MB) 42%

    • Threshold value: 5 (Source file dimension 1.4 MB)

      • Huffman (Dimension of the compressed file: ~0.9 MB) 68%

      • Huff. 5 Table (Dimension of the compressed file: ~0.8 MB) 55%

    • Threshold value: 10 (Source file dimension 1 MB)

      • Huffman (Dimension of the compressed file: ~0.7 MB) 72%

      • Huff. 5 Table (Dimension of the compressed file: ~0.6 MB) 57%

Dario Favretto 9 September 2002 9


Results
Results

  • Compression applied on a source file generated simulating one event of 10000 primaries

    • Threshold value: 2 (Source file dimension 21.8 MB

      • Gzip (Dimension of the compressed file: ~17.5 MB) 80%

      • Huff. 5 Table (Dimension of the compressed file: ~10.7 MB) 49%

Dario Favretto 9 September 2002 10


Main Macros and Classes

  • StoreDigits.C is a macro that creates a binary file (DigitsData.dat) containing the sequence of digits (Amplitude, Time-bin, Sector, Row and Pad number)

  • AliTPCBuildAltroFormat.C is a macro used to generate the Altro format file (AltroFormat.dat) from DigitsData.dat.

  • AliTPCBuffer160 is a class used to read/write values according to the Altro data format (10 bits words)

  • AliTPCHNode and AliTPCHTable are classes used to create and manage the tables used by Huffman coding.

  • AliTPCHCompression class for the implementation of compression and decompression based on one table

  • AliTPCCompression class for the implementation of compression and decompression based on 5 table

Dario Favretto 9 September 2002 11


Future developments
Future developments

  • Test phase using bigger source file (80000 primaries)

  • Complete the implementation of the Altro data format

  • Optimize frequency tables independently of a particular source file

  • Improve the compression factor

  • Abstract the classes to make them available for others detectors (ITS)

Dario Favretto 9 September 2002 12


ad