Production and Compression
This presentation is the property of its rightful owner.
Sponsored Links
1 / 12

Production and Compression of Raw data for Time Projection Chamber Ajit Kumar Mohanty PowerPoint PPT Presentation


  • 63 Views
  • Uploaded on
  • Presentation posted in: General

Production and Compression of Raw data for Time Projection Chamber Ajit Kumar Mohanty Dario Favretto. Dario Favretto 9 September 2002 1. Summary. ALTRO data format Data compression based on standard Huffman technique (ref. A. Nicolaucig, M. Mattavelli, S. Carrato)

Download Presentation

Production and Compression of Raw data for Time Projection Chamber Ajit Kumar Mohanty

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Production and compression of raw data for time projection chamber ajit kumar mohanty

Production and Compression

of Raw data for

Time Projection Chamber

Ajit Kumar Mohanty

Dario Favretto

Dario Favretto 9 September 2002 1


Summary

Summary

  • ALTRO data format

  • Data compression based on standard Huffman technique (ref. A. Nicolaucig, M. Mattavelli, S. Carrato)

    • Using one table

    • Using 5 tables

  • Preliminary results

  • Future developments

Dario Favretto 9 September 2002 2


Altro data format

Altro Data Format

  • ALTRO (Alice Tpc Read Out)

    • Only the samples over a given threshold are considered (while the others are discarded)

    • A Bunch is a group of adjacent over threshold samples coming from one pad (The signal can be represented bunch by bunch).

      • Information relative to one pad is stored in one packet

        A packet is a sequence of 10 bit words (range 0 -1023) followed by a trailer

        • Bunch length (number of samples in the bunch)

        • Time information (temporal position of the last sample in the bunch

        • Sequence of amplitude values

          Trailer

        • Number of words in the packet (10 bits)

        • Hardware and channel address (8 and 4 bit respectively)

Dario Favretto 9 September 2002 3


Compression

Compression

  • Lossless compression technique

    • Static Huffman coding

      • Variable length coding technique based on frequency of the symbols (symbols that appear more frequently are coded with a shorter sequence of bits respect to those symbol that appear less frequently in the source file

      • Static means that the algorithm is based on one or more tables that are built before the compression phase according to the frequency of the symbols

Dario Favretto 9 September 2002 4


Production and compression of raw data for time projection chamber ajit kumar mohanty

Compression using one table

  • Frequency distribution using one table (entropy: 4.97)

Dario Favretto 9 September 2002 5


Production and compression of raw data for time projection chamber ajit kumar mohanty

Results

  • Compression applied on a source file generated simulating one event of 1000 primaries

    • Threshold value: 2 (Source file dimension 6.5 MB)

      • Huffman (Dimension of the compressed file: ~3.5 MB) 54%

      • Gzip (Dimension of the compressed file: ~4.5 MB) 69%

    • Threshold value: 5 (Source file dimension 1.4 MB)

      • Huffman (Dimension of the compressed file: ~0.9 MB) 68%

      • Gzip (Dimension of the compressed file: ~1.2 MB) 83%

    • Threshold value: 10 (Source file dimension 1 MB)

      • Huffman (Dimension of the compressed file: ~0.7 MB) 72%

      • Gzip (Dimension of the compressed file: ~0.9 MB) 85%

Dario Favretto 9 September 2002 6


Production and compression of raw data for time projection chamber ajit kumar mohanty

Compression using 5 tables

Improvement in compression can be obtained considering the nature of the data. Most of the bunches have a pseudo Gaussian shape in which first and last sample have a smaller value with respect to those in central position.

  • Samples are classified in three categories (each category correspond to a table)

    • Isolated samples

    • Border samples

    • Central samples

  • Two more tables are used to store the frequency for the Time-Bin values and bunch length values.

Dario Favretto 9 September 2002 7


Production and compression of raw data for time projection chamber ajit kumar mohanty

Frequency distribution

Entropy

  • Bunch length:1.00

  • Bunch of 1 sample: 0.36

  • Border samples:4.43

  • Central Samples:6.95

Dario Favretto 9 September 2002 8


Production and compression of raw data for time projection chamber ajit kumar mohanty

Results

  • Compression applied on a source file generated simulating one event of 1000 primaries

    • Threshold value: 2 (Source file dimension 6.5 MB)

      • Huffman (Dimension of the compressed file: ~3.5 MB) 54%

      • Huff. 5 Table (Dimension of the compressed file: ~2.8 MB) 42%

    • Threshold value: 5 (Source file dimension 1.4 MB)

      • Huffman (Dimension of the compressed file: ~0.9 MB) 68%

      • Huff. 5 Table (Dimension of the compressed file: ~0.8 MB) 55%

    • Threshold value: 10 (Source file dimension 1 MB)

      • Huffman (Dimension of the compressed file: ~0.7 MB) 72%

      • Huff. 5 Table (Dimension of the compressed file: ~0.6 MB) 57%

Dario Favretto 9 September 2002 9


Results

Results

  • Compression applied on a source file generated simulating one event of 10000 primaries

    • Threshold value: 2 (Source file dimension 21.8 MB

      • Gzip (Dimension of the compressed file: ~17.5 MB) 80%

      • Huff. 5 Table (Dimension of the compressed file: ~10.7 MB) 49%

Dario Favretto 9 September 2002 10


Production and compression of raw data for time projection chamber ajit kumar mohanty

Main Macros and Classes

  • StoreDigits.C is a macro that creates a binary file (DigitsData.dat) containing the sequence of digits (Amplitude, Time-bin, Sector, Row and Pad number)

  • AliTPCBuildAltroFormat.C is a macro used to generate the Altro format file (AltroFormat.dat) from DigitsData.dat.

  • AliTPCBuffer160 is a class used to read/write values according to the Altro data format (10 bits words)

  • AliTPCHNode and AliTPCHTable are classes used to create and manage the tables used by Huffman coding.

  • AliTPCHCompression class for the implementation of compression and decompression based on one table

  • AliTPCCompression class for the implementation of compression and decompression based on 5 table

Dario Favretto 9 September 2002 11


Future developments

Future developments

  • Test phase using bigger source file (80000 primaries)

  • Complete the implementation of the Altro data format

  • Optimize frequency tables independently of a particular source file

  • Improve the compression factor

  • Abstract the classes to make them available for others detectors (ITS)

Dario Favretto 9 September 2002 12


  • Login