Implementation of the "Road Grader" Algorithm for Efficient Data Compression in IceCube Project

"Road Grader" • Joshua Sopher & David Nygren, LBNL • March 20, 2005 • IceCube Collaboration Meeting

Historical perspective • Original notion in PDD: • Most pulses are simple SPE-like waveforms • “Recognize” SPE pulses & process waveforms • Report derived Q & time for these pulses • Don’t process all other complex waveforms • No zero-suppression, report raw waveform • Algorithmic implementation: unpleasant Two processing methods: bad idea

“Road Grader” Algorithm • Perspective: “Simple is Good” • Road grader scrapes up all good data: • zero-suppression + data compression • Samples near baseline & below threshold are unimportant for timing & charge • All fADC & ATWD waveforms treated identically • Very few parameters to meddle with • (and lose track of!)  Stability of data guaranteed

Project Goals • Suppress and compress data to meet the data rate requirement for DOM-to-surface data transmission: < 20 kbytes/s/DOM • Realize compressor in firmware to minimize processing time. • Efficient operation within DAQ FPGA design • CPU to be used for state control, message management, etc, not for data processing

Technical description • Waveforms are similar to fax scan lines: • Run-length encoding, followed by Huffman encoding • Suppression replaces baseline data with zeroes • Run-length encoding counts the repetitions of same valued data • Huffman “lite” encoding replaces “zero” bytes with a “zero” bit

Suppression • ATWD and fADC data words are 10 bits wide. • Data below a threshold is replaced by zeros, and data above a threshold is left unchanged • This produces a large run length of zero valued data, for a typical single-pulse waveform

ATWD pre-pulse behavior • Baseline noise is small, ± 2 counts peak-to-peak. • Occasional pre-pulse baseline “shift”: -3 counts • Threshold is set 4 counts above the baseline. • Maximum threshold: 8 - 9 counts • Typical SPE: 200 counts at peak. • Pulse samples with amplitudes above 8 -9 counts (~4% of an SPE) are never suppressed.

Threshold impact • Threshold causes not more than one sample of uncompressed data to be lost. • There will be virtually no loss of useable waveforms due to compression. • Pulse (non-zero) data will be identical to uncompressed data. • Reconstructed pulse has negligible errors.

Run length encoding • Zero-suppressed data is run-length encoded. • Run-length is zero for non-repeated data. • Run-length encoding produces number pairs: data followed by the number of repetitions. • Pre-pulse: 0 0 0 0 0  0,4 • Pulse: 43 89 22  43,0 89,0 22,0

Huffmann encoding • A zero-valued 10-bit word is replaced by a 1-bit wide “zero flag”. • A non-zero flag bit is added to non-zero data, forming a 11-bit word. • Decompression of data requires an additional flag bit  12 bit words.

Compressed data • The compression ratio depends on the sampling rate, the pulse width, and waveform complexity. • Compressed data is 12 bits wide for both repeated zeros & non-repeated non-zero data values. • For a pulse 8 samples wide, with leading and following zeroes, compressed data = 12 + (8 x 12) + 12 = 120 bits = 15 bytes • For a pulse 4 samples wide, with leading and following zeroes, compressed data = 12 + (4 x 12) + 12 = 72 bits = 9 bytes

Data compression ratio • For a 8 samples wide ATWD pulse: the compression ratio = 128 x 10/120 = 10 • For a 4 samples wide fADC pulse: the compression ratio = 256 x10/72 = 35 • Every hit also has an 8-byte header that includes the coarse time-stamp ( 32 bits) + various hit descriptor bits ( 32 bits)

Basic rates • String 21 measured PMT rate = <750 Hz> • LC tag rate (nearest neighbor only) = ~15 Hz • Non-tag rate (mainly SPE) = ~735 Hz • Data rate requirement < 20,000 bytes/s/DOM • This keeps data flow below danger zone: • Network occupancy >50% not allowed

Data flow rate - “Hard” LC • Mode: HardLocal Coincidence • LC tag present: Header + ATWD + fADC data • LC tag absent: no data at all! Hit discarded! • Data rate = (header + fadc + atwd) x tag rate = (8 + 15 + 9) x 15 Hz = 480 bytes/s • Compression is not really needed…but, • All isolated hit data is lost

Data flow rate - “Soft” LC • Operating mode: Soft Local Coincidence • LC tag present: Header + ATWD + fADC data • LC tag absent: Header only, no ATWD, no fADC data • Tagged data rate = (header + (fadc + atwd)) x tag rate = (8 + 15 + 9) bytes x 15 Hz = 480 bytes/s • Non-tagged rate = 8 x 735 Hz = 5880 bytes/s • Sum = 6360 bytes/s  • Zero-suppression & run-length encoding needed

Data rates - “Flabby” LC • Mode: Flabby Local Coincidence • LC tag present: Header + ATWD + fADC data • LC tag absent: Header only + fADC data, no ATWD • Tagged data rate = (header + fadc + atwd) x tag rate = (8 + 15 + 9) bytes x 15 Hz = 480 bytes/s • Non-tagged data rate = (8 + (1 + .2) x 9) bytes x 735 Hz = 13,818 bytes/s • Sum = 14,298 bytes/s (reasonable margin)

Possible issues • ATWD baseline may need monitoring • Baseline drift, if any, needs to be tracked • Easy to imagine auto-tracking capability • ATWD pulses over-sampled @ 300 MHz • Typical: ~11 samples/pulse • Why is this? Pulses are wider than expected • Delay line + amps + ATWD driver affect r • PMT gain is probably higher than we need • Pulse tail adds many samples, little information

Summary • “Road grader” is conceptually simple. • Reconstructed pulse fidelity is excellent. • Compression ratio meets project goals. • Implementation is pretty well-tested. • Incorporated in the new FPGA for DAQ. • ATWD issues may need some attention. • No obvious flaws preventing utilization.

Implementation of the "Road Grader" Algorithm for Efficient Data Compression in IceCube Project

Implementation of the "Road Grader" Algorithm for Efficient Data Compression in IceCube Project

Presentation Transcript

Are You Smarter Than a 5 th Grader?

NOBLE JOHN APPIAH Executive Director National Road Safety Commission, Ghana

Global Meeting of NGOs Advocating for Road Safety and Road Victims 7-8 May 2009 Brussels, Belgium

How to influence road user behaviour by police enforcement?

Rules of the Road

The Silk Road

Off Road Seating 2007 AgrAbility Workshop

Unit 3 American Revolution

Chapter 1 Celebration of Knowledge Review Game

On the Road to 2014-15

The Silk Road

Road to War

“Defensive Driving Training”

Map Reading Conventional Signs

Linked Data at the National Széchényi Library : road to the publication

Sermon notes @ bible /e/1F52

Road to Discovery: Lecture 1

Jacquis in As You Like It

Gradients

Road To Emmaus

Crossy Road Hack and Cheats-Tool Can Hack Crossy Road