1 / 20

FPGA Co-processor for the ALICE High Level Trigger

FPGA Co-processor for the ALICE High Level Trigger. Gaute Grastveit University of Bergen Norway. H.Helstrup 1 , J.Lien 1 , V.Lindenstruth 2 , C.Loizides 5 , D.Roehrich 3 , B.Skaali 4 , T.Steinbeck 2 , K.Ullaland 3 , A.Vestbo 3 , T. Vik 4 , A. Wiebalck 2 for the ALICE Collaboration

lloyd
Download Presentation

FPGA Co-processor for the ALICE High Level Trigger

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FPGA Co-processor for the ALICE High Level Trigger Gaute Grastveit University of Bergen Norway H.Helstrup1, J.Lien1, V.Lindenstruth2, C.Loizides5, D.Roehrich3, B.Skaali4, T.Steinbeck2, K.Ullaland3, A.Vestbo3, T. Vik4, A. Wiebalck2 for the ALICE Collaboration 1Bergen College, Norway 2Kirchhoff Institute for Physics, University of Heidelberg, Germany 3Departement of Physics, University of Bergen, Norway 4Departement of Physics, University of Oslo, Norway 5Institute of Nuclear Physics, University of Frankfurt, Germany

  2. ALICE– A Large Ion Collider Experiment TPC - Time Projection Chamber

  3. Very High Data Rate Pb-Pb central collisions Event rate: 200Hz Event size: ~75Mb => 15 Gbyte/s Max data-rate to tape is 1.25 Gbyte/s Compression/selection is needed Conventional, lossless methods: factor 2

  4. HLT functionality • Compress • Reduce the amount of data required to encode the event as far as possible without loosing physics information • Trigger • Accept/reject events on the basis of physics application • Select • Select regions of interest within an event • remove pile-up in p-p • ... Task: reconstruct the tracks of 20.000 charged particles (each producing 150 clusters) in the TPC Timebudget: 5 ms

  5. The HLT setup Data are received in parallel RCU – Readout Controller Unit DDL – Data Detector Link RORC – ReadOut Reciver Card HLT farm • PCI kernel in the FPGA • FPGA will also be utilised for pattern recognition • Reduces number of CPU’s needed

  6. The HLT FPGA co-processor • FPGA: APEX 20K400 • Next prototype: Altera Stratix FPGA • Large internal memory • DSP cores

  7. Two Schemes for Finding Tracks • Low occupancy (p-p, Pb-Pb outer padrows) • Conventional approach with (2d) cluster finder and track follower • High occupancy (overlapping clusters): • Hough transform on raw data • Cluster analysis for deconvolution • (Kalman filter) High multiplicity picture

  8. Cluster Finder

  9. time The numbers represent Charge (ADC values) A vertical uninterrupted stack of numbers is called a sequence. The square shows the geometric centre of the sequence. Neighbouring sequences belong to the same Cluster. Final mean value: (Weighted mean) Pad

  10. FPGA implementation of a cluster finder - the algorithm • Calculate the mean for every sequence • Adjacent pads with similar means are merged • Two lists of sequences are used: one for clusters on the previous pad one for clusters on the current pad • Clusters are removed from the searchrange when a match is found or we know it is finished • Clusters are inserted in the inputrange after merging or when we start a new cluster Memory of clusters begin Searchrange / Previous pad end Inputrange / Current pad insert

  11. Block Diagram, Verification T Testbench Top structure RAM (lpm) Decoder FIFO (lpm) Merger cluster seq seq File: charges File: VHDL clusters File: C++ clusters C++ model C++ program compares the results

  12. Relative Scales As before the mean is calculated by: smaller + Smaller numbers, only multiplies by <11 - Multiplication can’t be done until merging takes place Alternative, (absolute): Pre_Calc (2 mult, 1 add) Decoder FIFO (lpm) Merger

  13. Deconvolution Simplified implementation, almost for free – splits at minima in both directions (time and pad) off on

  14. Merger Goals • spend few clock cycles per sequence • use few logic elements • high clockspeed

  15. Cluster Finder Performance • Syntesized on Altera APEX • Uses 1800 Logic Elements (11%) • Memory usage 16*80 + 64*112= 8448 bits (4%) • Circuit runs at 33Mhz

  16. Outlook Implementation of Hough transformation

  17. Conclusion We have demonstrated the feasibility of a real time cluster finder implemented in an FPGA Firmware implementation of a Hough transform looks promising

  18. transperacy replacements from now on

  19. ALICE– A Large Ion Collider Experiment

  20. TPC- Time Projection Chamber 18 sectors on each side, each sector is readout in 6 subsectors Total is ca. 570.000 pads

More Related