1 / 12

CPM architecture and challenges

CPM architecture and challenges. CP system requirements Architecture Modularity Data Formats Data Flow Challenges High-speed data paths Latency. CP system requirements. Cluster Algorithms 4 x 4 x 2 cell environment Sliding window. Process –2.5 < η < 2.5 region

Download Presentation

CPM architecture and challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CPM architecture and challenges • CP system requirements • Architecture • Modularity • Data Formats • Data Flow • Challenges • High-speed data paths • Latency CPM FDR, Architecture and Challenges

  2. CP system requirements • Cluster Algorithms • 4 x 4 x 2 cell environment • Sliding window • Process –2.5 < η < 2.5 region • 50x64 trigger towers per layer • Two layers • 8 bit data (0-255 GeV) • Relatively complex algorithm • Output data to CTP • 16 x 3 bit hit counts • Each hit condition is a combination of four thresholds • Output data to RODs • Intermediate results • RoI data for RoIB CPM FDR, Architecture and Challenges

  3. System design considerations • Several major challenges to overcome • Large processing capacity • Data i/o, largely at input • Latency requirements • Processing must be split over several modules working in parallel • But overlapping nature of algorithms implies fan-out needed • Modularity is compromise between competing requirements • High connectivity back-plane required for data sharing • Data must be ‘compressed’ as much as possible • Use data reduction whenever possible • Data serialisation at various speeds used to reduce i/o pin counts CPM FDR, Architecture and Challenges

  4. System modularity • Full system • 50 x 64 x 2 trigger towers • Four crates, each processing one quadrant in phi • 50 x 16 x 2 core towers • Eta range split over 14 CPMs • 4 x 16 x 2 core towers • Module contains 8 CP FPGAs • 4 x 2 x 2 core towers CPM FDR, Architecture and Challenges

  5. Board Level Fan-out,input signals and back-plane • CPM has 64 core algorithm cells • 16 x 4 reference towers • Obtained from direct PPM connections (2 PPMs per CPM) • Algorithm requires extra surrounding cells for ‘environment’ • One extra below, two above • 19 x 4 x 2 towers in all • Fanout in phi achieved via multiple copies of PPM output data • Fanout in eta achieved via back-plane CPM FDR, Architecture and Challenges

  6. Internal Fan-out and the Cluster Processing FPGA Environment • CP FPGA processes 2x4 reference cells • Algorithm requires 4x4x2 cells around reference • Convoluting these gives 5x7x2 FPGA environment • Data received from 18 different serialiser FPGAs • 6 on-board • 12 through back-plane from left on-board from right from above ‘core’ cells from below CPM FDR, Architecture and Challenges

  7. CPM Data formats – tower data • 8 bit tower data • PPM peak finding algorithm guarantees any non-zero data is surrounded by zeroes • Allows data encoding/compression • Two 8 bit towers converted to one 9 bit ‘BC-muxed’ data word • Add odd-parity bit for error detection • 160 input towers encoded in 80 x 10 bit data streams • Same format utilized for: • input to CPM • between serializer FPGA and CP FPGA ‘bcmuxed’ 10 bit data two towers x 8 bits bcmux bit parity bit 8 bit data CPM FDR, Architecture and Challenges

  8. CPM data formats – hits and readout • CPM hits results: • 16 x 3 bit saturating sums • 8 sent to left CMM, 8 sent to right • 8 x 3 = 24 results bits pluse 1 odd-parity bit added • DAQ readout • Per L1A, 84 x 20 bits data • Bulk of data is BC-demuxed input data • 10 bit per tower, eight bit data, 1 bit parity error, 1 bit link error • 160 direct inputs x 10 bit data = 80 x 20 bits • 48 bits hit data, 12 bits Bcnum, 20 bits odd-parity check bit • RoI readout • Per L1A, 22 x 16 bits data • Bulk of data is individual CP FPGA hit and region location • 16 bits + 2 bits location + 1 bit saturation + 1 bit parity error • 8 FPGAs each have 2 RoI locations = 8 x 2 x 20 bits • Rest is 12 bits Bcnum, and odd-parity check bit CPM FDR, Architecture and Challenges

  9. CPM data flow: signal speeds • Multiple protocols and data speeds used throughout board • Care needed to synchronize data at each stage • This has proved to be the biggest challenge on the CPM 400 Mbit/s serial data (480 Mbit/s with protocol) LVDS deserialiser 40 MHz parallel data Serialiser FPGA 160 MHz serial data CP FPGA Readout Controllers 40 MHz parallel data Hit Merger 640 Mbit/s serial data (800 Mbit/s with protocol) CPM FDR, Architecture and Challenges

  10. CPM challenges: high-speed data paths • 400 (480) Mbit/s input data • Needed to reduce input connectivity • 80 differential inputs plus grounds = 200 pins/CPM • Previous studies of the LVDS chipset established viability • Works very reliably with test modules (DSS/LSM) • Still some questions over pre-compensation and PPM inputs • 160 MHz CP FPGA input data • Needed to reduce back-plane connectivity • 160 fan-in and 160 fan-out pins per CPM • Needed to reduce CP FPGA input pin count • 108 input streams needed per chip • This has been the subject of the most study in prototype testing • 640 (800) Mbit/s Glink output data • Glink chipset successfully used in demonstrators • Needed some work to understand interaction with RODs CPM FDR, Architecture and Challenges

  11. CPM challenges: latency • CP system latency budget: ~14 ticks • This is a very difficult target • Note, CPM is only first stage of CP system • CMM needs about 5 ticks • CPM latency - irreducible • Input cables: > 2 ticks • LVDS deserialisers: ~ 2 ticks • Mux/Demux to 160 MHz: ~ 1 tick • BC-demuxing algorithm: 1 tick • Remaining budget 14-5-2-2-1-1 = 3 ! CPM FDR, Architecture and Challenges

  12. Conclusions • The CPM is a very complex module • Difficulties include: • High connectivity • Multiple time-domains • Tight constraints on latency • Large overall system size • Extensive testing has shown that the current prototype CPM meets these demands CPM FDR, Architecture and Challenges

More Related