230 likes | 362 Views
JEM FDR: Design and Implementation. JEP system requirements Architecture Modularity Data Formats Data Flow Challenges : Latency Connectivity, high-speed data paths JEM revisions JEM 1.1 - implementation details Daughter modules Energy sum algorithms FPGA resource use Performance
E N D
JEM FDR: Design and Implementation • JEP system requirements • Architecture • Modularity • Data Formats • Data Flow • Challenges : • Latency • Connectivity, high-speed data paths • JEM revisions • JEM 1.1 - implementation details • Daughter modules • Energy sum algorithms • FPGA resource use • Performance • Production tests JEM FDR
JEP system requirements • Process –4.9 < η < 4.9 region • ~32×32×2 = 2k trigger towers of Δη×Δφ=.2×.2 • 9 bit input data (0-511 GeV) • 32x32 10-bit “jet elements” after em/had pre-sum • 2 multiplications per jet element: ET (EX,EY) • 3 Adder trees spanning the JEP (JEMs, CMMs) • Sliding window jet algorithm, variable window size within 3×3 environment • Output data to CTP • Thresholded ET , ET • Jet hit count • Output data to RODs • Intermediate results, mainly captured from module boundaries • RoI data for RoIB JEM FDR
JEP system design considerations • Moderate data processing power • Tough latency requirements • Large amount of signals to be processed partition into parallel operating modules • Algorithm requiring environment to each jet element high bandwidth inter-module lanes • Data concentrator functionality, many few • Severely pin bound design, dominated by input connectivity • Modules • Processors (FPGAs) • Benefit from similarities to cluster processor • Common infrastructure (Backplane) • Common serial link technology JEM FDR
System modularity • Two crates, each processing two quadrants in φ 32 × 8 bins (jet elements) per quad • η range split over 8 JEMs 4 × 8 jet elements per JEM • Four input processors per JEM • Single jet processor per JEM • Single sum processor per JEM JEM FDR
Replication of environment elements - system and crate level - • JEM has 32 core algorithm cells • 4 × 8 jet elements • Directly mapped : 4 PPMs (e,h) 1 JEM • JEM operates on a total of 77 jet elements including ‘environment’ : 7 × 11 • Replication in φ via multiple copies of PPM output data • Replication in ηvia back-plane fan-out JEM FDR
JEM data formats – real-time data • JEM Inputs from PPM: • Physical layer : LVDS, 10 bits, 12-bit encoded w. start/stop bit • D0 odd parity bit • D(9:1) 9 bit data, D1 = LSB= 1 GeV • Jet elements to jet processor: • No parity bit • D(9:0) 10 bit data, D0 = LSB= 1 GeV • 10 data bits muxed to 5 lines, least significant first • Energy sums to sum processor: • No parity bit • ET(11:0) 12 bit data, D0 = LSB= 1 GeV • EX(13:0) 14 bit data, D0 = LSB= .25 GeV • EY(13:0) 14 bit data, D0 = LSB= .25 GeV • JEM output to CMM: • J(23:0) 8 x 3 bit saturating jet hits sent on bottom port • J24 odd parity bit • S(23:0) 3 x 8 bit quad-linear encoded energy sums on top port • 6 bit energy • 2 bit range • Resolution 1GEV, 4 GeV, 16 GeV, 64 GeV • S24 odd parity bit JEM FDR
JEM data formats - readout • Physical layer : 16bits, 20-bit encoded (CIMT, alternating flag bit, fill-frames 1A/1B, HDMP 1022 format) • Event separator : Minimum of 1 fill-frame sent after each event worth of data • All data streams odd parity protected (serial parity) • DAQ readout : 67-long stream per L1A / slice being read out • Input data on D(14:0) : 11 bit per channel, nine bit data, 1 bit parity error, 1 bit link error • 12 bit Bcnum & 25 bit sum & 25 bit jet hits on D15 • RoI readout : 45-long stream per L1A • D(1:0) : total of 8ROIs • 2 bits location & saturation flag & 8 bits threshold passed • D2 : 12 bits Bcnum • D(4:3) : used on FCAL JEMs only (forward jets) • D(15:5) : always zero JEM FDR
JEM data flow 400 Mbit/s serial data (480 Mbit/s with protocol) Multiple protocols and data speeds and signaling levels used throughout board • Multiplexing up and down takes considerable fraction of latency budget • Re-synchronisation of data generally required on each chip and board boundary • FiFo buffers • Phase adjustment w. firmware-based detection • Delay scans LVDS deserialiser 40 MHz parallel Input processor 80 Mb/s 40Mb/s Jet processor+ readout controller Sum processor + readout controller 40 Mb/s 40 Mb/s parallel To CMM Link PHY Link PHY To CMM 640 Mbit/s serial data (800 Mbit/s with protocol) Not synchronous to bunch clock JEM FDR
Challenges : latency & connectivity • Latency budget for energy sum processor:18.5 ticks (TDR) • Input cables : ~2 ticks • CMM : ~ 5 ticks • Transmission to CTP <2 ticks • ~ 9.5 ticks available on JEM from cable connector to backplane outputs to CMM Module dimensions imposed by use of common backplane • Large module : 9U*40cm • Full height of backplane used for data transmission due to high signal count long high-speed tracks unavoidable need to use terminated lines throughout need to properly adjust timing • High input count : 88 differential cables JEM FDR
Connectivity : high-density input cabling • 24 4-pair cable assemblies arranged in 6 blocks of 4 (2 φ bins × em, had) • Same coordinate system now on cables and crate: φ upwards, η left to right (as seen from front) • V cable rotated • Different cablingfor FCAL JEMs re-map FCAL channels in jet FPGA firmware JEM FDR
Connectivity : details of differential data paths • Differential 100Ω termination at sink • 400 (480) Mbit/s input data • Use de-serialisers compatible to DS92LV1021 (LVDS signal level, not DC-balanced) • 88 signals per JEM arriving on shielded parallel pairs • Run via long cables (<15m) and short tracks (few cm) • Require pre-compensation on transmitting end • 640 (800) Mbit/s readout data • PECL level electro-optical translator • HDMP1022 protocol, 16-bit mode • Use compatible low-power PHY JEM FDR
Connectivity : details of single ended data paths • CMOS signals • point-to-point • 60ΩDCI source termination throughout on all FPGAs • 40Mb/s (25ns) • at 1.5V, no phase control • Energy sum path into sum processor : 40 lines per input processor • General control paths • At 2.5V : CMM merger signals via backplane (phase adjustment on receiving end) • 80Mb/s (12.5ns) at 1.5V : jet elements • 7x11x5bit =385 lines into jet processor • 2x3x11x5bit=330 lines on backplane from/to adjacent modules • Global phase adjustment via TTCrx • All signals latched into jet processor on same clock edge JEM FDR
JEM history • JEM0.0 built from Dec. 2000 • LVDS de-serialiser DS92LV1224 • 11 input processors covering one phi bin each, Spartan2 • Main processor performing jet and energy algorithms, Virtex-E • Control FPGA, ROC, HDMP1022 PHY, coaxial output • Complete failure due to assembly company • JEM 0.x built from Dec. 2003 • Minor design correction wrt to JEM0.0 • New manufacturer (PCB / assembly ) • Fully functional prototype except CAN slow control and FPGA flash configuration • TTC interface not to specs due to lack of final TTCrx chip • Successfully tested all available functionality JEM FDR
JEM 0 11 input processors VME-Interface 2 x HDMP1022 Backplane Conn. Main ROC TTCrx CAN 88 x DS92LV1224 JEM FDR
JEM history (2) • JEM1.0 built in 2003 • All processors Virtex-2 • Input processors on daughter modules (R,S,T,U) • LVDS de-serialiser SCAN921260 (6-channel) • 4 input processors covering three phi bins each • 1 Jet processor on main board • 1 Sum processor on main board • 1 Board control CPLD (CC) • Readout links (PHY & opto) on daughter module (RM) • Flash configurator : system ACE • Slow control / CAN : Fujitsu microcontroller • Successfully tested algorithms and all interfaces • Some tuning required on SystemACE clock • CAN not to new specs (L1Calo common design) JEM FDR
VME CC RM U Sum T TTC Jet CAN S ACE R Flash power History: JEM 1.0 JEM1.0 successfully tested • Algorithms • All interfaces • LVDS in • FIO inter-module links • Merger out • Optical readout • VME • CAN slow control • Mainz, RAL slice test, CERN test beam JEM FDR
JEM 1.1 • JEM1.1 in production now • Identical to JEM 1.0 • Additional daughter module: Control Module (CM) • CAN • VME control • Fan-out of configuration lines • Expected back from assembly soooon JEM FDR
JEM details –main board • 9U*40cm*2mm, bracing bars, ESD strips, shielded b’plane connector • 4 signal layers incl. top, bottom, 2*Vcc, 4*GND total 10 layers • Micro vias on top, bottom, buried vias • All tracks controlled impedance : controlled / measured by manufacturer • Single ended 60Ω • Differential 100Ω • Point-to-point links only • All hand-routed • 60Ω DCI source termination on processors (CMOS levels) • Power distribution • All circuitry supplied by local step-down regulators, fused 10A (estimated maximum consumption < 5A on any supply, 50W tot.) • 10A capacity, separate 1.5V regulator for daughter modules • Defined ramp-up time (Virtex2 requirement) • staged bypass capacitors, low ESR • VME buffers scannable 3.3V (DTACK: open drain 3*24mA), short stubs on signal lines, 20-75 mm • Vccaux for FPGAs : dedicated quiet 3.3V • Merger signals (directly driven by processors) on 2.5V banks • FPGA core and inter-processor and inter-module links 1.5V JEM FDR
JEM details –main board (2) • Timing • TTC signals terminated and buffered (LVPECL, DC) near backplane • TTCdec module with PLL and crystal clock automatic backup • DESKEW1 bunch clock used as a general purpose clock • Low skew buffers (within TTCdec PLL loop) with series terminators • DESKEW2 clock used for phase-controlled sampling 80Mb/s jet element data (local & FIO) on jet processor only • VME • Synchronised to bunch clock • Sum processor acts as VME controller • Basic pre-configure VME access through CM • Readout located on RM (ROCs on sum and jet processor) • DCS/CAN located on CM (except PHY - near backplane) • Configuration via SystemACE / CF • P2P links to keep ringing at bay • Multiple configurations, slot dependent choice JEM FDR
JEM details –main board (3) • JTAG available on most active components. Separate chains • FPGAs (through SystemACE) • Non-programmable devices on input daughters • TTCdec and Readout Module • Buffers • Control Module • JTAG used for • Connectivity tests at manufacturer & MZ • CPLD configuration • FPGA configuration (ACE) JEM FDR
Input modules • 24 LVDS data channels per module • 12 layer PCB with micro vias • Impedance controlled tracks • 60 Ω single ended • 100 Ω differential • LVDS signals entering via 100Ω differential connector on short tracks (<1cm) • Differential termination close to de-serialiser • 4 × SCAN921260 6-channel de-serialiser • PLL and analogue supply voltage only (3.3V) supplied from backplane • Digital supply from step-down regulator on main board • Reference clock supplied via FPGA • XC2V1500 input processor • 1.5V CMOS 60Ω DCI signals to sum and jet processor • SMBus device for Vcc and temperature monitoring (new) JEM FDR
Readout Module RM 2 channels, 640 Mb/s 16bit 20 bit CIMT coded, fill-frame FF1, alternating flag bit, as defined in HDMP1022 specs • 2xPHY, 2xSFP opto transceiver, so far 2-layer boards • High-speed tracks <1cm • PHYs tested: • HDMP1022 serialiser 2.4W/chip (reference, tested in 16-bit and 20-bit mode) • HDMP1032A serialiser 660mW/chip, €27.86 @ 80pc (16-bit) • TLK1201A serdes 250mW/chip, < €5.00 @ 80pc, uncoded, requires data formatter firmware in ROC (16-bit, 20-bit) • Successfully run off bunch clock • Converted to Xtal clock due to unknown jitter situation on ATLAS TTC clock • Problems with Xtal clock distribution to ROI PHY (RAL, MZ) • RM seems to work with clock linked from DAQ PHY to ROI PHY • Want a local crystal oscillator on RM • Need new iteration of RM (HDMP1032A, TLK1201A) JEM FDR
Control Module CM Combines CAN/DCS, VME pre-configure access and JTAG fanout • CAN • Controller to L1Calo specs now (common design for all processors, see CMM/CPM • Link to main board via SMBus only (Vcc, temperatures) • VME CPLD (pinout error corrected) • generating DTACK for all accesses within module sub-address range to avoid bus timeout • Providing basic access for • FPGA configuration via VME • configuration reset • ACE configuration selection / slot dependent • ACE configuration selection via VME • Buffers for SystemACE-generated JTAG signals to FPGAs • TTCdec parallel initialisation (ID from geographical address) JEM FDR