1 / 9

DCC Out of Sync Problems Stan Durkin, Ohio State Presented by: Ben Bylsma

This analysis focuses on the Out-of-Sync condition in DCCs during high rate cosmic runs. It analyzes specific data rates and events to identify potential causes and suggests remedies for the problem.

ritterc
Download Presentation

DCC Out of Sync Problems Stan Durkin, Ohio State Presented by: Ben Bylsma

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DCC Out of Sync Problems Stan Durkin, Ohio State Presented by: Ben Bylsma

  2. In Recent High Rate Cosmic Runs (July 18-23, 2010) DCCs have gone into an Out-of-Sync Condition 7 times FMM 750 W 82 B 28 S 1 E 0 FMM 752 W 0 B 0 S 0 E 0 FMM 754 W 1023 B 14 S 6 E 0 FMM 756 W 107 B 33 S 2 E 0 Analyze Study Run 141291 (specifically 490s to 540 s) 4,230,000 events thru each RUI 5102 events on CMSSW data ~0.1 % of events saved Rate (from slopes): 79.5 KHz L1As Time (seconds)

  3. 156.25 MHz 156.25 MHz 156.25 MHz 156.25 MHz 156.25 MHz IN_FPGA IN_FPGA IN_FPGA IN_FPGA IN_FPGA 2x 3.125 Gbps 2x 3.125 Gbps 2x 3.125 Gbps 2x 3.125 Gbps 2x 3.125 Gbps FIFO 6 FIFO 2 FIFO 4 FIFO 0 FIFO 8 DDU 4 DDU 2 DDU 6 DDU 8 DDU 0 x32+4 x32+4 x32+4 x32+4 x32+4 156.25 MHz 156.25 MHz 156.25 MHz 156.25 MHz 156.25 MHz 2x 3.125 Gbps 2x 3.125 Gbps 2x 3.125 Gbps 2x 3.125 Gbps 2x 3.125 Gbps FIFO 5 FIFO 1 FIFO 9 FIFO 7 FIFO 3 DDU 7 DDU 1 DDU 5 DDU 3 DDU 9 x32+4 x32+4 x32+4 x32+4 x32+4 M_FPGA 62.5 MHz 78.125MHz S_FIFO 0 SLINK64 0 x64+8 x64+8 62.5 MHz 78.125MHz S_FIFO 1 SLINK64 1 x64+8 x64+8 DCC FIFO Overflows at High Data Rates VME64X Discrete Logic for PROM 156.25 MHz x32+4 bits Gigabit Ethernet 1.25 Gbps Gigabit Ethernet 1.25 Gbps L1A, CMSCLK, & Control 156.25 MHz x32+4 bits VME Control SLINK FIFO 1MB FMM Backplane TTCrx TTC CSC DCC &&DDU header have FMM information Input_FIFO 512KB

  4. CSC DCC sTTS state machine: • SLINK_FIFO: • Half_Full set WARNING; reset WARNING when it drops to Almost_Empty; • Almost_Full or Full set Out_of_Sync and hold until Resync; • IN_FIFO: • Half_Full set WARNING; • Half_Full for more than 3.2ms, set BUSY; • Half_Full while in WARNING, set BUSY; • Almost_Full or Full set Out_Of_Sync and hold until Resync; • Event Buffer Count: • >1536: set WARNING, reset WARNING when it drops to 1280; • >1920: set BUSY, reset BUSY when it drop to 1536; • >2016: set Out_Of_Sync and hold until Resync; - Warning and Busy Stops L1A Triggers (lacency ~1sec) - Out_of_Sync stops run for a resync

  5. FMM Throttling Seems to be Working Time FMM 1 Asserted FMM Log 141491 t(s) dt(s) FMM 139.384429875 0.436721600 1 139.386232725 0.001802850 8 140.119162225 0.732929500 1 140.120998750 0.001836525 8 144.130565900 4.009567150 1 144.132397975 0.001832075 8 146.057188825 1.924790850 1 146.058872650 0.001683825 8 148.779290350 2.720417700 1 148.781143125 0.001852775 8 152.496441950 3.715298825 1 152.498013425 0.001571475 8 152.817810300 0.319796875 1 152.819979975 0.002169675 8 153.590204650 0.770224675 1 153.592016100 0.001811450 8 154.189867650 0.597851550 1 154.191494650 0.001627000 8 … repeats 90 times … 191.300884525 0.001097700 8 191.301140075 0.000255550 1 191.303430625 0.002290550 2 1.8 msec Time (msec) Transition FMM 12 2.290±0.005 msec

  6. Data Rates aren’t Large Enough to be Causing Overflows Theoretical Probability of >50 events in Queue Average Event Sizes RUI 750 884 bytes RUI 751 993 bytes RUI 752 861 bytes RUI 753 1129 bytes RUI 754 843 bytes RUI 755 1163 bytes RUI 756 821 bytes RUI 757 988 bytes 78.5 Khz ~78.5 MB/s Log10(P)*106 Rate (MB/s) To Fill SLINK FIFO in 2.29 msec requires >200 MB/s even if output stopped 625 MB/s 500 MB/s SLINK FIFO 1 Mbyte

  7. 60 Events in Run 141491 CMSSW data show bad transmission 3.125 GB/s 3.125GB/s Two independent 3.125 Gbit links 1960 826d bc50 bc50 0000 8000 bc50 bc50 0080 0000 bc50 bc50 8000 8000 bc50 bc50 0000 0000 bc50 bc50 0080 2c1e bc50 bc50 c0de c000 bc50 bc50 1560 826d 6d0f 5080 0000 8000 0001 8000 0080 0000 1014 3f7f 8000 8000 ffff 8000 0000 0000 0000 2000 0080 2210 0006 a000 Bad data, 0xBC50 idle code Good Data Transfer problem On 3.125 Gbit Backplane

  8. How do we prove these events are causing problem ? last column shift f308 7342 76b2 5164 01f0 5ae0 0e36 d900 1960 734d 5064 c0de 0000 8000 8000 76b2 0080 0000 3f7f 0001 8000 8000 8000 1014 0000 0000 2000 ffff 0080 be16 a000 0000 c0de c000 c000 0006 1960 86bd 5064 c0de 0000 8000 8000 76b3 0080 0000 3f7f 0001 8000 8000 8000 1014 0000 0000 2000 ffff 0080 2a10 a000 0000 c0de c000 c000 0006 1960 916d 5064 c0de 0000 8000 8000 76b4 0080 0000 3f7f 0001 8000 8000 8000 1014 0000 0000 2000 ffff 0080 5039 a000 0000 c0de c000 c000 0006 1960 960d 5064 c0de Viewed several hundred bad transmission events. Only a small number of DDU->DCC links gave problems. RUI755 DDU 25 most RUI757 DDU 33 many RUI751 DDU 7 a few RUI751 DDU 3 a few RUI756 DDU 35 one RUI755 DDU 16 one We will swap DDU 25 and see if the problems go away.

  9. Possible Remedies to Problem • Fix problem boards • Reconfigure XILINX RocketIOs • Channel Bonding – lock step data transmissions • 16 bit -> 32 bit interface – keep data words together • Change Clock Frequency in Firmware (divide by 2) • we don’t need 625 Mbyte/s from DDU to DCC • This is not urgent. We will proceed with caution.

More Related