Federico alessio w ith inputs from richard ken guillaume
This presentation is the property of its rightful owner.
Sponsored Links
1 / 38

Federico Alessio, w ith inputs from Richard, Ken, Guillaume PowerPoint PPT Presentation


  • 56 Views
  • Uploaded on
  • Presentation posted in: General

Study on buffer usage and data packing at the FE. LHCb Electronics Upgrade Meeting 11 April 2013. Federico Alessio, w ith inputs from Richard, Ken, Guillaume. Scope. Attempt to study : Impact of TFC commands on behaviour of FE buffer in upgraded readout architecture

Download Presentation

Federico Alessio, w ith inputs from Richard, Ken, Guillaume

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Federico alessio w ith inputs from richard ken guillaume

Study on buffer usage and data packingat the FE

LHCb Electronics Upgrade Meeting

11 April 2013

Federico Alessio,

with inputs from Richard, Ken, Guillaume


Federico alessio w ith inputs from richard ken guillaume

Scope

  • Attempt to study:

  • Impact of TFC commands on behaviour of FE buffer in upgradedreadoutarchitecture

  • Feasibility of packingalgorithmacross GBT link asspecified in readoutarchitecturespecifications

  • ThispresentationisNOT intendedto show youhow to pack across the GBT link or how to use the buffer.

  • However, itIS intended to stimulatediscussionsusing a practicalexample on possiblesolutions/implicationsat FE in the global readoutarchitecture.

  • Thereis a publishedinternal note whichcontainswhatI’mpresentinghere: LHCb-INT-2013-015 .

  • Itisnotfinal, itismeantonly to stimulatediscussions  feedbacks!

2


Federico alessio w ith inputs from richard ken guillaume

TFC simulationtestbench

  • First simulationtestbench (est. 2009) developed in VisualElite from Mentor Graphics. Includes:

    • S-ODIN

    • SOL40 (only TFC)

    • LHC clock

    • LHC fillingscheme

    • LLT emulation (based on current L0)

    • Custom-made FE emulationblock

      • GenericFE emulation

      • OT-like

      • CALO-like

    • No TELL40 emulation, throttleisfaked

    • Everythingis an HDL entity, portable to othersimulationplatforms

  • Basically, the aimis to simulate a (very small) slice of the readoutsystem

    • === Mini-DAQ including FE emulation

    • Couldaddfew FE channels with differentoccupancies

    • *onlyproblemissimulation time

3


Federico alessio w ith inputs from richard ken guillaume

(Simplified) TFC simulationtestbench

4


Federico alessio w ith inputs from richard ken guillaume

FE emulation, why?

  • Needed to develop a FE emulationblock to simulate the generation of detector data

    • Used to

    • study impact of TFC commandsat FE buffer behaviour

    • demonstratefeasibility of packingmechanismat FE aswritten in specs

    • emulate FE data generator to spy on sub-detectors for FE reviews…. 

  • Proposed to use itas a practicalexample of a generic FE data generator for the readoutarchitecturesimulationframeworkuntil sub-detectors’ codesbecomeavailable

    • Description of the code here

    • Simulationresults

    • Considerations on packingmechanism

    • Considerations on buffer usage

    • Synthesisresults

  • Practicalproof of howimportant

  • simulating code is…

5


Federico alessio w ith inputs from richard ken guillaume

Generic FE channelas in specs

  • FE channelcontains a buffer:

  • No trigger at FE, so buffer isactually a derandomizer.

  • Used to pipe data @ 40MHz to be packed and sent over GBT link.

  • If no TFC command and occupancytoo high, buffer willfill up veryveryquickly

    • We are runningat 40MHz! It’s 40 timesfasterthannow…

    • Mechanism to empty buffer

    • TFC commands come in handy

  • DATA coming out on GBT link:

  • No emptyspaces, no unexpected 0s

  • Fullydynamicpackingalgorithmacross GBT frame-width

  • Wishingly, data should be in order…

6


Federico alessio w ith inputs from richard ken guillaume

The code: FE data generator

7


Federico alessio w ith inputs from richard ken guillaume

The code: FE buffer manager

8


Federico alessio w ith inputs from richard ken guillaume

The code: GBT dynamicpacking

Very important to analyze simulation output bit-by-bit and clock-by-clock!

9


Federico alessio w ith inputs from richard ken guillaume

The code: configuration

  • FE generic data generator is fully programmable:

    • Number of channels associated to GBT link

    • Width of each channel

    • Derandomizer depth

    • Mean occupancy of the channels associated to GBT link

    • Size of GBT frame (80 bits or WideBus + GBT header 4 bits)

  • Extremely flexible and easy to configure with parameters

  • Covers almost all possibilities (almost…)

    • Including flexible transmission of NZS and ZS

  • Including TFC commands as defined in specs

    • Study dependency of FE buffer behaviour with TFC commands

    • Study effect of packing algorithm on TELL40

    • Study synchronization mechanism at beginning of run

    • Study re-synchronization mechanism when de-synchronized

    • Etc… etc… etc…

  • And it is fully synthesizable… 

10


Federico alessio w ith inputs from richard ken guillaume

Simulationresults

  • Simulated 11 different scenarios:

  • fixed GBT size to 80 bits + 4 bits GBT header

  • fixed width of data header to 24 bits in three fields (12 for BXID, 8 for data size, 4 for info)

  • fixed width of data channel to 5 bits as practical example

  • Numbers scale relatively: less occupancy, more number of channels

11


Federico alessio w ith inputs from richard ken guillaume

Simulationresults

Scenario 1: 10% occupancy, 50x5bits channels, derandomizer depth 75

Scenario 2: 25% occupancy, 50x5bits channels, derandomizer depth 75

12


Federico alessio w ith inputs from richard ken guillaume

Simulationresults

Scenario 8: 40% occupancy, 32x5bits channels, derandomizer depth 165

Scenario 9: 40% occupancy, 32x5bits channels, derandomizer depth 165 + NO BX VETO sent from TFC

13


Federico alessio w ith inputs from richard ken guillaume

Simulationresults

Filling scheme

TFC commands

FE data generated

Derandomizer occupancy

GBT output

For a bit-by-bit zoom in please come to my office 

14


Federico alessio w ith inputs from richard ken guillaume

Simulationresults

For a bit-by-bit zoom in please come to my office or ask the code 

15


Federico alessio w ith inputs from richard ken guillaume

Synthesisresults

  • Using Quartus Altera 12.1 SP1

  • No synthesis optimization done, let fitter free, no pinout defined, no timing constraint

  • No memory cells used

  • Doable, can be further improved though.

16


Federico alessio w ith inputs from richard ken guillaume

FYI, simulationoutlook

  • Simulation should be a coordinated effort

  • Personal drive in order to be able to produce a (complex) code for TFC on time

  • FE generic code + TFC code should be merged with TELL40 effort

  • To test both FE packing algorithm and FE buffer management

  • To test decoding at TELL40 and investigate consequences/solutions

  • To analyze effects of TFC commands on global system (including TELL40)

    • Effort already ongoing between me and Guillaume to do so

  • We would very very much appreciate to have the code (emulation) of each sub-detectors

  • a FE generic code is useful to study things on paper, but real code is something different

  • Proposal is to use this simulation effort to validate FE code

  • simulation performed by me and Guillaume to investigate solutions, issues in FE

17


Federico alessio w ith inputs from richard ken guillaume

Conclusions

  • Packing mechanism as specified in our document is feasible.

    • Will be used temporarily to emulate FE generated data in global readout and TFC simulation.

  • However, very big open questions:

    • Is your FE compatible with such scheme? What about such code in an ASIC?

    • Behaviour of FE derandomizer will strongly depend on your compression or suppression mechanism.

      • If dynamic could create big latencies

      • If your data does not come out of order can become quite complicated…

    • Behaviour of FE derandomizer will strongly depend on TFC commands

      • FE buffer depth should not rely on having a BX VETO! Aim at a bandwidth for fully 40 MHz readout  BX VETO solely to discard events synchronously.

      • What about SYNCH command? When do you think you can apply it? Ideally after derandomizer and after suppression/compression, but…

    • How many clock cycles do you need to recover from an NZS event?

      • Can you handle consecutive NZS events?

18


Federico alessio w ith inputs from richard ken guillaume

Qs & As?

19


System and functional requirements

System and functional requirements

  • Bidirectionalcommunication network

  • Clock jitter, and phase and latency control

    • At the FE, butalsoat TELL40 and between S-TFC boards

  • Partitioningto allowrunning with any ensemble and parallelpartitions

  • LHCinterfaces

  • Eventsrate control

  • Low-Level-Trigger input

  • Support for old TTC-baseddistributionsystem

  • Destination control for the eventpackets

  • Sub-detectors calibrationtriggers

  • S-ODIN data bank

    • Infomationabouttransmittedevents

  • Test-benchsupport

20


The s tfc system at a glance

The S-TFC system at a glance

  • S-ODINresponsible for controllingupgradedreadoutsystem

  • Distributing timing and synchronouscommands

  • Manages the dispatching of events to the EFF

  • Rate regulates the system

  • Support old TTC system: hybridsystem!

STORAGE

  • SOL40responsible for interfacingFE+TELL40 sliceto S-ODIN

  • Fan-out TFC information to TELL40

  • Fan-in THROTTLE information from TELL40

  • Distributes TFC information to FE

  • Distributes ECS configuration data to FE

  • Receives ECS monitoring data from FE

DATA

DATA

21


Federico alessio w ith inputs from richard ken guillaume

S-TFC concept reminder

22


The upgraded physical readout slice

The upgraded physical readout slice

  • Common electronicsboard for upgradedreadoutsystem: Marseille’s ATCA board with 4 AMC cards

    • S-ODIN AMC card

    • LLT  AMC card

    • TELL40  AMC card

    • LHC Interfaces specific AMC card

23


Federico alessio w ith inputs from richard ken guillaume

Latest S-TFC protocol to TELL40

Wewillprovide the TFC decodingblock for the TELL40: VHDL entity with inputs/outputs

  • «Extended» TFC word to TELL40 via SOL40:

  •  64 bits sentevery 40 MHz = 2.56 Gb/s (on backplane)

  •  packed with 8b/10b protocol(i.e. total of 80 bits)

  •  no dedicated GBT buffer, use ALTERA GX simple 8b/10b encoder/decoder

  • MEP acceptcommandwhen MEP ready:

  • Take MEP address and pack to FARM

  • No need for special address, dynamic

Constant latency after BXID

  • THROTTLE information from each TELL40 to SOL40:

    • no change: 1 bit for each AMC board + BXID for which the throttlewas set

      • 16 bits in 8b/10b encoder

      • same GX buffer asbefore (assame decoder!)

24


Federico alessio w ith inputs from richard ken guillaume

S-TFC protocol to FE, no change

  • TFC word on downlink to FE via SOL40 embedded in GBT word:

  •  24 bits in each GBT frame every 40 MHz = 0.98 Gb/s

  •  allcommandsassociated to BXID in TFC word

  • Put localconfigurabledelays for each TFC command

    • GBT doesnotsupportindividualdelays for each line

    • Need for «local» pipelining: detector delays+cables+operationallogic (i.e. laser pulse?)

      • DATA SHOULD BE TAGGED WITH THE CROSSING TO WHICH IT BELONGS!

  • TFC word willarrivebefore the actualeventtakesplace

    • To allow use of commands/resets for particularBXID

    • Accounting of delays in S-ODIN: for now, 16 clock cyclesearlier + time to receive

    • Aligned to the furthest FE (simulation, then in situ calibration!)

  • TFC protocol to FE hasimplications on GBT configuration and ECS to/from FE

    • seespecsdocument!

25


Federico alessio w ith inputs from richard ken guillaume

SODIN firmware v1r0 – blockdiagram

26


Federico alessio w ith inputs from richard ken guillaume

Timing distribution

  • From TFC point of view, weensureconstant:

  • LATENCY: Alignment with BXID

  • FINE PHASE: Alignment with best samplingpoint

  • Some resynchronizationmechanismsenvisaged:

  • Within TFC boards

  • With GBT

    • No impact on FE itself

  • Loopbackmechanism:

  • re-transmit TFC word back

  • allows for latencymeasurement + monitoring of TFC commands and synchronization

27


Federico alessio w ith inputs from richard ken guillaume

How to decode TFC in FE chips?

FE electronicblock

  • Use of TFC+ECS GBTsin FE is 100% common to everybody!!

  • dashedlines indicate the detector specificinterfaceparts

  • pleasepayparticular care in the clock transmission: the TFC clock must be used by FE to transmit data, i.e. lowjitter!

    • Kaptoncable, crate, copperbetween FE ASICs and GBTX

28


Federico alessio w ith inputs from richard ken guillaume

The TFC+ECS GBT

Clock[7:0]

External clock reference

FEModule

  • These clocks should be the main clocks for the FE

  • 8 programmablephases

  • 4 programmablefrequencies (40,80,160,320 MHz)

E – Port

GBTX

e-Link

Phase - Shifter

CLK Reference/xPLL

E – Port

FEModule

E – Port

ePLLRx

GBTIA

DEC/DSCR

CDR

E – Port

data-down

data-up

Phase – Aligners + Ser/Des for E – Ports

CLK Manager

clock

80, 160 and 320 Mb/s ports

GBLD

SCR/ENC

SER

E – Port

ePLLTx

FEModule

E – Port

E – Port

  • Used to:

  • sample TFC bits

  • drive Data GBTs

  • drive FE processes

Control Logic

Configuration

(e-Fuses + reg-Bank)

one 80 Mb/s port

GBT – SCA

JTAG

I2C Slave

I2C Master

E – Port

data

I2C (light)

control

clocks

JTAG port

I2C port

29


Federico alessio w ith inputs from richard ken guillaume

The TFC+ECS GBT protocol to FE

  •  TFC protocolhasdirectimplications in the way in which GBT should be usedeverywhere

    • 24 e-links @ 80 Mb/s dedicated to TFC word:

      • use 80 MHz phaseshifter clock to sample TFC parallel word

    • TFC bits are packed in GBT frame so thattheyall come out on the same clock edge

      • We can repeat the TFC bits also on consecutive 80 MHz clock edgeifneeded

  • Leftover 17 e-linksdedicated to GBT-SCAs for ECS configuring and monitoring(seelater)

30


Federico alessio w ith inputs from richard ken guillaume

Words come out from GBT at 80 Mb/s

  • In simplewords:

  • Odd bits of GBT protocol on risingedgeof 40 MHz clock (first, msb),

  • Even bits of GBT protocol on fallingedgeof 40 MHz clock (second,lsb)

31


Federico alessio w ith inputs from richard ken guillaume

TFC decoding at FE after GBT

  • Thisiscrucial!!

  • wecan alreadyspecifywhereeach TFC bit will come out on the GBT chip

  • thisis the only way in which FE designers stillhaveminimalfreedom with GBT chip

    • if TFC info waspacked to come out on only 12 e-links (first oddtheneven), thendecoding in FE ASIC would be mandatory!

    • whichwouldmeanthatthe GBT bus wouldhave to go to each FE ASIC for decoding of TFC command

  • thereisalso the idea to repeat the TFC bits on even and odd bits in TFC protocol

    • wouldthat help?

    • FE couldtielogicalblocksdirectly on GBT pins…

32


Federico alessio w ith inputs from richard ken guillaume

Now, what about the ECS part?

  • Eachpair of bit from ECS field inside GBT can go to a GBT-SCA

    • OneGBT-SCA isneeded to configure the Data GBTs(EC one for example?)

    • The rest can go to either FE ASICs or DCS objects(temperature, pressure) via other GBT-SCAs

      • GBT-SCA chip hasalreadyeverything for us: interfaces, e-linksports ..

        •  No reason to go for somethingdifferent!

      • However, «silicon for SCA will come laterthansilicon for GBTX»…

        •  Weneedsomethingwhilewewait for it!

33


Federico alessio w ith inputs from richard ken guillaume

SOL40 encoding block to FE!

  • Protocol drivers build GBT-SCA packets with addressing scheme and bus type for associated GBT-SCA user busses to selected FE chip

  • Basically each block will build one of the GBT-SCA supported protocols

Memory Mapwith internal addressing scheme for GBT-SCA chips + FE chips addressing, e-link addressing and bus type: content of memory loaded from ECS

34


Federico alessio w ith inputs from richard ken guillaume

Usual considerations …

  • TFC+ECSInterface has the ECS load of an entireFE cluster for configurating and monitoring

  • 34bits @ 40 MHz = 1.36Gb/son single GBT link

    • ~180 Gb/s for full TFC+ECSInterface (132 links)

    • Single CCPC mightbecomebottleneck…

    • Clara & us, December 2011

  • How long to configure FE cluster?

    • howmany bits / FE?

    • howmanyFEs/ GBT link?

    • howmanyFEs / TFC+ECSInterface?

  •  Numbers to be pinned down soon+ GBT-SCAinterfaces and protocols.

35


Federico alessio w ith inputs from richard ken guillaume

Old TTC systemsupport and

runningtwosystems in parallel

  • We already suggested the idea of a hybrid system:

    • reminder: L0 electronics relying on TTC protocol

    • part of the system runs with old TTC system

    • part of the system runs with the new architecture

  • How?

  • Need connection between S-ODIN and ODIN (bidirectional)

  •  use dedicated RTM board on S-ODIN ATCA card

  • In an early commissioning phase ODIN is the master, S-ODIN is the slave

    • S-ODIN task would be to distribute new commands to new FE, to new TELL40s, and run processes in parallel to ODIN

    • ODIN tasks are the ones today + S-ODIN controls the upgraded part

      • In this configuration, upgraded slice will run at 40 MHz, but positive triggers will come only at maximum 1.1MHz…

        • Great testbench for development + tests + apprenticeship…

        • Bi-product: improve LHCb physics programme in 2015-2018…

  • 3. In the final system, S-ODIN is the master, ODIN is the slave

    •  ODIN task is only to interface the L0 electronics path to S-ODIN and to

    • provide clock resets on old TTC protocol

  • 36


    Federico alessio w ith inputs from richard ken guillaume

    S-ODIN on Marseille’s ATCA board

    37


    Federico alessio w ith inputs from richard ken guillaume

    TFC+ECSInterface on Marseille’s ATCA board

    38


  • Login