1 / 47

FE-I4 Architecture & Performance

FE-I4 Architecture & Performance. Marlon Barbero, Universität Bonn 2 nd ATLAS CMS Electronics for SLHC, CERN Mar. 04 th 2009. FE-I4 for IBL & sLHC. IBL (~2014) : inserted layer @ 3.7cm in current pixel detector.

doli
Download Presentation

FE-I4 Architecture & Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FE-I4 Architecture & Performance Marlon Barbero, Universität Bonn 2nd ATLAS CMS Electronics for SLHC, CERN Mar. 04th 2009

  2. FE-I4 for IBL & sLHC • IBL (~2014): inserted layer @ 3.7cm in current pixel detector. • sLHC tentative layout (>2017): pixel layers at 3.7cm, 7.5cm, 16cm, 20cm(note: Discussion on boundary pixel / short strips, …). IBL tentative ID layout for sLHC IBL R~37 2 layers long strips 3 layers short strips FE-I4 fixed 4 layers pixels removable M. Garcia-Sciveres, ACES Mar. 03rd 09

  3. FE-I4: Some Specifications • Pixel size: 50×250μm2. • Pixel array: 80 columns×336 rows = 26880 pixels/FE. • Dimensions FE-I4: ~ 20×19 mm2. • Analog goals: 1.5V, 10μA/pixel. Digital goals: 1.2V, 10μA/pixel. • Analog information: ToT coded on 4 bits. • pseudo-LVDS output: 160Mb.s-1. • Rad.-hardness: >200MRad ionizing dose (FE-I3: >50Mrad). • Minimal guidelines: no ELT, nmos guard rings for analog & sensitive digital circuitry. • Sensor capacitance: 0-0.5pF. • Low noise at low cap. (~100e-). • DC leakage I tolerant to Ileak > 100nA. A. Mekkaoui, ACES Mar. 04th 09

  4. Digital Readout Architecture FE-I3 Both FE readout based on double column (DC) structure FE-I4 • All hit pixels are shipped to EoC buffer. • A hit pixel need to transfer its data to EoC before accepting new hit  congestion. • Each pixel is logically independent inside the DC. bottleneck local storage low traffic on DC bus • Store data locally in DC until L1T. • Only 0.25% of pixel hits are shipped to EoC  DC bus traffic “low”. Warning: Local Buffer Congestion??? • Each pixel tied to its neighbors -time info- (real hits clustered). TW out.

  5. Simulation David Arutinov - Bonn Two sources of inefficiencies are identified in the FE-I4 architecture • Pile-up inefficiency: α (hit rate; mean(ToT); area).  untie neighbor pixels if needed & aggressive return to baseline. • Local buffer overflow:  increase Logic Unit / Local Buffer Region (averaging out effect) & increase # of cells per Local Buffer. nm = 1+n τ ; ; #? Pile-up inefficiency. Mean ToT = 4 Analytical Simulation sLHC 1.9% LHC 0.13% 3xLHC 0.56% n – true interaction rate m – recorded count rate τ - mean ToT

  6. Local Buffer Overflow (2x2) 3xLHC • Local Buffer Overflow Inefficiency for the 3.7cm layer 0.5% - 5 cells 120 BX Simulation Analytical 0.1% - 6 cells 0.01% ~ 7-8 cells x6 3xLHC sLHC Latency 120 BX Latency (BX)

  7. Inefficiency FE-I4 2x2 x6 0.6% At 3 times LHC luminosity, r~3.7cm, FE-I4 inefficiencies should be in acceptable range Mean ToT = 4

  8. Towards a reference design Now all pixels in buffer area are ‘semi’-tied together. Due to the smaller radius (3.7 cm vs. 5.05 cm) charge sharing in Z becomes comparable with r/phi. 4 pixel region DC η=2.5 η=0 pn= Erland-B function n: buffer occupied. k: total # of buffer. ρ=λμ, with λ: hit probability. μ: busy time.

  9. Region Schematic • A 4-pixel unit with these functionalities: • Time-Stamping (up to 5 stored at a time). • ToT coded on 4 bits: no hit, small hit, long hit, analog values. • Neighbor bit. • Small hit  Available to Neighbor Region . hit proc.: TS/sm/big/ToT Token disc. top left disc. top right Read & Trigger 5 ToT memory /pixel disc. bot. left disc. bot. right L1T Read Neighbor 5 latency counter / region

  10. Digital Column Architecture • 168 regions + CLK + buffering scheme  1 Double-Column • Simple buffer. • H-tree. • Delay compensated for skew balancing.

  11. Work in progress Tomasz Hemperek - Bonn region symbol region layout 188mm 94mm 50m delay matching - clk 50m x8 DC schem. drop on vdd addresses

  12. FE-I4 Performance • Inefficiency 3.7cm @ 3xLHC: 0.56% (double-hit) + 0.05% (5-deep buffer overflow).( ~0.35%+~0.0065% for 16cm sLHC -50ns bx-) • Area:  cells from provider, 100 x 102 um2. • Power:  1 hit/bx/DC, 100kHz L1T, 2.6uW / pixel. Warning: This is before adding any buffering, clock distribution…  5-6uW digital total?

  13. Needs in Periphery • Focus needs to shift to periphery. • Command decoder  L1T / configuration. • Ctrl block  handles token pass, read request to DC, readout from DC. • Data Formatting  Data Output Protocol, compression, 8b10b. • Data transmission  pseudo-LVDS output from fast CLK. • Power blocks  regulator. • Pad frame.

  14. Status of FE-I4 Periphery Pix Array: 80×336 pixel array L1T, token, read, … token token 28 b × 40 DC EoC EoC EoC L1T, token, read, … data compression config. monitoring Periphery: Bypass-able data formatting (protocol) with error detection (parity/CRC?) pixel config Asynch. FIFO : in “advanced stage” : “effort needed” trigger FIFO ctrl block Bypass-able L1T 160MHz global config PLL, 40MHz in, 160MHz out DACs interface clk select 40MHz ‘LVDS’-out 160Mb/s Powering aux 2

  15. Summary FE-I4 Architecture • Lot of work performed during last 4-6 months on digital region + digital Double-Column.  will remain high in priority list in coming months (performance studies, improvements, optimization) . • Focus has started shifting to FE periphery.  Much effort needed there (interface, data output protocol, control block…). • Validation, testability. • Milestones 2009: • Reviews foreseen for 2009, March and early summer. • Full scale design completed: fall 2009. Needless to say, this is an aggressive schedule.

  16. FE-I4_proto1 collaboration FE-I4-P1 3mm • Participating institutes: Bonn, CPPM, Genova, LBNL, Nikhef. Bonn: D. Arutinov, M. Barbero, T. Hemperek, M. Karagounis. CPPM: D. Fougeron, M. Menouni. Genova: R. Beccherle, G. Darbo. LBNL: R. Ely, M. Garcia-Sciveres, D. Gnani, A. Mekkaoui. Nikhef: R. Kluit, J.D. Schipper LDORegulator 61x14 array Control Block ChargePump CapacitanceMeasurement SEU test IC 4mm DACs CurrentReference ShuLDO+trist LVDS/LDO/10b-DAC 4-LVDS Rx/Tx

  17. backup BACKUP SLIDES

  18. FE-I4 • Originally developed as an IC for b-layer upgrade. • Similar bandwidth for IBL and outer layers at sLHC ~2017 + schedule construction sLHC outer layers sooner than insertable inner layers  FE-I4 a good fit for both projects. • FE-I4 for IBL requires: • hit rate ×4-5 wrt FE-I3, 5cm. • small pixel & big chip (active fraction). • compatible w. present RO & ctrl. • compatible w. different sensor types. • FE-I4 for outer layers @ sLHC requires: • big chip for costs reduction. • compatible w. sLHC RO & ctrl. • lower current & compatible new powering schemes.

  19. Motivation for re-design of FE FE-I3FE-I4 • Need for new FE: • Smaller b-layer radius + potential luminosity increase  higher hit rate FE-I3 column-drain architecture saturated.  FE-I4 has new digital architecture.  FE-I4 has smaller pixel (reduced cross-section). • Enhancements brought to FE-I4: • Improved active area ratio (<¾0.9):  Bigger IC; reduced periphery; cost. • Power: • Analog design for reduced currents; decrease of digital activity (digital logic sharing for neighbor pixels); new powering concepts. • Adapt to sensor technologies with different cap. / leak. • New technology:  Availability, rad-hard, higher integration density for digital circuits. FE-I3 (5cm) 3xLHC sLHC LHC inefficiency Hit prob. / DC 0.25μm130nm

  20. 4-pixel / 8-pixel • Local Buffer Overflow Inefficiency Quadri-pixel vs. Octo-pixel. Averaging out effect.

  21. FE-I4 geometry • 250 μm × 50 μm. • Array: 80 columns × 336 rows. • No bricking. 20.2mm 7.6mm ~200μm 16.8mm ~19 mm active IBM reticule 8mm active 2.8mm ~2mm Chartered reticule (24 x 32) FE-I3 74% FE-I4 ~89% vendor’s max chip size: 21mm×19.5mm (review when above 20mm)

  22. Some target specs for FE-I4 • Rad.-hardness: >200MRad ionizing dose (FE-I3: >50Mrad). • Minimal guidelines: no ELT, nmos guard rings for analog & sensitive digital circuitry. • Sensor capacitance: 0-0.5pF. • Low noise at low cap. (~100e-). • DC leakage I tolerant to > 100nA.

  23. Clock Multiplier I/O choices for ATLAS IBL, ATLAS Pixel System Design Task Force • For IBL, need to transmit data out at BW of 160Mb/s • 2 options: • send a 80MHz CLK to the FE and use both edges to transmit • Needs modification of BOC / ROD to produce higher speed TTC • Needs synchronization protocol on the FE between 80MHz clock & beam crossing. • A new DORIC needs to decode CLK at twice frequency • send a 40MHz CLK to the FE and multiply clock on FE • Needs a clock multiplier on chip • Note: synergy with what the strip MCC need • In FE-I4, we will provide both options: • Clock multiplier from the 40MHz input clock • AUX: possibility to send the 80MHz to the FE

  24. 8b10b encoder I/O choices for ATLAS IBL, ATLAS Pixel System Design Task Force • For IBL, need to transmit data out at BW of 160Mb/s • At BOC/ROD: • Data rate 4 times the clock rate • Phase adjustment • Use Clock Data Recovery mechanism • CDR requires an output data stream with good engineering properties • 8b10b: • adequate for this purpose, enough transitions for reliable CDR • widely used  easy to implement • provides some level of error detection • provides comma for frame identification & synchronization

  25. PLL Overview Voltage Controlled Oscillator Charge Pump Loop Filter Phase Frequency Detector 640 MHz 40 MHz Frequency Divider Conversion and Buffering

  26. Analog Readout Chain • In FE-I4_proto1 (FE-I4 prototype submitted spring 2008): • 2-stage architecture optimized for low power, low noise, fast rise time.  Additional gain, Cc/Cf2~6.  More flexibility on choice of Cf1.  Qcoll less dependant on Cdetect.  2nd stage decoupled from leakage related DC potential shift. • 12b configuration:  FDAC: tuning feedback current.  TDAC: tuning of discriminator threshold.  Local charge injection circuitry. TDAC 50 mm Amp2 discri Preamp FDAC Config Logic 145 mm

  27. Irradiation in 2008 • Sept. Los Alamos 800MeV p+  FE-I4-Proto1 FE, #1 (50Mrad) & #2 (100Mrad)‏ • Oct. CERN 20GeV p+  SEU test chip + LVDS test chip (used for interface and received a low parasitic dose ) • Dec. Los Alamos 800Mev p+ • FE-I4-Proto 1 chips #2 (an additional 100MRad) and #3 (200MRad)‏ • LVDS chips #1,#2 and #3,#4 Laser along beam line LVDS RxTx FE-I4-proto1 Beam stop

  28. SEU-hardened latch • CPPM has studied the influence of various layout of a DICE latch on the SEU x-section. Physical separation of sensitive node pairs. Latch5.1 and latch5.2 ; Area :12µm × 4µm = 48 μm2 nMos separation : 7µm ; pMos separation : 3 µm Triple Redundant Logic with Interleaved Layout. Calin et al, IEEE Trans. Nucl. Sci. vol43, n.6, 1996 • X-section [cm2.bit-1]: • Standard Latch: ~ 5.10-14 • DICE w. improved layout: ~ 3.10-16 1.a 2.a 3.a 1.a 2.a 3.a 1.b 2.b 3.b 1.b 2.b 3.b X-section : < 1.10-17

  29. LVDS transciever IBM 130nm • For IBL and outer layers sLHC, need for a 320Mb.s-1 BW/ LVDS i/0. • LVDS transciever IC irradiated up to ~180Mrad. No degradation observed. 1.8mm tests with differential probe and 100 Ω on board term. @ 1.2V supply TX output Chained RxTx output @ 320 MHz Clock Clock-Rate 1050mV 320MHz 600mV 160MHz 150mV 40MHz Common Mode Voltage 0.8mm

  30. Output Stage: PLL & 8b10b I/O choices for ATLAS IBL, ATLAS Pixel System Design Task Force • Compatibility w. current BOC / ROD. • Clock multiplier from the 40MHz input clock • Classic PLL design: Phase Freq. Detector, Loop Filter, Voltage Controlled Oscill., Freq. Divider. • Phase Frequency Detector w. Upset Detection Unit. • Settling in 1.2μs; fast recovery from SEU in divider & Vctrl. • 8b10b: • higher frequency clk & data  recover clk from data. • balanced coding for Clock Data Recovery in BOC / ROD. • some nice features (error detection, frame alignment). • Both blocks by-passable for maximum flexibility.

  31. Out-Stage: tri-state pseudo-LVDS • MUXing FE output for outer layers. • M3-M6 steered by tri-state logic block  all switch can be left open  hZ. • tri-state LVDS submitted. Testing is starting. Tri-State logic

  32. Others • Note: • Low power comparator. • Failsafe mechanism of LVDS receiver. • Pad-frame. • LDO with new 0-cell. • ShuLDO. • Vin= 1.6V • Vout= 1.2V1.5V • Zero is introduced in the open loop transfer function by a frequency dependent voltage controlled current source • Less peaking of Vout in comparison with compensation by • RESRof Cout output. Talk M. Karagounis -ID Powering -Tue. 24th 2009

  33. MC events _ • Events: (Pythia generator) • WH(120GeV); Hbb. • overlaid with: 24 / 75 / 240 / 400 events pileup. “LHC”/“3×LHC”/ “sLHC” (25ns / 50ns bx) • Sensor: Un-irradiated planar sensor, 260μm width. Note: 3D simulation in progress • Geometry: (Geant3 simulation package) • pixel size: FE-I3: 400×50μm2; FE-I4: 250×50μm2. • first: 4 barrels, 3.7 (FE-I4) & 5.05/8.85/12.25 cm radius FE-I3. • new: 6 barrels, 3.7/5.05/8.85/12.25/16/21 cm radius FE-I4. • Threshold: first  3750e-. New  down to 1000e-. V. Kostyukhin -3D Si- Mon. 23rd 2009

  34. Foreword: Minimal Bias events • FE-I4 for: - b-layer upgrade: luminosity? radius?  75 ev pile-up & 3.7cm. - s-LHC: lumi.? radius?  240/400 ev pile-up & outer layer. • Extrapolation to LHC energy:  extrapolation @ 14TeV: uncertainty ~ 30%? (1st years operation crucial to feedback simulation) <pt charged particle> at η=0 <# charged particles> / interaction

  35. 3×LHC / b-layer replacement FE-I3, 50μm×400μm. FE-I4 simul., 50μm×250μm. r [mm] η=1.0 η=0.1 η=0.2 η=0.3 η=0.4 η=0.5 η=0.6 η=0.7 η=0.8 η=0.9 200 η=1.2 160 1.41 1.24 1.26 1.26 1.37 1.34 1.33 122.5 120 rates given in [pixel hits.bx-1cm-2] η=1.5 2.55 2.56 2.54 2.55 2.64 2.65 2.64 88.5 80 η=2.0 6.30 6.46 6.03 5.85 5.91 6.46 6.11 50.5 η=2.5 40 12.10 11.53 12.01 11.85 11.72 12.11 8.02 37 η=3.0 η=3.5 z [mm] 0 600 0 100 200 300 400 500

  36. 10×LHC (25ns bx) / sLHC FE-I4, 50μm×250μm. FE-I4 simul., 50μm×250μm. FE-I4 Nigel, 50μm×250μm. FE-I4 sdtf 220908, 50×250μm2. r [mm] mean: 2.3 210 201 η=1.0 η=0 η=0.1 η=0.2 η=0.3 η=0.4 η=0.5 η=0.6 η=0.7 η=0.8 η=0.9 200 η=1.2 160 rates given in [pixel hits.bx-1cm-2] 150 131 122.5 mean: 4.7 120 η=1.5 mean: 7.8 88.5 80 η=2.0 70 mean: 19.5 50.5 η=2.5 40 36.89 35.76 35.97 36.46 35.94 33.26 23.23 mean: 35 37/37 η=3.0 η=3.5 z [mm] 0 0 100 200 300 400 500 600 324 524

  37. 10×LHC (50ns bx) / sLHC FE-I4, 50μm×250μm. FE-I4 simul., 50μm×250μm. FE-I4 Nigel, 50μm×250μm. FE-I4 sdtf 220908, 50×250μm2. r [mm] mean: 3.9 210 201 η=1.0 η=0 η=0.1 η=0.2 η=0.3 η=0.4 η=0.5 η=0.6 η=0.7 η=0.8 η=0.9 200 η=1.2 160 rates given in [pixel hits.bx-1cm-2] 150 131 122.5 mean: 8.4 120 η=1.5 mean: 13.4 88.5 80 η=2.0 70 mean: 34 50.5 η=2.5 40 61.18 58.74 60.02 60.12 59.15 55.10 38.67 mean: 60 37/37 η=3.0 η=3.5 z [mm] 0 0 100 200 300 400 500 600 324 524

  38. Extrapolations to other radius sLHC, 50ns bx / 400 events pileup Hits/mm2 sLHC, 25ns bx / 240 events pileup Reasonable fit with:exp(1.34-0.57*R)+0.15-0.0053*R Hits/mm2 Reasonable fit with:exp(0.86-0.58*R)+0.088-0.0031*R r [cm] r [cm]

  39. Pixel occupancy  Data bandwidth • Pixel hit rate  FE output bandwidth: • # bits / pixel transmitted? • address 7+9 bits, analog info 4+2 bits 22b? • data output protocol? • Reduce data output by taking into account clustered nature of real physics hits. NUMBER OF PIXELS 3xLHC FE-I4, central module, 3.7cm layer 3xLHC 10xLHC FE-I4, central module, 3.7cm layer FE-I4, central module, 21cm layer

  40. Pixel occupancy  Data bandwidth preliminary assumption: 100kHz L1T, 336×80 pixels FE-I4 • Example 3: clustered data out with fixed format. • compression factor (all at 3×LHC) 3.7cm (vs. 21cm), η=0 • indiv pixels: 4.09 (0.25)×(7+9+4+2)= 1.00 (1.00) A.U. • static 1×2: 3.45 (0.18)×(7+8+2×4+2)=0.96 (0.83) A.U. • dynamic 1×2: 3.02 (0.15)×(7+9+2×4+2)= 0.87(0.74) A.U. • static 1×4: 2.86 (0.17)×(6+8+4×4+4)=1.08(1.08) A.U. • dyn. in-DC 1×4: 2.43 (0.15)×(6+9+4×4+4)= 0.95(0.95) A.U. • dynamic 1×4: 2.13 (0.14)×(7+9+4×4+4)= 0.85(0.94) A.U. (×336) column NL row 106.count.FE-1.s-1 row ToT DC (×40) dyn. 1×4 better at small R? (larger η!) dyn. 1×2 at large R? Disclaimer: no header, trailer, DC-balancing, error correction… For reference in backup slides: same at higher η

  41. Pixel occupancy  Data bandwidth preliminary assumption: 100kHz L1T, 336×80 pixels FE-I4 • Example 3: clustered data out with fixed format. • compression factor (all at 3×LHC) 3.7cm mod.4 (vs. 21cm mod.6), • indiv pixels: 3.96 (0.26)×(7+9+4+2)= 1.00 (1.00) A.U. • static 1×2: 3.38 (0.20)×(7+8+2×4+2)=0.97 (0.87) A.U. • dynamic 1×2: 3.05 (0.18)×(7+9+2×4+2)= 0.91(0.79) A.U. • static 1×4: 2.28 (0.17)×(6+8+4×4+4)= 0.89(1.01) A.U. • dyn. in-DC 1×4: 2.00 (0.15)×(6+9+4×4+4)= 0.80(0.91) A.U. • dynamic 1×4: 1.88 (0.14)×(7+9+4×4+4)= 0.78 (0.85) A.U. (×336) column NL row 106.count.FE-1.s-1 row ToT DC (×40) dyn. 1×4 better at small R? (larger η!) dyn. 1×2 at large R? Disclaimer: no header, trailer, DC-balancing, error correction…

  42. Data BW for IBL @ 3.7cm

  43. Data BW for sLHC

More Related