1 / 9

Course Information Instructor CK Cheng, ckcheng+291@ucsd, 858 534-6184 Schedule

Course Information Instructor CK Cheng, ckcheng+291@ucsd.edu, 858 534-6184 Schedule Lectures: 5:00-6:20PM, TTH, CSE 2217 Textbooks (H) High Speed Signal Propagation: Advanced Black Magic Howard Johnson and Martin Graham (D) Digital Systems Engineering William J. Dally, John W. Poulton

dkrueger
Download Presentation

Course Information Instructor CK Cheng, ckcheng+291@ucsd, 858 534-6184 Schedule

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Course Information • Instructor • CK Cheng, ckcheng+291@ucsd.edu, 858 534-6184 • Schedule • Lectures: 5:00-6:20PM, TTH, CSE 2217 • Textbooks • (H) High Speed Signal Propagation: Advanced Black Magic Howard Johnson and Martin Graham • (D) Digital Systems Engineering William J. Dally, John W. Poulton • Content • 1. Structure of Interconnect and Packaging • 2. Electrical and Physical Scaling • 3. Interconnect Modeling: Wire and Transmission Line Models • 4. Interconnect Signaling • 5. Transmitters and Receivers • 6. Power Distribution Network • 7. Clock Distribution • 8. Extraction and Simulation • 9. Thermal Issues

  2. Overall View 8x8 Racks: 65,536 compute nodes 25KW/Rack 2 Midplanes/Rack 16 Node cards/Midplane 16 Compute cards/Node card 2 PUs/Compute card 64x32x32 Torus 1.4Gb/s differential link, 700MHz clock System Example: Blue Gene/L 2005

  3. Compute card 14.3W/ASIC node: power density 10.4W/cm2 206mm x 55mm/compute card 14 layers: 6 signal, 8 power System Example: Blue Gene/L 2005

  4. Air cooling 25KW/ 0.91 x 0.91 m2 ≈ 3W/cm2 Air displacement 1.4m3/s, Average velocity 6.7m/s Fan speed is optimized individually Plenum: θ, β, EMI screen Elliptical vane System Example: Blue Gene/L 2005

  5. Clock Length matching, differential pairs with termination Interconnect Pre-emphasis, On-chip termination Vdd/Vss noise: 185-100 ps delay Midplane: reduce longest path between boards 18 layers 190-215um width trace at 1.0ounce copper for 100ohm differential pairs 100um width trace at 0.5 ounce copper for short wires System Example: Blue Gene/L 2005

  6. z196 (2012): 45nm tech, released 9/2010 96 cores, 5.2GHz, 770GB Memory/node 3KW/PU book, 4PU books/backplane MCM 1MCM/PU book, 2KA/MCM 6PCs, 2 Cache/MCM 96x96mm2, 103 layers, 7,356pins/MCM System Example: z196

  7. Water cooling option humidity and atmospheric pressure -> dew point + 6°C 3.25 gallons/minute for each processor module Lower temperatures -> lower processor power consumption No refrigeration compressors Air conditioning of the room: energy reduced by a factor of 3 Save 4 kW/4PU books System Example: z196

  8. Power Distribution at ±5% tolerance Locate power conversion close to the chip DCA-> 40-48V Gearboxes -> 1.1V, 17 power domains Feedback control Redundancy N+2 (N=2), V, I, T sensing for failure detection Previous version: 600W copper losses, 5 ounces metal plane Now 400W (1/3 on copper, 2/3 on power conversion) Deep trench capacitor: 25 times density, 15uF -> 5.2 GHz on chip System Example: z196

  9. Power network impedance evaluation (10mΩ) Set on-off sequence for clock tree to create stimulus pattern Measure voltage with probes Average 7,864 times, 2M samples for 2ms interval at 1GHz sample rate Z(f)=V(f)/I(f) System Example: z196

More Related