1 / 26

Giga-Scale System-On-A-Chip International Center on System-on-a-Chip (ICSOC)

Giga-Scale System-On-A-Chip International Center on System-on-a-Chip (ICSOC). Jason Cong University of California, Los Angeles Tel: 310-206-2775, Email: cong@cs.ucla.edu (Other participants are listed inside). Background: “Double Exponential” Growth of Design Complexity.

chun
Download Presentation

Giga-Scale System-On-A-Chip International Center on System-on-a-Chip (ICSOC)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Giga-Scale System-On-A-ChipInternational Center on System-on-a-Chip (ICSOC) Jason Cong University of California, Los Angeles Tel: 310-206-2775, Email: cong@cs.ucla.edu (Other participants are listed inside)

  2. Background: “Double Exponential” Growth of Design Complexity • C1: complexity due to exponential increase of chip capacity • More devices • More power • Heterogeneous integration, …… • C2: complexity due to exponential decrease of feature size • Interconnect delay • Coupling noise • EMI, …… • Design Complexity  C1 x C2

  3. Motivation: Productivity Gap 10,000,000 100,000,000 1,000,000 10,000,000 58%/Yr. Complexity growth rate 100,000 1,000,000 Logic Transistors/Chip (K) Transistor/Staff-Month 10,000 100,000 1,000 10,000 21%/Yr. Productivity growth rate x x 100 1,000 x x x x x x 10 100 1 10 1998 2003 Chip Capacity and Designer Productivity Source: NTRS’97

  4. Project Summary • Develop new design methodology to enable efficient giga-scale integration for system-on-a-chip (SOC) designs • Project includes three major components • SOC synthesis tools and methodologies • SOC verification, test, and diagnosis • SOC design driver – network processor

  5. Research Team by Institutions • US • UCLA: Jason Cong • UC Santa Barbara: Tim Cheng • Taiwan • NTHU: Shi-Yu Huang, Tingting Hwang, J. K. Lee, Youn-Long Lin, C. L. Liu, Cheng-Wen Wu, Allen Wu • NCTU: Jing-Yang Jou • China • Tsinghua Univ.: Jinian Bian, Xianlong Hong, Zeyi Wang, Hongxi Xue • Peking Univ.: Xu Cheng • Zhejiang Univ.: Xiaolang Yan

  6. Current Research Team • US • UCLA: Jason Cong • UC Santa Barbara: Tim Cheng • Taiwan • NTHU: Shi-Yu Huang, Tingting Hwang, J. K. Lee, Youn-Long Lin, C. L. Liu, Cheng-Wen Wu, Allen Wu • NCTU: Jing-Yang Jou • China • Tsinghua Univ.: Jinian Bian, Xianlong Hong, Zeyi Wang, Hongxi Xue • Peking Univ.: Xu Cheng • Zhejiang Univ.: Xiaolang Yan • Several new faculty members in the 7 institutions • Guest members from National University of Singapore, Purdue Univ., and UCLA (EE Dept)

  7. Thrust 1 -- SOC Synthesis Environment/Methodology(Led by Jason Cong) Design Spec VHDL/C VHDL/C Co-Simulation Design Partitioning ASIC Synthesis Interconnect-Driven High-level Synthesis Code Generation for Retargetable Compiler and Assembler Generator DSP Synthesis and Optimization FPGA Synthesis and Technology Mapping Synthesis for IP Reuse Physical Synthesis for Full-Chip Assembly DSPs Embedded FPGAs Customized Logic Embedded Processors

  8. 5 cycles 4 cycles 3 cycles 2 cycles 1 cycle 28.3 11.4 22.8 0 Interconnect Bottleneck in Nanometer Designs • Challenge: Single-cycle full chip communication is no longer possible • Not supported by the current CAD toolset • ITRS’01 0.07um Tech • 5.63 G Hz across-chip clock • 800 mm2 (28.3mm x 28.3mm) • IPEM BIWS estimations • Buffer size: 100x • Driver/receiver size: 100x • On semi-global layer (tier 3) : • Can travel up to 11.4 mm in one cycle • Need 5 clock cycles from corner to corner

  9. MUL MUX Island FSM FSM FSM ADD Cluster with area constraint Register File …. Hi FSM Local Computational Cluster (LCC) Global Interconnect FSM FSM FSM Wi Reg. file Reg. file Reg. file Reg. file Reg. file Reg. file … … … … … … LCC LCC LCC LCC LCC LCC 2 cycle 1 cycle k cycle Regular Distributed Register Architecture • Use register banks: • Registers in each island are partitioned to k banks for 1 cycle, 2 cycle, … k cycle interconnect communication in each island • Highly regular

  10. MCAS: Architectural Synthesis for Multi-Cycle Communication Using RDR Architecture C program CDFG generation MCAS (Multi-Cycle Architectural Synthesis) CDFG Resource allocation & Functional unit binding ICG Scheduling-driven placement Locations Placement-driven rescheduling & rebinding Register and port binding Datapath & FSM generation RTL VHDL Floorplan constraints Multi-cycle path constraints

  11. MCAS flow vs. Synopsys Behavioral Compiler (on Virtex-II) • Synopsys Behavioral Compiler setting: default (optimizing latency) • Average latency ratio of MCAS vs. BC: 69% Latency Resource

  12. Construct instances with known optimal using the characteristic of the original problem • First quantitative evaluation of the optimality of circuit placement problem • Existing placement algorithms can be 70% to 150% away from the optimal Optimality Study of Large-Scale Circuit Placement • Construction of Placement Example with Known Optimal (PEKO) [C. Chang et al, 2003] ?

  13. High Interest in the Community • Three EE Times articles coverage • Placement tools criticized for hampering IC designs [Feb’03] • IC placement benchmarks needed, researchers say [April’03] • FPGA placement performance [Nov03] • More than 150 downloads from our website • Cadence, IBM, Intel, Magma, Mentor Graphics, Synopsys, etc • CMU, SUNY, UCB, UCSB, UCSD, UIC, UMichgan, UWaterloo, etc • Used in every placement since its publication http://ballade.cs.ucla.edu/~pubbench

  14. Floorplanning & Interconnect Planning • Based on proposed Corner Block List (CBL) representation propose several Extended Corner Block List, ECBL, CCBL and SUB-CBL to speed up floorplanning and handle more complicate L/T shaped and rectilinear shaped blocks. • Propose floorplanning algorithms with some geometric constraints, such as boundary, abutment, L/T shaped blocks. • Propose integrated floorplanning and buffer planning algorithms with consideration of congestion . • Using research results from UCLA on interconnect planning • About 30 papers published in DAC, ICCAD, ISPD, ASPDAC, ISCAS and Transactions.

  15. P/G Network Analysis & Optimization • Propose an Area Minimization of Power Distribution Network Using Efficient Nonlinear Programming Techniques (ICCAD2001, accepted by IEEE Trans. On CAD) • Propose a decoupling capacitance optimization algorithm for Robust On-Chip Power Delivery (ASPDAC2004, ASICON2003)

  16. Parasitic R/L/C Etraction • 3-D R/C Extraction using Boundary Element Method (BEM) • Quasi-Multiple Medium (QMM) BEM algorithms • Hierarchical Block BEM (HBBEM) technique • Fast 3-D Inductance Extraction (FIE) • Papers were published in ASPDAC, ASICON and IEEE Transaction on MTT

  17. Thrust 2 -- SOC Verification, Test, and Diagnosis(Led by Tim Cheng) Verification and Testing Enabling techniques for semi-formal functional verification Testing and diagnosis for heterogeneous SOC Self-testing using on-chip programmable components Self-testing for on-chip analog/mixed-signal components Automatic/semi-automatic functional vector generation from HDL code Scalable constraint-solving techniques Integrated framework for simulation, vector generation and model checking New test techniques for deep-submicron embedded memories

  18. Key Results - Verification • Developed and released ATPG-based SAT solvers for circuits(Univ. of California, Santa Barbara) • Integrating structural ATPG and SAT techniques with new conflict learning • CSAT: Fast combinational solver (released on March 2003) • Demonstrated 10-100X speedup over state-of-the-art SAT solvers on industrial test cases (reported by Intel and Calypto) • Has been integrated into Intel’s FV verification system and a startup’s verification engine • Publications: DATE2003 and DAC2003 • Satori2: Fast sequential solver (released on Dec. 2003) • Demonstrated 10X-200X speedup over a commercial, sequential ATPG engine on public benchmark circuits • Publications: ICCAD2003, HLDVT2003 and ASPDAC2004

  19. ATPG/Pattern Selection Diagnosis Critical Path Selection Defect Injection & Simulation Path Filtering Dynamic Timing Simulator Static Timing Analysis Statistical Timing Analysis Framework (Cell-based characterization) Key Results - Testing A new Statistical Delay Testing and Diagnosis framework consisting of five major components (UCSB): • Statistical timing analysis • Statistical critical path selection [DAC’02,ICCAD’02] • Selecting statistical long & true paths whose tests maximize detection of parametric failures • Path coverage metric [ASPDAC’03] • Estimating the quality of a path set • Selection/Generation of high quality tests for target paths [ITC’01][DATE 2004] • Identifying tests that activate longer delay along the target path • Delay fault diagnosis based on statistical timing model [DATE’03, VTS’03, DAC’03] • Ref: Krstic, Wang, Cheng,& Abadir, DATE’03–Best Paper Award in Test

  20. Key Results - Testing • On-Chip Jitter Extraction for Bit-Error-Rate (BER) Testing of Multi-GHz Signal (UCSB) • Using on-chip, single-shot measurement unit to sample signal periods for spectral analysis • Demonstrated, through simulation, accurate extraction of multiple sinusoids and random jitter components for a 3GHz signal • Publications: ASPDAC2004 and DATE2004

  21. Thrust 3 – Design Driver: Network Security Processor (Led by Prof. C. W. Wu & Xu Cheng) • Applications: IPSec, SSL, VPN, etc. • Functionalities: • Public key: RSA, ECC • Secret key: AES • Hashing (Message authentication): HMAC (SHA-1/MD5) • Truly random number generator (FIPS 140-1,140-2 compliant) • Target technology: 0.18m or below • Clock rate: 200MHz or higher (internal) • 32-bit data and instruction word • 10Gbps (OC192) • Power: 1 to 10mW/MHz at 3V (LP to HP) • Die size: 50mm2 • On-chip bus: AMBA (Advanced Microcontroller Bus Architecture)

  22. Encryption Modules (PKEM) • Public key encryption module • Operations: • 32-bit word-based modular multiplication • Multiplication over GF(p) and GF(2m) • An RSA cryptography engine with small area overhead and high speed • Scalable word-width • TSMC 0.35μm • 34K gates (1.7×1.8 mm2 ) • 100MHz clock • Scalable key length • Throughput • 512-bit key: 1.79Kbps/MHz • 1024-bit key: 470bps/MHz

  23. Encryption Modules (SKEM) • Secret key encryption module • Operations: • Matrix operations, manipulation • AES cryptography • 32-bit external interface • 58K gates • Over 200MHz clock • Throughput: 2Gbps • Support key length of 128/192/256 bits

  24. International Collaborations • Joint NSF/NSC workshop in Aug. 1999 on SOC (Hsin-Chu, Taiwan) • First team preparation meeting for the proposed center in Jan. 2000 (Yokohama, Japan) • 2nd planning meeting held in April 2000 (Hawaii, US) • 3rd planning meeting in Aug. 2000 (Chengde, China) • Proposal submitted to NSF in Aug. 2000 and funded in Dec. 2000 • Workshops • March 30-31, 2001 in Taipei, Taiwan. • June 23-24, 2001 in Los Angeles, USA • August 31-September 1, 2001 in HangZhou, China • March 28-29, 2002, National Tsing Hua University, Hsinchu, Taiwan • August 20-21, 2002, Peking University, Beijing, China • November 15-16, 2002, University of California, Santa Barbara • March 27-29, 2003, National Taiwan University, Taipei, Taiwan • December 19-21, 2003, Yunnan University, Kunming, China

  25. Publications • 56 research publications up to this point • 17 in top conferences/journals (DAC, ICCAD, ASPDAC, ITC, etc.) in the field

  26. People & Education • Many interactions among participants from different institutes • Two new IEEE fellows: • Prof. Xiaolang Hong, Tsinghua Univ. • Prof. Cheng-Wen Wu, National Tsing Hua Univ. • Involved many young faculty members and researchers • Trained an army of graduate students

More Related