1 / 58

Challenges and opportunities for FPGA platforms

Challenges and opportunities for FPGA platforms. Ivo Bolsens Xilinx Research Labs. Thanks to. Bill Carter David Eden Erich Goetting Alireza Kaviani Bernie New Cameron Patterson Steve Trimberger Tim Tuan. Overview. FPGA’s ride the tide Opportunities Challenges.

erik
Download Presentation

Challenges and opportunities for FPGA platforms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Challenges and opportunities for FPGA platforms Ivo Bolsens Xilinx Research Labs

  2. Thanks to • Bill Carter • David Eden • Erich Goetting • Alireza Kaviani • Bernie New • Cameron Patterson • Steve Trimberger • Tim Tuan Xilinx Confidential

  3. Overview • FPGA’s ride the tide • Opportunities • Challenges Xilinx Confidential

  4. ASICs buck the tide, FPGAs ride the tide • Process Technology • Performance • Architecture • Cost • Flexibility • Market trends Xilinx Confidential

  5. Tox,Gate Leakage Channel Leakage Gate Source Drain Substrate Moore’s Law A tale of two numbers : What process people don’t tell you CD CD 320nm 240nm 160nm 80nm (nm) 2.7 4.5 6.5 1.3 Tox Xilinx Confidential

  6. Trend: Line Widths Smaller Than the Wavelength of Light 0.700 0.600 0.500 0.400 Process Geometry (micron) 0.300 0.200 0.100 - 1988 1990 1992 1994 1996 1998 2000 2002 Optical Processing Wavelength Process Geometry Xilinx Confidential

  7. Painting a one cm line with a three cm brush… Courtesy : IBM Xilinx Confidential

  8. Gate Oxide Polysilicon Gate Gate Oxide Silicon crystal • About 10 molecular layers of SiO2 for this 150nm example • 90nm technology is about half the thickness Xilinx Confidential

  9. Virtex-II FPGA to Market 1-Year Earlier Cu/Low-K Xilinx is developing 90nm in 2002 SIA Roadmap Xilinx FPGA’s are ahead of the curve 350 250 180 Process Technology Feature Size (nm) 150 130 100 70 97 98 99 00 01 02 03 04 05 Year Xilinx Confidential

  10. Where are we today 4 24 556 442 10Mb 125K 105K 340 168 3Mb 840Mb/sLVDS 3.125Gb/s MGTs Multipliers PowerPCCPUs Logic Cells Block RAM XC2V8000 = 350M tranistors XC2VP125 Xilinx Confidential

  11. FPGAs are leading Intel’s Roadmap Source : Intel Xilinx Confidential

  12. Gate count requirement for ASICs Source: IMS FPGAs can address very large part of the ASIC market today Xilinx Confidential

  13. Performance requirement for ASICs Source: IMS 2000 FPGAs can address very large part of the ASIC market today Xilinx Confidential

  14. A Decade of Progress 1000x 1000 Virtex-II (excl. Block RAM) 100x 100 Capacity Speed Price Virtex & Virtex-E (excl. Block RAM) XC4000 10x 10 Spartan 1x 1 1/91 1/92 1/93 1/94 1/95 1/96 1/97 1/98 1/99 1/00 1/01 Year Xilinx Confidential

  15. The Cost/Volume Crossover 1000 100 ASIC Cost 10 FPGA Cost Relative Cost 1 0.1 10 100 1,000 10,000 100,000 1,000K Unit Volume Xilinx Confidential

  16. 1 pin 10,000 transistors 10,000X Are Transistors Free? Xilinx Confidential

  17. Source: ITRS Performance Scaling Xilinx Confidential

  18. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + l/2 2 Tconnect (nsec) ­ .l / 3 3 V / Heat/area Localisation of storage and computing + + store store + + l l/2 2 Courtesy :IMEC Xilinx Confidential

  19. PC Smart Things Mainframe >100 # 1 0.01 +DSP Compute Power +Communications + Ambient Intelligence Market Requirements / human A mass market for one person Post-PC Era 60 70 80 90 00 10 Xilinx Confidential

  20. Electronics Industry Dynamics Residential Gateway (Broadband access) Satellite/Cable + Digital VCR NTSC DES ATAPI DBS DOCSIS HomePNA HomeRF HomePLUG Bluetooth Hiperlan2 DSL... Digital VCR Custom Features (Pay-Per-View) Market Size ($) Dramatic increase in new standards NTSC DES ATAPI DBS DOCSIS Cable Decoders NTSC DES ATAPI NTSC Smart cards (DES) NTSC • New Products • Take less time to reach high volumes • Shorter Product Life Cycles • Many standards / More Interoperability Time Xilinx Confidential

  21. Design Interconnect Power Analysis 3% 20% Analysis authoring Transistor 3% Simulation 5% Extraction 5% Place and Route 17% Floorplanning 5% Static Timing Gate Simulation 7% Analysis Simulation 5% 14% Complex ASIC DesignThe Shrinking Window of Innovation Synthesis 16% • Average iterations between design and layout = 20 (Source Electronic Systems Jan 99) Xilinx Confidential

  22. Simpler/Faster Design Flows • 2:1 proven Time-to-Market Advantage • No silicon design or verification steps • More design flexibility through later design freeze Spec Design and Verification Silicon Prototype System Integration Silicon Production ASIC Flow Design Freeze Spec Design and Verification System Integration FPGA Flow Design Freeze Xilinx Confidential

  23. Today’s Product Lifecycle Profit for first to Market • 37% of new digital products were late to market • Entering the market first can result in up to a 40% greater total profit contribution over the product’s life vs. the #2 entrant Profit Reduced profit for latecomers Time Xilinx Confidential

  24. Today’s Product Lifecycle IRL extends product life in market • 37% of new digital products were late to market • Entering the market first can result in up to a 40% greater total profit contribution over the product’s life vs. the #2 entrant Profit Time Xilinx Confidential

  25. I-Cache 16KB Fetch & Decode Timers and Debug Logic D-Cache 16KB PPC PPC MMU Execution Unit 32x32b GPR ALU, MAC Virtex-II Pro PowerPC Technology • 32-bit RISC CPU, Harvard Architecture • 130nm CMOS with 1.5V Operation • 456 Dhrystone MIPS at 300MHz • 32 x 32-bit General Purpose Registers • Hardware Multiply / Divide • 5-Stage Execution Pipeline • 16KB D-Cache, 16KB I-Cache • Memory Management Unit (MMU) • High-Bandwidth Interface to Logic • Built-In Hardware Timers • Built-In JTAG Debug and Trace support IBM PowerPC™ 405 RISC CPU 3.8 sq mm = 1% of 2VP100 Xilinx Confidential

  26. 4 CPUs 2 CPUs High Performance 1824 1600 912 Dhrystone MIPS 800 456 400 220 200 1 CPU 100 AlteraExcaliburArm 9 Virtex-II ProPowerPC 405 Xilinx Confidential

  27. “Low PowerPC”: 0.59mW/MIPS 400 Full-Custom IBM CPU Design 1.5V 130nm CMOS Technology Low-K Dielectric IP-Immersion 300 100mW = 1 LED Indicator Power (mW) 200 100 …or 169 MIPS! 150 0 50 100 200 250 300 350 400 Performance (Dhrystone MIPS) Xilinx Confidential

  28. PPC PPC IP-ImmersionEmbed multiple IP blocks of arbitrary shape withhigh-bandwidth connectivity to FPGA core logic, memory & I/O Technologies Enabling IP-Immersion Metal 9 Metal 8 Metal 7 Metal 6 Metal 5 Metal 4 Metal 3 Advanced hard-IP block (e.g. PowerPC CPU) Metal 2 Metal 1 Poly Silicon Substrate Active Interconnect™Segmented Routing Metal ‘Headroom’ Xilinx Confidential

  29. PPC PPC System Architecture Options ExternalDevices ExternalInterfaces • “Logic-Centric Architecture” • PowerPC Executes Entirely out of Cache • No FPGA Logic, Memory, or I/O Used • 10-20 Pages of C-Code or More • Use as Complex Algorithmic Engine • Web Server • Encryption/Decryption • Packet Processor • “CPU-Centric Architecture” • PowerPC forms Heart of Embedded System • On & Off-Chip Peripherals • External Interfaces • e.g. PCI, 3GIO, Gb Ethernet, ZBT SRAM • CoreConnect™ On-Chip Bus • Ties System Together • Peripherals implemented in FPGA Logic • Typically Runs Embedded OS ExternalDevices ExternalInterfaces Xilinx Confidential

  30. The Virtex-II Pro Advantage Viterbi Viterbi Interleave Interleave Reed-Solomon Reed-Solomon HW acceleration Virtex-II Pro Code Stack (C++) Concatenated FEC Engine Control Tasks PowerPC Processor RAM Viterbi Inter-leaver Reed-Solomon Viterbi Interleaver Reed-Solomon PowerPC with Application-SpecificHardware Acceleration Control Tasks XTREMEProcessing™ Control Tasks Traditional Processing time Xilinx Confidential

  31. Provides Specialized Connectivity Between PowerPC & FPGA Logic Dual-Port BlockRAM Memory CPU & Logic Each Own 1 Port High-Bandwidth 6.4Gb/sec Low-Latency Non-Caching Designed for Communications Data Processing Enables PowerPC & FPGA Logic to Work together on Complex Problems 6.4Gb/sec 6.4Gb/sec I-Cache 16KB Fetch & Decode Timers and Debug Logic D-Cache 16KB MMU Execution Unit 32x32b GPR ALU, MAC 6.4Gb/sec 6.4Gb/sec AccelerationLogic HW/SW Interfacing BlockRAMs Xilinx Confidential

  32. APU Controller PLB 405 Core Hardware Coprocessor Processor Block APU Controller • Micro-controller style interface to fabric for control plane applications • Benefits: • Up to 10x faster than memory mapped interface • Saves PLB bandwidth for code execution • Minimizes pipeline stalls Xilinx Confidential

  33. TCP TCP/IP Stack on PowerPC IP Link Layer inFPGA Logic (GbE MAC) MAC MAC TCP/IP Creating Complete Communications Solutions ftp telnet rlogin mail etc Upper Layerson PowerPC PHY RocketIO is PHY(1000Base-SX/LX) Gb Ethernet (1000BaseLX/SX/CX) Xilinx Confidential

  34. Infiniband ExampleCPU Makes Communications Practical, Easier, & Cheaper InfiniBand TCA built with CPU + fabric CPU Based Solution8 Times Less Area …or built with fabric only Sources: Intel, Xilinx Xilinx Confidential

  35. Specify System Architecture 2 1 Create System Architecture 3 4 Define Addresses Configure Peripherals Configurable Platform Xilinx Confidential

  36. UART Interrupt Controller PPC 405 32-Bit RISC 130nm Process 300+ MHz Core 420 D MIPS MicroBlaze MicroBlaze MicroBlaze MicroBlaze PPC 405 32-Bit RISC 130nm Process 300+ MHz Core 420 D MIPS Arbiter MicroBlaze The MicroBlaze™High Performance Soft CPU tm TM CoreConnect Technology Local OPB Bus Xilinx Confidential

  37. Incremental Designlessens the impact of design changes • “Next Generation” technology • Easy set-up through floorplanningalong HDL hierarchy boundaries • Changes only affect the modulethat was changed • The remainder of the design stays locked and intact • Timing repeatability • preserves routing • Faster turnaround for localized design changes Xilinx Confidential

  38. Partial ReconfigurabilityFPGA Flexibility for the Field 011011 • Re-program part of an FPGAwhile it’s still running • Virtex-II and Virtex-E Fixed Logic PR Logic PR Logic Fixed Logic Fixed Logic User Definable Boundaries Xilinx Confidential

  39. Bus Line System System Payload Payload Data Line Tx Assembly Qualify Format Coding System Payload Interfaces Processing Payload Payload Data Line Rx Buffer Quality Alignment Decoding System Exploration Xilinx Confidential

  40. Traditional Architecture Payload Payload Data Line Assembly Qualify Format Coding Tx Rx Payload Payload Data Line Buffer Quality Alignment Decoding m P Bus Motorola PowerQUICC System U-Bus CPM RAM Memory AAL5 G704 G703 Interface Processor Framer LIU FLASH EEPROM Payload Processor Processing Other Peripherals PCI Bus MPC860 System PCI Bridge CPM = Communications Processor Module Device Generic Design System Interfaces Xilinx Confidential

  41. Payload Payload Data Line Assembly Qualify Format Coding Tx Rx Payload Payload Data Line Buffer Quality Alignment Decoding m P Bus Motorola PowerQUICC System U-Bus CPM RAM Data Memory AAL5 G704 G703 Direction Interface Processor Framer LIU FLASH EEPROM Payload Processor Processing Other Peripherals PCI Bus MPC860 System PCI Bridge CPM = Communications Processor Module Device Generic Design System Interfaces Traditional Architecture Xilinx Confidential

  42. Optimized Architecture Payload Payload Data Line Assembly Qualify Format Coding Tx Rx Payload Payload Data Line Buffer Quality Alignment Decoding m P Bus System RAM Dual Port MicroB G704 G703 Block Processor Framer LIU RAM FLASH EEPROM Memory PowerPC Payload Processing Interface Processor Other Peripherals PCI Bus System PCI Bridge Fast I/F Device FIFO FPGA Boundary Generic Design System Interfaces Xilinx Confidential

  43. Optimized Architecture Payload Payload Data Line Assembly Qualify Format Coding Tx Rx Payload Payload Data Line Buffer Quality Alignment Decoding m P Bus System RAM Dual Port MicroB G704 G703 Block Processor Framer LIU RAM FLASH EEPROM Memory PowerPC Payload Processing Interface Processor Other Peripherals PCI Bus System PCI Bridge Fast I/F Device FIFO FPGA Boundary Generic Design System Interfaces Xilinx Confidential

  44. Interconnect and power Source : Bill Daly Xilinx Confidential

  45. Interconnect and performance Source : Bill Daly Xilinx Confidential

  46. Power Analysis • Typical design • 5.9uW/CLB/MHz [FPGA00] • Fabric power is ~69% of total power • 2V6000 = 5.9uW/CLB/MHz  8448CLBs  100MHz  69% = 7.5W Xilinx Confidential

  47. Dynamic Power • Normalized to 2001 • Best fit is a quadratic trend line • Predicts 5X by 2007 1996: 4000EX 1997: 4000XL1998: 4000XV1999: Virtex2000: Virtex-E2001: Virtex-II Xilinx Confidential

  48. Static Power • Normalized to 2001 • Best fit is a power trend • Predicts 100X by 2007 • Future data points projected using linear trend for 1/VTH 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 Xilinx Confidential

  49. Static versus Dynamic Xilinx Confidential

  50. Mixed Signal FPGA uProc. Virtex-II Pro System Clock Management Virtex High Performance I/O Virtex Memory XC4000 Special Arithmetic Functions XC4000 Gates Routing XC2000 The Age of Accumulation Xilinx Confidential

More Related