1 / 60

Agenda

DESIGN. SILICON. PRODUCT. How do we move mainstream designs from ASICs to high performance ... the-Art : Clock Domains. State-of-the-Art : Computer Design Hardware ...

Thomas
Download Presentation

Agenda

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Design Techniques for Million Gate, High Speed FPGAs

    2. Agenda The Problem State-of-the-Art Technology Design Issues Performance Oriented Design

    3. The Problem

    4. State-of-the-Art : 2000 Technology Gate Count Frequency Clock Domains Computer Hardware Design Software RTL Language Design

    5. “Those who can not remember thepast are condemned to repeat it.”

    6. State-of-the-Art : Technology

    7. State-of-the-Art : Gate Count

    8. State-of-the-Art : Frequency

    9. State-of-the-Art : Clock Domains

    10. State-of-the-Art : Computer Design Hardware

    11. State-of-the-Art : RTL Language Abstract Data Types Design reusability Compiled concepts Design Management Structure replication

    12. State-of-the-Art : Design In Packaged Power’s design creation stage, we use the Renoir editors to create your design. Since a picture is worth a thousand words, being able to express you design with graphics makes your design easier to understand and describe to other members of your team. With Packaged Power you can express your design in any of the five editors: block diagrams, truth tables, flow charts, state machines, and text.In Packaged Power’s design creation stage, we use the Renoir editors to create your design. Since a picture is worth a thousand words, being able to express you design with graphics makes your design easier to understand and describe to other members of your team. With Packaged Power you can express your design in any of the five editors: block diagrams, truth tables, flow charts, state machines, and text.

    13. State-of-the-Art : Failures

    14. State-of-the-Art : FPGA APEX and Virtex at 3+ Million Gates Maximum Operating Frequency is ~200Mhz (pushing 300Mhz) Large blocks of memory Imbedded Processors (PowerPC, ARM, Mips) Copper interconnect On the FPGA front it isn’t much better, this is why Exemplar Logic using our ASIC expertise, is working with Xilinx and Altera and other FPGA vendors to promote a new design methodology. This is essential if are going to take advantage new found silicon. One fact is, that as devices grow, so will your design environment. On the FPGA front it isn’t much better, this is why Exemplar Logic using our ASIC expertise, is working with Xilinx and Altera and other FPGA vendors to promote a new design methodology. This is essential if are going to take advantage new found silicon. One fact is, that as devices grow, so will your design environment.

    15. The Development Gap Today we find that their has been a very rapid increase in the available silicon that designers can use to implement there systems. In fact this growth rate of available silicon is exceeding the ability of designers to effectively implement and more importantly VERIFY these large designs. The design gap is mainly due to the deep sub-micron effects on timing constraints. On the opposite, the verification gap is directly related to the size of current designs and to the fact that the verification flow has not really changed for years. As a leading supplier of verification technology, Mentor Graphics has developed new verification tools among which a true next generation equivalence checker that will help in closing this verification gap. Today we find that their has been a very rapid increase in the available silicon that designers can use to implement there systems. In fact this growth rate of available silicon is exceeding the ability of designers to effectively implement and more importantly VERIFY these large designs. The design gap is mainly due to the deep sub-micron effects on timing constraints. On the opposite, the verification gap is directly related to the size of current designs and to the fact that the verification flow has not really changed for years. As a leading supplier of verification technology, Mentor Graphics has developed new verification tools among which a true next generation equivalence checker that will help in closing this verification gap.

    16. System / SOC Design Methodology

    17. Adjusting to a New Methodology Team Design IP Logic More software content Heavy with memory Less synthesis / more chip level assembly Why? Because designs are getting larger. Not in terms of blocks, but in terms of the number of block. IP, Bottom up design, incremental design, if you are not using this methodology today, you will be in your next design. You need a tool that is designed for this challenge. Why? Because designs are getting larger. Not in terms of blocks, but in terms of the number of block. IP, Bottom up design, incremental design, if you are not using this methodology today, you will be in your next design. You need a tool that is designed for this challenge.

    18. Effects of the Design Flow

    19. ASIC versus FPGA design

    20. A Designer’s Life

    21. How to make a better designer Provide proper training Designers went to college to learn digital logic design, but most have less than 10 hours RTL training. Provide a proven Design Methodology Enforce Design for Quality techniques Quality circuits are always easier to manufacture and are the most profitable. Functionality is only a minor part of the design process. Using Performance Orient Design techniques are the key to a successful product development

    22. Performance Oriented Design Techniques RTL Coding Styles Design Architecture trade-offs Design Structure Timing Optimization Physical Optimization

    23. Coding style impact Coding style does impact performance It affect FPGAs more than ASICs Different level of RTL Different descriptions give different results Tools are also part of the equation Different tools give different results Learn to know your tool !!! OKOK

    24. The Keys to Language Synthesis Data Types Packages Ports Hierarchy Combinational Logic Relational Operators Arithmetic Operators Sequential Logic Memory IOs

    25. Structuring A Design A design should read like a book. Table of contents : An explanation of the design structure. Logical flow from beginning to end. Chapters : Logical breaks in a design. Commentary : Comments on complex structure in the design.

    26. Source Code Control

    27. Hierarchy

    28. Understand what the RTL does!!

    29. Serial / Priority Structure

    30. Parallel Structure

    31. Tri-State

    32. Bi-directional Buffer

    33. Relational Operators

    34. Addition Operators

    35. Resource Sharing (when it really hurts) OKOK

    36. Multiplication Operator

    37. Pipelined Multipliers Improve timing by introducing parallelism Registers, introduced by pipelining may have modest area impact Requirements Certain constructs in the input RTL source code description Output of the multiplier must be registered. Optimal pipeline stages = log2(input data bus width) A 16 bit databus => optimal pipeline value of 4; 32 bit bus => optimal pipeline value of 5.

    38. A little Algebra goes a long ways Minimize all arithmetic equation to eliminate operators. Frequency increased dramatically.

    39. D Flip-flop

    40. Complex Clock Enables

    41. Latches

    42. Counter

    43. State Machine Tools have made progress with FSM compilers Reachability analysis, highly optimal results Extended encoding techniques Without FSM ‘one hot’ is often the best choice Deflates the next state decoding logic ‘cloud’ FSM compiler without ‘Safe’ State Implements the functionality, however the state machine may not be totally bullet proof The ‘Safe’ option ‘default’ switch in the case may be ignored Recovery logic is implemented to go back to the reset state The ‘Exact’ implementation You want a better match with simulation Performance is not an obstacle Your design works in a harsh environment Check with Tom HillCheck with Tom Hill

    44. State Machine Try the ‘safe’ for SynplicityTry the ‘safe’ for Synplicity

    45. Read Only Memory (ROM)

    46. Single Port Rams Look at Synplicity limitations Check with Tom HillLook at Synplicity limitations Check with Tom Hill

    47. Dual Port Rams Look at Synplicity limitations Check with Tom HillLook at Synplicity limitations Check with Tom Hill

    48. Content Addressable Memory (CAM) Use a CAM when address translation is needed. Use CAMs for sparsely used addresses. CAMs replace large priority encoders. 60% area reduction 80% timing reduction

    49. Checklist for performance Pipeline for high performance Make hardware work in parallel Optimize late-arriving signals Control arithmetic circuits Use IP and hard-macros OKOK

    50. Parallel Gates

    51. Attributes Attributes enable... Mapping control DLLs setup IOB flop control Ram initialization Soft macros for speed Synthesis attributes helpful for... Improved usability Name preservation Replication Resource sharing Speed / area control FSM encoding

    52. Physical Optimization Floor Plan your FPGA. Produces a faster circuit Circuit is more predictable and repeatable. Timing convergence occurs quickly. Back Annotate real timing data. Allows 2nd pass of synthesis works on real critical paths.

    53. FPGA High-Level Floorplanner Tight links to Exemplar’s synthesis tool. Position blocks into regions of device Generates area constraints Required for new Incremental design flow Useful for Design Planning

    56. Constraint Based Clustering Another cause of excessive delay can be traced to high fan out nets. Many times a designer can inadvertently infer many loads, for example on a bus, with out realizing it. This is where LeonardoSpectrum’s TrueTiming Logic replication algorithm kicks in.. By accurately identifying the correct path, we can then apply logic replication to remove excessive loading on a path, thus reducing the delays cause by long routes across the design. Our TrueTiming algorithms are design to help you meet timing. Combined with LeonardoSpectrum's Incremental flow, where only the nets that have been changed are required to be rerouted, TimeCloser technology gets you to market FAST. Another cause of excessive delay can be traced to high fan out nets. Many times a designer can inadvertently infer many loads, for example on a bus, with out realizing it. This is where LeonardoSpectrum’s TrueTiming Logic replication algorithm kicks in.. By accurately identifying the correct path, we can then apply logic replication to remove excessive loading on a path, thus reducing the delays cause by long routes across the design. Our TrueTiming algorithms are design to help you meet timing. Combined with LeonardoSpectrum's Incremental flow, where only the nets that have been changed are required to be rerouted, TimeCloser technology gets you to market FAST.

    57. Logic Replication Another cause of excessive delay can be traced to high fan out nets. Many times a designer can inadvertently infer many loads, for example on a bus, with out realizing it. This is where LeonardoSpectrum’s TrueTiming Logic replication algorithm kicks in.. By accurately identifying the correct path, we can then apply logic replication to remove excessive loading on a path, thus reducing the delays cause by long routes across the design. Our TrueTiming algorithms are design to help you meet timing. Combined with LeonardoSpectrum's Incremental flow, where only the nets that have been changed are required to be rerouted, TimeCloser technology gets you to market FAST. Another cause of excessive delay can be traced to high fan out nets. Many times a designer can inadvertently infer many loads, for example on a bus, with out realizing it. This is where LeonardoSpectrum’s TrueTiming Logic replication algorithm kicks in.. By accurately identifying the correct path, we can then apply logic replication to remove excessive loading on a path, thus reducing the delays cause by long routes across the design. Our TrueTiming algorithms are design to help you meet timing. Combined with LeonardoSpectrum's Incremental flow, where only the nets that have been changed are required to be rerouted, TimeCloser technology gets you to market FAST.

    58. Critical Path Restructuring Another cause of excessive delay can be traced to high fan out nets. Many times a designer can inadvertently infer many loads, for example on a bus, with out realizing it. This is where LeonardoSpectrum’s TrueTiming Logic replication algorithm kicks in.. By accurately identifying the correct path, we can then apply logic replication to remove excessive loading on a path, thus reducing the delays cause by long routes across the design. Our TrueTiming algorithms are design to help you meet timing. Combined with LeonardoSpectrum's Incremental flow, where only the nets that have been changed are required to be rerouted, TimeCloser technology gets you to market FAST. Another cause of excessive delay can be traced to high fan out nets. Many times a designer can inadvertently infer many loads, for example on a bus, with out realizing it. This is where LeonardoSpectrum’s TrueTiming Logic replication algorithm kicks in.. By accurately identifying the correct path, we can then apply logic replication to remove excessive loading on a path, thus reducing the delays cause by long routes across the design. Our TrueTiming algorithms are design to help you meet timing. Combined with LeonardoSpectrum's Incremental flow, where only the nets that have been changed are required to be rerouted, TimeCloser technology gets you to market FAST.

    59. User Applied Physical Constraints Preserve signals Assign nets to secondary routing resources Specify fanout on net by net basis

    60. Design Techniques for Million Gate, High Speed FPGAs

More Related