Traditional soc design flow
This presentation is the property of its rightful owner.
Sponsored Links
1 / 81

Traditional SOC Design Flow PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on
  • Presentation posted in: General

Traditional SOC Design Flow. Key Problem: Timing assumption during prelayout synthesis widely differs from the post layout reality. This happens because the interconnect delay dominates the overall propagation delay in DSM (Deep Sub-Micron) technologies.

Download Presentation

Traditional SOC Design Flow

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Traditional soc design flow

Traditional SOC Design Flow

  • Key Problem: Timing assumption during prelayout synthesis widely differs from the post layout reality.

  • This happens because the interconnect delay dominates the overall propagation delay in DSM (Deep Sub-Micron) technologies.

  • As a result getting a timing closure becomes a challenge.

Source: Advanced ASIC Chip Synthesis. 2nd Ed. Himanshu Bhatnagar. Kluwer Academic Publishers


Traditional soc design flow

Set Design Constraints

Develop HDL files

Design Rule Constraints

set_max_transition

set_max_fanout

set_max_capacitance

Design Optimisation Constraints

Create_clock

set_clock_latency

set_propagated_clock

set_clock_uncertainty

set_clock_transition

set_input_delay

set_output_delay

set_max_area

Specify Libraries

Library Objects

link_library

target_library

symbol_library

synthetic_library

Read Design

analyze

elaborate

read_file

Select Compile Strategy

Top Down

Bottom Up

Define Design Environment

Optimize the Design

Set_operating_conditions

Set_wire_load_model

Set_drive

Set_driving_cell

Set_load

Set_fanout_load

Set_min_library

Compile

Analyze and Resolve

Design Problems

Check_design

Report_area

Report_constraint

Report_timing

Save the

Design database

write


Design compiler setup files

Design Compiler Setup Files

  • .synopsys_dc.setup

    • Library paths

    • Company wide, project wide design environment related variables and commands

    • UNIX variables

  • Three files at three locations. All three are read in the following order

    • Synopsys root - $SYNOPSYS/admin/setup

      • Affects all users. Only system adminstrator can modify this. In small startups with only single ASIC project, this serves as the place to enforce project wide discipline.

    • Home Directory

      • Content affects all DC activities. Project wide enforcement could happen at these level if the designer is involved in a single project (less likely).

    • Working Directory

      • Affects the current invocation of DC. If a person is working on more than one Synopsys projects (more likely), then the project wide enforcement should happen at this level. One working directory for each project.

  • Repeated commands are overridden


Libraries search path

Technology Library

Created by ASIC vendor in Synopsys format – which is now an open standard.

Cells are defined by their names, function, timing, net delay, parasitic information, units for time, resistance, capacitance etc.

Target Library

a technology library that Design Compiler maps to during optimization.

Link Library

The technology library that contains the definition of the cells used in the mapped design. In principle should be the same as target_library unless a technology translation is being performed.

Libraries & Search Path

  • Symbol Library

    Definition of graphics symbols. Cells in Symbol Library must match

  • DesignWare Library

    A DesignWare component library is a collection of reusable circuit-design building blocks that are tightly integrated into the Synopsys synthesis environment.

  • GTECH Library

    The GTECH library is the Synopsys generic technology library. It is technology-independent and included with Design Compiler software.

    GTECH parts are Synopsys unmapped representations of Boolean functions (library cell placeholders). GTECH instantiation allows for a technology-independent HDL description and the accuracy of instantiation.

  • Search_path

    If the library variables only specify file names, search_path is used to locate libraries. By default points to current working directory and $SYNOPSYS/libraries/syn


Synopsys design objects

Synopsys Design Objects

  • Design

    A circuit that performs one or more logical functions

  • Cell

    An instance of a design or library primitive within a design

  • Reference

    The name of the original design that a cell instance points to

  • Port

    The input or output of a design

  • Pin

    The input or output of a cell

  • Net

    A wire that connects ports to ports or ports to pins

  • Clock

    A timing reference object to describe a waveform for timing analysis


Synopsys design objects schematic

Synopsys Design Objects - Schematic


Synopsys design objects vhdl

Synopsys Design Objects - VHDL


Synopsys design objects vhdl1

Synopsys Design Objects - VHDL


Reading assignment

Reading Assignment

Read about these commands from Synopsys Documentation

Find and Filter

Read / Analyze / Elaborate

Compile

Report_timing

Also read about what are Attributes and Variables


Outline of this course module

Outline of this course module

Synopsys Design Environment Essentials

CMOS essentials for logic synthesis

Constraint Classification

Load and Drive Constraints

Clocking constraints

Operating Conditions Constraints

Static Timing Analysis

Chip Level Timing and Multiple Clock Domains


Mosfet transistor

MOSFET Transistor

Source: MIT. Course 6.375. Lecture L06. 2006


Key qualitative characteristics of mosfet transistors

Key qualitative Characteristics of MOSFET transistors

Source: MIT. Course 6.375. Lecture L06. 2006


Traditional soc design flow

Source: MIT. Course 6.375. Lecture L06. 2006


Traditional soc design flow

Source: MIT. Course 6.375. Lecture L06. 2006


Rc model of an inverter

RC Model of an inverter

Source: MIT. Course 6.375. Lecture L06. 2006


Traditional soc design flow

Source: MIT. Course 6.375. Lecture L06. 2006


Traditional soc design flow

Source: MIT. Course 6.375. Lecture L06. 2006


Traditional soc design flow

Source: MIT. Course 6.375. Lecture L06. 2006


Traditional soc design flow

Source: MIT. Course 6.375. Lecture L06. 2006


Wires

Wires

Source: MIT. Course 6.375. Lecture L06. 2006


Distributed rc wire model

Distributed RC wire model

This is also known as Elmore Delay model

Source: MIT. Course 6.375. Lecture L06. 2006


Manual insertion of repeaters

Manual insertion of Repeaters

Source: MIT. Course 6.375. Lecture L06. 2006


Lumped rc wire model

Lumped RC wire model

Source: MIT. Course 6.375. Lecture L06. 2006


Estimate the rise time

Estimate the rise time

Source: MIT. Course 6.375. Lecture L06. 2006


Traditional soc design flow

Width of transistor is found by multiplying the scaling factor (16/8/2/1) with the minimum width of transistor which is 0.5 mm.

Multiply Cg,N/Cg,P/Cd,N/Cd,P with the width of the transistor to get the drain/gate capacitances for P and N transistors.

Wider transistor  more capacitance

Divide Reff,N/Reff,P with the width of the transistor to get the Resistance for the N and P transistors.

Wider Transistor  Less resistance

The factor 2.2 comes from 90% Vdd swing

loge(0.9Vdd/ 0.1Vdd)

The sheet resistance (0.07) is for unit square.

Since the wire width is 0,25mm. resistance for 1 mm X 0.25 mm wire is 0.07/0.25. This factor is multiplied by the length 250 mm

The wire capacitance is made up of two parts: Bottom (area) capacitance found using 250 X 0.25 (area) X CA,M2.

Side capacitance is found by multiplying length 250 XCL,M32

Source: MIT. Course 6.375. Lecture L06. 2006


Constraints

Constraints

  • Optimisation Constraints

    • Performance – clock

    • Area

    • Power

  • Technology, Operating and Manufacturing Constraints

    • Max rise time, max capacitance

    • Operating Conditions –

      • Vdd, Temperature

      • Drive current, Load

    • Process Variations

      • Fast corner, Slow corner

    • Physical Design

      • Antenna rules


Generic synthesis flow

Generic Synthesis Flow

Design

Create a solution

Technology, Operating & Manufacturing Constraints

Optimisation Constraints

Evaluate the solution

Analysis

Constraints Met


Static timing analysis sta

Static Timing Analysis (STA)

  • Exhaustively verifies that

    • the timing constraints (clock) are met for a design

    • for given technology (Standard Cell Library) and

    • a set of specified operating conditions

  • Limitations of the alternative – Simulation

    • Not Exhaustive

    • Accuracy

      • RTL

      • Gate Level

        • SDF back annotation

        • Dependent on STA

      • Circuit Level SPICE simulation are impractical

    • Time (STA also takes time, but is bounded)

PROCESS (clk)

BEGIN

IF rising_edge (clk) THEN

s <= a * b;

END IF;

END


Timing models accuracy

Timing Models - Accuracy

  • Untimed

  • Transaction Level - SystemC

    • Multiple Cycles

    • Bus Transactions, Transmit/Receive, Encode/Decode

  • Cycle Accurate – RTL

    • What happens in each clock cycle is accurately known

  • Gate Level – Event Driven

    • Physical details of computation, storage and interconnect operations known

    • Delay in wire is not known

    • Clock is ideal

  • Layout Level

    • Delay in wire known

    • Clock is real

    • Relative position of standard cell is known


Delay parameters intrinsic delay slew

Delay Parameters – Intrinsic Delay & Slew

A=1

Z

B

Vdd

B

Z

Vdd

0.7Vdd

R

z

0.5Vdd

y

Q

0.3Vdd

P

x

t1

t1

t2

t2


Path delay calculation

Path Delay Calculation

  • The intrinsic delays and the slews are characterised using SPICE simulation by sweeping many parameters that affects the Intrinsic delay and Slew

  • All the paths are exhaustively covered

Library and Design

Environment Conditions for Analysis

A

Delay Computation

Through Wire

Delay Computation

Through Gate

Delay and Slew

At Gate Output

B

D

Delay and Slew

At Next Gate Input

C


Paths path groups

Paths & Path Groups

  • Paths

    • Start point: Input ports or clock pins of sequential devices and

    • End point: Output ports or Data input pins of sequential devices.

  • Path groups

    • Paths are organised in groups identified by clocks controlling their endpoints.


  • Timing arcs

    Timing Arcs

    • positive unate timing arc:

      • Combines rise delays with rise delays, and fall delays with fall delays. An example is an AND gate cell delay or an interconnect (net) delay.

    • negative unate timing arc:

      • Combines incoming rise delays with local fall delays, and incoming fall delays with local rise delays. An example is a NAND gate.

    • nonunate timing arc:

      • Combines local delay with the worst-case incoming delay value. Nonunate timing arcs are present in logic functions whose output value change cannot be predicted by the direction of the change on the input value. An example is an XOR gate.

    • Accuracy of estimates is critical

      • Intrinsic Delays are accurate after logic synthesis

      • Slew and Net Delays are estimated and known accurately only after physical synthesis


    Factors affecting delay and slew

    Factors Affecting Delay and Slew

    Discrete Factors:

    Geometry & Dimension

    Specific Path

    Transition Direction

    Related Pin

    P1

    P2

    Z

    A

    N1

    4 Input NAND gate

    B

    N2


    Factors affecting delay and slew1

    Factors Affecting Delay and Slew

    • Load on the Gate

    • Load of all the inputs that this output has to drive

    • Load of the interconnect wires

    • Tri-stated wires

    • Input Slew

    • Transition time at the previous gate

    • The interconnect

    • Primary input – drive strength, driver cell


    Constraints1

    Constraints

    • Technology Constraints

    • Max Transition

    • Max Fanout

    • Max Capacitance

    • Min Capacitance

    • Design Constraints

    • Set Load

    • Set Drive (inverse of resistance)


    Traditional soc design flow

    5

    Z1

    A

    Z2

    A

    Z3

    set_driving_cell

    set_load

    or set_drive

    Technology Constraint; Cannot be relaxed

    Design Constraint

    • If drive or driving cell is not specified, the synthesis tool assumes infinite drive strength

    • If load is not specified, the synthesis tool assumes zero load


    Interpolation and extrapolation

    Interpolation and Extrapolation

    Piece Wise Linear Model

    Load

    D12

    D22

    L2

    L

    D2

    D1

    D

    L1

    D11

    D21

    S

    S1

    S2

    Slew


    Process voltage temperature pvt variation operating conditions

    worst

    worst

    worst

    Delay

    nominal

    Delay

    nominal

    Delay

    nominal

    best

    best

    best

    Process

    Temperature

    Voltage

    Process, Voltage, Temperature (PVT) Variation & Operating Conditions

    Operating Conditions

    NameLibraryProcessTempVoltInterconnect Model

    WCCOMmy_lib1.50701.1worst_case_tree

    WCINDmy_lib1.50801.1worst_case_tree

    WCMILmy_lib1.501251.0worst_case_tree

    BCCOMmy_lib1.5001.2best_case_tree

    BCINDmy_lib1.50-401.2best_case_tree

    BCMILmy_lib1.50-551.3best_case_tree


    Pvt variation an example

    PVT Variation: An Example

    Consider a minimum size NMOS device in a 1.2 mm CMOS process. VGS =VDS = 5V

    The nominal saturation current for the device size W = 1.8 mm, Leff = 0,9 um

    Now consider the variation in the following parameters:

    • 25 % variation in Threshold voltage – Vt

    • 10 % variation in transconductancek’n mainly due to variation in oxide thickness.

    • ±0.15mm (about 10 %) variation in W and L. Variations in W and L are uncorrelated as they are

    • ±0.5V (10%) variation in power supply voltage

    Speed of device is proportional to the drain current and can thus result in variation of the speed of the circuit.


    Derating

    Derating

    Libraries are characterized for various operating conditions

    Further characterisation is done to see how the delay model responds to change in process, voltage and temperature. This is done by holding two parameters constant and sweeping the third.

    This yields derating factors for Process, Voltage and Temperature


    Sequential arcs

    Sequential Arcs

    Timing relationship between

    two input pins

    two consecutive events on the same input pin

    Pulse Width

    Setup

    Hold

    Recovery

    Removal


    Pulse width

    Pulse Width

    Width of High and low phases of clocks

    Width of Active level of asynchronous inputs like reset

    Not met. Reset may

    have no effect

    rst_n

    Pulse

    Width

    Requirement


    Setup

    Setup

    Data should be stable setup time before the arrival of clock edge.

    What happens if the setup time is violated ?

    Not met. New data

    may not get latched

    clk

    data

    Setup Requirement


    Traditional soc design flow

    Hold

    Data should be stable hold time after the arrival of clock edge.

    What happens if the Hold time is violated ?

    Not met. Old data may

    not get latched

    clk

    data

    Hold

    Requirement


    Recovery and removal

    Recovery and Removal

    Minimum time between de-assertion of an asynchronous control signal and the next active clock edge

    Minimum time between an active clock edge that an asynchronous control signal should remain asserted

    rst_n

    Not met. clk may

    not have effect

    Not met. clk may

    override rst_n

    clk

    clk

    rst_n

    Recovery

    Requirement

    Removal

    Requirement

    Can be formulated as a setup check

    Can be formulated as a hold check


    What is the reason for setup and hold

    What is the reason for setup and hold

    a

    Vin2 = Vout1

    Vin1

    Vout1

    Vin2, Vout1

    c

    c

    Vin2

    Vout2

    b

    Vin1 = Vout2

    Vin1, Vout2

    a

    b


    Traditional soc design flow

    Transistor Level Schematic of a D-Flophttp://www.edn.com/design/analog/4371393/Understanding-the-basics-of-setup-and-hold-time


    Working of the d flop work at transistor level

    Working of the D-Flop work at Transistor Level

    http://www.edn.com/design/analog/4371393/Understanding-the-basics-of-setup-and-hold-time


    Setup and hold time at circuit level

    Setup and Hold Time at Circuit Level

    The time it takes data D to reach node Z is called the setup time.

    The time it takes data D to reach node W is called the hold time.

    http://www.edn.com/design/analog/4371393/Understanding-the-basics-of-setup-and-hold-time


    Negative hold time

    Negative Hold Time

    http://www.edn.com/design/analog/4371393/Understanding-the-basics-of-setup-and-hold-time


    Generalizing setup hold constraints

    Generalizing Setup & Hold Constraints

    Setup Constraint

    Boundary of the Flop

    Assume C1 is zero

    clk reaches F1 before data has arrived at F1 and registers wrong data

    To avoid this, data should stabilize D1 time before the arrival of clk.

    In reality, C1 is never zero, so data should stabilize D1-C1 time before the arrival of clk.

    As there are multiple D1 paths and multiple C1 paths, the complete and safe setup constraint is max (data path delays) – min (clock path delays)

    Delay D1

    data

    F1

    Delay C1

    clk

    Hold Constraint

    Assume D1 is zero

    Data reaches F1 before clk has arrived at F1. When the clk arrives, new data has overwritten the previous data.

    To avoid this, data should remain stable C1 time after the arrival of clk.

    In reality, D11 is never zero, so data should remain stable C1-D1 time after the arrival of clk.

    The complete and safe hold constraint is max (clock path delays) – min (data path delays)


    Negative hold

    Negative Hold

    Typically clock paths are well buffered and faster

    There can be substantial data path delay, especially in scan flops

    max (data path delays) – min (clock path delays) is always positive. This implies that Setup constraint is never negative

    max (clock path delays) – min (data path delays) can be negative. This implies that Hold constraint can be negative

    Boundary of the Flop

    Delay D1

    data

    clk

    F1

    At Device Interface

    Delay C1

    clk

    data

    At Latching Element

    clk

    Stable

    Stable

    New

    New

    Setup + Hold (cannot be negative) =

    Max(clock path) + Max(data path) –

    Min(clock path) – Min(data path)

    data

    Negative Hold – Seen At Device Interface


    Specifying input delay

    Specifying Input Delay

    Good design practice mandates that inBlock does not have a combinatorial logic (”m”) driving output

    These days ”m” is more likely to be the result of global interconnect delay.

    Early floorplanning is a good way to estimate the delay due to ”m”

    If floorplanning is not done a good bet is 50-60% of the clock cycle

    Characterize command automatically calculates input delay from parent design

    set_input_delay -clock Clock 8 “data_in_2”


    Specifying output delay

    Specifying Output Delay

    set_output_delay -clock Clk -max -fall 10 {"Z<0>" "Z<1>"}


    General timing constraints

    General Timing Constraints

    C2

    I1

    C0

    C1

    C3

    O1

    F1

    F2

    F3

    clk

    C4

    I2

    O2

    O2 = TI2 + C4

    Four kinds of path groups exist:

    Input to Output, e.g., I2 to O2

    Input to Register, e.g, I1 to F1

    Register to Register F1 to F2

    Register to Output F3 to O1

    TI1 + C0 ≤ P – S1

    TI1 + C0 ≥ H1

    Setup Slack: P- S1- TI1- C0

    Hold Slack: TI1 + C0 - H1

    Setup and Hold Slacks should be positive

    TI1, TI2 are input delays

    DQ1, DQ2 and DQ3 are clk-to-Q delays

    S1, S2 and S3 are setup constraints

    H1, H2 and H3 are hold constraints

    C0-C3 combinatorial delays

    P is the clock Period

    DQ1 + C1 ≤ P – S2

    DQ2 + C1 ≥ H2

    Setup Slack: P - S2 - DQ2 - C1

    Hold Slack: DQ2 + C1 – H2


    Gate level simulation

    Gate Level Simulation

    Gate Level Design

    Simulation Library

    Timing Library

    Timing Analysis

    Tool

    Simulator

    SDF File


    Clock distribution

    Clock Distribution

    Source: MIT. Course 6.375. Lecture L06. 2006


    Clock skew

    Clock Skew

    Clock Skew in Alpha Processor

    The basic assumption in synchronous system is that all the sequential elements in the design sample their input at the same time, marked by a clock signal. In reality, the clock signal does not arrive at the sequential elements at the same time. The difference in time between the reference clock signal and the local clock signal at a sequential element is called the clock skew.

    In fact clock skew would not be a problem if the clock signal was uniformly delayed at all the sequential elements. It is the non-uniform delay of the clock signal that creates the problem. The delay depends on the distance of the sequential element from the clock source and the local load.

    The primary reason for the delay is the large amount of load seen by the clock signal. The load consists of all the sequential elements in the design and clock net itself which behaves as a distributed RC line (or higher order models ) and can be several cms long in a large chip.

    The total capacitance of a single clock line easily measures hundreds of pF and can easily reach into nF range. The total clock capacitance of the Alpha processor equals 3.25 nF, which is 40% of the total switching capacitance of the entire chip.


    Clock skew1

    Clock Skew

    Source: MIT. Course 6.375. Lecture L06. 2006


    Clock jitter

    Clock Jitter

    Source: MIT. Course 6.375. Lecture L06. 2006


    Traditional soc design flow

    Source: MIT. Course 6.375. Lecture L06. 2006


    Clock skew and sequential circuit performance

    Clock Skew and Sequential Circuit Performance

    Each synchronous module is composed of combinational logic CL and a Flop and is characterised by six timing parameters: The min. and max. propagation(pg) delays of the register: tr,min, tr,max and combinational logic: tl,min, tl,max. The propagation delay of the interconnect ti and the local clock skew tf.

    The max pg. delay corresponds to the time taken by the slowest output to respond to any transition at input. This delay constraints the max. allowable clock speed.

    The min pg. delay corresponds to the time taken by atleast one output to start responding to a transition at input. This delay is typically much smaller than the max delay and determines the amount of skew a circuit can tolerate before race condition occurs. If d is greater tr,min + ti + tl,min than inputs at R2 can change before the previous inputs are latched.

    tf”    tf’ + tr,min + ti + tl,min OR

    d   tr,min + ti + tl,min

    tf” + T     tf’ + tr,max + ti + tl,maxOR

    T tr,max + ti + tl,max - d


    Positive and negative clock skew

    Positive and Negative Clock Skew

    • Positive Skew: d > 0:

      • In this case the clock is routed in the same direction as the data and the first equation needs to be satisfied. Violating it will result in malfuntioning of circuit. Observe that slowing down the clock period does not help. The positive skew actually helps improve the clock speed as it is a negative factor in the constraint on clock period T.

    • Negative Skew: d < 0:

      • The negative skew occurs when the data is routed in the direction opposite to the clock signal. The first equation is unconditionally satisfied and the circuit works correctly independent of the skew. Unfortunately, negative skew will limit the clock speed and thus lower the performance, as predicted by the second equation: the skew reduces the time available for computation by |d|.


    Traditional soc design flow

    a

    c

    d

    b

    0

    Setup time met

    Hold time met

    Launch

    Clock

    c

    a

    b

    0

    Capture

    Clock

    d

    a

    0

    b


    Traditional soc design flow

    a

    c

    d

    b

    0

    Setup time violated

    Hold time violated

    Launch

    Clock

    c

    a

    b

    0

    Capture

    Clock

    d

    a’

    0

    b’


    Traditional soc design flow

    a

    c

    d

    b

    0

    Setup time violated

    Hold time met

    Launch

    Clock

    c

    a

    b

    0

    Capture

    Clock

    d

    0


    Traditional soc design flow

    logic

    logic

    FF 1

    FF 2

    setup

    startpoint

    hold

    relationship

    relationship

    endpoint

    Setup Violations result from worst case timing

    Hold Violations result from best case timing


    Chip level timing issues

    4

    4

    1

    2

    3

    4

    1

    2

    3

    4

    CGU

    CGU

    6

    5

    8

    6

    5

    8

    8

    8

    7

    7

    Chip Level Timing Issues

    Blocks 4 & 8 communicate and need their clocks to be skew alligned

    The data signals between Blocks 4 & 8 could take more than one clock cycle and can get routed through blocks 5 and 6

    This makes chip level timing closure difficult and sensitive to geometry.

    A hierarchical design style, where each chiplets are timing closed independently and chip can be composed from such chiplets. Solution: Latency insensitive design.


    Categories of synchronization

    Categories of Synchronization

    Data Based

    ClockBased

    GS

    Double Latch

    GALS

    Handshake: 2 Phase, 4 Phase

    GRLS (KTH Technology)

    Asynchronous – 2 Clock FIFO

    Latency

    ambiguity

    Data based

    synchronization

    Clock based

    synchronization

    Constraints

    Complexity


    Send and forget double latching

    PS

    PD

    S

    ACL

    D

    CLKs

    CLKD

    D

    D

    Q

    Q

    Ps

    PD

    CLKD

    Send and Forget – Double Latching

    ACL: Asynchronous Communication Link

    Source

    Destination


    Send and forget double latching1

    Send and Forget – Double Latching

    • Advantages

    • Good choice for single bit control data

    • Grey coded multi bit data payloads are also target

    • Disadvantages

    • No Flow Control  Send and Forget

    • Metastable signal to multiple targets could resolve to different values


    Handshake acl a synchronous c ommunication l ink

    PS

    PD

    S

    D

    ACL

    RS

    RD

    AS

    AD

    CLKs

    CLKD

    Ps

    PD

    D

    Q

    CLKD

    FSM

    RS

    RD

    Q

    D

    D

    Q

    AD

    AS

    Q

    D

    Q

    D

    FSM

    CLKs

    Handshake ACLAsynchronous Communication Link

    Pd: Destination Payload

    Ps: Source Payload


    Traditional soc design flow

    Data payload frequency must be less than the worst-case round trip delay of the flow control

    2-phase

    3Ts + 3Td ≥ TPs

    4 phase

    6Ts + 6Td ≥ TPs

    Example:

    Source: 27 MHz, Destination: 200 MHz

    Maximum isochronous data rate using 2 phase protocol

    3*(37nS) + 3*(5nS) = 126 ns = 7.9 MHz


    Traditional soc design flow

    3Ts + 3Td

    6Ts + 6Td

    TPs

    TPs

    TPs

    The period for which

    data remains valid/asserted

    2-phase

    3Ts + 3Td ≥ TPs

    4 phase

    6Ts + 6Td ≥ TPs

    Note that TPs does not decide data payload frequency. TPs is less than the round trip delay to enable the next payload to be transferred immediately after the round trip delay is over.

    The period (TPL)corresponding to the data payload frequency has to be more than the worst case round trip delay i.e. 3Ts + 3Td ≤ TPL and 6Ts + 6Td ≤ TPLfor 2 and 4 phase protocols respectively. This is illustrated in the example below

    Data payload frequency must be less than the worst-case round trip delay of the flow control

    4-phase

    6Ts + 6Td

    2-phase

    3Ts + 3Td

    Example:

    Source: 27 MHz, Destination: 200 MHz

    Maximum isochronous data rate using 2 phase protocol

    3*(37nS) + 3*(5nS) = 126 ns = 7.9 MHz


    2 clock asynchronous fifo

    2 Clock Asynchronous FIFO

    • Fail Safe, Self Correcting:

      • Write logic could think the FIFO is full when it is not

      • Read logic could think that the FIFO is empty when it is not

    • Not suitable for Island hopping:

      • Storage in Write Island is a problem

      • Typically the read side needs to be read every cycle


    Gals globally asynchronous locally synchronous

    GALS Globally Asynchronous Locally Synchronous

    Source: ETH, Zurich


    Traditional soc design flow

    GALS


    Clocking and communication schemes

    Clocking and Communication Schemes

    • Synchronous Design – phase and skew alligned

    • Mesochronous Design – same clk freq and phase alligned

    • Ratiochronous Design

      Different Clock freqs but have rational relationship – phase alligned

      KTH research

    • Pleisochronous

      • No rational clock relationship – phase relationship drifts

    • Asynchronous


    Ideal vs real clock

    Ideal vs Real Clock

    • During the initial phase of synthesis clock is ideal

    • set_auto_disable_drc_nets command should be used to prevent DC from wasting time on fixing DRC violations on high fanout nets like Resets and Clocks

    • Model skew and jitter effects using the set_clock_uncertainity command

    • Model clock network latency using set_clock_latency command

    • Once clock tree has been inserted use the set_propagated_clock command to use the actual clock. Back annotation using read_sdf command is required


    Modelling clock skew

    Modelling Clock Skew


  • Login