EEGN-CSCI 660 Introduction to VLSI Design Lecture 5

1 / 65

# EEGN-CSCI 660 Introduction to VLSI Design Lecture 5 - PowerPoint PPT Presentation

EEGN-CSCI 660 Introduction to VLSI Design Lecture 5. Khurram Kazi. Overview of Synthesis flow. Fundamental Steps to a Good design. If you have a good start, the project will go smoothly Partitioning the Design is a good start Partition by: Functionality

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## EEGN-CSCI 660 Introduction to VLSI Design Lecture 5

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### EEGN-CSCI 660Introduction to VLSI DesignLecture 5

Khurram Kazi

CSCI 660

Fundamental Steps to a Good design
• If you have a good start, the project will go smoothly
• Partitioning the Design is a good start
• Partition by:
• Functionality
• Don’t mix two different clock domains in a single block
• Don’t make the blocks too large
• Optimize for Synthesis

CSCI 660 3

Block diagram of the Framer Receiver direction:Is it partitioned well? Does it follow previous suggestions of the previous slide?

CSCI 660 4

Partitioning

CSCI 660 5

Recommended rules for Synthesis
• Share resources whenever possible
• When implementing combinatorial paths do not have hierarchy
• Register all outputs
• Do not implement glue logic between block, partition them well
• Separate designs on functional boundary
• Keep block sizes to a reasonable size
• Separate core logic, pads, clock and JTAG

CSCI 660 6

A

+

B

Mux

sum

C

+

D

select

Resource Sharing

HDL Description

if (select) then

sum <= A + B;

Else

sum <= C + D;

A

mux

C

select

+

sum

B

mux

D

Another Implementation: shared resource Implementation -> Area-efficient

One Possible Implementation

CSCI 660 7

Sharable HDL Operators
• Following HDL (VHDL and Verilog) synthetic operators can result in shared implementation

* + -

>= < <=

= /= ==

• Within the same blocks, the operators can be shared (i.e. they are in the same process)

CSCI 660 8

DesignWare Implementation Selection
• DesignWare implementation is dependent on Area and timing goals
• Smallest implementation is selected based on timing goals being met

fastest

+

smallest

Ripple Carry

Synthetic Module

CSCI 660 9

Sharing Common Sub-Expressions
• Design compiler tries to share common sub-expressions to reduce the number of resources necessary to implement the design -> area savings while timing goals are met

A

B

C

D

E

SUM1 <= A + B + C;

SUM2 <= A + B + D;

SUM3 <= A + B + E;

+

+

+

+

SUM1

SUM2

SUM3

CSCI 660 10

Sharing Common Sub-Expression’s Limitations
• Sharable terms must be in the same order within the each expression

sum1 <= A + B + C;

sum2 <= B + A + D; -> not sharable

sum3 <= A + B + E; -> sharable

• Sharable terms must occur in the same position (or use parentheses to maintain ordering)

sum1 <= A + B + C;

sum2 <= D+ A + B; -> not sharable

sum3 <= E +(A + B); -> sharable

CSCI 660 11

How to Infer Specific Implementation (Adder with Carry-In
• Following expression infers adder with carry-in
• sum <= A + B + Cin;
• where A and B are vectors, and Cin is a single bit

A

B

Cin

+

sum

CSCI 660 12

Operator Reordering
• Design Compiler has the capability to produce the reordering the arithmetic operators to produce the fastest design
• For example
• Z <= A + B + C + D; (Z is time constrained)
• Initially the ordering is from left to right

A

+

B

+

C

+

Z

D

CSCI 660 13

Reordering of the Operator for a Fast Design
• If the arrival time of all the signals, A, B, C and D is the same, the Design Compiler will reorder the operators using a balanced tree type architecture

A

+

B

+

Z

C

+

D

CSCI 660 14

Reordering of the Operator for a Fast Design
• If the arrival time of the signal A is the latest, the Design Compiler will reorder the operators such that it accommodates the late arriving signal

C

+

B

+

D

+

Z

A

CSCI 660 15

Avoid hierarchical combinatorial blocks

The path between reg1 and reg2 is divided between three different block

Due to hierarchical boundaries, optimization of the combinatorial logic cannot be achieved

Synthesis tools (Synopsys) maintain the integrity of the I/O ports, combinatorial optimization cannot be achieved between blocks (unless “grouping” is used).

CSCI 660 16

Recommend way to handle Combinatorial Paths

All the combinatorial circuitry is grouped in the same block that has its output connected the destination flip flop

It allows the optimal minimization of the combinatorial logic during synthesis

Allows simplified description of the timing interface

CSCI 660 17

Register all outputs

Simplifies the synthesis design environment: Inputs to the individual block arrive within the same relative delay (caused by wire delays)

Don’t really need to specify output requirements since paths starts at flip flop outputs.

Take care of fanouts, rule of thumb, keep the fanout to 16 (dependent on technology and components that are being driven by the output)

CSCI 660 18

NO GLUE LOGIC between blocks

Due to time pressures, and a bug found that can be simply be fixed by adding some simple glue logic. RESIST THE TEMPTATION!!!

At this level in the hierarchy, this implementation will not allow the glue logic to be absorbed within any lower level block.

CSCI 660 19

Separate design with different goals

reg1 may be driven by time critical function, hence will have different optimization constraints

reg3 may be driven by slow logic, hence no need to constrain it for speed

CSCI 660 20

Optimization based on design requirements
• Use different entities to partition design blocks
• Allows different constraints during synthesis to optimize for area or speed or both.

CSCI 660 21

Separate FSM with random logic
• Separation of the FSM and the random logic allows you to use FSM optimized synthesis

CSCI 660 22

Maintain a reasonable block size
• Partition your design such that each block is between 1000-10000 gates (this is strictly tools and technology dependent)
• Larger the blocks, longer the run time -> quick iterations cannot be done.

CSCI 660 23

Partitioning of Full ASIC
• Top-level block includes I/O pads and the Mid block instantiation
• Mid includes Clock generator, JTAG, CORE logic
• CORE LOGIC includes all the functionality and internal scan circuitry

CSCI 660 24

Synthesis Constraints
• Specifying an Area goal
• Area constraints are vendor/library dependent (e.g. 2 input-nand gate, square mils, grid etc)
• Design compiler has the Max Area constraint as one of the constraint attributes.

CSCI 660 25

Timing constraints for synchronous designs
• Define timing paths within the design, i.e. paths leading into the design, internal paths and design leading out of the design
• Define the clock
• Define the I/O timing relative to the clock

CSCI 660 26

Define a clock for synthesis
• Clock source
• Period
• Duty cycle
• Defining the clock constraints the internal timing paths

CSCI 660 27

Timing goals for synchronous design
• Define timing constraints for all paths within a design
• Define the clocks
• Define the I/O timing relative to the clock

CSCI 660 28

Constraining input path
• Input delay is specified relative to the clock
• External logic uses some time within the clock period and i.e.
• TclkToQ(clock to Q delay) + Tw (net delay) ->{At input to B}
• Example command for this in synopsys design compiler:
• dc_shell> set_input_delay –clock clk 5 (where 5 represents the input delay)

CSCI 660 29

Constraining output path
• Output delay is specified relative to the clock
• How much of the clock period does the external logic (shown by cloud b) use up?
• Tb + Tsetup; The amount to be specified as the output delay

CSCI 660 30

Generic statement for input and output delays
• Normally the input and the output delay values are set by using some rule of thumb value which is dependent on the fanout, external logic, and the technology being used
• The design compiler (Synthesis tools have to work with time (Tclk-Tin-Tout) during synthesis.

CSCI 660 31

False and Multicycle paths
• False path
• Very slow signals like reset test mode enable, that are not used under normal conditions are classified as false paths
• Multicycle path
• Paths that take more than one clock cycle are known as multicycle paths.
• Have to take define the multicylce paths in the analyzer and it takes those constraints into account when synthesizing

CSCI 660 32

Timing paths

CSCI 660 33

Combinatorial logic may have multiple paths
• Static Timing Analysis uses the longest path to calculate a maximum delay or the shortest path to calculate a minimum delay.

CSCI 660 34

Selecting a Semiconductor vendor
• One of the first things that needs to be done when designing a chip is to select the semiconductor vendor and technology one wants to use. The following issues need to be considered during the selection process
• Maximum frequency of operation
• Power restrictions
• Packageing restrictions
• Clock tree implementation
• Floor planning
• Back-annotationsupport
• Design support for libraries, megacells, and RAMs
• Available cores
• Available test methods and scans

CSCI 660 37

Understanding the library
• Design Compiler (DC) uses these libraries
• Technology libraries
• Symbol libraries
• DesignWare libraries
• Will use design vision from synopsys for synthesis
• Type design_vision to invoke the tool

CSCI 660 39

Technology libraries
• Contain information about the characteristics and functions of each cell provided in a semiconductor vendor’s library. The manufacturers maintain and distribute the technology libraries
• Cell characteristics include information such as cell name, pin names, area, delay arcs and pin loading.
• The technology library also defines the conditions that must be met for a functional design (e.g., the maximum transition time for nets). These conditions are called design rule constraints.
• Also specify the operating conditions and wire load models specific to that technology
• DC requires the technology libraries to be in “.db” format. These libraries are typically provided by the semiconductor manufacturer

CSCI 660 40

Symbol libraries
• Symbol libraries contain definitions of the graphic symbols that represent library cells in the design schematics. Semiconductor vendors maintain and distribute the symbol libraries.
• Design Compiler uses symbol libraries to generate the design schematic. You must use Design Vision to view the design schematic.
• When you generate the design schematic, Design Compiler performs a one-to-one mapping of cells in the netlist to cells in the symbol library.

CSCI 660 41

DesignWare Library
• A DesignWare library is a collection of reusable circuit-design building blocks (components) that are tightly integrated into the Synopsys synthesis environment.
• DesignWare components that implement many of the built-in HDL operators are provided by Synopsys. These operators include +, -, *, <, >, <=, >=, and the operations defined by if and case statements.
• You can develop additional DesignWare libraries at your site by using DesignWare Developer, or you can license DesignWare libraries from Synopsys or from third parties.

CSCI 660 42

Specifying Libraries
• Use dc_shell variables to specify the libraries used by the Design Compiler as shown in the table below

CSCI 660 43

Target Library
• Design Compiler uses the target library to build a circuit. During mapping, Design Compiler selects functionally correct gates from the target library. It also calculates the timing of the circuit, using the vendor-supplied timing data for these gates.
• Use the target_library variable to specify the target library.
• The syntax is
• set target_library my_tech.db

CSCI 660 44

• Design Compiler uses the link library to resolve references. For a design to be complete, it must connect to all the library components and designs it references. This process is called linking the design or resolving references. During the linking process, Design Compiler uses the link_library system variable, the local_link_library attribute, and the search_path system variable to resolve references
• The syntax is

CSCI 660 45

Specifying DesignWare Library
• You do not need to specify the standard synthetic library, standard.sldb, that implements the built-in HDL operators. The software automatically uses this library.
• If you are using additional DesignWare libraries, you must specify these libraries by using the synthetic_library variable (for optimization purposes) and the link_library variable (for cell resolution purposes).

CSCI 660 46

Describing environmental attributes

set_max_capacitance

Set_max_transition

& set_max_fanout

on Inputs and Output ports or current design

set_operating_conditions

on the whole design

CSCI 660 47

Environmental attributes
• Design environment consists of defining the process parameters, I/O port attributes, and statistical wire load models.
• Set_min_library <max_library filename>

-min_version <min library filename>

dc_shell> set_min_library “ex25_worst.db” \

-min_version “ex25_best.db”

This command allows the users to simultaneously specify the best case and worst case libraries. Can be used to fix set up and hold violation. The user should set both the min and the max values for the operating conditions

CSCI 660 48

Setting operating conditions
• set_operating_conditions
• Specifies the process, voltage and temperature conditions of the design.
• Synopsys library consists of WORST, TYPICAL and BEST cases. Each vendor has their own naming convention for the libraries!
• Changing the value of the operating condition command, full range of process variations are covered.

CSCI 660 49

Setting operating conditions
• set_operating_conditions
• WORST is generally used during pre-layout synthesis phase to optimize the maximum set-up time.
• BEST is normally used to fix any hold violations.
• TYPICAL is generally not used since it is covered when both WORST and BEST cases are used.

CSCI 660 50

Setting operating conditions
• set_operating_conditions
• It is possible to optimize the design with both WORST and BEST cases simultaneously

dc_shell> set_operating_conditions WORST

dc_shell> set_operating_conditions –min BEST

-max WORST

CSCI 660 51

• DC uses wire loads models to estimate capacitance, resistance and the area of the nets prior to floor planning or layout.
• The wire load model is based upon a statistically average length of a net for a given fan out for a given area

“20 x 20”

“10 x 10”

CSCI 660 53

• Synopsys provides wire load models in the technology library, each representing a particular size.
• Designer can create their own wire load models for better accuracy

CSCI 660 54

• There are 3 modes associated with the set_wire_load_mode: top, enclosed and segmented
• top
• Defines that all nets in the hierarchy will inherit the same wire load model as the top level block. Use it if when the plan is to flatten the design later for layout.
• enclosed
• Specifies all the nets (of the sub-blocks) inherit the wire load model of the block that completely encloses the sub-blocks. For example, if blocks X and Y are enclosed within block Z, then the blocks X and Y will inherit the wire load models defined for block Z.

CSCI 660 55

• segmented
• Used when wires are crossing hierarchical boundaries. From the previous example, the sub-blocks X and Y will inherit the wire load models specific to them, while nets between sub-blocks X and Y(which are contained within Z) will inherit wire-load model specified for block Z
• Not used often, as the wire load models are specific to the net segments

• Accurately using wire load models is highly recommended as this directly affects the synthesis runs. Wrong model can generate undesired results. Use slightly pessimistic wire load models. This will provide extra time margin that may be absorbed later in the test circuit insertion or layout

CSCI 660 56

50x50

50x50

40x40

40x40

30x30

30x30

20x20

20x20

50x50

40x40

mode = top: (ignores lower level wire loads)

mode = enclosed: (uses best fitting wire loads)

mode = segmented: (uses several wire loads)

50x50

40x40

30x30

20x20

40x40

20x20

30x30

CSCI 660 57

set_drive
• set_drive is used at the input ports of the block. It is used to specify the drive strength at the input port. Is typically used to model the external drive resistance to the ports of the block or chip. 0 signifies highest strength and is normally used for clock or reset ports.

set_drive <value><object list>

dc_shell> set_drive 0 {clk rst}

CSCI 660 58

set_driving_cell
• set_driving_cell is used to model the drive resistance of the driving cell to the input ports.

set_driving_cell –cell <cell name> -pin <pin name> <object list>

dc_shell>set_driving_cell –cell BUFF1 –pin Z [all_inputs]

CSCI 660 59

• set_load sets the capacitive load in the units defined in the technology library (pf), to the specified ports or nets of the design. It typically sets capacitive loading on output ports of the blocks during pre-layout synthesis, and on nets, for back annotating the extracted post layout capacitive information

CSCI 660 60

Design rule constraints
• Design rule constraints consist of set_max_transition, set_max_fanout and set_max_capacitance. These rules are technology dependent and are generally set in the technology library. The DRC commands are applied to input ports, output ports or on the current_design. It can be useful if the technology library is not adequate of is too optimistic, then these commands can be used to control the buffering in the design

set_max_transition <value> <object list>

set_max_capacitance <value> object list>

set_max_fanout ,value> <object list>

dc_shell –t>set_max_transition 0.3 current_design

dc_shell –t>set_max_capacitance 1.5 [get_ports out1]

dc_shell –t>set_max_fanout 3.0 [all_outputs]

(dc_shell –t> corresponds to DC operating in tcl mode)

CSCI 660 61

Some more design constraints

dc_shell –t >create_clock –period 40

-waveform [list 0 20] CLK

set_dont_touch_network is a very useful command and is usually used for clock and reset. It is used to set_dont_touch property on a port, or a net. This prevents DC from buffering the net in order to meet DRCs.

dc_shell –t>set_dont_touch_network {clk, rst}

CSCI 660 62

Some more design constraints
• If a block generates a secondary clock from the primary, e.g. byte clock from the serial clock, in this apply set_dont_touch_network on the generated clock output port of the block. Helps prevent DC from buffering it up. Clock trees can later be inserted to balance the clock skew.

CSCI 660 63

Some more design constraints
• set_dont_touch is used to set a dont_touch property on the current design, cells, references or net. This is frequently used during hierarchical compilations of the block.

dc_shell –t>set_dont_touch current_design

• Useful in telling DC not to touch the current design if it has been optimized to designer’s satisfaction. For example, if some spare gates block is instantiated, DC will not touch it or optimize it.

CSCI 660 64

Summarizing: High level synthesis is constraint driven
• Resource sharing, sharing common sub-expressions and implementation selection are all dependent on design constraints and coding style
• Design Compiler based on timing constraints decides what to share, how to implement and what ordering should be done.
• If no constraints are given, area based optimization is performed (maybe a good start to get an idea of the synthesized circuit)
• It is imperative that realistic constraints should be set prior to compilation
• High Level synthesis takes place only when optimizing an HDL description

CSCI 660 65