ece 551 digital system design synthesis n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
ECE 551 Digital System Design & Synthesis PowerPoint Presentation
Download Presentation
ECE 551 Digital System Design & Synthesis

Loading in 2 Seconds...

play fullscreen
1 / 68

ECE 551 Digital System Design & Synthesis - PowerPoint PPT Presentation


  • 172 Views
  • Uploaded on

ECE 551 Digital System Design & Synthesis. Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options. Of course, things are not so simply divided. Pre-Synthesis Steps. Syntax Check Makes sure your HDL code follows the syntax rules of the Standard.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'ECE 551 Digital System Design & Synthesis' - elata


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
ece 551 digital system design synthesis

ECE 551Digital System Design & Synthesis

Lecture 08

The Synthesis Process

Constraints and Design Rules

High-Level Synthesis Options

pre synthesis steps
Pre-Synthesis Steps
  • Syntax Check
    • Makes sure your HDL code follows the syntax rules of the Standard.
    • Finds errors like typos, missing semicolons, “begin” without “end”, assigning to a net in a behavioral block, etc.
    • Only a surface-level check
    • Checks each module in isolation; doesn’t look at how they fit together
pre synthesis steps1
Pre-Synthesis Steps
  • Elaboration
    • “Elaborates” HDL statements
    • Unrolls FOR loops
    • Computes values of constant functions
    • Replaces parameters with their values
    • Substitutes macro text
    • Evaluates generate conditionals and loops
    • Checks to make sure instantiated modules are defined
    • Checks inter-module connections for mismatched input/output connections (i.e. module port width not the same as connected net/variable width)
pre synthesis steps2
Pre-Synthesis Steps
  • Design Check
    • Checks design for issues that may make it unsynthesizable, but are otherwise legal HDL
    • Detects multiple drivers to non-tristates
    • Detects combinational loops
    • Gives errors or warnings about unsynthesizable constructs like delays, unsupported operators, etc.
    • Warns about unconnected or constant-value ports
    • May give warnings about inferred latches
    • Many of these produce warnings rather than errors; make sure you read the warnings when synthesizing!
synthesis process
Synthesis Process
  • Inputs
    • Functional hardware description in HDL
    • List of design constraints and design rules
      • Desired clock frequency / maximum delay
      • Limits on area, power, capacitance
    • Technology library (logic cells, wire models, etc.)
    • User-specified synthesis options/strategies
  • Output
    • Ideally: A netlist that uses the specified technology library, produces the same behavior as the functional description, and meets the design constraints
    • Reports that summarize the area and timing of the implementation
logic synthesis steps
Logic Synthesis Steps
  • Translation
    • The synthesis tool identifies the behavior of high-level constructs and replaces them with a structural representation from a generic technology library.
    • Examples: “adder”, “multiplier”, “flip-flop”, “latch”
  • High-Level Optimizations
    • The tool performs optimizations at the Boolean equation level
    • The types of optimizations depend on your strategies
    • Examples: Reducing the number of logic levels, minimizing the number of Boolean operations, eliminating redundant computations
logic synthesis steps1
Logic Synthesis Steps
  • Mapping
    • The synthesis tool replaces the generic representations of gates and logic structures with equivalent hardware representations from the provided technology library
    • The netlist now consists of a structural representation of logic cells (Standard Cell) or LUTs/CLBs (FPGA)
  • Low-Level Optimizations
    • The tool performs optimizations at the logic cell level, either to reduce delay or reduce area
    • Examples: Duplicating logic, re-ordering operations to minimize delay, re-timing registers
a brief aside on mapping
A Brief Aside on Mapping
  • People commonly say that when using Structural Verilog, you know exactly what gates you are getting.
  • Is this true?
    • It actually depends on what’s in your Tech Library
    • If your library contains an XOR gate, then an XOR primitive will be mapped to that gate
    • But what if your Tech Library only contains NAND gates? Or only Look-up Tables?
why require constraints strategies
Why require Constraints & Strategies?
  • Synthesis is hard (NP-hard!)
    • For a circuit of any useful size, the number of possible implementations is enormous
    • It is too computationally intensive to try them all
    • Need to know when a solution is good enough to stop
    • We usually give the tool hints on how to proceed
  • Often there is no universally “best” solution
    • Area vs. delay
    • Throughput vs. latency
    • Power vs. frequency
    • Constraints & strategies allow us to manage tradeoffs to find the solution that meets our needs
constraint examples
Constraint Examples
  • Minimize area

module mac(input clk, rst, input [31:0] in, output [63:0] out);

reg [31:0] constreg;

reg [63:0] mult, add, result;

reg [2:0] count;

assign out = result;

always @(*) mult = constreg * in;

always @(*) add = mult + result;

always @ (posedge clk) begin

if (rst) begin

constreg <= in;

result <= 0;

count <= 0;

end else if (count > 0) begin

result <= add;

count <= count - 1;

end else begin

result <= 0;

count <= 4;

end

end

endmodule

setting design constraints
Setting Design Constraints
  • set_max_area 20000
    • Sets maximum area to 20,000 cell units
  • set_max_delay 4 -to all_outputs()
    • Sets maximum delay of 4 to any output
  • set_max_dynamic_power 10mW
    • Sets maximum dynamic power to 10 mW
  • create_clk “clk” –period 10
    • Specifies that port clk is a clock with a period of 10ns
  • create_clk –name “my_clk” –period 12
    • Creates a virtual clock called my_clk with a period of 12ns; use with combinational logic
constraint examples1
CLK_PERIOD = 4 (250 MHz)

MAX_AREA = 80000

Arrival: 3.73

Slack: 0.01

Area: 68122

Slack = CLK_PERIOD –

(Arrival + Library Setup Time)

Library Setup Time is approximately 0.25-0.26 ns for these examples

Constraint Examples
constraint examples2
CLK_PERIOD = 4

MAX_AREA = 65000

Arrival: 3.75

Slack: 0.00

Area: 64758

Constraint Examples
constraint examples3
CLK_PERIOD = 4

MAX_AREA = 60000

Arrival: 3.75

Slack: 0.00

Area: 63377

Constraint Examples
constraint examples4
Constraint Examples
  • Maximize speed

module mac(input clk, rst, input [31:0] in, output [63:0] out);

reg [31:0] constreg;

reg [63:0] mult, add, result;

reg [2:0] count;

assign out = result;

always @(*) mult = constreg * in;

always @(*) add = mult + result;

always @ (posedge clk) begin

if (rst) begin

constreg <= in;

result <= 0;

count <= 0;

end else if (count > 0) begin

result <= add;

count <= count - 1;

end else begin

result <= 0;

count <= 4;

end

end

endmodule

constraint examples5
CLK_PERIOD = 4 (250 MHz)

MAX_AREA = 80000

Arrival: 3.73 (+ 0.26 = 3.99)

Slack: 0.01

Area: 68122

Constraint Examples
constraint examples6
CLK_PERIOD = 3.6 (278 MHz)

MAX_AREA = 80000

Arrival: 3.46 (+ 0.26 = 3.68)

Slack: -0.08

Area: 73131

Constraint Examples
constraint examples7
CLK_PERIOD = 3.7 (270 MHz)

MAX_AREA = 90000

Arrival: 3.45 (+ 0.25 = 3.7)

Slack: 0.00

Area: 75673

Constraint Examples
optimization priorities
Optimization Priorities
  • Design rules have priority over timing goals
  • Timing goals have priority over area goals
    • Design rules have highest priority
  • To prioritize area constraints:
    • use the ignore_tns (total negative slack) option when you specify the area constraint:

set_max_area -ignore_tns 10000

  • To change priorities use set_cost_priority
    • Example: set_cost_priority -delay
  • To remove all optimization constraints use remove_constraint
compiling the design
Compiling the Design
  • Once optimizations specifications are set, the design is compiled
  • The compile command
    • Logic-level and gate-level synthesis
    • Optimizations of the design
  • The compile_ultra command
    • Two-pass high effort compile of the design
    • May want to compile normally first to get ballpark figure (higher effort == longer compilation)

What is the purpose of doing multiple passes?

synthesis strategies
Synthesis Strategies
  • Even after supplying HDL code, Tech Library, and Constraints, the designer is still responsible for the Synthesis Strategy.
  • Why do we use Strategies?
    • The amount of CPU time and memory we devote to synthesis are still limited resources
    • The designers may already have a good idea about what sort of hardware they want
compiling the design1
Compiling the Design
  • Useful compile options include:

-map_effort low | medium | high (default is medium)

-area_effort low | medium | high (default same as map_effort)

-incremental_mapping (may improve already-mapped)

-verify (compares initial and synthesized designs)

-ungroup_all (collapses all levels of design hierarchy)

top down compilation
Top-Down Compilation
  • Use top-down compile strategy used when compile time or synthesizer memory are not limiters
  • Synthesizes each design unit separately and uses top-level constraints
  • Basic steps are:
    • Read in the entire design using analyze/elaborate or:

acs_read_hdl -recurse $TOP_DESIGN

    • Resolve multiple instances of any design references with uniquify
    • Apply attributes and constraints to the top level
    • Compile the design using compile or compile_ultra
example top down script
Example Top-Down Script

# read in the entire design

analyze -library WORK -format verilog {E.v D.v C.v B.v A.v TOP.v}

elaborate {E.v D.v C.v B.v A.v TOP.v}

current_design TOP

link # links TOP.v to libraries and modules it references

# set design constraints

set_max_area 2000

# resolve multiple references

uniquify

# compile the design

compile

bottom up compile strategy
Bottom-Up Compile Strategy
  • The bottom-up compile strategy
    • Compile the subdesigns separately and then incorporate them
    • Top-level constraints are applied and the design is checked for violations.
  • Advantages:
    • Compiles large designs more quickly (divide-and-conquer)
    • Requires less memory than top-down compile
  • Disadvantages
    • Need to develop local constraints as well as global constraints
    • May need to repeat process several times to meet design goals
  • Might use if memory or CPU time are limited
compile once don t touch method
Compile-Once-Don’t-Touch Method
  • The compile-once-don’t-touch method uses the set_dont_touch command to preserve the compiled subdesign

current_design top

characterize U2/U3

current_design C

compile

current_design top

set_dont_touch {U2/U3 U2/U4}

compile

  • What are advantages and disadvantages?
resolving multiple references
Resolving Multiple References
  • In a hierarchical design, subdesigns are often referenced by more than one cell instance
uniquify method
Uniquify Method
  • The uniquify command creates a uniquely named copy of the design for each instance.

current_design top

uniquify

compile

  • Each design optimized separately
  • What are advantages and disadvantages?
ungroup method flattening
Ungroup Method (“Flattening”)
  • The ungroup command makes unique copies of the design and removes levels of the hierarchy

current_design B

ungroup {U3 U4}

current_design top

compile

  • What are advantages and disadvantages?
benefits of ungrouping hierarchy
Benefits of Ungrouping Hierarchy

module logic1(input a, c, e, output reg x);

always @(a, c, e)

x = ((~a|~c) & e) | (a&c);

endmodule

module logic2(input a, b, c, d, output reg y);

always @(a, b, c, d)

y = ((((~a|~c)&b) | ((a|~b)&c))&d) | ((a|~b)&~d);

endmodule

module logic(input a, b, c, d, e, f, output reg z);

wire x, y;

logic1(a, c, e, x);

logic2(a, b, c, d, y);

always @(x, y, f)

z = (~f&x) | (f&y);

endmodule

With Hierarchy

Area: 36.15

Delay: 0.25

Without Hierarchy

Area: 34.15

Delay: 0.25

ungrouping versus boolean flattening
Ungrouping versus Boolean Flattening
  • Ungrouping is commonly referred to as “Flattening the Hierarchy”, even by tool vendors
  • Because of this, many people incorrectly think the “set_flatten true” option in Synopsys is the same as “ungroup”
  • set_flatten true tells Design Vision to flatten the Boolean equations describing your logic down to a two-level expression. That is, to create a Sum of Products expression.
  • Flattening Boolean equations is a way of reducing delay at the cost of increased area – we’ll talk about it more in a later lecture.
dealing with structured logic
Dealing with Structured Logic
  • Sometimes we do not want the synthesis tool to try to optimize our Boolean equations.
  • Structured Logic refers to Boolean logic operations that are structured in a certain way to achieve a goal, such as reduced delay or fault tolerance.
  • Examples: Carry-Lookahead Adder, Wallace Multiplier, duplicated logic
  • set_structure true (default) – tells the tool it can re-order, factor, or decompose the logic equations
  • set_structure false – tells the tool to leave the logic alone
checking your design
Checking your Design
  • Use the check_design command to verify design consistency.
    • Usually run both before and after compiling a design
    • Gives a list of warning and error messages
    • Errors will cause compiles to fail
    • Warnings indicate a problem with the current design
      • Try to fix all of these, since later they can lead to problems
      • Use check_design –summary or check_design -no_warnings to limit the number of warnings given
    • Use check_timing to locate potential timing problems
analyzing your design 1
Analyzing your Design [1]
  • There are several commands to analyze your design
    • report_design
      • display characteristics of the current design
      • operating conditions, wire load model, output delays, etc.
      • parameters used by the design
    • report_area
      • displays area information for the current design
      • number of nets, ports, cells, references
      • area of combinational logic, non-combinational, interconnect, total
analyzing your design 2
Analyzing Your Design [2]
    • report_hierarchy
      • displays the reference hierarchy of the current design
      • tells modules/cells used and the libraries they come from
    • report_timing
      • reports timing information about the design
      • default shows one worst case delay path
    • report_resources
      • Lists the resources and datapath blocks used by the current design
    • Can send reports to files
      • report_resources > cmult_resources.rpt
  • Lots of other report commands available
synthesis scripts
Synthesis Scripts
  • Synthesis scripts provide a convenient method for performing synthesis multiple times
  • To run the script, enter the directory which contains the Verilog code and type:
    • dc_shell –tcl_mode –f script.tcl
    • dc_shell –tcl_mode –f script.tcl > log.txt &
      • This will start the script and store its output to log.txt

43

example synthesis script
Example Synthesis Script

analyze -library WORK -format verilog {/.register_file_behave.v}

elaborate reg_file_behave -architecture verilog -library WORK

create_clock –name "clk" -period 2 -waveform {0 1} {clk}

set_dont_touch_network [ find clock clk ]

set_max_area 30000

check_design

uniquify

compile -map_effort medium

report_area > area_report.txt

report_timing > timing_report.txt

report_constraint -all_violators > violator_report.txt

44

design optimization fir filter
Design Optimization: FIR Filter
  • Used in signal processing
  • Passes through some data but not all (filter!)
  • Example: Remove noise from image/sound
  • Uses multipliers and adders
  • Multiply constant “tap” value against time-delayed input value
  • In the Verilog, y is out, bk is taps, and x is data
design optimization fir filter1
Design Optimization: FIR Filter
  • We’ll look at three different approaches to implementing this filter
    • “Initial”
    • “Small”
    • “Fast”
  • We’ll revisit the idea of re-architecting algorithms for better area, latency, and throughput later.
  • As an exercise, you should take some time on your own to try to understand exactly what is happening in each of the following code segments.
  • Learning to read and understand someone else’s (confusing) code is an extremely valuable skill
initial design code 1
Initial Design: Code [1]

module fir_init(clk, rst, in, out);

parameter bitwidth = 8;

parameter ntaps = 4;

parameter logntaps = 2;

input clk, rst;

input [bitwidth-1:0] in;

output reg [bitwidth-1:0] out;

reg [bitwidth-1:0] taps [0:ntaps-1];

reg [bitwidth-1:0] data [0:ntaps-1];

reg [logntaps:0] count;

integer i;

initial design code 2
Initial Design: Code [2]

always @(posedge clk) begin

if (rst) begin

// indicate we need to load all the tap values

count <= 0;

// reset the data and taps

for (i = 0; i < ntaps; i = i + 1) begin: resetloop

data[i] <= 0; taps[i] <= 0;

end

end

else if (count < ntaps) begin

// we need to load the tap values before filtering

for (i = ntaps-1; i > 0; i = i - 1) begin: loadtaps

taps[i] <= taps[i-1];

end

// load the new value at tap[0]

taps[0] <= in;

count <= count+1;

end

initial design code 3
Initial Design: Code [3]

else begin

// ready to do the filtering

// first shift in the new input data value

for (i = ntaps-1; i > 0; i = i - 1) begin: shiftdata

data[i] <= data[i-1];

end

// load the new value at data[0]

data[0] <= in;

end // else: !if(count < ntaps)

end // always @ (posedge clk)

// compute the filtered result

always @(*) begin

out = 0;

for (i = 0; i < ntaps; i = i + 1) begin: filterloop

out = out + (data[i] * taps[ntaps-1 - i]);

end

end

endmodule

initial design synthesis
Initial Design: Synthesis
  • Constraints
    • CLK_PERIOD 4
    • INPUT_DELAY 0.2
    • OUTPUT_DELAY 0.2
    • MAX_AREA 8000
  • Results
    • Arrival Time 3.13
    • Slack .67 (MET)
    • Area 7335

Should we make our contraints more aggressive?

small design code 1
Small Design: Code [1]

module fir_area(clk, rst, in, out);

parameter bitwidth = 8;

parameter ntaps = 4;

parameter logntaps = 2;

input clk, rst;

input [bitwidth-1:0] in;

output reg [bitwidth-1:0] out;

reg [bitwidth-1:0] taps [0:ntaps-1];

reg [bitwidth-1:0] data [0:ntaps-1];

reg [bitwidth-1:0] partial;

reg [logntaps:0] count;

reg [logntaps-1:0] step;

reg ready; // indicates ready to filter

integer i;

small design code 2
Small Design: Code [2]

always @(posedge clk) begin

if (rst) begin

// indicate we need to load all the tap values

count <= 0; ready <= 0;

// reset the data and taps

for (i = 0; i < ntaps; i = i + 1) begin: resetloop

data[i] <= 0; taps[i] <= 0;

end

end

else if (count < ntaps && ~ready) begin

// we need to load the tap values before filtering

for (i = ntaps-1; i > 0; i = i - 1) begin: loadtaps

taps[i] <= taps[i-1];

end

// load the new value at tap[0]

taps[0] <= in;

count <= count+1;

if (count >= ntaps) begin ready <= 1; count <= 0; end

end

small design code 3
Small Design: Code [3]

else begin

// ready to do the filtering

// first shift in the new input data value

for (i = ntaps-1; i > 0; i = i - 1) begin: shiftdata

data[i] <= data[i-1];

end

// load the new value at data[0]

data[0] <= in;

end // else: !if(count < ntaps)

end // always @ (posedge clk)

small design code 4
Small Design: Code [4]

// compute the filtered result

always @(posedge clk) begin

if (rst || ~ready) begin step <= 0; partial <= 0; end

else begin

if (step == 0) begin

out <= partial;

partial <= (data[0] * taps[ntaps-1]);

end

else begin

out <= out;

partial <= partial + (data[step] * taps[ntaps - 1 – step]);

end

if (step < ntaps-1) step <= step + 1;

else step <= 0;

end

end

endmodule

small design synthesis
Small Design: Synthesis
  • Constraints
    • CLK_PERIOD 4
    • INPUT_DELAY 0.2
    • OUTPUT_DELAY 0.2
    • MAX_AREA 8000
  • Results
    • Arrival Time 2.76 (vs. 3.13)
    • Slack .92 (MET) (4 clock cycles)
    • Area 5754 (vs. 7335)
  • What are the tradeoffs?
fast design code 1
Fast Design: Code [1]

module fir_fast(clk, rst, in, out);

parameter bitwidth = 8;

parameter ntaps = 4;

parameter logntaps = 2;

input clk, rst;

input [bitwidth-1:0] in;

output [bitwidth-1:0] out;

reg [bitwidth-1:0] taps [0:ntaps-1];

reg [bitwidth-1:0] mult [0:ntaps-1];

reg [bitwidth-1:0] partial [0:ntaps-1];

reg [logntaps:0] count;

reg ready; // indicates ready to filter

integer i;

assign out = partial[ntaps-1];

fast design code 2
Fast Design: Code [2]

always @(posedge clk) begin

if (rst) begin

// indicate we need to load all the tap values

count <= 0;

// reset the taps

for (i = 0; i < ntaps; i = i + 1) begin: resetloop

taps[i] <= 0;

end

end

else if (count < ntaps && ~ready) begin

// we need to load the tap values before filtering

for (i = ntaps-1; i > 0; i = i - 1) begin: loadtaps

taps[i] <= taps[i-1];

end

// load the new value at tap[0]

taps[0] <= in;

count <= count+1;

end

fast design code 3
Fast Design: Code [3]

else begin

// taps stay the same

end // else: !if(count < ntaps)

end // always @ (posedge clk)

// compute the filtered result (pipelined)

always @(posedge clk) begin

// get the product of the input with each of the tap values

for (i = 0; i < ntaps; i = i + 1)

mult[i] <= in * taps[i];

// special case at front

partial[0] <= mult[0];

// get the partial sums for the rest

for (i = 1; i < ntaps; i = i + 1)

partial[i] <= partial[i-1] + mult[i];

end

endmodule

fast design synthesis
Fast Design: Synthesis
  • Constraints
    • CLK_PERIOD 4
    • INPUT_DELAY 0.2
    • OUTPUT_DELAY 0.2
    • MAX_AREA 8000
  • Results
    • Arrival Time 1.92 (vs. 3.13)
    • Slack 1.82 (MET) (1 clock cycle!*)
    • Area 7311 (vs. 7335)

What are the tradeoffs?

optimization strategies
Optimization Strategies
  • Area vs. Delay - Often only really optimize for one
    • “Fastest given an area constraint”
    • “Smallest given a speed constraint”
  • Design Compiler Reference Manual has several pointers on synthesis settings for these goals
  • In some ways, synthesis is as much an art as it is a science
  • Experiment with different options to see how they interact with each other
design examples
Design Examples
  • All using same constraints
  • No special synthesis options
  • Can get even more dramatic results by combining:
    • Coding style
    • Tight constraints
    • Synthesis optimization options
script
Script

analyze -library WORK -format verilog {fir_area.v}

elaborate fir_area -architecture verilog -library WORK

create_clock -name "clk" -period 4 {clk}

set_dont_touch_network [ find clock clk ]

set_max_area 5000

set NORM_INPUTS [remove_from_collection [all_inputs] "clk rst"]

#set NORM_INPUTS [remove_from_collection [all_inputs] "clk"]

set_input_delay 0.2 -max -clock clk $NORM_INPUTS

set_output_delay 0.2 -max -clock clk [all_outputs]

check_design > check_design.txt

uniquify

#compile -map_effort medium -area_effort medium

compile -map_effort high -area_effort high

compile_ultra

report_area > area_report.txt

report_timing > timing_report.txt

report_constraint -all_violators > violator_report.txt

exit

slide68

Want more information about any of the Design Vision commands listed in these lectures?

Log in to a CAE computer and type:

dc_shell

man command_name