Deep Submicron Logic / Layout Synthesis

1 / 43

# Deep Submicron Logic / Layout Synthesis - PowerPoint PPT Presentation

Deep Submicron Logic / Layout Synthesis. 1999. 11 Jun Dong Cho Sungkyunkwan Univ. Dept. ECE Mail : Jdcho@skku.ac.kr Homepage : vada.skku.ac.kr. Agenda. Design Methodology Recent Approaches in Logic / Layout Synthesis EDA Vendor &amp; Their Tools Conclusion. Design Methodology.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Deep Submicron Logic / Layout Synthesis' - Jims

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Deep Submicron Logic / Layout Synthesis

1999. 11

Jun Dong Cho

Sungkyunkwan Univ. Dept. ECE

Mail : Jdcho@skku.ac.kr

Agenda
• Design Methodology
• Recent Approaches in Logic / Layout Synthesis
• EDA Vendor & Their Tools
• Conclusion
Design Methodology
• Introduction : DSM Design Dilemma
• Current Design Methodology
• Recent Approaches in Design Methodology
• Floorplan Approach
• Super Glue Approach
• Fixed Timing Approach
• Simultaneous Optimization Approach
Introduction : DSM Design Dilemma
• As physical feature sizes decrease, the time delay of electrical signals traveling in the interconnect between active devices and gates is approaching the delay through the devices and gates. Therefore, the parasitic information (resistance and capacitance) of the interconnect is absolutely critical to predicting circuit performance.
• The key to solving this problem is knowing more about the physical design, i.e. placement and estimated interconnect, early in the design cycle.
• Iterations between synthesis and layout increase dramatically due to timing and routability problems.
• Most current VLSI tools could not handle the new problems, such as accurate RC extraction, transmission line effect and coupling effect, raised by deep submicron technology properly. Even models for VLSI ASIC designs, such as timing delays, routability, size and power dissipation will need to be modified or to be improved.
Introduction : DSM Design Dilemma
• Although there is no official line for what constitutes a deep submicron, the term generally refers to a CMOS device whose minimum logic gate length is 0.5 um or smaller. Deep submicron technology gives the chip manufacturers' ability to put more gates in chips and increase the density of chips. These make chips more powerful and smaller.
• Most current VLSI tools could not handle the new problems, such as accurate RC extraction, transmission line effect and coupling effect, raised by deep submicron technology properly. Even models for VLSI ASIC designs, such as timing delays, routability, size and power dissipation will need to be modified or to be improved.
• In non-submicron integrated circuits that do not require high clock operation speed, minimum-width line can be used for clock distribution. Since the difference of logic gate delays in the signal paths dominates the clock skew, wire length does not affect the clock skew much. The interconnect wire delay is not a big issue. Under these conditions, the rule of thumb is to use the same number of identical buffers for each signal path, such that every component will experience the same logic gate delay.
Floorplan Approach
• Exsiting EDA Vendor
• Particularly emphasize the floorplan
• Iterations between different tools
• No flexibility to fix timing problems caused by long wires
• Overly constrained timing budgets
• May fail at timing closure
• Adds many buffers and oversizes gates on critical nets.
Super Glue Approach
• Attempts to glue Front End and Back End
• Performs floorplan, placement, routing, timing verification in advance
• Optimize few variables
• Merely moving the closure issue
• Little correlation with final Back End
• Difficult feedback to Front End
Fixed Timing Approach
• Attempts to break the problem
• Early aggressive timing optimization
• Based on simple conservative models
• Timing is set, back-end left for later
• Sub-Optimal area and power results
• Trades one problem to another
• Difficult to extend to other optimizations
Simultaneous Layout Optimization Approach
• Simultaneous Placement and Global Optimization
• Placement
• Routing
• Timing
• Logic Optimization
• Clocks
• Power
• Crosstalk
Recent Approaches in Logic / Layout Synthesis
• Layout Driven Logic Synthesis
• Post Routing Optimization
• Post-Routing Optimization with Routing Characterization[ISPD`99]
• Congestion Minimization
• On The Behavior of Congestion Minimization During Placement[ISPD`99]
• Control Logic Layout Synthesis
• C5M- A Control-Logic Layout Synthesis System

• RTL Logic / Layout Synthesis
• Wave Steering in YADDs : A Novel Non-Iterative Synthesis

and Layout Technique[ISPD`99]

Layout Driven Logic Synthesis (1)Main Feature
• Adoption opposite approach to conventional logic synthesis : Logic synthesis to optimize only for interconnect delay, ignoring the effect of gate delays.
• Based on the simple observation that if an output “o” depends on an input “I”, then the best way to connect “I” to “o” is through a path which is monotonic from “I” to “o” : no diversions in the path from “I” to “o”.
• Conventional logic synthesis can produce a circuit for which it is impossible to find a placement with no diversions in the input-output paths.
• “Illegal node” : a node is illegal if it can not be placed somewhere on the die without causing a diversion in the circuit.
• The proposed approach has the advantage that it still maintains a distinction between the logic synthesis and place & route stages. It does not need to tightly couple synthesis and placement by frequently alternating between the two which can be inefficient and may not converge at all.
Layout Driven Logic Synthesis (2)Problem Description
• Assumption
• The die is represented by a rectangle with width and height .
• The given logic circuit is pin-assigned.
• The delay of a path is linear function of its length (In general, the interconnect delay depends on quadratically on the length of the interconnect, but, it can be made linear by buffer insertion and wire sizing).
• I/O Pin-to-Pin delay model (IP-based synthesis vs slack-based synthesis)
• Particularly suited for intellectual property(IP) blocks.
• Arrival time of the pins are not known in advance. Thus, we aggressively minimize the delays for all I/O paths.
• Objective Function : Minimize the longest path of the circuit
Layout Driven Logic Synthesis (3)
• Synthesis Network
• Not good for placement(longest path exist).
• Placement Tool places node z to minimize the longest path from b to y1 & y2.
• Decomposed Network
• Y2 is independent of b, therefore b can be removed from the support set of y2
• Path from c to y1 is greater than its Manhattan distance
• Optimal Placement Synthesis
• Each node has a short Manhattan distance
• Aim is to guide logic synthesis such that it produces a circuit which is good for placement
Layout Driven Logic Synthesis (4)Constraint Generation
• Region Placement Constraint
• Partition the die into rectangles along the pin position
• Labels each region with functions that can be placed in it
• Each region is labeled with a set of placement constraint(Support set & transitive primary outputs).
• r3: {c,d} is support of y2, and

{a,b} is support of y1.

• Node Placement Constraint
• Label each node with a placement constraint
• The node placement constraint of node n denotes the support set of n & its transitive primary outputs
• Can be easily computed by traversing the Boolean network in BFS manner.
Layout Driven Logic Synthesis (5)Legalize & Synthesis
• Make Legal
• Legal : the node n is legal if there is a region r where n can be placed.
• Minimize the number of new Bloolean nodes created.
• Traverse the Boolean network in a reverse topological order (after fanout visited, node is visited)
• Sees an illegal node.
• Collapse the node into its fanouts until the node becomes legal.
• Constraint-Driven Synthesis
• Optimize the network such that we get a minimum literal legal Boolean network.
• Fast Extract : finds for a two-cube divisor or a two-literal cube that reduces the most number of literals in every iteration.
• Resubstitution : a node n is resubstituted into an other node x if n divides x(if the legality of n is preserved)
• Produce a legal Boolean network, which can be placed s.t. every path is monotonic.
Post Routing Optimization (1)Main Feature [Avante’99]
• The delay due to parasitic of wire routing becomes non-ignoring factor under 0.25um
• Traditional back-annotation approach can’t solve the timing problem because of inaccurate delay estimation
• Many iteration occurs b/w synthesis and layout
• Timing convergence is not guaranteed
• Timing Fidelity Before and After Routing
• Minimal rectilinear spanning tree or Steiner tree is usually used to estimate the wire load and delay for the interconnects of a placement
• VDSM Design
• Routing congestion is more severe so, Wires have to detour
• Coupling effect is usually large
• Timing discrepancies b/w pre- and post- routing
• Main cause of Timing discrepancy
• Coupling effect
• Routing pattern
Post Routing Optimization (2) Coupling Effect & Routing Pattern Effect
• Coupling Effect
• Increase signal delay
• Introduces noise over neighboring wires
• Assumes coupling cap. exists b/w neighboring parallel wires only
• Routing Pattern Effect
• MST,MRST : lower bound of total wire length
• The more routing congestion, the larger detouring nets, the larger the timing discrepancy
Post Routing Optimization (3) Routing Characterization
• Coupling characterization
• Divide layout floorplan into 3D routing plan of small regions
• Routing congestion
Post Routing Optimization (4) Routing Pattern Prediction
• A routing pattern prediction is required to predict which regions the final routing will go through.
• This prediction can be passed down to a detailed router to guide the final routing.
• Routing Pattern Problem(RPP)
• Given a routing graph
• Find set S of connected regions which cover all terminals of this net with objective to
• Subject to capacity constraints
Post Routing Optimization (5)Main Feature
• Cluster Selection
• Seed selection : choose gates in the range of critical slack.
• Selection criteria :1) criticality of the gate, 2) difference among the arrival times of the inputs to the gates, 3) number of fan-in’s and fan-out’s, 4) congestion of the neighboring area.
• Grouping : cluster the adjacent instances to the seed selection to form a partition
• User-specified window size is given to control the logic change within a localized area.
• Incremental placement & logic optimization is performed.
Post Routing Optimization (6) Routing Characterization
• Routing Characterization

/ Prediction

• When a cluster is transformed and placed, the routing of changed nets will be predicted based on the characterization
• Timing analyzer may use the coupling capacitance to estimate the timing after routing
• If the change improves the timing, it will be committed, the routing tree will be updated.
Congestion Minimization (1)Main Feature
• Automated cell placement for VLSI circuits has always been a key factor for achieving designs with optimized area usage, wiring congestion and timing behavior. As technology advances, the congestion problem becomes more important.
• Congestion in a layout means too many nets are routed in local regions.
• With the advent of over-the-cell routing the goal of every place and route methodology has been to utilize area to prevent spilling of routes into channels.
• Multiple routing layers have enough routing resources to route most wires as long as there are not too many wires congested in the same region.
• Excessive congestion will result in a local shortage of the routing resources.
Congestion Minimization (2)Definition of Congestion Cost
• Global bin(bin) : partition a given chip into several retilinear regions.
• Routing demand( ) : number of the nets crossing edge
• Routing supply of a global edges( ) : function of the length of e(fixed value)
• Overflow ( ) : exceeding amount of routing demand
• Measure of congestion
• Total overflow of a placement
• Number of congested edges
Congestion Minimization (3)Consistent Model
• Congestion cost is router dependent.
• Congestion is dependent of the wiring cost.
• “Consistent”
• Two routing models are defined to be “consistent” if the total weighted length of the routes are the same.
• A model for net consist of a set of segment . Each segment has a length and a weight . The total weighted length for net is
• Total weighted length for all net is
Congestion Minimization (4)Congestion Distribution
• Correlation between Wirelength and Congestion
• Total wirelength of a layout is equal to the total routing demand on all global edges
• Maximum routing demand is greater than or equal to the total wirelength divided by the number of global edges.

: Average routing demand

• Theoretical analysis
• Expected number of wires crossing global edge e
Congestion Minimization (6)Objective Function
• An effective congestion objective should be sensitive to placement moves and directly related to the congestion cost
• Objective function
• Suppose the routing demand of e is before a move and after the move.
• Direct overflow cost of this move
• Cost = 0 when
• Cost is close to when
• Don’t care(no congestion) when
Control Logic Layout Synthesis (1)Main Feature
• High-performance control logic is sometimes implemented via custom (manual) layout. Custom layout methods result in good eletrical and area characteristics.
• Productivity is very poor.
• Using custom design for control logic is often a high-risk strategy because the reaction time to changes is long.
• Standard-cell and other fixed-library ASIC-like methods are often employed for control logic
• Design turn around time using these methods is very fast and top-down constraints are accommodated well.
• Overhead required to create the fixed cell library is substantial.
• A poor timing/area/power tradeoff can occur.
• C5M : a new layout system for high-performance control logic which has been successfully used in the design of 400MHz IBM processor.
Control Logic Layout Synthesis (2)C5M Approach
• C5M generates hierarchical row-based macros for static CMOS logic.
• Schematic independence and device-sizing tuning are accomplished via on-the-fly leaf-cell synthesis
• Flow
• The macro HDL description is compiled into a gate-level schematic via logic synthesis. The synthesis target library consists of parameterized gate schematics and delay rules(no layout data)
• Performance is optimized through manual or automatic device-size tuning
• The tuned schematic is restructured for cell generation through gate combining and splitting
• The leaf cells are synthesized to a macro-specific cell image
• The macro is assembled according to macro image
Control Logic Layout Synthesis (3)Leaf Cell Generation
• The leaf-cell schematic is converted into a symbolic layout using CCC(IBM cell compiler)
• CCC operates by first splitting the devices in the schematic according to the maximum finger size, using selectable split strategies.
• Placement engine accommodates multiple objectives like minimum diffusion breaks, maximum gate alignment, minimum wire length, minimum number of contacts etc.
• The symbolic layout is converted into a physical layout using CC(IBM layout compactor)
• Uses the constraint-based, 1D model.
• Constraint-graph generation
• Critical-path analysis
• Wire-minimization
• Cell-image : the result of C5. It assures the cells can be readily assembled, cell boundary are regularized to enable cell abutment, cell wiring is controlled to facilitate macro wiring.
Control Logic Layout Synthesis (4)Macro Assembly
• C5M uses a row-based macro assembly style
• Placement is performed by an IBM ChipPlace:Qplace(Quadratic programming model)
• Not restricted to row-based models.
• Timing driven placement support
• A number of functions for controlling cell placement through constraints or ojectives
• Signal Wiring is created using an IBM LGWire(maze router)
• Macro Image
• Controls the top level physical design
• Specifies pin assignments, bussing structure, macro shape, macro wiring porosity, row structure and configuration of special sub-macros
• Size and pin data are automatically imported from the floorplan and is constructed by an automatic uitility that is parameterized with respect to the mask levels
• Bussing structure is a grid
• Power/Ground(M1), Vertical Wire(M2), Horizontal Wire(M3)
PTL Logic / Layout Synthesis (1)Introduction
• Linearized, pseudo-symmetric binary decision diagram based synthesis of a function
• Can be directly mapped to pass transistor logic with very highly predictable delay and area
• Based on low granularity 2-phase pipelining
• Routing by abutment
• Avoiding interconnect related parasitics
• Delay : Cell delay
• Equalize the delays of the different paths to very small margins of spread
• Be able to “Wave Steer” the circuits
• The obvious limitation
• The size of layout can be more than the standard cell implementation’s
• In some cases the latency of our implementation can be more than that of the standard cell one though the clocking frequency can still be high because of the coexistence of multiple data waves
• Will not be good for feedback systems
• Will be good for data path circuits
PTL Logic / Layout Synthesis (2) Topology of Synthesized YADD Structure (1)
• LBDD
• Defines as an Ordered BDD which grows linearly in the number of nodes per level
• C2 and C3 in the 3rd level can be merged
• Not every function can be represented by LBDD
PTL Logic / Layout Synthesis (3) Topology of Synthesized YADD Structure (2)
• PSBDD(Pseudo-Symmetric BDD)
• Allows for multiple levels labeled with the same variable
• Created by repeated application of Shannon’s expansion
• Merging adjacent non-conflict nodes and/or join operation
• Has a regular structure and can be directly mapped to layout
PTL Logic / Layout Synthesis (4) Topology of Synthesized YADD Structure (3)
• Generalization of the PSBDD and LBDD
• Unrestricted ordering of child nodes of a parent
• Two adjacent nodes in a level that can be merged
• Any leaf node must be present only at the lowest level of the structure
• Exterior don’t cares : Process of joining cofactors and repeating variables creates don’t cares which can be useful in the subsequent level
PTL Logic / Layout Synthesis (5) Topology of Synthesized YADD Structure (4)
• Exterior don’t care
• Two adjacent nodes in a level are in conflict and any reordering of the parent nodes cannot merge them : not solvable
• Assign some care values to exterior don’t care : merging is possible
• Interior don’t care
• When both the parents is same
• More powerful than exterior don’t care
PTL Logic / Layout Synthesis (6) Topology of Synthesized YADD Structure (5)
• Goal : generate YADD from a logic specification
• Cost function : min. the number of level of YADD
• Input : blif / Output : YADD
• Variable selected : max. the number of don’t care minterm pairs after the merging
• During any joining ops, the algorithm tries to create more interior don’t cares
• Implementation
• Regular two-dimensional structure of the YADD : entire structure can be mapped directly to silicon by the simple expedient of replacing every node by a pass transistor logic MUX and an inverter
• Why inverter? : In n-FET transistor, signal degradation of logic high signal in input occurs
• Have faster rise and fall times
• Carry out voltage restoration and improve noise margin
• Size them selectively to equalize the different path delays
• Requirement for PTL circuits
• No output should remain floating for any combination of inputs
• There should be no sneak paths in the circuits
• To make ‘safe buffer insertion’, should not keep any internal node floating
PTL Logic / Layout Synthesis (8) Physical Layout Details (1)
• Why use 2 phase clock scheme?
• Inputs are clocked simultaneously at a higher frequency to make many waves coexist in the structure, data will be corrupted
PTL Logic / Layout Synthesis (9) Physical Layout Details (2)
• Why use D-FF?
• Delay logic values by integer number of clock periods
• L YADD depth, we have (L-1)/2 FF at the root level and 0 at lowest level
• The number of FFs increases by 1 every 2 level
• FF Cell
• Skewing of input data to provide time alignment
• Compact, low power, dynamic shift register cell used
• Driver
• Convert from dynamic to static logic
• Subsequent inverter and static CMOS inverter pair