Algorithmic Techniques in VLSI CAD

Algorithmic Techniques in VLSI CAD Shantanu Dutt University of Illinois at Chicago

Algorithms in VLSI CAD • Divide & Conquer (D&C) [e.g., merge-sort, partition-driven placement] • Reduce & Conquer (R&C) [e.g., multilevel techniques such as the hMetis partitioner] • Dynamic programming [e.g., matrix multiplication, optimal buffer insertion] • Mathematical programming: linear, quadratic, 0/1 integer programming [e.g., floorplanning, global placement]

Algorithms in VLSI CAD (contd) • Search Methods: • Depth-first search (DFS): mainly used to find any solution when cost is not an issue [e.g., FPGA detailed routing---cost generally determined at the global routing phase] • Breadth-first search (BFS): mainly used to find a soln at min. distance from root of search tree [e.g., maze routing when cost = dist. from root] • Best-first search (BeFS): used to find optimal solutions w/ any cost function, Can be done when a provable lower-bound of the cost can be determined for each branching choice from the “current partial soln node” [e.g., TSP, global routing] • Iterative Improvement: deterministic, stochastic

Do recursively until subprob-size is s.t. TT-based design is doable Stitch-up of solns to A1 and A2 to form the complete soln to A Root problem A Subprob. A1 Subprob. A2 A2,2 A1,1 A1,2 A2,1 Divide & Conquer • Determine if the problem can be solved in a hierarchical or divide-&-conquer (D&C) manner: • D&C approach: See if the problem can be “broken up” into 2 or more smaller subproblems that can be “stitched-up” to give a soln. to the parent prob. • Do this recrusively for each large subprob until subprobs are small enough for an “easy” solution technique (could be exhasutive!) • If the subprobs are of a similar kind to the root prob then the breakup and stitching will also be similar

Reduce problem size (Coarsening) Solve Uncoarsen and refine solution Reduce-&-Conquer • Examples: Multilevel graph/hypergraph partitioning (e.g., hMetis), multilevel routing

Root Problem A Stitch-up function f: Optimal soln of root = f(optimal solns of subproblems) = f(opt(A1), opt(A2), opt(A3), opt(A4) Stitch-up function A1 A2 A3 A4 Dynamic Programming (DP) Subproblems • The above property means that everytime we optimally solve the subproblem, we can store/record the soln and reuse it everytime it is part of the formulation of a higher-level problem

Root Problem A Stitch-up function A1 A2 A3 A4 Subproblems Dynamic Programming (contd) • Matrix multiplication example: Most computationally efficient way to perform the series of matrix mults: M = M1 x M2 x ………….. x Mn, Mi is of size ri x ci w/ ri = ci-1 for i > 1. • DP formulation: opt_seq(M) = (by defn) opt_seq(M(1,n)) • = mini=1 to n-1 {opt_seq(M(1, i)) + opt_seq(M(i+1, n)) + r1xcixcn} • Correctness rests on the property that the optimal way of multiplying M1x … x Mi • & Mi+1 to Mn will be used in the “min” stitch-up function to determine the optimal soln for M • Thus if the optimal soln invloves a “cut” at Mr, then the opt_seq(M(1,r)) & opt_seq(M(r+1,n)) will be part of opt_seq(M) • Perform computation bottom-up (smallest sequences first) • Complexity: Note that each subseq M(j, k) will appear in the above computation and is solved exactly once (irrespective of how many times it appears). • Time to solve M(j, k), j < n, k >= j, not counting the time to solve its subproblems (which are accounted for in the complexity of each M(j,k)) is length l of seq -1 = l-1 (min of l-1 different options is computed). Note l = j-k+1 • # of different M(j, k)’s is of length l = n – l + 1, 2 <= l <= n. • Total complexity = Sum i = 1 to n-1 (i+1) (n-i) = O(n 3) (as opposed to, say, O(2 n) using exhaustive search)

A DP Example: Simple Buffer Insertion Problem Given: Source and sink locations, sink capacitances and RATs, a buffer type, source delay rules, unit wire resistance and capacitance RAT4 Buffer RAT3 s0 RAT2 RAT1 Courtesy: Chuck Alpert, IBM

RAT4 RAT3 s0 RAT2 RAT1 Simple Buffer Insertion Problem (contd) Find: Buffer locations and a routing tree such that slack/RAT at the source is maximized Courtesy: Chuck Alpert, IBM

Slack/RAT Example RAT = 500 delay = 400 Slack/RAT = -200 RAT = 400 delay = 600 RAT = 500 delay = 350 Slack/RAT = +100 RAT = 400 delay = 300 Courtesy: Chuck Alpert, IBM

R1 R2 A B C C1 C2 Elmore Delay Courtesy: Chuck Alpert, IBM

DP Example: Van Ginneken Buffer Insertion Algorithm [ISCAS’90] • Associate each leaf node/sink with two metrics (Ct, Tt) • Downstream loading capacitance (Ct) and RAT (Tt) • DP-based alg propagates potential solutions bottom-up[Van Ginneken, 90] • Add a wire • Add a buffer • Merge two solutions: For each Zn=(Cn,Tn), Zm=(Cm,Tm) soln. vectors in the 2 subtrees, create a soln vector Zt=(Ct,Tt) where Note: Take Ln = Cn Cw, Rw Ct, Tt Cn, Tn Cn, Tn Ct, Tt Ct, Tt Cn, Tn Cm, Tm Courtesy: UCLA

RAT4 RAT3 s0 RAT2 RAT1 DP Example (contd) • Add a wire to each merged solution Zt (same cap. & delay change formulation as before) • Add a buffer to each Zt • Delete all dominated solutions Zd: Zd=(Cd, Td) is dominated if there exists a Zr=(Cr, Tr) s.t. Cd >= Cr and Td <= Tr (i.e., both metrics are worse) • The remaining soln vectors are all “optimal” solns for this subtree and one of them will be part of the optimal solution at the root/driver of the net---this is the DP feature of this algorithm

Van Ginneken Example (20,400) Buffer C=5, d=30 Wire C=10,d=150 (30,250) (5, 220) (20,400) Buffer C=5, d=50 C=5, d=30 Wire C=15,d=200 C=15,d=120 (30,250) (5, 220) (45, 50) (5, 0) (20,100) (5, 70) (20,400) Courtesy: Chuck Alpert, IBM

Van Ginneken Example Cont’d (30,250) (5, 220) (45, 50) (5, 0) (20,100) (5, 70) (20,400) (5,0) is inferior to (5,70). (45,50) is inferior to (20,100) Wire C=10 (30,250) (5, 220) (20,100) (5, 70) (30,10) (15, -10) (20,400) Pick solution with largest slack, follow arrows to get solution Courtesy: Chuck Alpert, IBM

Mathematical Programming Others Linear programming (LP) E.g., Obj: Min 2x1-x2+x3 w/ constraints x1+x2 <= a, x1-x3 <= b -- solvable in polynomial time Quadratic programming (QP) E.g., Min. x12 – x2x3 w/ linear constraints -- solvable in polynomial (cubic) time w/ equality constraints Some vars are integers Mixed integer linear prog (ILP) -- NP-hard Mixed integer quad. prog (IQP) -- NP-hard Some vars are in {0,1} Mixed 0/1 integer quad. prog (0/1 IQP) -- NP-hard Mixed 0/1 integer linear prog (0/1 ILP) -- NP-hard

0/1 ILP/QLP Examples • Generally useful for “assignment” problems, where objects {O1, ..., On) are assigned to bins {B1, ..., Bm} • 0/1 variable xi,j = 1 of object Oi is assigned to bin Bj • Min-cut bi-partitioning for graphs G(V,E) can me modeled as a 0/1 IQP • xi,1 = 1 => ui in V1 else ui in V2 • Edge (ui, uj) in cutset if xi,1 (1-xj,1) + (1-xi,1)(xj,1 ) = 1 Objective function: Min Sum (ui, uj) in E c(i,j) (xi,1 (1-xj,1) + (1-xi,1)(xj,1) • Constraint: Sum w(ui) xi,1 <= max-size ui uj V2 V1

3 1 1 A A A 5 B B B C C C 4 6 6 2 2 E E E G G G 3 7 D D D 4 F F F 5 Graph DFS BFS Search Techniques soln_dfs(v) /* used when nodes are basic elts of the problem and not partial soln nodes */ v.mark = 1; If path to v is a soln, then return(1); for each (v,u) in E if (u.mark != 1) then soln_found = soln_dfs(u) if (soln_found = 1) then return(soln_found) end for; v.mark = 0; /* can visit v again to form another soln on a different path */ return(0) dfs(v) /* for basic graph visit or for soln finding when nodes are partial solns */ v.mark = 1; for each (v,u) in E if (u.mark != 1) then dfs(u) Algorithm Depth_First_Search for each v in V v.mark = 0; for each v in V if v.mark = 0 then if G has partial soln nodes then dfs(v); else soln_dfs(v);

1 A B 6 C 2 3 E G 4 D 5 F DFS Search Techniques—Exhaustive DFS optimal_soln_dfs(v) /* used when nodes are basic elts of the problem and not partial soln nodes */ begin v.mark = 1; If path to v is a soln, then begin if cost < best_cost then begin best_soln=soln; best_cost=cost; endif v.mark=0; return; Endif for each (v,u) in E if (u.mark != 1) then optimal_soln_dfs(u) end for; v.mark = 0; /* can visit v again to form another soln on a different path */ end Algorithm Depth_First_Search for each v in V v.mark = 0; best_cost = infinity; optimal_soln_dfs(root);

10 costs (1) 12 15 19 (2) 16 18 18 17 (3) Best-First Search BeFS (root) begin open = {root} /* open is list of gen. but not expanded nodes---partial solns */ best_soln_cost = infinity; while open != nullset do begin curr = first(open); if curr is a soln then return(curr) /* curr is an optimal soln */ else children = Expand_&_est_cost(curr); /* generate all children of curr & estimate their costs---cost(u) should be a lower bound of cost of the best soln reachable from u */ for each child in children do begin if child is a soln then delete all nodes w in open s.t. cost(w) >= cost(child); endif store child in open in increasing order of cost; endfor endwhile end /* BFS */ Expand_&_est_cost(Y) begin children = nullset; for each basic elt x of problem “reachable” from Y & can be part of current partial soln. Y do begin if x not in u and if feasible child = Y U {x}; path_cost(child) = path_cost(Y) + cost(u, x) /* cost(Y,x) is cost of reaching x from Y */ est(child) = lower bound cost of best soln reachable from child; cost(child) = path_cost(child) + est(child); children = children U {child}; endfor end /* Expand_&_est_cost(Y);

10 costs (1) 12 15 19 (2) 16 18 18 17 (3) Best-First Search • Proof of optimality when cost is a LB • The current set of nodes in “open” represents a complete front of generated nodes, i.e., the rest of the nodes in the search space are descendants of “open” • Assuming the basic cost (cost of adding an elt in a partial soln to contruct another partial soln that is closer to the soln) is non-negative, the cost is monotonic, i.e., cost of child >= cost of parent • If first node curr in “open” is a soln, then cost(curr) <= cost(w) for each w in “open” • Cost of any node in the search space not in “open” and not yet generated is >= cost of its ancestor in “open” and thus >= cost(curr). Thus curr is the optimal (min-cost) soln

9 A B 5 4 3 5 8 5 F C E 7 1 2 D Search techs for a TSP example A A E B F C F D F D F E E TSP graph E F E D x A A A 27 31 33 Solution nodes Exhaustive search using DFS (w/ backtrack) for finding an optimal solution

9 A B 5 4 3 5 8 5 F C E 7 1 C E 2 D Search techs for a TSP example (contd) A A E B F 5+15 D C F 8+16 F 21+6 D F C F C D E 11+14 22+9 Path cost for (A,E,F) = 8 D B F F E 14+9 23+8 X X X MST for node (A, E, F); = MST{F,A,B,C,D}; cost=16 F F • Lower-bound cost estimate: • MST({unvisited cities} U • {current city} U {start city}) • LB as structure (spanning tree) • is a superset of reqd soln structure • (cycle) • min_cost(set S) <= min_cost(set S’) • if S is a superset of S’ A A 27 20 BeFS for finding an optimal solution

BFS for 0/1 ILP Solution X = {x1, …, xm} are 0/1 vars X2=1 X2=0 Solve LP w/ x2=1; Cost=cost(LP)=C2 Solve LP w/ x2=0; Cost=cost(LP)=C1 X4=0 X4=1 Solve LP w/ x2=1, x4=1; Cost=cost(LP)=C4 Solve LP w/ x2=1, x4=0; Cost=cost(LP)=C3 Cost relations: C5 < C3 < C1 < C6 C2 < C1 C4 < C3 X5=0 X5=1 Solve LP w/ x2=1, x4=1, x5=0 Cost=cost(LP)=C5 Solve LP w/ x2=1, x4=1, x5=1 Cost=cost(LP)=C6 optimal soln

Iterative Improvement Techniques Iterative improvement Stochastic (non-greedy) Deterministic Greedy Non-locally greedy Locally/immediately greedy • Make a combination of deterministic greedy moves and probabilistic moves that cause a detrioration (can help to jump out of local minima) • Until (stopping criteria satisfied) • Stopping criteria could be an upper bound on the total # of moves or iterations Make move that is best according to some non-immediate (non-local) metric (e.g., probability-based lookahead as in PROP) Until (no further impr.) Make move that is immediately (locally) best Until (no further impr.) (e.g., FM)

Algorithmic Techniques in VLSI CAD

Algorithmic Techniques in VLSI CAD

Presentation Transcript

EE5900 Advanced Algorithms for Robust VLSI CAD

VLSI CAD Flow: Logic Synthesis, 6.375 Lecture 13

Introduction to VLSI CAD

Large-Scale Optimization in VLSI CAD

101-1 Under-Graduate Project Techniques in VLSI design

EECS 395/495 Algorithmic Techniques for Bioinformatics

CSC 6001 VLSI CAD (Physical Design)

CAD: Lining Up CAD Data In ArcGIS

C2 Part 4: VLSI CAD Tools Problems and Algorithms

EE 5900 Advanced Algorithms for Robust VLSI CAD , Spring 2009

Introduction to VLSI Algorithmic Design Automation Lab Research

FPT algorithmic techniques: Treewidth (1)

C2: VLSI CAD Tools Problems and Algorithms

Dynamic Programming and Some VLSI CAD Applications

CAD-IP Reuse via the Bookshelf for Fundamental VLSI CAD Algorithms

VLSI Online training |Online VLSI Course in Bangalore - VLSI GURU

EE 5900 Advanced Algorithms for Robust VLSI CAD , Spring 2009

Large-Scale Optimization in VLSI CAD

Central Algorithmic Techniques

EE5900 Advanced Algorithms for Robust VLSI CAD

Review on Algorithmic and Non Algorithmic Software Cost Estimation Techniques

Electrical CAD | Electrical CAD Software | Electrical CAD Courses in Chennai