1 / 31

Algorithmic Techniques in VLSI CAD

Algorithmic Techniques in VLSI CAD. Shantanu Dutt University of Illinois at Chicago. Algorithms in VLSI CAD. Divide & Conquer (D&C) [e.g., merge-sort, partition-driven placement] Reduce & Conquer (R&C) [e.g., multilevel techniques such as the hMetis partitioner]

lel
Download Presentation

Algorithmic Techniques in VLSI CAD

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Algorithmic Techniques in VLSI CAD Shantanu Dutt University of Illinois at Chicago

  2. Algorithms in VLSI CAD • Divide & Conquer (D&C) [e.g., merge-sort, partition-driven placement] • Reduce & Conquer (R&C) [e.g., multilevel techniques such as the hMetis partitioner] • Dynamic programming [e.g., matrix multiplication, optimal buffer insertion] • Mathematical programming: linear, quadratic, 0/1 integer programming [e.g., floorplanning, global placement]

  3. Algorithms in VLSI CAD (contd) • Search Methods: • Depth-first search (DFS): mainly used to find any solution when cost is not an issue [e.g., FPGA detailed routing---cost generally determined at the global routing phase] • Breadth-first search (BFS): mainly used to find a soln at min. distance from root of search tree [e.g., maze routing when cost = dist. from root] • Best-first search (BeFS): used to find optimal solutions w/ any cost function, Can be done when a provable lower-bound of the cost can be determined for each branching choice from the “current partial soln node” [e.g., TSP, global routing] • Iterative Improvement: deterministic, stochastic

  4. Do recursively until subprob-size is s.t. TT-based design is doable Stitch-up of solns to A1 and A2 to form the complete soln to A Root problem A Subprob. A1 Subprob. A2 A2,2 A1,1 A1,2 A2,1 Divide & Conquer • Determine if the problem can be solved in a hierarchical or divide-&-conquer (D&C) manner: • D&C approach: See if the problem can be “broken up” into 2 or more smaller subproblems that can be “stitched-up” to give a soln. to the parent prob. • Do this recrusively for each large subprob until subprobs are small enough for an “easy” solution technique (could be exhasutive!) • If the subprobs are of a similar kind to the root prob then the breakup and stitching will also be similar

  5. Reduce problem size (Coarsening) Solve Uncoarsen and refine solution Reduce-&-Conquer • Examples: Multilevel graph/hypergraph partitioning (e.g., hMetis), multilevel routing

  6. Root Problem A Stitch-up function f: Optimal soln of root = f(optimal solns of subproblems) = f(opt(A1), opt(A2), opt(A3), opt(A4) Stitch-up function A1 A2 A3 A4 Dynamic Programming (DP) Subproblems • The above property means that everytime we optimally solve the subproblem, we can store/record the soln and reuse it everytime it is part of the formulation of a higher-level problem

  7. Root Problem A Stitch-up function A1 A2 A3 A4 Subproblems Dynamic Programming (contd) • Matrix multiplication example: Most computationally efficient way to perform the series of matrix mults: M = M1 x M2 x ………….. x Mn, Mi is of size ri x ci w/ ri = ci-1 for i > 1. • DP formulation: opt_seq(M) = (by defn) opt_seq(M(1,n)) • = mini=1 to n-1 {opt_seq(M(1, i)) + opt_seq(M(i+1, n)) + r1xcixcn} • Correctness rests on the property that the optimal way of multiplying M1x … x Mi • & Mi+1 to Mn will be used in the “min” stitch-up function to determine the optimal soln for M • Thus if the optimal soln invloves a “cut” at Mr, then the opt_seq(M(1,r)) & opt_seq(M(r+1,n)) will be part of opt_seq(M) • Perform computation bottom-up (smallest sequences first) • Complexity: Note that each subseq M(j, k) will appear in the above computation and is solved exactly once (irrespective of how many times it appears). • Time to solve M(j, k), j < n, k >= j, not counting the time to solve its subproblems (which are accounted for in the complexity of each M(j,k)) is length l of seq -1 = l-1 (min of l-1 different options is computed). Note l = j-k+1 • # of different M(j, k)’s is of length l = n – l + 1, 2 <= l <= n. • Total complexity = Sum i = 1 to n-1 (i+1) (n-i) = O(n 3) (as opposed to, say, O(2 n) using exhaustive search)

  8. A DP Example: Simple Buffer Insertion Problem Given: Source and sink locations, sink capacitances and RATs, a buffer type, source delay rules, unit wire resistance and capacitance RAT4 Buffer RAT3 s0 RAT2 RAT1 Courtesy: Chuck Alpert, IBM

  9. RAT4 RAT3 s0 RAT2 RAT1 Simple Buffer Insertion Problem (contd) Find: Buffer locations and a routing tree such that slack/RAT at the source is maximized Courtesy: Chuck Alpert, IBM

  10. Slack/RAT Example RAT = 500 delay = 400 Slack/RAT = -200 RAT = 400 delay = 600 RAT = 500 delay = 350 Slack/RAT = +100 RAT = 400 delay = 300 Courtesy: Chuck Alpert, IBM

  11. R1 R2 A B C C1 C2 Elmore Delay Courtesy: Chuck Alpert, IBM

  12. DP Example: Van Ginneken Buffer Insertion Algorithm [ISCAS’90] • Associate each leaf node/sink with two metrics (Ct, Tt) • Downstream loading capacitance (Ct) and RAT (Tt) • DP-based alg propagates potential solutions bottom-up[Van Ginneken, 90] • Add a wire • Add a buffer • Merge two solutions: For each Zn=(Cn,Tn), Zm=(Cm,Tm) soln. vectors in the 2 subtrees, create a soln vector Zt=(Ct,Tt) where Note: Take Ln = Cn Cw, Rw Ct, Tt Cn, Tn Cn, Tn Ct, Tt Ct, Tt Cn, Tn Cm, Tm Courtesy: UCLA

  13. RAT4 RAT3 s0 RAT2 RAT1 DP Example (contd) • Add a wire to each merged solution Zt (same cap. & delay change formulation as before) • Add a buffer to each Zt • Delete all dominated solutions Zd: Zd=(Cd, Td) is dominated if there exists a Zr=(Cr, Tr) s.t. Cd >= Cr and Td <= Tr (i.e., both metrics are worse) • The remaining soln vectors are all “optimal” solns for this subtree and one of them will be part of the optimal solution at the root/driver of the net---this is the DP feature of this algorithm

  14. Van Ginneken Example (20,400) Buffer C=5, d=30 Wire C=10,d=150 (30,250) (5, 220) (20,400) Buffer C=5, d=50 C=5, d=30 Wire C=15,d=200 C=15,d=120 (30,250) (5, 220) (45, 50) (5, 0) (20,100) (5, 70) (20,400) Courtesy: Chuck Alpert, IBM

  15. Van Ginneken Example Cont’d (30,250) (5, 220) (45, 50) (5, 0) (20,100) (5, 70) (20,400) (5,0) is inferior to (5,70). (45,50) is inferior to (20,100) Wire C=10 (30,250) (5, 220) (20,100) (5, 70) (30,10) (15, -10) (20,400) Pick solution with largest slack, follow arrows to get solution Courtesy: Chuck Alpert, IBM

  16. Mathematical Programming Others Linear programming (LP) E.g., Obj: Min 2x1-x2+x3 w/ constraints x1+x2 <= a, x1-x3 <= b -- solvable in polynomial time Quadratic programming (QP) E.g., Min. x12 – x2x3 w/ linear constraints -- solvable in polynomial (cubic) time w/ equality constraints Some vars are integers Mixed integer linear prog (ILP) -- NP-hard Mixed integer quad. prog (IQP) -- NP-hard Some vars are in {0,1} Mixed 0/1 integer quad. prog (0/1 IQP) -- NP-hard Mixed 0/1 integer linear prog (0/1 ILP) -- NP-hard

  17. 0/1 ILP/QLP Examples • Generally useful for “assignment” problems, where objects {O1, ..., On) are assigned to bins {B1, ..., Bm} • 0/1 variable xi,j = 1 of object Oi is assigned to bin Bj • Min-cut bi-partitioning for graphs G(V,E) can me modeled as a 0/1 IQP • xi,1 = 1 => ui in V1 else ui in V2 • Edge (ui, uj) in cutset if xi,1 (1-xj,1) + (1-xi,1)(xj,1 ) = 1 Objective function: Min Sum (ui, uj) in E c(i,j) (xi,1 (1-xj,1) + (1-xi,1)(xj,1) • Constraint: Sum w(ui) xi,1 <= max-size ui uj V2 V1

  18. 3 1 1 A A A 5 B B B C C C 4 6 6 2 2 E E E G G G 3 7 D D D 4 F F F 5 Graph DFS BFS Search Techniques soln_dfs(v) /* used when nodes are basic elts of the problem and not partial soln nodes */ v.mark = 1; If path to v is a soln, then return(1); for each (v,u) in E if (u.mark != 1) then soln_found = soln_dfs(u) if (soln_found = 1) then return(soln_found) end for; v.mark = 0; /* can visit v again to form another soln on a different path */ return(0) dfs(v) /* for basic graph visit or for soln finding when nodes are partial solns */ v.mark = 1; for each (v,u) in E if (u.mark != 1) then dfs(u) Algorithm Depth_First_Search for each v in V v.mark = 0; for each v in V if v.mark = 0 then if G has partial soln nodes then dfs(v); else soln_dfs(v);

  19. 1 A B 6 C 2 3 E G 4 D 5 F DFS Search Techniques—Exhaustive DFS optimal_soln_dfs(v) /* used when nodes are basic elts of the problem and not partial soln nodes */ begin v.mark = 1; If path to v is a soln, then begin if cost < best_cost then begin best_soln=soln; best_cost=cost; endif v.mark=0; return; Endif for each (v,u) in E if (u.mark != 1) then optimal_soln_dfs(u) end for; v.mark = 0; /* can visit v again to form another soln on a different path */ end Algorithm Depth_First_Search for each v in V v.mark = 0; best_cost = infinity; optimal_soln_dfs(root);

  20. 10 costs (1) 12 15 19 (2) 16 18 18 17 (3) Best-First Search BeFS (root) begin open = {root} /* open is list of gen. but not expanded nodes---partial solns */ best_soln_cost = infinity; while open != nullset do begin curr = first(open); if curr is a soln then return(curr) /* curr is an optimal soln */ else children = Expand_&_est_cost(curr); /* generate all children of curr & estimate their costs---cost(u) should be a lower bound of cost of the best soln reachable from u */ for each child in children do begin if child is a soln then delete all nodes w in open s.t. cost(w) >= cost(child); endif store child in open in increasing order of cost; endfor endwhile end /* BFS */ Expand_&_est_cost(Y) begin children = nullset; for each basic elt x of problem “reachable” from Y & can be part of current partial soln. Y do begin if x not in u and if feasible child = Y U {x}; path_cost(child) = path_cost(Y) + cost(u, x) /* cost(Y,x) is cost of reaching x from Y */ est(child) = lower bound cost of best soln reachable from child; cost(child) = path_cost(child) + est(child); children = children U {child}; endfor end /* Expand_&_est_cost(Y);

  21. 10 costs (1) 12 15 19 (2) 16 18 18 17 (3) Best-First Search • Proof of optimality when cost is a LB • The current set of nodes in “open” represents a complete front of generated nodes, i.e., the rest of the nodes in the search space are descendants of “open” • Assuming the basic cost (cost of adding an elt in a partial soln to contruct another partial soln that is closer to the soln) is non-negative, the cost is monotonic, i.e., cost of child >= cost of parent • If first node curr in “open” is a soln, then cost(curr) <= cost(w) for each w in “open” • Cost of any node in the search space not in “open” and not yet generated is >= cost of its ancestor in “open” and thus >= cost(curr). Thus curr is the optimal (min-cost) soln

  22. 9 A B 5 4 3 5 8 5 F C E 7 1 2 D Search techs for a TSP example A A E B F C F D F D F E E TSP graph E F E D x A A A 27 31 33 Solution nodes Exhaustive search using DFS (w/ backtrack) for finding an optimal solution

  23. 9 A B 5 4 3 5 8 5 F C E 7 1 C E 2 D Search techs for a TSP example (contd) A A E B F 5+15 D C F 8+16 F 21+6 D F C F C D E 11+14 22+9 Path cost for (A,E,F) = 8 D B F F E 14+9 23+8 X X X MST for node (A, E, F); = MST{F,A,B,C,D}; cost=16 F F • Lower-bound cost estimate: • MST({unvisited cities} U • {current city} U {start city}) • LB as structure (spanning tree) • is a superset of reqd soln structure • (cycle) • min_cost(set S) <= min_cost(set S’) • if S is a superset of S’ A A 27 20 BeFS for finding an optimal solution

  24. BFS for 0/1 ILP Solution X = {x1, …, xm} are 0/1 vars X2=1 X2=0 Solve LP w/ x2=1; Cost=cost(LP)=C2 Solve LP w/ x2=0; Cost=cost(LP)=C1 X4=0 X4=1 Solve LP w/ x2=1, x4=1; Cost=cost(LP)=C4 Solve LP w/ x2=1, x4=0; Cost=cost(LP)=C3 Cost relations: C5 < C3 < C1 < C6 C2 < C1 C4 < C3 X5=0 X5=1 Solve LP w/ x2=1, x4=1, x5=0 Cost=cost(LP)=C5 Solve LP w/ x2=1, x4=1, x5=1 Cost=cost(LP)=C6 optimal soln

  25. Iterative Improvement Techniques Iterative improvement Stochastic (non-greedy) Deterministic Greedy Non-locally greedy Locally/immediately greedy • Make a combination of deterministic greedy moves and probabilistic moves that cause a detrioration (can help to jump out of local minima) • Until (stopping criteria satisfied) • Stopping criteria could be an upper bound on the total # of moves or iterations Make move that is best according to some non-immediate (non-local) metric (e.g., probability-based lookahead as in PROP) Until (no further impr.) Make move that is immediately (locally) best Until (no further impr.) (e.g., FM)

More Related