380 likes | 495 Views
This paper presents practical approximation algorithms for the problem of separable packing linear programs (LPs) in the context of VLSI design, focusing on global routing via buffer blocks. We outline the motivation behind this work, offer integer linear programming (ILP) formulations, and introduce a polynomial-time approximation scheme (PTAS) for solving these separable packing LPs. The analysis includes various experimental results to validate the effectiveness of our approach, presenting insights into the complexity and runtime performance of the proposed algorithms.
E N D
Practical Approximation Algorithms for Separable Packing LPs F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA/UCSD) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State)
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Buffer Blocks VLSI Global Routing Buffered
Problem Formulation Global Routing via Buffer-Blocks (GRBB) Problem Given: • BB locations and capacities • List of multi-pin nets • upper-bound on #buffers for each source-sink path • L/U bounds on the wirelength b/w consecutive buffers/pins Find: • Buffered routing of a maximum number of nets subject to the given constraints
Integer program changes • Split each BB vertex r of G into two copies, r’ and r’’ • Impose capacity constraint on the sets of vertices {r’,r’’} Enforcing Parity Constraints • Inverting buffers change the polarity of the signal • Each sink has a given polarity requirement • Parity constraints for the #buffers on each routed source-sink path • A path may use two buffers in the same buffer block
Combining with compaction Set capacity constraints: cap(BB1) + cap(BB2) const.
Integer program changes • Replace each BB vertex r of G by a set X(r) of vertices (one for each buffer type) • Modify edge set of G to take into account non-uniform driving strengths • Impose capacity constraint on the sets of vertices X(r): GRBB with Buffer Library • Discrete buffer library: different buffer sizes/driving strengths • Need to allocate BB capacity between different buffer types
“Relax+Round” Approach to GRBB • Solve the fractional relaxation • Exact linear programming algorithms are impractical for large instances • KEY IDEA: use an approximation algorithm • allows fine-tuning the tradeoff between runtime and solution quality • Round to integer solution • Provably good rounding [RT87] • Practical runtime (random-walk based)
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing LP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Previous Work • MCF and packing/covering LP approximation: [FGK73,SM90, PST91,G92,GK94,KPST94,LMPSTT95,R95,Y95,GK98,F00,…] • Exponential length function to model flow congestion [SM90] • Shortest-path augmentation + final scaling [Y95] • Modified routing increment [GK98] • Fewer shortest-path augmentations [F00] • We extend speed-up idea of [F00] to separable packing LPs
Separable Packing LP Algorithm w(X) , f 0, = For i = 1 to N do For k = 1, …, #nets do Find min weight feasible Steiner tree T for net k While weight(T) < min{ 1, (1+) } do f(T)= f(T) + 1 For every X do w(X) ( 1 + (T,X)/cap(X) ) * w(X) End For Find min weight feasible Steiner tree T for net k End While End For = (1+) End For Output f/N
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Runtime • Choose #iterations N such that all feasible trees have weight 1 after N iterations (i.e., 1) • Tree weight lower bound is initially, and is multiplied by (1+) in each iteration Dual LP:
Theorem: For every <.15, the algorithm finds factor 1/(1+4 ) approximation by choosing where L is the maximum number of vertices in a feasible Steiner tree. For this value of , the running time is Approximation Guarantee
Outline • VLSI design motivation • Global routing via buffer-blocks • Separable packing ILP formulations • PTAS for separable packing LPs • Analysis • Experimental results
Provably Good Rounding • Store fractional flows f(T) for every feasible Steiner tree T • Scale down each f(T) by 1- for small • Each net k routed with prob. f(k)={ f(T) | T feasible for k } • Number of routed nets (1- )OPT • To route net k, choose tree T with probability = f(T) / f(k) • With high probability, no BB capacity is exceeded Problem: Impractical to store all non-zero flow trees
use random walk from source to sink Random-Walk 2-TMCF Rounding • Store fractional flows f(T) for every valid routing tree T • Scale down each f(T) by 1- for small • Each net k routed with prob. f(k)={ f(T) | T routing for k } • Number of routed nets (1- )OPT • To route net k, choose tree T with probability = f(T) / f(k) • With high probability, no BB capacity is exceeded Practical: random walk requires storing only flows on edges
T3 T1 S T2 Random-Walk MTMCF Rounding SourceSinks
T3 T1 S T2 Random-Walk MTMCF Rounding SourceSinks
The MTMCF Rounding Heuristic • Round each net k with probability f(k), using backward random walks • No scaling-down, approximate MTMCF < OPT • Resolve capacity violations by greedily deleting routed paths • Few violations • Greedily route remaining nets using unused BB capacity • Further routing still possible
Implemented Heuristics • Greedy buffered routing: • For each net, route sinks sequentially along shortest paths to source or node already connected to source • After routing a net, remove fully used BBs • Generalized MCF approximation + randomized rounding • G2TMCF • G3TMCF (3-pin decomposition) • G4TMCF (4-pin decomposition) • GMTMCF (no decomposition, approximate DRST)
Experimental Setup • Test instances extracted from next-generation SGI microprocessor • Up to 5,000 nets, ~6,000 sinks • U=4,000 m, L=500-2,000 m • 50 buffer blocks • 200-400 buffers / BB
Conclusions and Ongoing Work • Provably good algorithms and practical heuristics based on separable packing LP approximation • Higher completion rates than previous algorithms • Extensions: • Combine global buffering with BB planning • Buffer “site” methodology tile graph • Routing congestion (channel capacity constraints) • Simultaneous pin assignment
Resource Usage #nets = 4,764 #sinks = 6,038 400 buffers/BB
Resource Usage for 100% Completion #nets = 4,764 #sinks = 6,038 MTMCF wastes routing resources!