Timing-Driven Placement for Heterogeneous FPGA

Timing-Driven Placement for Heterogeneous FPGA Bo Hu Velogix Inc. ICCAD 06

Outline • Introduction • Problem • Multi-layer Density System • Timing-Driven Placement • Experiments • Conclusions

Introduction • Traditional homogeneous FPGA is mainly based on programmable Look-Up tables. Its logic density and performance are usually inferior to ASIC. • Nowadays more ASIC-like dedicated functional blocks are integrated in heterogeneous FPGA. • The overall density and performance disadvantages are mitigated in modern FPGAs.

Introduction • A simplified example of a heterogeneous FPGA. It consists of two-dimension array of Basic Process Unit(BPU). Each BPU contains a two-dimension array of LUTs, a computing unit(CU) and a memory block.

Introduction • Given a netlist of design components, the task of a timing-driven placer is to assign components(single LUT or a complex functional block) into the proper locations on the FPGA chip. • The input to an analytical placer are a graph representing the design netlist and a region specifying where the netlist should be placed. • Each node in the graph is assigned a geometric shape. • Handling non-overlapping requirement is through density D(x,y). Since D(x,y) is a two-dimension function, we call it a single-layer density system.

Introduction • Suppose that the design to be placed consists of only CUs and LUTs. Large CU geometric shape forms blockage for memory blocks and LUTs.

Introduction Small CU geometric shape cause congestion. A group of CU components might be closely located in some local region where there are not enough CU resources available.

Problem • In general, Computing Units and Memory Blocks have much more sparse distribution than LUTs. • A single density layer cannot satisfy the distribution requirements for different architectural resources simultaneously. • Create one density layer for each architectural resource: Multi-layer density system.

Multi-layer Density System • A computational block(CB) is a pre-designed functional block implemented using the resources available on the chip. The example CB shown below consists of 3 CUs, 3 memory blocks and 12 LUTs relatively placed within a 2x2 BPU region.

Multi-layer Density System • Before a node is mapped to a density layer, we need to first determine its geometric shape. • Complex shape for CU density layer.

Multi-layer Density System • Complex shape for memory block density layer.

Multi-layer Density System • Complex shape for LUT density layer.

Multi-layer Density System • With the new multi-layer density system, a heterogeneous placement task is translated to a set of homogeneous ones, with each of them being handled at a different density layer.

Timing-Driven Placement • Expansion Basics • Based on fixed-point addition technique. • In analytical placement formulation, nodes tend to cluster to each other due to intrinsic attracting forces induced by connections. • A connection with larger weight and longer length induces stronger intrinsic force. • Fixed-points are used to apply additional attracting forces on nodes and work against intrinsic ones in order to pull the nodes away from high density area. • The placer based on expansion consists of a sequence of expansion iterations. It stops when density distribution satisfies preset criteria.

Timing-Driven Placement • Density d(b): density at bin b. A(b,n): the intersection area between b and node n. A(b): the area of bin b.

Timing-Driven Placement

Timing optimization Wp[ j ] and Wp[ j-1 ] is the weight for connection p at jth and jth-1 expansion. f[ j ] is the adjustment factor at jth expansion.

Timing optimization f0[ j ] : the preset maximum adjustment factor at jth iteration. f0[ 0]=1 and gradually approaches to zero. Sp : timing slack on connection p. Sworst : the worst slack. ε : a preset value used to decide whether a connection is critical. lp : the current length of connection p. lpmin and lpmax : the minmum and maxmum length of p.

Experiments

Conclusion • Multi-Layer density system for heterogeneous FPGA placement. • The timing-driven placement algorithm can handle complex placement requirements inherent in heterogeneous FPGAs.

Timing-Driven Placement for Heterogeneous FPGA

Timing-Driven Placement for Heterogeneous FPGA

Presentation Transcript

Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation

FPGA Architecture, timing, Software

FPGA Architecture, timing, Software

A Difference Logic Formulation and SMT Solver for Timing-Driven Placement

Compiler-Driven Data Layout Transformation for Heterogeneous Platforms

Scalable and Deterministic Timing-Driven Parallel Placement for FPGAs

A SimPLR Method for Routability-driven Placement

Timing-Driven Synthesis for Fast Barrel Shifters

A SimPLR Method for Routability -driven Placement

Communicating in Systems with Heterogeneous Timing

Congestion Driven Placement for VLSI Standard Cell Design

Timing Event-driven simulation

Post-Placement Voltage Island Generation for Timing-Speculative Circuits

Lens Aberration Aware Timing-Driven Placement

Partition-Driven Standard Cell Thermal Placement

FPGA Tools Course Timing Analyzer

Placement and Timing for FPGAs Considering Variations

An Analytic Placer for Mixed-Size Placement and Timing-Driven Placement

Pulsed-Latch Aware Placement for Timing-Integrity Optimization

FPGA Run-time Reconfigurable Placement

An Effective Congestion Driven Placement Framework

HeAP: Heterogeneous Analytical Placement for FPGAs