1 / 1

Automating Resource Optimisation in Reconfigurable Design

Automating Resource Optimisation in Reconfigurable Design. {nx210, cpc10, qj04 , wl} @doc.ic.ac.uk qiangliu@tju.edu.cn. Function Level. Design Flow. Xinyu Niu, Thomas C.P Chau, Qiwei Jin, Wayne Luk and Qiang liu. ABSTRACT. Partition Level. Configuration Level.

salaam
Download Presentation

Automating Resource Optimisation in Reconfigurable Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automating Resource Optimisation in Reconfigurable Design {nx210, cpc10, qj04 , wl}@doc.ic.ac.uk qiangliu@tju.edu.cn Function Level Design Flow Xinyu Niu, Thomas C.P Chau, Qiwei Jin, Wayne Luk and Qiang liu ABSTRACT Partition Level Configuration Level Function properties extracted from algorithm details: resource consumption, bandwidth requirement, memory architectures data dependency New design approach: automatically identify and exploit run-time reconfiguration for optimising resource utilisation Configuration Data Flow Graph: a hierarchical graph structure, for synthesis of reconfigurable designs in three steps Evaluation: barrier option pricing (finance), particle filter (robotics), reverse time migration (oil and gas) Improvement: 1.61 to 2.19 times faster than optimised static FPGA designs, up to 28.8 times faster than optimised CPU reference designs, and 1.55 times faster than optimised GPU designs; up to 29 times more energy efficient than CPU/GPU Generate configurations based on assigned ALAP and ATAP levels Optimise configurations to fully utilise available resources Adopt analysis of resource consumption and bandwidth in the design model Motivating Example Function nodes are assigned: As Late As Possible (ALAP) levels based on interactions between function nodes, As Timely As Possible (ATAP) levels based on data dependency inside a function Function level Results Group Configurations into partitions by mapping configurations into a Configuration Graph Assign design rules to build the search space Generate valid partitions within the pruned search space by recursive search algorithm Scalable design method: Design rules to eliminate redundant search space Design algorithm to only explore valid design space Improvement over static designs: FPGA devices: 4 Xilinx Virtex-6 SX475T FPGAs, hosted in a Maxeler MPC-C500 computing node, running at 100MHz 1.94, 2.19 and 1.61 times faster respectively for barrier option pricing, particle filter, reverse time migration Static design: 34% to 59% of theoretical performance Reconfiguration removing idle functions: 98% of theoretical performance Inefficiency for dynamic design: due to reconfiguration overhead  Improvement over optimised CPU and GPU designs: CPU: 24 Intel Xeon X5660 cores running at 2.67 GHz GPU: an NVIDIA Tesla C2070 card running at 1.15 GHz, linearly scaled by 4 for comparison with multi-FPGA designs Up to 27 times faster than CPU, 1.55 times faster than GPU Up to 29 times more energy efficient than CPU and GPU designs Proposed approach applicable to multi-chip environment Scalability: limited by reconfiguration overhead, can overcome by parallel reconfiguration of multiple FPGAs Acknowledgement: This work was supported in part by UK EPSRC, by the European Union Seventh Framework Programme under Grant agreement number 257906, 287804 and 318521, by the HiPEAC NoE, by Maxeler University Programme, and by Xilinx. Exploring original search space Algorithm level Design issues are addressed at multiple levels: Function level: extract function information, assign data dependency levels Configuration level: generate configuration based on designs rules, optimise configurations Partition level: generate partitions recursively, select the optimal run-time solutions Static design: accommodate all functions to accomplish the application Dynamic design: reconfigure design dynamically, implement only active functions Benchmarks: Barrier Option Pricing (BOP) Particle Filter (PF) Reverse-Time Migration (RTM) Exploring pruned search space 4n 3n 2n n 7n/3 8n/3 2n n

More Related