1 / 23

CS244-Introduction to Embedded Systems and Ubiquitous Computing

CS244-Introduction to Embedded Systems and Ubiquitous Computing. Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010. CS244 – Lecture 5. Hardware/Software Co-design. Cost. Improving cost is desired. Improving performance beyond threshold Is a waste. Better.

barth
Download Presentation

CS244-Introduction to Embedded Systems and Ubiquitous Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010

  2. CS244 – Lecture 5 Hardware/Software Co-design Winter 2010- CS 244

  3. Cost Improving cost is desired Improving performance beyond threshold Is a waste Better Better Performance Improving quality beyond threshold is desired Better Thresholds Quality Review: Design Objectives Winter 2010- CS 244

  4. Co-design Flow Refine System Model Informal Specification System Simulation Algorithmic Design Hardware/Software Partitioning Partitioned Model Schedule HW/SW Co-simulation Partitioned Model & Sch. Winter 2010- CS 244

  5. Co-design Flow Partitioned Model + Sch. Communication Synthesis Refine Software Model Hardware Model HW/SW Co-simulation Compilation Synthesis Binary Exec. Model Gate-level Model HW/SW Co-simulation Winter 2010- CS 244

  6. Co-design Flow Refine Binary Exec. Model Gate-level Model Emulate or Prototype Fabrication Winter 2010- CS 244

  7. Informal Specification & System Level Model • Informal Specification loosely defines high level behavior, constraints, and optimization objectives of the system • Algorithmic and implementation details absent • Performance estimates not present • System level model formally captures behavior, constraints, and optimization objectives • Can be simulated to obtain early performance estimates • Feedback to refine the system specification • Can serve as a golden model for validation of intermediate or final stages • Algorithmic design Winter 2010- CS 244

  8. F {F1, F2, F3 … Fn} … P1 P2 P3 PM … Hardware Software Partitioning • Decompose (i.e., partition) the function F of the system into N sub-functions F1, F2, F3 … FN • Decompose the constraints and design objectives of the system into sub-constraints and design sub-objectives • Cluster F1, F2, F3, …, Fn into M partitions to run on M processors Winter 2010- CS 244

  9. Scheduling • Scheduling is to obtain an execution sequence such that dependencies are obeyed • Static • During design time the schedule is fixed (the common case) • Dynamic • During execution time, the schedule is determined (reconfigurable computing) F1 F2 F4 F5 F6 F7 F3 F8 P1: F1  F2  F8 P2: F4  F5 P3: F3  F6 P4: F7 Winter 2010- CS 244

  10. Scheduling • A deadline D for the entire schedule • An execution time for each Ti for each Fi • ASAP (as soon as possible) • ALAP (as late as possible) 3 3 F1 F2 F4 6 F5 4 2 F6 F7 3 F3 1 F8 3 P1: F1  F2  F8 P2: F4  F5 P3: F3  F6 P4: F7 Winter 2010- CS 244

  11. Partitioning (Clustering) • Given: • F = { F1, F2, F3 … FN } • P = { P1, P2, P3 … PM } • Find a lowest cost partition (cluster), as computed by an objective function • Exhaustive approach O(MN) • Heuristics • Constructive partitioning (based on closeness function) • Random (good for seeding iterative approaches) • Cluster Growth • Hierarchical clustering • Iterative partitioning • Start with a partition and improve • Gradient search • Controlled random search • Modified Kernighan/Lin and FM algorithm • Partitions a set of nodes (functions) into two bins (processors) • Minimize edges between bins (communication cost, wires, etc.) • Cost function for moving a node from one partition to another • ILP • Genetic evolution • Simulated annealing Winter 2010- CS 244

  12. Partitioning (Clustering) • Given: • F = { F1, F2, F3 … FN } • P = { P1, P2, P3 … PM } • Find a lowest cost partition (cluster), as computed by an objective function • Exhaustive approach O(MN) • Heuristics • Constructive partitioning (based on closeness function) • Random (good for seeding iterative approaches) • Cluster Growth • Hierarchical clustering • Iterative partitioning • Start with a partition and improve • Gradient search • Controlled random search • Modified Kernighan/Lin algorithm • Partitions a set of nodes (functions) into two bins (processors) • Minimize edges between bins (communication cost, wires, etc.) • Cost function for moving a node from one partition to another • ILP • Genetic evolution • Simulated annealing Winter 2010- CS 244

  13. Iterative Partitioning Algorithms • The computation time in an iterative algorithm is spent evaluating large numbers of partitions • Iterative algorithms differ from one another primarily in the ways in which they modify the partition and in which they accept or reject bad modifications

  14. Kernighan-Lin (Min-Cut) Algorithms • Two-way partitioning example • Start with 2 equal subgraphs • Exchange k pairs in each iteration • Continue until no further improvement • Gain function • f(internal – external) cost

  15. Hierarchical Clustering – Example Winter 2010- CS 244

  16. Clustering w/ several criteria

  17. Alternate Partitioning Techniques • Start with all functionality in software and move portions into hardware which are time-critical and can not be allocated to software (software-oriented partitioning) • Start with all functionality in hardware and move portions into software implementation (hardware-oriented partitioning) Winter 2010- CS 244

  18. More Partitioning Issues • Partitioning into hardware and software affects overall system cost and performance • Hardware implementation • Provides higher performance via hardware speeds and parallel execution of operations • Incurs additional design expense • Software implementation • Lower performance • Incurs high cost of developing and maintaining (complex) software Winter 2010- CS 244

  19. Functional Co-simulation • Some of the M processors are single-purpose (e.g., those with a single function mapped on to them), others are general purpose • Functions mapped onto the general-purpose processors are implemented in software and simulated on virtual machines with performance models • Functions mapped onto the single-purpose processors are simulated at the behavioral level with performance models • Communication is done via abstract channels • Feedback is used to refine the partitioning and scheduling tasks Winter 2010- CS 244

  20. Communication Synthesis & Bus-accurate Co-simulation • Abstract channels A1, A2 … An are mapped onto a set of communication channels C1, C2 … Cm • Similar to functional partitioning • Similar to hardware/software scheduling • Channels correspond to physical artifacts of the architecture • Hardware and software models are annotated with detailed communication constructs • A hardware model and software model is obtained and co-simulated • Communication synthesis (or possibly higher levels of design) are refined Winter 2010- CS 244

  21. Compilation & Synthesis & Cycle-accurate Co-simulation • Compiler used to generate binary executables for general-purpose processors • Synthesis used to generate gate-level models of single-purpose processors • Synthesis used to generate gate-level models of general-purpose processors • Cycle accurate co-simulation of the entire system • Note: mixed level co-simulation is common Winter 2010- CS 244

  22. Emulate/Prototype and Fabrication • Use hardware (e.g, FPGAs) to emulate a system as fast as possible (relative to real-time) • Fabrication • Place & route • Mask design • Chip testing • Manufacturing fault models • Test vector generation • Packaging Winter 2010- CS 244

  23. Conclusion • Satisfying performance, cost, and quality metrics of a system entails hardware and software codesign • Partitioning is at the heart of codesign • Functional • Communication • Scheduling • Partitioning techniques • Constructive • Iterative • Heuristics often used to bound the running time Winter 2010- CS 244

More Related