1 / 51

Software-Hardware co-design for Real Time Systems

Software-Hardware co-design for Real Time Systems. Marko Bertogna ReTiS Lab. Scuola S.Anna, Pisa. Introd u ction. Overview. What is Co-design? Co-design typical instruments VHDL SystemC Reconfigurable Devices CSoC Co-design for RT Systems. Introd u ction. Co-design types.

tyrell
Download Presentation

Software-Hardware co-design for Real Time Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Software-Hardware co-design for Real Time Systems Marko Bertogna ReTiS Lab. Scuola S.Anna, Pisa Marko Bertogna - Sw/Hw Co-design

  2. Introduction Overview • What is Co-design? • Co-design typical instruments • VHDL • SystemC • Reconfigurable Devices • CSoC • Co-design for RT Systems Marko Bertogna - Sw/Hw Co-design

  3. Introduction Co-design types • Mechanical vs electrical design • Analog/digital • Control vs computing • Sw/Hw • time vs space programming • centralized vs distributed computing • sequential vs parallel behaviour Marko Bertogna - Sw/Hw Co-design

  4. Software programming c=a+b; result=c/2; Hardware implement. Introduction What is a task in hardware? a c + b shifter Assembler expansion: ldr r0,a ldr r1,b add r0,r0,r1 mov r0,LSR r0 str r0,result result 5 operations All in one clock cycle! Marko Bertogna - Sw/Hw Co-design

  5. VHDL VHDL – Verilog • Very High Speed Integrated Circuit Hardware Description Language • formal model for the behaviour of a system • simulation • synthesis: automatic transformation refinement from a less detailed description… until existing components • design reuse Marko Bertogna - Sw/Hw Co-design

  6. VHDL VHDL features • Abstraction, modularity, hierarchy o<=i1+i2*i3 after 100 ns Behavioural RTL … U6: ND2 port map(A=>n3, B=>n9, Z=>I7); U7: IVP port map(A=>n13, B=>n19); U8: ND2 port map(A=>u8, B=>u1, Z=>n4); … Logic Layout Marko Bertogna - Sw/Hw Co-design

  7. VHDL VHDL synthesis steps • Specification (“paper and pencil”) • System level: behaviour • Logic design: all synthesis aspects • Gate level: mapping to ASIC library or FPGA logic blocks. Automatic synthesis  Netlist • Layout VHDL design Validation at each step! Marko Bertogna - Sw/Hw Co-design

  8. VHDL VHDL synthesis Brehavioural synthesis Logic synthesis (use gate libraries) Behavioural VHDL  RTL VHDL  Netlist VHDL  Layout functional timing: “after 10s signal A switches to 1” gate delays path delays clock, functions, events Placement and route Back annotation Σlongest path (gate delay) < Tck Marko Bertogna - Sw/Hw Co-design

  9. Entity/Architecture Components Configuration Process Library Subprogram (functions and procedures) Package/Package Body Signals, Testbench entity HALFADDER isport(      A, B:                in   bit;      SUM, CARRY: out bit);end HALFADDER;architecture RTL of HALFADDER isbegin   SUM      <= A xor B;   CARRY <= A and B;end RTL;-- VHDL'93: end architecture RTL ; VHDL VHDL structural elements Marko Bertogna - Sw/Hw Co-design

  10. VHDL VHDL synthesis example Library IEEE;use IEEE.Std_Logic_1164.all;entity IF_EXAMPLE isport (A, B, C, X : instd_ulogic_vector..; Z                : outstd_ulogic_vector..);end IF_EXAMPLE; architecture A of IF_EXAMPLE isbeginprocess (A, B, C, X)begin if ( X = "1110" ) then Z <= A;elsif (X = "0101") then Z <= B;else Z <= C;end if;end process;end A; Marko Bertogna - Sw/Hw Co-design

  11. VHDL VHDL optimization examples Refinement Refinement Marko Bertogna - Sw/Hw Co-design OUT1<=IN1+IN2+IN3+IN4+IN5+IN6 OUT2<=(IN1+IN2)+(IN3+IN4)+(IN5+IN6)

  12. SystemC SystemC • Integration with C++ • Provides: • hardware timing (clock and delay) • concurrency support (modules) • reactive behaviour (events) • signal-based communication support • new data types (logic values, bit vectors, etc.) • No need to translate to HDLs Marko Bertogna - Sw/Hw Co-design

  13. SystemC SystemC Design Methodology SystemC Design Methodology: Current system design methodology: Marko Bertogna - Sw/Hw Co-design

  14. SystemC SystemC features • Implemented as a C++ class library (libsystemc.a) • Inherits all hierarchy features • Built-in simulation environment • Easy refinement and reworking • Lightweight Marko Bertogna - Sw/Hw Co-design

  15. SystemC SystemC core language • Modules • Processes • Clocks, custom wait() calls • Support for events, sensitivity list, watching() construct • Signals Marko Bertogna - Sw/Hw Co-design

  16. SystemC Modules • Basic building block • Map functionality of Hw/Sw blocks • Derived from class sc_module • Possibility to use hierarchy constructs and sub-modules • Interface each other via ports/interfaces/channels Marko Bertogna - Sw/Hw Co-design

  17. SystemC Modules //my_module.h SC_MODULE(my_module) { //port declarations //process declarations SC_CTOR(my_module) { //process configuration //initialization code } }; Marko Bertogna - Sw/Hw Co-design

  18. SystemC Ports/Channels/Interfaces • Ports provide communication functions to modules • Interfaces connect ports to channels • Typical channel: signal Marko Bertogna - Sw/Hw Co-design

  19. SystemC Processes • Provide module functionality • Implemented as C++ member functions • Run concurrently between each other • Execute statements sequentially • Three kinds: • SC_METHOD • SC_THREAD • SC_CTHREAD Marko Bertogna - Sw/Hw Co-design

  20. SystemC SC_METHOD //my_module.h SC_MODULE(my_module) { sc_in<bool> id; sc_in<sc_uint<3> > in_a; sc_in<sc_uint<3> > in_b; sc_out<sc_uint<3> > out_c; void my_method(); SC_CTOR(my_module) { SC_METHOD(my_method); sensitive << a << b; } }; //my_module.cpp void my_module::my_method() { if (id.read()) out_c.write(in_a.read()); else out_c.write(in_b.read()); }; Marko Bertogna - Sw/Hw Co-design

  21. SystemC SC_THREAD //my_module.h SC_MODULE(my_module) { sc_in<bool> id; sc_in<bool> clock; sc_in<sc_uint<3> > in_a; sc_in<sc_uint<3> > in_b; sc_out<sc_uint<3> > out_c; void my_thread(); SC_CTOR(my_module) { SC_THREAD(my_thread); sensitive << clock.pos(); } }; //my_module.cpp void my_module:: my_thread() { while(true) { if (id.read()) out_c.write(in_a.read()); else out_c.write(in_b.read()); wait(); } }; Marko Bertogna - Sw/Hw Co-design

  22. SystemC Channels • The most common type is signal • Signal can be traced: waveform dumping produces .VCD output file • Other channels: • sc_fifo • sc_mutex • sc_semaphore Marko Bertogna - Sw/Hw Co-design

  23. SystemC SystemC scheduler • Similar to HDL scheduler • Two different time steps: • Discrete simulation cycle • “Delta cycle” • “Evaluate then update” semantic • Order of process resumption unknown • Event objects extend sensitivity Marko Bertogna - Sw/Hw Co-design

  24. Reconfigurable Devices Co-design for embedded systems • “Programming in Space” versus “Programming in Time” • Key design choices: • Computational units and their granularity • Interconnect Network • (Re)configuration time and frequency • Formal verification • Automatic synthesis Marko Bertogna - Sw/Hw Co-design

  25. Flexibility vs efficiency Reconfigurable Devices Marko Bertogna - Sw/Hw Co-design

  26. Reconfigurable Devices Reconfigurable devices advantages • Efficiency AND Flexibility • Time to market • Easier upgrade • Lower cost (on scale production) • Reusable IP • Customable interface Marko Bertogna - Sw/Hw Co-design

  27. Reconfigurable Devices Reconfigurable devices parameters • Block granularity • Density • Reconfiguration time • Compile-Time Reconfiguration (CTR) vs Run-Time Reconfiguration (RTR) • Partial or Total reprogramming Marko Bertogna - Sw/Hw Co-design

  28. Reconfigurable Devices FPGA • SRAM-based Field Programmable Gate Array • Basic block is the Logic Element (LE) • Capacity from 1k to 100k LEs • Configurable Interconnect • Need for optimized CAD or pre-binded design libraries Marko Bertogna - Sw/Hw Co-design

  29. Reconfigurable Devices FPGA CSL organization: Basic Logic Element: Marko Bertogna - Sw/Hw Co-design

  30. CSoC CSoCConfigurable Systems on Chip • RISC processor • FPGA block • On-chip memories • External memories • Peripherals • DIP switches and connectors • Debug support Marko Bertogna - Sw/Hw Co-design

  31. PRISM (Brown) PRISC (Harvard) DPGA-coupled uP, Raw processor (MIT) V-IRAM, GARP, Pleiades, etc. (UCB) OneChip (Toronto) REMARC (Stanford) NAPA (NSC) E5, A7 etc. (Triscend) Chameleon Quicksilver Excalibur (Altera) Virtex+PowerPC (Xilinx) PIM Processor (Sun) CSoC Research on CSoC Marko Bertogna - Sw/Hw Co-design

  32. CSoC CSoC companies • Xilinx  Triscend (50% market in PLDs and FPGA) • Altera • many others Triscend and Altera boards available in our lab Marko Bertogna - Sw/Hw Co-design

  33. CSoC The Triscend A7S Board • TA7S20-60Q CSoC • SDRAM 32Mb • Flash 2Mb • Memory sockets • 2 serial connectors • 7 segment LED • Oscillator for CK • Debug facilities Marko Bertogna - Sw/Hw Co-design

  34. CSoC The Triscend A7S chip Marko Bertogna - Sw/Hw Co-design

  35. CSoC Triscend Fastchip 2.4 • FPGA optimized module library • IO Editor • Generate file.h • Bind (placement and route)  file.csl • Config  file.cfg • Download Marko Bertogna - Sw/Hw Co-design

  36. CSoC Triscend Fastchip modules Marko Bertogna - Sw/Hw Co-design

  37. Co-design and real-time • RTOS Booster (Lindh et al.): • hardware fixed-priority scheduler • no need for clock tick administration • interprocess communication, mutex and semaphores • Beware to bus bottlenecks! • SoC Lock Cache (Lee) • Configurable Hardware scheduler (GIT) • Online scheduling of Hardware RT tasks to Partially Reconfigurable Devices (Thiele et al.) Marko Bertogna - Sw/Hw Co-design

  38. Hardware RTOS: the RTU Lindh et al., RTU (Real Time Unit): - Accelerator Interface - Scheduler Unit - Message, Semaphore and Delay Handler - Intelligent Interrupt Handler - Real-Time Control - General and Technology Dependent Bus Interface Marko Bertogna - Sw/Hw Co-design

  39. Drawbacks of centralized computing • Moore’s law is going the wrong way for power consumption • A memory access consumes far more then a CPU local operation • Chip area= logic + MEMORY • Under 100nm many problems: • Increasing leakage current • Difficult interconnect • Litho and process variaibility Marko Bertogna - Sw/Hw Co-design

  40. Power delivery and dissipation Marko Bertogna - Sw/Hw Co-design

  41. Power efficiency Marko Bertogna - Sw/Hw Co-design

  42. Road to distributed computing • Concurrent programming • Compilers that can exploit parallelism • High-level debuggers • Algorithm for intermediate levels of granularity (between C++ and HDLs) • New benchmarking methods and metrics (MOPS/$ or MOPS/kg W) Marko Bertogna - Sw/Hw Co-design

  43. Cell processor(IBM, Sony, Toshiba) (from IMEC – Hugo de Man) Marko Bertogna - Sw/Hw Co-design

  44. Grazie per l’attenzione! Fine! Marko Bertogna - Sw/Hw Co-design

  45. SystemC SystemC layers No notion of time (processes and data transfers) Functional verification Algorithm validation + formal + time Notion of time (processesand data transfers) Coarse benchmarking Architectural analysis + pin acc. + cycle acc. Cycle accuracy, signal accuracy Detailed benchmarking Microarchitectural analysis + HW mapping Marko Bertogna - Sw/Hw Co-design

  46. SystemC UnTimed Functional (UTF) model // adder.h SC_MODULE(adder) { {sc_fifo_in<float> input1, input2; sc_fifo_out<float> output; SC_CTOR(adder) { SC_THREAD(adding());} void adding() { while (true) { output.write(input1.read() + input2.read()); }}} // constgen.h SC_MODULE(constgen) { {sc_fifo_out<float> output; SC_CTOR(constgen) { SC_THREAD(generating());} void generating() { while (true) { output.write(0.7); }}} Marko Bertogna - Sw/Hw Co-design

  47. SystemC Timed Functional (TF) model // constgen.h SC_MODULE(constgen) { {sc_fifo_out<float> output; SC_CTOR(constgen) { SC_THREAD(generating());} void generating() { while (true) { wait(200, SC_NS); output.write(0.7); }}} // constgen.h SC_MODULE(constgen) { {sc_fifo_out<float> output; SC_CTOR(constgen) { SC_THREAD(generating());} void generating() { while (true) { output.write(0.7); }}} refining Marko Bertogna - Sw/Hw Co-design

  48. SystemC Bus Cycle Accurate (BCA) model // euclid.cpp void euclid::compute() {unsigned int tmp_a = 0, tmp_b; // reset section while (true) { c.write(tmp_a); // signaling output ready.write(true); wait(); // moving to next cycle tmp_a = a.read(); // sampling input tmp_b = b.read(); ready.write(false); wait(); // moving to next cycle while (tmp_b != 0) { // computing unsigned int r = tmp_a; tmp_a = tmp_b; r = r % tmp_b; tmp_b = r;}}} // euclid.h SC_MODULE (euclid) { sc_in_clk clock; sc_in<bool> reset; sc_in<unsigned int> a, b; sc_out<unsigned int> c; sc_out<bool> ready; void compute(); SC_CTOR(euclid) { SC_CTHREAD(compute, clock.pos()); watching(reset.delayed() == true); } }; Marko Bertogna - Sw/Hw Co-design

  49. SystemC Register Transfer Level (RTL) model - RTL level: signal accurate, cycle accurate, resource accurate - Can not use abstractions (functional units, communication infrastructures, …) Marko Bertogna - Sw/Hw Co-design

  50. SystemC RTL adder // counter.cpp #include "counter.h“ void counter::counting() { if (clear) countval = 0; else if (load.read()) countval = (unsigned int)din.read(); else countval++; dout.write((sc_uint<8>)countval); } // counter.h SC_MODULE(counter) { sc_in<bool> clk; sc_in<bool> load; sc_in<bool> clear; sc_in<sc_uint<8> > din; sc_out<sc_uint<8> > dout; unsigned int countval; void counting(); SC_CTOR(counter) { SC_METHOD(counting); sensitive << clk.pos(); } }; Marko Bertogna - Sw/Hw Co-design

More Related