1 / 12

Architecture Design of a Scalable Single-Chip Multi-Processor

Architecture Design of a Scalable Single-Chip Multi-Processor. B.D. Theelen. Overview. Introduction MµP Features System Architecture Hardware RTOS Example Configuration Experimental Results Conclusions. Scalable, Customisable, Reusable. Parallel Execution of Various Tasks.

milt
Download Presentation

Architecture Design of a Scalable Single-Chip Multi-Processor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architecture Design of aScalable Single-Chip Multi-Processor B.D. Theelen www.ics.ele.tue.nl/~btheelen

  2. Overview • Introduction • MµP Features • System Architecture • Hardware RTOS • Example Configuration • Experimental Results • Conclusions www.ics.ele.tue.nl/~btheelen

  3. Scalable, Customisable, Reusable Parallel Execution of Various Tasks Introduction Architecture Platforms for Real-Time Embedded Systems Customisability + Parallel + Scalable + Reusable Configurable Set of Application-Dedicated Processor Cores Flexibility + (Parallel + Scalable) + Reusable (Scalable Number of Identical) General-Purpose Processor Core(s) SoC technology enables embedding both on Single-Chip Involves flexible and scalable Interconnects and Memory Architecture Examples: TriMedia, SpaceCake www.ics.ele.tue.nl/~btheelen

  4. Deadlines, Task Priorities, Impact of Overhead Real-Time Environment Architecture Platforms for Real-Time Embedded Systems Involves fast Interconnects and Memory Architecturecapable of dealing with task priorities Multi-Micro Processor (MμP) Combines Scalable Number of Identical General-Purpose Master Processors with Configurable Set of Shared Application-Dedicated Co-processors and a Hardware RTOS Kernel to reduce task switching overhead www.ics.ele.tue.nl/~btheelen

  5. MµP Features • True parallel execution of tasks • Master Processors execute tasks independently • Instruction Set is extendable • Only 1/16th of instruction space is executed by Master Processors • Remainder is split over up to 15 different Co-processor types • Co-processor type determines actual use of instruction space • Number of Co-processors of certain type is scalable • On-chip RTOS Kernel • Transparent priority-based multi-tasking over Master Processors • Hardware support for fast task switches • Communication and synchronisation between (local and remote) tasks • (Counting) semaphores, mailboxes, pipes • Extended event handling mechanism instead of interrupts • Uses counting semaphores www.ics.ele.tue.nl/~btheelen

  6. L2 I$ Master Processors 1 2 n L1 I$ L1 I$ L1 I$ Register D$ Arbiter Memory MultiPort D$ Function Switch Task Assignment SharedCo-Processors Event Inputs m.1 FPU 2.1 LSU 1 TCU m.y FPU 2.x LSU MPNetwork Result Switch Chip Boundary System Architecture Task Control UnitHardware RTOS Kernel www.ics.ele.tue.nl/~btheelen

  7. Design Issues • On-Chip Interconnects • Cyclic path of instructions and results • Interconnects are non-blocking • Master processors accept results at all times and implement scoreboarding • Function Switch routes on co-processor type number • Fair arbitration with high/low priority based on task priority and request age • Result Switch routes on task number • FCFS arbitration without priorities • Perform routing functionality in one clock • Memory Architecture • Separated instruction and data path • Two-level instruction cache architecture with round-robin arbitration • Shared multi port data cache = data cache with statistically multiplexed banks • Round-robin arbitration between accesses for different paths • No real cache coherency problems www.ics.ele.tue.nl/~btheelen

  8. Function Switch Control Space TCU Core TCU Network Management Link Function Rx Task Admin Link Switch Network Task Scheduler Sorted Task List Executive Resource Admin Link Link Resource Data Timers Arbiter Result Tx Event Detect Result Switch Event Inputs MultiPort D$ Master Processors Hardware RTOS www.ics.ele.tue.nl/~btheelen

  9. Design Issues • Task Management • Commands for creating, terminating, delaying, suspending and restarting tasks and for changing priority • Tasks of equal priority time share master processors available to them • Task switching accelerated by specialised cache storing volatile contexts • Transparent Communication • Commands for activating, deactivating, reading and writing resources • Counting semaphores, mailboxes and pipes in hardware • Network Manager shields tasks from MµP network • Tasks can access any resource in the MµP network • Extended Event Handling • Commands for activating and deactivating event inputs • Event inputs are coupled to counting semaphores • Involved semaphore might not be in same MµP where the task resides www.ics.ele.tue.nl/~btheelen

  10. Two 8048 ISA compatible Master Processors 8048 compatible I/O and Timers in Co-Processors 1 clock Function Switchand Result Switch On-chip 2kB Instruction ROM and 1kB Data RAM Register D$ enablingTask Switches in 1 clock TCU Co-Processor • 15 user-definable tasks • 32 binary semaphores • Timers and Interrupts supported as events for predefined tasks • all commands executed in 1 clock By V.R. Suárez Example Configuration (Mini MµP) www.ics.ele.tue.nl/~btheelen

  11. Experimental Results (Mini MµP) • Mini MµP designed using IDaSS • Interactive Design and Simulation System • Automatic generation of synthesisable VHDL or Verilog • Mini MµP implemented in Xilinx Spartan-II 200 FPGA • Uses 42% of memory area and 83% of gate area • Total gate count of 141k • Runs at 25 Mhz (expect over 30Mhz for optimised version) • Critical path is 14 gates (in Master Processor core) • Next critical path in TCU Co-Processor www.ics.ele.tue.nl/~btheelen

  12. Conclusions • Multi Micro Processor (MµP) Architecture • Scalable Single-Chip Multi-Processor • Intended for Real-Time Embedded Systems • On-chip RTOS Kernel with hardware support for fast Task Switches • Design issues • On-chip Interconnects • Memory Architecture • Hardware RTOS • Task Management • Transparent Communication • Extended Event Handling • Results • Mini version of MµP with two 8048 ISA compatible Master Processors www.ics.ele.tue.nl/~btheelen

More Related