OS support for Teraflux A Prototype

OS support for TerafluxA Prototype Avi Mendelson Doron Shamia

System and Execution ModelsData Flow Based • System is made out of clusters. • Each cluster contains 16 cores (may change) • Each cluster is controlled by a single “OS kernel”; e.g., Linux, L4 • Execution is made up of tasks; each task • Has no side effects • Are scheduled with their data (may use pointers) • May return results • If fail to complete, can be reschedule on the same core/other core • Tasks can be executed on any (service) cluster and has a unified view of system memory • All resource allocation/management is done in two levels, a local one and a global one

System Overview Target Protoyped System Cores View Memory View Linux CPU CPU CPU CPU L4 CPU CPU CPU CPU Configuration Page Message Buffers CPU CPU CPU CPU Linux CPU CPU CPU CPU L4 CPU == Cluster

Target SystemOS Requirements Linux Linux (Full OS) CPU CPU CPU CPU • Each uK runs a job • Jobs sent by full OS (FOS) • Jobs have no side-effects • Failed jobs are simply restarted • Runs low level FT, reporting to FOS Single chip Multi cores CPU CPU CPU CPU L4 (uKernel) CPU CPU CPU CPU • Manages jobs on uKernel (uK) cores • Proxies uKs I/O requests • Remote debug uKs/self • Runs high level (system) FT managing uK/self faults CPU CPU CPU CPU L4

Communications (1) Buffer L4 • Ownership (L4/Linux) • Ready flag • Type • Length (bytes) • Data • Fixups (optional) Buffer Configuration Page Message Buffers Buffer Buffer Linux

Communications (2) Buffer L4 • Ownership (L4/Linux) • Ready flag • Type • Length (bytes) • Data • Fixups (optional) Buffer Configuration Page Message Buffers Buffer Buffer • Ownership: who currently uses the buffer • Ready: Signals the buffer is ready to be transferred to the other side (inverse owner) • Type: The message type • Data: simply the raw data (according to type) • Fixups: A list of fixups in case we pass pointers Linux

Current Prototype • Goal: Quick development of OS support, and applications (later to move on COTson full prototype) • Quick prototyping via VMs • Linux on both ends (Fedora 13) • Main node = Linux (host) • Service Nodes = Linux (VMs) • Using shared memory between • Host and VMs • Between VMs • Shared memory uses kernel driver (ivshmem)

Prototype Architecture Linux F13 (Host) App Linux F13 QEMU Linux F13 QEMU User space Kernel space IVSHMEM Linux F13 QEMU Linux F13 QEMU

IV Shared Memory Arch mmap to user level Exposed as a PCI BAR QEMU maps shared-memory into RAM

Communications App App Linux F13 QEMU Linux F13 (Host) Host App Logic Data Flow App Message queue API Message queue API Linux F13 QEMU User space Kernel space Shared RAM Msg Msg Msg Linux F13 QEMU Linux F13 QEMU

Demo (toy) Apps • Distributed sum app • Single work dispatcher (host) • Multiple sum-engines (VMs) • Distributed Mandelbrot • Single work dispatcher – lines (host) • Multiple compute engines – compute pixels of each line (VMs)

Futures • Single Boot • A TeraFlux chips boots a FOS • FOS boots the uKs on the other cores • Looks like a single boot process • Distributed Fault Tolerance • Allow uK/FOS to test each others health • One step beyond FOS-centric FT • Cores Repurposing • If FOS cores fail, uK cores re-boot as FOS • New FOS takes over using last valid data snapshot

References Inter-VM Shared memory

OS support for Teraflux A Prototype

OS support for Teraflux A Prototype

Presentation Transcript

Intel IA32 OS Support - Refresh

DEVELOPING A PROTOTYPE

Steps for Evolution of a Prototype

Reliability-Aware OS Support for FPGA-Based Systems

A Prototype Model

Search for a suitable cargo prototype

OS Support for Virtualizing Hardware Transactional Memory

Timepix2 as a prototype for VELOpix

Reliability-Aware OS Support for FPGA-Based Systems

OS Support / Indexing

Intelligent Flight Support System (IFSS) A Real-Time Intelligent Decision Support Prototype

OS Support for Web Services

Lecture III: OS Support

Lecture III: OS Support

OS Support for Detecting Trojan Circuit Attacks

t-kernel – Reliable OS support for WSN

t-kernel – Reliable OS support for WSN

A Hand Prototype

proposal for a barrel prototype

A prototype for an extended PROOF

OS Support for Virtualizing Hardware Transactional Memory

Reliability-Aware OS Support for FPGA-Based Systems