Inter-Processor Communication for Heterogeneous Dual Core Systems

Inter-Processor Communication for Heterogeneous Dual Core Systems 2006/09/27 Chun-Ming Huang, Ph.D. National Chip Implementation Center (CIC) cmhuang@cic.org.tw

Agenda • IPC Overview • IPC Schemes • Nokia DSP Gateway • TI DSP/BIOS Link • IPC Hardware Architecture • Conclusions

IPC Overview

What is IPC? • Inter-Process Communication • Inter-Processor Communication How to provide inter-process communication services for multi-core systems?

Independent & Cooperating Process • Processes executing concurrently in the multitasking environment may be either independent processes or cooperating processes • A process is independent if it cannot affect or be affected by the other processes executing in the system; any process that does not share data with any other process is independent • A process is cooperating if it can affect or be affected by the other processes executing in the system; any process that shares data with other processes is a cooperating process Silberschatz, et al., Operating System Principles, Seventh Edition

Why Allow Process Cooperation? • Information sharing • Computation speedup • Modularity • Convenience • Cooperating processes requires an inter-process communication (IPC) mechanism that will allow them to exchange data and information Silberschatz, et al., Operating System Principles, Seventh Edition

IPC Example • Unix pipe • ls –l / | grep 2005 | wc • 2 19 98 • The grep utility searches text files for a pattern and prints all lines that contain that pattern. • The wc utility displays a count of lines, words and characters in a text file. • Data exchange • Synchronization

Operating System Kernel Components • Process scheduler • determines when and for how long a process execute on a processor • Memory manager • determines when and how memory is allocated to processes and what to do when memory becomes full • I/O manager • services input and output requests from and to hardware devices • Inter-process communication (IPC) manager • allows processes to communicate with one other • File system manager • organizes named collections of data on storage devices and provides an interface for accessing data on those devices Deitel, et al., Operating Systems, Third Edition

Linux Kernel 2.6.17.11 drwxr-xr-x arch drwxr-xr-x block drwxr-xr-x crypto drwxr-xr-x drivers drwxr-xr-x fs drwxr-xr-x include drwxr-xr-x init drwxr-xr-x ipc drwxr-xr-x kernel drwxr-xr-x lib drwxr-xr-x mm drwxr-xr-x net drwxr-xr-x scripts drwxr-xr-x security drwxr-xr-x sound drwxr-xr-x usr -rw-r--r-- Makefile -rw-r--r-- compat.c -rw-r--r-- compat_mq.c -rw-r--r-- mqueue.c -rw-r--r-- msg.c -rw-r--r-- msgutil.c -rw-r--r-- sem.c -rw-r--r-- shm.c -rw-r--r-- util.c -rw-r--r-- util.h http://www.kernel.org

Machine-Independent SW in the FreeBSD Kernel McKusic & Neville-Neil, The Design and Implementation of the FreeBSD Operating System

Homogeneous vs. Heterogeneous Sun TI OMAP 5910

Multiprocessor OS Organizations • Can classify systems based on how processors share operating system responsibilities • Three types • Master/slave • Separate kernels • Symmetrical organization Deitel, et al., Operating Systems, Third Edition

Master/Slave • Master/Slave organization • Master processor executes the operating system • Slaves execute only user processors • Hardware asymmetry • Low fault tolerance • Good for computationally intensive jobs • Example: nCUBE system Deitel, et al., Operating Systems, Third Edition

Separate Kernels • Separate kernels organization • Each processor executes its own operating system • Some globally shared operating system data • Loosely coupled • Catastrophic failure unlikely, but failure of one processor results in termination of processes on that processor • Little contention over resources • Example: Tandem system Deitel, et al., Operating Systems, Third Edition

Symmetrical Organization • Symmetrical organization • Operating system manages a pool of identical processors • High amount of resource sharing • Need for mutual exclusion • Highest degree of fault tolerance of any organization • Some contention for resources • Example: BBN Butterfly Deitel, et al., Operating Systems, Third Edition

Memory Access Architectures • Memory access • Can classify multiprocessors based on how processors share memory • Goal: Fast memory access from all processors to all memory • Contention in large systems makes this impractical Deitel, et al., Operating Systems, Third Edition

Uniform Memory Access • Uniform memory access (UMA) multiprocessor • All processors share all memory • Access to any memory page is nearly the same for all processors and all memory modules (disregarding cache hits) • Typically uses shared bus or crossbar-switch matrix • Also called symmetric multiprocessing (SMP) • Small multiprocessors (typically two to eight processors) Deitel, et al., Operating Systems, Third Edition

Uniform Memory Access Deitel, et al., Operating Systems, Third Edition

Non-Uniform Memory Access • Non-uniform memory access (NUMA) multiprocessor • Each node contains a few processors and a portion of system memory, which is local to that node • Access to local memory faster than access to global memory (rest of memory) • More scalable than UMA (fewer bus collisions) Deitel, et al., Operating Systems, Third Edition

Non-Uniform Memory Access Deitel, et al., Operating Systems, Third Edition

Cache-Only Memory Architecture • Cache-only memory architecture (COMA) multiprocessor • Physically interconnected as a NUMA is • Local memory vs. global memory • Main memory is viewed as a cache and called an attraction memory (AM) • Allows system to migrate data to node that most often accesses it at granularity of a memory line (more efficient than a memory page) • Reduces the number of cache misses serviced remotely • Overhead • Duplicated data items • Complex protocol to ensure all updates are received at all processors Deitel, et al., Operating Systems, Third Edition

Cache-Only Memory Architecture Deitel, et al., Operating Systems, Third Edition

No Remote Memory Access • No-remote-memory-access (NORMA) multiprocessor • Does not share physical memory • Some implement the illusion of shared physical memory—shared virtual memory (SVM) • Loosely coupled • Communication through explicit messages • Distributed systems • Not networked system Deitel, et al., Operating Systems, Third Edition

No Remote Memory Access Deitel, et al., Operating Systems, Third Edition

Four Possible Cases

IPC Schemes

Communication via Files • Communication via files is in fact the oldest way of exchanging data between programs. Program A writes data to a file and Program B reads it. In a system in which only one program can be run at any given time, this does not present any problem. • In a multitasking system, however both programs could be run as processes at least quasi-parallel to each other. Race conditions then usually produce inconsistencies in the file data which result from one program reading a data area before the other has finished modifying it, or both processes modifying the same area of memory at the same time.

Communication via Files • Locking entire files • lock file • fcntl( ) (POSIX), flock( ) (BSD 4.3) • Locking file areas (record locking) • Deadlock

Process Communication Models • Message passing • Shared memory Silberschatz, et al., Operating System Principles, Seventh Edition

IPC for Linux • Linux IPC • Many IPC mechanisms derived from traditional UNIX IPC • Allow processes to exchange information • Some are better suited for particular applications • For example, those that communicate over a network or exchange short messages with other local applications Deitel, et al., Operating Systems, Third Edition

IPC for Linux • Signal • Pipe • Message queue • Shared memory • System V Semaphores • Sockets

Signals • Signals • One of the first interprocess communication mechanisms available in UNIX systems • Kernel uses them to notify processes when certain events occur • Do not allow processes to specify more than a word of data to exchange with other processes • Created by the kernel in response to interrupts and exceptions, are sent to a process or thread • as a result of executing an instruction (such as a segmentation fault) • from another process (such as when one process terminates another) • from an asynchronous event Deitel, et al., Operating Systems, Third Edition

POSIX Signals Deitel, et al., Operating Systems, Third Edition

Signals • A process/thread can handle a signal by • Ignore the signal—processes can ignore all but the SIGSTOP and SIGKILL signals. • Catch the signal—when a process catches a signal, it invokes its signal handler to respond to the signal. • Execute the default action that the kernel defines for that signal • Default actions • Abort: terminate immediately • Memory dump: Copies execution context before exiting • Ignore • Stop (i.e., suspend) • Continue (i.e., resume) Deitel, et al., Operating Systems, Third Edition

Signals • Signal blocking • A process or thread can block a signal • Signal is not delivered until process/thread stops blocking it • While a signal handler is running, signals of that type are blocked by default • Still possible to receive signals of a different type • Common signals are not queued • Real-time signals provide signal queuing Deitel, et al., Operating Systems, Third Edition

Pipes • Pipes  • Producer process writes data to the pipe, after which the consumer process reads data from the pipe in first-in-first-out order • When pipe is created, an inode that points to pipe buffer (page of data) is created • Access to pipes is controlled by file descriptors • Can be passed between related processes (e.g., parent and child) • Named pipes (FIFOs) ↔ • Can be accessed via the directory tree • Limitation: Fixed-size buffer Deitel, et al., Operating Systems, Third Edition

Message Queues • Message queues • Allow processes to transmit information that is composed of a message type and a variable-length data area • Stored in message queues, remain until a process is ready to receive them • Related processes can search for a message queue identifier in a global array of message queue descriptors • Message queue descriptor contains • Queue of pending messages • Queue of processes waiting for messages • Queue of processes waiting to send messages • Data describing the size and contents of the message queue Deitel, et al., Operating Systems, Third Edition

Shared Memory • Shared memory [protection schemes] • Advantages • Improves performance for processes that frequently access shared data • Processes can share as much data as they can address • Standard interfaces • System V shared memory • POSIX shared memory • Does not allow processes to change privileges for a segment of shared memory Deitel, et al., Operating Systems, Third Edition

System V Shared Memory System Calls Deitel, et al., Operating Systems, Third Edition

Shared Memory • Shared memory implementation • Treats region of shared memory as a file • Shared memory page frames are freed when file is deleted • Tmpfs (temporary file system) stores such files • Tmpfs pages are swappable • Permissions can be set • File system does not require formatting Deitel, et al., Operating Systems, Third Edition

System V Semaphores • System V semaphores • Designed for user processes to access via the system call interface • Semaphore arrays • Protect a group of related resources • Before a process can access resources protected by a semaphore array, the kernel requires that there be sufficient available resources to satisfy the process’s request • Otherwise, kernel blocks requesting process until resources become available • Preventing deadlock • When a process exits, the kernel reverses all the semaphore operations it performed to allocate its resources Deitel, et al., Operating Systems, Third Edition

Sockets • Sockets • Allows pairs of processes to exchange data by establishing direct bidirectional communication channels • Primarily used for bidirectional communication between multiple processes on different systems, but can be used for processes on the same system • Stored internally as files • File name used as socket’s address, accessed via the VFS Deitel, et al., Operating Systems, Third Edition

Sockets • Stream sockets • Implement the traditional client/server model • Data is transferred as a stream of bytes • Use TCP to communicate, so they are more appropriate for reliable communication • Datagram sockets • Faster, but less reliable communication • Data is transferred using datagram packets • Socketpairs • Pair of connected, unnamed sockets • Limited to use by processes that share file descriptors Deitel, et al., Operating Systems, Third Edition

sf01a:cmhuang[/] ipcs IPC status from <running system> as of Thu Sep 21 14:35:30 CST 2006 T ID KEY MODE OWNER GROUP Message Queues: Shared Memory: m 1 0x50000d1d --rw-r--r-- root root m 2 0xabbaca01 --rw-rw-rw- pc62 TR m 3103 0 --rw-rw-rw- cmhuang DSD m 1404 0 --rw-rw-rw- root root Semaphores: s 0 0x1 --ra-ra-ra- root root s 2031617 0 --ra-ra-ra- cmhuang DSD s 917506 0 --ra-ra-ra- cmhuang DSD

IPC for WinXP • Data oriented • Pipes • Mailslots (message queues) • Shared memory • Procedure oriented / object oriented • Remote procedure calls • Microsoft COM objects • Clipboard • GUI drag-and-drop capability Deitel, et al., Operating Systems, Third Edition

Pipes • Manipulated with file system calls • Read • Write • Open • Pipe server • Process that creates pipe • Pipe clients • Processes that connect to pipe • Modes • Read: pipe server receives data from pipe clients • Write: pipe server sends data to pipe clients • Duplex: pipe server sends and receives data Deitel, et al., Operating Systems, Third Edition

Pipes • Anonymous Pipes • Unidirectional • Between local processes • Synchronous • Pipe handles, usually passed through inheritance • Named Pipes • Unidirectional or bidirectional • Between local or remote processes • Synchronous or asynchronous • Opened by name • Byte stream vs. message stream • Default mode vs. write-through mode Deitel, et al., Operating Systems, Third Edition

Mailslots • Mailslot server: creates mailslot • Mailslot clients: send messages to mailslot • Communication • Unidirectional • No acknowledgement of receipt • Local or remote communication • Implemented as files • Two modes • Datagram: for small messages • Server Message Block (SMB): for large messages Deitel, et al., Operating Systems, Third Edition

Shared Memory • File mapping • Processes map their virtual memory to same page frames in physical memory • Multiple processes access same file • No synchronization guaranteed • File mapping object • Maps file to main memory • File view • Maps a process’s virtual memory to main memory mapped by file mapping object Deitel, et al., Operating Systems, Third Edition

Nokia DSP Gateway

Inter-Processor Communication for Heterogeneous Dual Core Systems

Inter-Processor Communication for Heterogeneous Dual Core Systems

Presentation Transcript

Operating Systems: Inter-Process Communication

Serial Code Accelerators for Heterogeneous Multi-core Processor with 3D memory

Core Inter-Process Communication Mechanisms (Historically Important)

Resource Mapping and Scheduling for Heterogeneous Network Processor Systems

Inter-Processor Communication (IPC)

Multi-Core/Processor

Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction

Heterogeneous Systems

Processor: AMD Dual-Core E1-1200 Accelerated Processor (1.4GHz, 1MB L2 Cache)

Multi-core Processor

Heterogeneous multi-core

Operating Systems: Inter-Process Communication

Operating Systems: Inter-Process Communication

Inter-Vehicle Communication Systems: A Survey

Using Heterogeneous Paths for Inter-process Communication in a Distributed System

Operating Systems: Inter-Process Communication

Inter-Processor Parallel Architecture

Distributed Systems : Inter-Process Communication

Multi-core Processor

Core-A Processor