inter processor communication for heterogeneous dual core systems l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Inter-Processor Communication for Heterogeneous Dual Core Systems PowerPoint Presentation
Download Presentation
Inter-Processor Communication for Heterogeneous Dual Core Systems

Loading in 2 Seconds...

play fullscreen
1 / 109

Inter-Processor Communication for Heterogeneous Dual Core Systems - PowerPoint PPT Presentation


  • 145 Views
  • Uploaded on

Inter-Processor Communication for Heterogeneous Dual Core Systems. 2006/09/27. Chun-Ming Huang, Ph.D. National Chip Implementation Center (CIC) cmhuang@cic.org.tw. Agenda. IPC Overview IPC Schemes Nokia DSP Gateway TI DSP/BIOS Link IPC Hardware Architecture Conclusions. IPC Overview.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Inter-Processor Communication for Heterogeneous Dual Core Systems' - jerusha


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
inter processor communication for heterogeneous dual core systems

Inter-Processor Communication for Heterogeneous Dual Core Systems

2006/09/27

Chun-Ming Huang, Ph.D.

National Chip Implementation Center (CIC)

cmhuang@cic.org.tw

agenda
Agenda
  • IPC Overview
  • IPC Schemes
  • Nokia DSP Gateway
  • TI DSP/BIOS Link
  • IPC Hardware Architecture
  • Conclusions
what is ipc
What is IPC?
  • Inter-Process Communication
  • Inter-Processor Communication

How to provide inter-process communication

services for multi-core systems?

independent cooperating process
Independent & Cooperating Process
  • Processes executing concurrently in the multitasking environment may be either independent processes or cooperating processes
  • A process is independent if it cannot affect or be affected by the other processes executing in the system; any process that does not share data with any other process is independent
  • A process is cooperating if it can affect or be affected by the other processes executing in the system; any process that shares data with other processes is a cooperating process

Silberschatz, et al., Operating System Principles, Seventh Edition

why allow process cooperation
Why Allow Process Cooperation?
  • Information sharing
  • Computation speedup
  • Modularity
  • Convenience
  • Cooperating processes requires an inter-process communication (IPC) mechanism that will allow them to exchange data and information

Silberschatz, et al., Operating System Principles, Seventh Edition

ipc example
IPC Example
  • Unix pipe
  • ls –l / | grep 2005 | wc
  • 2 19 98
  • The grep utility searches text files for a pattern and prints all lines that contain that pattern.
  • The wc utility displays a count of lines, words and characters in a text file.
  • Data exchange
  • Synchronization
operating system kernel components
Operating System Kernel Components
  • Process scheduler
    • determines when and for how long a process execute on a processor
  • Memory manager
    • determines when and how memory is allocated to processes and what to do when memory becomes full
  • I/O manager
    • services input and output requests from and to hardware devices
  • Inter-process communication (IPC) manager
    • allows processes to communicate with one other
  • File system manager
    • organizes named collections of data on storage devices and provides an interface for accessing data on those devices

Deitel, et al., Operating Systems, Third Edition

linux kernel 2 6 17 11
Linux Kernel 2.6.17.11

drwxr-xr-x arch

drwxr-xr-x block

drwxr-xr-x crypto

drwxr-xr-x drivers

drwxr-xr-x fs

drwxr-xr-x include

drwxr-xr-x init

drwxr-xr-x ipc

drwxr-xr-x kernel

drwxr-xr-x lib

drwxr-xr-x mm

drwxr-xr-x net

drwxr-xr-x scripts

drwxr-xr-x security

drwxr-xr-x sound

drwxr-xr-x usr

-rw-r--r-- Makefile

-rw-r--r-- compat.c

-rw-r--r-- compat_mq.c

-rw-r--r-- mqueue.c

-rw-r--r-- msg.c

-rw-r--r-- msgutil.c

-rw-r--r-- sem.c

-rw-r--r-- shm.c

-rw-r--r-- util.c

-rw-r--r-- util.h

http://www.kernel.org

machine independent sw in the freebsd kernel
Machine-Independent SW in the FreeBSD Kernel

McKusic & Neville-Neil, The Design and Implementation of the FreeBSD Operating System

multiprocessor os organizations
Multiprocessor OS Organizations
  • Can classify systems based on how processors share operating system responsibilities
  • Three types
    • Master/slave
    • Separate kernels
    • Symmetrical organization

Deitel, et al., Operating Systems, Third Edition

master slave
Master/Slave
  • Master/Slave organization
    • Master processor executes the operating system
    • Slaves execute only user processors
    • Hardware asymmetry
    • Low fault tolerance
    • Good for computationally intensive jobs
    • Example: nCUBE system

Deitel, et al., Operating Systems, Third Edition

separate kernels
Separate Kernels
  • Separate kernels organization
    • Each processor executes its own operating system
    • Some globally shared operating system data
    • Loosely coupled
    • Catastrophic failure unlikely, but failure of one processor results in termination of processes on that processor
    • Little contention over resources
    • Example: Tandem system

Deitel, et al., Operating Systems, Third Edition

symmetrical organization
Symmetrical Organization
  • Symmetrical organization
    • Operating system manages a pool of identical processors
    • High amount of resource sharing
    • Need for mutual exclusion
    • Highest degree of fault tolerance of any organization
    • Some contention for resources
    • Example: BBN Butterfly

Deitel, et al., Operating Systems, Third Edition

memory access architectures
Memory Access Architectures
  • Memory access
    • Can classify multiprocessors based on how processors share memory
    • Goal: Fast memory access from all processors to all memory
      • Contention in large systems makes this impractical

Deitel, et al., Operating Systems, Third Edition

uniform memory access
Uniform Memory Access
  • Uniform memory access (UMA) multiprocessor
    • All processors share all memory
    • Access to any memory page is nearly the same for all processors and all memory modules (disregarding cache hits)
    • Typically uses shared bus or crossbar-switch matrix
    • Also called symmetric multiprocessing (SMP)
    • Small multiprocessors (typically two to eight processors)

Deitel, et al., Operating Systems, Third Edition

uniform memory access18
Uniform Memory Access

Deitel, et al., Operating Systems, Third Edition

non uniform memory access
Non-Uniform Memory Access
  • Non-uniform memory access (NUMA) multiprocessor
    • Each node contains a few processors and a portion of system memory, which is local to that node
    • Access to local memory faster than access to global memory (rest of memory)
    • More scalable than UMA (fewer bus collisions)

Deitel, et al., Operating Systems, Third Edition

non uniform memory access20
Non-Uniform Memory Access

Deitel, et al., Operating Systems, Third Edition

cache only memory architecture
Cache-Only Memory Architecture
  • Cache-only memory architecture (COMA) multiprocessor
    • Physically interconnected as a NUMA is
      • Local memory vs. global memory
    • Main memory is viewed as a cache and called an attraction memory (AM)
      • Allows system to migrate data to node that most often accesses it at granularity of a memory line (more efficient than a memory page)
      • Reduces the number of cache misses serviced remotely
      • Overhead
        • Duplicated data items
        • Complex protocol to ensure all updates are received at all processors

Deitel, et al., Operating Systems, Third Edition

cache only memory architecture22
Cache-Only Memory Architecture

Deitel, et al., Operating Systems, Third Edition

no remote memory access
No Remote Memory Access
  • No-remote-memory-access (NORMA) multiprocessor
    • Does not share physical memory
    • Some implement the illusion of shared physical memory—shared virtual memory (SVM)
    • Loosely coupled
    • Communication through explicit messages
    • Distributed systems
    • Not networked system

Deitel, et al., Operating Systems, Third Edition

no remote memory access24
No Remote Memory Access

Deitel, et al., Operating Systems, Third Edition

communication via files
Communication via Files
  • Communication via files is in fact the oldest way of exchanging data between programs. Program A writes data to a file and Program B reads it. In a system in which only one program can be run at any given time, this does not present any problem.
  • In a multitasking system, however both programs could be run as processes at least quasi-parallel to each other. Race conditions then usually produce inconsistencies in the file data which result from one program reading a data area before the other has finished modifying it, or both processes modifying the same area of memory at the same time.
communication via files28
Communication via Files
  • Locking entire files
    • lock file
    • fcntl( ) (POSIX), flock( ) (BSD 4.3)
  • Locking file areas (record locking)
    • Deadlock
process communication models
Process Communication Models
  • Message passing
  • Shared memory

Silberschatz, et al., Operating System Principles, Seventh Edition

ipc for linux
IPC for Linux
  • Linux IPC
    • Many IPC mechanisms derived from traditional UNIX IPC
      • Allow processes to exchange information
    • Some are better suited for particular applications
      • For example, those that communicate over a network or exchange short messages with other local applications

Deitel, et al., Operating Systems, Third Edition

ipc for linux31
IPC for Linux
  • Signal
  • Pipe
  • Message queue
  • Shared memory
  • System V Semaphores
  • Sockets
signals
Signals
  • Signals
    • One of the first interprocess communication mechanisms available in UNIX systems
    • Kernel uses them to notify processes when certain events occur
    • Do not allow processes to specify more than a word of data to exchange with other processes
    • Created by the kernel in response to interrupts and exceptions, are sent to a process or thread
      • as a result of executing an instruction (such as a segmentation fault)
      • from another process (such as when one process terminates another)
      • from an asynchronous event

Deitel, et al., Operating Systems, Third Edition

slide33

POSIX Signals

Deitel, et al., Operating Systems, Third Edition

signals34
Signals
  • A process/thread can handle a signal by
    • Ignore the signal—processes can ignore all but the SIGSTOP and SIGKILL signals.
    • Catch the signal—when a process catches a signal, it invokes its signal handler to respond to the signal.
    • Execute the default action that the kernel defines for that signal
  • Default actions
    • Abort: terminate immediately
    • Memory dump: Copies execution context before exiting
    • Ignore
    • Stop (i.e., suspend)
    • Continue (i.e., resume)

Deitel, et al., Operating Systems, Third Edition

signals35
Signals
  • Signal blocking
    • A process or thread can block a signal
      • Signal is not delivered until process/thread stops blocking it
    • While a signal handler is running, signals of that type are blocked by default
      • Still possible to receive signals of a different type
    • Common signals are not queued
      • Real-time signals provide signal queuing

Deitel, et al., Operating Systems, Third Edition

pipes
Pipes
  • Pipes 
    • Producer process writes data to the pipe, after which the consumer process reads data from the pipe in first-in-first-out order
    • When pipe is created, an inode that points to pipe buffer (page of data) is created
    • Access to pipes is controlled by file descriptors
      • Can be passed between related processes (e.g., parent and child)
    • Named pipes (FIFOs) ↔
      • Can be accessed via the directory tree
    • Limitation: Fixed-size buffer

Deitel, et al., Operating Systems, Third Edition

message queues
Message Queues
  • Message queues
    • Allow processes to transmit information that is composed of a message type and a variable-length data area
      • Stored in message queues, remain until a process is ready to receive them
      • Related processes can search for a message queue identifier in a global array of message queue descriptors
        • Message queue descriptor contains
          • Queue of pending messages
          • Queue of processes waiting for messages
          • Queue of processes waiting to send messages
          • Data describing the size and contents of the message queue

Deitel, et al., Operating Systems, Third Edition

shared memory
Shared Memory
  • Shared memory [protection schemes]
    • Advantages
      • Improves performance for processes that frequently access shared data
      • Processes can share as much data as they can address
    • Standard interfaces
      • System V shared memory
      • POSIX shared memory
        • Does not allow processes to change privileges for a segment of shared memory

Deitel, et al., Operating Systems, Third Edition

slide39

System V Shared Memory System Calls

Deitel, et al., Operating Systems, Third Edition

shared memory40
Shared Memory
  • Shared memory implementation
    • Treats region of shared memory as a file
    • Shared memory page frames are freed when file is deleted
    • Tmpfs (temporary file system) stores such files
      • Tmpfs pages are swappable
      • Permissions can be set
      • File system does not require formatting

Deitel, et al., Operating Systems, Third Edition

system v semaphores
System V Semaphores
  • System V semaphores
    • Designed for user processes to access via the system call interface
  • Semaphore arrays
    • Protect a group of related resources
    • Before a process can access resources protected by a semaphore array, the kernel requires that there be sufficient available resources to satisfy the process’s request
    • Otherwise, kernel blocks requesting process until resources become available
  • Preventing deadlock
    • When a process exits, the kernel reverses all the semaphore operations it performed to allocate its resources

Deitel, et al., Operating Systems, Third Edition

sockets
Sockets
  • Sockets
    • Allows pairs of processes to exchange data by establishing direct bidirectional communication channels
    • Primarily used for bidirectional communication between multiple processes on different systems, but can be used for processes on the same system
    • Stored internally as files
    • File name used as socket’s address, accessed via the VFS

Deitel, et al., Operating Systems, Third Edition

sockets43
Sockets
  • Stream sockets
    • Implement the traditional client/server model
    • Data is transferred as a stream of bytes
    • Use TCP to communicate, so they are more appropriate for reliable communication
  • Datagram sockets
    • Faster, but less reliable communication
    • Data is transferred using datagram packets
  • Socketpairs
    • Pair of connected, unnamed sockets
    • Limited to use by processes that share file descriptors

Deitel, et al., Operating Systems, Third Edition

slide44

sf01a:cmhuang[/] ipcs

IPC status from <running system> as of Thu Sep 21 14:35:30 CST 2006

T ID KEY MODE OWNER GROUP

Message Queues:

Shared Memory:

m 1 0x50000d1d --rw-r--r-- root root

m 2 0xabbaca01 --rw-rw-rw- pc62 TR

m 3103 0 --rw-rw-rw- cmhuang DSD

m 1404 0 --rw-rw-rw- root root

Semaphores:

s 0 0x1 --ra-ra-ra- root root

s 2031617 0 --ra-ra-ra- cmhuang DSD

s 917506 0 --ra-ra-ra- cmhuang DSD

ipc for winxp
IPC for WinXP
  • Data oriented
    • Pipes
    • Mailslots (message queues)
    • Shared memory
  • Procedure oriented / object oriented
    • Remote procedure calls
    • Microsoft COM objects
    • Clipboard
    • GUI drag-and-drop capability

Deitel, et al., Operating Systems, Third Edition

pipes46
Pipes
  • Manipulated with file system calls
    • Read
    • Write
    • Open
  • Pipe server
    • Process that creates pipe
  • Pipe clients
    • Processes that connect to pipe
  • Modes
    • Read: pipe server receives data from pipe clients
    • Write: pipe server sends data to pipe clients
    • Duplex: pipe server sends and receives data

Deitel, et al., Operating Systems, Third Edition

pipes47
Pipes
  • Anonymous Pipes
    • Unidirectional
    • Between local processes
    • Synchronous
    • Pipe handles, usually passed through inheritance
  • Named Pipes
    • Unidirectional or bidirectional
    • Between local or remote processes
    • Synchronous or asynchronous
    • Opened by name
    • Byte stream vs. message stream
    • Default mode vs. write-through mode

Deitel, et al., Operating Systems, Third Edition

mailslots
Mailslots
  • Mailslot server: creates mailslot
  • Mailslot clients: send messages to mailslot
  • Communication
    • Unidirectional
    • No acknowledgement of receipt
    • Local or remote communication
    • Implemented as files
    • Two modes
      • Datagram: for small messages
      • Server Message Block (SMB): for large messages

Deitel, et al., Operating Systems, Third Edition

shared memory49
Shared Memory
  • File mapping
    • Processes map their virtual memory to same page frames in physical memory
    • Multiple processes access same file
    • No synchronization guaranteed
  • File mapping object
    • Maps file to main memory
  • File view
    • Maps a process’s virtual memory to main memory mapped by file mapping object

Deitel, et al., Operating Systems, Third Edition

nokia dsp gateway overview
Nokia DSP Gateway Overview
  • Supports TI OMAP1510, 1610, 5910, 5912, 2410, and 2412.
  • GPP side
    • Linux kernel 2.6.6
    • Linux device driver
    • Access DSP through normal system calls such as read() and write()
  • DSP side
    • TI DSP/BIOS
    • DSP kernel library (tokliBIOS) and API

http://dspgateway.sourceforge.net/pub/index.php

nokia dsp gateway overview52
Nokia DSP Gateway Overview
  • Current version: 3.3.1 (2006-09-13)
  • Open source software
  • Current license state:
slide54

Summary of changes from v2.6.5 to v2.6.6

============================================

<tony@com.rmk.(none)> [ARM PATCH] 1777/1: Add TI OMAP support to ARM core files

Patch from Tony Lindgren

This patch updates the ARM Linux core files to add support for Texas Instruments OMAP-1510, 1610, and 730 processors.

OMAP is an embedded ARM processor with integrated DSP.

OMAP-1610 has hardware support for USB OTG, which might be of interest to Linux developers. OMAP-1610 could be easily be used as development platform to add USB OTG support to Linux.

This patch is an updated version of an earlier patch 1767/1 with the dummy Kconfig added for OMAP as suggested by Russell King here:

http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=1767/1

This patch is brought to you by various linux-omap developers.

http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.6

ti dsp bios
TI DSP/BIOS
  • Scalable real-time kernel
  • Real-time scheduling and synchronization
  • Host-to-target communication
  • Real-time instrumentation
  • Preemptive multi-threading
  • Hardware abstraction
  • Real-time analysis and configuration tools
  • Application programs use DSP/BIOS by making calls to the API
  • All DSP/BIOS modules provide C-callable interfaces
mailbox in omap1
Mailbox in OMAP1
  • Each set of mailbox registers consists of two 16-bit registers and a 1-bit flag register.
  • The interrupting processor can use one 16-bit register to pass a data word to the interrupted processor and the other 16-bit register to pass a command word.
mailbox in omap2
Mailbox in OMAP2
  • 6 sets of mailbox registers, and each message register can carry a 32-bit data
  • two mailbox queues are reserved, MAILBOX_0 for ARM to DSP direction and MAILBOX_1 for DSP to ARM direction
mailbox command and data register
Mailbox Command and Data Register
  • Command register bit definitions
  • Data register bit definitions
mailbox command sequence
Mailbox Command Sequence
  • Configuration sequence
    • System configuration
    • Task configuration
    • Task add/delete
  • Data transfer sequence
    • ARM to DSP transfer
    • DSP to ARM transfer
    • Task control
    • Read/write DSP register
    • Read/write DSP system parameters
ipc buffer
IPC Buffer
  • It is unrealistic to transfer a large amount of data between two processors with only mailbox registers. Therefore, IPBUF (Inter-Processor Buffer) is introduced for the large block data transfer.
  • There are three types of IPBUFs:
    • Global IPBUF
    • Private IPBUF
    • System IPBUF
global ipbuf
Global IPBUF
  • The Global IPBUFs are defined for the block data transfer between ARM and DSP.
  • The Global IPBUF lines are identified with BID (Buffer ID), and all tasks can use them commonly.
  • The maximum line size is 64k words (128k bytes).
ti dsp bios link75
TI DSP/BIOS Link
  • For TI OMAP5910/5912, Davinci, and DM642 devices.
  • DSP/BIOS Link is a no-charge, royalty-free product and is provided in C source code form.
  • Current version: 1.30.06 (Nov. 22, 2005)
  • Portable across different operating systems.
  • OS (GPP) + DSP/BIOS (DSP)

http://focus.ti.com/dsp/docs/dspsupportatn.tsp?sectionId=3&tabId=477&familyId=44&toolTypeId=5

dsp bios link supported platforms
DSP/BIOS Link Supported Platforms
  • Davinci running Montavista Linux Pro 4.0 or PrKernel v4.1 on ARM
  • OMAP5912 running Montavista Linux Pro 3.1 on ARM
  • DA300 running PrKernel v4.1 on ARM
  • DM642 connected to a PC running Red Hat Linux 9.0 or Red Hat Enterprise Linux 4.0
on the gpp side
On the GPP Side
  • The OS ADAPTATION LAYER encapsulates the generic OS services that are required by the other components of DSP/BIOS LINK. This component exports a generic API that insulates the other components from the specifics of an OS. All other components use this API instead of direct OS calls. This makes DSP/BIOS LINK portable across different operating systems.
  • The LINK DRIVER encapsulates the low-level control operations on the physical link between the GPP and DSP. This module is responsible for controlling the execution of the DSP and data transfer using defined protocol across the GPP-DSP boundary.
on the gpp side79
On the GPP Side
  • The PROCESSOR MANAGER maintains book-keeping information for all components. It also allows different boot-loaders to be plugged into the system. It builds exposes the control operations provided by the LINK DRIVER to the user through the API layer.
  • The DSP/BIOS LINK API is interface for all clients on the GPP side. This is a very thin component and usually doesn’t do any more processing than parameter validation. The API layer can be considered as ‘skin’ on the ‘muscle’ mass contained in the PROCESSOR MANAGER and LINK DRIVER.
on the dsp side
On the DSP Side
  • The LINK DRIVER is one of the drivers in DSP/BIOS. This driver specializes in communicating with the GPP over the physical link.
  • There is no specific DSP/BIOS LINK API on the DSP. The communication (data/message transfer) is done using the DSP/BIOS modules - SIO/GIO/MSGQ.
dsp bios link key components
DSP/BIOS Link Key Components
  • PROC
    • This component represents the DSP processor in the application space.
    • This component provides services to:
      • Initialize the DSP & make it available for access from the GPP.
      • Load code on the DSP.
      • Start execution from the run address specified in the executable.
      • Read from or write to DSP memory.
      • Stop execution.
      • Additional platform-specific control actions.
    • In the current version, only one processor is supported. However, the APIs are designed to support multiple DSPs and hence they accept a processorID argument to support this future enhancement.
dsp bios link key components82
DSP/BIOS Link Key Components
  • CHNL
    • This component represents a logical data transfer channel in the application space.
    • CHNL is responsible for the data transfer across the GPP and DSP.
    • CHNL is an acronym for ‘channel’.
    • A channel (when referred in context of DSP/BIOS LINK) is:
      • A means of transferring data across GPP and DSP.
      • A logical entity mapped over a physical connectivity between the GPP and DSP.
      • Uniquely identified by a number within the range of channels for a specific physical link towards a DSP.
      • Unidirectional. The direction of a channel is decided at run time based on the attributes passed to the corresponding API.
dsp bios link key components83
DSP/BIOS Link Key Components
  • MSGQ
    • This component represents queue based messaging
    • This component is responsible for exchanging short messages of variable length between the GPP and DSP clients. It is based on the MSGQ module in DSP/BIOS.
    • The messages are sent and received through message queues.
    • A reader gets the message from the queue and a writer puts the message on a queue. A message queue can have only one reader and many writers. A task may read from and write to multiple message queues.
dsp bios link key components84
DSP/BIOS Link Key Components
  • POOL
    • This component provides APIs to open and close memory pools, which are used by the CHNL and MSGQ component for allocating the buffers used in data transfer and messaging respectively.
    • This component is responsible for providing a uniform view of different memory pool implementations, which may be specific to the hardware architecture or OS on which DSP/BIOS LINK is ported. This component is based on the POOL interface in DSP/BIOS.
initialization phase api
Initialization Phase API
  • PROC
    • PROC_Setup()
    • PROC_Attach()
    • PROC_Load()
  • CHNL
    • CHNL_Create()
    • CHNL_AllocateBuffer()
  • MSGQ
    • MSGQ_TransportOpen()
    • MSGQ_Open()
    • MSGQ_SetErrorHandler()
    • MSGQ_Locate()
  • POOL
    • POOL_Open()
execution phase api
Execution Phase API
  • PROC
    • PROC_Start()
    • PROC_Read()
    • PROC_Write()
    • PROC_Stop()
  • CHNL
    • CHNL_Issue()
    • CHNL_Reclaim()
  • MSGQ
    • MSGQ_Alloc()
    • MSGQ_Put()
    • MSGQ_Get()
    • MSGQ_GetSrcQueue()
    • MSGQ_Free()
finalization phase api
Finalization Phase API
  • PROC
    • PROC_Detach()
    • PROC_Destroy()
  • CHNL
    • CHNL_FreeBuffer()
    • CHNL_Delete()
  • MSGQ
    • MSGQ_Release()
    • MSGQ_TransportClose()
    • MSGQ_Close()
  • POOL
    • POOL_Close()
tightly coupled vs loosely coupled systems
Tightly Coupled vs. Loosely Coupled Systems
  • Tightly coupled systems
    • Processors share most resources including memory
    • Communicate over shared buses using shared physical memory
  • Loosely coupled systems
    • Processors do not share most resources
    • Most communication through explicit messages or shared virtual memory (although not shared physical memory)
  • Comparison
    • Loosely coupled systems: more flexible, fault tolerant, scalable
    • Tightly coupled systems: more efficient, less burden to operating system programmers

Deitel, et al., Operating Systems, Third Edition

tightly coupled systems
Tightly Coupled Systems

Deitel, et al., Operating Systems, Third Edition

loosely coupled systems
Loosely Coupled Systems

Deitel, et al., Operating Systems, Third Edition

processor interconnection schemes
Processor Interconnection Schemes
  • Interconnection scheme
    • Describes how the system’s components, such as processors and memory modules, are connected
    • Consists of nodes (components or switches) and links (connections)
    • Parameters used to evaluate interconnection schemes
      • Node degree
      • Bisection width
      • Network diameter
      • Cost of the interconnection scheme

Deitel, et al., Operating Systems, Third Edition

processor interconnection schemes93
Processor Interconnection Schemes

Shared bus multiprocessor organization.

Deitel, et al., Operating Systems, Third Edition

processor interconnection schemes94
Processor Interconnection Schemes

Crossbar-switch matrix multiprocessor organization.

Deitel, et al., Operating Systems, Third Edition

processor interconnection schemes95
Processor Interconnection Schemes

4-connected 2-D mesh network.

Deitel, et al., Operating Systems, Third Edition

processor interconnection schemes96
Processor Interconnection Schemes

3- and 4-dimensional hypercubes.

Deitel, et al., Operating Systems, Third Edition

processor interconnection schemes97
Processor Interconnection Schemes

Multistage baseline network.

Deitel, et al., Operating Systems, Third Edition

a simple ipc architecture
ARM writes command in shared memory

ARM interrupts DSP

DSP responds to interrupt and reads command in shared memory

DSP executes a task based on the command

DSP interrupts ARM upon completion of the task

A Simple IPC Architecture

TMS320DM644x DMSoC ARM Subsystem Reference Guide (SPRUE14)

omap5910 ipc architecture
Mailbox registers

Each direction 32bit x 2

Interrupt occurrence

MPU interface (MPUI)

MPU accesses DSP memory space directly

Shared memory

Arrangement with the Traffic Controller

3 type of memories

Best suitable to large amount of data sharing

OMAP5910 IPC Architecture
traffic controller tc
Traffic Controller (TC)
  • The IMIF allows access to the 192K bytes of on-chip SRAM.
  • The EMIFS interface provides 16-bit-wide access to asynchronous or synchronous memories.
  • The EMIFF Interface provides access to 16-bit-wide access to standard SDRAM memories.
  • The TC provides the functions of
    • arbitrating contending accesses to the same memory interface from different initiators (MPU, DSP, System DMA, Local Bus),
    • synchronization of accesses due to the initiators and the memory interfaces running at different clock rates,
    • and the buffering of data allowing burst access for more efficient multiplexing of transfers from multiple initiators to the memory interfaces.
  • The TC’s architecture allows simultaneous transfers between initiators and different memory interfaces without penalty. For instance, if the MPU is accessing the EMIFF at the same time, the DSP is accessing the IMIF, transfers may occur simultaneously since there is no contention for resources.
arm ipcm module
ARM IPCM Module
  • The IPCM provides up to 32 mailboxes with control logic and interrupt generation to support inter-processor communication.
  • An AHB interface enables access from source and destination cores.
  • The IPCM:
    • sends interrupts to other cores
    • passes small amounts of data to other cores.
  • A source core can have multiple mailboxes and send messages in parallel (multitasking).

PrimeCell Inter-Processor Communications Module Technical Reference Manual

ipcm components
IPCM Components
  • 1-32 programmable mailboxes, each comprising:
    • a single 1-32-bit Mailbox Source Register
    • a single 1-32-bit Mailbox Destination Register
    • a single 2-bit Mailbox Mode Register
    • a single 1-32-bit Mailbox Mask Register
    • a single 2-bit Mailbox Send Register
    • 0-7 32-bit data registers to store the message.
  • 1-32 sets of read-only interrupt status registers, one for each interrupt, each comprising:
    • 1-32-bit Raw Interrupt Status Register (each bit corresponds to each mailbox)
    • 1-32-bit Masked Interrupt Status Register (each bit corresponds to each mailbox).
  • A 32-bit Configuration Status Register
ipcm functional block
IPCM Functional Block

PrimeCell Inter-Processor Communications Module Technical Reference Manual

ipcm example106
IPCM Example
  • Core0 has a message to send to Core1. Core0 claims the mailbox by setting bit 0 in the Mailbox Source Register. Core0 then sets bit 1 in the Mailbox Destination Register, enables the interrupts and programs the message into the Mailbox Data Registers. Finally, Core0 sends the message by writing 01 to the Mailbox Send Register. This asserts the interrupt to Core1.
  • When Core1 is interrupted, it reads the Masked Interrupt Status Register for IPCMINT[1] to determine which mailbox contains the message. Core1 reads the message in that mailbox, then clears the interrupt and asserts the acknowledge interrupt by writing 10 to the Mailbox Send Register.
  • Core0 is interrupted with the acknowledge message, completing the operation. Core0 then decides whether to retain the mailbox to send another message or release the mailbox, freeing it up for other cores in the system to use it.
conclusions108
Conclusions
  • IPC schemes for supporting many cores
  • Performance and power consumption analysis for different IPC schemes
  • IPC API schemes