multiprocessors and threads n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Multiprocessors and Threads PowerPoint Presentation
Download Presentation
Multiprocessors and Threads

Loading in 2 Seconds...

play fullscreen
1 / 39

Multiprocessors and Threads - PowerPoint PPT Presentation


  • 117 Views
  • Uploaded on

Multiprocessors and Threads. Fred Kuhns (fredk@arl.wustl.edu, http://www.arl.wustl.edu/~fredk) Department of Computer Science and Engineering Washington University in St. Louis. Motivation for Multiprocessors. Enhanced Performance -

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Multiprocessors and Threads' - shen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
multiprocessors and threads

Multiprocessors and Threads

Fred Kuhns

(fredk@arl.wustl.edu, http://www.arl.wustl.edu/~fredk)

Department of Computer Science and Engineering

Washington University in St. Louis

motivation for multiprocessors
Motivation for Multiprocessors
  • Enhanced Performance -
    • Concurrent execution of tasks for increased throughput (between processes)
    • Exploit Concurrency in Tasks (Parallelism within process)
  • Fault Tolerance -
    • graceful degradation in face of failures

Cs422 – Operating Systems Organization

basic mp architectures
Basic MP Architectures
  • Single Instruction Single Data (SISD)
    • conventional uniprocessor designs.
  • Single Instruction Multiple Data (SIMD)
    • Vector and Array Processors
  • Multiple Instruction Single Data (MISD)
    • Not Implemented.
  • Multiple Instruction Multiple Data (MIMD)
    • conventional MP designs

Cs422 – Operating Systems Organization

mimd classifications
MIMD Classifications
  • Tightly Coupled System- all processors share the same global memory and have the same address spaces (Typical SMP system).
    • Main memory for IPC and Synchronization.
  • Loosely Coupled System - memory is partitioned and attached to each processor. Hypercube, Clusters (Multi-Computer).
    • Message passing for IPC and synchronization.

Cs422 – Operating Systems Organization

mp block diagram

CPU

CPU

CPU

CPU

cache

MMU

cache

MMU

cache

MMU

cache

MMU

Interconnection Network

MM

MM

MM

MM

MP Block Diagram

Cs422 – Operating Systems Organization

memory access schemes
Memory Access Schemes
  • Uniform Memory Access (UMA)
    • Centrally located
    • All processors are equidistant (access times)
  • NonUniform Access (NUMA)
    • physically partitioned but accessible by all
    • processors have the same address space
  • NO Remote Memory Access (NORMA)
    • physically partitioned, not accessible by all
    • processors have own address space

Cs422 – Operating Systems Organization

other details of mp
Other Details of MP
  • Interconnection technology
    • Bus
    • Cross-Bar switch
    • Multistage Interconnect Network
  • Caching - Cache Coherence Problem!
    • Write-update
    • Write-invalidate
    • bus snooping

Cs422 – Operating Systems Organization

mp os structure 1
MP OS Structure - 1
  • Separate Supervisor -
    • all processors have own copy of the kernel.
    • Some share data for interaction
    • dedicated I/O devices and file systems
    • good fault tolerance but bad for concurrency
  • Master/Slave Configuration
    • Master: monitors status and assigns work
    • Slaves: schedulable pool of resources
    • master can be bottleneck
    • poor fault tolerance

Cs422 – Operating Systems Organization

mp os structure 2
MP OS Structure - 2
  • Symmetric Configuration - Most Flexible.
    • all processors are autonomous, treated equal
    • one copy of the kernel executed concurrently across all processors
    • Synchronized access to shared data structures:
      • Lock entire OS - Floating Master
      • Mitigated by dividing OS into segments that normally have little interaction
      • multithread kernel and control access to resources (continuum)

Cs422 – Operating Systems Organization

mp overview
MP Overview

MultiProcessor

SIMD

MIMD

Shared Memory

(tightly coupled)

Distributed Memory

(loosely coupled)

Symmetric

(SMP)

Clusters

Master/Slave

Cs422 – Operating Systems Organization

smp os design issues
SMP OS Design Issues
  • Threads - effectiveness of parallelism depends on performance of primitives used to express and control concurrency.
  • Process Synchronization - disabling interrupts is not sufficient.
  • Process Scheduling - efficient, policy controlled, task scheduling. Issues:
    • Global versus Local (per CPU)
    • Task affinity for a particular CPU
    • resource accounting
    • inter-thread dependencies

Cs422 – Operating Systems Organization

smp os design issues cont
SMP OS design issues - cont.
  • Memory Management - complication of shared main memory.
    • cache coherence
    • memory access synchronization
    • balancing overhead with increased concurrency
  • Reliability and fault Tolerance - degrade gracefully in the event of failures

Cs422 – Operating Systems Organization

typical smp system

Main

Memory

Typical SMP System

CPU

CPU

CPU

CPU

500MHz

cache

MMU

cache

MMU

cache

MMU

cache

MMU

System/Memory Bus

  • Issues:
  • Memory contention
  • Limited bus BW
  • I/O contention
  • Cache coherence

I/O

subsystem

50ns

Bridge

INT

ether

System Functions

(timer, BIOS, reset)

scsi

  • Typical I/O Bus:
  • 33MHz/32bit (132MB/s)
  • 66MHz/64bit (528MB/s)

video

Cs422 – Operating Systems Organization

some useful definitions
Some Useful Definitions
  • Parallelism: degree to which a multiprocessor application achieves parallel execution
  • Concurrency: Maximum parallelism an application can achieve with unlimited processors
  • System Concurrency: kernel recognizes multiple threads of control in a program
  • User Concurrency: User space threads (coroutines) provide a natural programming model for concurrent applications.

Cs422 – Operating Systems Organization

introduction to threads
Introduction to Threads

Multithreaded Process Model

Single-Threaded

Process Model

Thread

Thread

Thread

Thread

Control

Block

Thread

Control

Block

Thread

Control

Block

Process

Control

Block

User

Stack

Process

Control

Block

User

Stack

User

Stack

User

Stack

User

Address

Space

Kernel

Stack

User

Address

Space

Kernel

Stack

Kernel

Stack

Kernel

Stack

Cs422 – Operating Systems Organization

process concept embodies
Process Concept Embodies
  • Unit of Resource ownership - process is allocated a virtual address space to hold the process image
  • Unit of Dispatching- process is an execution path through one or more programs
    • execution may be interleaved with other processes
  • These two characteristics are treated independently by the operating system

Cs422 – Operating Systems Organization

threads
Threads
  • Effectiveness of parallel computing depends on the performanceof the primitives used to express and control parallelism
  • Separate notion of execution from Process abstraction
  • Useful for expressing the intrinsic concurrency of a program regardless of resulting performance
  • We will discuss Two examples of threading:
    • User threads,
    • Kernel threads

Cs422 – Operating Systems Organization

threads cont
Threads cont.
  • Thread : Dynamic object representing an execution path and computational state.
    • One or more threads per process, each having:
      • Execution state (running, ready, etc.)
      • Saved thread context when not running
      • Execution stack
      • Per-thread static storage for local variables
      • Shared access to process resources
        • all threads of a process share a common address space.

Cs422 – Operating Systems Organization

thread states
Thread States
  • Primary states:
    • Running, Ready and Blocked.
  • Operations to change state:
    • Spawn: new thread provided register context and stack pointer.
    • Block: event wait, save user registers, PC and stack pointer
    • Unblock: moved to ready state
    • Finish: deallocate register context and stacks.

Cs422 – Operating Systems Organization

threading models

P

P

P

P

P

P

Many-to-One

One-to-One

Many-to-Many

Threading Models
  • User threads := Many-to-One
  • kernel threads := One-to-One
  • Mixed user and kernel := Many-to-Many

Cs422 – Operating Systems Organization

user level threads
User Level Threads
  • User level threads - supported by user level threads libraries
    • Examples
      • POSIX Pthreads, Mach C-threads, Solaris ui-threads
    • Benefits:
      • no modifications required to kernel
      • flexible and low cost
    • Drawbacks:
      • can not block without blocking entire process
      • no parallelism (not recognized by kernel)

Cs422 – Operating Systems Organization

kernel level threads
Kernel Level Threads
  • Kernel level threads - directly supported by kernel, thread is the basic scheduling entity
    • Examples:
      • Windows 95/98/NT/2000, Solaris, Tru64 UNIX, BeOS, Linux
    • Benefits:
      • coordination between scheduling and synchronization
      • less overhead than a process
      • suitable for parallel application
    • Drawbacks:
      • more expensive than user-level threads
      • generality leads to greater overhead

Cs422 – Operating Systems Organization

threading issues
Threading Issues
  • fork and exec
    • should fork duplicate one, some or all threads
  • Cancellation – cancel the target thread, issues with freeing resources and inconsistent state
    • asynchronous cancellation – target is immediately canceled
    • deferred cancellation – target checks periodically. check at cancellation points

Cs422 – Operating Systems Organization

threading issues1
Threading Issues
  • Signals: generation, posting and delivery
    • Every signal handled by a default or user-defined handler
  • Signal delivery:
    • to thread for which it may apply
    • to every thread in process
    • to certain threads
    • specifically designated thread (signal thread)
  • synchronous signals should go to thread causing the signal
  • what about asynchronous signals?
    • Solaris: deliver to a special thread which forward to first user created thread that has not blocked the signal.

Cs422 – Operating Systems Organization

threading issues2
Threading Issues
  • Bounding the number of threads created in a dynamic environment
    • use thread pools
  • Al threads share some address space:
    • use of thread specific data

Cs422 – Operating Systems Organization

pthreads
Pthreads
  • a POSIX standard (IEEE 1003.1c) API for thread creation and synchronization.
  • API specifies behavior of the thread library, not the implementation.
  • Common in UNIX operating systems.
  • Programs must include <pthread.h>

Cs422 – Operating Systems Organization

unix support for threading
UNIX Support for Threading
  • BSD:
    • pthreads and similar user space implementations
    • process model only. 4.4 BSD enhancements.
    • BSD based OSes are adding support for threads
  • Solaris
    • user threads, kernel threads, LWPs and in 2.6 Scheduler Activations
  • Mach
    • kernel threads and tasks. Thread libraries provide semantics of user threads, LWPs and kernel threads.
  • Digital UNIX - extends MACH to provide usual UNIX semantics: Pthreads library.

Cs422 – Operating Systems Organization

solaris threads
Solaris Threads
  • Supports:
    • user threads (uthreads) via libthread and libpthread
    • LWPs, abstraction that acts as a virtual CPU for user threads.
      • LWP is bound to a kthread.
    • kernel threads (kthread), every LWP is associated with one kthread, however a kthread may not have an LWP
  • interrupts as threads

Cs422 – Operating Systems Organization

solaris kthreads
Solaris kthreads
  • Fundamental scheduling/dispatching object
  • all kthreads share same virtual address space (the kernels) - cheap context switch
  • System threads - example STREAMS, callout
  • kthread_t, /usr/include/sys/thread.h
    • scheduling info, pointers for scheduler or sleep queues, pointer to klwp_t and proc_t

Cs422 – Operating Systems Organization

solaris lwp
Solaris LWP
  • Kernel provided mechanism to allow for both user and kernel thread implementation on one platform.
  • Bound to akthread
  • LWP data (see /usr/include/sys/klwp.h)
    • user-level registers, system call params, resource usage, pointer to kthread_t and proc_t
  • All LWPs in a process share:
    • signal handlers
  • Each may have its own
    • signal mask
    • alternate stack for signal handling
  • No global name space for LWPs

Cs422 – Operating Systems Organization

solaris user threads
Solaris User Threads
  • Implemented in user libraries
  • library provides synchronization and scheduling facilities
  • threads may be bound to LWPs
  • unbound threads compete for available LWPs
  • Manage thread specific info
    • thread id, saved register state, user stack, signal mask, priority*, thread local storage
  • Solaris provides two libraries: libthread and libpthread.
  • Try man thread or man pthreads

Cs422 – Operating Systems Organization

solaris thread data structures
Solaris Thread Data Structures

proc_t

p_tlist

kthread_t

t_procp

t_lwp

klwp_t

t_forw

lwp_thread

lwp_procp

Cs422 – Operating Systems Organization

slide33

L

L

L

L

...

...

...

P

P

P

Solaris Threading Model (Combined)

Process 2

Process 1

user

Int kthr

kernel

hardware

Cs422 – Operating Systems Organization

solaris user level threads
Solaris User Level Threads

Stop

Wakeup

Runnable

Continue

Stop

Stopped

Sleeping

Preempt

Dispatch

Stop

Active

Sleep

Cs422 – Operating Systems Organization

solaris lightweight processes
Solaris Lightweight Processes

Timeslice

or Preempt

Stop

Running

Dispatch

Wakeup

Blocking

System

Call

Runnable

Stopped

Continue

Wakeup

Stop

Blocked

Cs422 – Operating Systems Organization

solaris interrupts
Solaris Interrupts
  • One system wide clock kthread
  • pool of 9 partially initialized kthreads per CPU for interrupts
  • interrupt thread can block
  • interrupted thread is pinned to the CPU

Cs422 – Operating Systems Organization

solaris signals and fork
Solaris Signals and Fork
  • Divided into Traps (synchronous) and interrupts (asynchronous)
  • each thread has its own signal mask, global set of signal handlers
  • Each LWP can specify alternate stack
  • fork replicates all LWPs
  • fork1 only the invoking LWP/thread

Cs422 – Operating Systems Organization

windows 2000 threads
Windows 2000 Threads
  • Implements the one-to-one mapping.
  • There is also support for user lever threads called fibers
  • Each thread contains

- a thread id

- register set

- separate user and kernel stacks

- private data storage area

Cs422 – Operating Systems Organization

linux threads
Linux Threads
  • Linux refers to processes and threads tasks
  • Thread creation is done through clone() system call.
  • Clone() allows a child task to share the address space of the parent task (process)

Cs422 – Operating Systems Organization