advanced programming n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Advanced Programming PowerPoint Presentation
Download Presentation
Advanced Programming

Loading in 2 Seconds...

play fullscreen
1 / 63

Advanced Programming - PowerPoint PPT Presentation


  • 57 Views
  • Uploaded on

Advanced Programming. Rabie A. Ramadan Lecture 7. Multithreading An Overview. Some of the slides are exerted from Jonathan Amsterdam presentation. Processing Elements Architecture. Simple classification by Flynn: (No. of instruction and data streams) SISD - conventional

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Advanced Programming' - truman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
advanced programming

Advanced Programming

Rabie A. Ramadan

Lecture 7

multithreading an overview
Multithreading An Overview

Some of the slides are exerted from Jonathan Amsterdam presentation

processing elements
Simple classification by Flynn:

(No. of instruction and data streams)

SISD - conventional

SIMD - data parallel, vector computing

MISD - systolic arrays

MIMD - very general, multiple approaches.

Current focus is on MIMD model, using general purpose processors.

(No shared memory)

Processing Elements
sisd a conventional computer
Speed is limited by the rate at which computer can transfer information internally.

Instructions

Processor

Data Output

Data Input

SISD : A Conventional Computer

Ex: PC, Macintosh, Workstations

the misd architecture
More of an intellectual exercise than a practical configuration. Few built, but commercially not availableThe MISD Architecture
simd architecture
Ex: CRAY machine vector processing,

Instruction

Stream

Data Output

stream A

Data Input

stream A

Processor

A

Data Output

stream B

Processor

B

Data Input

stream B

Data Output

stream C

Processor

C

Data Input

stream C

SIMD Architecture

Ci<= Ai * Bi

mimd architecture
Unlike SISD, MISD, MIMD computer works asynchronously.

Shared memory (tightly coupled) MIMD

Distributed memory (loosely coupled) MIMD

MIMD Architecture

Instruction

Stream A

Instruction

Stream B

Instruction

Stream C

Data Output

stream A

Processor

A

Data Input

stream A

Data Output

stream B

Processor

B

Data Input

stream B

Data Output

stream C

Processor

C

Data Input

stream C

shared memory mimd machine
Comm: Source PE writes data to GM & destination retrieves it

Easy to build, conventional OSes of SISD can be easily be ported

Limitation : reliability & expandability. A memory component or any processor failure affects the whole system.

Increase of processors leads to memory contention.

Ex. : Silicon graphics supercomputers....

MEMORY

MEMORY

MEMORY

BUS

BUS

BUS

Shared Memory MIMD machine

Processor

A

Processor

B

Processor

C

Global Memory System

distributed memory mimd
Communication : based on High Speed Network.

Network can be configured to ... Tree, Mesh, Cube, etc.

Unlike Shared MIMD

easily/ readily expandable

Highly reliable (any CPU failure does not affect the whole system)

MEMORY

MEMORY

MEMORY

BUS

BUS

BUS

Memory

System A

Memory

System B

Memory

System C

Distributed Memory MIMD

Processor

A

Processor

B

Processor

C

serial vs parallel

Q

Please

Serial Vs. Parallel

COUNTER 2

COUNTER

COUNTER 1

single and multithreaded processes
Single and Multithreaded Processes

Single-threaded Process

Multiplethreaded Process

Threads of

Execution

Multiple instruction stream

Single instruction stream

Common

Address Space

slide13

Application

Application

Application

Application

CPU

CPU

CPU

CPU

CPU

CPU

OS:Multi-Processing, Multi-Threaded

Threaded Libraries, Multi-threaded I/O

Better Response Times in Multiple Application Environments

Higher Throughput for Parallelizeable Applications

slide14

Multi-threading, continued...Multi-threaded OS enables parallel, scalable I/O

Application

Application

Application

Multiple, independent I/O requests can be satisfied simultaneously because all the major disk, tape, and network drivers have been multi-threaded, allowing any given driver to run on multiple CPUs simultaneously.

OS Kernel

CPU

CPU

CPU

applications could have one or more process
Program in Execution

Consists of three components

An executable program

Associated data needed by the program

Execution context of the program

All information the operating system needs to manage the process

Applications Could have One or More Process
what are threads
Thread is a piece of code that can execute in concurrence with other threads.

It is a schedule entity on a processor

Hardware

Context

Registers

Status Word

Program Counter

What are Threads?
  • Local state
  • Global/ shared state
  • PC
  • Hard Context

Thread Object

what is a thread
A single sequential flow of control

A unit of concurrent execution.

Multiple threads can exist within the same process and share memory resources (on the other hand, processes have each its own process space)

All programs have at least one thread called “main thread”

What is a Thread ?
thread resources
Each thread has its own

Program Counter (point of execution)

Control Stack (procedure call/return)

Data Stack (local Variables)

All threads share

Heap (objects) – dynamic allocated memory for the process

Program code

Class and instance variables

Thread Resources
threaded process model
Threaded Process Model

THREAD STACK

SHARED MEMORY

THREAD DATA

Threads within a process

THREAD TEXT

  • Independent executables
  • All threads are parts of a process hence communication easier and simpler.
the multi threading concept
The Multi-Threading Concept

Task A

T2

UniProcessor

T1

T0

A Threading library creates threads and assigns processor time to each thread

the multi threading in multi processors
The Multi-Threading in Multi-Processors

Task A

T2

Processor 1

T1

T0

Processor 2

Processor 3

Processor 4

why multiple threads
Speeding up the computations

Two threads , each solve half of the problem then combine their results

Improving Responsiveness

One thread computes while other handles the user interface

One thread loads an image from the net while the other computes

Why multiple Threads?
why multiple threads1
Performing house keeping tasks

One thread does garbage collection while other computes

One thread rebalances the search tree while the other uses the tree.

Performing multiuser tasks

Several threads run animation simultaneously (as an example)

Why multiple Threads?
simple example
main :

Run thread2

Forever

Print 1

thread2

Forever

Print 2

Simple Example
scheduling
Scheduler is part of the Operating System that determines which thread to run next

Two types of schedulers

Pre-emptive – can interrupt the running thread

Cooperative – a thread must voluntarily yield

Most modern O.S. are pre-emptive

Scheduling
thread life cycle
New state: At this point, the thread is considered not alive.

Runnable (Ready-to-run) state :� invoked by the start() method but not actually. The scheduler is aware of the thread but may be scheduled sometimes later

Runningstate: � The thread is currently executing.

Dead state: � If any thread comes on this state that means it cannot ever run again.

Blocked - A thread can enter in this state because of waiting the resources that are hold by another thread.

Thread Life cycle
software models for multithreaded programming
Boss/worker model

Work crew model

Pipelining model

Combinations of models

Software Models for Multithreaded Programming
boss worker model
One thread functions as the boss

It assigns tasks to worker threads for them to perform.

Each worker performs a different task until it has finished, at which point it notifies the boss that it is ready to receive another task.

Alternatively, the boss polls workers periodically to see whether or not each worker is ready to receive another task.

A variation of the boss/worker model is the work queue model. The boss places tasks in a queue, and workers check the queue and take tasks to perform

Boss/Worker Model
work crew model
Multiple threads work together on a single task.

The task is divided horizontally into pieces that are performed in parallel

Each thread performs one piece.

Work Crew Model
  • Example:
    • Group of people cleaning a building. Each person cleans certain rooms or performs certain types of work (washing floors, polishing furniture, and so forth), and each works independently.
pipelining model
A task is divided vertically into steps.

The steps must be performed in sequence to produce a single instance of the desired result.

The work done in each step (except for the first and last) is based on the previous step and is a prerequisite for the work in the next step.

Pipelining Model
bad news
Multithreaded programs are hard to write

Hard to Understand

They are incredibly hard to debug

Anyone thinks that concurrent programming is easy should have his/her thread examined

Bad News
threads assumptions
Threads are executed in any order

Not necessarily to alternate line by line

Bugs may show up rarely

Bugs may be hard to repeat

More than one thread try to change memory at the same time

Assumptions about the execution does not apply

(E. G.) What is the value of i after i=1?

Threads Assumptions
memory conflicts
Two threads access the same memory location; they can conflict with each other

The resulting state may be expected wrong

E.g. Two states may try to increment a counter

Memory Conflicts
terminology
Terminology

Critical section: a section of code which reads or writes shared data

Race condition: potential for interleaved execution of a critical section by multiple threads

Results are non-deterministic

Mutual exclusion: synchronization mechanism to avoid race conditions by ensuring exclusive execution of critical sections

Deadlock: permanent blocking of threads

Starvation: one or more threads denied resources;without those resources, the program can never finish its task.

four requirements for deadlock
Mutual exclusion

Only one thread at a time can use a resource.

Hold and wait

Thread holding at least one resource is waiting to acquire additional resources held by other threads

No preemption

Resources are released only voluntarily by the thread holding the resource, after thread is finished with it

Circular wait

There exists a set {T1, …, Tn} of waiting threads

T1 is waiting for a resource that is held by T2

T2 is waiting for a resource that is held by T3

Tn is waiting for a resource that is held by T1

Four requirements for Deadlock
thread synchronization methods
Mutex Locks

Condition Variables

Semaphore

Thread Synchronization methods
mutex locks1
If a data item is shared by a number of threads, race conditions could occur if the shared item is not protected properly.

The easiest protection mechanism is a lock

For every thread, before it accesses the set of data items, it acquires the lock.

Once the lock is successfully acquired, the thread becomes the owner of that lock and the lock is locked.

Then, the owner can access the protected items. After this, the owner must release the lock and the lock becomes unlocked.

Another thread can acquire the lock

Mutex Locks
mutex locks2
the use of a lock simply establishes a critical section.

Before entering a critical section, a thread acquires a lock.

If it is successful, this thread enters the critical section and the lock is locked.

As a result, all subsequent acquiring requests will be queued until the lock is unlocked.

Mutex Locks
mutex locks restrictions
Only the owner can release the lock

Imagine the following situation. Suppose thread A is the current owner of lock L and thread B is a second thread who wants to lock the lock. If a non-owner can unlock a lock, thread B can unlock the lock that thread A owns, and, hence, either both threads may be executing in the same critical section, or thread B preempts thread A and executes the instructions of the critical section.

Recursive lock acquisition is not allowed

The current owner of the lock is not allowed to acquire the same lock again.

Mutex Locks Restrictions
mutex example the dining philosophers problem
Imagine that five philosophers who spend their lives just thinking and eating.

In the middle of the dining room is a circular table with five chairs.

The table has a big plate of spaghetti. However, there are only five chopsticks available.

Each philosopher thinks. When he gets hungry, he sits down and picks up the two chopsticks that are closest to him.

If a philosopher can pick up both chopsticks, he eats for a while.

After a philosopher finishes eating, he puts down the chopsticks and starts to think.

Mutex ExampleThe Dining Philosophers Problem
the dining philosophers problem analysis
The Dining Philosophers ProblemAnalysis

Philosopher Cycle

Philosopher flow

c language support for synchronization
Languages with exceptions like C++

Languages that support exceptions are problematic (easy to make a non-local exit without releasing lock)

Consider:

void Rtn() { lock.acquire(); … DoFoo(); … lock.release(); } void DoFoo() { … if (exception) throw errException; … }

C++ Language Support for Synchronization
  • Notice that an exception in DoFoo() will exit without releasing the lock
c language support for synchronization con t
Must catch all exceptions in critical sections

Catch exceptions, release lock, and re-throw exception:void Rtn() { lock.acquire();try { … DoFoo(); …} catch (…) { // catch exception lock.release(); // release lock throw; // re-throw the exception } lock.release(); } void DoFoo() { … if (exception) throw errException; … }

C++ Language Support for Synchronization (con’t)
  • Even Better: auto_ptr<T> facility. See C++ Spec.
    • Can deallocate/free lock regardless of exit method
java language support for synchronization
Java has explicit support for threads and thread synchronization

Bank Account example:class Account { private int balance; // object constructor public Account (int initialBalance) { balance = initialBalance; } public synchronized int getBalance() { return balance; } public synchronizedvoid deposit(int amount) { balance += amount; } }

Java Language Support for Synchronization
  • Every object has an associated lock which gets automatically acquired and released on entry and exit from a synchronizedmethod.
condition variables cv
A condition variableallows a thread to block its own execution until some shared data reaches a particular state.

A condition variable is a synchronization object used in conjunction with a mutex.

A mutexcontrols access to shared data;

A condition variable allows threads to wait for that data to enter a defined state.

A mutex is combined with CV to avoid the race condition.

Condition Variables (CV)
condition variable
Condition Variable

Waiting and signaling on condition variables

Routines

pthread_cond_wait(condition, mutex)

Blocks the thread until the specific condition is signalled.

Should be called with mutex locked

Automatically release the mutex lock while it waits

When return (condition is signaled), mutex is locked again

pthread_cond_signal(condition)

Wake up a thread waiting on the condition variable.

Called after mutex is locked, and must unlock mutex after

pthread_cond_broadcast(condition)

Used when multiple threads blocked in the condition

condition variable for signaling
Condition Variable – for signaling

Think of Producer – consumer problem

Producers and consumers run in separate threads.

Producer produces data and consumer consumes data.

Producer has to inform the consumer when data is available

Consumer has to inform producer when buffer space is available

slide53
/* Globals */

int data_avail = 0;

pthread_mutex_t data_mutex = PTHREAD_MUTEX_INITIALIZER;

void *producer(void *)

{

Pthread_mutex_lock(&data_mutex);

Produce data

Insert data into queue;

data_avail=1;

Pthread_mutex_unlock(&data_mutex);

}

slide54
void *consumer(void *)

{

while( !data_avail );

/* do nothing – keep looping!!*/

Pthread_mutex_lock(&data_mutex);

Extract data from queue;

if (queue is empty)

data_avail = 0;

Pthread_mutex_unlock(&data_mutex);

consume_data();

}

slide56
int data_avail = 0;

pthread_mutex_t data_mutex = PTHREAD_MUTEX_INITIALIZER;

pthread_cont_t data_cond = PTHREAD_COND_INITIALIZER;

void *producer(void *)

{

Pthread_mutex_lock(&data_mutex);

Produce data

Insert data into queue;

data_avail = 1;

Pthread_cond_signal(&data_cond);

Pthread_mutex_unlock(&data_mutex);

}

slide57
void *consumer(void *)

{

Pthread_mutex_lock(&data_mutex);

while( !data_avail ) {

/* sleep on condition variable*/

Pthread_cond_wait(&data_cond, &data_mutex);

}

/* woken up */

Extract data from queue;

if (queue is empty)

data_avail = 0;

Pthread_mutex_unlock(&data_mutex);

consume_data();

}

so far
So far ….
  • Lock: provides mutual exclusion to shared data:
    • Always acquire before accessing shared data structure Always release after finishing with shared data

Condition Variable: a queue of threads waiting for something inside a critical section

Key idea: allow sleeping inside critical section by atomically releasing lock at time we go to sleep

semaphore
An extension to mutex locks

A semaphore is an object with two methods Wait and Signal, a private integer counterand a private queue (of threads).

Semaphore
example
Assume that in our corporate print room, we have 5 printers online.

Our print spool manager allocates a semaphore set with 5 semaphores in it, one for each printer on the system.

Since each printer is only physically capable of printing one job at a time, each of our five semaphores will be initialized to a value of 1 (one), meaning that they are all online, and accepting requests.

John sends a print request to the spooler. The print manager looks at the semaphore set, and finds the first semaphore which has a value of one. Before sending John's request to the physical device, the print manager decrements the semaphore for the corresponding printer by a value of negative one (-1). Now, that semaphore's value is zero.

Example
example1
A value of zero represents 100% resource utilization on that semaphore. In our example, no other request can be sent to that printer until it is no longer equal to zero.

When John's print job has completed, the print manager increments the value of the semaphore which corresponds to the printer. Its value is now back up to one (1), which means it is available again.

Example
semaphore1
Synchronized counting variables

Formally, a semaphore comprises:

An integer value

Two operations: P() and V()

P() – (e.g. consumer) - also known as wait ()

While value == 0, sleepDecrement value

V() – (e.g. producer)--- also known as signal()

Increment valueIf there are any threads sleeping waiting for value to become non-zero, wakeup at least 1 thread

Semaphore