Threads and Synchronization
This presentation is the property of its rightful owner.
Sponsored Links
1 / 43

Threads and Synchronization PowerPoint PPT Presentation


  • 96 Views
  • Uploaded on
  • Presentation posted in: General

Threads and Synchronization. Jeff Chase Duke University. Portrait of a thread. low. name/status etc. 0xdeadbeef. Stack. machine state. high. Thread operations a rough sketch: t = create (); t. start (proc, arg); t. alert (); (optional) result = t. join ();. Self operations

Download Presentation

Threads and Synchronization

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Threads and synchronization

Threads and Synchronization

Jeff Chase

Duke University


Portrait of a thread

Portrait of a thread

low

name/status etc

0xdeadbeef

Stack

machine state

high

Thread operations

a rough sketch:

t = create();

t.start(proc, arg);

t.alert(); (optional)

result = t.join();

Self operations

a rough sketch:

exit(result);

t = self();

setdata(ptr);

ptr = selfdata();

alertwait(); (optional)

Details vary.


A thread closer look

A thread: closer look

thread API

e.g., pthreads

or Java threads

kernel interface for thread libs

(not for users)

1-to-1 mapping of user threads to dedicated kernel supported “vessels”

User TCB

thread library

threads, mutexes, condition variables…

0xdeadbeef

user stack

PG-13

kernel thread support

raw “vessels”, e.g., Linux CLONE_THREAD+”futex”

Kernel TCB

kernel

stack

saved context

(Some older user-level “green” thread libraries may multiplex multiple user threads over each “vessel”.)

Threads can enter the kernel (fault or trap) and block, so they need a k-stack.


Kernel based vs user level threads take 2

Kernel-based vs. user-level threadsTake 2

  • A thread system schedules threads over a pool of “logical cores” or “vessels” for threads to run in.

  • The kernel provides the vessels: they are either classic processes or “lightweight” processes, e.g., via CLONE_THREAD.

  • Kernel scheduler schedules/multiplexes vessels on core slots:

    • Select at most one vessel to occupy each core slot at any given time.

    • Each vessel occupies at most one core slot at any given time.

    • Vessels have k-stacks and can block independently in the kernel.

  • A “kernel-based thread system” maintains a stable 1-1 mapping of threads to dedicated vessels.

    • There is no user-level thread scheduler, since the mapping stable.

    • For simplicity we just call each (thread, vessel) pair a “thread”.

  • A thread library can always choose to multiplex N threads over M vessels. It used to be necessary but it’s not anymore. It causes problems with I/O because the threads cannot block independently in the kernel and the kernel does not know about the threads. There might still be performance-related reasons to do it in some scenarios but we IGNORE THAT CASE from now on.


Thread models illustrated

Thread models illustrated

data

Kernel scheduler (not library) decides which thread/vessel to run next.

1-to-1 mapping of user threads to dedicated kernel supported “vessels”

data

Thread/vessels block via kernel syscalls. They block in the kernel, not in user space. Each has a kernel stack, so they can block independently.

Syscall interface for “vessels” as a foundation for thread API libraries. Might call the vessels “threads” or “lightweight processes”.

Optional add on: a library that multiplexes N user-level threads over M kernel thread vessels, N > M.

user-level thread readyList

while(1) {

t = get next ready thread;

scheduler->Run(t);

}


Threads and synchronization

Andrew Birrell


Synchronization

Synchronization

  • The scheduler (and the machine) select the execution order of threads.

  • Each thread executes a sequence of instructions, but their sequences may be arbitrarily interleaved.

    • E.g., from the point of view of loads/stores on memory.

  • Each possible execution order is a schedule.

  • It is the program’s responsibility to exclude schedules that lead to incorrect behavior.

  • The programmer has some tools to do this, and we must use those tools correctly.

  • It is called synchronization or concurrency control.


Resource trajectory graphs

Resource Trajectory Graphs

Resource trajectory graphs (RTG) depict the “random walk” through the space of possible program states.

Sm

Sn

So

  • RTG for N threads is N-dimensional.

    • Thread i advances along axis i.

  • Each point represents one state in the set of all possible system states.

    • Cross-product of the possible states of all threads in the system

    • (But not all states in the cross-product are legally reachable.)


Resource trajectory graphs1

Resource Trajectory Graphs

This RTG depicts a schedule within the space of possible schedules for a simple program of two threads sharing one core.

Every schedule ends here.

Blue advances along the y-axis.

EXIT

The diagonal is an idealized parallel execution (two cores).

Purple advances along the x-axis.

The scheduler chooses the path (schedule, event order, or interleaving).

context switch

From the point of view of the program, the chosen path is nondeterministic.

EXIT

Every schedule starts here.


Interleaving matters

Interleaving matters

loadx, R2; load global variable x

addR2, 1, R2; increment: x = x + 1

storeR2, x; store global variable x

Two threads execute this code section. x is a shared variable.

load

add

store

load

add

store

X

In this schedule, x is incremented only once: last writer wins.

The program breaks under this schedule. This bug is a race.


This is not a game

This is not a game

  • But we can think of it as a game.

  • You write your program.

  • The game begins when you submit your program to your adversary: the scheduler.

  • The scheduler chooses all the moves while you watch.

  • Your program may constrain the set of legal moves.

  • The scheduler searches for a legal schedule that breaks your program.

  • If it succeeds, then you lose (your program has a race).

  • You win by not losing.

X

x=x+1

U LOOZ

x=x+1


A picture of a race

A picture of a race

load

add

store

load

add

store

Events in different threads may be interleaved.

Each schedule may be different.

x = x + 1;

These code sections are concurrent in this execution no ordering is defined among them.

x = x + 1;

They are conflicting: they access a shared variable (global or heap), and at least one access is a write.

An execution with concurrent conflicting accesses has a race: the result depends on the schedule.


Possible interleavings

Possible interleavings?

load

add

store

load

add

store

load

add

store

load

add

store

load

add

store

load

add

store

time

X

1.

x = x + 1;

X

2.

x = x + 1;

3.

4.


Critical sections

Critical sections

load

add

store

load

add

store

load

add

store

load

add

store

load

add

store

load

add

store

concurrent

interleaved (racebug)

X

1.

x = x + 1;

X

serialized

(one after the other)

2.

x = x + 1;

This code sequence is a critical section: the program fails if more than one thread executes in the critical section concurrently:

that constitutes a race, a bug.


The need for mutual exclusion

The need for mutual exclusion

The program may fail if the schedule enters the grey box

(i.e., if two threads execute the critical section concurrently).

The two threads must not both operate on the shared global x “at the same time”.

x=???

X

x=x+1

x=x+1


A lock or mutex

A Lock or Mutex

Locks are the basic tools to enforce mutual exclusion in conflicting critical sections.

  • A lock is an object, a data item in memory.

  • API methods: Acquire and Release.

  • Also called Lock() and Unlock().

  • Threads pair calls to Acquire and Release.

  • Acquire upon entering a critical section.

  • Release upon leaving a critical section.

  • Between Acquire/Release, the thread holds the lock.

  • Acquire does not pass until any previous holder releases.

  • Waiting locks can spin (a spinlock) or block (a mutex).

A

A

R

R


Definition of a lock mutex

Definition of a lock (mutex)

  • Acquire + release ops on L are strictly paired.

    • After acquire completes, the caller holds (owns) the lock L until the matching release.

  • Acquire + release pairs on each L are ordered.

    • Total order: each lock L has at most one holder at any given time.

    • That property is mutual exclusion; L is a mutex.


Locking a critical section

Locking a critical section

load

add

store

load

add

store

load

add

store

load

add

store

load

add

store

load

add

store

3.

mx->Acquire();

x = x + 1;

mx->Release();

4.

serialized

atomic

mx->Acquire();

x = x + 1;

mx->Release();

Holding a shared mutex prevents competing threads from entering a critical section. If the critical section code acquires the mutex, then its execution is serialized: only one thread runs it at a time.


Portrait of a lock in motion

Portrait of a Lock in Motion

The program may fail if it enters the grey box.

A lock (mutex) prevents the schedule from ever entering the grey box, ever: both threads would have to hold the same lock at the same time, and locks don’t allow that.

R

x=???

x=x+1

A

R

A

x=x+1

x = x + 1;


Handing off a lock

Handing off a lock

serialized

(one after the other)

First I go.

release

acquire

Then you go.

Handoff

The nth release, followed by the (n+1)th acquire


A peek at some deep tech

A peek at some deep tech

An execution schedule defines a partial order of program events. The ordering relation (<) is called happens-before.

mx->Acquire();

x = x + 1;

mx->Release();

Two events are concurrent if neither happens-before the other. They might execute in some order, but only by luck.

happens

before

(<)

The next schedule may reorder them.

before

Just three rules govern happens-before order:

mx->Acquire();

x = x + 1;

mx->Release();

  • Events within a thread are ordered.

  • Mutex handoff orders events across threads: the release#Nhappens-before acquire #N+1.

  • Happens-before is transitive:

  • if (A < B) and (B < C) then A < C.

Machines may reorder concurrent events, but they always respect happens-before ordering.


How about this

How about this?

load

add

store

load

add

store

A

x = x + 1;

mx->Acquire();

x = x + 1;

mx->Release();

B


How about this1

How about this?

load

add

store

load

add

store

A

x = x + 1;

The locking discipline is not followed: purple fails to acquire the lock mx.

Or rather: purple accesses the variable x through another program section A that is mutually critical with B, but does not acquire the mutex.

A locking scheme is a convention that the entire program must follow.

mx->Acquire();

x = x + 1;

mx->Release();

B


How about this2

How about this?

load

add

store

load

add

store

lock->Acquire();

x = x + 1;

lock->Release();

A

mx->Acquire();

x = x + 1;

mx->Release();

B


How about this3

How about this?

load

add

store

load

add

store

lock->Acquire();

x = x + 1;

lock->Release();

A

This guy is not acquiring the right lock.

Or whatever. They’re not using the same lock, and that’s what matters.

A locking scheme is a convention that the entire program must follow.

mx->Acquire();

x = x + 1;

mx->Release();

B


Mutual exclusion in java

Mutual exclusion in Java

  • Mutexes are built in to every Java object.

    • no separate classes

  • Every Java object is/has a monitor.

    • At most one thread may “own” a monitor at any given time.

  • A thread becomes owner of an object’s monitor by

    • executing an object method declared as synchronized

    • executing a block that is synchronized on the object

public void increment() {

synchronized(this) {

x = x + 1;

}

}

public synchronized void increment()

{

x = x + 1;

}


Roots monitors

Roots: monitors

P1()

P2()

P3()

P4()

A monitor is a module in which execution is serialized.

A module is a set of procedures with some private state.

[Brinch Hansen 1973]

[C.A.R. Hoare 1974]

state

At most one thread runs in the monitor at a time.

(enter)

ready

to enter

Other threads wait until the monitor is free.

signal()

wait()

blocked

Java synchronized just allows finer control over the entry/exit points.

Also, each Java object is its own “module”: objects of a Java class share methods of the class but have private state and a private monitor.


Monitors and mutexes are equivalent

Monitors and mutexes are “equivalent”

  • Entry to a monitor (e.g., a Java synchronized block) is equivalent to Acquire of an associated mutex.

    • Lock on entry

  • Exit of a monitor is equivalent to Release.

    • Unlock on exit (or at least “return the key”…)

  • Note: exit/release is implicit and automatic if the thread exits monitored code by a Java exception.

    • Much less error-prone then explicit release


Monitors and mutexes are equivalent1

Monitors and mutexes are “equivalent”

  • Well: mutexes are more flexible because we can choose which mutex controls a given piece of state.

    • E.g., in Java we can use one object’s monitor to control access to state in some other object.

    • Perfectly legal! So “monitors” in Java are more properly thought of as mutexes.

  • Caution: this flexibility is also more dangerous!

    • It violates modularity: can code “know” what locks are held by the thread that is executing it?

    • Nested locks may cause deadlock (later).

  • Keep your locking scheme simple and local!

    • Java ensures that each Acquire/Release pair (synchronized block) is contained within a method, which is good practice.


Using monitors mutexes

Using monitors/mutexes

P1()

P2()

P3()

P4()

Each monitor/mutex protects specific data structures (state) in the program. Threads hold the mutex when operating on that state.

The state is consistent iff certain well-defined invariant conditions are true. A condition is a logical predicate over the state.

state

(enter)

ready

to enter

Example invariant condition

E.g.: suppose the state has a doubly linked list. Then for any element e either e.next is null or e.next.prev == e.

signal()

wait()

blocked

Threads hold the mutex when transitioning the structures from one consistent state to another, and restore the invariants before releasing the mutex.


New problem ping pong

New Problem: Ping-Pong

  • void

  • PingPong() {

  • while(not done) {

  • if (blue)

    • switch to purple;

    • if (purple)

    • switch to blue;

  • }

  • }


  • Ping pong with mutexes

    Ping-Pong with Mutexes?

    • void

    • PingPong() {

    • while(not done) {

    • Mx->Acquire();

      • Mx->Release();

  • }

  • }

  • ???


    Mutexes don t work for ping pong

    Mutexes don’t work for ping-pong


    Monitor wait signal

    Monitor wait/signal

    P1()

    P2()

    P3()

    P4()

    We need a way for a thread to wait for some condition to become true, e.g., until another thread runs and/or changes the state somehow.

    At most one thread runs in the monitor at a time.

    A thread may wait (sleep) in the monitor, allowing another thread to enter.

    state

    (enter)

    ready

    to enter

    A thread may signal in the monitor.

    Signal means: wake one waiting thread, if there is one, else do nothing.

    The awakened thread returns from its wait.

    signal()

    signal()

    wait()

    waiting

    (blocked)

    wait()


    Condition variables are equivalent

    Condition variables are equivalent

    • A condition variable (CV) is an object with an API.

    • A CV implements the behavior of monitor conditions.

      • interface to a CV: wait and signal (also called notify)

    • Every CV is bound to exactly one mutex, which is necessary for safe use of the CV.

      • “holding the mutex”  “in the monitor”

    • A mutex may have any number of CVs bound to it.

      • (But not in Java: only one CV per mutex in Java.)

    • CVs also define a broadcast (notifyAll) primitive.

      • Signal all waiters.


    Ping pong using a condition variable

    Ping-Pong using a condition variable

    void

    PingPong() {

    mx->Acquire();

    while(not done) {

    cv->Signal();

    cv->Wait();

    }

    mx->Release();

    }


    Ping pong using a condition variable1

    Ping-Pong using a condition variable

    wait

    signal

    wait

    signal

    signal

    wait


    Example wait notify in java

    Example: Wait/Notify in Java

    Every Java object may be treated as a condition variable for threads using its monitor. There is no condition class.

    public class PingPong (extends Object) {

    public synchronized void PingPong() {

    while(true) {

    notify();

    wait();

    }

    }

    }

    public class Object {

    void notify(); /* signal */

    void notifyAll(); /* broadcast */

    void wait();

    void wait(long timeout);

    }

    A thread must own an object’s monitor to call wait/notify, else the method raises an IllegalMonitorStateException.

    Wait(*) waits until the timeout elapses or another thread notifies.


    Using condition variables

    Using condition variables

    • In typical use a condition variable is associated with some logical condition or predicate on the state protected by its mutex.

      • E.g., queue is empty, buffer is full, message in the mailbox.

      • Note: CVs are not variables. You can associate them with whatever data you want, i.e, the state protected by the mutex.

    • A caller of CV wait must hold its mutex (be “in the monitor”).

      • This is crucial because it means that a waiter can wait on a logical condition and know that it won’t change until the waiter is safely asleep.

      • Otherwise, another thread might change the condition and signal before the waiter is asleep! Signals do not stack! The waiter would sleep forever: the missed wakeup or wake-up waiter problem.

    • The wait releases the mutex to sleep, and reacquires before return.

      • But another thread could have beaten the waiter to the mutex and messed with the condition: loop before you leap!


    Example event request queue

    Example: event/request queue

    worker loop

    We can implement an event queue with a mutex/CV pair.

    Protect the event queue data structure itself with the mutex.

    Handle one event, blocking as necessary.

    dispatch

    Incoming event

    queue

    When handler is complete, return to worker pool.

    threads waiting on CV

    Workers wait on the CV for next event if the event queue is empty. Signal the CV when a new event arrives.


    Monitor wait signal1

    Monitor wait/signal

    P1()

    P2()

    P3()

    P4()

    Design question: when a waiting thread is awakened by signal, must it start running immediately? Back in the monitor, where it called wait?

    At most one thread runs in the monitor at a time.

    Two choices: yes or no.

    state

    If yes, what happens to the thread that called signal within the monitor? Does it just hang there? They can’t both be in the monitor.

    If no, can’t other threads get into the monitor first and change the state, causing the condition to become false again?

    (enter)

    ready

    to enter

    signal()

    ???

    signal

    wait()

    waiting

    (blocked)

    wait


    Mesa semantics just say no

    Mesa semantics: Just say no

    P1()

    P2()

    P3()

    P4()

    Design question: when a waiting thread is awakened by signal, must it start running immediately? Back in the monitor, where it called wait?

    Mesa semantics: no.

    state

    An awakened waiter gets back in line. The signal caller keeps the monitor.

    So, can’t other threads get into the monitor first and change the state, causing the condition to become false again?

    Yes. So the waiter must recheck the condition:

    “Loop before you leap”.

    ready

    to (re)enter

    (enter)

    ready

    to enter

    signal()

    signal

    wait()

    waiting

    (blocked)

    wait


    Alternative hoare semantics

    Alternative: Hoare semantics

    • As originally defined in the 1960s, monitors chose “yes”: Hoare semantics. Signal suspends; awakened waiter gets the monitor.

    • Monitors with Hoare semantics might be easier to program, somebody might think. Maybe. I suppose.

    • But monitors with Hoare semantics are difficult to implement efficiently on multiprocessors.

    • Birrell et. al. determined this when they built monitors for the Mesa programming language in the 1970s.

    • So they changed the rules: Mesa semantics.

    • Java uses Mesa semantics. Everybody uses Mesa semantics.

    • Hoare semantics are of historical interest only.

    • Loop before you leap!


  • Login