Efficient dependency tracking for relevant events in shared memory systems
Download
1 / 40

Efficient Dependency Tracking for Relevant Events in Shared Memory Systems - PowerPoint PPT Presentation


  • 98 Views
  • Uploaded on

Efficient Dependency Tracking for Relevant Events in Shared Memory Systems. Anurag Agarwal ([email protected]) Vijay K. Garg ([email protected]) PDS Lab University of Texas at Austin. Outline. Motivation Background Chain Clock Instances of Chain Clock Experimental Results

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Efficient Dependency Tracking for Relevant Events in Shared Memory Systems' - goldy


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Efficient dependency tracking for relevant events in shared memory systems

Efficient Dependency Tracking for Relevant Events in Shared Memory Systems

Anurag Agarwal ([email protected])

Vijay K. Garg ([email protected])

PDS Lab

University of Texas at Austin


Outline
Outline Memory Systems

  • Motivation

  • Background

  • Chain Clock

  • Instances of Chain Clock

  • Experimental Results

  • Conclusion


Motivation
Motivation Memory Systems

  • Dependency between events required for global state information

  • Applications like monitoring and debugging

  • Vector clock [Fidge 88, Mattern 89]

    • O(N) operations for a system with N processes

    • Dynamic creation of processes


Outline1
Outline Memory Systems

  • Motivation

  • Background

  • Chain Clock

  • Instances of Chain Clock

  • Experimental Results

  • Conclusion


Relevant events
Relevant Events Memory Systems

  • Events “useful” for application

  • Predicate Detection

    • “There are no messages in the channel”

p1

p2

p3

p4


Vector clocks fidge 88 mattern 89
Vector Clocks Memory Systems[Fidge 88, Mattern 89]

  • Assigns N-tuple (V) to every relevant event

    • e → f iff e.V < f.V (clock condition)

  • Process Pi :

    • V = (0, … , 0)

    • On an event e

      • If e is receive of message m:

        V = max (V, m.V)

      • If e is a relevant event:

        V[i] = V[i] + 1

      • If e is a send of message m:

        m.V = V


Outline2
Outline Memory Systems

  • Motivation

  • Background

  • Chain Clock

  • Instances of Chain Clock

  • Experimental Results

  • Conclusion


Key idea

p Memory Systems1

a

b

c

d

p2

p3

e

f

g

h

p4

Key Idea

  • Any chain in the computation poset can function as a process

a

b

c

e

d

h

f

g


Chain clocks
Chain Clocks Memory Systems

  • A component in timestamp corresponds to a chain

  • Change “Rule II” in the vector clock algorithm

    • If e is a relevant event

      V[e.c] = V[e.c] + 1

  • Theorem: Chain clocks guarantee the “clock condition”

  • Goal: Online decomposition of poset into as few chains as possible


Outline3
Outline Memory Systems

  • Motivation

  • Background

  • Chain Clock

  • Instances of Chain Clock

    • DCC

    • ACC

    • VCC

  • Experimental Results

  • Conclusion


Dynamic chain clocks dcc
Dynamic Chain Clocks (DCC) Memory Systems

  • Shared vector Z maintains up-to-date values of all components

  • Each process starts with empty vector

  • Rule II

    • e.c = j such that Z[j] = e.V[j]

      • Give preference to component last updated by Pi

    • V[e.c] = V[e.c] + 1


Dcc example
DCC: Example Memory Systems

  • If e is receive of message m:

    V = max (V, m.V)

  • If e is a relevant event:

    e.c = i s.t. Z[i] = V[i]

    V[e.c] = V[e.c] + 1

    Z[e.c] = Z[e.c] + 1

  • If e is a send of message m: m.V = V

p1

(1)

(1,1) = max{(1),(0,1)}

(2,1)

(3,1)

p2

(0,1)

p3

(3,1)

(3,2)

V1

V2

V3

Z

1

3

2

0

3

3

2

1

1

1

1

2

1

2

1


Problem
Problem Memory Systems

  • Number of processes can be much larger than minimal number of chains

p1

(1)

p2

(1,2)

(0,1)

p3

(0,1,1)

(1,2,2)

p4

(0,1,1,1)

(1,2,2,2)


Optimal chain decomposition
Optimal Chain Decomposition Memory Systems

  • Antichain: Set of pairwise concurrent elements

  • Width: Maximum size of an antichain

  • Dilworth’s Theorem [1950] : A poset of width k can be partitioned into k chains and no fewer.

  • Requires knowledge of complete poset


Online chain decomposition
Online Chain Decomposition Memory Systems

  • Elements of poset presented in a total order consistent with the poset

  • Assign elements to chains as they arrive

  • Can be modeled as a game between

    • Bob : Presents elements

    • Alice : Assigns them to chains

  • Felsner [1997] : For a poset of width k, Bob can force Alice to use k(k+1)/2 chains


Chain partitioning algorithm acc
Chain Partitioning Algorithm (ACC) Memory Systems

  • Felsner gave an algorithm which meets the k(k+1)/2 bound

  • Our algorithm is simpler and more efficient

  • B1 … Bk : |Bi| = i

  • For an element z:

  • Insert into the first queue q in Bi with head < z

  • Swap queues in Bi and Bi-1 leaving q in its place

z

B1

B2

B3


Drawback of dcc and acc
Drawback of DCC and ACC Memory Systems

  • Require a shared data structure

    • Monitoring applications generally need a central server

  • Hybrid clocks

    • Multiple servers, each responsible for a subset of processes

    • Finds chains within a process group


Shared memory system
Shared Memory System Memory Systems

  • Accesses to shared variables induce dependencies

  • Observation: Access events for a shared variable form a chain

  • Variable-based Chain Clocks (VCC)

    • Associate a component with every variable


Vcc application predicate detection

y = 2 Memory Systems

x = 0

x = 2

x =1

y = 1

x = 1

VCC Application: Predicate Detection

  • Predicate : (x = 1) and (y = 1)

  • Only events changing x and y are relevant

  • Associate a component of VCC with x and other with y

Initially:

x=0, y = 0


Outline4
Outline Memory Systems

  • Motivation

  • Background

  • Chain Clock

  • Instances of Chain Clock

  • Experimental Results

  • Conclusion


Experiments
Experiments Memory Systems

  • Setup

    • A multithreaded application

    • Each thread generates a sequence of events

    • Parameters:

      • Number of Processes

      • Number of Events

      • Probability of relevant event: a

    • Metrics

      • Number of components used

      • Execution time


Components used
Components Used Memory Systems

Events = 100

a = 1%


Execution time
Execution Time Memory Systems

Events = 100

a = 1%


Effect of relevancy
Effect of Relevancy Memory Systems

Threads = 100

Events = 100


Conclusion
Conclusion Memory Systems

  • Generalized vector clocks to a class of algorithms called Chain Clocks

  • Dynamic Chain Clock (DCC) can provide tremendous speedup and reduce memory requirement for applications

  • Antichain-based Chain Clock (ACC) meets the lower bound for chain decomposition


Questions

Questions? Memory Systems


Example poset of width 2
Example: Poset of width 2 Memory Systems

  • For a poset of width 2, Alice can force Bob to use 3 chains

3

1

1

2


Drawback of dcc and acc1
Drawback of DCC and ACC Memory Systems

  • Require a shared data structure

    • Monitoring applications generally need a central server

  • Hybrid clocks

    • Multiple servers, each responsible for a subset of processes

    • Finds chains within a process group


Example poset of width 21
Example: Poset of width 2 Memory Systems

  • For a poset of width 2, Alice can force Bob to use 3 chains

3

1

1

2


Chain partitioning algorithm acc1
Chain Partitioning Algorithm (ACC) Memory Systems

  • Felsner gave an algorithm which meets the k(k+1)/2 bound

  • Our algorithm is simpler and more efficient

  • B1 … Bk : |Bi| = i

  • For an element z:

  • Insert into the first queue q in Bi with head < z

  • Swap queues in Bi and Bi-1 leaving q in its place

z

B1

B2

B3


Happened before relation lamport 78
Happened Before Relation (→) Memory Systems[Lamport 78]

  • Distributed computation with N processes

  • Every process executes a series of events

    • Internal, send or receive event

p1

p2

  • e → f if there is a path from e to f

  • e║f if there is no path between e and f


Future work
Future work Memory Systems

  • Lower bound for online chain decomposition when a decomposition into N chains is already known

  • Other chain decomposition strategies


Distributed system time vs threads
Distributed System: Time vs Threads Memory Systems

Events = 100

a = 1%


Distributed system events vs time
Distributed System: Events vs Time Memory Systems

Threads = 100

a = 1%


Effect of number of events
Effect of Number of Events Memory Systems

Threads = 100

a = 1%


Dcc example1
DCC: Example Memory Systems

  • If e is receive of message m:

    V = max (V, m.V)

  • If e is a relevant event:

    e.c = i s.t. Z[i] = V[i]

    V[e.c] = V[e.c] + 1

    Z[e.c] = Z[e.c] + 1

  • If e is a send of message m: m.V = V

p1

(1)

(1,1) = max{(1),(0,1)}

(2,1)

(3,1)

p2

(0,1)

p3

(3,1)

(3,2)

V1

V2

V3

Z

1

3

2

0

3

3

2

1

1

1

1

2

1

2

1


  • Example for DCC – is it appropriate ? Memory Systems

  • Is the content a bit too much for this amount

    • Where can I reduce it ?

      • Remove VCC or ACC ?

  • Chain clock

    • Generalizes vector clocks

    • Reduces the time and memory overhead

    • Elegantly handles dynamic process creation


ad