Loading in 5 sec....

Efficient Dependency Tracking for Relevant Events in Shared Memory SystemsPowerPoint Presentation

Efficient Dependency Tracking for Relevant Events in Shared Memory Systems

- By
**goldy** - Follow User

- 98 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Efficient Dependency Tracking for Relevant Events in Shared Memory Systems' - goldy

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Efficient Dependency Tracking for Relevant Events in Shared Memory Systems

Outline Memory Systems

### Questions? Memory Systems

Anurag Agarwal ([email protected])

Vijay K. Garg ([email protected])

PDS Lab

University of Texas at Austin

Outline Memory Systems

- Motivation
- Background
- Chain Clock
- Instances of Chain Clock
- Experimental Results
- Conclusion

Motivation Memory Systems

- Dependency between events required for global state information
- Applications like monitoring and debugging
- Vector clock [Fidge 88, Mattern 89]
- O(N) operations for a system with N processes
- Dynamic creation of processes

Outline Memory Systems

- Motivation
- Background
- Chain Clock
- Instances of Chain Clock
- Experimental Results
- Conclusion

Relevant Events Memory Systems

- Events “useful” for application
- Predicate Detection
- “There are no messages in the channel”

p1

p2

p3

p4

Vector Clocks Memory Systems[Fidge 88, Mattern 89]

- Assigns N-tuple (V) to every relevant event
- e → f iff e.V < f.V (clock condition)

- Process Pi :
- V = (0, … , 0)
- On an event e
- If e is receive of message m:
V = max (V, m.V)

- If e is a relevant event:
V[i] = V[i] + 1

- If e is a send of message m:
m.V = V

- If e is receive of message m:

Outline Memory Systems

- Motivation
- Background
- Chain Clock
- Instances of Chain Clock
- Experimental Results
- Conclusion

p Memory Systems1

a

b

c

d

p2

p3

e

f

g

h

p4

Key Idea- Any chain in the computation poset can function as a process

a

b

c

e

d

h

f

g

Chain Clocks Memory Systems

- A component in timestamp corresponds to a chain
- Change “Rule II” in the vector clock algorithm
- If e is a relevant event
V[e.c] = V[e.c] + 1

- If e is a relevant event
- Theorem: Chain clocks guarantee the “clock condition”
- Goal: Online decomposition of poset into as few chains as possible

Outline Memory Systems

- Motivation
- Background
- Chain Clock
- Instances of Chain Clock
- DCC
- ACC
- VCC

- Experimental Results
- Conclusion

Dynamic Chain Clocks (DCC) Memory Systems

- Shared vector Z maintains up-to-date values of all components
- Each process starts with empty vector
- Rule II
- e.c = j such that Z[j] = e.V[j]
- Give preference to component last updated by Pi

- V[e.c] = V[e.c] + 1

- e.c = j such that Z[j] = e.V[j]

DCC: Example Memory Systems

- If e is receive of message m:
V = max (V, m.V)

- If e is a relevant event:
e.c = i s.t. Z[i] = V[i]

V[e.c] = V[e.c] + 1

Z[e.c] = Z[e.c] + 1

- If e is a send of message m: m.V = V

p1

(1)

(1,1) = max{(1),(0,1)}

(2,1)

(3,1)

p2

(0,1)

p3

(3,1)

(3,2)

V1

V2

V3

Z

1

3

2

0

3

3

2

1

1

1

1

2

1

2

1

Problem Memory Systems

- Number of processes can be much larger than minimal number of chains

p1

(1)

p2

(1,2)

(0,1)

p3

(0,1,1)

(1,2,2)

p4

(0,1,1,1)

(1,2,2,2)

Optimal Chain Decomposition Memory Systems

- Antichain: Set of pairwise concurrent elements
- Width: Maximum size of an antichain

- Dilworth’s Theorem [1950] : A poset of width k can be partitioned into k chains and no fewer.
- Requires knowledge of complete poset

Online Chain Decomposition Memory Systems

- Elements of poset presented in a total order consistent with the poset
- Assign elements to chains as they arrive
- Can be modeled as a game between
- Bob : Presents elements
- Alice : Assigns them to chains

- Felsner [1997] : For a poset of width k, Bob can force Alice to use k(k+1)/2 chains

Chain Partitioning Algorithm (ACC) Memory Systems

- Felsner gave an algorithm which meets the k(k+1)/2 bound
- Our algorithm is simpler and more efficient

- B1 … Bk : |Bi| = i
- For an element z:
- Insert into the first queue q in Bi with head < z
- Swap queues in Bi and Bi-1 leaving q in its place

z

B1

B2

B3

Drawback of DCC and ACC Memory Systems

- Require a shared data structure
- Monitoring applications generally need a central server

- Hybrid clocks
- Multiple servers, each responsible for a subset of processes
- Finds chains within a process group

Shared Memory System Memory Systems

- Accesses to shared variables induce dependencies
- Observation: Access events for a shared variable form a chain
- Variable-based Chain Clocks (VCC)
- Associate a component with every variable

y = 2 Memory Systems

x = 0

x = 2

x =1

y = 1

x = 1

VCC Application: Predicate Detection- Predicate : (x = 1) and (y = 1)
- Only events changing x and y are relevant
- Associate a component of VCC with x and other with y

Initially:

x=0, y = 0

- Motivation
- Background
- Chain Clock
- Instances of Chain Clock
- Experimental Results
- Conclusion

Experiments Memory Systems

- Setup
- A multithreaded application
- Each thread generates a sequence of events
- Parameters:
- Number of Processes
- Number of Events
- Probability of relevant event: a

- Metrics
- Number of components used
- Execution time

Conclusion Memory Systems

- Generalized vector clocks to a class of algorithms called Chain Clocks
- Dynamic Chain Clock (DCC) can provide tremendous speedup and reduce memory requirement for applications
- Antichain-based Chain Clock (ACC) meets the lower bound for chain decomposition

Example: Poset of width 2 Memory Systems

- For a poset of width 2, Alice can force Bob to use 3 chains

3

1

1

2

Drawback of DCC and ACC Memory Systems

- Require a shared data structure
- Monitoring applications generally need a central server

- Hybrid clocks
- Multiple servers, each responsible for a subset of processes
- Finds chains within a process group

Example: Poset of width 2 Memory Systems

- For a poset of width 2, Alice can force Bob to use 3 chains

3

1

1

2

Chain Partitioning Algorithm (ACC) Memory Systems

- Felsner gave an algorithm which meets the k(k+1)/2 bound
- Our algorithm is simpler and more efficient

- B1 … Bk : |Bi| = i
- For an element z:
- Insert into the first queue q in Bi with head < z
- Swap queues in Bi and Bi-1 leaving q in its place

z

B1

B2

B3

Happened Before Relation (→) Memory Systems[Lamport 78]

- Distributed computation with N processes
- Every process executes a series of events
- Internal, send or receive event

p1

p2

- e → f if there is a path from e to f
- e║f if there is no path between e and f

Future work Memory Systems

- Lower bound for online chain decomposition when a decomposition into N chains is already known
- Other chain decomposition strategies

DCC: Example Memory Systems

- If e is receive of message m:
V = max (V, m.V)

- If e is a relevant event:
e.c = i s.t. Z[i] = V[i]

V[e.c] = V[e.c] + 1

Z[e.c] = Z[e.c] + 1

- If e is a send of message m: m.V = V

p1

(1)

(1,1) = max{(1),(0,1)}

(2,1)

(3,1)

p2

(0,1)

p3

(3,1)

(3,2)

V1

V2

V3

Z

1

3

2

0

3

3

2

1

1

1

1

2

1

2

1

- Example for DCC – is it appropriate ? Memory Systems
- Is the content a bit too much for this amount
- Where can I reduce it ?
- Remove VCC or ACC ?

- Where can I reduce it ?
- Chain clock
- Generalizes vector clocks
- Reduces the time and memory overhead
- Elegantly handles dynamic process creation

Download Presentation

Connecting to Server..