Coding for atomic shared memory emulation
This presentation is the property of its rightful owner.
Sponsored Links
1 / 55

Coding for Atomic Shared Memory Emulation PowerPoint PPT Presentation


  • 70 Views
  • Uploaded on
  • Presentation posted in: General

Coding for Atomic Shared Memory Emulation. Viveck R. Cadambe (MIT) Joint with Prof. Nancy Lynch (MIT), Prof. Muriel Médard (MIT) and Dr. Peter Musial (EMC). Erasure Coding for Distributed Storage. Erasure Coding for Distributed Storage.

Download Presentation

Coding for Atomic Shared Memory Emulation

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Coding for atomic shared memory emulation

Coding for Atomic Shared Memory Emulation

Viveck R. Cadambe

(MIT)

Joint with Prof. Nancy Lynch (MIT), Prof. Muriel Médard (MIT) and Dr. Peter Musial (EMC)


Coding for atomic shared memory emulation

Erasure Coding for Distributed Storage


Coding for atomic shared memory emulation

Erasure Coding for Distributed Storage

  • Locality, Repair Bandwidth, Caching and Content Distribution

    • [Gopalan et. al 2011, Dimakis-Godfrey-Wu-Wainwright- 10, Wu-Dimakis 09, Niesen-Ali 12]


Coding for atomic shared memory emulation

Erasure Coding for Distributed Storage

  • Locality, Repair Bandwidth, Caching and Content Distribution

    • [Gopalan et. al 2011, Dimakis-Godfrey-Wu-Wainwright- 10, Wu-Dimakis 09, Niesen-Ali 12]

  • Queuing theory

    • [Ferner-Medard-Soljanin 12, Joshi-Liu-Soljanin 12, Shah-Lee-Ramchandran 12]


Coding for atomic shared memory emulation

Erasure Coding for Distributed Storage

  • Locality, Repair Bandwidth, Caching and Content Distribution

    • [Gopalan et. al 2011, Dimakis-Godfrey-Wu-Wainwright- 10, Wu-Dimakis 09, Niesen-Ali 12]

  • Queuing theory

    • [Ferner-Medard-Soljanin 12, Joshi-Liu-Soljanin 12, Shah-Lee-Ramchandran 12]

This talk: Theory of distributed computing

Considerations for storing data that changes


Coding for atomic shared memory emulation

Failure tolerance, Low storage costs, Fast reads and writes

Consistency: Value changing, get the “latest” version


Coding for atomic shared memory emulation

Shared Memory Emulation - History

Atomic (consistent) shared memory

  • [Lamport 1986]

  • Cornerstone of distributed computing and multi-processor programming


Coding for atomic shared memory emulation

Shared Memory Emulation - History

Atomic (consistent) shared memory

  • [Lamport 1986]

  • Cornerstone of distributed computing and multi-processor programming

  • “ABD” algorithm [Attiya-Bar-Noy-Dolev95], 2011 Dijsktra Prize,

  • Amazon dynamo key-value store

  • [Decandia et. al. 2008]

  • Replication-based

Emulation over distributed storage systems


Coding for atomic shared memory emulation

Shared Memory Emulation - History

Atomic (consistent) shared memory

  • [Lamport 1986]

  • Cornerstone of distributed computing and multi-processor programming

  • “ABD” algorithm [Attiya-Bar-Noy-Dolev95], 2011 Dijsktra Prize,

  • Amazon dynamo key-value store

  • [Decandia et. al. 2008]

  • Replication-based

Emulation over distributed storage systems

  • Costs of emulation

  • Low cost coding based algorithm

  • Communication and storage costs

  • [C-Lynch-Medard-Musial 2014],

  • preprint available

(This talk)


Coding for atomic shared memory emulation

Shared Memory Emulation - History

Atomic (consistent) shared memory

  • [Lamport 1986]

  • Cornerstone of distributed computing and multi-processor programming

  • “ABD” algorithm [Attiya-Bar-Noy-Dolev95], 2011 Dijsktra Prize,

  • Amazon dynamo key-value store

  • [Decandia et. al. 2008]

  • Replication-based

Emulation over distributed storage systems

  • Costs of emulation

  • Low cost coding based algorithm

  • Communication and storage costs

  • [C-Lynch-Medard-Musial 2014],

  • preprint available

(This talk)


Coding for atomic shared memory emulation

Write

time

Read


Coding for atomic shared memory emulation

Write

time

Read


Atomicity

Atomicity

[Lamport 86]

aka linearizability. [Herlihy, Wing 90]

Write

time

Read


Coding for atomic shared memory emulation

Atomicity

[Lamport 86]

aka linearizability. [Herlihy, Wing 90]

Write

time

Read


Coding for atomic shared memory emulation

Atomicity

[Lamport 86]

aka linearizability. [Herlihy, Wing 90]

Write

time

Read


Coding for atomic shared memory emulation

Atomicity

[Lamport 86]

aka linearizability. [Herlihy, Wing 90]

Atomic

Write

time

Read


Coding for atomic shared memory emulation

Atomicity

[Lamport 86]

aka linearizability. [Herlihy, Wing 90]

Atomic

Write

time

Read

Not atomic

time

time


Coding for atomic shared memory emulation

Shared Memory Emulation - History

Atomic (consistent) shared memory

  • [Lamport 1986]

  • Cornerstone of distributed computing and multi-processor programming

  • “ABD” algorithm [Attiya-Bar-Noy-Dolev95], 2011 Dijsktra Prize,

  • Amazon dynamo key-value store

  • [Decandia et. al. 2008]

  • Replication-based

Emulation over distributed storage systems

  • Costs of emulation

  • Low cost coding based algorithm

  • Communication and storage costs

  • [C-Lynch-Medard-Musial 2014],

  • preprint available

(This talk)


Coding for atomic shared memory emulation

Distributed Storage Model

Read Clients

Write Clients

Servers

  • Client server architecture, nodes can fail (no. of server failures is limited)

  • Point-to-point reliable links (arbitrary delay).

  • Nodes do not know if other nodes fail

  • An operation should not have to wait for others to complete


Coding for atomic shared memory emulation

Distributed Storage Model

Read Clients

Write Clients

Servers

  • Client server architecture, nodes can fail (no. of server failures is limited)

  • Point-to-point reliable links (arbitrary delay)

  • Nodes do not know if other nodes fail

  • An operation should not have to wait for others to complete


Coding for atomic shared memory emulation

Distributed Storage Model

Read Clients

Write Clients

Servers

  • Client server architecture, nodes can fail (no. of server failures is limited)

  • Point-to-point reliable links (arbitrary delay).

  • Nodes do not know if other nodes fail

  • An operation should not have to wait for others to complete


Coding for atomic shared memory emulation

Requirements and cost measure

Read Clients

Write Clients

Servers

  • Design write, read and server protocolssuch that

  • Atomicity

  • Concurrent operations, no waiting.

  • Communication overheads: Number of bits sent over links

  • Storage overheads: (Worst-case) server storage costs


Coding for atomic shared memory emulation

The ABD algorithm (sketch)

Read Clients

Write Clients

Servers

Quorum set: Every majority of server snodes.

Any two sets intersect at at least one nodes

Algorithm works if at least one quorum set is available.


Coding for atomic shared memory emulation

The ABD algorithm (sketch)

Read Clients

Write Clients

Servers

Write:

Send time-stamped value to every server; return after receiving sufficeintacks.

Read:

Send read query; wait for sufficient responses and return with latest value.

Servers:

Store latest value from server; send ack

Respond to read request with value


Coding for atomic shared memory emulation

The ABD algorithm (sketch)

ACK

ACK

ACK

Read Clients

Write Clients

ACK

ACK

ACK

Servers

Write:

Send time-stamped value to every server; return after receiving acks from quorum.

Read::

Send read query; wait for sufficient responses and return with latest value.

Servers:

Store latest value; send ack

Respond to read request with value


Coding for atomic shared memory emulation

The ABD algorithm (sketch)

Read Clients

Write Clients

Query

Query

Query

Query

Query

Query

Query

Servers

Write:

Send time-stamped value to every server; return after receiving sufficeintacks.

Read:

Send read query; wait for sufficient responses and return with latest value.

Servers:

Store latest value from server; send ack

Respond to read request with value


Coding for atomic shared memory emulation

The ABD algorithm (sketch)

Read Clients

Write Clients

Servers

Write:

Send time-stamped value to every server; return after receiving sufficeintacks.

Read:

Send read query; wait for quorum of responses; return with latest value.

Servers:

Store latest value from server; send ack

Respond to read request with value


Coding for atomic shared memory emulation

The ABD algorithm (sketch)

Read Clients

Write Clients

Servers

Write:

Send time-stamped value to every server; return after receiving sufficeintacks.

Read:

Send read query; wait for quorum responses; send latest value to quourm; latest value.

Servers:

Store latest value from server; send ack

Respond to read request with value


Coding for atomic shared memory emulation

The ABD algorithm (sketch)

ACK

ACK

Read Clients

Write Clients

ACK

ACK

ACK

ACK

Servers

Write:

Send time-stamped value to every server; return after receiving sufficeintacks.

Read:

Send read query; wait for acks from quorum responses; send latest value to servers;

return latest value after receiving acks from quorum.

Servers:

Store latest value from server; send ack

Respond to read request with value


Coding for atomic shared memory emulation

The ABD algorithm (summary)

  • The ABD algorithm ensures atomic operations.

  • Operations terminate is ensured as long as a majority of nodes do not fail.

  • Implication: A networked distributed storage system can be used as shared memory.

  • Replication to ensure failure tolerance.


Coding for atomic shared memory emulation

Performance Analysis

Storage

Communication

(write)

Communication

(read)

  • f represents number of failures

  • a lower communication cost algorithm in [Fan-Lynch 03]


Coding for atomic shared memory emulation

Shared Memory Emulation - History

Atomic (consistent) shared memory

  • [Lamport 1986]

  • Cornerstone of distributed computing and multi-processor programming

  • “ABD” algorithm [Attiya-Bar-Noy-Dolev95], 2011 Dijsktra Prize,

  • Amazon dynamo key-value store

  • [Decandia et. al. 2008]

  • Replication-based

Emulation over distributed storage systems

  • Costs of emulation

  • Low cost coding based algorithm

  • Communication and storage costs

  • [C-Lynch-Medard-Musial 2014],

  • preprint available

(This talk)


Coding for atomic shared memory emulation

Shared Memory Emulation – Erasure Coding

  • [Hendricks-Ganger-Reiter 07, Dutta-Guerraoui-Levy 08, Dobre-et.al 13, Androulaki et. al 14]

  • New algorithm, a formal analysis of costs

  • Outperforms previous algorithms in certain aspects

    • Previous algorithms incur infinite worst-case storage costs

    • Previous algorithms incur large communication costs


Coding for atomic shared memory emulation

Erasure Coded Shared Memory


Coding for atomic shared memory emulation

Erasure Coded Shared Memory

Smaller packets,

smaller overheads

Example:

(6,4) MDS code

  • Value recoverable from any 4 coded packets

  • Size of coded packet is ¼ size of value


Coding for atomic shared memory emulation

Erasure Coded Shared Memory

Smaller packets,

smaller overheads

Example:

(6,4) MDS code

  • Value recoverable from any 4 coded packets

  • Size of coded packet is ¼ size of value

  • New constraint, need 4 packets with same time-stamp


Coding for atomic shared memory emulation

Coded Shared Memory – Quorum set up

Read Clients

Write Clients

Servers

Quorum set: Every subset of 5 server snodes.

Any two sets intersect at 4 nodes

Algorithm works if at least one quorum set is available.


Coding for atomic shared memory emulation

Coded Shared Memory – Why is it challenging?

Read Clients

Write Clients

Servers


Coding for atomic shared memory emulation

Coded Shared Memory – Why is it challenging?

Query

Query

Query

Query

Read Clients

Write Clients

Servers

Servers store multiple versions

Challenges: reveal elements to readers only when enough elements are propagated

discard old versions safely

Solutions:Write in multiple phases

Store all the write-versions concurrent with a read


Coding for atomic shared memory emulation

Coded Shared Memory – Protocol overview

Write:

Send time-stamped value to every server; send finalize message after getting acks from quorum; return after receiving acks from quorum.

Read:

Send read query; wait for time-stamps from a quorum;

Send request with latest time-stamp to servers;

decode and return value after receiving acks from quorum.

Servers:

Store the coded symbol; keep latest δcodeword symbols and delete older ones; send ack.

Set finalize flag for tag on receiving finalize message.

Respond to read query with latest finalized tag.

Finalize the requested tag; respond to read request with codeword symbol.


Coding for atomic shared memory emulation

Coded Shared Memory – Protocol overview

Write:

Send time-stamped value to every server; send finalize message after getting acks from quorum; return after receiving acks from quorum.

Read:

Send read query; wait for time-stamps from a quorum;

Send request with latest time-stamp to servers;

decode and return value after receiving acks from quorum.

Servers:

Store the coded symbol; keep latest δcodeword symbols and delete older ones; send ack.

Set finalize flag for time-stamp on receiving finalize message. Send ack.

Respond to read query with latest finalized tag.

Finalize the requested tag; respond to read request with codeword symbol.


Coding for atomic shared memory emulation

Coded Shared Memory – Protocol overview

Write:

Send time-stamped value to every server; send finalize message after getting acks from quorum; return after receiving acks from quorum.

Read:

Send read query; wait for time-stamps from a quorum;

Send request with latest time-stamp to servers;

decode and return value after receiving acks from quorum.

Servers:

Store the coded symbol; keep latest δcodeword symbols and delete older ones; send ack.

Set finalize flag for tag on receiving finalize message.

Respond to read query with latest finalized tag.

Finalize the requested tag; respond to read request with codeword symbol.


Coding for atomic shared memory emulation

Coded Shared Memory – Protocol overview

Write:

Send time-stamped value to every server; send finalize message after getting acks from quorum; return after receiving acks from quorum.

Read:

Send read query; wait for time-stamps from a quorum;

Send request with latest time-stamp to servers;

decode and return value after receiving acks/symbols from quorum.

Servers:

Store the coded symbol; keep latest δcodeword symbols and delete older ones; send ack.

Set finalize flag for tag on receiving finalize message.

Respond to read query with latest finalized tag.

Finalize the requested time-stamp; respond to read request with codeword symbol if it exists, else send ack.


Coding for atomic shared memory emulation

Coded Shared Memory – Protocol overview

Write:

Send time-stamped value to every server; send finalize message after getting acks from quorum; return after receiving acks from quorum.

Read:

Send read query; wait for time-stamps from a quorum;

Send request with latest time-stamp to servers;

decode and return value after receiving acks/symbols from quorum.

Servers:

Store the coded symbol; keep latest δcodeword symbols and delete older ones; send ack.

Set finalize flag for time-stamp on receiving finalize message.

Respond to read query with latest finalized tag.

Finalize the requested time-stamp; respond to read request with codeword symbol if it exists, else send ack.


Coding for atomic shared memory emulation

Coded Shared Memory – Protocol overview

  • Use (N,k) MDS code, where N is the number of servers

  • Ensures atomic operations

  • Operations terminate is ensured as long as

    • Number of failed nodes smaller than (N-k)/2

    • Number of writes concurrent with a read smaller than δ


Performance comparisons

Performance comparisons

Storage

Communication

(write)

Communication

(read)

  • N represents number of nodes, f represents number of failures

  • δ represents maximum number of writes concurrent with a read


Proof steps

Proof Steps

  • After every operation terminates,

    - there is a quorum of servers with the codewordsymbol

    - there is a quorum of servers with the finalize label

    - because every pair of servers intersects in k servers, readers can decode the value


Proof steps1

Proof Steps

  • After every operation terminates,

    - there is a quorum of servers with the codewordsymbol

    - there is a quorum of servers with the finalize label

    - because every pair of servers intersects in k servers, readers can decode the value

  • When a codeword symbol is deleted at a server

    • Every operation that wants that time-stamp has terminated

    • (Or the concurrency bound is violated)


Coding for atomic shared memory emulation

Main Insights

  • Significant savings on network traffic overheads

    • Reflects the classical gain of erasure coding over replication

  • (New Insight) Storage overheads depend on client activity

    • Storage overhead proportional to the no. of writes concurrent with a read

    • Better than classical techniques for moderate client activity


Coding for atomic shared memory emulation

Future Work – Many open questions

  • Refinements of our algorithm

    • (Ongoing) More robustness to client node failures

  • Information theoretic bounds on costs

    • New coding schemes

  • Finer network models

    • Erasure channels, different topologies, wireless channels

  • Finer source models

    • Correlations across versions

  • Dynamic networks


Coding for atomic shared memory emulation

Future Work – Many open questions

  • Refinements of our algorithm

    • (Ongoing) More robustness to client node failures

  • Information theoretic bounds on costs

    • New coding schemes

  • Finer network models

    • Erasure channels, different topologies, wireless channels

  • Finer source models

    • Correlations across versions

  • Dynamic networks


Storage costs

Storage costs

Our algorithm

ABD

Storage Overhead

What is the fundamental cost curve?

Number of writes concurrent with a read


Coding for atomic shared memory emulation

Future Work – Many open questions

  • Refinements of our algorithm

    • (Ongoing) More robustness to client node failures

  • Information theoretic bounds on costs

    • New coding schemes

  • Finer network models, finer source models

    • Erasure channels, different topologies, wireless channels

    • Correlations across versions

  • Dynamic networks


Coding for atomic shared memory emulation

Future Work – Many open questions

  • Refinements of our algorithm

    • (Ongoing) More robustness to client node failures

  • Information theoretic bounds on costs

    • New coding schemes

  • Finer network models, finer source models

    • Erasure channels, different topologies, wireless channels

    • Correlations across versions

  • Dynamic networks

    • Interesting replication based algorithm in[Gilbert-Lynch-Shvartsman 03]

    • Study of costs in terms of network dynamics


  • Login