consistency and replication n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Consistency and Replication PowerPoint Presentation
Download Presentation
Consistency and Replication

Loading in 2 Seconds...

play fullscreen
1 / 48

Consistency and Replication - PowerPoint PPT Presentation


  • 106 Views
  • Uploaded on

Consistency and Replication. Chapter 6. Part II Distribution Models & Consistency Protocols. Distribution Models. How do we propagate updates between replicas? Replica Placement: Where is a replica placed? When is a replica created? Who creates the replica?. Types of Replicas.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Consistency and Replication' - maude


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
consistency and replication

Consistency and Replication

Chapter 6

Part II

Distribution Models

&

Consistency Protocols

distribution models
Distribution Models
  • How do we propagate updates between replicas?
  • Replica Placement:
    • Where is a replica placed?
    • When is a replica created?
    • Who creates the replica?
types of replicas
Types of Replicas
  • The logical organization of different kinds of copies of a data store into three concentric rings.
permanent replicas
Permanent Replicas
  • Initial set of replicas
    • Other replicas can be created from them
    • Small and static set
  • Example: Web site horizontal distribution
    • Replicate Web site on a limited number of machines on a LAN
      • Distribute request in round-robin
    • Replicate Web site on a limited number of machines on a WAN (mirroring)
      • Clients choose which sites to talk to
server initiated replicas 1
Server-Initiated Replicas (1)
  • Dynamically created at the request of the owner of the DS
  • Example: push-caches
    • Web hosting servers can dynamically create replicas close to the demanding client
  • Need dynamic policy to create and delete replicas
  • One is to keep track of Web page hits
    • Keep a counter and access-origin list for each page
server initiated replicas 2
Server-Initiated Replicas (2)
  • Counting access requests from different clients.
server initiated replicas 3
Server-Initiated Replicas (3)
  • Each server can determine which is the closest server to a client
  • Hits from clients who have the same closest server will go to the same counter
  • cntQ(P,F) = count of hits to replica F hosted at server Q, initiated by P’s clients
  • If cntQ(P,F) drops below a threshold D, delete replica F from Q
server initiated replicas 4
Server-Initiated Replicas (4)
  • If cntQ(P,F) drops below a threshold R, replicate or migrate F to P
  • Choose D and R with care
  • Do not allow deletion on permanent-replicas
  • If for some server P, cntQ(P,F) exceeds more than half the total requests for F, migrate F from Q to P
    • What if it cannot be migrated to P?
client initiated replicas
Client-Initiated Replicas
  • These are caches
    • Temporary storage (expire fast)
  • Manage by clients
  • Cache hit: return data from cache
  • Cache miss: load copy from original server
  • Kept on the client machine, or at the same LAN
  • Multi-level caches
design issues for update propagation
Design Issues for Update Propagation
  • A client sends an update to a replica. The update is then propagated to all other replicas
  • Propagate state or operation
  • Pull or Push protocols
  • Unicast or multicast propagation
state versus operation propagation 1
State versus Operation Propagation (1)
  • Propagate a notification of update
    • Invalidate protocols
      • When data item x is changed at a replica, it is invalidated at other replicas
      • An attempt to read the item causes an “item-fault” triggering updating the local copy before the read can complete
      • the update is done based on the consistency model supported
    • Uses little network bandwidth
    • Good when read-to-write ratio is low
state versus operation propagation 2
State versus Operation Propagation (2)
  • Transfer modified data
    • Uses high network bandwidth
    • Good when read-to-write ratio is high
  • Propagate the update operation
    • Each replica must have a process capable of performing the update
    • Uses very low network bandwidth
    • Good when read-to-write ratio is high
pull versus push
Pull versus Push
  • Push: updates are propagated to other replicas without solicitation
    • Typically, from permanent to server-initiated replicas
  • Used to achieve a high degree of consistency
  • Efficient when read-to-write ratio is high
  • Pull: A replica asks another for an update
    • Typically, from client-initiated replica
  • Efficient when read-to-write ratio is low
  • Cache misses result in more response time
pull versus push protocols
Pull versus Push Protocols
  • A comparison between push-based and pull-based protocols in the case of multiple client, single server systems.
unicast versus multicast
Unicast versus Multicast
  • Unicast: a replica sends separate n-1 updates to every other replica
  • Multicast: a replica sends one message to multiple receivers
    • Network takes care of multicasting
    • Can be cheaper
    • Suites push-based protocols
consistency protocols
Consistency Protocols
  • Is an implementation of a specific consistency model
  • Consider an implementation where each server (site) has a complete copy of the data store
    • The data store is fully replicated
  • Will present simple protocols that implement P-RAM, CC, and SC
  • Exercise: suggest protocols for WC and Coherence
consistency protocols design issues
Consistency Protocols – Design Issues
  • Primary replica protocols
    • Each data item has a primary replica, an owner, denoted x.prime
    • x is not replicated, it lives at site x.prime
  • Primary replica, remote write protocol
    • All writes to x are sent to x.prime
  • Primary replica, local write protocol
    • Each write is performed locally
      • x migrates (x.prime changes)
      • A process gets exclusive permission to write from x.prime
design issues distribution
Design Issues – Distribution

p

q

s

t

X, Y, Z

A, B, C

E, F, G

U, V, W

Only p has X.

All operations on X

must be forwarded to p

X can migrate

design issues primary fixed replica
Design Issues – Primary Fixed Replica

p

q

s

t

A, B, C, D

A, B, C, D

A, B, C, D

A, B, C, D

Only p can

write to A.

All writes to A must

be forwarded to p

Others can read A locally

design issues primary variable replica
Design Issues – Primary Variable Replica

p

q

s

t

A, B, C, D

A, B, C, D

A, B, C, D

A, B, C, D

p is the current

Owner of A.

All writes to A must

be forwarded to p

Ownership can change

design issues free replication
Design Issues – Free Replication

p

q

s

t

A, B, C, D

A, B, C, D

A, B, C, D

A, B, C, D

Any process can

Write to A

Writes can be propagated

system model 1
System Model (1)

p

q

s

t

Replica

DSp

Replica

DSq

Replica

DSs

Replica

DSt

system model 2
System Model (2)
  • Each process (n processes) maintains:
    • DSp, a complete replica of DS
      • A data item x in replica DSp is denoted by DSp.x
    • OutQp, FIFO queue for outgoing messages
    • InQp, queue for incoming messages (need not be FIFO)
  • Sending/receiving (removing from OutQ/adding to InQ) messages is handled by independent daemons
sending and receiving messages
Sending and Receiving Messages
  • sendp(): /* executed infinitely often */
  • If not OutQp.empty() then
  • remove message m from OutQp
  • send m to each q  p
  • receivep():
  • upon receiving a message m from q
  • add m to InQp
  • Assume each m contains exactly one write
  • Better: send a bunch of writes in one message
p ram protocol p ram messages
P-RAM Protocol - (P-RAM) Messages
  • Messages have the format:
    • <write,p,x,v>, p = process id, x data item, v is a value:
    • Representing writep(x)v
  • InQp is a FIFO queue, for each process p
  • Let Tp be a sequence of operations for process p
p ram protocol p ram read and write implementation
P-RAM Protocol - (P-RAM) Read and Write Implementation
  • readp(x)?:
  • return DSp.x
  • /* add rp(x)* to Tp */
  • writep(x)v:
  • DSp.x = v
  • add <write,p,x,v> to OutQp
  • /* add wp(x)v to Tp */
p ram protocol p ram apply implementation
P-RAM Protocol - (P-RAM) Apply Implementation
  • applyp(): /* to apply remote writes locally */
  • /* executed infinitely often */
  • If not InQp.empty() then
  • let <write,q,x,v> be at the head of InQp
  • remove <write,q,x,v> from InQp
  • DSp.x = v
  • /* Add wq(x)v to Tp */
p ram correctness
(P-RAM) – Correctness
  • Claim: any computation of (P-RAM) satisfies P-RAM
  • Proof:
  • Let (O|p  O|w, <p) be Tp for each process p
  • Clearly, program order is maintained, Tp is valid, and contains all of O|p  O|w
cc protocol cc
CC Protocol – (CC)
  • Need an additional variable for each process p:
    • vtp[n], a vector timestamp for write operations
    • If vtp[q] = k, then p thinks q has performed k writes
  • Messages now have the structure:
    • <write,q,x,v,t> representing writeq(x)v with vector timestamp t
  • InQp is a priority queue based on vt values
  • Add a new message,
    • in front of all the messages with higher vt
    • behind the messages that have incomparable vt
cc read and write implementation
(CC) – Read and Write Implementation
  • readp(x)?:
  • return DSp.x
  • (read operation has timestamp of vtp)
  • /* Add rp(x)* to Tp */
  • writep(x)v:
  • vtp[p]++ /* One more write by p*/
  • DSp.x = v
  • add <write,p,x,v,vtp> to OutQp
  • (write operation has timestamp of vtp)
  • /* Add wp(x)v to Tp */
cc apply implementation
(CC) – Apply Implementation
  • applyp(): /* to apply remote writes locally */
  • If not InQp.empty() then
  • let <write,q,x,v,t> be at the head of InQp
  • if t[k]  vtp[k] for each k  p
  • and t[q] == vtp[q] + 1 then
  • remove <write,q,x,v,t> from InQp
  • vtp[q] = t[q]
  • DSp.x = v
  • (write operation has timestamp of t)
  • /* Add wq(x)v to Tp */
if t k vt p k for each k p and t q vt p q 1 then
if t[k]  vtp[k] for each k  p and t[q] == vtp[q] + 1 then
  • Not (t[k]  vtp[k]) for each k  p
    • There is a j such that t[j] > vtp[j]
    • There is at least one write by j that q knows of, but p does not
    • If so, p must wait
  • t[q] == vtp[q] + 1
    • To apply this write of q with vt = t[q] = x, it must be the case that the last write by q that p applied has a vt = x – 1
cc timestamps reflect causality
(CC) – Timestamps Reflect Causality
  • Observation 1: if o1 <co o2 then o1.vt  o2.vt
    • If o1 <po o2 , processes never decrement vt
    • If o1 <wbr o2 , after o1 is applied by p then vtp = o1.vt. Any subsequent operation by p will have a timestamp of at least vtp
    • Transitivity, there is some operation o’ such that o1 <co o’ <co o2. By induction o1.vt  o’.vt  o2.vt

 is transitive, thus o1.vt  o2.vt

  • Observation 2: if o1 <co o2 and o2 is a write then o1.vt < o2.vt
    • Proof similar to Observation 1
cc liveness 1
(CC) – Liveness (1)
  • Observation 3: If w is a write by p then each q eventually applies w
    • If q = p, trivial
    • If q  p, then w is either:
      • In OutQp: will be eventually sent (send is executed infinitely often)
      • In transit: will be delivered to q (channels are reliable)
      • In InQq: need to argue that it will eventually be applied
      • Applied by q: we’re done
cc liveness 2
(CC) – Liveness (2)
  • Any w will eventually be applied by q
  • When q adds w to InQq, InQq has a finite number of writes
  • After w is added to InQq, more writes with lower vt can be added
    • These is a finite number of such writes
  • w eventually reaches the head of InQq
cc liveness 3
(CC) – Liveness (3)
  • A write w by p at the head of InQq is eventually applied by q
  • Let vt be vtq when m is at the head of InQq
  • We must show that vt[k]  w.vt[k] for all k  p and vt[p] == w.vt[p] + 1
cc liveness 4 vt k w vt k for all k p
(CC) – Liveness (4) vt[k]  w.vt[k] for all k  p
  • Let k be a process other than p
  • Let w’ by the w.vt[k]th write by k
  • p applies w’ before it performs w
  • This implies that w’.vt < w.vt
  • Thus, w’ is ahead of w in InQq
  • When w is at the head of InQq, w’ is already applied
  • Once q applies w’, vt[k]  w.vt[k]
cc liveness 5 vt p 1 w vt p
(CC) – Liveness (5) vt[p] + 1 = w.vt[p]
  • Let w’’ be the (w.vt[p] – 1)st write by p
  • By observation 1, w’’.vt  w.vt
  • Thus, w’’ is ahead of w in InQp
  • w’’ must have been already applied
  • Therefore, vt[p] = w.vt[p] – 1
cc correctness 1
(CC) – Correctness (1)
  • Claim: any computation of (CC) satisfies CC
  • Proof:
  • Let (O|p  O|w, <p) be Tp for each process p
  • By Observation 3, Tp contains all of O|p  O|w
  • By memory property, Tp is valid
  • Tp is acyclic (exercise)
  • Need to show that (O|p  O|w, <co)  Tp
cc correctness 2 o p o w co t p
(CC) – Correctness (2)(O|p  O|w, <co)  Tp
  • Let o1 and o2 be two operations such that o1 <co o2; we show that o1 precedes o2 in Tp [let q  p]
  • By Observations 1, o1.vt  o2.vt
  • o1,o2  O|p: trivial from (CC)
  • o1  O|q|w & o2  O|p: p sets its vt[q] = o1.ts[q] after applying o1. Since o2.vt[q]  o1.vt[q], o2 can only occur after o1
  • o1  O|p|w & o2  O|q|w: o1.vt  o2.vt => o1.vt[p]  o2.vt[p] => q applies o1 before o2. p applies o2 after q and o1 before q => p applies o1 before o2
  • o1  O|p|r & o2  O|q|w: it must be o1 <co o’ <co o2, where o’ is a write. Cases 1 and 3 apply
  • o1,o2  O|q|w: exercise (Hint: Observation 2 => o1.vt < o2.vt)
sc protocol sc
SC Protocol – (SC)
  • Arrange process on a logical ring
  • Circulate a special message (token) <token>
  • Introduce another message type <ack,p,x,v,t>
    • Ack to p that w(x)v of ts t has been applied
  • Protocol uses the token-ring to apply writes in a mutual exclusive manner
  • Another way is to enforce a total order on all read and write operations using timestamps (exercise)
sc read and write implementation
(SC) – Read and Write Implementation
  • readp(x)?:
  • return DSp.x
  • writep(x)v:
  • wait for <token>
  • DSp.x = v
  • add <write,p,x,v> to OutQp
  • wait for <ack,p,x,v> from each q
  • add <token> OutQp
sc apply implementation
(SC) – Apply Implementation
  • applyp(): /* to apply remote writes locally */
  • If not InQp.empty() then
  • let <write,q,x,v> be at the head of InQp
  • remove <write,q,x,v> from InQp
  • DSp.x = v
  • add <ack,q,x,v> to OutQp
sc correctness
(SC) – Correctness
  • Claim: any computation of (SC) satisfies SC
  • Note that any computation of (SC) satisfies P-RAM
sc provides sc
(SC) Provides SC

w3

w5

w1

(O|q  O|w ,<p)

(O,<)

w3

w3

w5

w5

w1

w1

(O|w,<w)

w3

w5

w1

(O|p  O|w ,<p)

  • All processes agree on one order of writes (O|w, <w)
    • (O|w,<p) = (O|w,<q) = (O|w,<w), for each pair p,q
  • To get (O,<) s.t. (O,<po)  (O,<):