Consistency & Replication I

Consistency & Replication I CSE5306 Lecture Quiz due 9 July 2014 at 5 PM

Consistency & Replication • What does “replicated data” mean? • How is it useful? • How can we be sure all replicas stay the same through multiple, simultaneous accesses?

R U O K ? • Why is it important that data be consistent? • Inconsistent law enforcement leads good people to disrespect laws. • Different prices on the clothing rack and at the cash register can make customers angry or make stores lose money. • Skewed timing standards among GPS satellites can cause lead to military aircraft targeting errors. • All of the above. • None of the above.

Reasons for Replication • Reliability—replicating three copies of a file keeps two accessible when one goes down, and it enables two identical copies to vote on repairs of a corrupted one. • Performance—replicating one overloaded server multiplies system performance, and placing replicated servers closer to widely-spaced users reduces access times. • Consistency is costly—updating a stale remotely-located web page too frequently wastes network bandwidth, and it over-burdens the source server.

R U O K ? 2. Why replicate? • Keeping three copies of a file ensures that two remain available, when one goes down. • Two identical file copies can vote on the repairs of a corrupted file copy. • Replicating an overloaded server multiplies a system’s performance. • Placing replicated servers closer to widely-spaced clients reduces access times. • All of the above.

Replication as Scaling Technique • If stock prices change more frequently than they are remotely accessed, then perhaps the remote server should not be updated as often, or its users should access the source server instead. • Synchronous replication ensures that reading all replicates produces the same results; i.e., “tight consistency,” (almost) atomic transactions. • Lamport (pp.244-52) showed that properly ordering events is a cheap but effective substitute for global synchronization of updates.

R U O K ? 3. What about implementing synchronous replication? • If data change more frequently than they are remotely accessed, then perhaps the remote server should not be updated as often, or its users should access the source server instead. • Synchronous replication ensures that reading all replicates produces the same results. • Properly ordering events is a cheap but effective substitute for global synchronization of updates. • All of the above. • None of the above.

Data-Centric Consistency Models • Data store—memory devices physically distributed across many machines, each of which has local copies of its users’ files. • “Write” operations propagate local changes to remote copies, and “read” operations make remote changes available to local users. • Consistency model—a contract between processes and the store, saying the “last” remote write is evident to every local read. • Easier-to-use models generally perform more poorly.

R U O K ? Match the following terms with their definitions below. 4. Data store __ 5. Write operation __ 6. Read operation __ 7. Consistency model __ 8. Easier-to-use models __ • Generally poorer performance. • Make remote changes available to local users. • Memory devices physically distributed across many machines, each of which has local copies of its users’ files. • Acontract between processes and a shared store that says the “last” remote write is evident to every local read. • Propagate local changes to remote copies.

Continuous Consistency • Applications dictate how to loosen consistency for more efficient replication. • Continuousconsistency measurement ranges: • Replica numerical value deviations; e.g., stock market price absolute or relative (percentage) differences, value (weight) of delayed changes made in a web page. • Replica staleness deviations; e.g., days for weather reports, nanoseconds for algorithm stock traders. • Replica update ordering deviations; e.g., write replica upon arrival, possibly reorder after winning global agreement.

R U O K ? 9. What measurements typically are used to evaluate consistency among replicas? • numerical value deviations. • Replica staleness deviations. • update ordering deviations. • All of the above. • None of the above.

The Notion of a Conit • A “conit” is a consistency unit of measure; e.g., the Dow Jones Industrials average, today’s weather report. • In the figure above, conits = final local current values of x & y…. • Operation = <time, source>:equation = <5, B> : x<-x+2 (shading indicates permanent, committed, cannot be rolled back). A has not seen <10, B> : y = y + 5. • Result = ( x, y ) locally initialized to (0, 0), then apply operations. • A vector clock keeps events in causal order (Fig. 6-13, p.251): vector clock A = (last A time + 1, last seen B time + 1 ) = (15, 5). vector clock B = (unknown A time, last B time + 1 ) = (0, 11). • Order deviation (number of unshaded pending update operations): A’s orddev = 3, B’s orddev = 2. • Numerical deviation, (unseen operations, abs(max(committed local values – final remote values))): A’s numdev = (1, abs(max((2,0) – (2,5))) = (1, 5) and B’s = (3, abs(max((0,0) – (6,3))) = (3,6).

R U O K ? Match the following conit related terms with their definitions below. 10. Conit__ 11. Operation __ 12. Result __ 13. Vector clock __ 14. Order deviation __ 15. Numerical deviation __ • <time, source>:equation. • A = (last A time + 1, last seen B time + 1 ). • (number of unshaded pending update operations). • Locally initialize ( x, y ) to (0, 0), then apply operations. • Consistency unit of measure, final local current values of x & y. • (unseen operations, abs(max(committed local values – final remote values))).

The Notion of a Conit (continued) • Conits should not be too fine or coarse grained: • Few fine conits; e.g., an entire database, replicates must update when every tiny detail changes (above left). • Very many coarse conits: e.g., only occasional address changes prompt updates (above right): • Replicates may even “falsely share” different coarse conits. • Managing a great many conits reduces server performance. • Continuous consistency toolkit helps manage conits: • Create protocols to enforce consistency requirements: • DependsOnConit(ConitQ, 4, 0, 60); // limit numdev, orddev, staleness to 4, 0 & 60sec • Read message m from head of queue Q; • App developers must specify consistency requirements: • AffectsConit(ConitQ, 1, 1); // define conit • Append message m to queue Q;

R U O K ? 16. How can we most effectively adjust a conit’s precision? • Not too fine, so as to avoid updating replicates with every tiny detail change. • Not too coarse, because that would result in only occasional updates of grossly differing replicates. • Rely upon a continuous consistency toolkit to create protocols for consistency requirement enforcement and to discover those requirements. • All of the above. • None of the above.

Consistent Ordering of Operations • Many physically separate software developers may work together on one white board design; i.e., concurrent programming. • Their shared replicas must agree on a global ordering of their updates; i.e., a sequentially consistent ordering of operations.

Sequential Consistency > time • Notation: process Pi writes the value a into a data item x, then a process Pireads the value b from a data item x (initially nil); i.e., Wi(x)a and Ri(x)b • In the figure above, P1 writes a and P2 reads nil locally. When all replicates update , P2‘s read value gets corrected to a. • A data store is “sequentially consistent,” when the result of any execution is the same as if the (read and write) operations by all processes on the data store … • were executed in some sequential order and … • the operations of each individual process appear … • in this sequence • in the order specified by its program.

R U O K ? 17. What proves that data store is “sequentially consistent”? • The result of any execution is the same, as if the (read and write) operations by all processes on the data store were executed in some sequential order. • The operations of each individual process appear in the order specified by its program. • Both of the above. • None of the above.

Sequential Consistency (continued) • Time is unimportant: • See the sequentially consistent store (above left). • See a sequentially inconsistent store (above right), in which not all reads see the same sequence.

R U O K ? 18. How important is real time in maintaining sequential consistency? • It is completely irrelevant. • It is a convenient tool. • It is absolutely essential. • None of the above.

Sequential Consistency (continued) • Consider the three concurrently executing processes P1, P2 and P3 above. • Their writes and reads can interleave in 90 different ways, only four of which appear above. • The values written appear as “Signatures,” and the values read appear as “Prints.” • The contract between the processes and the shared data store says the processes must accept all of these as sequentially consistent.

R U O K ? 19. What guarantees that concurrent (interleaved) read-after-write processes are sequentially consistent? • All of them execute in strictly chronological order. • None of them changed to write-after-read. • Each completed before the next began. • All of the above. • None of the above.

Causal Consistency • A causally consistent store obeys the following: • Writes that are potentially causally related … • must be seen by all processes • in the same order. • Concurrent (i.e., causally unrelated) writes … • may be seen in a different order • on different machines. • For example, W1(x)a and R2(x)a are causally related (above), but W2(x)b and W1(x)c are merely concurrent, which allows P3 and P4 to see them indifferent orders. • A counter example (below left), shows a violation of causal consistency, because b (written by W2(x)b) may be computed from a (read by R2 (x)a) and P3 and P4’s reading orders do not reflect that. • The last example (below right) does not violate causal consistency, because W1(x)a and W2(x)b are merely concurrent.

R U O K ? 20. What is is the causal consistency contract? • Writes that are potentially causally related must be seen by all processes in the same order. • Concurrent (i.e., causally unrelated) writes may be seen in a different order on different machines. • All of the above. • None of the above.

Grouping Operations • Multiprocessors, which share a store, enforce mutual exclusion from coded critical sections (CS), instead of doing write and read operations. • Upon entering a CS, the processor is guaranteed that its local data store is up to date. • All reads and writes of its owned objects inside the CS are atomic. • No other processes can read or write those objects, till that processor leaves the CS. • Necessary criteria for correct multiprocessorsynchronization: • An acquire access of a synchronization variable, not allowed to perform until all updates to guarded shared data have been performed with respect to that process. • Before exclusive mode access to synchronization variable by process is allowed to perform with respect to that process, no other process may hold synchronization variable, not even in nonexclusive mode. • After exclusive mode access to synchronization variable has been performed, any other process‘next nonexclusive mode access to that synchronization variable may not be performed until it has performed with respect to that variable’s owner. • Releases precede acquires in the valid “entry consistency” event sequence shown above.

R U O K ? 21. What conditions assure correct multiprocessorsynchronization? • A process may not acquire access to a synchronization variable, until all of that process’ guarded shared data have been updated. • A process is not granted exclusive mode access to a synchronization variable, until all other processes have completely released it. (Others may not even hold it in nonexclusive mode.) • Even after a process has released its exclusive mode access to a synchronization variable, another process still must ask that variable’s former owner for nonexclusive mode access to it. • All of the above. • None of the above.

Consistency vs. Coherence • A consistency model tells what to expect (i.e., how the set of data are consistent), when multiple processes share a data store. • A coherency model tells expectations of a single data item that is replicated on many machines. • If the coherency model is sequentially consistent, for example, all processes will see that data item getting the same sequence of updates.

R U O K ? 22. How are consistency and coherency models different? • A consistency model tells what to expect, when multiple processes share a data store. • A coherency model tells expectations of a single data item that is replicated on many machines. • A coherency model is sequentially consistent, if all processes will see that data item getting the same sequence of updates. • All of the above. • None of the above.

Eventual Consistency • Eventual consistency is a very weak consistency model that hides many of a client’s database reading inconsistencies relatively cheaply. • DNS uses a lazy update, in which a client sees it long after it happens. • Web caches quickly display out-of-date pages to clients, who prefer speed to accuracy. Updates seldom occur, so multiple clients are satisfied with eventual consistency. • But a mobile user may update one replica and notice her changes have not yet propagated to another replica (figure above). Client-centric consistency can guarantee one client that all replicas of her Bayou databaseare consistent; i.e., are the same version.

R U O K ? 23. Which of the following accurately characterize Eventual consistency? • It is a very weak consistency model that hides many of a client’s database reading inconsistencies relatively cheaply. • For example, DNS does lazy updates, which clients see long after they happen. • In another example, Web caches quickly display out-of-date pages to clients, who prefer speed to accuracy. • All of the above. • None of the above.

Monotonic Reads SFO: NYC: • Monotonic-read consistency model assures that a client sees the same value (or a more recent value) every time she reads a data item. • Emails are delivered in a lazy, on-demand fashion. For example, the same emails a client read in the morning in San Francisco can be reread in the evening in New York City, plus a few more (see above). Old Plus New Only New

R U O K ? 24. Describe the monotonic-read consistency model. • It assures that a client sees the same value (or a more recent value) every time she reads a data item. • For example, the same emails a client read in the morning in San Francisco can be reread in the evening in New York City, plus a few more. • All prior writes to the store are completed before each new write. • Both a and b above. • None of the above.

Monotonic Writes • Monotonic-write consistency model assures that a each write is finished, before any successive write by the same process begins; i.e., a replicate must be up to date before it is edited (FIFO consistency). • Above left correctly shows a WS(x1) store update before the W(x2) edit. Above right incorrectly omits the update, violating monotonic-write consistency.

R U O K ? 25. Describe the monotonic-write consistency model. • It assures that a client sees the same value (or a more recent value) every time she reads a data item. • For example, the same emails a client read in the morning in San Francisco can be reread in the evening in New York City, plus a few more. • All prior writes to the store are completed before each new write. • All of the above. • None of the above.

Consistency & Replication I