Reconciling Differences: towards a theory of cloud complexity

Reconciling Differences: towards a theory of cloud complexity George Varghese UCSD, visiting at Yahoo! Labs

Part 1: Reconciling Sets across a link • Joint with D. Eppstein, M. Goodrich, F. Uyeda • Appeared in SIGCOMM 2011

Motivation 1: OSPF Routing (1990) • After partition forms and heals, R1 needs updates at R2 that arrived during partition. R1 R2 Partition heals Must solve the Set-Difference Problem!

Motivation 2:Amazon S3 storage (2007) • Synchronizing replicas. S1 S2 Periodic Anti-entropy Protocol between replicas Set-Difference across cloud again!

What is the Set-Difference problem? Host 1 Host 2 • What objects are unique to host 1? • What objects are unique to host 2? A B E F A C D F

Use case 1: Data Synchronization Host 1 Host 2 • Identify missing data blocks • Transfer blocks to synchronize sets C D A B E F A C D F B E

Use case 2: Data De-duplication Host 1 Host 2 • Identify all unique blocks. • Replace duplicate data with pointers A B E F A C D F

Prior work versus ours • Trade a sorted list of keys. • Let n be size of sets, U be size of key space • O(n log U) communication, O(n log n) computation • Bloom filters can improve to O(n) communication. • Polynomial Encodings (Minsky ,Trachtenberg) • Let “d” be the size of the difference • O(d log U) communication, O(dn+d3) computation • Invertible Bloom Filter (our result) • O(d log U) communication, O(n+d) computation

Difference Digests • Efficiently solves the set-difference problem. • Consists of two data structures: • Invertible Bloom Filter (IBF) • Efficiently computes the set difference. • Needs the size of the difference • Strata Estimator • Approximates the size of the set difference. • Uses IBF’s as a building block.

IBFs: main idea • Sum over random subsets:Summarize a set by “checksums” over O(d) random subsets. • Subtract: Exchange and subtract checksums. • Eliminate: Hashing for subset choice  common elements disappear after subtraction • Invert fast: O(d) equations in d unknowns; randomness allows expected O(d) inversion.

“Checksum” details • Array of IBF cells that form “checksum” words • For set difference of size d, use αd cells (α > 1) • Each element ID is assigned to many IBF cells • Each cell contains:

IBF Encode B C A Assign ID to many cells All hosts use the same hash functions Hash1 Hash2 Hash3 idSum⊕A hashSum⊕ H(A) count++ idSum⊕A hashSum⊕H(A) count++ idSum⊕ A hashSum⊕ H(A) count++ IBF: “Add” ID to cell Not O(n), like Bloom Filters! αd

Invertible Bloom Filters (IBF) Host 1 Host 2 • Trade IBF’s with remote host A B E F A C D F IBF 1 IBF 2

Invertible Bloom Filters (IBF) Host 1 Host 2 • “Subtract” IBF structures • Produces a new IBF containing only unique objects A B E F A C D F IBF 2 IBF 1 IBF (2 - 1)

IBF Subtract

Disappearing act • After subtraction, elements common to both sets disappear because: • Any common element (e.g W) is assigned to same cells on both hosts (same hash functions on both sides) • On subtraction, W XOR W = 0. Thus, W vanishes. • While elements in set difference remain, they may be randomly mixed  need a decode procedure.

IBF Decode H(V ⊕ X ⊕ Z) ≠ H(V) ⊕ H(X) ⊕ H(Z) Test for Purity: H( idSum ) H( idSum ) = hashSum H(V) = H(V)

IBF Decode

How many IBF cells? Overhead to decode at >99% Hash Cnt 3 Hash Cnt 4 α Space Overhead Small Diffs: 1.4x – 2.3x Large Differences: 1.25x - 1.4x Set Difference

How many hash functions? • 1 hash function produces many pure cells initially but nothing to undo when an element is removed. C A B

How many hash functions? • 1 hash function produces many pure cells initially but nothing to undo when an element is removed. • Many (say 10) hash functions: too many collisions. C C C B B C B A A A B A

How many hash functions? • 1 hash function produces many pure cells initially but nothing to undo when an element is removed. • Many (say 10) hash functions: too many collisions. • We find by experiment that 3 or 4 hash functions works well. Is there some theoretical reason? C C B C A A A B B

Theory • Let d = difference size, k = # hash functions. • Theorem 1: With (k + 1) d cells, failure probability falls exponentially with k. • For k = 3, implies a 4x tax on storage, a bit weak. • [Goodrich,Mitzenmacher]: Failure is equivalent to finding a 2-core (loop) in a random hypergraph • Theorem 2: With ck d, cells, failure probability falls exponentially with k. • c4 = 1.3x tax, agrees with experiments

Recall experiments Overhead to decode at >99% Hash Cnt 3 Hash Cnt 4 Space Overhead Large Differences: 1.25x - 1.4x Set Difference

Connection to Coding • Mystery: IBF decode similar to peeling procedure used to decode Tornado codes. Why? • Explanation: Set Difference is equivalent to coding with insert-delete channels • Intuition: Given a code for set A, send checkwords only to B. Think of B as a corrupted form of A. • Reduction: If code can correct D insertions/deletions, then B can recover A and the set difference. • Reed Solomon <---> Polynomial Methods • LDPC (Tornado) <---> Difference Digest

Random Subsets  Fast Elimination Sparse X + Y + Z = . . αd Y = . . Pure X = . . Roughly upper triangular and sparse

Difference Digests • Consists of two data structures: • Invertible Bloom Filter (IBF) • Efficiently computes the set difference. • Needs the size of the difference • Strata Estimator • Approximates the size of the set difference. • Uses IBF’s as a building block.

Strata Estimator Estimator B C A 1/16 • Divide keys into sampled subsets containing ~1/2k • Encode each subset into an IBF of small fixed size • log(n) IBF’s of ~20 cells each IBF 4 ~1/8 IBF 3 ~1/4 Consistent Partitioning IBF 2 ~1/2 IBF 1

Strata Estimator Estimator 1 Estimator 2 • Attempt to subtract & decode IBF’s at each level. • If level k decodes, then return:2kx (the number of ID’s recovered) … … IBF 4 IBF 4 4x IBF 3 IBF 3 Host 1 Host 2 IBF 2 IBF 2 Decode IBF 1 IBF 1

KeyDiff Service • Promising Applications: • File Synchronization • P2P file sharing • Failure Recovery Application Application Add( key ) Remove( key ) Diff( host1, host2 ) Key Service Key Service Application Key Service

Difference Digest Summary • Strata Estimator • Estimates Set Difference. • For 100K sets, 15KB estimator has <15% error • O(log n) communication, O(n) computation. • Invertible Bloom Filter • Identifies all ID’s in the Set Difference. • 16 to 28 Bytes per ID in Set Difference. • O(d) communication, O(n+d) computation • Worth it if set difference is < 20% of set sizes

Connection to Sparse Recovery? • If we forget about subtraction, in the end we are recovering a d-sparse vector. • Note that the hash check is key for figuring out which cells are pure after differencing. • Is there a connection to compressed sensing. Could sensors do the random summing? The hash summing? • Connection the other way: could use compressed sensing for differences?

Comparison with Information Theory and Coding • Worst case complexity versus average • It emphasize communication complexity not computation complexity: we focus on both. • Existence versus Constructive: some similar settings (Slepian-Wolf) are existential • Estimators: We want bounds based on difference and so start by efficiently estimating difference.

Aside: IBFs in Digital Hardware Stream of set elements Logic (Read, hash, Write) a , b, x, y Hash 3 Hash 1 Hash 2 Strata Hash Bank 3 Bank 1 Bank 2 Hash to separate banks for parallelism, slight cost in space needed. Decode in software

Part 2: Towards a theory of Cloud Complexity O2 ? O1 O3 Complexity of reconciling “similar” objects?

Example: Synching Files X.ppt.v2 X.ppt.v3 ? X.ppt.v1 Measures: Communication bits, computation

So far: Two sets, one link, set difference {a,b,c} {d,a,c}

Mild Sensitivity Analysis: One set much larger than other Small difference d ? Set A Set B (|A|) bits needed, not O (d) : Patrascu 2008 Simpler proof: DKS 2011

Asymmetric set difference in LBFS File System (Mazieres) C99 C98 C97 1 chunk difference File B ? . . . C3 C5 C1 C3 C2 C1 . . . C98 C97 C99 File A Chunk Set B at Server LBFS sends all chunk hashes in File A: O|A|

More Sensitivity Analysis: small intersection: databasejoins Small intersection d ? Set B Set A (|A|) bits needed, not O (d) : Follows from results on hardness of set disjointness

Sequences under Edit Distance (Files for example) Edit distance 2 A ? A B C C D D E E F F G File A File B Insert/delete can renumber all file blocks . . .

Sequence reconciliation (with J. Ullman) Edit distance 1 A A H1 B C H2 C D H2 D E H3 E F H3 F File A File B Send 2d+1 piece hashes. Clump unmatched pieces and recurse. O( d log (N) ) 2

21 years of Sequence Reconciliation! • Schwartz, Bowdidge, Burkhard (1990): recurse on unmatched pieces, not aggregate. • Rsync: widely used tool that breaks file into roughly piece hashes, N is file length. UCSD, Lunch Princeton, kids

Sets on graphs? {b,c,d} {a,b,c} {d,c,e} {a,f,g}

Generalizes rumor spreading which has disjoint singleton sets {b} {a} {d} {g} CLP10,G11,: O( E n log n /conductance)

Generalized Push-Pull (with N. Goyal and R. Kannan) {b,c,d} Pick random edge Do 2 party set reconciliation {a,b,c} {d,c,e} Complexity: C + D, C as before, D = Sum (U – S ) i i

Sets on Steiner graphs? R1 {b} U S {a} U S Only terminals need sets. Push-pull wasteful!

Butterfly example for Sets S1 S2 S2 S1 X D = Diff(S1 ,S2) S1 Y D D Set difference instead of XOR within network

Reconciling Differences: towards a theory of cloud complexity

Reconciling Differences: towards a theory of cloud complexity

Presentation Transcript

Tracing Complexity Theory

Complexity Theory

Towards a useful theory of language

TOWARDS A DYNAMIC THEORY OF STRATEGY

CS151 Complexity Theory

CS151 Complexity Theory

CS151 Complexity Theory

Toward a Unifying Worldview: Reconciling Differences

Complexity Theory

CS151 Complexity Theory

Towards a Theory of Onion Routing

CS151 Complexity Theory

Towards a Theory of Events

Towards a Theory of Everything

CS151 Complexity Theory

TOWARDS A CONTROL THEORY OF ATTENTION

CS151 Complexity Theory

CS151 Complexity Theory

Towards a Theory of Everything

Towards a Theory of Digital Preservation

TOWARDS A CONTROL THEORY OF ATTENTION