Loading in 2 Seconds...

Reconciling Differences: towards a theory of cloud complexity

Loading in 2 Seconds...

- 119 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Reconciling Differences: towards a theory of cloud complexity' - suzuki

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Reconciling Differences: towards a theory of cloud complexity

George Varghese

UCSD, visiting at Yahoo! Labs

Part 1: Reconciling Sets across a link

- Joint with D. Eppstein, M. Goodrich, F. Uyeda
- Appeared in SIGCOMM 2011

Motivation 1: OSPF Routing (1990)

- After partition forms and heals, R1 needs updates at R2 that arrived during partition.

R1

R2

Partition heals

Must solve the Set-Difference Problem!

Motivation 2:Amazon S3 storage (2007)

- Synchronizing replicas.

S1

S2

Periodic Anti-entropy Protocol between replicas

Set-Difference across cloud again!

What is the Set-Difference problem?

Host 1

Host 2

- What objects are unique to host 1?
- What objects are unique to host 2?

A

B

E

F

A

C

D

F

Use case 1: Data Synchronization

Host 1

Host 2

- Identify missing data blocks
- Transfer blocks to synchronize sets

C

D

A

B

E

F

A

C

D

F

B

E

Use case 2: Data De-duplication

Host 1

Host 2

- Identify all unique blocks.
- Replace duplicate data with pointers

A

B

E

F

A

C

D

F

Prior work versus ours

- Trade a sorted list of keys.
- Let n be size of sets, U be size of key space
- O(n log U) communication, O(n log n) computation
- Bloom filters can improve to O(n) communication.
- Polynomial Encodings (Minsky ,Trachtenberg)
- Let “d” be the size of the difference
- O(d log U) communication, O(dn+d3) computation
- Invertible Bloom Filter (our result)
- O(d log U) communication, O(n+d) computation

Difference Digests

- Efficiently solves the set-difference problem.
- Consists of two data structures:
- Invertible Bloom Filter (IBF)
- Efficiently computes the set difference.
- Needs the size of the difference
- Strata Estimator
- Approximates the size of the set difference.
- Uses IBF’s as a building block.

IBFs: main idea

- Sum over random subsets:Summarize a set by “checksums” over O(d) random subsets.
- Subtract: Exchange and subtract checksums.
- Eliminate: Hashing for subset choice common elements disappear after subtraction
- Invert fast: O(d) equations in d unknowns; randomness allows expected O(d) inversion.

“Checksum” details

- Array of IBF cells that form “checksum” words
- For set difference of size d, use αd cells (α > 1)
- Each element ID is assigned to many IBF cells
- Each cell contains:

IBF Encode

B

C

A

Assign ID to many cells

All hosts use the same hash functions

Hash1

Hash2

Hash3

idSum⊕A

hashSum⊕ H(A)

count++

idSum⊕A

hashSum⊕H(A)

count++

idSum⊕ A

hashSum⊕ H(A)

count++

IBF:

“Add” ID to cell

Not O(n), like Bloom Filters!

αd

Invertible Bloom Filters (IBF)

Host 1

Host 2

- “Subtract” IBF structures
- Produces a new IBF containing only unique objects

A

B

E

F

A

C

D

F

IBF 2

IBF 1

IBF (2 - 1)

Disappearing act

- After subtraction, elements common to both sets disappear because:
- Any common element (e.g W) is assigned to same cells on both hosts (same hash functions on both sides)
- On subtraction, W XOR W = 0. Thus, W vanishes.
- While elements in set difference remain, they may be randomly mixed need a decode procedure.

How many IBF cells?

Overhead to decode at >99%

Hash Cnt 3

Hash Cnt 4

α

Space Overhead

Small Diffs:

1.4x – 2.3x

Large Differences:

1.25x - 1.4x

Set Difference

How many hash functions?

- 1 hash function produces many pure cells initially but nothing to undo when an element is removed.

C

A

B

How many hash functions?

- 1 hash function produces many pure cells initially but nothing to undo when an element is removed.
- Many (say 10) hash functions: too many collisions.

C

C

C

B

B

C

B

A

A

A

B

A

How many hash functions?

- 1 hash function produces many pure cells initially but nothing to undo when an element is removed.
- Many (say 10) hash functions: too many collisions.
- We find by experiment that 3 or 4 hash functions works well. Is there some theoretical reason?

C

C

B

C

A

A

A

B

B

Theory

- Let d = difference size, k = # hash functions.
- Theorem 1: With (k + 1) d cells, failure probability falls exponentially with k.
- For k = 3, implies a 4x tax on storage, a bit weak.
- [Goodrich,Mitzenmacher]: Failure is equivalent to finding a 2-core (loop) in a random hypergraph
- Theorem 2: With ck d, cells, failure probability falls exponentially with k.
- c4 = 1.3x tax, agrees with experiments

Recall experiments

Overhead to decode at >99%

Hash Cnt 3

Hash Cnt 4

Space Overhead

Large Differences:

1.25x - 1.4x

Set Difference

Connection to Coding

- Mystery: IBF decode similar to peeling procedure used to decode Tornado codes. Why?
- Explanation: Set Difference is equivalent to coding with insert-delete channels
- Intuition: Given a code for set A, send checkwords only to B. Think of B as a corrupted form of A.
- Reduction: If code can correct D insertions/deletions, then B can recover A and the set difference.

- Reed Solomon <---> Polynomial Methods
- LDPC (Tornado) <---> Difference Digest

Random Subsets Fast Elimination

Sparse

X + Y + Z = . .

αd

Y = . .

Pure

X = . .

Roughly upper triangular and sparse

Difference Digests

- Consists of two data structures:
- Invertible Bloom Filter (IBF)
- Efficiently computes the set difference.
- Needs the size of the difference
- Strata Estimator
- Approximates the size of the set difference.
- Uses IBF’s as a building block.

Strata Estimator

Estimator

B

C

A

1/16

- Divide keys into sampled subsets containing ~1/2k
- Encode each subset into an IBF of small fixed size
- log(n) IBF’s of ~20 cells each

IBF 4

~1/8

IBF 3

~1/4

Consistent

Partitioning

IBF 2

~1/2

IBF 1

Strata Estimator

Estimator 1

Estimator 2

- Attempt to subtract & decode IBF’s at each level.
- If level k decodes, then return:2kx (the number of ID’s recovered)

…

…

IBF 4

IBF 4

4x

IBF 3

IBF 3

Host 1

Host 2

IBF 2

IBF 2

Decode

IBF 1

IBF 1

KeyDiff Service

- Promising Applications:
- File Synchronization
- P2P file sharing
- Failure Recovery

Application

Application

Add( key )

Remove( key )

Diff( host1, host2 )

Key Service

Key Service

Application

Key Service

Difference Digest Summary

- Strata Estimator
- Estimates Set Difference.
- For 100K sets, 15KB estimator has <15% error
- O(log n) communication, O(n) computation.
- Invertible Bloom Filter
- Identifies all ID’s in the Set Difference.
- 16 to 28 Bytes per ID in Set Difference.
- O(d) communication, O(n+d) computation
- Worth it if set difference is < 20% of set sizes

Connection to Sparse Recovery?

- If we forget about subtraction, in the end we are recovering a d-sparse vector.
- Note that the hash check is key for figuring out which cells are pure after differencing.
- Is there a connection to compressed sensing. Could sensors do the random summing? The hash summing?
- Connection the other way: could use compressed sensing for differences?

Comparison with Information Theory and Coding

- Worst case complexity versus average
- It emphasize communication complexity not computation complexity: we focus on both.
- Existence versus Constructive: some similar settings (Slepian-Wolf) are existential
- Estimators: We want bounds based on difference and so start by efficiently estimating difference.

Aside: IBFs in Digital Hardware

Stream of set elements

Logic (Read, hash, Write)

a , b, x, y

Hash 3

Hash 1

Hash 2

Strata Hash

Bank 3

Bank 1

Bank 2

Hash to separate banks for parallelism, slight cost in space needed. Decode in software

Mild Sensitivity Analysis: One set much larger than other

Small difference d

?

Set A

Set B

(|A|) bits needed, not O (d) : Patrascu 2008

Simpler proof: DKS 2011

LBFS File System (Mazieres)

C99

C98

C97

1 chunk difference

File B

?

. . .

C3

C5

C1

C3

C2

C1

. . .

C98

C97

C99

File A

Chunk Set B at Server

LBFS sends all chunk hashes in File A: O|A|

More Sensitivity Analysis: small intersection: databasejoins

Small intersection d

?

Set B

Set A

(|A|) bits needed, not O (d) : Follows from results on hardness of set disjointness

(Files for example)

Edit distance 2

A

?

A

B

C

C

D

D

E

E

F

F

G

File A

File B

Insert/delete can renumber all file blocks . . .

(with J. Ullman)

Edit distance 1

A

A

H1

B

C

H2

C

D

H2

D

E

H3

E

F

H3

F

File A

File B

Send 2d+1 piece hashes. Clump unmatched pieces and recurse. O( d log (N) )

2

21 years of Sequence Reconciliation!

- Schwartz, Bowdidge, Burkhard (1990): recurse on unmatched pieces, not aggregate.
- Rsync: widely used tool that breaks file into roughly piece hashes, N is file length.

UCSD, Lunch

Princeton, kids

Generalizes rumor spreading which has disjoint singleton sets

{b}

{a}

{d}

{g}

CLP10,G11,: O( E n log n /conductance)

(with N. Goyal and R. Kannan)

{b,c,d}

Pick random edge

Do 2 party set reconciliation

{a,b,c}

{d,c,e}

Complexity: C + D, C as before, D = Sum (U – S )

i

i

Butterfly example for Sets

S1

S2

S2

S1

X

D = Diff(S1 ,S2)

S1

Y

D

D

Set difference instead of XOR within network

How does reconciliation on Steiner graphs relate to network coding?

- Objects in general, not just bits.
- Routers do not need objects but can transform/code objects.
- What transformations within network allow efficient communication close to lower bound?

VM code pages (with Ramjee et al)

2 “errors”

A

?

A

B

X

C

C

D

D

E

Y

VM A

VM B

Reconcile Set A = {(A,1)(B,2),(C,3),(D,4),(E,5)} and Set B = {(A,1),(X,2),(C,3),(D,4),(Y,5)}

Twist: IBFs for error correction?(with M. Mitzenmacher)

- Write message M[1..n] of n words as set S = {(M[1],1), (M[2], 2), . . (M[n], n)}.
- Calculate IBF(S) and transmit M, IBF(S)
- Receiver uses received message M’ to find IBF(S’); subtracts from IBF’(S) to locate errors.
- Protect IBF using Reed-Solomon or redundancy
- Why: Potentially O(e) decoding for e errors -- Raptor codes achieve this for erasure channels.

The Cloud Complexity Milieu

Other dimensions: approximate, secure, . . .

Conclusions: Got Diffs?

- Resiliency and fast recoding of random sums set reconciliation; and error correction?
- Sets on graphs
- All terminals: generalizes rumor spreading
- Routers,terminals: resemblance to network coding.
- Cloud complexity: Some points covered, many remain
- Practical, may be useful to synch devices across cloud.

Comparison to Logs/Incremental Updates

- IBF work with no prior context.
- Logs work with prior context, BUT
- Redundant information when sync’ing with multiple parties.
- Logging must be built into system for each write.
- Logging adds overhead at runtime.
- Logging requires non-volatile storage.
- Often not present in network devices.

- IBF’s may out-perform logs when:
- Synchronizing multiple parties
- Synchronizations happen infrequently

Download Presentation

Connecting to Server..