reconciling differences towards a theory of cloud complexity n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Reconciling Differences: towards a theory of cloud complexity PowerPoint Presentation
Download Presentation
Reconciling Differences: towards a theory of cloud complexity

Loading in 2 Seconds...

play fullscreen
1 / 56

Reconciling Differences: towards a theory of cloud complexity - PowerPoint PPT Presentation


  • 114 Views
  • Uploaded on

Reconciling Differences: towards a theory of cloud complexity. George Varghese UCSD, visiting at Yahoo! Labs. Part 1 : Reconciling Sets across a link. Joint with D. Eppstein , M. Goodrich, F. Uyeda Appeared in SIGCOMM 2011. Motivation 1: OSPF Routing (1990).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Reconciling Differences: towards a theory of cloud complexity' - suzuki


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
reconciling differences towards a theory of cloud complexity

Reconciling Differences: towards a theory of cloud complexity

George Varghese

UCSD, visiting at Yahoo! Labs

slide2

Part 1: Reconciling Sets across a link

  • Joint with D. Eppstein, M. Goodrich, F. Uyeda
  • Appeared in SIGCOMM 2011
motivation 1 ospf routing 1990
Motivation 1: OSPF Routing (1990)
  • After partition forms and heals, R1 needs updates at R2 that arrived during partition.

R1

R2

Partition heals

Must solve the Set-Difference Problem!

motivation 2 amazon s3 storage 2007
Motivation 2:Amazon S3 storage (2007)
  • Synchronizing replicas.

S1

S2

Periodic Anti-entropy Protocol between replicas

Set-Difference across cloud again!

what is the set difference problem
What is the Set-Difference problem?

Host 1

Host 2

  • What objects are unique to host 1?
  • What objects are unique to host 2?

A

B

E

F

A

C

D

F

use case 1 data synchronization
Use case 1: Data Synchronization

Host 1

Host 2

  • Identify missing data blocks
  • Transfer blocks to synchronize sets

C

D

A

B

E

F

A

C

D

F

B

E

use case 2 data de duplication
Use case 2: Data De-duplication

Host 1

Host 2

  • Identify all unique blocks.
  • Replace duplicate data with pointers

A

B

E

F

A

C

D

F

prior work versus ours
Prior work versus ours
  • Trade a sorted list of keys.
    • Let n be size of sets, U be size of key space
    • O(n log U) communication, O(n log n) computation
    • Bloom filters can improve to O(n) communication.
  • Polynomial Encodings (Minsky ,Trachtenberg)
    • Let “d” be the size of the difference
    • O(d log U) communication, O(dn+d3) computation
  • Invertible Bloom Filter (our result)
    • O(d log U) communication, O(n+d) computation
difference digests
Difference Digests
  • Efficiently solves the set-difference problem.
  • Consists of two data structures:
    • Invertible Bloom Filter (IBF)
      • Efficiently computes the set difference.
      • Needs the size of the difference
    • Strata Estimator
      • Approximates the size of the set difference.
      • Uses IBF’s as a building block.
ibfs main idea
IBFs: main idea
  • Sum over random subsets:Summarize a set by “checksums” over O(d) random subsets.
  • Subtract: Exchange and subtract checksums.
  • Eliminate: Hashing for subset choice  common elements disappear after subtraction
  • Invert fast: O(d) equations in d unknowns; randomness allows expected O(d) inversion.
checksum details
“Checksum” details
  • Array of IBF cells that form “checksum” words
    • For set difference of size d, use αd cells (α > 1)
  • Each element ID is assigned to many IBF cells
  • Each cell contains:
ibf encode
IBF Encode

B

C

A

Assign ID to many cells

All hosts use the same hash functions

Hash1

Hash2

Hash3

idSum⊕A

hashSum⊕ H(A)

count++

idSum⊕A

hashSum⊕H(A)

count++

idSum⊕ A

hashSum⊕ H(A)

count++

IBF:

“Add” ID to cell

Not O(n), like Bloom Filters!

αd

invertible bloom filters ibf
Invertible Bloom Filters (IBF)

Host 1

Host 2

  • Trade IBF’s with remote host

A

B

E

F

A

C

D

F

IBF 1

IBF 2

invertible bloom filters ibf1
Invertible Bloom Filters (IBF)

Host 1

Host 2

  • “Subtract” IBF structures
    • Produces a new IBF containing only unique objects

A

B

E

F

A

C

D

F

IBF 2

IBF 1

IBF (2 - 1)

disappearing act
Disappearing act
  • After subtraction, elements common to both sets disappear because:
    • Any common element (e.g W) is assigned to same cells on both hosts (same hash functions on both sides)
    • On subtraction, W XOR W = 0. Thus, W vanishes.
  • While elements in set difference remain, they may be randomly mixed  need a decode procedure.
ibf decode
IBF Decode

H(V ⊕ X ⊕ Z)

H(V) ⊕ H(X) ⊕ H(Z)

Test for Purity:

H( idSum )

H( idSum ) = hashSum

H(V) = H(V)

how many ibf cells
How many IBF cells?

Overhead to decode at >99%

Hash Cnt 3

Hash Cnt 4

α

Space Overhead

Small Diffs:

1.4x – 2.3x

Large Differences:

1.25x - 1.4x

Set Difference

how many hash functions
How many hash functions?
  • 1 hash function produces many pure cells initially but nothing to undo when an element is removed.

C

A

B

how many hash functions1
How many hash functions?
  • 1 hash function produces many pure cells initially but nothing to undo when an element is removed.
  • Many (say 10) hash functions: too many collisions.

C

C

C

B

B

C

B

A

A

A

B

A

how many hash functions2
How many hash functions?
  • 1 hash function produces many pure cells initially but nothing to undo when an element is removed.
  • Many (say 10) hash functions: too many collisions.
  • We find by experiment that 3 or 4 hash functions works well. Is there some theoretical reason?

C

C

B

C

A

A

A

B

B

theory
Theory
  • Let d = difference size, k = # hash functions.
  • Theorem 1: With (k + 1) d cells, failure probability falls exponentially with k.
    • For k = 3, implies a 4x tax on storage, a bit weak.
  • [Goodrich,Mitzenmacher]: Failure is equivalent to finding a 2-core (loop) in a random hypergraph
  • Theorem 2: With ck d, cells, failure probability falls exponentially with k.
    • c4 = 1.3x tax, agrees with experiments
recall experiments
Recall experiments

Overhead to decode at >99%

Hash Cnt 3

Hash Cnt 4

Space Overhead

Large Differences:

1.25x - 1.4x

Set Difference

connection to coding
Connection to Coding
  • Mystery: IBF decode similar to peeling procedure used to decode Tornado codes. Why?
  • Explanation: Set Difference is equivalent to coding with insert-delete channels
  • Intuition: Given a code for set A, send checkwords only to B. Think of B as a corrupted form of A.
  • Reduction: If code can correct D insertions/deletions, then B can recover A and the set difference.
  • Reed Solomon <---> Polynomial Methods
  • LDPC (Tornado) <---> Difference Digest
random subsets fast elimination
Random Subsets  Fast Elimination

Sparse

X + Y + Z = . .

αd

Y = . .

Pure

X = . .

Roughly upper triangular and sparse

difference digests1
Difference Digests
  • Consists of two data structures:
    • Invertible Bloom Filter (IBF)
      • Efficiently computes the set difference.
      • Needs the size of the difference
    • Strata Estimator
      • Approximates the size of the set difference.
      • Uses IBF’s as a building block.
strata estimator
Strata Estimator

Estimator

B

C

A

1/16

  • Divide keys into sampled subsets containing ~1/2k
  • Encode each subset into an IBF of small fixed size
    • log(n) IBF’s of ~20 cells each

IBF 4

~1/8

IBF 3

~1/4

Consistent

Partitioning

IBF 2

~1/2

IBF 1

strata estimator1
Strata Estimator

Estimator 1

Estimator 2

  • Attempt to subtract & decode IBF’s at each level.
  • If level k decodes, then return:2kx (the number of ID’s recovered)

IBF 4

IBF 4

4x

IBF 3

IBF 3

Host 1

Host 2

IBF 2

IBF 2

Decode

IBF 1

IBF 1

keydiff service
KeyDiff Service
  • Promising Applications:
    • File Synchronization
    • P2P file sharing
    • Failure Recovery

Application

Application

Add( key )

Remove( key )

Diff( host1, host2 )

Key Service

Key Service

Application

Key Service

difference digest summary
Difference Digest Summary
  • Strata Estimator
    • Estimates Set Difference.
    • For 100K sets, 15KB estimator has <15% error
    • O(log n) communication, O(n) computation.
  • Invertible Bloom Filter
    • Identifies all ID’s in the Set Difference.
    • 16 to 28 Bytes per ID in Set Difference.
    • O(d) communication, O(n+d) computation
    • Worth it if set difference is < 20% of set sizes
connection to sparse recovery
Connection to Sparse Recovery?
  • If we forget about subtraction, in the end we are recovering a d-sparse vector.
  • Note that the hash check is key for figuring out which cells are pure after differencing.
  • Is there a connection to compressed sensing. Could sensors do the random summing? The hash summing?
  • Connection the other way: could use compressed sensing for differences?
comparison with information theory and coding
Comparison with Information Theory and Coding
  • Worst case complexity versus average
  • It emphasize communication complexity not computation complexity: we focus on both.
  • Existence versus Constructive: some similar settings (Slepian-Wolf) are existential
  • Estimators: We want bounds based on difference and so start by efficiently estimating difference.
aside ibfs in digital hardware
Aside: IBFs in Digital Hardware

Stream of set elements

Logic (Read, hash, Write)

a , b, x, y

Hash 3

Hash 1

Hash 2

Strata Hash

Bank 3

Bank 1

Bank 2

Hash to separate banks for parallelism, slight cost in space needed. Decode in software

slide37

Part 2: Towards a theory of Cloud Complexity

O2

?

O1

O3

Complexity of reconciling “similar” objects?

slide38

Example: Synching Files

X.ppt.v2

X.ppt.v3

?

X.ppt.v1

Measures: Communication bits, computation

slide40

Mild Sensitivity Analysis: One set much larger than other

Small difference d

?

Set A

Set B

(|A|) bits needed, not O (d) : Patrascu 2008

Simpler proof: DKS 2011

slide41

Asymmetric set difference in

LBFS File System (Mazieres)

C99

C98

C97

1 chunk difference

File B

?

. . .

C3

C5

C1

C3

C2

C1

. . .

C98

C97

C99

File A

Chunk Set B at Server

LBFS sends all chunk hashes in File A: O|A|

slide42

More Sensitivity Analysis: small intersection: databasejoins

Small intersection d

?

Set B

Set A

(|A|) bits needed, not O (d) : Follows from results on hardness of set disjointness

slide43

Sequences under Edit Distance

(Files for example)

Edit distance 2

A

?

A

B

C

C

D

D

E

E

F

F

G

File A

File B

Insert/delete can renumber all file blocks . . .

slide44

Sequence reconciliation

(with J. Ullman)

Edit distance 1

A

A

H1

B

C

H2

C

D

H2

D

E

H3

E

F

H3

F

File A

File B

Send 2d+1 piece hashes. Clump unmatched pieces and recurse. O( d log (N) )

2

21 years of sequence reconciliation
21 years of Sequence Reconciliation!
  • Schwartz, Bowdidge, Burkhard (1990): recurse on unmatched pieces, not aggregate.
  • Rsync: widely used tool that breaks file into roughly piece hashes, N is file length.

UCSD, Lunch

Princeton, kids

slide46

Sets on graphs?

{b,c,d}

{a,b,c}

{d,c,e}

{a,f,g}

slide47

Generalizes rumor spreading which has disjoint singleton sets

{b}

{a}

{d}

{g}

CLP10,G11,: O( E n log n /conductance)

slide48

Generalized Push-Pull

(with N. Goyal and R. Kannan)

{b,c,d}

Pick random edge

Do 2 party set reconciliation

{a,b,c}

{d,c,e}

Complexity: C + D, C as before, D = Sum (U – S )

i

i

slide49

Sets on Steiner graphs?

R1

{b} U S

{a} U S

Only terminals need sets. Push-pull wasteful!

butterfly example for sets
Butterfly example for Sets

S1

S2

S2

S1

X

D = Diff(S1 ,S2)

S1

Y

D

D

Set difference instead of XOR within network

how does reconciliation on steiner graphs relate to network coding
How does reconciliation on Steiner graphs relate to network coding?
  • Objects in general, not just bits.
  • Routers do not need objects but can transform/code objects.
  • What transformations within network allow efficient communication close to lower bound?
slide52

Sequences with d mutations:

VM code pages (with Ramjee et al)

2 “errors”

A

?

A

B

X

C

C

D

D

E

Y

VM A

VM B

Reconcile Set A = {(A,1)(B,2),(C,3),(D,4),(E,5)} and Set B = {(A,1),(X,2),(C,3),(D,4),(Y,5)}

twist ibfs for error correction with m mitzenmacher
Twist: IBFs for error correction?(with M. Mitzenmacher)
  • Write message M[1..n] of n words as set S = {(M[1],1), (M[2], 2), . . (M[n], n)}.
  • Calculate IBF(S) and transmit M, IBF(S)
  • Receiver uses received message M’ to find IBF(S’); subtracts from IBF’(S) to locate errors.
  • Protect IBF using Reed-Solomon or redundancy
  • Why: Potentially O(e) decoding for e errors -- Raptor codes achieve this for erasure channels.
the cloud complexity milieu
The Cloud Complexity Milieu

Other dimensions: approximate, secure, . . .

conclusions got diffs
Conclusions: Got Diffs?
  • Resiliency and fast recoding of random sums  set reconciliation; and error correction?
  • Sets on graphs
    • All terminals: generalizes rumor spreading
    • Routers,terminals: resemblance to network coding.
  • Cloud complexity: Some points covered, many remain
  • Practical, may be useful to synch devices across cloud.
comparison to logs incremental updates
Comparison to Logs/Incremental Updates
  • IBF work with no prior context.
  • Logs work with prior context, BUT
    • Redundant information when sync’ing with multiple parties.
    • Logging must be built into system for each write.
    • Logging adds overhead at runtime.
    • Logging requires non-volatile storage.
      • Often not present in network devices.
  • IBF’s may out-perform logs when:
  • Synchronizing multiple parties
  • Synchronizations happen infrequently