1 / 54

Epidemics

Epidemics. Spring 2008: CS525 Farhana Ashraf and Fariba Khan. Problem: Multicast. Simultaneously deliver information to a group of destinations. Examples: News groups Update for replicated database systems Real-time media ( i.e. Sports) Conference bridge

loring
Download Presentation

Epidemics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Epidemics Spring 2008: CS525 Farhana Ashraf and Fariba Khan

  2. Problem: Multicast • Simultaneously deliver information to a group of destinations. • Examples: • News groups • Update for replicated database systems • Real-time media (i.e. Sports) • Conference bridge • Software update in ad-hoc and sensor network

  3. Challenges • Reliable: • Strong • Best-effort • Probabilistic (bimodal) • Delay • Bandwidth • Fault-tolerance

  4. Epidemic Algorithm for Replicated Database Maintenance Alan Demers, Dan Greene, Carl Hauser, Wes Irish, John Larson, Scott Shenker, Howard Sturgis, Dan Swineheart, Doug Terry PODC 1987

  5. Motivation: Replicated database Xerox Clearinghouse Data is replicated in 300 sites worldwide. An update in any site has to be forwarded to other 299. Financial Distributed Systems Data and Code replicated worldwide. There is a 7-hour window between the stock market closing in Tokyo and opening in NYC.

  6. Approach • Naive • Direct Mail • Epidemic • Anti-entropy • Rumor-mongering

  7. Timely Immediately mailed from the entry site to all other site Not entirely reliable Incomplete information about other sites Mail may be lost PostMail Direct Mail PostMail PostMail PostMail PostMail PostMail Infectious node Susceptible node

  8. Extremely reliable Resolves difference with random site periodically Slow & Expensive Examines the contents of the entire databases Database content sent over network Anti-Entropy ResolveDiff ResolveDiff ResolveDiff ResolveDiff Infectious node Susceptible node

  9. Anti-Entropy: Push and Pull PUSH PULL Not updated Not updated PUSH - PULL Infectious node Susceptible node

  10. Pull > Push pi – Probability that a node is susceptible after the ith round • If Anti-entropy is back-up for, e.g., Direct-Mail • Pull converges faster than push, thus providing better delay Pull Push

  11. Anti-Entropy: Optimization • Checksum • Exchange checksum first, compare database if checksum disagree [saves network traffic] • As network size increases, time to distribute update to all sites increase [more possibility of checksum mismatch] • Recent update list • Exchange recent update list within T to update database and new checksum • Compare databases if new checksum disagree • Choice of T critical • Inverted index of database by timestamp • Exchange updates in reverse timestamp order, compute checksum until checksum match • Cost of additional inverted index at each site • Synchronization of time

  12. Less expensive Require fewer resources Can be done more frequently than anti-entropy Less reliable Some chance that updates will not reach all sites Complex Epidemics: “Not Anti-entropy”Rumor Spreading Susceptible node Infectious node Removed node

  13. Less expensive Require fewer resources Can be done more frequently than anti-entropy Less reliable Some chance that updates will not reach all sites Complex Epidemics: “Not Anti-entropy”Rumor Spreading Susceptible node Infectious node Removed node

  14. Designing a Good Epidemic • Residue • Number of sites not receiving the update, when epidemic ends • Traffic • Average number of messages sent from a typical site • Delay • tavg: difference between initial injection and avg. arrival of update at a given site • Tlast: Delay until reception by the last site that receives the update during epidemic

  15. Variants of Rumor Spreading • Blind vs. Feedback • Blind: sender looses interest with prob. 1/k regardless of recipient • Feedback: sender looses interest with prob. 1/k only if recipient already knows the rumor • Counter vs. Coin • Counter: loose interest only after k unnecessary contacts • Push vs. Pull

  16. Problem with Deletion • Problem • Absence of an item does not spread • Propagation of old copies of deleted item will cause insertion of the item back to the site • Solution • Replace deleted item with Death Certificate (DC)

  17. Discussion • Direct Mail or Epidemics? • Economics and Industry? • Anti-entropy or Rumor?

  18. Bimodal Multicast Ken Birman, Mark Hayden, OznurOzkasap, Zhen Xiao, MihaiBudiu, YaronMinsky ACM TOCS 1999

  19. Dilemma • Application is extremely critical: stock market, air traffic control, medical system • Hence need a strong model, guarantees • But these applications often have a soft-realtime subsystem • Steady data generation • May need to deliver over a large scale

  20. Probabilistic Broadcast: pbcast • Atomicity: bimodal delivery guarantee • almost all or almost none • Throughput stability: variation can be characterized • Ordering: FIFO per sender • Multicast stability: safely garbage collected (no dormant death certificate) • Detection of lost messages • Scalability: cost is a function of network size • Soft failure recovery: bounded number of recoveries from buffer overflow, transient network.

  21. Pbcast 2-stage Protocol • Stage 1: Best effort dissemination • Hierarchical broadcast • Unreliable best-effort approach • Stage 2: Anti-entropy • Exchange digest and correct loss • Probabilistic end-to-end

  22. Pbcast: Best Effort Dissemination IP multicast or “virtual” multicast spanning trees Sender randomly generates a spanning tree Neighbors forward based on the tree identifier Number of random trees can be tuned

  23. Pbcast: Random Spanning Tree P Q R S

  24. Pbcast: Random Spanning Tree P Q R S

  25. Pbcast: Random Spanning Tree P Q R S

  26. Pbcast: Random Spanning Tree P Q R S

  27. Pbcast: Hierarchical Multicast m1 P Q R S P={m1} Q={m1} R={m1} S={m1}

  28. Pbcast: Hierarchical Multicast m1 m2 m3 P Q R S P={m1, m2} Q={m1, m2} R={m1, m2, m3} S={m1, m2, m3}

  29. Pbcast: Hierarchical Multicast m1 m2 m3 m4 P Q R S P={m1, m2} Q={m1, m2} R={m1, m2, m3, m4} S={m1, m2, m3}

  30. Pbcast: Two phase anti-entropy • Progresses in rounds • In each round • Gossip summary to randomly chosen nodes • Solicit any message they found lacking • Message resend

  31. Pbcast: Anti-entropy m1 m2 m3 m4 P m3 Q m3 R S P={m1, m2} Q={m1, m2} R={m1, m2, m3} S={m1, m2, m3}

  32. Optimizations (2) Independent Numbering of Rounds: numbers used for local decisions only (solicitation, garbage collect) Random Graphs for Scalability: spanning tree with network (LAN, WAN) knowledge. Multicast for Some Retransmissions

  33. Optimizations (1) Soft-Failure Detection: retransmissions serviced in the same round Round Retransmission Limit: maximum amount of data per node per round Cyclic Retransmissions: avoid resending a message that might be in transit. Most-Recent-First Transmissions: No starvation

  34. Analytic Results • Initial unreliable multicast failed • Run the gossip rounds for some time • Probability of Message loss in 5% • Crash failure 0.1%

  35. Bimodal Delivery Distribution k

  36. Experimental Setup • SP2 is a large network of parallel computers • Nodes are UNIX workstations • Interconnect is an ATM network • Software is standard Internet stack (TCP, UDP) • 128 nodes on Cornell SP2 in Theory Center • pbcast was run on this

  37. Source to dest latency distributions Groups of 8 One forced to sleep

  38. Throughput variation as a function of scale (25% nodes perturbed) mean and standard deviation of pbcast throughput: 128-member group standard deviation of pbcast throughput 220 150 215 210 100 205 throughput (msgs/sec) 200 standard deviation 195 50 190 185 180 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0 50 100 150 process group size perturb rate Very Small and Slow Growth

  39. Discussion • Good enough for Voip (low variance in delay)? • Random spanning tree • WAN, LAN, subgroup sizes, trans-oceanic delays • Asymmetric network conditions (cellphone vs server)

  40. Exploring the Energy – Latency Trade-off for Broadcasts in Energy-Saving Sensor Networks Matthew J. Miller Cigdem Sengul Indranil Gupta ICDCS 2005 40

  41. WSN Applications • Code update • Energy primary constraint • Attribute based search • Latency primary constraint msg msg

  42. Background: IEEE 802.11 PSM B C A ATIM msg DATA msg A M1 Latency B M2 Latency C M3 BI Problem: High Latency

  43. Reducing Latency : Immediate Broadcast B C A A M1 B M2 C BI Problem: C does not get the update, Reliability decreases

  44. Probability-Based Broadcast Forwarding (PBBF) B C A A M1 Solution1: Immediate Broadcast with p p (1-p) B M2 Solution2: C remains awake with q C BI

  45. Effect of p and q on Reliability, Energy, Latency • Reliability = pq + (1 - p) • p=0, q=0 : IEEE 802.11 PSM • p=1, q=1 : “always on” • Still have the ATIM window overhead Immediate broadcast Wait for the next BI

  46. Experimental Setup • Application: a base station periodically sends patches for sensors to apply • Simulation in ns-2, where • 50 nodes • Average One-Hop Neighborhood Size = 10 • Uniformly random node placement in square area • Topology connected • Full MAC layer

  47. E does not depend on p For fixed p, E increases with q Energy, E NO PSM Energy Joules/Broadcast PBBF PSM q

  48. For fixed p, L decreases with q For fixed q, increase in p gets less L Latency, L Latency Average 5-Hop Latency Increasing p q

  49. Reliability, R • Reliability, R in terms of Average fraction of broadcasts received per node • For high p, R is small for small q p=0.5 Average Fraction of Broadcasts Received q

  50. Energy – Latency Tradeoff Achievable region for reliability ≥ 99% Joules/Broadcast Reliability = 99% Average Per-Hop Broadcast Latency (s)

More Related