1 / 44

Epidemic Techniques

Epidemic Techniques. Chiu Wah So (Kelvin). Database Replication. Why do we replicate database? Low latency High availability To achieve strong (sequential) consistency on replicated database. Not scalable. One primary database Quorum system (contact over half of replicas).

Download Presentation

Epidemic Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Epidemic Techniques Chiu Wah So (Kelvin)

  2. Database Replication • Why do we replicate database? • Low latency • High availability • To achieve strong (sequential) consistency on replicated database. Not scalable. • One primary database • Quorum system (contact over half of replicas)

  3. Database Replication • To scale and have high availability, we need a weaker consistency model • Eventual consistency: If all updating stops then eventually all replicas will converge to the identical values. • The two papers talk about how to use epidemic techniques to achieve eventual consistency to scale.

  4. Epidemic Techniques for replicated database • Epidemic Algorithms for Replicated Database Maintenance • Look at different epidemic algorithms to reduce bandwidth consumption to maintain replicated database • Astrolabe • Scalable and Robust information management system.

  5. Motivation on the first paper • Clearinghouse service maintains translations from names to machine addresses. (like DNS) • Problem: Using direct mail and anti-entropy, too much traffic to maintain consistency between highly replicated servers. Some key links are overloaded. • Look at techniques to reduce bandwidth: rumor spreading and spatial distributions.

  6. Direct mail • Direct mail: each server sends update to all other servers. • Advantage • Easy to implement • Good enough for small and static servers • Disadvantage: • Not scale (O(n) message for each update) • Updates may get lost.

  7. Anti-entropy • Servers pick random server and resolve differences. • 3 ways to resolve differences: push, pull, and push-pull.

  8. Anti-entropy Example

  9. Push

  10. Pull

  11. Push-Pull

  12. Anti-entropy Average time • Average converge in O(log(n)) steps

  13. Anti-entropy (2) • Very expensive to send the whole database across network to compare • Some techniques for optimizing comparing bandwidth • Compute Checksum • Exchange list of of recent updates. Then apply the update and compute checksum • Exchange updates in reverse chronological order until checksums agree • Still too much bandwidth…..

  14. Rumor spreading • Main idea: send out updates randomly. Instead of comparing whole database. • Three states: susceptible, infective, and removed. • Initially all servers are susceptible • Once server has a rumor (infective), and then pick a random server to send the rumor. • With probability 1/k, the server loses interest (removed) to spread rumor

  15. Rumor spreading (2) • But maybe not every server got the rumor. • With probability of remaining susceptible after the epidemic finishes: • Run anti-entropy infrequently to make sure every server gets the update.

  16. Three goals in rumor spreading • Low Residue: the probability of remaining susceptible when the epidemic finishes • Low Traffic: total traffic sent per site • Low Delay: Average time and the last time between the injection of an update and the arrival of update.

  17. Variations in rumor spreading • Many variations in rumor spreading. • Blind with Coin vs Feedback with Counter • Push vs Pull • Increase the smaller counter of the two • Connection limit • Hunting

  18. Feedback Counter vs Blind Prob. Feedback and Counter Blind and Probability

  19. Deletion and Death Certificates • Simple solution: death certificates and store for a fixed threshold of time • 2nd solution: dormant death certificates. Use two threshold time, and some servers keep it longer. 2 different timestamp: original timestamp and reactivation timestamp.

  20. Motivation of Spatial Distributions • Network is not uniform. • Certain key links in the network are overload. • Transatlantic links about 80 conversations, but on average conversations per link is 6. • Therefore, we should favor nearby neighbors.

  21. Spatial Distributions • Each servers, sort the list of sites by distance from s. • Select anti-entropy exchange partners from the sorted list according to a function f(i), i = index on the sorted list. • We can use f(i) = i^(-a), where a is the parameter for tuning spatial distribution.

  22. Spatial Distribution

  23. Next Paper: Astrolabe • The first paper talks about how to use rumor spreading and spatial distribution to reduce bandwidth. • But the storage grows O(n) and total bandwidth taken up by gossip grows O(n^2) • We need a more scalable solution.

  24. Astrolabe • Scalable and Robust information management system. • Monitors the dynamically changing state of a collection of distributed resources. • Reports summaries of this information to users.

  25. Four design goals • Scalability: scale through its zone hierarchy. Information is summarized before exchanges. • Flexibility: easy to install new aggregated function in a form of SQL aggregation query • Robustness: randomized peer-to-peer approach to exchange information. • Security: use signed certificates.

  26. Structure of Astrolabe • Structure of Astrolabe’s zones can be viewed as a trees. Leaves of this tree are hosts. • Each hosts run an astrolabe agent.

  27. Astrolabe Detail • Each agent is a virtual database. • Each agent has a path name. (For example: /USA/Cornell/pc3) • Each agent contains information, called MIB, for all the ancestor zone (For example, it contains /, /USA, /USA/Cornell) • Each ancestor MIB is generated using aggregation for scalability, instead of having O(n) entries.

  28. Astrolabe Detail (2) • Each zone can be viewed as relational table of the attributes of its child zone. • How do we gather or generate the information in the zone relational table? • Two ways: If the agent is in the zone, use aggregation to construct the MIB for the zone. Otherwise, gossip for the information. • Therefore, MIB for internal zones has to be small in order to scale.

  29. Aggregation • Aggregation Function Certificates contain information on how to collect and aggregate attributes of child zone MIBs into entries for internal zone MIBs. • Programmed in SQL-like language • Propagates by two ways: copying to parent (propagates like other normal attributes), and look for new AFC from its ancestor zone

  30. Aggregation (2) • Here are the SQL aggregation functions that are provided by Astrolabe.

  31. Gossip • Each zone has a small set of addresses for representative agents. • Representative agents are computed using an aggregation function, such as using load and longevity. • An agent gossips on behalf of those zones for which it is a representative.

  32. Gossip (2) • Periodically, the agent picks one of the child zones, and talks to one of the contact agents. (anti-entropy) • Then, it sends all the child zones at that level, and does the same thing for the higher levels in the tree up until the root level. • Then the two agents can compare which entries are newer and keep them.

  33. Example of gossip (taken from ken slides) swift.cs.cornell.edu cardinal.cs.cornell.edu

  34. Example of gossip (2) swift.cs.cornell.edu cardinal.cs.cornell.edu

  35. Example of gossip (3) swift.cs.cornell.edu cardinal.cs.cornell.edu

  36. Dynamically changing query output is visible system-wide SQL query “summarizes” data New Jersey San Francisco

  37. Membership • When an agent has not seen an update for a zone from a particular representative for some time Tfail. Remove its MIB. • Connect different pieces of the trees and add in new machines • IP multicast • Broadcast • Relatives • Administrators responsible for configuring the system by assigning zone names.

  38. Communication • Through Http and UDP(need to fragment the messages into more than one UDP packets) • If there is firewall, • Use ALG in core internet or an astrolabe agent in core internet.

  39. Security • Each zone is a management unit. • Children have a way to override policy enforced by parents. • Each zone: 2 pairs of key, CA and zone keys • Zone certificate • MIB certificate • Aggregation function certificate • Client certificate

  40. Related work • Directory Services (Clearinghouse, Bayou, Globe) • Network Monitoring • Event Notification • Sensor Networks • Peer-to-peer routing

  41. Measurement on expected # rounds

  42. Measurement on expected # rounds (2)

  43. Measurement on Latency Real Simulation

  44. Conclusion ??

More Related