cs 525 advanced distributed systems spring 09
Download
Skip this Video
Download Presentation
CS 525 Advanced Distributed Systems Spring 09

Loading in 2 Seconds...

play fullscreen
1 / 40

CS 525 Advanced Distributed Systems Spring 09 - PowerPoint PPT Presentation


  • 254 Views
  • Uploaded on

CS 525 Advanced Distributed Systems Spring 09. (gatorlog.com). (epath.org). Indranil Gupta Lecture 7 More on Epidemics (or “Tipping Point Protocols”) February 12, 2009. Question…. What fraction of main roads need to be randomly knocked out before

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'CS 525 Advanced Distributed Systems Spring 09' - Ava


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cs 525 advanced distributed systems spring 09

CS 525 Advanced Distributed SystemsSpring 09

(gatorlog.com)

(epath.org)

Indranil Gupta

Lecture 7

More on Epidemics (or “Tipping Point Protocols”)

February 12, 2009

question
Question…

What fraction of main roads need to be randomly knocked out before

source and destination are completely cut off?

Source

Destination

critical value answer 0 5

Tipping Point!

Critical Value? Answer = 0.5

Source

Destination

(Comes from Percolation Theory)

tipping point
“Tipping Point”

[Malcolm Gladwell, The Tipping Point, Little Brown and Company, ISBN: 0316346624]

Tipping is that (magic) moment when an idea, trend or social behavior crosses a threshold, and spreads like wildfire.

epidemic protocols
Epidemic Protocols
  • A specific class of tipping point protocols
  • Local behavior at each node – probabilistic
  • Determines global, emergent behavior at the scale of the distributed system
  • As one tunes up the local probabilities, the global behavior may undergo a threshold behavior (or, a phase change)
  • Three papers:
    • Epidemic algorithms
    • Bimodal multicast
    • PBBF (sensor networks)
epidemic algorithms for replicated database maintenance

Epidemic Algorithms for Replicated Database Maintenance

Alan Demers et. al.

Xerox Palo Alto Research Center

PODC 1987

[Some slides borrowed from presentation by: R. Ganti and P. Jayachandran]

introduction
Introduction
  • Maintain mutual consistency of updates in a distributed and replicated database
  • Used in Clearinghouse database – developed in Xerox PARC and used for many years
  • First cut approaches
    • Direct mail: send updates to all nodes
      • Timely and efficient, but unreliable
    • Anti-entropy: exchange database content with random site
      • Reliable, but slower than direct mail and uses more resources
    • Rumor mongering: exchange only ‘hot rumor’ updates
      • Less reliable than anti-entropy, but uses fewer resources
epidemic multicast

(from Lecture 1)

Epidemic Multicast

Infected

Protocol rounds (local clock)

b random targets per round

Gossip Message (UDP)

Uninfected

epidemic multicast push
Epidemic Multicast (Push)

Infected

Protocol rounds (local clock)

b random targets per round

Gossip Message (UDP)

Uninfected

epidemic multicast pull
Epidemic Multicast (Pull)

Infected

Gossip Message (UDP)

Uninfected

Protocol rounds (local clock)

b random targets per round

pull push
Pull > Push

pi – Probability that a node is susceptible after the ith round

  • Pull converges faster than push, thus providing better delay
  • Push-pull hybrid variant possible (see Karp and Shenker’s “Randomized Rumor Spreading”)

Pull

Push

anti entropy optimizations
Anti-entropy: Optimizations
  • Maintain checksum, compare databases if checksums unequal
  • Maintain recent update lists for time T, exchange lists first
  • Maintain inverted index of database by timestamp; exchange information in reverse timestamp order, incrementally re-compute checksums
epidemic flavors
Epidemic Flavors
  • Blind vs. Feedback
    • Blind: lose interest to gossip with probability 1/k every time you gossip
    • Feedback: Loss of interest with probability 1/k only when recipient already knows the rumor
  • Counter vs. Coin
    • Coin: above variants
    • Counter: Lose interest completely after k unnecessary contacts. Can be combined with blind.
  • Push vs. Pull
deletion and death certificates
Deletion and Death Certificates
  • Absence of item does not spread; On the contrary, it can get resurrected!
  • Use of death certificates (DCs) – when a node receives a DC, old copy of data is deleted
  • How long to maintain a DC?
    • Typically twice (or some multiple of) the time to spread the information
    • Alternately, use Chandy and Lamport snapshot algorithm to ensure all nodes have received
    • Certain sites maintain dormantDCs for a longer duration; re-awakened if item seen again
performance metrics
Performance Metrics
  • Residue: Fraction of susceptibles left when epidemic finishes
  • Traffic: (Total update traffic) / (No. of sites)
  • Delay: Average time for receiving update and maximum time for receiving update
  • Some results:
    • Counters and feedback improve delay
    • Pull provides lower delay than push
performance evaluation
Performance Evaluation

Tipping Point Behavior

discussion
Discussion

Pick your favorite:

  • Push vs. pull vs. push-pull
    • Name one disadvantage of each
  • Direct mail vs. anti-entropy vs. rumor mongering
    • Name one disadvantage of each
  • Random neigbhor picking
    • Disadvantage in wired networks?
    • In Sensor network?
bimodal multicast

Bimodal Multicast

Kenneth P. Birman et. al.

ACM TOCS 1999

[Some slides borrowed from presentation by: W. Fagen and L. Cook]

vs pbcast
Atomicity: All or none delivery

Multicast stability: Reliable immediately delivery of messages

Scalability: Bad. Costs >= quadratic with group size.

Ordering

Atomicity: Bimodal delivery guarantee, almost all or almost none (immediately)

Multicast stability: Reliable eventual delivery of messages

Scalability: Costs logarithmic w.r.t. network size. Throughput stability.

Ordering

Vs. Pbcast

Traditional Multicast

Pbcast

pbcast probabilistic broadcast protocol
Pbcast: Probabilistic Broadcast Protocol
  • Pbcast has two stages:
    • Unreliable, hierarchical, best-effort broadcast. Eg. IP Multicast
    • Two-phase anti-entropy protocol: runs simultaneously with the broadcast messages
      • First phase detects message loss
      • Second phase corrects such losses
the second stage
The second stage
  • Anti-entropy round:
    • Gossip Messages:
      • Each process chooses another random process and sends a summary of its recent messages
    • Solicitation Messages:
      • Messages sent back to the sender of the gossip message requesting a resend of a given set of messages (not necessarily the original source)
    • Message Resend:
      • Upon reception of a solicitation message, the sender resends that message
  • Protocol parameters at each node
    • # of rounds and # of processes contacted in each round
    • Product of above two parameters called fanout
optimizations
Optimizations
  • Soft-Failure Detection: Retransmission requests served only if received recently; protects against congestion caused due to redundant retransmissions
  • Round Retransmission Limit: Limit the no. of retransmissions in a round; spread overhead in space and time
  • Most-Recent-First Retransmission: prefer recent messages
  • Independent Numbering of Rounds: Allows delivery and garbage collection to be entirely a local decision
  • Multicast for Some Retransmissions
bimodality of pbcast
Bimodality of Pbcast

Logarithmic

Y-axis

Almost none

Almost all

latency for delivery
Latency for Delivery

Logarithmic growth

discussion27
Discussion
  • Disadvantages of Bimodal Multicast?
    • When would wasteful messages be sent?
  • What happens when
    • Rate of injection of multicasts is very very low?
    • IP multicast is very very reliable?
    • IP multicast is very very unreliable?
pbbf probability based broadcast forwarding

PBBF: Probability-Based Broadcast Forwarding

Cigdem Sengul and Matt Miller

ICDCS 2005 and ACM TOSN 2008

(Originated from a 525 Project)

broadcast in an ad hoc network
Broadcast in an Ad-Hoc Network
  • Ad-hoc sensor network (Grid example below)
  • One node has a piece of information that it needs to broadcast: e.g., (1) code update, (2) query
  • Simple approach: each node floods received message to all its neighbors
    • Disadvantages?
ieee 802 11 psm
IEEE 802.11 PSM

A real, stable MAC protocol (similar results for S-MAC, T-MAC, etc.)

  • Nodes are assumed to be synchronized
  • Every beacon interval (BI), all nodes wake up for an ATIM window (AW)
  • During the AW, nodes advertise any traffic that they have queued
  • After the AW, nodes remain active if they expect to send or receive data based on advertisements; otherwise nodes return to sleep until the next BI
protocol extreme 1

N2

N1

N3

A

D

A

D

A

D

A

Protocol Extreme #1

N1

N2

N3

= ATIM Pkt

= Data Pkt

protocol extreme 2

N2

N1

N3

D

A

D

D

D

A

Protocol Extreme #2

N1

N2

N3

= ATIM Pkt

= Data Pkt

probability based broadcast forwarding pbbf
Probability-Based Broadcast Forwarding (PBBF)
  • Introduce two parameters to sleep scheduling protocols: p and q
  • When a node is scheduled to sleep, it will remain active with probability q
  • When a node receives a broadcast, it rebroadcasts immediately with probability p
    • With probability (1-p), the node will wait and advertise the packet during the next AW before rebroadcasting the packet
analysis reliability
Analysis: Reliability

Tipping Point!

  • Phase transition when:

pq + (1-p) ≈ 0.8-0.85

  • Larger than traditional bond percolation threshold
    • Boundary effects
    • Different metric
  • Still shows phase transition

p=0.25

p=0.37

Fraction of Broadcasts

Received by 99% of Nodes

p=0.5

p=0.75

q

application energy and latency
Application: Energy and Latency

Energy

Joules/Broadcast

Latency

Average 5-Hop Latency

PBBF

Increasing p

q

q

≈ 1 + q * [(BI - AW)/AW]

  • Ns2 simulation: 50 nodes, uniform placement, 10 avg. neighbors
adaptive pbbf
Adaptive PBBF

Achievable

Region

Energy

Latency

adaptive pbbf tosn paper
Adaptive PBBF (TOSN paper)

1.0

  • Dynamically adjusting p and q to converge to user-specified QoS metrics
    • Code updates prefer reliability overl latency
    • Queries prefer latency over reliability
  • Can specify any 2 of energy, latency, and reliability
  • Subject to those constraints, p and q are adjusted to achieve the highest reliability possible

q

0.5

p

0.0

Time

discussion38
Discussion
  • PBBF: bond percolation (remove roads from city)
  • Haas et al paper (Infocom): site percolation
    • Remove intersections/junctions (not roads) from city
  • Site percolation and bond percolation have different thresholds and behaviors
  • Hybrid possible? (like push-pull?)
  • What about over-hearing optimizations? (like feedback)
question39
Question…

Are there other tipping point protocols…?

Source

Destination

next week onwards
Next Week Onwards
  • Student Presentations start (see instructions)
  • Reviews needed (see instructions)
  • Project Meetings start (see newsgroup)
    • Think about which testbed you need access to: PlanetLab, Emulab, Cirrus
  • Tomorrow: Yahoo! Training seminar
ad