CS 598IG Adv. Topics in Dist. Sys. Spring 2006. Indranil Gupta Lecture 3 January 24, 2006. DHT=Distributed Hash Table. A hash table allows you to insert, lookup and delete objects with keys A distributed hash table allows you to do the same in a distributed setting (objects=files)
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
CS 598IGAdv. Topics in Dist. Sys.Spring 2006
Indranil Gupta
Lecture 3
January 24, 2006
Say m=7
0
N16
N112
6 nodes
N96
N32
N45
N80
Say m=7
0
N16
N112
N96
N32
N45
N80
(similarly predecessors)
Say m=7
Finger Table at N80
0
N16
N112
i ft[i]
0 96
1 96
2 96
3 96
4 96
5 112
6 16
80 + 25
80 + 26
N96
N32
80 + 24
80 + 23
80 + 22
80 + 21
80 + 20
N45
N80
ith entry at peer with id n is first peer with id >=
Say m=7
0
N16
N112
N96
N32
N45
N80
File with key K42
stored here
Say m=7
0
N16
N112
N96
N32
Who has cnn.com/index.html?
(hashes to K42)
N45
N80
File cnn.com/index.html with
key K42 stored here
At node n, send query for key k to largest successor/finger entry < k
if none exist, send query to successor(n)
Say m=7
0
N16
N112
N96
N32
Who has cnn.com/index.html?
(hashes to K42)
N45
N80
File cnn.com/index.html with
key K42 stored here
At node n, send query for key k to largest successor/finger entry < k
if none exist, send query to successor(n)
Say m=7
0
N16
N112
All “arrows” are RPCs
N96
N32
Who has cnn.com/index.html?
(hashes to K42)
N45
N80
File cnn.com/index.html with
key K42 stored here
Search takes O(log(N)) time w.h.p.
Takes at most m steps: ok if is at most a constant multiplicative factor above N
Number of node identifiers in a range of
is O(log(N)) with high probability (why? SHA-1!)
So using successors in that range will be ok
Lookup fails
(N16 does not know N45)
Say m=7
0
N16
N112
X
N96
X
N32
Who has cnn.com/index.html?
(hashes to K42)
X
N45
N80
File cnn.com/index.html with
key K42 stored here
One solution: maintain r multiple successor entries
in case of failure, use successor entries
Say m=7
0
N16
N112
N96
X
N32
Who has cnn.com/index.html?
(hashes to K42)
N45
N80
File cnn.com/index.html with
key K42 stored here
Lookup fails
(N45 is dead)
Say m=7
0
N16
N112
N96
N32
Who has cnn.com/index.html?
(hashes to K42)
X
X
N45
N80
File cnn.com/index.html with
key K42 stored here
One solution: replicate file/key at r successors and predecessors
Say m=7
0
N16
N112
N96
N32
Who has cnn.com/index.html?
(hashes to K42)
K42 replicated
X
N45
N80
File cnn.com/index.html with
key K42 stored here
K42 replicated
All the time
Need to update successors and fingers, and copy keys
Introducer directs N40 to N32
N32 updates successor to N40
N40 initializes successor to N45, and inits fingers from it
N40 periodically talks to neighbors to update finger table
Say m=7
0
Stabilization
protocol
N16
N112
N96
N32
N40
N45
N80
N40 may need to copy some files/keys from N45
(files with fileid between 32 and 40)
Say m=7
0
N16
N112
N96
N32
N40
N45
N80
K34,K38
Large
variance
Solution: Each real node joins as r multiple virtual nodes
smaller load variation, lookup cost is O(log(N*r)= O(log(N)+log(r))
Average Messages per Lookup
log, as expected
Number of Nodes
500 nodes (avg. path len=5)
Stabilization runs every 30 s
1 joins&fails every 10 s
(3 fails/stabilization round)
=> 6% failed lookups
Numbers look fine here, but
does stabilization scale with N?
http://www.pdos.lcs.mit.edu/chord/
Student-led paper presentations
Projects
What new design techniques about p2p systems have you learnt in this class?
As you head out that door, think about at least 3 new design techniques (about p2p systems) that you have learnt in this lecture.