peer to peer systems n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Peer-to-Peer Systems PowerPoint Presentation
Download Presentation
Peer-to-Peer Systems

Loading in 2 Seconds...

play fullscreen
1 / 71

Peer-to-Peer Systems - PowerPoint PPT Presentation


  • 310 Views
  • Uploaded on

Peer-to-Peer Systems. Presented By: Nazanin Dehghani Supervisor: Dr. Naser Yazdani. Peer-to-Peer Architecture. more dynamic structure while having a distributed system . every client should bound statically to a specific server. Peer-to-Peer Definition.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Peer-to-Peer Systems' - inara


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
peer to peer systems

Peer-to-Peer Systems

Presented By: NazaninDehghani

Supervisor: Dr. NaserYazdani

peer to peer architecture
Peer-to-Peer Architecture

more dynamic structure while having a distributed system.

every client should bound statically to a specific server

peer to peer definition
Peer-to-Peer Definition
  • “a computer network in which each computer in the network can act as a client or server for the other computers in the network, allowing shared access to files and peripherals without the need for a central server.”
peer to peer systems1
Peer-to-Peer Systems
  • Properties
    • Nodes have to share their resources such as memory, band-width and processing power directly
    • P2P networks should be robust to node churn
primitives
Primitives
  • Common Primitives
    • Join: how to I begin participating?
    • Publish: how do I advertise my file?
    • Search: how to I find a file?
    • Fetch: how to I retrieve a file?
architecture of p2p systems
Architecture of P2P Systems
  • Overlay Network
  • Graph Structure
    • Structured
      • Aware of topology of overlay network
    • Unstructured
how did it start
How Did it Start?
  • A killer application: Naptser
    • Free music over the Internet
  • Key idea: share the content, storage and bandwidth of individual (home) users

Internet

model
Model
  • Each user stores a subset of files
  • Each user has access (can download) files from all users in the system
main challenge
Main Challenge
  • Find where a particular file is stored

E

F

D

E?

C

A

B

other challenges
Other Challenges
  • Scale: up to hundred of thousands or millions of machines
  • Dynamicity: machines can come and go any time
napster
Napster
  • Assume a centralized index system that maps files (songs) to machines that are alive
  • How to find a file (song)
    • Query the index system  return a machine that stores the required file
      • Ideally this is the closest/least-loaded machine
    • ftp the file
  • Advantages:
    • Simplicity, easy to implement sophisticated search engines on top of the index system
  • Disadvantages:
    • Robustness, scalability (?)
napster example

E?

E

E?

m5

Napster: Example

m5

E

m6

F

D

m1 A

m2 B

m3 C

m4 D

m5 E

m6 F

m4

C

A

B

m3

m1

m2

gnutella
Gnutella
  • Distribute file location
  • Idea: flood the request
  • Hot to find a file:
    • Send request to all neighbors
    • Neighbors recursively multicast the request
    • Eventually a machine that has the file receives the request, and it sends back the answer
  • Advantages:
    • Totally decentralized, highly robust
  • Disadvantages:
    • Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL)
gnutella1

xyz

Gnutella
  • Queries are flooded for bounded number of hops
  • No guarantees on recall

xyz

Query: “xyz”

distributed hash tables dhts
Distributed Hash Tables (DHTs)
  • Abstraction: a distributed hash-table data structure
    • insert(id, item);
    • item = query(id); (or lookup(id);)
    • Note: item can be anything: a data object, document, file, pointer to a file…
  • Proposals
    • CAN, Chord, Kademlia, Pastry, Tapestry, etc
dht design goals
DHT Design Goals
  • Make sure that an item (file) identified is always found
  • Scales to hundreds of thousands of nodes
  • Handles rapid arrival and failure of nodes
structured networks

K I

K I

K I

K I

K I

K I

K I

K I

K I

(K1,I1)

I1

put(K1,I1)

get (K1)

Structured Networks
  • Distributed Hash Tables (DHTs)
  • Hash table interface: put(key,item), get(key)
  • O(log n) hops
  • Guarantees on recall
chord1
Chord

In short: a peer-to-peer lookup service

Solves problem of locating a data item in a collection of distributed nodes, considering frequent node arrivals and departures

Core operation in most p2p systems is efficient location of data items

Supports just one operation: given a key, it maps the key onto a node

chord characteristics
Chord Characteristics

Simplicity, provable correctness, and provable performance

Each Chord node needs routing information about only a few other nodes

Resolves lookups via messages to other nodes (iteratively or recursively)

Maintains routing information as nodes join and leave the system

napster gnutella etc vs chord
Napster, Gnutella etc. vs. Chord

Compared to Napster and its centralized servers, Chord avoids single points of control or failure by a decentralized technology

Compared to Gnutella and its widespread use of broadcasts, Chord avoids the lack of scalability through a small number of important information for rounting

addressed difficult problems 1
Addressed Difficult Problems (1)

Load balance: distributed hash function, spreading keys evenly over nodes

Decentralization: chord is fully distributed, no node more important than other, improves robustness

Scalability: logarithmic growth of lookup costs with number of nodes in network, even very large systems are feasible

addressed difficult problems 2
Addressed Difficult Problems (2)

Availability: chord automatically adjusts its internal tables to ensure that the node responsible for a key can always be found

consistent hashing
Consistent Hashing

Hash function assigns each node and key an m-bit identifier using a base hash function such as SHA-1

ID(node) = hash(IP, Port)

ID(key) = hash(key)

Properties of consistent hashing:

Function balances load: all nodes receive roughly the same number of keys

When an Nth node joins (or leaves) the network, only an O(1/N) fraction of the keys are moved to a different location

slide26

identifier

node

6

X

key

0

1

7

6

2

5

3

4

2

Successor Nodes

1

successor(1) = 1

identifier

circle

successor(6) = 0

6

2

successor(2) = 3

node joins and departures
Node Joins and Departures

0

1

7

6

2

5

3

4

6

1

6

successor(6) = 7

successor(1) = 3

1

2

scalable key location
Scalable Key Location

A very small amount of routing information suffices to implement consistent hashing in a distributed environment

Each node need only be aware of its successor node on the circle

Queries for a given identifier can be passed around the circle via these successor pointers

Resolution scheme correct, BUT inefficient: it may require traversing all N nodes!

acceleration of lookups
Acceleration of Lookups

Lookups are accelerated by maintaining additional routing information

Each node maintains a routing table with (at most) m entries (where N=2m) called the finger table

ithentry in the table at node n contains the identity of the first node, s, that succeeds n by at least 2i-1 on the identifier circle (clarification on next slide)

s = successor(n + 2i-1) (all arithmetic mod 2)

s is called the ith finger of node n, denoted by n.finger(i).node

finger tables 1
Finger Tables (1)

finger table

keys

start

int.

succ.

6

finger table

keys

0

start

int.

succ.

1

1

2

3

5

[2,3)

[3,5)

[5,1)

3

3

0

7

6

2

finger table

keys

5

3

start

int.

succ.

2

4

5

7

[4,5)

[5,7)

[7,3)

0

0

0

4

1

2

4

[1,2)

[2,4)

[4,0)

1

3

0

finger tables 2 characteristics
Finger Tables (2) - characteristics

Each node stores information about only a small number of other nodes, and knows more about nodes closely following it than about nodes fartheraway

A node’s finger table generally does not contain enough information to determine the successor of an arbitrary key k

Repetitive queries to nodes that immediately precede the given key will lead to the key’s successor eventually

node joins with finger tables
Node Joins – with Finger Tables

finger table

keys

0

start

int.

succ.

1

1

2

3

5

[2,3)

[3,5)

[5,1)

3

3

0

7

finger table

keys

6

2

start

int.

succ.

7

0

2

[7,0)

[0,2)

[2,6)

0

0

3

5

3

4

finger table

keys

start

int.

succ.

6

1

2

4

[1,2)

[2,4)

[4,0)

1

3

0

6

6

finger table

keys

start

int.

succ.

2

4

5

7

[4,5)

[5,7)

[7,3)

0

0

0

6

6

node departures with finger tables
Node Departures – with Finger Tables

0

1

7

6

2

5

3

4

finger table

keys

start

int.

succ.

1

2

4

[1,2)

[2,4)

[4,0)

1

3

0

3

6

finger table

keys

start

int.

succ.

1

2

3

5

[2,3)

[3,5)

[5,1)

3

3

0

6

finger table

keys

start

int.

succ.

6

7

0

2

[7,0)

[0,2)

[2,6)

0

0

3

finger table

keys

start

int.

succ.

2

4

5

7

[4,5)

[5,7)

[7,3)

6

6

0

0

chord finger table
Chord “Finger Table”

1/2

1/4

1/8

1/16

1/32

1/64

1/128

N80

  • Entry i in the finger table of node n is the first node that succeeds or equals n + 2i
  • In other words, the ith finger points 1/2n-i way around the ring
chord routing
Chord Routing

Succ. Table

  • Upon receiving a query for item id, a node:
  • Checks whether stores the item locally
  • If not, forwards the query to the largest node in its successor table that does not exceed id

Items

7

i id+2i succ

0 1 1

1 2 2

2 4 0

0

Succ. Table

Items

1

1

i id+2i succ

0 2 2

1 3 6

2 5 6

7

query(7)

6

2

Succ. Table

i id+2i succ

0 7 0

1 0 0

2 2 2

Succ. Table

i id+2i succ

0 3 6

1 4 6

2 6 6

5

3

4

node join
Node Join
  • Compute ID
  • Use an existing node to route to that ID in the ring.
    • Finds s = successor(id)
  • ask s for its predecessor, p
  • Splice self into ring just like a linked list
    • p->successor = me
    • me->successor = s
    • me->predecessor = p
    • s->predecessor = me
chord summary
Chord Summary
  • Routing table size?
    • Log N fingers
  • Routing time?
    • Each hop expects to 1/2 the distance to the desired id => expect O(log N) hops.
fairness
Fairness
  • How about somebody only download not upload.
  • What is the policy
    • Incentive mechanism

Distributed Operating Systems

fetching data
Fetching Data
  • Once we know which node(s) have the data we want...
  • Option 1: Fetch from a single peer
    • Problem: Have to fetch from peer who has whole file.
      • Peers not useful sources until d/l whole file
      • At which point they probably log off. :)
    • How can we fix this?
chunk fetching
Chunk Fetching
  • More than one node may have the file.
  • How to tell?
    • Must be able to distinguish identical files
    • Not necessarily same filename
    • Same filename not necessarily same file...
  • Use hash of file
    • Common: MD5, SHA-1, etc.
  • How to fetch?
    • Get bytes [0..8000] from A, [8001...16000] from B
    • Alternative: Erasure Codes

Distributed Operating Systems

bittorrent1
BitTorrent
  • Written by Bram Cohen (in Python) in 2001
  • “Pull-based” “swarming” approach
    • Each file split into smaller pieces
    • Nodes request desired pieces from neighbors
      • As opposed to parents pushing data that they receive
    • Pieces not downloaded in sequential order
  • Encourages contribution by all nodes
bittorrent2
BitTorrent
  • Piece Selection
    • Rarest first
    • Random first selection
  • Peer Selection
    • Tit-for-tat
    • Optimistic un-choking
bittorrent swarm
BitTorrent Swarm
  • Swarm
    • Set of peers all downloading the same file
    • Organized as a random mesh
  • Each node knows list of pieces downloaded by neighbors
  • Node requests pieces it does not own from neighbors
how a node enters a swarm for file popeye mp4
How a node enters a swarm for file “popeye.mp4”
  • File popeye.mp4.torrent hosted at a (well-known) webserver
  • The .torrent has address of tracker for file
  • The tracker, which runs on a webserver as well, keeps track of all peers downloading file
slide46

How a node enters a swarm for file “popeye.mp4”

www.bittorrent.com

  • File popeye.mp4.torrent hosted at a (well-known) webserver
  • The .torrent has address of tracker for file
  • The tracker, which runs on a webserver as well, keeps track of all peers downloading file

1

Peer

popeye.mp4.torrent

how a node enters a swarm for file popeye mp41
How a node enters a swarm for file “popeye.mp4”

www.bittorrent.com

  • File popeye.mp4.torrent hosted at a (well-known) webserver
  • The .torrent has address of tracker for file
  • The tracker, which runs on a webserver as well, keeps track of all peers downloading file

2

Peer

Addresses of peers

Tracker

how a node enters a swarm for file popeye mp42
How a node enters a swarm for file “popeye.mp4”

www.bittorrent.com

  • File popeye.mp4.torrent hosted at a (well-known) webserver
  • The .torrent has address of tracker for file
  • The tracker, which runs on a webserver as well, keeps track of all peers downloading file

Peer

3

Tracker

Swarm

contents of torrent file
Contents of .torrent file
  • URL of tracker
  • Piece length – Usually 256 KB
  • SHA-1 hashes of each piece in file
    • For reliability
  • “files” – allows download of multiple files
terminology
Terminology
  • Seed: peer with the entire file
    • Original Seed: The first seed
  • Leech: peer that’s downloading the file
    • Fairer term might have been “downloader”
peer peer transactions choosing pieces to request
Peer-peer transactions:Choosing pieces to request
  • Rarest-first: Look at all pieces at all peers, and request piece that’s owned by fewest peers
    • Increases diversity in the pieces downloaded
      • avoids case where a node and each of its peers have exactly the same pieces; increases throughput
    • Increases likelihood all pieces still available even if original seed leaves before any one node has downloaded entire file
choosing pieces to request
Choosing pieces to request
  • Random First Piece:
    • When peer starts to download, request random piece.
      • So as to assemble first complete piece quickly
      • Then participate in uploads
    • When first complete piece assembled, switch to rarest-first
tit for tat as incentive to upload
Tit-for-tat as incentive to upload
  • Want to encourage all peers to contribute
  • Peer A said to choke peer B if it (A) decides not to upload to B
  • Each peer (say A) unchokes at most 4 interested peers at any time
    • The three with the largest upload rates to A
      • Where the tit-for-tat comes in
    • Another randomly chosen (Optimistic Unchoke)
      • To periodically look for better choices
why bittorrent took off
Why BitTorrent took off
  • Better performance through “pull-based” transfer
    • Slow nodes don’t bog down other nodes
  • Allows uploading from hosts that have downloaded parts of a file
    • In common with other end-host based multicast schemes
pros and cons of bittorrent
Pros and cons of BitTorrent
  • Pros
    • Proficient in utilizing partially downloaded files
    • Discourages “freeloading”
      • By rewarding fastest uploaders
    • Encourages diversity through “rarest-first”
      • Extends lifetime of swarm
pros and cons of bittorrent1
Pros and cons of BitTorrent
  • Cons
    • Assumes all interested peers active at same time; performance deteriorates if swarm “cools off”
    • Even worse: no trackers for obscure content
pros and cons of bittorrent2
Pros and cons of BitTorrent
  • Dependence on centralized tracker: pro/con?
    •  Single point of failure: New nodes can’t enter swarm if tracker goes down
    • Lack of a search feature
      •  Prevents pollution attacks
      •  Users need to resort to out-of-band search: well known torrent-hosting sites / plain old web-search
trackerless bittorrent
“Trackerless” BitTorrent
  • To be more precise, “BitTorrent without a centralized-tracker”
  • E.g.: Azureus
  • Uses a Distributed Hash Table (Kademlia DHT)
  • Tracker run by a normal end-host (not a web-server anymore)
    • The original seeder could itself be the tracker
    • Or have a node in the DHT randomly picked to act as the tracker
p2p live video streaming
P2P Live Video Streaming
  • Autonomous and selfish peers
  • Churn
  • Time-sensitive and deadline-prone data
success of p2p based file distribution
Success of P2P-Based File Distribution
  • Distribute content quickly
  • Utilizing the capacity of all peers
  • Incentive mechanism
    • Preventing peers from free-riding
  • Incentive mechanism == formation of clusters of similar bandwidth peers
newcoolstreaming
newCoolstreaming
  • Provide peer-to-peerlivestreaming
  • Data-drivendesign
    • Don’t use any tree, mesh, or any other structures
    • Data flows are guided by the availability of data
core operations of donet coolstreaming
Core operations of DONet / CoolStreaming
  • DONet: Data-driven Overlay Network
  • CoolStream: Cooperative Overlay Streaming
    • A practical DONet implementation
  • Every node periodically exchanges data availability information with a set of partners
  • Retrieve unavailable data from one or more partners, or supply available data to partners
  • The more people watching the streaming data, the better the watching quality will be
    • The idea is similar to BitTorrent (BT)
a generic system diagram for a donet node
A generic system diagram for a DONet node
  • Partnership manager
    • Random select
  • Transmission scheduler
    • Schedules transmission of video data
  • Buffer Map
    • Record availability
coolstreaming
Coolstreaming
  • Two types of connections between peers:
    • Partnership relationship
    • Parent-child relationship
  • Multiple sub-streams
  • Buffer partitioning
  • Push-Pull content delivering
  • Parent re-selection
an example of stream decomposition
An Example of Stream Decomposition

Single stream of blocks with Sequence number {1, 2, 3, … , 13}

Combine & Decompose

1

2

3

4

5

6

11

12

13

For sub-streams {S1, S2, S3, S4}

S1

1

5

9

13

S2

2

6

10

S3

3

7

11

S4

4

8

12

structure of buffer in a node
Structure of Buffer in a Node

S1

d+1

S2

d

d+1

. . .

3

2

1

. . .

. . .

Cache Buffer

d+3k

d+2k

d+k

Sk

Blocks Received

Synchronization Buffer

With K numbers of sub-streams

Blocks Unavailable

p2p summary
P2P: Summary
  • Many different styles; remember pros and cons of each
    • centralized, flooding, swarming, unstructured and structured routing
  • Lessons learned:
    • Single points of failure are bad
    • Flooding messages to everyone is bad
    • Underlying network topology is important
    • Need incentives to discourage freeloading
    • Privacy and security are important
    • Structure can provide theoretical bounds and guarantees