Performance comparison of scheduling algorithms for peer to peer collaborative file distribution l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 24

Performance Comparison of Scheduling Algorithms for Peer-to-Peer Collaborative File Distribution PowerPoint PPT Presentation


  • 93 Views
  • Uploaded on
  • Presentation posted in: General

Performance Comparison of Scheduling Algorithms for Peer-to-Peer Collaborative File Distribution. Presented by: Chan Siu Kei, Jonathan Supervisors: Prof. VOK Li, Dr. KS Lui. Overview. Introduction Communication Model Analysis Scheduling Algorithms - Rarest Piece First

Download Presentation

Performance Comparison of Scheduling Algorithms for Peer-to-Peer Collaborative File Distribution

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Performance comparison of scheduling algorithms for peer to peer collaborative file distribution l.jpg

Performance Comparison of Scheduling Algorithms for Peer-to-Peer Collaborative File Distribution

Presented by: Chan Siu Kei, Jonathan

Supervisors: Prof. VOK Li, Dr. KS Lui


Overview l.jpg

Overview

  • Introduction

  • Communication Model

  • Analysis

  • Scheduling Algorithms

    - Rarest Piece First

    - Most Demanding Node First

    - Maximum-Flow Algorithms

  • Simulation Results

  • Future Work

  • Conclusion


Introduction l.jpg

Introduction

  • P2P file sharing applications are highly popular in the Internet, e.g. BitTorrent, Gnutella, Kazaa, Napster, etc.

  • More scalable (faster) compared with traditional client/server approach (e.g. FTP)

  • Former research focuses on topics like overlay topology formation, peer discovery, content search, fairness and incentive issues, etc. But seldom look into the data distribution scheduling problem

  • We present the first effort and propose a novel Maximum-Flow algorithm to better solve the problem


Communication model l.jpg

Communication Model

  • Synchronous Scheduling

    - same transmission time for every pair of nodes

  • Asymmetric Bandwidth

    - send p pieces out, receive q pieces in for each cycle


Notations and definitions l.jpg

Notations and Definitions

  • N = no. of peers, M = no. of file pieces

  • F = {F1, F2, …, FM}

  • P = NxM possession matrix,

    Pij = 1iff node i possesses file piece Fj, otherwise Pij = 0

  • Pt =possession matrix at time t

  • p = {p1,p2,…,pN} (upload limit vector),

    q = {q1,q2,…,qN} (download limit vector)

p = {1,1,2,2,2}, q = {2,3,2,3,3}


Schedule 1 l.jpg

Schedule (1)

  • Specifies which file pieces each peer has to send out and to whom

  • A possible schedule for P0 with p={1,1,2,2,2}, q={2,3,2,3,3}

    - Node 1: send piece 3 to node 2

    - Node 2: send piece 4 to node 1

    - Node 3: send piece 5 to node 1

    send piece 5 to node 2

    - Node 4: send piece 6 to node 2

    send piece 6 to node 3

    - Node 5: send piece 2 to node 4

    send piece 7 to node 4

  • Formally, we use NxM matrix Sk to represent the schedule at cycle k. From Sk, we can derive transmission matrix Tk (NxM)

e.g. Node 1 receives piece 4 from Node 2, piece 5 from Node 3 => and


Schedule 2 l.jpg

Schedule (2)

  • Given Pk-1 and the schedule Sk-1, Tk-1, the possession matrix at next cycle k is Pk = Pk-1 + Tk-1(k > 0)

  • The distribution terminates after certain, say k0 cycles, until

  • Our goal is to minimize k0, which is the time needed for complete distribution


Analysis on lower bound 1 l.jpg

Analysis on Lower Bound (1)

  • Let p = {p1,p2,…,pN}, q = {q1,q2,…,qN} be the upload and download limit vectors. , ,

  • Let ri be the total no. of 0s across row i, i.e. , the min. value of k0 is given by

  • Let cj be the total no. of 1s along column j, i.e. , we can find the minimum no. of 1s along all columns, , the min. value of k0 is given by

  • Let z be the total no. of 0s in P, i.e. , the min. value of k0 is given by

(1)

(2)

(3)


Analysis on lower bound 2 l.jpg

Analysis on Lower Bound (2)

  • Combining (1),(2),(3), the lower bound k0 is given by

(4)

From (1),

From (2),

From (3),


Rarest piece first rpf l.jpg

Rarest Piece First (RPF)

  • Borrowed from the Rarest Element First algorithm employed in BitTorrent

  • Rarity cj of piece j is the no. of peers who have piece j, i.e.

RPF – Node-Oriented: (p={1,1,2,2,2}, q={2,3,2,3,3})

RPF – Piece-Oriented: (p={1,1,2,2,2}, q={2,3,2,3,3})


Most demanding node first mdnf l.jpg

Most Demanding Node First (MDNF)

  • Demand di of node i is the no. of un-received pieces for node i, i.e.

  • When choosing recipients, prefer sending to the node with largest di

MDNF – Node-Oriented: (p={1,1,2,2,2}, q={2,3,2,3,3})

6

6

4

4

5

MDNF – Piece-Oriented: (p={1,1,2,2,2}, q={2,3,2,3,3})

6

6

4

4

5


Problem with rpf and mdnf l.jpg

Problem with RPF and MDNF

  • The max. no. of transmissions for each cycle cannot be achieved

Using MDNF – Piece-Oriented: (p={2,2,2,1}, q={2,1,2,2})

only 6 transmissions can be scheduled (but the max. is 7)

MDNF (only 6 transmissions)

Maximum is 7 transmissions


Maximum flow maxflow l.jpg

Maximum-Flow (MaxFlow)

Let G = (V,E) to be the flow network graph

L = {L1, L2, …, LN}

R = {R1, R2, …, RN}


Maximum flow maxflow14 l.jpg

Maximum-Flow (MaxFlow)

  • Edmonds-Karp Algorithm:

  • Find augmenting paths using BFS

  • Guarantee to find maximum # of transmissions in each cycle

  • Complexity =


Maxflow counter example l.jpg

MaxFlow – Counter Example

  • Pure MaxFlow performance is unsatisfactory, as it does not consider whether we can match more in subsequent cycles

Using MaxFlow, total 3 cycles are needed: (p={2,2,2,2,2}, q={3,3,3,3,3})

Using RPF – Node-Oriented, only 2 cycles are needed: (p={2,2,2,2,2}, q={3,3,3,3,3})


Maxflow weighted l.jpg

MaxFlow - Weighted

  • Put weights on both sides to give priorities to some nodes during searching

  • Weights on Li = (sum of the no. of 0s in other peers for those pieces that peer i has)

  • Weights on Bij =δij (sum of the no. of 0s across row i and column j)

  • E.g.

    δ42 = 7


Maxflow weighted counter example l.jpg

MaxFlow – WeightedCounter Example

For p={2,2,2,2,2}, q={3,3,3,3,3}

Using MaxFlow – Weighted, total 3 cycles are needed:

… P3 = 1

Using MDNF – Piece-Oriented, only 2 cycles are needed:

P2 = 1


Maxflow dynamically weighted l.jpg

MaxFlow – Dynamically-Weighted

  • Allows the weights to be dynamically varied within each scheduling cycle

γ = {15,14,25,13,15,10,16,16} and δ43 = 9 which is the greatest value among all δij


Simulation results 1 l.jpg

Simulation Results (1)

Fig. 1 Performance comparison of various scheduling algorithms (All) with varying peer sizes (file size = 100, pi = 2, qi = 3, equal probability for 1s and 0s)


Simulation results 2 l.jpg

Simulation Results (2)

Fig. 2 Performance comparison of various scheduling algorithms (Representative) with varying peer sizes (file size = 100, pi = 2, qi = 3, equal probability for 1s and 0s)


Simulation results 3 l.jpg

Simulation Results (3)

Fig. 3 Performance comparison of various scheduling algorithms (Representative) with varying file sizes (peer size = 10, pi = 2, qi = 3, equal probability for 1s and 0s)


Future work l.jpg

Future Work

  • Study the case of asynchronous scheduling, where the transmission time is different for different pairs of nodes

  • Study the case when the network is dynamic in nature, where peers can come and go at any instant and they may shift to communicate with different sets of peers during the distribution process


Conclusion l.jpg

Conclusion

  • The data distribution problem in P2P networks is not well studied in previous research

  • We formally define the collaborative file distribution problem with the possession and transmission matrix formulations

  • We also deduce a theoretical bound for the minimum distribution time required

  • We develop several types of algorithms (RPF, MDNF, MaxFlow) for solving the problem

  • Our novel dynamically-weighted max-flow algorithm outperforms all other algorithms by simulations


Thank you l.jpg

Thank You!

Q&A


  • Login