peer to peer n.
Skip this Video
Loading SlideShow in 5 Seconds..
Peer-to-Peer PowerPoint Presentation
Download Presentation


91 Views Download Presentation
Download Presentation


- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Peer-to-Peer Credits: Slides adapted from J. Pang, B. Richardson, I. Stoica, M. Cuenca, B. Cohen, S. Ramesh, D. Qui, R. Shrikant, Xiaoyan Li and R. Martin.

  2. Reminders • HW1 due next class (9/14) • PA1 to be assigned next class.

  3. Intro • Quickly grown in popularity • Hundreds of file sharing applications • 35 million American adults use P2P networks -- 29% of all Internet users in US! • Audio/Video transfer now dominates traffic on the Internet • But what is P2P? • Searching or location? -- DNS, Google! • Computers “Peering”? -- Server Clusters, IRC Networks, Internet Routing! • Clients with no servers? -- Doom, Quake!

  4. Intro (2) • Fundamental difference: Take advantage of resources at the edges of the network • What’s changed: • End-host resources have increased dramatically • Broadband connectivity now common • What hasn’t: • Deploying infrastructure still expensive

  5. Overview • Centralized Database • Napster • Query Flooding • Gnutella • Intelligent Query Flooding • KaZaA • Swarming • BitTorrent

  6. The Lookup Problem N2 N1 N3 Key=“title” Value=MP3 data… Internet ? Client Publisher Lookup(“title”) N4 N6 N5

  7. The Lookup Problem (2) • Common Primitives: • Join: how to I begin participating? • Publish: how do I advertise my file? • Search: how to I find a file? • Fetch: how to I retrieve a file?

  8. Next Topic... • Centralized Database • Napster • Query Flooding • Gnutella • Intelligent Query Flooding • KaZaA • Swarming • BitTorrent

  9. Napster: History • In 1999, S. Fanning launches Napster • Peaked at 1.5 million simultaneous users • Jul 2001, Napster shuts down

  10. Napster: Overiew • Centralized Database: • Join: on startup, client contacts central server • Publish: reports list of files to central server • Search: query the server => return someone that stores the requested file • Fetch: get the file directly from peer

  11. Publish Napster: Publish insert(X, ... I have X, Y, and Z!

  12. Fetch Query Reply Napster: Search search(A) --> Where is file A?

  13. Napster: Discussion • Pros: • Simple • Search scope is O(1) • Controllable (pro or con?) • Cons: • Server maintains O(N) State • Server does all processing • Single point of failure

  14. Next Topic... • Centralized Database • Napster • Query Flooding • Gnutella • Intelligent Query Flooding • KaZaA • Swarming • BitTorrent

  15. Gnutella: History • In 2000, J. Frankel and T. Pepper from Nullsoft released Gnutella • Soon many other clients: Bearshare, Morpheus, LimeWire, etc. • In 2001, many protocol enhancements including “ultrapeers”

  16. Gnutella: Overview • Query Flooding: • Join: on startup, client contacts a few other nodes; these become its “neighbors” • Publish: no need • Search: ask neighbors, who as their neighbors, and so on... when/if found, reply to sender. • Fetch: get the file directly from peer

  17. I have file A. I have file A. Reply Query Gnutella: Search Where is file A?

  18. Gnutella: Discussion • Pros: • Fully de-centralized • Search cost distributed • Cons: • Search scope is O(N) • Search time is O(???) • Nodes leave often, network unstable

  19. Aside: Search Time?

  20. 1.5Mbps DSL 1.5Mbps DSL 56kbps Modem 1.5Mbps DSL 10Mbps LAN 1.5Mbps DSL 56kbps Modem 56kbps Modem Aside: All Peers Equal?

  21. Next Topic... • Centralized Database • Napster • Query Flooding • Gnutella • Intelligent Query Flooding • KaZaA • Swarming • BitTorrent • Unstructured Overlay Routing • Freenet • Structured Overlay Routing • Distributed Hash Tables

  22. KaZaA: History • In 2001, KaZaA created by Dutch company Kazaa BV • Single network called FastTrack used by other clients as well: Morpheus, giFT, etc. • Eventually protocol changed so other clients could no longer talk to it

  23. KaZaA: Overview • “Smart” Query Flooding: • Join: on startup, client contacts a “supernode” ... may at some point become one itself • Publish: send list of files to supernode • Search: send query to supernode, supernodes flood query amongst themselves. • Fetch: get the file directly from peer(s); can fetch simultaneously from multiple peers

  24. “Super Nodes” KaZaA: Network Design

  25. Publish KaZaA: File Insert insert(X, ... I have X!

  26. search(A) --> search(A) --> Query Replies KaZaA: File Search Where is file A?

  27. KaZaA: Fetching • More than one node may have requested file... • How to tell? • Must be able to distinguish identical files • Not necessarily same filename • Same filename not necessarily same file... • Use Hash of file • KaZaA uses UUHash: (MD5+CRC32 checksum of staggered blocks)---fast, but not secure • Alternatives: full MD5, SHA-1 • How to fetch? • Get bytes [0..1000] from A, [1001...2000] from B • Alternative: Erasure Codes for greater redundancy.

  28. KaZaA: Discussion • Pros: • Tries to take into account node heterogeneity: • Bandwidth • Host Computational Resources • Host Availability (?) • Rumored to take into account network locality (latency) • Cons: • Mechanisms easy to circumvent • Still no real guarantees on search scope or search time • Easy to intentionally corrupt content.

  29. Next Topic... • Centralized Database • Napster • Query Flooding • Gnutella • Intelligent Query Flooding • KaZaA • Swarming • BitTorrent

  30. BitTorrent: History • In 2002, B. Cohen debuted BitTorrent • Key Motivation: • Popularity exhibits temporal locality (Flash Crowds) • E.g., Slashdot effect, CNN on 9/11, new movie/game release • Focused on Efficient Fetching, not Searching: • Distribute the same file to all peers • Single publisher, multiple downloaders • Has some “real” publishers: • Blizzard Entertainment using it to distribute the beta of their new game - May 2006 deal with Warner Brothers for content distribution.

  31. BitTorrent Working

  32. BitTorrent: Publish/Join Tracker/Seed

  33. BitTorrent: Fetch

  34. BitTorrent: Overview • Swarming: • Join: contact centralized “tracker” server, get a list of peers. • Publish: Run a tracker server. • Search: Out-of-band. E.g., use Google to find a tracker for the file you want. • Fetch: Download chunks of the file from your peers. Upload chunks you have to them.

  35. Algorithms/Features in BT • Pipelining of requests – to avoid delay in pieces being sent • Piece Selection • Strict priority [within sub-pieces] • Rarest first • Random First piece • Endgame mode [for request of final subpieces] • Choking, Optimistic unchoking, Anti-snubbing, Upload only

  36. Free-Riding with Optimistic Unchoking • If a peer wants to free-ride in BT, the maximum download rate it will get is 1/(nu + 1) of the maximum rate • This is also a very powerful result showing why BT performs so much better when compared to other P2P networks like Gnutella and Kazaa.

  37. Notes on BitTorrent • The BT network graph is random. Other ‘clever’ peer assignment algorithms have resulted only in network partitions • Single point of failure (tracker) • Lots of magic numbers • how many peers should a peer upload to? • how long should a peer wait in optimistic unchoke before moving on to other peers? • how many requests should be pipelined? • “Starting multiple torrents at once is a bad idea” - Bram

  38. BitTorrent: Sharing Strategy • Employ “Tit-for-tat” sharing strategy • “I’ll share with you if you share with me” • Be optimistic: occasionally let freeloaders download • Otherwise no one would ever start! • Also allows you to discover better peers to download from when they reciprocate • Meant to counter Prisoner’s Dilemma problem. • Approximates Pareto Efficiency • Game Theory: “No change can make anyone better off without making others worse off”

  39. BitTorrent: Summary • Pros: • Works reasonably well in practice • Gives peers incentive to share resources; avoids freeloaders • Cons: • Pareto Efficiency relative weak condition • Central tracker server needed to bootstrap swarm (is this really necessary?)

  40. P2P: Summary • Many different styles; remember pros and cons of each • centralized, flooding, swarming Lessons learned: • Single points of failure are very bad • Flooding messages to everyone is bad • Underlying network topology is important • Not all nodes are equal • Need incentives to discourage freeloading • Privacy and security are important

  41. Demo of using Bittorent client against a .torrent file on Video of Interview with Bittorrent’s Creator Bram Cohen

  42. Extra Slides

  43. Only for Pirates and sharing home video? • Legal downloads tripled in the first half of 2005, reaching 186 million. • In May 2006, Warner Brother announced it would use bittorrent to distribute pay-for content. $1 per tv show $4 to $5 per full length movie. • Claims “[Users] will be prevented from copying and distributing files they purchase through two mechanisms: one that requires them to enter a password before watching a file, and another that allows the file to be viewed only on the computer to which it was downloaded."

  44. Allowing p2p a liability for ISPs? From TMCnet News: Unauthorized p2p file sharing banned in Spain. “It's a criminal offense for ISPs to facilitate unauthorized downloading.” “Spain's telco giant Telefonica reports 90% of usage on its broadband lines is Internet traffic, up from 15% five years ago. Of that 90%, a massive 71% is P2P traffic.”

  45. What can ISPs do? • tcp/udp port filtering? • Deep Packet Inspection (DPI): “From Network operators worldwide spent US$96.8m on DPI in 2005, but the sector is poised to grow by more than 75 per cent this year, to about US$170m, and top US$586m in 2010.”

  46. KaZaA: Usage Patterns • KaZaA is more than one workload! • Many files < 10MB (e.g., Audio Files) • Many files > 100MB (e.g., Movies) from Gummadi et al., SOSP 2003

  47. from Gummadi et al., SOSP 2003 KaZaA: Usage Patterns (2) • What we saw: • A few big files consume most of the bandwidth • Many files are fetched once per client but still very popular • Solution? • Caching!