1 / 31

A Hybrid Architecture for Cost-Effective On-Demand Media Streaming

A Hybrid Architecture for Cost-Effective On-Demand Media Streaming Mohamed Hefeeda & Bharat Bhargava CS Dept, Purdue University Support: NSF, CERIAS FTDCS 2003. Motivations. Imagine a media server streaming movies to clients Target environments Distance learning, corporate streaming, …

sora
Download Presentation

A Hybrid Architecture for Cost-Effective On-Demand Media Streaming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Hybrid Architecture for Cost-Effective On-Demand Media Streaming Mohamed Hefeeda & Bharat Bhargava CS Dept, Purdue University Support: NSF, CERIAS FTDCS 2003

  2. Motivations • Imagine a media server streaming movies to clients • Target environments • Distance learning, corporate streaming, … • Current approaches • Unicast • Centralized • Proxy caches • CDN (third-party) • Multicast • Network layer • Application layer • Proposed approach (Peer-to-Peer)

  3. Unicast Streaming: Centralized • Pros • Easy to deploy, administer • Cons • Limited scalability • Reliability concerns • Load on the backbone • High deploymentcost $$$…..$ • Note: • A server with T3 link (~45 Mb/s) supports only up to 45 concurrent users at 1Mb/s CBR!

  4. Unicast streaming: Proxy • Pros • Better performance • Less load on backbone and server • Prefix caching, selective caching, staging, … • Cons • Proxy placement and management • High deployment cost $$$…..$

  5. Unicast Streaming: CDN • Pros • Good performance • Suitable for web pages with moderate-size objects • Cons • Co$t: CDN charges for every megabyte served!  • Not suitable for VoD service; movies are quite large (~Gbytes) • Note [Raczkowski’02] • Cost ranges from 0.25 to 2 cents/MByte • For a one-hour streamed to 1,000 clients, content provider pays $264 to CDN (at 0.5 cents/MByte)!

  6. Multicast Streaming: Network Layer • Pros • Efficient! • Asynchronous client • Patching, skyscraper, … • [e.g., Mahanti, et al. ToN’03] • Cons • Scalability of the routers • Asynchronous  tune to multiple channels • Require high inbound bandwidth (2R) • Not widely deployed Patch stream

  7. Multicast Streaming: Application Layer • Pros • Deployable • Cons • Assumes end systems can support (outbound bandwidth) multiple folds of the streaming rate

  8. P2P Approach: Key Ideas • Make use of underutilized peers’ resources • Make use of heterogeneity • Multiple peers serve a requesting peer • Network-aware peers organization

  9. P2P Approach (cont’d) • Pros • Cost effective • Deployable (no new hardware, nor network support) • No extra inbound bandwidth (R) • Small load on supplying peers (< R) • Less stress on backbone • Challenges • Achieve good quality • Unreliable, limited capacity, peers • Highly dynamic environment

  10. P2P Approach: Entities • Peer • Level of cooperation (rate, bw, coonections) • Super peer • Help in searching and dispersion • Seed peer • Initiate the streaming • Stream • Time-ordered sequence of packets • Media file • Divided into equal-size segments • Super Peers play special roles  Hybrid system

  11. Hybrid System: Issues • Peers Organization • Two-level peers clustering • Join, leave, failure, overhead, super peer selection • Operation • Client protocol: manage multiple suppliers • Switch suppliers • Effect of switching on quality • Cluster-Based Searching • Find nearby suppliers • Cluster-Based Dispersion • Disseminate new media files Details are in the extended version of the paper

  12. Peers Organization: Two-Level Clustering • Based on BGP tables • [Oregon RouteViews] • Network-cluster [KW 00] • Peers sharing the same network prefix • 128.10.0.0/16 (purdue),128.2.0.0/16 (cmu) • 128.10.3.60, 128.10.7.22, 128.2.10.1, 128.2.11.43 • AS-cluster • All network clusters within an Autonomous System Snapshot of a BGP routing table

  13. Join Join Join Peers Organization: Join Bootstrap data structure

  14. Client Protocol • Building blocks of the protocol to be run by a requesting peer • Three phases • Availability check • Search for all segments with the full rate • Streaming • Stream (and playback) segment by segment • Caching • Store enough copies of the media file in each cluster to serve all expected requests from that cluster • Which segments to cache?

  15. Client Protocol (cont'd) • Phase II: Streaming tj = tj-1 + δ /* δ: time to stream a segment */ For j = 1 to N do At time tj, get segment sj as follows: • Connect to every peer Px in Pj(in parallel) and • Download from byte bx-1 to bx-1 • Note: bx = |sj| Rx/R Example: P1, P2, and P3 serving different pieces of the same segment to P4 with different rates

  16. Dispersion (cont'd) • Dispersion Algorithm (basic idea): • /* Upon getting a request from Py to cache Nysegments */ • C  getCluster (Py) • Compute available (A) and required (D) capacities in cluster C • If A < D • Pycaches Ny segments in a cluster-wide round robin fashion (CWRR) • All values are smoothed averages • Average available capacity in C: • CWRR Example: (10-segment file) • P1 caches 4 segments: 1,2,3,4 • P2 then caches 7 segments: 5,6,7,8,9,10,1

  17. Evaluation Through Simulation • Client parameters • Effect of switching on quality • System Parameters • Overall system capacity, • Average waiting time, • Load/Role on the seeding peer • Scenarios: different client arrival patterns (constant rate, Poisson, flash crowd) and different levels of peer cooperation • Performance of the dispersion algorithm • Compare against random dispersion algorithm

  18. Simulation: Topology • Large, Hierarchical (more than 13,000 nodes) • Ex. 20 transit domains, 200 stub domains, 2,100 routers, and a total of 11,052 end hosts • Used GT-ITM and ns-2

  19. Client: Effect of Switching • Initial buffering is needed to hide suppliers switching • Tradeoff: small segment size  small buffering but more overhead (encoding, decoding, processing)

  20. System: Capacity and Load on Seed Peer System capacity Load on seeding peer • The role of the seeding peer is diminishing • For 50%: After 5 hours, we have 100 concurrent clients (6.7 times original capacity) and none of them is served by the seeding peer

  21. System: Under Flash Crowd Arrivals • Flash crowd  sudden increase in client arrivals

  22. System: Under Flash Crowd (cont'd) System capacity Load on seeding peer • The role of the seeding peer is still just seeding • During the peak, we have 400 concurrent clients (26.7 times original capacity) and none of them is served by the seeding server (50% caching)

  23. Dispersion Algorithm 5% caching • Avg. number of hops: • 8.05 hops (random), 6.82 hops (ours)  15.3% savings • For a domain with a 6-hop diameter: • Random: 23% of the traffic was kept inside the domain • Cluster-based: 44% of the traffic was kept inside the domain

  24. Conclusions • A hybrid model for on-demand media streaming • P2P streaming; Powerful peers do more work • peers with special roles (super peers) • Cost-effective, deployable • Supports large number of clients; including flash crowds • Two-level peers clustering • Network-conscious peers clustering • Helps in keeping the traffic local • Cluster-based dispersion pushes contents closer to clients (within the same domain)  • Reduces number of hops traversed by the stream and the load on the backbone

  25. Thank You!

  26. P2P Systems: Basic Definitions Background • Peers cooperate to achieve desired functions • Cooperate: share resources (CPU, storage, bandwidth), participate in the protocols (routing, replication, …) • Functions: file-sharing, distributed computing, communications, … • Examples • Gnutella, Napster, Freenet, OceanStore, CFS, CoopNet, SpreadIt, SETI@HOME, … • Well, aren’t they just distributed systems? • P2P == distributed systems?

  27. P2P vs. Distributed Systems Background • P2P = distributed systems++; • Ad-hoc nature • Peers are not servers [Saroui et al., MMCN’02 ] • Limited capacity and reliability • Much more dynamism • Scalability is a more serious issue (millions of nodes) • Peers are self-interested (selfish!) entities • 70% of Gnutella users share nothing [Adar and Huberman ’00] • All kind of Security concerns • Privacy, anonymity, malicious peers, … you name it!

  28. P2P Systems: Rough Classification[Lv et al., ICS’02], [Yang et al., ICDCS’02] Background • Structured (or tightly controlled, DHT) • Files are rigidly assigned to specific nodes • Efficient search & guarantee of finding • Lack of partial name and keyword queries • Ex.: Chord[Stoica et al., SIGCOMM’01], CAN[Ratnasamy et al., SIGCOMM’01], Pastry[Rowstron and Druschel, Middleware’01] • Unstructured (or loosely controlled) • Files can be anywhere • Support of partial name and keyword queries • Inefficient search (some heuristics exist) & no guarantee of finding • Ex.: Gnutella • Hybrid (P2P + centralized), super peers notion) • Napster, KazaA

  29. File-sharing vs. Streaming Background • File-sharing • Download the entire file first, then use it • Small files (few Mbytes)  short download time • A file is stored by one peer  one connection • No timing constraints • Streaming • Consume (playback) as you download • Large files (few Gbytes)  long download time • A file is stored by multiple peers  several connections • Timing is crucial

  30. Related Work: Streaming Approaches Background • Distributed caches [e.g., Chen and Tobagi, ToN’01 ] • Deploy caches all over the place • Yes, increases the scalability • Shifts the bottleneck from the server to caches! • But, it also multiplies cost • What to cache? And where to put caches? • Multicast • Mainly for live media broadcast • Application level [Narada, NICE, Scattercast, Zigzag, … ] • IP level [e.g., Dutta and Schulzrine, ICC’01] • Widely deployed?

  31. Related Work: Streaming Approaches Background • P2P approaches • SpreadIt [Deshpande et al., Stanford TR’01] • Live media • Build application-level multicast distribution tree over peers • CoopNet [Padmanabhan et al., NOSSDAV’02 and IPTPS’02] • Live media • Builds application-level multicast distribution tree over peers • On-demand • Server redirects clients to other peers • Assumes a peer can (or is willing to) support the full rate • CoopNet does not address the issue of quickly disseminating the media file

More Related