VMTorrent : Scalable P2P Virtual Machine Streaming - PowerPoint PPT Presentation

joshua reich oren laadan eli brosh alex sherman vishal misra jason nieh and dan rubenstein n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
VMTorrent : Scalable P2P Virtual Machine Streaming PowerPoint Presentation
Download Presentation
VMTorrent : Scalable P2P Virtual Machine Streaming

play fullscreen
1 / 64
VMTorrent : Scalable P2P Virtual Machine Streaming
88 Views
Download Presentation
peers
Download Presentation

VMTorrent : Scalable P2P Virtual Machine Streaming

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. VMTorrent: Scalable P2P Virtual Machine Streaming Joshua Reich, Oren Laadan, Eli Brosh, Alex Sherman, Vishal Misra, Jason Nieh, and Dan Rubenstein

  2. VM Basics • VM: software implementation of computer • Implementation stored in VM image • VM runs on VMM • Virtualizes HW • Accesses image VM VM Image VMM

  3. Where is Image Stored? VM VM Image VMM

  4. Traditionally: Local Storage Local Storage VM VMM

  5. IaaS Cloud: on Network Storage Network Storage VM VMM VM Image

  6. Can Be Primary Network Storage VM VMM NFS/iSCSI VM Image • e.g., OpenStack Glance • Amazon EC2/S3 • vSphere network storage

  7. Or Secondary Network Storage Local Storage VM VMM VM Image • e.g., Amazon EC2/EBS • vSphere local storage

  8. Either Way, No Problem Here Network Storage VM VMM VM Image

  9. Here? Network Storage VM Image Bottleneck!

  10. Lots of Unique VM Images on EC2 alone 54784 unique images* Network Storage *http://thecloudmarket.com/stats#/totals , 06 Dec 2012

  11. Unpredictable Demand • Lots of customers • Spot-pricing • Cloud-bursting Network Storage

  12. Don’t Just Take My Word • “The challenge for IT teams will be finding way to deal with the bandwidth strain during peak demand - for instance when hundreds or thousands of users log on to a virtual desktop at the start of the day - while staying within an acceptable budget” 1 • “scale limits are due to simultaneous loading rather than total number of nodes” 2 • Developer proposals to replace or supplement VM launch architecture for greater scalability3 http://www.zdnet.com/why-so-many-businesses-arent-ready-for-virtual-desktops-7000008229/?s_cid=e539 http://www.openstack.org/blog/2011/12/openstack-deployments-abound-at-austin-meetup-129 https://blueprints.launchpad.net/nova/+spec/xenserver-bittorrent-images

  13. Challenge: VM Launch in IaaS • Minimize delay in VM execution • Starting from time launch request arrives • For lots of instances (scale!)

  14. Naive Scaling Approaches • Multicast • Setup, configuration, maintenance, etc.1 • ACK implosion • “multicast traffic saturated the CPU on [Etsy] core switches causing all of Etsy to be unreachable“ 2 • [El-Sayed et al., 2003; Hosseini et al., 2007] • http://codeascraft.etsy.com/2012/01/23/solr-bittorrent-index-replication

  15. Naive Scaling Approaches • P2P bulk data download (e.g., Bit-Torrent) • Files are big (waste bandwidth) • Must wait until whole file available (waste time) • Network primary? Must store GB image in RAM!

  16. Both Miss Big Opportunity VM image access • Sparse • Gradual • Most of image doesn’t need to be transferred • Can start w/ just a couple of blocks VM image streaming  

  17. VMTorrent Contributions • Architecture • Make (scalable) streaming possible: Decouple data delivery from presentation • Make scalable streaming effective: Profile-based image streaming techniques • Understanding / Validation • Modeling for VM image streaming • Prototype & evaluation not highly optimized

  18. Talk • Make (scalable) streaming possible: Decouple data delivery from presentation • Make scalable streaming effective: Profile-based image streaming techniques • VMTorrentPrototype & Evaluation (Modeling along the way)

  19. Decoupling Data Delivery from Presentation(Making Streaming Possible)

  20. Generic Virtualization Architecture • Virtual Machine Monitor virtualizes hardware • Conducts I/O to image through file system VM VM Image Host VMM FS Hardware

  21. Cloud Virtualization Architecture Network backend used • Either to download image • Or to access via remote FS Network Backend VM VM Image VMM FS Hardware

  22. VMTorrent Virtualization Architecture • Introduce custom file system • Divide image into pieces • But provide appearance • of complete image to VMM Custom FS Network Backend VM VM Image VMM FS Hardware

  23. Decoupling Delivery from Presentation VMM attempts to read piece 1 Piece 1 is present, read completes Custom FS Network Backend VM 0 1 2 3 4 5 VMM 6 7 8 Hardware

  24. Decoupling Delivery from Presentation VMM attempts to read piece 0 Piece 0 isn’t local, read stalls VMM waits for I/O to complete VM stalls Custom FS Network Backend VM 0 1 2 3 4 5 VMM 6 7 8 Hardware

  25. Decoupling Delivery from Presentation FS requests piece from backend Backend requests from network Custom FS Network Backend VM 0 1 2 3 4 5 VMM 6 7 8 Hardware

  26. Decoupling Delivery from Presentation Later, network delivers piece 0 Custom FS receives, updates piece Read completes VMM resumes VM’s execution Custom FS Network Backend VM 0 0 1 2 3 4 5 VMM 6 7 8 Hardware

  27. Decoupling Improves Performance Primary Storage No waiting for image download to complete Custom FS Network Backend VM 1 2 0 3 4 5 VMM 6 7 8 Hardware

  28. Decoupling Improves Performance Secondary Storage No more writes or re-reads over network w/ remote FS X Custom FS Network Backend VM 1 2 0 3 4 5 X VMM 6 7 8 Hardware

  29. But Doesn’t Scale Assuming a single server, the time to download a single piece is t = W + S / (rnet / n) • W:wait time for first bit • rnet: network speed • S: piece size • n : # of clients Transfer time, each client gets rnet/ n of server BW

  30. Read Time Grows Linearly w/ n Assuming a single server, the time to download a single piece is t = W + n * S / rnet • W:wait time for first bit • rnet: network speed • S: piece size • n : # of clients Transfer time linear w/ n

  31. This Scenario csd Custom FS Network Backend VM 1 2 0 3 4 5 VMM 6 7 8 Hardware

  32. Decoupling Enables P2P Backend Alleviate network storage bottleneck Swarm • Exchange pieces w/ swarm P2P copy must remain pristine Custom FS Network Backend P2P Manager VM 1 2 0 1 2 0 3 4 5 3 4 5 VMM 6 7 8 6 7 8 Hardware

  33. Space Efficient FS uses pointers to P2P image Swarm FS does copy-on-write Custom FS P2P Manager VM 1 2 0 1 2 0 3 4 5 3 4 5 VMM 6 7 8 6 7 6 7 8 Hardware

  34. Minimizing Stall Time Non-local piece accesses Swarm Trigger high priority requests 4! Custom FS P2P Manager VM 1 2 0 1 2 0 4? 3 4 5 3 4 5 4? VMM 6 7 8 6 7 6 7 8 Hardware

  35. P2P Helps Now, the time to download a single piece is t = W(d)+ S / rnet • W(d) :wait time for first bit as function of • d : piece diversity • rnet: network speed • S: piece size • n : # of peers Transfer time independent of n Wait is function of diversity

  36. High Diversity Swarm Efficiency

  37. Low Diversity Little Benefit Nothing to share

  38. P2P Helps, But Not Enough All peers request same pieces at same time t = W(d)+ S / rnet • Low piece diversity • Long wait (gets worse as ngrows) • Long download times

  39. This Scenario p2pd Swarm Custom FS P2P Manager VM 1 2 0 1 2 0 3 4 5 3 4 5 VMM 6 7 8 6 7 6 7 8 Hardware

  40. Profile-based Image Streaming Techniques(Making Streaming Effective)

  41. How to Increase Diversity? Need to fetch pieces that are • Rare: not yet demanded by many peers • Useful: likely to be used by some peer

  42. Profiling • Need useful pieces • But only small % of VM image accessed • We need to know which pieces accessed • Also, when (need later for piece selection)

  43. Build Profile • One profile for each VM/workload • Ran one or more times (even online) • Use FS to track • Which pieces accessed • When pieces accessed • Entries w/ average appearance time, piece index, and frequency

  44. Piece Selection • Want pieces not yet demanded by many • Don’t know piece distribution in swarm • Guess others like self • Gives estimate when pieces likely needed

  45. Piece Selection Heuristic • Randomly (rarest first) pick one of first k pieces in predicted playback window • fetch w/ medium priority (demand wins)

  46. Profile-based Prefetching • Increases diversity • Helps even w/ no peers (when ideal access exceeds network rate)

  47. Obtain Full P2P Benefit Profile-based window-randomized prefetch t = W(d)+ S / rnet • High piece diversity • Short wait (shouldn’t grow much w/ n) • Quick piece download

  48. Full VMTorrent Architecture p2pp Swarm Custom FS P2P Manager VM 1 2 0 1 2 0 3 4 5 3 4 5 VMM 6 7 8 6 7 6 7 8 profile Hardware

  49. Prototype

  50. VMTorrent Prototype BT Swarm Custom C++ & Libtorrent Custom C Using FUSE Custom FS P2P Manager VM 1 2 0 1 2 0 3 4 5 3 4 5 6 7 8 6 7 6 7 8 profile Hardware