1 / 25

VDN: Virtual Machine Image Distribution Network for Cloud Data Centers

Chunyi Peng 1 , Minkyong Kim 2 , Zhe Zhang 2 , Hui Lei 2 1 University of California, Los Angeles 2 IBM T.J. Watson Research Center. VDN: Virtual Machine Image Distribution Network for Cloud Data Centers. IEEE INFOCOM 2012 Orlando, Florida USA. Cloud Computing.

judson
Download Presentation

VDN: Virtual Machine Image Distribution Network for Cloud Data Centers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chunyi Peng1, Minkyong Kim2, Zhe Zhang2, Hui Lei2 1University of California, Los Angeles 2IBM T.J. Watson Research Center VDN: Virtual Machine Image Distribution Network for Cloud Data Centers IEEE INFOCOM 2012 Orlando, Florida USA

  2. Cloud Computing the delivery of Computing as a Service C Peng (UCLA)

  3. Service Access in Virtual Machine Instances Cloud Clients Web browser, mobile app, thin client, terminal emulator, … Client Service Requests (e.g. HTTP) Application Software as a Service (SaaS) CRM, Email, virtual desktop, communications, games, … VM VM VM VM VM Platform as a Service (PaaS) Execution runtime, database, web server, development tools, … Platform Infrastructure as a Service (IaaS) Virtual machines, server storage, load balancer, networks, … Infra structure Problem: On-demand VM provisioning Picture source: http://www.wikimedia.org C Peng (UCLA)

  4. Time for VM Image Provisioning User request Req process VM Bootup VM image transfer time Our focus: Transfer time Response in several or tens of minutes in reality! C Peng (UCLA)

  5. Core Aggregation Access ToRswitch Why Slow? • VM image files are large (several or tens of GB) • Centralized image storage becomes a bottleneck Data Center Image-server RH5.6 RH5.6 C Peng (UCLA)

  6. Roadmap • Basic VDN idea: enable collaborative sharing • VDN solution on efficient sharing • Basic sharing units • Metadata management • Performance evaluation • Conclusion C Peng (UCLA)

  7. Core Aggregation Access ToRswitch VDN: Speedup VM Image Distribution • Enable collaborative sharing • Utilize the “free” VM images • Exploit source diversity and make full use of network bandwidth Image-server RH5.6 C Peng (UCLA) RH5.6 RH5.5 RH5.6 RH6.0 RH5.6

  8. How to Enable Collaborative Sharing? • What is the basic data unit for sharing? • File-based sharing: Allow sharing only among same files • Chunk-based sharing: Allow sharing of common chunks from different files • How to manage content location information? • Centralized solution: directory service, etc. • Distributed solution: P2P overlay, etc. C Peng (UCLA)

  9. What is the Appropriate Sharing Unit? • Two factors • The number of the same, alive VM image instances • The similarity of different VM images • Conduct real trace analysis and cross-image similarity measurement • VM traces from six operational data centers for 4 months • VM images including different Linux/Windows versions, IBM services (DB2, Rational, WebSphere) etc C Peng (UCLA)

  10. VM Instance Popularity • The distribution of image popularity is highly skewed • A few popular images take a large portion of VM instances • Many unpopular images have a small number of VM instances (< 5) • Few peers can involve in file-based sharing Unpopular VM images C Peng (UCLA)

  11. VM Instance Lifetime • The lifetime of VM instance varies • 40% instances (more popular VM instances) < 13 minutes • The unpopular VM images have longer lifetime • VM image distribution network should cope with various lifetime instances 13 min C Peng (UCLA)

  12. VM Image Structure • Tree-based VM image structure Enterprise Linux v5.5 (32bit) (26.6%) Enterprise Linux v5.5 (64bit) (18.7%) … Enterprise Linux v5.4 (32bit) (4%) … Enterprise Linux v5.6 (32bit) (0.2%) Red Hat (53%) Linux (60%) SUSE …… (7%) …… Windows (25%) Database …… IDE …… …… …… V7.0 B (0.7%) V7.0.0.11 S P (0.7%) V7.0.0.11 R B (0.3%) V7.0.0.11 S B (0.3%) V7.0.0.11 S D (0.2%) V7.0 P (0.1%) Services (11%) Web app. server …… Misc (4%) C Peng (UCLA)

  13. VM Image Similarity • High similarity across VM images • Chunk schemes: fixed size and Rabin fingerprinting • Similarity: Sim(A,B) = |A’s chunks that appear in B| /|A| • Chunk-based sharing can exploit cross-image similarity C Peng (UCLA)

  14. Enable Chunk-based Sharing • Decouple VM images into VM chunks • Exploit similarity across VM images • Provide a higher source diversity and sharing opportunity RH5.6 RH5.5 RH5.6 RH6.0 RH5.6 RH5.6 Questions: How to maintain chunk location information (metadata) How to be scalable and also enable fast data transmission C Peng (UCLA)

  15. Internet How to Manage Location Information? • Solution I: centralized metadata server • Cons: be simple • Pros: bottleneck at metadata server • Solution II: P2P overlay network, e.g., DHT • Cons: distributed operations • Pros: be unaware of data center topology and may introduce high network overhead I-S C Peng (UCLA)

  16. Issues in Conventional P2P Practice One logic operation (lookup/publish) Multiple physical hops Hop costs (e.g. time) can be high! Solution: Reduce # of hops Reduce the cost of physical hops Keep it local or with close buddies C Peng (UCLA)

  17. Internet Topology-aware Metadata Management • Divide all the hosts into different-level hierarchies and manage chunks in each hierarchy • Utilize static/quasi-static (controlled) topology • Exploit high bandwidth local links in hierarchical structure I-S L3 L2 L2 L1 L1 L1 H H H C Peng (UCLA)

  18. VDN: Encourage Local Communication • Local chunk metadata storage • Index nodes maintain only metadata within this hierarchy • Unnecessary to maintain a global view at all index nodes • Local chunk metadata operation (e.g., lookup/publish) • Ask close index nodes first • Lower operation overhead • Local chunk data delivery • Enable high bandwidth transmission between close hosts (e.g. within the rack) C Peng (UCLA)

  19. VDN Operation Flows • Recursive operation from lower-hierarchy to higher-hierarchy L3 A. Metadata update B. Metadata lookup C. Data transmission Image-server 3C. L2 L2 3B. 1. L1 L1 L1 L1 2. 3A. 4A 4B Local Cache 5 C Peng (UCLA)

  20. Performance Evaluation • Setting • One-month real trace driven simulation • VM image: 128MB~ 8GB • Tree topology: 4x 4 x 8 (128 nodes) • Network bandwidth: • Static throughput for one physical link • Queue-based simulation for multiple transmissions on one link • Schemes • Baseline: centralized operation • Local: fetch VM chunks from local host if possible • VDN: enable collaborative sharing (4-) (8-nodes) 200Mbps I-S 500Mbps disk I/O: 1Gbps Net BW: 1Gbps 2Gbps C Peng (UCLA)

  21. Great Speedup on Image Distribution S1 data center S6 data center at S6, VM image size = 4GB C Peng (UCLA)

  22. Scalable to Heavy Traffic Loads • Adjust time-of-arrival using factor 1-60 S6, Median S6, 90th C Peng (UCLA)

  23. Low Metadata Management Overhead • Compare with three metadata management schemes • Naïve: on-demand topology-aware broadcast • Flat: manage metadata in a ring (e.g. DHT, P2P) • Topo: topology-aware design (VDN) • Assume the communication cost is 1:4:10 (reverse to bandwidth) (a) Number of messages (b) Communication cost C Peng (UCLA)

  24. Conclusion • VDN is a network-aware P2P paradigm for VM image distribution • Reduce image provisioning time • Achieve the reasonable overhead • Chunk-based sharing exploit inherent cross-image similarity • Network-aware operations can optimize the performance in the context of data centers C Peng (UCLA)

  25. THANKs C Peng (UCLA)

More Related