A Comparison of Software and Hardware Techniques for x86 Virtualization - PowerPoint PPT Presentation

slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
A Comparison of Software and Hardware Techniques for x86 Virtualization PowerPoint Presentation
Download Presentation
A Comparison of Software and Hardware Techniques for x86 Virtualization

play fullscreen
1 / 35
A Comparison of Software and Hardware Techniques for x86 Virtualization
448 Views
Download Presentation
redford
Download Presentation

A Comparison of Software and Hardware Techniques for x86 Virtualization

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

    1. A Comparison of Software and Hardware Techniques for x86 Virtualization Paper by Keith Adams & Ole Agesen (VMWare) Presentation by Jason Agron

    2. Presentation Overview What is virtualization? Traditional virtualization techniques. Overview of Software VMM. Overview of Hardware VMM. Evaluation of VMMs. Conclusions Questions

    3. Virtualization Defined by Popek & Goldberg in 1974. Establishes 3 essential characteristics of a VMM: Fidelity Running on VMM == Running directly on HW. Performance Performance on VMM == Performance on HW. Safety VMM manages all hardware resources (correctly?).

    4. Is This Definition Correct? Yes, but its scope should be taken into account. It assumes the traditional trap-and-emulate style of full virtualization. This was extremely popular circa 1974. Completely transparent. It does not account for Paravirtualization. Not transparent. Guest software is modified.

    5. Full Virtualization Full == Transparent Must be able to detect when VMM must intervene. Definitions: Sensitive Instruction: Accesses and/or modifies privileged state. Privileged Instruction: Traps when run in an unprivileged mode.

    6. Traditional Techniques De-privileging Run guest programs in a reduced privilege level so that privileged instructions trap. VMM intercepts the trap and emulates the functionality of the original call. Very similar to the way programs transfer control to the OS kernel during a system call.

    7. Traditional Techniques Primary & Shadow Structures Each virtual systems privileged state differs from that of the underlying HW. Therefore, the VMM must provide the correct environment to meet the guests expectations. Guest-level primary structures reflect the state that a guest sees. VMM-level shadow structures are copies of primary structures. Kept coherent via memory traces.

    8. Traditional Techniques Memory traces Traps occur when on-chip privileged state is accessed/modified. What about off-chip privileged state? i.e. page tables. They can be accessed by LOADs/STOREs. Either by CPU or DMA-capable devices. HW page protection schemes are employed to detect when this happens.

    9. Refinements to Classical Virtualization Traps are expensive! Improve the Guest/VMM interface: AKA Paravirtualization. Allows for higher-level information to be passed to the VMM. Can provide features beyond the baseline of classic virtualization. Improve the VMM/HW interface: IBMs System 370 - Interpretive Execution Mode. Guests allowed safe and direct access to certain pieces of privileged information w/o trapping.

    10. Software VMM x86 - not classically virtualizable. Visibility of privileged state. i.e. Guest can observe its privilege level via un-protected %cs register. Not all sensitive instructions trap. i.e. Privileged execution of popf (pop flags) instruction modifies on-chip privileged state. Unprivileged execution must trap so that VMM can emulate its effects. Unfortunately, no trap occurs, instead a NO-OP.

    11. Software VMM How can x86s faults be overcome? What if guests execute on an interpreter? The interpreter can Prevent leakage of privileged state. Ensure that all sensitive instructions are correctly detected. Therefore it can provide Fidelity Safety Performance??

    12. Interpreter-Based Software VMM Authors Statement: An interpreter-based VMM will not provide adequate performance. A single native x86 instruction will take N instructions to interpret. Question: Is this necessarily true? Authors Solution: Binary Translation.

    13. Properties of This BT Dynamic and On-Demand Run-time translation interleaved with code execution. Code is translated only when about to execute. Laziness avoids problem of distinguishing code & data. System-level All translation rules are set by the x86 ISA. Subsetting Input is x86 ISA binary Output is a safe subset of the ISA. Mostly user-mode instructions. Adaptive Can optimize generated code over time

    14. BT Process Input a TU (Translation Unit) Stopping at either: 12 instructions. Terminating instruction (usually control flow). Translate the TU into a CCF (Compiled Code Fragment). Place generated CCF into the TC (Translation Cache).

    15. BT Process CCFs must be chained together to form a complete program. Each CCF ends in a continuation that acts as a link. Continuations are evaluated at run-time Can be translated into jumps Can be removed (code merely falls through to next CCF). If a continuation is never hit Then it is never transformed. Thus, the BT acts like a just-in-time compiler. Software VMM can switch between BT-mode and direct execution. Performance optimization.

    16. Adaptive BT Traps are expensive. BT can avoid some traps. i.e. rdtsc instruction TC emulation << Call-out & emulate << Trap-and-emulate. Sensitive non-privileged instructions are harder to avoid. i.e. LOADs/STOREs to privileged data. Use adaptive BT to re-work code.

    17. Adaptive BT Detect instructions that trap frequently Adapt the translation of these instructions. Re-translate to avoid trapping. Jump directly to translation. Call out to interpreter. Adaptive BT tries to eliminate more and more traps over time.

    18. Hardware VMM Experimental VMM based on new x86 virtualization extensions. AMDs SVM & Intels VT. New HW features: Virtual Machine Control Blocks (VMCBs). Guest mode privilege level. Ability to transfer control to/from guest mode. vmrun - host to guest. exit - guest to host.

    19. Hardware VMM VMM executes vmrun to start a guest. Guest state is loaded into HW from in-memory VMCB. Guest mode is resumed and guest continues execution. Guests execute until they toy with control bits of the VMCB. An exit operation occurs. Guest saves data to VMCB. VMM state is loaded into HW - switches to host mode. VMM begins executing.

    20. x86 Architecture Extensions

    21. Qualitative Comparison Software wins in Trap elimination via adaptive BT. HW replaces traps w/ exits. Emulation speed. Translations and call-outs essentially jump to pre-decoded emulation routines. HW VMM must fetch VMCB and decode trapping instructions before emulating.

    22. Qualitative Comparison Hardware wins in Code density. No translation = No replicated code segments Precise exceptions. BT approach must perform extra work to recover guest state for faults and interrupts. HW approach can just examine the VMCB. System calls. [Can] run w/o VMM intervention.

    23. Qualitative Comparison (Summary) Hardware VMMs Native performance for things that avoid exits. However exits are still costly (currently). Strongly targeted towards trap-and-emulate style. Software VMMs Carefully engineered to be efficient. Flexible (b/c it isnt HW).

    24. Experiments 3.8 GHz Intel Pentium 4. HT disabled (b/c most virtualization products cant handle this). The contenders Mature commercial Software VMM. Recently developed Hardware VMM. Fair battle?

    25. SPECint & SPECjbb Primarily user-level computations. Unaffected by VMMs Therefore, performance should be near native. Experimental results confirm this. 4% average slowdown for Software VMM. 5% average slowdown for Hardware VMM. The cause is host background activity. Windows jiffy rate << Linux jiffy rate Windows test closer to native than Linux test.

    26. Apache ab Benchmark Tests I/O efficiency SW VMM (and HW VMM?) use host as I/O controller. Therefore ~2x overhead of normal I/O Experimental results confirm this ~ 2x slowdown. Both HW and SW VMMs suck. Windows and Linux tests differ widely Windows - single process (less paging). HW VMM is better. Linux - multiple processes (more paging). SW VMM is better. Why (hint: VMCB)?

    27. PassMark Benchmarks A synthetic suite of microbenchmarks. used to pinpoint various aspects of workstation performance. Large RAM test - exhausts memory Intended to test paging capability SW VMM wins. 2D Graphics test - hits system calls HW VMM wins.

    28. Compile Jobs Test Less synthetic test. Compilation time of Linux Kernel, Apache, etc. SW VMM beats the HW VMM again. Big compilation job w/ lots of files = Lots of page faults. SW VMM is better at this than HW VMM. Compared to native speed SW VMM is ~60% as fast. HW VMM is ~55% as fast.

    29. ForkWait Test Test to stress process creation/destruction. System calls, context switching, page table modifications, page faults, context switching, etc. Native = 6.0 seconds. SW VMM = 36.9 seconds. HW VMM = 106.4 seconds.

    30. Nanobenchmarks Tests used to exercise single virtualization sensitive operations. All tests are conducted using a specially developed guest OS -- FrobOS.

    31. Nanobenchmarks Syscall (Native == HW << SW) HW VMM doesnt intervene. SW VMM traps. In (SW << Native << HW) Native goes off-chip. SW VMM interacts with virtual CPU model. HW VMM intervenes Ptemod (Native << SW << HW) Both take a hit (both use shadowing) SW VMM can adapt, but still less than ideal. HW VMM cant, so it must always do exit/vmrun.

    32. Analysis of Results SW and HW VMMs are even except When BT adaptation helps. i.e. page table faults vs.. exit/vmrun round-trips. They claim that we have found few workloads that benefit from current HW extensions. BUT HW extensions are getting faster all of the time. But stateless HW VMM approach still has a memory bottleneck with VMCB access! Trouble w/ HW VMM is MMU virtualization. HW assisted MMU could relieve VMM of a lot of work! Being proposed by both AMD and Intel.

    33. Future/Related Works CISC/RISC? Should the HW be more complex to support virtualization? Should a complex SW VMM be used? Open source? Open source OS code allows for paravirtualization. What should the OS/VMM interface be? It should be investigated, standardized, documented, and most importantly SUPPORTED! What should the OS/HW interface be? This should be looked at as well!

    34. Conclusions Hardware extensions now allow x86 to execute guests directly (trap-and-emulate style). Comparison of SW and HW VMMs Both are able to execute computation-bound workloads at near native speed. When I/O and process management is involved. SW prevails. When there are a lot of system calls. HW prevails.

    35. Conclusions SW VMM techniques are very mature. Also, very flexible. New x86 extensions are relatively immature and present a fixed (inflexible) interface. Future work on HW extensions promises to improve performance. Hybrid SW/HW VMMs promise to provide benefits of both worlds. There is no clear winner at this time.

    36. Questions???? References: K. Adams and O. Agesen (2006). A comparison of software and hardware techniques for x86 virtualization. In Proceedings of the 12th international Conference on Architectural Support For Programming Languages and Operating Systems. ASPLOS-XII. ACM Press, New York, NY, 2-13.