1 / 48

Introduction to Xen -A Hypervisor (on x86)

Introduction to Xen -A Hypervisor (on x86). Advisor: Chih-Wen Hsueh Student: Tang-Hsun Tu National Taiwan University Graduate Institute of Networking and Multimedia Wireless Networking and Embedded Systems Laboratory Real-Time System Software Group September 20, 2014. Outline.

shayla
Download Presentation

Introduction to Xen -A Hypervisor (on x86)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Xen-A Hypervisor (on x86) Advisor: Chih-Wen Hsueh Student: Tang-Hsun Tu National Taiwan University Graduate Institute of Networking and Multimedia Wireless Networking and Embedded Systems Laboratory Real-Time System Software Group September 20, 2014

  2. Outline • Introduction • What is Virtualization ? • Why Virtualization is Difficult ? • How to Virtualize ? • Xen Architecture • Hypervisor • CPU Virtualization • Memory Virtualization • I/O Device Virtualization • Hardware-Assisted Virtualization • Conclusion /48

  3. Outline Introduction What is Virtualization ? Why Virtualization is Difficult ? How to Virtualize ? Xen Architecture Hypervisor CPU Virtualization Memory Virtualization I/O Device Virtualization Hardware-Assisted Virtualization Conclusion /48

  4. What is Virtualization ? Fully Utilizing Hardware Virtualization etc Sharing Hardware Resource Running Applications (x-platform) Security Virtual Machine ! /48

  5. Why Virtualization is Difficult ? (1/2) 0/1/3 Ring, e.g. x86_32 • OS is moved to ring1/3 • On x86 • Some instructions • Sensitive Instructions • Cannot be trapped OS Privileged Instructions 0/3/3 Ring, e.g. x86_64, ARM OS /48

  6. Why Virtualization is Difficult ? (2/2) - Examples SGDT, SIDT and SLDT SGDT m // save gdtr to memory SIDT m // save idtr to memory SLDT r/m16 // save ldtr to memory Only one gdtr, idtr and ldtr on a cpu ! POP POP ss // need to satisfy RPL=CPL=DPL CPL changes from 0 to 1 or 3 ! /48

  7. How to Virtualize ? (1/2) • Binary translation • Hypercall Full Virtualization Para Virtualization Hardware Assisted Virtualization Intel VT-x & AMD SVM /48

  8. How to Virtualize ? (2/2) Type I • Hypervisor (VMM) Type • Type I + Microkernel • Xen (open source, citrix), • Microsoft Hyper-V • Type I + Integrated kernel • VMware ESX, • KVM (kernel-base VM) • Type II (Host OS + Guest OS) • VMware GSX, workstation, • Microsoft virtual PC, • Microsoft virtual server, • Sun Virtual Box Type II /48

  9. Outline • Introduction • What is Virtualization ? • Why Virtualization is Difficult ? • How to Virtualize ? • Xen Architecture • Hypervisor • CPU Virtualization • Memory Virtualization • I/O Device Virtualization • Hardware-Assisted Virtualization • Conclusion /48

  10. Xen Architecture (1/2) Domain U Domain 0 Hypervisor /48

  11. Xen Architecture (2/2) • Compare to common Linux /48

  12. Xen Architecture Boot Hypervisor Hyper Call & System Call Event Channel Grant Table CPU Virtualization Virtual CPU Architecture Scheduling Interrupt Memory Virtualization Shared Info Page Memory Architecture Translation • I/O Device Virtualization • Split Device Driver • Device I/O Ring • Build System • Build Xen • Build XCI /48

  13. Boot • For paravirtualized guest OSes • Start in “protected mode” • Use start info page • Start info page • Put the address to “esi” register • For HVM guest OSes • Start in “real mode” (emulated BIOS) • With QEMU /48

  14. Hypervisor - Hyper Call & System Call (1/2) System Call eax • int 0x80 • int 0x82 01 02 03 04 05 06 07 // linux/include/asm/unistd.h #define __NR_restart_syscall 0 #define __NR_exit 1 #define __NR_fork 2 #define __NR_read 3 … Guest OS Hypervisor HYPERVOSIR_sched_op int 82h hypercall Hypercall_table do_sched_op iret Hyper Call resume Guest OS 01 02 03 04 05 06 07 // xen/include/public/xen.h #define __HYPERVISOR_set_trap_table 0 #define __HYPERVISOR_mmu_update 1 #define __HYPERVISOR_set_gdt 2 #define __HYPERVISOR_stack_switch 3 … /48

  15. Hypervisor - Hyper Call & System Call (2/2) • How system calls work with hyper calls ? • HVM can use SYSENTER/SYSCALL • How to do hyper calls in applications ? User space User Space User Space xm, xend ioctl() ring3 Application Application system call exception procfs Guest OS Guest OS ring1 Service privcmd system call hyper call hyper call OS Hypervisor Hypervisor ring0 Service services services /48

  16. Domain A Domain B Domain A Domain B create GR create GR send GR send GR map page transfer page access page inform receive page unmap page release GR inform release GR Hypervisor - Grant Table • Grant reference (GR)  Grant entry • A request with an index • Use in communication • Page mapping & Page transferring /48

  17. Hypervisor - Event Channel A lightweight signal mechanism Use “ports” as identifers (pending+mask) Four major purposes IDC IPI IPI vIRQ pIRQ 15 0 Event Channel … port 0 port 1 Guest OS Guest OS … VCPU VCPU VCPU VCPU … … Hypervisor VirtualMemory … Virtual CPU Scheduling Hardware PhysicalCPU PhysicalMemory … Eth0 Eth1 /48

  18. CPU Virtualization • Architecture • 2 scheduling algorithms (Non/Work Conserving) • Simple Earliest Deadline First (SEDF) • Credit App App Guest OS Guest OS … Hypervisor VCPU VCPU VCPU … Scheduling PCPU PCPU PCPU … /48

  19. T2 T2 T2 T1 T1 T1 T1 T1 t 0 1 2 3 5 6 7 8 9 4 10 CPU Virtualization - Earliest Deadline First • Assign process priorities according to the deadlines of their current request • An example, two processes • T1 = (slice, deadline) = (1, 2) • T2 = (2, 8) d1: 2d2: 8 d1: Xd2: 8 d1: 4d2: 8 d1: Xd2: 8 d1: 6d2: X d1: 8d2: X d1: 10d2: 16 d1: Xd2: 16 /48

  20. CPU Virtualization - SEDF • (slice, period, deadline) • Two queues • Cannot do load balancing on SMP • e.g 3 domains (A:80%, B:80%, C:30%), 2 PCPUs slice period Run queue VCPU1 VCPU2 VCPU3 VCPU4 d1 < d2 < d3 < d4… Wait queue VCPU1 VCPU2 VCPU3 s1 < s2 < s3… /48

  21. CPU Virtualization - Credit • Each PCPU has a VCPU list • Priority queue • Two priority states, over, under • Over: consume > allocate • Under: consume < allocate • If there is no “under” VCPU, hypervisor will select “under” VCPU from other PCPU • (weight, cap)creditunder or over Priority queue VCPU1 VCPU2 VCPU3 VCPU4 under under under over /48

  22. CPU Virtualization - Interrupt (1/2) • 8259A • IOAPIC+LAPIC PIT Keyboard RTC /48

  23. PIC PIC CPU Virtualization - Interrupt (2/2) • Physical interrupt • For the hypervisor or for guest OSes • Virtual interrupt • Ask guest OSes to do • 8 for now (max is 24) Guest OS Guest OS … event OS Hypervisor ISR Hardware Hardware Device Device IRQn IRQn /48

  24. Memory Virtualization - Memory Architecture (1/2) • Two-level memory • Three-level memory • Virtual, Pseudo-physical, Machine hypervisor Application - Virtual Memory OS Guest OS -Physical Memory -Pseudo-Physical Memory P2M M2P Hypervisor -Machine Memory /48

  25. Memory Virtualization - Memory Architecture (2/2) • 168M memory for hypervisor 0xFC000000 0xFC400000 Heap 0xFFFFFFFF /48

  26. Memory Virtualization - Translation (1/2) • 4 mechanisms to manipulate page tables • Paravirtualized page tables • Write page tables (Only level 1 is writable) • Shadow page tables • Hardware-assisted paging (Intel:Extend, AMD: Nest) Virtual Memory Page Table Shadow Page Table MMU Page Fault ! (VM->PFN) (VM->MFN or VM->P2M) Pseudo-Physical Memory Second Level PagingHAP P2M Machine Memory /48

  27. Memory Virtualization - Translation (2/2) • Comparison N is the number of page tables in all guests. M is the number of all guests. /48

  28. Memory Virtualization - Shared Info Page • Structure • Compare with start_info_page MAX is 32 VCPUs event channel TSC memory wall clock /48

  29. I/O Device Virtualization - Device Model • Hypervisor also provides three mechanisms to use devices. • Emulated Devices • Paravirtualized Driver • Pass-through /48

  30. I/O Device Virtualization - Emulated Devices • Implemented by QEMU • e.g. sound card, ac97, sb16, etc QEMU-DM /48

  31. I/O Device Virtualization - Paravirtualized Driver • Split Device Driver Model • An example of sending packets Back-End Driver Front-End Driver Native Driver /48

  32. I/O Device Virtualization - I/O Ring • Without data, it only transfers request/reply • A example with GR Dom U Dom 0 GR GR GR Grant Table I/O Channel Hypervisor Active Grant Table Device /48

  33. I/O Device Virtualization - Pass-Through • Pass and directly use the device Dom 0 Dom U … NativeDriver NativeDriver Hypervisor VirtualMemory … Virtual CPU Scheduling Hardware PhysicalCPU PhysicalMemory … Eth0 Eth1 /48

  34. Hardware Virtual Machine (1/3) • Intel Virtualization Technology /48

  35. Hardware Virtual Machine (2/3) • Architecture • Intel VT-x • Support if CPUID.1:ECX.VMX[bit 5] = 1 ring3 Guest App Guest App ring1 non-root Guest OS ring0 Hypervisor Guest OS root Hypervisor VMLAUNCH VMRESUME /48

  36. Hardware Virtual Machine (3/3) • Use BIOS code from Bochs • Replace several functions, e.g. SYSENTER • HVM Device QEMU-DM /48

  37. Build Xen - Xen Source Tree • http://rswiki.csie.org/lxr/http/source/?v=xen-3.4.1 A mini paravirtualized OS QEMU-DM, Bootloader, xm, xend, … hypervisor /48

  38. Build Xen - Screenshot /48

  39. 0x0 _start stack_start 0x1000 shared_info 0x2000 hypercall_page 0x3000 … Build Xen - A Simplest Xen Kernel • Headers to tell Xen loader • OS 01 02 03 04 05 06 07 08 09 10 11 #include <arch-x86_32.h> .section __xen_guest .ascii "GUEST_OS=Hacking_Xen_Example" .ascii ",XEN_VER=xen-3.0" .ascii ",VIRT_BASE=0x0" .ascii ",ELF_PADDR_OFFSET=0x0" .ascii ",HYPERCALL_PAGE=0x2" .ascii ",PAE=yes" .ascii ",LOADER=generic" .byte0 page number hypercall 01 02 03 04 05 06 07 08 void start_kernel( start_info_t *start_info) { HYPERVISOR_console_io( CONSOLEIO_write, 12,"Hello World\n"); while(1); } 01 02 03 04 05 _start: cld lss stack_start,%esp push%esi call start_kernel /48

  40. Build XCI - Xen Client Initiative (1/2) • Goals • Creating a minimal environment of Xen, i.e. Xen hypervisor + Linux domain 0, suitable for clients • Supporting more devices through ioemu • XCI consists three subprojects • Hypervisor (original code + patches + new management tools) • ioemu (separating from original Xen source tree) • Domain-0 Linux /48

  41. Build XCI - Xen Client Initiative (2/2) • Only x86, ia64 and arm in “arch” directory /48

  42. Experimental Environment • CPU: Intel Core2 U9400 1.4GHz (use one core) • Memory: 512MB • Network Interface Card: Atheros AR8131 (at 100MBps) • Hypervisor: Xen 3.4.2 • Dom-0: Linux 2.6.18.8 • Guest OS: Windows XP • CPU Benchmark Tools: • Chrome V8 Benchmark Suite • SuperPI 1.1e • Hard Disk Drive Benchmark Tools • HD Tune Pro v3.50 • Network Benchmark Tools • Iperf (Server: 2.0.4, Client: 1.7.0) /48

  43. CPU Benchmark (1/2) 8.3% /48

  44. CPU Benchmark (2/2) 5% /48

  45. Network Benchmark (1/2) Testing Time: 180 seconds Benchmark Deviation: 0.12%~0.26 59% /48

  46. Network Benchmark (2/2) Average: 9.82% Sample Period: 2 seconds /48

  47. Conclusion • We introduce the techniques for how to virtualize. • i.e. full, para and hardware-assisted virtualization • We present the architecture of Xen. • Several parts in Xen are also introduced. /48

  48. Q & A /48

More Related