1 / 40

Raju Kumar CS598C: Virtual Machines

Unmodified Device Driver Reuse and Improved System Dependability via Virtual Machines Joshua Le Vasseur, Volkmar Uhlig, Jan Stoess, Stefan Gotz – OSDI-2004. Raju Kumar CS598C: Virtual Machines. Introduction. Device Drivers - 70% of Linux 2.4.1 code for IA32 New OS Rewrite drivers

hei
Download Presentation

Raju Kumar CS598C: Virtual Machines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Unmodified Device Driver Reuse and Improved System Dependability via Virtual MachinesJoshua Le Vasseur, Volkmar Uhlig, Jan Stoess, Stefan Gotz – OSDI-2004 Raju Kumar CS598C: Virtual Machines

  2. Introduction • Device Drivers - 70% of Linux 2.4.1 code for IA32 • New OS • Rewrite drivers • Reuse drivers from other OS • Unavailable code • Undocumented features • Extent of programming errors

  3. Contribution • Unmodified reuse of existing device drivers • Strong isolation among device drivers • Fault containment • Extent of collocation

  4. Related Work - Reuse • Binary driver reuse – cohosting in VMware Workstation • Both driver OS and VM run with all privileges!! • Transplanting • Uses glue • Raises conflicts • Leads to compromises in new OS • Both driver and VM still run with all privileges

  5. Related Work – Semantic Resource Conflicts • Semantic Resource Conflicts • Accidental denial of service • Sharing Conflicts • Transplanted driver and host OS prone to each other’s faults • Since driver and OS both have all privileges, cooperation is required • Cooperation not possible with transplanting • Device driver disables interrupts

  6. Related Work – Engineering Effort • Are reused drivers functioning correctly ? • Even with transplanting, 12% of OS-Kit code = glue • Glue provides • Ways to handle semantic differences • Interface translation • Donor OS knowledge required to write glue • What if multiple donor OS-s ? Writing glue code is even more difficult. • What if driver code in donor OS gets updated ?

  7. Related Work - Dependability • User level device drivers • Used with some differences • Nooks • Isolates drivers within protection domains • No privilege isolation • Complete fault isolation not possible • Detection of malicious drivers not possible • Adds 22,000 lines of privileged code to Linux • Uses interposition services to maintain integrity of resources shared between drivers • No sharing of resources between drivers in this work – uses request messages

  8. Approach • Drivers are closely knit to kernel, applications are not • Orthogonal drivers should be based on following principles • Resource delegation • Receive only bulk resources • Separation of name spaces • Driver has its own address space • Separation of privilege • Execute driver in unprivileged mode • Secure isolation • Among drivers, between drivers and applications • Common API

  9. Analysis of principles • Most flouted for device drivers • None flouted for OS • Insight – transplant OS, rather than just driver

  10. Architecture • DD/OS – OS running a device driver • DD/OS hosted in a VM • Driver controls its device directly via a pass-through enhancement to VM hosting DD/OS • Driver cannot access other DD/OS • Translation module – added to DD/OS to interface with clients • One translation module can be used for multiple DD/OS-s • Hard disks, floppy disks, optical media, etc. • Drivers execute in separate VMs • Driver isolation from each other • Simultaneous use of drivers from incompatible OS-s

  11. DD/OS using a DD/OS

  12. Virtual Machine Environment • Hypervisor • VMM • DD/OS-s • Clients in VMs • Translation modules

  13. Inter VM • Low overhead communication • Message notification • Source VM raises communication interrupt in Destination VM • Request completion • Destination VM raises completion interrupt in Source VM • Low overhead memory sharing • Register memory areas of a VM into another VM’s physical memory space

  14. Requests and Responses • Client signals DD/OS – VMM sends virtual interrupt to translation module • DD/OS signals client – Translation module raises a trap in VMM

  15. Enhancing Dependability • Driver isolation • Improve reliability • By preventing fault propagation • Improve availability • Virtual machine reboot • Continuum of configurations • Individual drivers vs group of drivers

  16. Driver Restart • Asynchronous – Reset driver • Fault detection • Malicious driver • Synchronous – Negotiations and quiescing • Live upgrades • Proactive restart • Indirection captures accesses to a restarting driver • Transparently started • Fault signaled

  17. Virtualization Issues • DD/OS consumes more resources than drivers • DMA operations • Special timing needs of physical hardware violated • Host OS has to collaborate with DD/OS to control driver

  18. DMA address translation • DMA addresses in DD/OS reference guest physical address • not same as host physical address • Translation • VMM intercepts DMA access and translates

  19. DMA and Security • DD/OS can perform DMA to physical memory not allowed by memory protection system !! • Use DMA to replace hypervisor code/data • In absence of hardware support to restrict DMA access, device drivers are part of TCB

  20. DMA and Trust • Untrusted by hypervisor • Client • Client and DD/OS • Client and DD/OS + they do not trust each other • When DD/OS is untrusted • Hypervisor enables DMA permissions to client memory • Restricts DD/OS’s actions in client memory • When DD/OS and client do not trust each other • Client pins its own memory • DD/OS verifies pinning of client’s memory via hypervisor

  21. DMA and Trust contd… • VM faults and restarts while device is using DMA !! • All targeted memory cannot be reclaimed until all such DMA operations complete or abort • What is “targeted memory” ? DD/OS memory ? Client’s pinned memory ? • No solution provided to this problem!! • Client with memory pinned due to a DD/OS that faulted and is rebooting should not use pinned memory until restart has completed • And then what ? Will the DD/OS signal completion ? What if DMA completes before the VM restarts ? What if VM fails to start at all ?

  22. IO-MMU and IO Contexts • IO-MMU • Designed to overcome 32-bit address limitation for DMA in 64-bit systems • Can be used to enforce access permissions for DMA operations and address translation • Hence DD/OS are hardware isolated • Hence device drivers can be excluded from TCB • More questions – So does this work assume device drivers in TCB or not in TCB ? If in TCB, we cannot do anything. If not in TCB, then driver cannot do anything malicious due to hardware isolation, so we do not need to do anything. So?

  23. IO-MMU contd… • IO-MMU does not support multiple address contexts • Time multiplex IO-MMU between PCI devices • Timeouts may occur in several device drivers • Question – How many PCI devices are there generally in a system ? But eventually the various device drivers will be the deciding granularity. So would it be a better idea to group all device drivers in one DD/OS and avoid all contention ? If yes, we have a tradeoff between performance and fault isolation. • Impact on gigabit ethernet NIC proportional to bus access • Decrease impact of multiplexing by using dynamic bus allocation based on • Device utilization – prefer active and asynchronous devices • Have to use IO-MMU to ensure device driver isolation. No options yet.

  24. Resource Consumption • OS size of driver modules • Periodic tasks in DD/OS lead to cache and TLB footprints • Question – paper claims periodic tasks in DD/OS impose overhead on clients even when not using any device driver. How ? • Page Sharing uses schemes used in VMware ESX Server • Steady state cache footprint of multiple DD/OS-s is low due to high sharing • Swap out VM pages to disk • Do not swap out pages for VM hosting DD/OS for swap device • Do not swap out pages for VM hosting DD/OS used by swap device • More questions • When treating the DD/OS as a black box, we cannot swap unused parts of the swap DD/OS via working set analysis. All parts of the OS must always be in main memory to guarantee full functionality even for rare corner cases. • Black Box - Do not know which pages are used. All parts of OS must always be in main memory. Then what can be paged out ? How do we find it ?

  25. Reducing Memory Footprint • In addition to memory sharing and swapping • Memory ballooning inside DD/OS • Does it acquire pages and zero them out ? Details not provided. • Handles zero pages specially • Compresses non-working set pages that cannot be swapped and uncompresses them upon access • Periodic tasks increase DD/OS footprint • Do not meet strict requirements

  26. Timing • Virtual Time vs Real Time • Devices malfunction under violation of assumptions related to time • Soft preemption • If interrupts disabled, VMM does not preempt VM until interrupts are enabled • Hard preemption • Preempt even if interrupts disabled

  27. Shared Hardware and Recursion • Time sharing of devices is needed • Time sharing PCI is difficult • Let a DD/OS control PCI • This DD/OS interposes access to the PCI and applies a policy

  28. Results • Implemented a driver reuse system • Evaluated network, disk and PCI drivers • Hypervisor and VMM are paravirtualized systems

  29. Virtualization Environment • Hypervisor • L4 • VMM • User level L4 task • DD/OS • Linux kernel 2.4.22 • Client OS • Linux kernel 2.6.8.1

  30. Translation Modules • Disk interface • Added to DD/OS as a kernel module • Communicates with the block layer • Network interface • Added to DD/OS as a device driver • Represents itself to DD/OS as a network device, attached to a virtual interconnect • Asynchronous inbound packet delivery • Outbound – transmitter from the client via DMA • Inbound – L4 copies packets from DD/OS to client • PCI interface • More questions - When the PCI driver is isolated, it helps the other DD/OS instances discover their appropriate devices on the bus, and restricts device access to only the appropriate DD/OS instances. - ? • Executed at a lower priority than all other components • More questions - Priority is not privilege. Would not PCI performance affect system performance drastically ? Paper says PCI interface is not performance critical. Why ?

  31. Resource Consumption – Working Set

  32. Resource Consumption – Memory Compression

  33. Resource Consumption – CPU utilization

  34. Performance - TTCP

  35. Performance - Disk

  36. Performance – Application Level

  37. IO-MMU

  38. Engineering Effort

  39. Conclusion • Provides reuse of unmodified device drivers • Network throughput within 3-8% of native Linux • Each DD/OS consumes 0.6-1.8% of CPU (approximately 0.12%)

  40. Thanks !!

More Related