slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
QoS Support in Operating Systems PowerPoint Presentation
Download Presentation
QoS Support in Operating Systems

Loading in 2 Seconds...

play fullscreen
1 / 67

QoS Support in Operating Systems - PowerPoint PPT Presentation


  • 332 Views
  • Uploaded on

QoS Support in Operating Systems. Banu Özden Bell Laboratories ozden@research.bell-labs.com. Vision. Service providers will offer storage and computing services through their distributed data centers connected with high bandwidth networks to globally distributed clients.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

QoS Support in Operating Systems


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. QoS Support in Operating Systems Banu Özden Bell Laboratories ozden@research.bell-labs.com

    2. Vision • Service providers will offer storage and computing services • through their distributed data centers • connected with high bandwidth networks • to globally distributed clients. • Clients will access these services via diverse devices and networks, e.g.: • mobile devices and wireless networks, • high-end computer systems and high bandwidth networks. • These services will become utilities (e.g., storage utility, computing utility). • Eventually resources will be exchanged and traded between geographically dispersed data centers to address fluctuating demand.

    3. Eclipse/BSD:an Operating System with Quality of Service Support Banu Özden ozden@research.bell-labs.com

    4. Motivation • QoS support for (server) applications: • web servers • video servers • Isolation and differentiation of different • entities serviced on the same platform • applications running on the same platform • QoS requirements: • client-based • service-based • content-based

    5. Design Goals • QoS support in a general purpose operating system • Remain compatible with the underlying operating system • QoS parameters: • Isolation • Differentiation • Fairness • (Cumulative) throughput • Flexible resource management • capable of implementing a large set of provisioning needs • supports a large set of server applications without imposing significant changes to their design

    6. Talk Outline • Schedulers • Reservation File System (reservfs) • Tagging • Web Server Experiments • Access Control and Profiles • Eclipse/BSD Status • Related Work • Future Work

    7. Proportional sharing • Generalized processor sharing (GPS) weight of flow i service received by flow i in set of flows • For any flow i continuously backlogged in • Thus, rate of flow i in is:

    8. QoS Guarantees • Fairness • Throughput • Packet delay

    9. Schedulers in Eclipse • Resource characteristics differ • Different hierarchical proportional-share schedulers for resources • Link scheduler: WF2Q • Disk scheduler: YFQ • CPU scheduler: MTR-LS • Network input: SRP

    10. server server 0.8 0.2 0.4 0.2 0.4 company A company B company A page 1 company A page 2 company B 0.5 0.5 page 1 page 2 Hierarchical GPS Example hierarchical proportional sharing proportional sharing

    11. Schedulers • Hierarchical proportional-sharing (GPS) descendant queue nodes of node n serviced received by scheduler node n in set of immediate descendant nodes of the parent of node n • For any node n continuously backlogged in

    12. link scheduler link scheduler Link Aggregation • Need to incrementally scale bandwidth • Resource aggregation is emerging as a solution: • Grouping multiple resources into a single logical unit • QoS over such aggregated links?

    13. GPS MSFQ Nr Nr … r r r Multi-Server Model • Multi Server Fair Queuing (MSFQ) • A packetized algorithm for a system with N links, each with a bandwidth of r, that approximates a GPS system with a single link with Nr bandwidth Reference model Packetized scheduler

    14. Multi-Server Model (Contd.) • Goals: • Guarantee bandwidth and packet delay bounds that are independent of the number of flows • Allow flows arrive and depart dynamically • Be work-conserving • Algorithm: • When a server is idle, schedule the packet that would complete transmission earliest under a single server GPS system with a bandwidth of Nr Sigcomm 2001

    15. a1 a2 a1 a2 GPS GPS 1 2 1 2 MSFQ serv1 WFQ 1 serv 1 2 serv2 2 time = 0 1 2 3 4 time = 0 1 2 3 4 a1 a2 a3 a4 a5 a6 a7 GPS 1 2 3 4 5 6 7 … serv1 6 1 4 … 7 2 5 serv2 MSFQ 3 serv3 time = 0 1 2 3 4 5 6 7 8 9 10 MSFQ Preliminary Properties Multi-Server specific properties • Ordering: a pair of packets scheduled in the order of their GPS finishing times may complete in reverse order • GPS busy MSFQ busy, but converse is not true • Non-coinciding busy periods • Work backlog?

    16. GPS service MSFQ Packet delay time GPSi service MSFQi Service discrepancy time MSFQ Properties • Maximum service discrepancy (buffer requirement) • Maximum packet delay • Maximum per-flow service discrepancy

    17. Schedulers (contd.) • Disk scheduling with QoS • tradeoffs between QoS and total disk performance • driver queue management • queue depth • queue ordering • fragmentation • Hierarchical YFQ • CPU scheduling with QoS • length of cpu phases are not known a priori • cumulative throughput • Hierarchical MTR-LS

    18. Eclipse’s Key Elements • Hierarchical, proportional share resource schedulers • Reservation, reservation file system (reservfs) • Tagging mechanism • Access and admission control, reservation domain

    19. Reservations and Schedulers • (Resource)reservations • unit for QoS assignment • similar to the concept of a flow in packet scheduling • Hierarchical schedulers • a tree with two kinds of nodes: • scheduler nodes • queue nodes • each node corresponds to a reservation • Schedulers are dynamically reconfigurable

    20. disk bandwidth cpu cycles 0.8 0.8 0.8 0.2 0.2 0.2 company A company B company A company B 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 page 1 page 1 page 1 page 1 page 2 page 2 page 2 page 2 Web Server Example • Hosting two companies’ web sites, each with two web pages network bandwidth company A company B

    21. Web Server Video Server Application Interface Reservation file system Scheduler Interface CPU scheduler Link scheduler Disk scheduler 1 Disk scheduler 2 Net 1 Net 2 CPU 1 CPU 1 Disk1 Disk2 Disk3 Reservfs • We built the reservation file system • to create and manipulate reservations • to access and configure resource schedulers

    22. /reserv cpu fxp0 fxp1 da0 Reservfs • Hierarchical • Each reservation directory corresponds to a node at a scheduler • Each resource is represented by a reservation directory under /reserv

    23. Reservfs • Two types of reservation directories: • scheduler directories • queue directories • Scheduler directories are hierarchically expandable • Queue directories are not expandable

    24. /reserv cpu fxp0 fxp1 ca0 q0 q0 r1 q0 q0 q1 q0 share newqueue newreserv share backlog Reservfs • Scheduler directory: • share • newqueue • newreserv • special queue: q0 • Queue directory: • share • backlog

    25. CPU scheduler Link scheduler Disk scheduler Net 1 Net 2 CPU 1 Disk1 Disk2 Reservfs Web Server Video Server Application Interface: Reservation file system Scheduler Interface:

    26. Reservfs API • Creation of a new queue/scheduler reservation • fd=open(newqueue/newreserve,O_CREAT) • fd of newly created share file

    27. da0 q1 q0 q1 share newqueue newreserv share backlog Creating Queue Reservation /reserv cpu fxp0 fxp1 da0 q0 q0 r1 q0 q0 q0 q1 fd= open(“newqueue”,O_CREAT)

    28. da0 da0 q0 q1 r0 r0 q0 q1 share newqueue newreserv q0 share newreserv newqueue fd= open(“newreserv”,O_CREAT) Creating Scheduler Reservation /reserv cpu fxp0 fxp1 q0 q0 r1 q0 q0 q1

    29. Reservfs API • Changing QoS parameters • writing a weight and min value to the share file • Getting QoS parameters • reading the share file • Getting/setting queue parameters • reading/writing the backlog file

    30. Reservfs API Command line output: killerbee$ cd /reserv killerbee$ ls -al total 5 dr-xr-xr-x 0 root wheel 512 Sep 15 11:37 . drwxr-xr-x 20 root wheel 512 Sep 12 21:54 .. dr-xr-xr-x 0 root wheel 512 Sep 15 11:37 cpu dr-xr-xr-x 0 root wheel 512 Sep 15 11:37 fxp0 dr-xr-xr-x 0 root wheel 512 Sep 15 11:37 fxp1 killerbee$ cd fxp0 killerbee$ ls -alR total 6 dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 . dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 .. -rw------- 1 root wheel 1 Sep 15 11:39 newqueue -rw------- 1 root wheel 1 Sep 15 11:39 newreserv dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 q0 -r-------- 1 root wheel 1 Sep 15 11:39 share ./q0: total 4 dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 . dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 .. -rw------- 1 root wheel 1 Sep 15 11:39 backlog -rw------- 1 root wheel 1 Sep 15 11:39 share

    31. Reservfs API killerbee$ cd r0 killerbee$ ls -al total 6 dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 . dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 .. -rw------- 1 root wheel 1 Sep 15 11:39 newqueue -rw------- 1 root wheel 1 Sep 15 11:39 newreserv dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 q0 -r-------- 1 root wheel 1 Sep 15 11:39 share killerbee$ echo “50 1000000” > newqueue killerbee$ ls -al total 6 dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 . dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 .. -rw------- 1 root wheel 1 Sep 15 11:39 newqueue -rw------- 1 root wheel 1 Sep 15 11:39 newreserv dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 q0 dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 q1 -r-------- 1 root wheel 1 Sep 15 11:39 share killerbee$ cd q1 killerbee$ ls -al total 4 dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 . dr-xr-xr-x 0 root wheel 512 Sep 15 11:39 .. -rw------- 1 root wheel 1 Sep 15 11:39 share -rw------- 1 root wheel 1 Sep 15 11:39 backlog killerbee$ cat share 50 1000000 killerbee$

    32. CPU scheduler Link scheduler Disk scheduler Net 1 Net 2 CPU 1 Disk1 Disk2 Reservfs Web Server Video Server Application Interface: Reservation file system Scheduler Interface:

    33. Reservfs Scheduler Interface • Schedulers registers by providing the following interface routines via reservfs_register(): • init(priv) • create(priv, parent, type) • start(priv, parent, type) • delete(priv, node) • get/set(priv, node, values, type)

    34. Reservfs Implementation • Built via vnode/vfs interface • A reserv{} structure represents each reservfs file • reserv{} representing a directory contains a pointer to the corresponding node at scheduler • Scheduler independent • Implements garbage collection mechanism

    35. Talk Outline • Introduction • Schedulers • Reservation File System (reservfs) • Tagging • Web Server Experiments • Access Control and Profiles • Eclipse/BSD Status • Related Work • Future Work

    36. Tagging • A request arriving at a scheduler must be associated with the appropriate reservation • Each request is tagged with a pointer to a queue node • mbuf{}, buf{} and proc{} are augmented • How is a request tagged?

    37. Tagging (contd.) • For a file, its file descriptor is tagged with a disk reservation • For a connected socket, its file descriptor is tagged with a network reservation • For unconnected sockets, we provide a late tagging mechanism • Each process is tagged with a cpu reservation • We associate reservations with references to objects

    38. Default List of a Process • Default reservations of a process, one for each resource • A list of tags (pointers to queue directories) • Used when a tag is otherwise not specified • Two new files are added for each process pid in /proc/pid • /proc/pid/default to represent the default list • /proc/pid/cdefault to represent the child default list

    39. Default List of a Process (contd.) • Reading these file returns the name of default queue directories, e.g., /reserv/cpu/q1 /reserv/fxp0/r2/q1 /reserv/da0/r1/q3 • A process, with the appropriate access rights, can change the entries of default files

    40. Implicit Tagging • The file descriptor returned by open(), accept() or connect() is automatically tagged with default • The tag of the file descriptor of an unconnected socket is set to default at sendto() and sendmesg() • When a process forks, the child process is tagged with the default cpu reservation

    41. Explicit Tagging • The tag of a file descriptor can be set/read with new commands to fcntl(): • F_SET_RES • F_GET_RES • A new system call chcpures() to change the cpu reservation of a process

    42. Reservation Domains • Permissions of a process to use, create and manipulate reservations • The reservation domain of a process is independent of its protection domain

    43. disk bandwidth network bandwidth cpu cycles 0.8 0.8 0.8 0.2 0.2 0.2 reserv A reserv B reserv A reserv B reserv A reserv B 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 reserv 1 reserv 2 reserv 1 reserv 2 reserv 1 reserv2 reserv 1 reserv2 Reservations and Reservation Domains Reservationdomain 1 Reservation domain 2

    44. Reservfs Garbage Collection • Based on reference counts • every application that is using a specific node adds a reference on it (to the vnode) • Triggered by the vnode layer • when the last application finishes using the node this is garbage collected • fcntl() available to maintain the node even if no references to it exist

    45. SRP Input Processing • Demultiples incoming packets • before network and higher-level protocol processing • Unprocesed input queue per socket • Processes input protocols in context of receiving process • Drops packets when per-socket queue is full • Avoids receive livelock

    46. Talk Outline • Introduction • Schedulers • Reservation File System (reservfs) • Tagging • Web Server Experiments • Access Control and Profiles • Eclipse/BSD Status • Related Work • Future Work

    47. QoS Support for Web Server • Virtual hosting with Apache server: • separate Apache server for each virtual host • single Apache server for all virtual hosts • Eclipse/BSD isolates and differentiates performance of virtual hosts • multiple Apache servers----implicit tagging • single Apache server----explicit tagging • We implemented an Apache module for explicit tagging

    48. Experimental Setup • Apache Web Server: • A multi-process server • (Pre)spawns helper processes • A process handles one request at a time • Each process calls accept() to service the next connection request • HTTP clients run on five different machines • Servers are running FreeBSD 2.2.8 or Eclipse/BSD 2.2.8 on a PC (266 MHz Pentium Pro, 64 MB RAM, 9 GB Seagate ST39173W fast wide SCSI disk) • Machines are connected with a 10/100 Mbps Ethernet switch

    49. /reserv cpu fxp0 da0 q0 q0 q0 q1 q1 q1 q2 q2 q2 Experiments • Hosting two sites with two servers Reservation domain of server 1 Reservation domain of server 2

    50. CPU Intensive Workload