Advanced operating systems
1 / 149

Advanced Operating Systems - PowerPoint PPT Presentation

  • Uploaded on

Advanced Operating Systems. Lecture 3: OS design. University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani. How to design an OS. Some general guides and experiences. References “The Computer for the 21st Century”, Mark Weiser

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Advanced Operating Systems' - vinny

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Advanced operating systems

Advanced Operating Systems

Lecture 3: OS design

University of Tehran

Dept. of EE and Computer Engineering


Dr. Nasser Yazdani

Distributed Operating Systems

How to design an os
How to design an OS

  • Some general guides and experiences.

  • References

    • “The Computer for the 21st Century”, Mark Weiser

    • “Exokernel: An Operating System Architecture for Application Level Resource Management”, Dawson R., Engler M, Frans Kaashoek, et al.

    • “On Micro-Kernel Constructions“,

Distributed Operating Systems


  • New applications/requirements

  • Organizing operating systems

  • Some microkernel examples

  • Object-oriented organizations

    • Spring

  • Organization for multiprocessors

Distributed Operating Systems

New vision
New vision

  • Two important problems: location and scale.

  • Ubiquitous computing: tiny kernels of functionality

  • Virtual Reality

  • Mobility

  • Intelligent devices

  • distributed computing" make networks appear like disks, memory, or other nonnetworked devices.

Distributed Operating Systems

Ubiquitous computing
Ubiquitous computing

  • Transparent computing is the ultimate goal

  • Computers should disappear into the background

  • Computation becomes part of the environment

  • Computing everywhere

    • Desktop, Laptop, Palmtop

    • Cars, Cell phones

    • Shoes, Clothing, Walls (paper / paint)

  • Connectivity everywhere

    • Broadband

    • Wireless

  • Mobile everywhere

    • Users move around

    • Disposable devices

Distributed Operating Systems

Ubiquitous computing1
Ubiquitous Computing

  • Structure

    • Resource and service discovery critical

    • User location an issue

    • Interface discovery

    • Disconnected operation

    • Ad-hoc organization

  • Security

    • Small devices with limited power

    • Intermittent connectivity

  • Agents

  • Sensor Networks

Distributed Operating Systems

Grid computing
Grid Computing

  • Federated system

    • No single controlling authority

  • Scheduling

    • Processors, bandwidth and other resources

  • Policy is an important issue

    • Reliability, security, of who can use, and what one is willing to use.

  • Systems

    • Globus toolkit

    • Condor

    • Related but not grid – CORBA, DCOM, DCE

  • Applications

    • Distributed supercomputing

Distributed Operating Systems

Peer to peer computing
Peer-to-Peer Computing

  • Locating Cooperative elements

  • Scalability

  • OS support

  • Security

  • Policies

P2p file sharing issues
P2P File Sharing Issues

  • Naming

  • Data discovery

  • Availability

  • Security

    • Encryption

    • Fault tolerance

  • Conflict resolution

  • Replication

Distributed Operating Systems

Other peer to peer technologies
Other Peer to Peer Technologies

  • Ad-hoc networking

    • Untrusted nodes used to relay messages

    • Multiple routes (distributed and replicated)

    • Extends range, reduces power, increases aggregate bandwidth.

    • Increases latency, management more difficult.

  • Sensor networks

    • An application of ad-hoc networking

    • Add processing/reduction in the network

Distributed Operating Systems

What is the big deal
What is the big deal?

  • Performance

  • Border crossings are expensive

    • Change in locality

    • Copying between user and kernel buffers

    • Application requirements differ in terms of resource management

Distributed Operating Systems

Operating system organization
Operating System Organization

  • What is the best way to design an operating system?

  • Put another way, what are the important software characteristics of an OS?

  • What should be in OS kernel or application or partitioning.

    • Is there a minimal set for kernel?

Distributed Operating Systems

Important os software characteristics
Important OS Software Characteristics

  • Correctness and simplicity

  • Power and completeness

  • Performance

  • Extensibility and portability

    • Flexibility

    • Scalability

  • Suitability for distributed and parallel systems

  • Compatibility with existing systems

  • Security and fault tolerance

Distributed Operating Systems

Common os organizations
Common OS Organizations

  • Monolithic

  • Virtual machine

  • Layered designs

  • Kernel designs

  • Microkernels

  • Object-Oriented

  • Note that individual OS components can be organized these ways

  • Trade off between generality and specialization

What are we shooting for
What are we shooting for?

  • OS should be thin (like a microkernel) providing only mechanisms not embodying policies (i.e. management)

  • Fine grain access to system resources while avoiding border crossings as much as possible (like DOS)

  • Allow flexible extensions for management of resources (like a microkernel) without sacrificing safety (like a monolithic kernel)

Distributed Operating Systems

Monolithic os design
Monolithic OS Design

  • Build OS as single combined module

    • Hopefully using data abstraction, compartmentalized function, etc.

  • OS lives in its own, single address space

  • Examples

    • DOS

    • early Unix systems

    • most VFS file systems

Pros cons of monolithic os organization
Pros/Cons of Monolithic OS Organization

  • Highly adaptable (at first . . .)

  • Little planning required

  • Potentially good performance

  • Hard to extend and change

  • Eventually becomes extremely complex

  • Eventually performance becomes poor

  • Highly prone to bugs

Virtual machine organizations
Virtual Machine Organizations

  • A base operating system provides services in a very generic way

  • One or more other operating systems live on top of the base system

    • Using the services it provides

    • To offer different views of system to users

  • Examples - IBM’s VM/370, the Java interpreter

Pros cons of virtual machine organizations
Pros/Cons of Virtual Machine Organizations

  • Allows multiple OS personalities on a single machine

  • Good OS development environment

  • Can provide good portability of applications

  • Significant performance problems

  • Especially if more than 2 layers

  • Lacking in flexibility

Old idea
Old idea

  • VM 370

    • Virtualization for binary support for legacy apps

  • Why resurgence today?

    • Companies want a share of everybody’s pie

      • IBM zSeries “mainframes” support virtualization for server consolidation

        • Enables billing and performance isolation while hosting several customers

      • Microsoft has announced virtualization plans to allow easy upgrades and hosting Linux!

  • You can see the dots connecting up

    • From extensibility (a la SPIN) to virtualization

Distributed Operating Systems

Possible virtualization approaches
Possible virtualization approaches

  • Standard OS (such as Linux, Windows)

    • Meta services (such as grid) for users to install files and run processes

    • Administration, accountability, and performance isolation become hard

  • Retrofit performance isolation into OSs

    • Linux/RK, QLinux, SILK

    • Accounting resource usage correctly can be an issue unless done at the lowest level (e.g. Exokernel)

  • Xen approach

    • Multiplex physical resource at OS granularity

Distributed Operating Systems

Full virtualization
Full virtualization

  • Virtual hardware identical to real one

    • Relies on hosted OS trapping to the VMM for privileged instructions

    • Pros: run unmodified OS binary on top

    • Cons:

      • supervisor instructions can fail silently in some hardware platforms (e.g. x86)

        • Solution in VMware: Dynamically rewrite portions of the hosted OS to insert traps

      • need for hosted OS to see real resources: real time, page coloring tricks for optimizing performance, etc…

Distributed Operating Systems

Xen principles
Xen principles

  • Support for unmodified application binaries

  • Support for multi-application OS

    • Complex server configuration within a single OS instance

  • Paravirtualization for strong resource isolation on uncooperative hardware (x86)

  • Paravirtualization to enable optimizing guest OS performance and correctness

Distributed Operating Systems

Xen vm management
Xen: VM management

  • What would make VM virtualization easy

    • Software TLB

    • Tagged TLB =>no TLB flush on context switch

      X86 does not have either

  • Xen approach

    • Guest OS responsible for allocating and managing hardware PT

    • Xen top 64MB of every address space. Why?

Distributed Operating Systems

Layered os design
Layered OS Design

  • Design tiny innermost layer of software

  • Next layer out provides more functionality

    • Using services provided by inner layer

  • Continue adding layers until all functionality required has been provided

  • Examples

    • Multics

    • Fluke

    • layered file systems and comm. protocols

Pros cons of layered organization
Pros/Cons of Layered Organization

  • More structured and extensible

  • Easy model and development

  • Performance: Layer crossing can be expensive

  • In some cases, unnecessary layers, duplicated functionality.

Kernel os designs
Kernel OS Designs

  • Similar to layers, but only two OS layers

    • Kernel OS services

    • Non-kernel OS services

  • Move certain functionality outside kernel

    • file systems, libraries

  • Unlike virtual machines, kernel doesn’t stand alone

  • Examples - Most modern Unix systems

Pros cons of kernel os organization
Pros/Cons of Kernel OS Organization

  • Many advantages of layering, without disadvantage of too many layers

  • Easier to demonstrate correctness

  • Not as general as layering

  • Offers no organizing principle for other parts of OS, user services

  • Kernels tend to grow to monoliths

Object oriented os design
Object-Oriented OS Design

  • Design internals of OS as set of privileged objects, using OO methods

  • Sometimes extended into application space

  • Tends to lead to client/server style of computing

  • Examples

    • Mach (internally)

    • Spring (totally)

Object oriented organizations
Object-Oriented Organizations

  • Object-oriented organization is increasingly popular

  • Well suited to OS development, in some ways

    • OSes manage important data structures

    • OSes are modularizable

    • Strong interfaces are good in OSes

Distributed Operating Systems

Object orientation and extensibility
Object-Orientation and Extensibility

  • One of the main advantages of object-oriented programming is extensibility

  • Operating systems increasingly need extensibility

  • So, again, object-oriented techniques are a good match for operating system design

Distributed Operating Systems

How object oriented should an os be
How object-oriented should an OS be?

  • Many OSes have been built with object-oriented techniques

    • E.g., Mach and Windows NT

  • But most of them leave object orientation at the microkernel boundary

    • No attempt to force object orientation on out-of-kernel modules

Distributed Operating Systems

Pros cons of object oriented os organization
Pros/Cons of Object Oriented OS Organization

  • Offers organizational model for entire system

  • Easily divides system into pieces

  • Good hooks for security

  • Can be a limiting model

  • Must watch for performance problems

    Not widely used yet

Microkernel os design
Microkernel OS Design

  • Like kernels, only less number of abstractions exported (threads, address space, communication channel)

  • Try to include only small set of required services in the microkernel

  • Moves even more out of innermost OS part

    • Like parts of VM, IPC, paging, etc.

  • System services (e.g. VM manager) implemented as servers on top

  • High comm overhead between services implemented at user level and microkernel limits extensibility in practice

  • Examples - Mach, Amoeba, Plan 9, Windows NT, Chorus, Spring, etc.

Pros cons of microkernel organization
Pros/Cons of Microkernel Organization

  • Those of kernels, plus:

  • Minimizes code for most important OS services

  • Offers model for entire system

  • Microkernels tend to grow into kernels

  • Requires very careful initial design choices

  • Serious danger of bad performance

Organizing the total system
Organizing the Total System

  • In microkernel organizations, much of the OS is outside the microkernel

  • But that doesn’t answer the question of how the system as a whole gets organized

  • How do you fit together the components to build an integrated system? While maintaining all the advantages of the microkernel

Distributed Operating Systems

Some important microkernel designs
Some Important Microkernel Designs

Micro-ness is in the eye of the beholder

  • Mach

  • Spring

  • Amoeba

  • Plan 9

  • Windows NT


  • Mach didn’t start life as a microkernel

    • Became one in Mach 3.0

  • Object-oriented internally

    • Doesn’t force OO at higher levels

  • Microkernel focus is on communications facilities

  • Much concern with parallel/distributed systems

Mach model
Mach Model



















What s in the mach microkernel
What’s In the Mach Microkernel?

  • Tasks & Threads

  • Ports and Port Sets

  • Messages

  • Memory Objects

  • Device Support

  • Multiprocessor/Distributed Support

Mach tasks
Mach Tasks

  • An execution environment providing basic unit of resource allocation

  • Contains

    • Virtual address space

    • Port set

    • One or more threads

Mach task model
Mach Task Model




User space











Mach threads
Mach Threads

  • Basic unit of Mach execution

  • Runs in context of one task

  • All threads in one task share its resources

  • Unix process similar to Mach task with single thread

Task and thread scheduling
Task and Thread Scheduling

  • Very flexible

  • Controllable by kernel or user-level programs

  • Threads of single task can execute in parallel

    • On single processor

    • Multiple processors

  • User-level scheduling can extend to multiprocessor scheduling

Mach ports
Mach Ports

  • Basic Mach object reference mechanism

    • Kernel-protected communication channel

  • Tasks communicate by sending messages to ports

  • Threads in receiving tasks pull messages off a queue

  • Ports are location independent

  • Port queues protected by kernel; bounded

Port rights
Port Rights

  • mechanism by which tasks control who may talk to their ports

  • Kernel prevents messages being set to a port unless the sender has its port rights

  • Port rights also control which single task receives on a port

Distributed Operating Systems

Port sets
Port Sets

  • A group of ports sharing a common message queue

  • A thread can receive messages from a port set

    • Thus servicing multiple ports

  • Messages are tagged with the actual port

  • A port can be a member of at most one port set

Distributed Operating Systems

Mach messages
Mach Messages

  • Typed collection of data objects

    • Unlimited size

  • Sent to particular port

  • May contain actual data or pointer to data

  • Port rights may be passed in a message

  • Kernel inspects messages for particular data types (like port rights)

Mach memory objects
Mach Memory Objects

  • A source of memory accessible by tasks

  • May be managed by user-mode external memory manager

    • a file managed by a file server

  • Accessed by messages through a port

  • Kernel manages physical memory as cache of contents of memory objects

Mach device support
Mach Device Support

  • Devices represented by ports

  • Messages control the device and its data transfer

  • Actual device driver outside the kernel in an external object

Mach multiprocessor and ds support
Mach Multiprocessor and DS Support

  • Messages and ports can extend across processor/machine boundaries

    • Location transparent entities

  • Kernel manages distributed hardware

  • Per-processor data structures, but also structures shared across the processors

  • Intermachine messages handled by a server that knows about network details

Mach s netmsgserver
Mach’s NetMsgServer

  • User-level capability-based networking daemon

  • Handles naming and transport for messages

  • Provides world-wide name service for ports

  • Messages sent to off-node ports go through this server

Distributed Operating Systems

Netmsgserver in action
NetMsgServer in Action

User space

User space

User process

User process



Kernel space

Kernel space



Distributed Operating Systems

Mach and user interfaces
Mach and User Interfaces

  • Mach was built for the UNIX community

  • UNIX programs don’t know about ports, messages, threads, and tasks

  • How do UNIX programs run under Mach?

  • Mach typically runs a user-level server that offers UNIX emulation

  • Either provides UNIX system call semantics internally or translates it to Mach primitives

Windows nt
Windows NT

  • More layered than some microkernel designs

  • NT Microkernel provides base services

  • Executive builds on base services via modules to provide user-level services

  • User-level services used by

    • privileged subsystems (parts of OS)

    • true user programs

Windows nt diagram
Windows NT Diagram














Nt microkernel
NT Microkernel

  • Thread scheduling

  • Process switching

  • Exception and interrupt handling

  • Multiprocessor synchronization

  • Only NT part not preemptible or pageable

    • All other NT components runs in threads

Nt executive
NT Executive

  • Higher level services than microkernel

  • Runs in kernel mode

    • but separate from the microkernel itself

    • ease of change and expansion

  • Built of independent modules

    • all preemptible and pageable

Nt executive modules
NT Executive Modules

  • Object manager

  • Security reference monitor

  • Process manager

  • Local procedure call facility (a la RPC)

  • Virtual memory manager

  • I/O manager

Typical activity in nt
Typical Activity in NT









Windows nt threads
Windows NT Threads

  • Executable entity running in an address space

  • Scheduled by kernel

  • Handled by kernel’s dispatcher

  • Kernel works with stripped-down view of thread - kernel thread object

  • Multiple process threads can execute on distinct processors--even Executive ones

Microkernel process objects
Microkernel Process Objects

  • A microkernel proxy for the real process

  • Microkernel’s interface to the real process

  • Contains pointers to the various resources owned by the process

    • e.g., threads and address spaces

  • Alterable only by microkernel calls

Microkernel thread objects
Microkernel Thread Objects

  • As microkernel process objects are proxies for the real object, microkernel thread objects are proxies for the real thread

    • One per thread

  • Contains minimal information about thread

    • Priorities, dispatching state

  • Used by the microkernel for dispatching

More on microkernels
More On Microkernels

  • Microkernels were the research architecture of the 80s

  • But few commercial systems of the 90s really use microkernels

  • To some extent, “microkernel” is now a dirty word in OS design

  • Why?

Microkernel construction
Microkernel Construction

  • Most Microkernels do not perform well

    • Is it inherent in the approach or

    • Implementation?

  • IPC, microkernel bottleneck, can implemented an order of magnitude faster.

    • Not supervise memory

    • Minimal address space management, grant, map, flush.

    • Fast kernel-User Switch, usually 20-30 us but 3 in L3 implementation








  • Traditional operating systems fix the interface and implementation of OS abstractions.

  • Abstractions must be overly general to work with diverse application needs.

Distributed Operating Systems



SQL Server






Traditional OS

Distributed Operating Systems

The issues
The Issues

  • Performance

    • Denies applications the advantages of domain-specific optimizations

  • Flexibility

    • Restricts the flexibility of application builders

  • Functionality

    • Discourages changes to the implementations of existing abstractions

Distributed Operating Systems


  • Example: A DB can have predictable data access patterns, that doesn't fit with OS LRU page replacement, causing bad performance.

  • Cao et al. Found that application-controlled file caching can reduce running time by as much as 45%.

  • There is no single way to abstract physical resources or to implement an abstraction that is best for all applications.

  • OS is forced to make trade-offs

  • Performance improvements of application-specific policies could be substantial

Distributed Operating Systems


  • Fixed high-level abstractions hide informationfrom applications.

  • Makes it difficult or impossible for applications to implement their own resource management abstractions.

Distributed Operating Systems


  • Only one available interface between applications and hardware resources.

  • Because all applications must share one set of abstractions, changes to these abstractions occur rarely, if ever

Distributed Operating Systems

The solution
The Solution

  • Separate protection from management

    • Allow user level to manage resources

      • Application libraries implement OS abstractions

    • Exokernel exports resources

      • Low level interface

      • Protects, does not manage

      • Expose hardware

Distributed Operating Systems

Exokernel philosophy

  • Applications know better than Operating Systems what the goal of their resource management decisions should be

  • Applications should be given as much control as possible over those decisions

  • Implementation view


Frame Buffer | TLB | Network | Memory | Disk


Distributed Operating Systems



SQL Server

Library OS

Chosen from available

Library OS

Customized for SQLServer






Exokernel – Application level resource management



Distributed Operating Systems

Implementation overview
Implementation Overview

  • Library O.S., which uses the low-level exokernel interface to implement higher-level abstractions.

Library O.S.


Frame Buffer | TLB | Network | Memory | Disk


Distributed Operating Systems

Implementation overview1
Implementation Overview

  • Applications link to library kernel, leveraging their higher-level abstractions.

Library O.S.

Library O.S.




Frame Buffer | TLB | Network | Memory | Disk


Distributed Operating Systems

End to end argument
End-to-End Argument

  • “if something has to be done by the user program itself, it is wasteful to do it in a lower level as well.”

  • Why should the OS do anything that the user program can do itself?

  • In other words - all an OS should do is securely allocate resources.

Distributed Operating Systems

Exokernel design
Exokernel design

Distributed Operating Systems

Exokernel tasks
Exokernel tasks

  • Track ownership

  • Guard all resources through bind points

  • Revoke access to resources

Distributed Operating Systems

Design principle
Design principle

  • Expose hardware (securely)

  • Expose allocation

  • Expose names

  • Expose revocation

Distributed Operating Systems

Secure binding
Secure binding

  • Decouples authorization from use

  • Allows kernel to protect resource without understanding their semantics

  • Example: TLB entry

    • Virtual to physical mapping performed in the library (above exokernel)

    • Binding loaded into the kernel; used multiple times

  • Example: packet filter

    • Predicates loaded into the kernel

    • Checked on each packet arrival

Distributed Operating Systems

Implementing secure bindings
Implementing secure bindings

  • Hardware mechanisms

    • Capability for physical pages of a file

    • Frame buffer regions (SGI)

  • Software caching

    • Exokernel large software TLB overlaying the hardware TLB

  • Downloading code into kernel

    • Avoid expensive boundary crossings

    • Similar to the SPIN idea

Distributed Operating Systems

Examples of secure binding
Examples of secure binding

  • Physical memory allocation (hardware supported binding)

    • Library allocates physical page

    • Exokernel records the allocator and the permissions and returns a “capability” – an encrypted cypher

    • Every access to this page by the library requires this capability

  • Page fault:

  • Kernel fields it

  • Kicks it up to the library

  • Library allocated a page – gets an encrypted capability

  • Library calls the kernel to enter a particular translation into the TLB

  • by presenting the capability

Distributed Operating Systems

  • Download code into kernel to establish secure binding

    • Packet filter for demultiplexing network packets

    • Exactly similar to SPIN

    • How to ensure authenticity?

    • Only trusted servers (library OS) can download code into the kernel

  • Other use of downloaded code

    • Execute code on behalf of an app that is not currently scheduled

    • E.g. application handler for garbage collection could be installed in the kernel

Distributed Operating Systems

Visible resource revocation
Visible resource revocation

  • Most resources are visibly revoked

    • E.g. processor; physical page

    • Library can then perform necessary action before relinquishing the resource

      • E.g. needed state saving for a processor

      • E.g. update of page table

Distributed Operating Systems

Abort protocol
Abort protocol

  • Repossession exception passed to the library OS

  • Repossession vector

    • Gives info to the library OS as to what was repossessed so that corrective action can be taken

    • Library OS can seed the vector to enable exokernel to autosave (e.g. disk blocks to which a physical page being repossessed should be written to)

Distributed Operating Systems

Aegis an exokernel
Aegis – an exokernel

Distributed Operating Systems

Aegis processor time slice
Aegis – processor time slice

  • Linear vector of time slots

  • Round robin

  • An application can mark its “position” in the vector for scheduling

  • Timer interrupt

    • Beginning and end of time slices

    • Control transferred to library specified handler for actual saving/restoring

    • Time to save/restore is bounded

      • Penalty? loss of a time slice next time!

Distributed Operating Systems

Aegis processor environments
Aegis – processor environments

  • Exception context

    • Program generated

  • Interrupt context

    • External: e,g. timer

  • Protected entry context

    • Cross domain calls

  • Addressing context

    • Guaranteed mappings implemented by software TLB mimicking the library OS page table

Distributed Operating Systems

Aegis performance
Aegis performance

Distributed Operating Systems

Aegis address translation
Aegis - Address translation

  • On TLB miss

    • Kernel installs hardware from software TLB for guaranteed mappings

    • Otherwise application handler called

    • Application establishes mapping

    • TLB entry with associated capability presented to the kernel

    • Kernel installs and resumes execution of the application

Distributed Operating Systems

Exos library os
ExOS – library OS

  • IPC abstraction

  • VM

  • Remote communication using ASH (application specific safe handlers)


    significant performance improvement possible compared to a monolithic implementation

Distributed Operating Systems

The exokernel
The Exokernel

  • A thin veneer that multiplexes and exports physical resources securely.

    • Simplicity allows efficiency

    • The lower the level of a primitive, the more efficiently it can be implemented, and the more latitude it grants to implementers of higher level abstractions.

Distributed Operating Systems

The exokernel1
The Exokernel

  • Resource management is restricted to

    • allocation,

    • revocation,

    • sharing

    • ownership tracking

Distributed Operating Systems

Library operating systems
Library operating systems

  • Use the low level exokernel interface

  • Higher level abstractions

  • Special purpose implementations

    An application can choose the library which best suits its needs, or even build its own.

Distributed Operating Systems

Another example
Another Example

Distributed Operating Systems

Design challenge
Design Challenge

How can an Exokernel allow libOSes to freely manage physical resources while protecting them from each other?

  • Track ownership of resources

    • Secure bindings – libOS can securely bind to machine resources

  • Guard all resource usage

  • Revoke access to resources

Distributed Operating Systems

Secure bindings
Secure Bindings

  • Exokernel allows libOSes to bind resources using secure bindings

    • Multiplex resources securely

    • Protection for mutually distrusted apps

    • Efficient

Distributed Operating Systems

Secure bindings1
Secure Bindings

  • Secure Binding – a protection mechanism that decouples authorization from actual use of a resource

    • Allows the kernel to protect resources without having to understand them

Distributed Operating Systems

Guard all resource usage
Guard all resource usage

Invisible resource revocation

-Efficient – application layer not involved

-Traditional OS

Visible resource revocation

-Allows libOS to guide deallocation and track availability of resources.


Distributed Operating Systems

Revoke access to resources
Revoke access to resources

Abort protocol – Allows exokernel to break secure bindings of an uncooperative libOS by force

Distributed Operating Systems


  • An Exokernel securely multiplexes available hardware raw hardware among applications

  • Application level library operating systems implement higher-level traditional OS abstractions

  • LibOSes can specialize an implementation to suit a particular application

Distributed Operating Systems


  • The lower the level of a primitive…

    …the more efficiently it can be implemented

    … the more latitude it gives to higher level abstractions

  • So, separate management from protection and…

    …implement protection at a low level (exokernel)

    … implement management at a higher level (libOS)

Distributed Operating Systems

Some features
Some Features

  • It is possible to have different libOSes, for example, one could export a Unix API and another a Windows API

Distributed Operating Systems

Exokernel vs microkernel
Exokernel vs. Microkernel

  • A micro-kernel provides abstractions to the hardware such as files, sockets, graphics etc.

  • An exokernel provides almost raw access to the hardware.

Distributed Operating Systems


Implementation Overview

  • Allows the extension, specialization, and even replacement of abstractions.

    • Example: Page Table implementations can vary from libOS to libOS, and applications can choose whichever is most suitable for their needs.

Distributed Operating Systems


Implementation Principles

  • Provide libOS'es maximum freedom while protecting them from each other. It is achieved through separation of protection and resource management.

    • Resources should only be managed to the extent required for protection. LibOS'es handle how best to use resources, with exokernel arbitrating between competing libraries.

    • LibOS's should be able to request specific physical resources (like specific physical pages).

    • Resources should not be implicitly allocated; the LibOS should participate in every allocation.

Distributed Operating Systems

Exokernel design1
Exokernel Design

  • Secure Bindings

  • Downloading Code

  • Visible Revocation

  • Abort Protocol

Distributed Operating Systems


Secure Bindings

  • Protection mechanism that decouples authorization (bind time) from actual use of the resource (access time).

    • Authorization performed at bind time.

    • Expressed in simple operations that the exokernel can implement quickly and efficiently.

  • Can protect resources without understanding them.

  • Example:

    • When a page fault occurs, virtual to physical address mapping is performed, the page is loaded by the exokernel (bind time), and then used multiple times (access time).

Distributed Operating Systems


Downloading Code

  • Code can be downloaded into the exokernel, for execution at defined events (like packet arrival).

    • Reduces kernel crossings.

    • Can execute even when the application isn't scheduled.

    • Can initiate events (e.g. - initiate response message to packet)

  • Example:

    • A packet filter is downloaded into the exokernel (bind time), and then run on every incoming packet to determine the intended target application (access time), and can even initiate a response.

Distributed Operating Systems


Visible Resource Revocation

  • Traditionally, OS's revoke (deallocate) resources invisibly, without application involvement (e.g. - physical memory).

    • Advantage: lower latency

    • Disadvantage: applications cannot guide deallocation

  • Exokernel uses visible revocation for most resources. The libraryOS is notified of the intention to deallocate, and has the capability of guiding the process.

    • Example: libOS is told that exokernel will deallocate physical page “5”, it can use this information to update it's page table, or even to suggest a less important page for deallocation.

Distributed Operating Systems


Abort Protocol

  • Mechanism to take away resources when libOS's fail to respond satisfactorily to visible revocation requests.

  • A Repossession Vector is used to keep track of forcibly deallocated resources.

    • Library OS's can pre-load the vector with information that can be used to write state or data about the resource when it is deallocated (e.g. - define disk blocks for memory paging).

  • OS's normally require certain allocations to be permanent, so exokernel can guarantee a small number of resources that cannot be forcibly deallocated.

  • Example: page tables, exception areas

Distributed Operating Systems



  • Aegis: Exokernel

    • Exports: processor, physical memory, TLB,exceptions, interrupts, and network interface.

  • ExOS: Library OS

    • Implements: processes, virtual memory, user-level exceptions, interprocess abstractions, and network protocols (ARP,IP,UDP,NFS)

  • Compared to Ultrix

Distributed Operating Systems



  • Processor Time Slices

    • Time Slices partitioned and allocated at the clock granularity. Scheduled using round robin.

    • Advanced Scheduling can be implemented by libOS through requesting specific positions in the time slices.

      • Long running apps can allocate contiguous time slices, while interactive apps can allocate several equidistant slices

Distributed Operating Systems



  • Exceptions

  • Interrupts

  • Address Translations

    • Guarantees address mappings for small number of pages, to simplify boot strapping.

  • Protected Control Transfers

    • For IPC abstractions

    • Changes program counter to agreed location, sets appropriate data for context for callee, and donates current time slice.

  • Dynamic Packet Filter

Distributed Operating Systems



  • IPC Abstractions

    • pipe: ExOS uses shared memory buffer, order of magnitude faster than Ultrix, which uses standard unix pipes.

  • Application Level Virtual Memory

    • 150x150 integer matrix mult – doesn't use any special ExOS or Aegis abilities – shows application level VM doesn't incur noticeable overhead (.1 second difference)

    • All other tests performs comparably with Ultrix (reading pages, flipping protection bits, etc...)

  • Downloaded code for networking handler

    • Round Trip latency for RPC faster than FRPC

Distributed Operating Systems


ExOS Extensibility

  • Extensible Page-Table structures

    • Implemented inverted page tables

  • Extensible Schedulers

    • Stride Scheduling (proportional share scheduling)

      • The processes are succesfully scheduled at a ration of 3:2:1

Distributed Operating Systems



  • Experiments with Aegis and ExOS show

    • Simple exokernel primitives can be implemented efficiently

    • Fast low-level hardware multiplexing can be implemented efficiently

    • Traditional OS abstractions can be implemented as User Level

    • Applications can create special-purpose implementations by modifying libraries

Distributed Operating Systems


Other Exokernel Work

  • Porting Multithreading Libraries to an Exokernel SystemErnest Artiaga, Albert Serra, Marisa GilDept. of Computer ArchitectureUniversitat Politecnica de CatalunyaACM SIGOPS European Workshop, ACM 2000, pp. 121-126

    • Ported Cthreads to Exokernel

    • Slightly faster execution than without threading

Distributed Operating Systems


Other Exokernel Work

  • Fast and Flexible Application-Level Networking on Exokernel SystemGergory Ganger, Dawson Engled, et al.CMU, Stanford, MIT and Vividon, Inc.ACM Transactions on Computer Systems, vol. 20, no. 1, pp. 49--83, 2002

    • Implemented TCP, HTTP server, and web benchmarking tool

    • TCP: 50-300% higher throughput

    • HTTP: 3-8 higher throughput

    • Benchmarking: Can produce loads 2-8 times heavier

Distributed Operating Systems

Key points of the paper
Key points of the paper

  • Microkernel should provide minimal abstractions

    • Address space, threads, IPC

  • Abstractions machine independent but implementation hardware dependent for performance

  • Myths about inefficiency of micro-kernel stem from inefficient implementation and NOT from microkernel approach

Distributed Operating Systems

What abstractions
What abstractions?

  • Determining criterion:

    • Functionality not performance

  • Hardware and microkernel should be trusted but applications are not

    • Hardware provides page-based virtual memory

    • Kernel builds on this to provide protection for services above and outside the microkernel

  • Principles of independence and integrity

    • Subsystems independent of one another

    • Integrity of channels between subsystems protected from other subsystems

Distributed Operating Systems

Microkernel concepts
Microkernel Concepts

  • Hardware provides address space

    • mapping from virtual page to a physical page

    • implemented by page tables and TLB

  • Microkernel concept of address spaces

    • Hides the hardware address spaces and provides an abstraction that supports

      • Grant?

      • Map?

      • Flush?

    • These primitives allows building a hierarchy of protected address spaces

Distributed Operating Systems

Address spaces
Address spaces



A2, P2


A1, P1

V1, R

(P1, v1)

(P1, v1)


A3, P3

V3, R


(P2, v2)

A2, P2

V2, R

(P3, v3)

(P1, v1)



A3, P3


(P2, v2)

(P1, v1)


Distributed Operating Systems

  • Power and flexibility of address spaces

    • Initial memory manager for address space A0 appears by magic (similar to SPIN core service BUT outside the kernel) and encompasses the physical memory

    • Allow creation of stackable memory managers (all outside the kernel)

    • Pagers can be part of a memory manager or outside the memory manager

    • All address space changes (map, grant, flush) orchestrated via kernel for protection

    • Device driver can be implemented as a special memory manager outside the kernel as well

Distributed Operating Systems


M2, A2, P2


M1, A1, P1



M0, A0, P0



Distributed Operating Systems

Threads and ipc
Threads and IPC

  • Executes in an address space

    • PC, SP, processor registers, and state info (such as address space)

  • IPC is cross address space communication

    • Supported by the microkernel

      • Classic method is message passing between threads via the kernel

    • Sender sends info; receiver decides if it wants to receive it, and if so where

    • Address space operations such as map, grant, flush need IPC

    • Higher level communication (e.g. RPC) built on top of basic IPC

Distributed Operating Systems

  • Interrupts?

    • Each hardware device is a thread from kernel’s perspective

    • Interrupt is a null message from a hardware thread to the software thread

    • Kernel transforms hardware interrupt into a message

      • Does not know or care about the semantics of the interrupt

      • Device specific interrupt handling outside the kernel

      • Clearing hardware state (if privileged) then carried out by the kernel upon driver thread’s next IPC

    • TLB handler?

      • In theory software TLB handler can be outside the microkernel

      • In practice first level TLB handler inside the microkernel or in hardware

Distributed Operating Systems

Unique ids
Unique IDs

  • Kernel provides uid over space and time for

    • Threads

    • IPC channels

Distributed Operating Systems

Breaking some performance myths
Breaking some performance myths

  • Kernel user switches

  • Address space switches

  • Thread switches and IPC

  • Memory effects

    Base system:

    486 (50 MHz) – 20 ns cycle time

Distributed Operating Systems

Kernel user switches
Kernel-user switches

  • Machine instruction for entering and exiting

    • 107 cycles

    • Mach measures 900 cycles for kernel-user switch

      • Why?

    • Empirical proof

      • L3 kernel ~ 123 cycles (accounting for some TLB, cache misses)

    • Where did the remaining 800 cycles go in MACH?

      • Kernel overhead (construction of the kernel, and inherent in the approach)

Distributed Operating Systems

Address space switches
Address space switches

  • Primer on TLBs

    • AS tagged TLB (MIPS R4000) vs untagged TLB (486)

      • Untagged TLB requires flush on AS switch

  • Instruction and data caches

    • Usually physically tagged in most modern processors so TLB flush has no effect

  • Address space switch

    • Complete reload of Pentium TLB ~ 864 cycles

Distributed Operating Systems

  • Do we need a TLB flush always?

    • Implementation issue of “protection domains”

    • SPIN implements protection domains as Modula names within a single hardware address space

    • Liedtke suggests similar approach in the microkernel in an architecture-specific manner

      • PowerPC: use segment registers => no flush

      • Pentium or 486: share the linear hardware address space among several user address spaces => no flush

        • There are some caveats in terms of size of user space and how many can be “packed” in a 2**32 global space

Distributed Operating Systems

  • Upshot?

    • Address space switching among medium or small protection domains can ALWAYS be made efficient by careful construction of the microkernel

    • Large address spaces switches are going to be expensive ALWAYS due to cache effects and TLB effects, so switching cost is not the most critical issue

Distributed Operating Systems

Thread switches and ipc
Thread switches and IPC

Distributed Operating Systems

Segment switch (instead of AS switch) makes cross domain calls cheap

Distributed Operating Systems

Memory effects system
Memory Effects – System calls cheap

Distributed Operating Systems

Capacity induced mcpi
Capacity induced MCPI calls cheap

Distributed Operating Systems

Portability vs performance
Portability Vs. Performance calls cheap

  • Microkernel on top of abstract hardware while portable

    • Cannot exploit hardware features

    • Cannot take precautions to avoid performance problems specific to an arch

    • Incurs performance penalty due to abstract layer

Distributed Operating Systems

Examples of non portability
Examples of non-portability calls cheap

  • Same processor family

    • Use address space switch implementation

      • TLB flush method preferable for 486

      • Segment register switch preferable for Pentium

        => 50% change of microkernel!

    • IPC implementation

      • Details of the cache layout (associativity) requires different handling of IPC buffers in 486 and Pentium

  • Incompatible processors

    • Exokernel on R4000 (tagged TLB) Vs. 486 (untagged TLB)

      => Microkernels are inherently non-portable

Distributed Operating Systems

Summary calls cheap

  • Minimal set of abstractions in microkernel

  • Microkernels are processor specific (at least in implementation) and non-portable

  • Right abstractions and processor-specific implementation leads to efficient processor-independent abstractions at higher layers

Distributed Operating Systems

Performance calls cheap

Distributed Operating Systems

Key points
Key points calls cheap

  • Goal: extensibility akin to SPIN and Exokernel goals

  • Main difference: support running several commodity operating systems on the same hardware simultaneously without sacrificing performance or functionality

  • Why?

    • Application mobility

    • Server consolidation

    • Co-located hosting facilities

    • Distributed web services

    • ….

Distributed Operating Systems

Multiprocessor os
Multiprocessor OS calls cheap

  • Synchronization

  • Communication

  • Scheduling

    We have seen these issues already in the other readings in this section of the course

Distributed Operating Systems

Key issues
Key Issues calls cheap

  • Modern parallel machines

    • Large system sizes stressing bottlenecks in system software (e.g. global data structures)

    • Higher memory latencies

    • NUMA effects (i.e. symmetric assumption does not hold

    • Cache hierarchy

      • Write sharing expensive due coherence traffic

      • False sharing due to large cache lines

Distributed Operating Systems

Thesis of tornado paper
Thesis of Tornado paper calls cheap

  • In designing multiprocessor OS

    • Pay attention to locality

    • Reduce shared system data structures

    • Reduce distance between accessing processor and target memory module

Distributed Operating Systems

Effect of global data structure shared counter
Effect of global data structure – shared counter calls cheap

Distributed Operating Systems

Tornado design approach
Tornado design approach calls cheap

  • Object-oriented design for scalability

    • Clustered objects

    • Protected procedure call with a view to preserving locality while ensuring concurrency

    • Semi automatic garbage collection for localizing locking

  • OS objects have multiple implementations

    • Low overhead version when scalability is not required

    • Resort to scalable implementation when performance critical

  • Optimize common case

    • Object invocation should be fast; object creation/destruction can be slower

    • Page fault handling should be fast; memory region creation/deletion can be slower

Distributed Operating Systems

Next lecture
Next Lecture calls cheap

  • Process and Thread

    • “Cooperative Task Management Without Manual Stack Management”, by Atul Adya,

    • “Capriccio: Scalable Threads for Internet Services”, by Ron Von Behrn, et. al.

    • “The Performance Implication of Thread Management Alternative for Shared-Memory Multiprocessors”, Thomas E. Anderson,

Distributed Operating Systems