Dave probert ph d windows kernel architect microsoft windows division
Download
1 / 30

Evolution of the Windows Kernel Architecture - PowerPoint PPT Presentation


  • 1880 Views
  • Updated On :

Dave Probert, Ph.D. - Windows Kernel Architect Microsoft Windows Division Evolution of the Windows Kernel Architecture 08.10.2009 Buenos Aires About Me Ph.D. in Computer Engineering (Operating Systems w/o Kernels) Kernel Architect at Microsoft for over 13 years

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Evolution of the Windows Kernel Architecture' - paul


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Dave probert ph d windows kernel architect microsoft windows division l.jpg

Dave Probert, Ph.D. - Windows Kernel Architect

Microsoft Windows Division

Evolution of the Windows Kernel Architecture

08.10.2009

Buenos Aires

Copyright Microsoft Corporation


About me l.jpg
About Me

  • Ph.D. in Computer Engineering (Operating Systems w/o Kernels)

  • Kernel Architect at Microsoft for over 13 years

    • Managed platform-independent kernel development in Win2K/XP

    • Working on multi-core & heterogeneous parallel computing support

      • Architect for UMS in Windows 7 / Windows Server 2008 R2

  • Co-instigator of the Windows Academic Program

    • Providing kernel source and curriculum materials to universities

    • http://microsoft.com/WindowsAcademic or [email protected]

    • Wrote the Windows material for leading OS textbooks

      • Tanenbaum, Silberschatz, Stallings

      • Consulted on others, including a successful OS textbook in China


Unix vs nt design environments l.jpg
UNIX vs NT Design Environments

Copyright Microsoft Corporation


Effect on os design l.jpg
Effect on OS Design

Copyright Microsoft Corporation


Today s environment 2009 l.jpg
Today’s Environment [2009]

Copyright Microsoft Corporation


Windows architecture l.jpg

ServiceControl Mgr.

LSASS

SvcHost.Exe

Task Manager

WinMgt.Exe

Explorer

WinLogon

SpoolSv.Exe

User

Application

Services.Exe

User

Mode

Subsystem DLLs

Kernel

Mode

Windows Architecture

Environment Subsystems

System Processes

Services

Applications

Windows

OS/2

Session Manager

POSIX

Windows DLLs

System

Threads

NTDLL.DLL

System Service Dispatcher

(kernel mode callable interfaces)

Windows

USER,

GDI

I/O Mgr

File System Cache

Object

Mgr.

Plug and

Play Mgr.

Power

Mgr.

SecurityReferenceMonitor

VirtualMemory

Processes&

Threads

Configura-

tion Mgr

(registry)

Local

Procedure

Call

Device &

File Sys.

Drivers

Graphics

Drivers

Kernel

Hardware Abstraction Layer (HAL)

hardware interfaces (buses, I/O devices, interrupts, interval timers, DMA, memory cache control, etc., etc.)

Copyright Microsoft Corporation


Kernel mode architecture of windows l.jpg
Kernel-mode Architecture of Windows

user mode

NT API stubs (wrap sysenter) -- system library (ntdll.dll)

NTOS kernel layer

Trap/Exception/Interrupt Dispatch

CPU mgmt: scheduling, synchr, ISRs/DPCs/APCs

Drivers

Devices, Filters, Volumes, Networking, Graphics

Procs/Threads

IPC

Object Mgr

kernel mode

Virtual Memory

glue

Security

Caching Mgr

I/O

Registry

NTOS executive layer

Hardware Abstraction Layer (HAL): BIOS/chipset details

firmware/ hardware

CPU, MMU, APIC, BIOS/ACPI, memory, devices

Copyright Microsoft Corporation

Copyright Microsoft Corporation


Kernel executive layers l.jpg
Kernel/Executive layers

  • Kernel layer – ntos/ke – ~ 5% of NTOS source)

    • Abstracts the CPU

      • Threads, Asynchronous Procedure Calls (APCs)

      • Interrupt Service Routines (ISRs)

      • Deferred Procedure Calls (DPCs – aka Software Interrupts)

    • Providers low-level synchronization

  • Executive layer

    • OS Services running in a multithreaded environment

    • Full virtual memory, heap, handles

    • Extensions to NTOS: drivers, file systems, network, …

Copyright Microsoft Corporation


Nt native api examples l.jpg
NT (Native) API examples

NtCreateProcess(&ProcHandle, Access, SectionHandle, DebugPort, ExceptionPort, …)

NtCreateThread(&ThreadHandle, ProcHandle, Access, ThreadContext, bCreateSuspended, …)

NtAllocateVirtualMemory(ProcHandle, Addr, Size, Type, Protection, …)

NtMapViewOfSection(SectionHandle, ProcHandle, Addr, Size, Protection, …)

NtReadVirtualMemory(ProcHandle, Addr, Size, …)

NtDuplicateObject(srcProcHandle, srcObjHandle, dstProcHandle, dstHandle, Access, Attributes, Options)

Copyright Microsoft Corporation


Windows vista kernel changes l.jpg
Windows Vista Kernel Changes

  • Kernel changes mostly minor improvements

    • Algorithms, scalability, code maintainability

    • CPU timing: Uses Time Stamp Counter (TSC)

      • Interrupts not charged to threads

      • Timing and quanta are more accurate

    • Communication

      • ALPC: Advanced Lightweight Procedure Calls

      • Kernel-mode RPC

      • New TCP/IP stack (integrated IPv4 and IPv6)

    • I/O

      • Remove a context switch from I/O Completion Ports

      • I/O cancellation improvements

    • Memory management

      • Address space randomization (DLLs, stacks)

      • Kernel address space dynamically configured

    • Security: BitLocker, DRM, UAC, Integrity Levels

Copyright Microsoft Corporation


Windows 7 kernel changes l.jpg
Windows 7 Kernel Changes

  • Miscellaneous kernel changes

    • MinWin

      • Change how Windows is built

      • Lots of DLL refactoring

      • API Sets (virtual DLLs)

    • Working-set management

      • Runaway processes quickly start reusing own pages

      • Break up kernel working-set into multiple working-sets

        • System cache, paged pool, pageable system code

    • Security

      • Better UAC, new account types, less BitLocker blockers

    • Energy efficiency

      • Trigger-started background services

      • Core Parking

      • Timer-coalescing, tick skipping

  • Major scalability improvements for large server apps

    • Broke apart last two major kernel locks, >64p

  • Kernel support for ConcRT

    • User-Mode Scheduling (UMS)

Copyright Microsoft Corporation


Minwin l.jpg
MinWin

  • MinWin is first step at creating architectural partitions

    • Can be built, booted and tested separately from the rest of the system

    • Higher layers can evolve independently

    • An engineering process improvement, not a microkernel NT!

  • MinWin was defined as set of components required to boot and access network

    • Kernel, file system driver, TCP/IP stack, device drivers, services

    • No servicing, WMI, graphics, audio or shell, etc, etc, etc

  • MinWin footprint:

    • 150 binaries, 25MB on disk, 40MB in-memory


Minwin layering l.jpg
MinWin Layering

Shell,

Graphics,

Multimedia,

Layered Services,

Applets,

Etc.

Kernel,

HAL,

TCP/IP,

File Systems,

Drivers,

Core System Services

MinWin


Timer coalescing l.jpg
Timer Coalescing

  • Secret of energy efficiency: Go idle and Stay idle

  • Staying idle requires minimizing timer interrupts

  • Before, periodic timers had independent cycles even when period was the same

  • New timer APIs permit timer coalescing

    • Application or driver specifies tolerable delay

    • Timer system shifts timer firing

Timer tick

15.6 ms

Vista

Periodic Timer Events

Windows 7

MarkRuss


Broke apart the dispatcher lock l.jpg
Broke apart the Dispatcher Lock

  • Scheduler Dispatcher lock hottest on server workloads

    • Lock protects all thread state changes (wait, unwait)

    • Very lock at >64x

  • Dispatcher lock broken up in Windows 7 / Server 2008 R2

    • Each object protected by its own lock

    • Many operations are lock-free

hot

Copyright Microsoft Corporation


Removed pfn lock l.jpg
Removed PFN Lock

  • Windows tracks the state of pages in physical memory

    • In use: in working sets:

    • Not assigned: on paging lists: freemodified, standby, …

  • Before, all page state changes protected by global PFN (Physical Frame Number) lock

  • As of Windows 7 the PFN lock is gone

    • Pages are now locked individually

    • Improves scalability for large memory applications

Copyright Microsoft Corporation


The silicon power wall l.jpg
The Silicon Power Wall

The situation:

  • Power2∝ Clock frequency

  • Voltage ∝ Power2

    • Clock frequency and Voltage offset each other

  • Clock frequency inversely proportional to logic path length

    Bad News:

  • Power is about as low as it can go

  • Logic paths between clocked elements are pretty short

    Good News:

  • Moore’s Law continues (# transistors doubles ~22 months)

  • All that parallel computational theory is going into practice

    Transistors going into more cores, not faster cores!

    Software subject to Amdahl’s Law, not Moore’s Law

    (or Gustafson’s Law

    – if my wife can find large enough datasets she cares about)

17


Approaches to hw parallelism l.jpg
Approaches to HW parallelism

Homogeneous

More big superscalar cores

  • Extend with private (or shared) SIMD engines (SSE on steroids)

  • (Maybe) not very energy efficient

    A few more big, cores and lots of smaller, slower, cooler cores

  • Use SIMD for performance

  • Shutoff idle small cores for energy efficiency (but leakage?)

    Lots of little fully programmable cores, all the same

  • Nobody has ever gotten this to work – more on this later

    Heterogeneous

    Programmable Accelerators (e.g. GPUs)

  • Attach loosely-coupled, specialized (non-x86), energy-efficient cores

    Fixed-function Accelerators

  • Very energy-efficient, device-like computational units for very-specific tasks

18


User mode scheduling ums l.jpg
User Mode Scheduling (UMS)

  • Improve support for efficient cooperative multithreaded scheduling of small tasks (over-decomposition)

    • Want to schedule tasks in user-mode

    • Use NT threads to simulate CPUs, multiplex tasks onto these threads

  • When a task calls into the kernel and blocks, the CPU may get scheduled to a different app

    • If a single NT thread per CPU, when it blocks it blocks.

    • Could have extra threads, but then kernel and user-mode are competing to schedule the CPU

  • Tasks run arbitrary Win32 code (but only x64/IA64)

    • Assumes running on an NT thread (TEB, kernel thread)

  • Used by ConcRT (Visual Studio 2010’s Concurrency Run-Time)

  • Copyright Microsoft Corporation


    Windows 7 user mode scheduling l.jpg
    Windows 7 User-Mode Scheduling

    • UMS breaks NT thread into two parts:

      • UT: user-mode portion (TEB, ustack, registers)

      • KT: kernel-mode portion (ETHREAD, kstack, registers)

    • Three key properties:

      • User-mode scheduler switches UTs w/o ring crossing

      • KT switch is lazy: at kernel entry (e.g. syscall, pagefault)

      • CPU returned to user-mode scheduler when KT blocks

    • KT “returns” to user-mode by queuing completion

      • User-mode scheduler schedules corresponding UT

      • (similar to scheduler activations, etc)

    Copyright Microsoft Corporation


    Normal nt threading l.jpg
    Normal NT Threading

    x86 core

    Kernel-mode

    Scheduler

    NTOS executive

    KT0

    KT1

    KT2

    kernel

    trap code

    user

    UT0

    UT1

    UT2

    • NT Thread is Kernel Thread (KT) and User Thread (UT)

    • UT/KT form a single logical thread representing NT thread in user or kernel

      • KT: ETHREAD, KSTACK, link to EPROCESS

      • UT: TEB, USTACK

    Copyright Microsoft Corporation


    User mode scheduling ums22 l.jpg
    User-Mode Scheduling (UMS)

    NTOS executive

    KT0 blocks

    KT0

    KT1

    KT2

    Primary

    Thread

    trap code

    Thread Parking

    kernel

    user

    UT Completion list

    UT0

    User-mode

    Scheduler

    Only primary thread runs in user-mode

    Trap code switches to parked KT

    KT blocks  primary returns to user-mode

    KT unblocks & parks  queue UT completion

    UT0

    UT1

    Copyright Microsoft Corporation


    Slide23 l.jpg
    UMS

    • Based on NT threads

      • Each NT thread has user & kernel parts (UT & KT)

      • When a thread becomes UMS, KT never returns to UT

        • (Well, sort of)

      • Instead, the primary thread calls the USched

  • USched

    • Switches between UTs, all in user-mode

    • When a UT enters kernel and blocks, the primary thread will hand CPU back to the USched declaring UT blocked

    • When UT unblocks, kernel queues notification

    • USched consumes notifications, marks UT runnable

  • Primary Thread

    • Self-identified by entering kernel with wrong TEB

    • So UTs can migrate between threads

    • Affinities of primaries and KTs are orthogonal issues

  • Copyright Microsoft Corporation


    Ums thread roles l.jpg
    UMS Thread Roles

    • Primary threads: represent CPUs, normal app threads enter the USched world and become primaries, primaries also can be created by UScheds to allow parallel execution

      • Primaries represent concurrent execution

    • UMS threads (UT/KTs): allow blocking in the kernel without losing the CPU

      • UMS thread represent concurrent blocking in kernel

    Copyright Microsoft Corporation


    Thread scheduling vs ums l.jpg
    Thread Scheduling vs UMS

    User

    Thread

    4

    User

    Thread

    3

    User

    Thread

    5

    User

    Thread

    6

    Core 2

    Core 2

    Core 1

    Core 1

    Thread

    4

    Thread

    5

    User

    Thread

    1

    Thread

    1

    Thread

    3

    Thread

    2

    Thread

    6

    User

    Thread

    2

    Kernel

    Thread

    1

    Kernel

    Thread

    2

    Kernel

    Thread

    4

    Kernel

    Thread

    3

    Kernel

    Thread

    5

    Kernel

    Thread

    6

    Non-running threads

    Thread Scheduling

    Cooperative Scheduling

    MarkRuss


    Win32 compat considerations l.jpg
    Win32 compat considerations

    Why not Win32 fibers?

    • TEB issues

      • Contains TLS and Win32-specific fields (inclLastError)

      • Fibers run on multiple threads, so TEB state doesn’t track

  • Kernel thread issues

    • Visibility to TEB

    • I/O is queued to thread

    • Mutexes record thread owner

    • Impersonation

    • Cross-thread operations expect to find threads and IDs

    • Win32 code has thread and affinity awareness

  • Copyright Microsoft Corporation


    Futures master slave ums l.jpg
    Futures: Master/Slave UMS?

    x86 core

    Kernel-mode

    Scheduler

    NTOS executive

    KT0

    KT1

    KT2

    remote kernel

    trap code

    Thread Parking

    Syscall Request Queue

    Syscall Completion Queue

    Remote x86

    Remote

    Scheduler

    UT0

    UTs (can) run on accelerators or x86s

    KTs run on x86s, syscalls remoted/batched

    Pagefaults are just like syscalls

    Accelerator never “loses the CPU” (implicit primary)

    UT2

    UT1

    Copyright Microsoft Corporation


    Operating systems futures l.jpg
    Operating Systems Futures

    • Many-core challenge

      • New driving force in software innovation:

        Amdahl’s Law overtakes Moore’s Law as high-order bit

      • Heterogeneous cores?

    • OS Scalability

      • Loosely –coupled OS: mem + cpu + services?

      • Energy efficiency

    • Shrink-wrap and Freeze-dry applications?

    • Hypervisor/Kernel/Runtime relationships

      • Move kernel scheduling (cpu/memory) into run-times?

      • Move kernel resource management into Hypervisor?

    Copyright Microsoft Corporation


    Windows academic program l.jpg
    Windows Academic Program

    • Windows Kernel Internals

      • Windows kernel in source (Windows Research Kernel – WRK)

      • Windows kernel in PowerPoint (Curriculum Resource Kit – CRK)

    • Based on Windows Server 2008 Service Pack 1

      • Latest kernel at time of release

      • First kernel release with AMD64 support

    • Joint program between Windows Product Group and MS Academic Groups

      • Program directed by Arkady Retik (Need a DVD? Have questions?)

        Information available at

    • Microsoft Academic Contacts in Buenos Aires

      Miguel Saez ([email protected]) or

      Ezequiel Glinsky ([email protected])

    Copyright Microsoft Corporation


    Muchas gracias l.jpg
    muchas gracias

    30


    ad