CS 2510

CS 2510 OS Basics, cont’d

Dynamic Memory Allocation • How does system manage memory of a single process? • View: Each process has contiguous logical address space

Dynamic Storage Management • Static (compile-time) allocation is not possible for data • Recursive procedures • Even regular procedures are hard to predict (data dependencies) • Complex data structures • Storage used inefficiently when reserved statically • Must reserve enough to handle worst case • ptr = allocate(x bytes) • Free(ptr) • Dynamic allocation can be handled in 2 ways • Stack allocation • Restricted, but simple and efficient • Heap allocation • More general, but less efficient • Harder to implement

Stack Organization • Definition: Memory is freed in opposite order from allocation • Alloc(A) • Alloc(B) • Alloc(C) • Free(C) • Free(B) • Free(A) • When is it useful? • Memory allocation and freeing are partially predictable • Allocation is hierarchical • Example • Procedure call frames • Tree traversal, expression evaluation, parsing

Stack Implementation • Advance pointer dividing allocated and free space • Allocate: Increment pointer • Free: Decrement pointer • X86: Special ‘stack pointer’ register • ‘SP’ (16), ‘ESP’ (32), ‘RSP’(64) • Where does this register point to? • How does the x86 allocate and free? • Stack grows down • Advantage • Keeps all the free space contiguous • Simple and efficient to implement • Disadvantage: Not appropriate for all data structures

Heap Organization • Definition: Allocate from random locations • Memory consists of allocated areas and free areas (or holes) • When is it useful? • Allocation and release and unpredictable • Arbitrary list structures, complex data organizations • Examples: new in C++, malloc() in C • Advantage: Works on arbitrary allocation and free patterns • Disadvantage: End up with small chunks of free space Free 16 bytes How to allocate 24 bytes? Alloc 32 bytes 12 bytes Free Alloc 16 bytes

Fragmentation • Definition: Free memory that is too small to be usefully allocated • External: Visible to system • Internal: Visible to process (e.g. if allocation at some granularity • Goal • Keep number of holes small • Keep size of holes large • Stack allocation • All free space is contiguous in one large region • How do we implement heap allocations

Heap implementation • Data Structure: Linked list of free blocks • Free List tracks storage not in use • Allocation • Choose block large enough for request • According to policy criteria! • Update pointers and size variable • Free • Add block to free list • Merge adjacent free blocks • If (addr of new block == prev_addr + size) { • Combine blocks • } Project 2!!! Project 2!!! Project 2!!! Project 2!!! Project 2!!! Project 2!!!

x86 and Linux • Where is heap managed? • User space or kernel? • syscall_brk(); • Expands or contracts heap • A lot like a stack • Heap grows up • Dedicated virtual address area • Allocated space then managed by heap allocator • Backed by page tables

Best vs. First vs. Worst • Best fit • Search the whole list of each allocation • Choose bock that most closely matches size of request • Can stop searching if see exact match • First fit • Allocate first block that is large enough • Rotating first fit: Start with next free block each time • Worst fit • Allocate largest block to request (most leftover space) • Which is best?

Examples • Best algorithm: Depends on sequence of requests • Example: Memory contains 2 free blocks of size 20 and 15 bytes • Allocation requests: 10 then 20 • Allocation requests: 8, 12, then 12

Buddy Allocation • Fast simple allocation for blocks that are 2n bytes (Knuth 1968) • Allocation Restrictions • Block sizes 2n • Represent memory units (2min_order) with bitmap • Allocation strategy for k bytes • Raise allocation request to nearest 2n • Search free list for appropriate size • Recursively divide larger blocks until reach block of correct size • “Buddy” blocks remain free • Free strategy • Recursively coalesce block with buddy if buddy free • May coalesce lazily to avoid overhead

Example • 1MB of memory • Allocate: 70KB, 35KB, 80KB • Free: 70KB, 35KB

Comparison of Allocation Strategies • Best fit • Tends to leave some very large holes, some very small ones • Disadvantage: Very small holes can’t be used easily • First fit • Tends to leave “average” size holes • Advantage: Faster than best fit • Buddy • Organizes memory to minimize external fragmentation • Leaves large chunks of free space • Faster to find hole of appropriate size • Disadvantage: Internal fragmentation when not power of 2 request

Memory allocation in practice • Malloc() in C: • Calls sbrk() to request more contiguous memory • Add small header to each block of memory • Pointer to next free block or • Size of block • Where must header be placed? • Combination of two data structures • Separate free list for each popular size • Allocation is fast, no fragmentation • Inefficient if some are empty while others have lots of free blocks • First fit on list of irregular free blocks • Combine blocks and shuffle blocks between lists

Reclaiming Free Memory • When can dynamically allocated memory be freed? • Easy when a chunk is only used in one place • Explicitly call free() • Hard when information is shared • Can’t be recycled until all sharers are finished • Sharing is indicated by the presence of pointers to the data • Without a pointer, can’t access data (can’t find data) • Two possible problems • Dangling pointers: Recycle storage while its still being used • Memory leaks: Forget to free storage even when can’t be used again • Not a problem for short lived user processes • Issue for OS and long running applications

Reference Counts • Idea • Keep track of the number of references to each chunk of memory • When reference count reaches zero, free memory • Example • Files and hard links in Unix • Smalltalk • Objects in distributed systems • Linux Kernel • Disadvantages • Circular data structures -> memory leaks

Garbage Collection • Idea • Storage isn’t freed explicitly (i.e. no free() operation) • Storage freed implicitly when no longer referenced • Approach • When system needs storage, examine and collect free memory • Advantages • Works with circular data structures • Makes life easier on the application programmer

Mark and Sweep Garbage Collection • Requirements • Must be able to find all objects • Must be able to find all pointers to objects • Compiler must cooperate by marking type of data in memory. Why? • Two Passes • Pass 1: Mark • Start with all statically allocated (where?) and procedure local variables (where?) • Mark each object • Recursively mark all objects reachable via pointer • Pass 2: Sweep • Go through all objects, free those not marked

Garbage Collection in Practice • Disadvantages • Garbage collection is often expensive: 20% or more of CPU time • Difficult to implement • Languages with garbage collection • LISP (emacs) • Java/C# • Scripting languages • Conservative Garbage Collection • Idea: Treat all memory as pointers (what does this mean?) • Can be used for C and C++

I/O Devices • Two primary aspects of computer system • Processing (CPU + Memory) • Input/Output • Role of Operating System • Provide a consistent interface • Simplify access to hardware devices • Implement mechanisms for interacting with devices • Allocate and manage resources • Protection • Fairness • Obtain Efficient performance • Understand performance characteristics of device • Develop policies

I/O Subsystem User Process Kernel Kernel I/O Subsystem Harddisk Mouse GPU SCSI Bus Keyboard PCI Bus Device Drivers Software Hardware Harddisk Mouse GPU Device Controllers SCSI Bus Keyboard PCI Bus Harddisk Mouse GPU SCSI Bus Keyboard PCI Bus Devices

User View of I/O • User Processes cannot have direct access to devices • Manage resources fairly • Protects data from access-control violations • Protect system from crashing • OS exports higher level functions • User process performs system calls (e.g. read() and write()) • Blocking vs. Nonblocking I/O • Blocking: Suspends execution of process until I/O completes • Simple and easy to understand • Inefficient • Nonblocking: Returns from system calls immediately • Process is notified when I/O completes • Complex but better performance

User View: Types of devices • Character-stream • Transfer one byte (character) at a time • Interface: get() or put() • Implemented as restricted forms of read()/write() • Example: keyboard, mouse, modem, console • Block • Transfer blocks of bytes as a unit (defined by hardware) • Interface: read() and write() • Random access: seek() specifies which bytes to transfer next • Example: Disks and tapes

Kernel I/O Subsystem • I/O scheduled from pool of requests • Requests rearranged to optimize efficiency • Example: Disk requests are reordered to reduce head seeks • Buffering • Deal with different transfer rates • Adjustable transfer sizes • Fragmentation and reassembly • Copy Semantics • Can calling process reuse buffer immediately? • Caching: Avoid device accesses as much as possible • I/O is SLOW • Block devices can read ahead

Device Drivers • Encapsulate details of device • Wide variety of I/O devices (different manufacturers and features) • Kernel I/O subsystem not aware of hardware details • Load at boot time or on demand • IOCTLs: Special UNIX system call (I/O control) • Alternative to adding a new system call • Interface between user processes and device drivers • Device specific operation • Looks like a system call, but also takes a file descriptor argument • Why?

Device Driver: Device Configuration • Interactions directly with DeviceController • Special Instructions • Valid only in kernel mode • X86: In/Out instructions • No longer popular • Memory-mapped • Read and write operations in special memory regions • How are memory operations delivered to controller? • OS protects interfaces by not mapping memory into user processes • Some devices can map subsets of I/O space to processes • Buffer queues (i.e. network cards)

Interacting with Device Controllers • How to know when I/O is complete? • Polling • Disadvantage: Busy Waiting • CPU cycles wasted when I/O is slow • Often need to be careful with timing • Interrupts • Goal: Enable asynchronous events • Device signals CPU by asserting interrupt request line • CPU automatically jumps to Interrupt Service Routine • Interrupt vector: Table of ISR addresses • Indexed by interrupt number • Lower priority interrupts postponed until higher priority finished • Interrupts can nest • Disadvantage: Interrupts “interrupt” processing • Interrupt storms

Device Driver: Data transfer • Programmed I/O (PIO) • Initiate operation and read in every byte/word of data • Direct Memory Access (DMA) • Offload data xfer work to special-purpose processor • CPU configures DMA transfer • Writes DMA command block into main memory • Target addresses and xfer sizes • Give command block address to DMA engine • DMA engine xfers data from device to memory specified in command block • DMA engine raises interrupt when entire xfer is complete • Virtual or Physical address?

CS 2510

CS 2510

Presentation Transcript

CS 551 / CS 645

CS 445 / CS 645

WVDE Policy 2510

CS 445 / CS 645

CS 2510 Graduate Operating Systems

COP 2510

CSCI 2510 Tutorial 1 Basic Assembly and Data Representation

CSCI 2510 Tutorial 2 Setting up Assembly Programming Environment

CSC 2510 Test 3

CS 551 / CS 645

HUM 2510 Critical Analysis Essay Workshop

CS 3870/CS 5870

CSC I 2510 Computer Organization

 CS-naive  CS

DIN 2510 MATERIALS

Hp 2510 printer support

CSC 2510 Test 3

CS 3870/CS 5870