vm design issues
Download
Skip this Video
Download Presentation
VM Design Issues

Loading in 2 Seconds...

play fullscreen
1 / 32

VM Design Issues - PowerPoint PPT Presentation


  • 99 Views
  • Uploaded on

VM Design Issues. Vivek Pai / Kai Li Princeton University. Mini-Gedankenexperimenten. What’s the refresh rate of your monitor? What is the access time of a hard drive? What response time determines sluggishness or speediness? What’s the relation?

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' VM Design Issues' - zoltin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
vm design issues

VM Design Issues

Vivek Pai / Kai Li

Princeton University

mini gedankenexperimenten
Mini-Gedankenexperimenten
  • What’s the refresh rate of your monitor?
  • What is the access time of a hard drive?
  • What response time determines sluggishness or speediness? What’s the relation?
  • What determines the running speed of a program that’s paging heavily?
  • If you have a program that pages heavily, what are your options to improve the situation?
mechanics
Mechanics
  • Let’s finish off last lecture
  • Memory mapping, Unified VM next time
    • No assigned reading yet, may not exist
  • Mid-term on track
    • Covers everything before it
  • Open Q&A session?
    • Is there interest?
    • If so, when?
where we left off last time
Where We Left Off Last Time
  • Various approaches to evicting pages
  • Some discussion about why doing even “well” is hard to implement
  • Belady’s algorithm for off-line analysis
  • We just finished variations on FIFO
    • In particular, enhanced FIFO with 2nd chance
lessons from enhanced fifo
Lessons From Enhanced FIFO
  • Observation: it’s easier to evict a clean page than a dirty page
  • 2nd observation: sometimes the disk and CPU are idle
  • Optimization: when system’s free, write dirty pages back to disk, but don’t evict
  • Called flushing – often falls to pager daemon
least recently used lru
Least Recently Used (LRU)
  • Algorithm
    • Replace page that hasn’t been used for the longest time
  • Question
    • What hardware mechanisms required to implement LRU?
implementing lru
Implementing LRU
  • Perfect
    • Use a timestamp on each reference
    • Keep a list of pages ordered by time of reference

Least

recently used

Mostly

recently used

5

3

4

7

9

11

2

1

15

approximate lru
Approximate LRU

Most recently used

Least recently used

LRU

N categories

pages in order of last reference

Crude

LRU

2 categories

pages referenced since

the last page fault

pages not referenced

since the last page fault

8-bit

count

. . .

256 categories

0

1

2

3

254

255

aging not frequently used nfu
Aging: Not Frequently Used (NFU)

00000000

00000000

10000000

01000000

10100000

00000000

10000000

01000000

10100000

01010000

10000000

11000000

11100000

01110000

00111000

  • Algorithm
    • Shift reference bits into counters
    • Pick the page with the smallest counter
  • Main difference between NFU and LRU?
    • NFU has a short history (counter length)
  • How many bits are enough?
    • In practice 8 bits are quite good
  • Pros: Require one reference bit
  • Cons: Require looking at all counters

00000000

00000000

00000000

10000000

01000000

where do we get storage
Where Do We Get Storage?
  • 32 bit VA to 32 bit PA – no space, right?
    • Offset within page is the same
  • No need to store offset
    • 4KB page = 12 bits of offset
    • Those 12 bits are “free” in PTE
  • Page # + other info <= 32 bits
    • Makes storing info easy
x86 page table entry
Valid

Writable

Owner (user/kernel)

Write-through

Cache disabled

Accessed (referenced)

Dirty

PDE maps 4MB

Global

x86 Page Table Entry

Page frame number

U

P

Cw

Gl

L

D

A

Cd

Wt

O

W

V

12

31

Reserved

what happens on diagonal lines
What Happens on Diagonal Lines
  • My screen is 1024*768 pixels
    • 256 colors = 1 byte per pixel = .75MB
    • 64K colors = 2 bytes/pixel = 1.5MB
    • Page size is 4KB
    • Screen is 192 or 384 pages
  • 1 page = several horizontal lines
  • Diagonal/vertical lines = TLB badness
  • “Superpages” to the rescue
the big picture
The Big Picture
  • We’ve talked about single evictions
  • Most computers are multiprogrammed
    • Single eviction decision still needed
    • New concern – allocating resources
    • How to be “fair enough” and achieve good overall throughput
  • This is a competitive world – local and global resource allocation decisions
program behaviors
Program Behaviors
  • 80/20 rule
    • > 80% memory references are made by < 20% of code
  • Locality
    • Spatial and temporal
  • Working set
    • Keep a set of pages in memory would avoid a lot of page faults

Working set

# page faults

# pages in memory

observations re working set
Observations re Working Set
  • Working set isn’t static
  • There often isn’t a single “working set”
    • Multiple plateaus in previous curve
    • Program coding style affects working set
  • Working set is hard to gauge
    • What’s the working set of an interactive program?
working set
Working Set
  • Main idea
    • Keep the working set in memory
  • An algorithm
    • On a page fault, scan through all pages of the process
    • If the reference bit is 1, record the current time for the page
    • If the reference bit is 0, check the “last use time”
      • If the page has not been used within d, replace the page
      • Otherwise, go to the next
    • Add the faulting page to the working set
wsclock paging algorithm
WSClock Paging Algorithm
  • Follow the clock hand
  • If the reference bit is 1, set reference bit to 0, set the current time for the page and go to the next
  • If the reference bit is 0, check “last use time”
    • If page has been used within d, go to the next
    • If page hasn’t been used within d and modify bit is 1
      • Schedule the page for page out and go to the next
    • If page hasn’t been used within d and modified bit is 0
      • Replace this page
simulating modify bit with access bits
Simulating Modify Bit with Access Bits
  • Set pages read-only if they are read-write
  • Use a reserved bit to remember if the page is really read-only
  • On a read fault
    • If it is not really read-only, then record a modify in the data structure and change it to read-write
    • Restart the instruction
implementing lru without reference bit
Implementing LRU without Reference Bit
  • Some machines have no reference bit
    • VAX, for example
  • Use the valid bit or access bit to simulate
    • Invalidate all valid bits (even they are valid)
    • Use a reserved bit to remember if a page is really valid
    • On a page fault
      • If it is a valid reference, set the valid bit and place the page in the LRU list
      • If it is a invalid reference, do the page replacement
      • Restart the faulting instruction
demand paging
Demand Paging
  • Pure demand paging relies only on faults to bring in pages
  • Problems?
    • Possibly lots of faults at startup
    • Ignores spatial locality
  • Remedies
    • Loading groups of pages per fault
    • Prefetching/preloading
speed and sluggishness
Speed and Sluggishness
  • Slow is > .1 seconds (100 ms)
  • Speedy is << .1 seconds
  • Monitors tend to be 60+ Hz =

<16.7ms between screen paints

  • Disks have seek + rotational delay
    • Seek is somewhere between 7-16 ms
    • At 7200rpm, one rotation = 1/120 sec = 8ms. Half-rotation is 4ms
  • Conclusion? One disk access OK, six are bad
disk address
Disk Address
  • Use physical memory as a cache for disk
  • Where to find a page on a page fault?
    • PPage# field is a disk address

Virtual

address

space

Physical

memory

invalid

imagine a global lru
Imagine a Global LRU
  • Global – across all processes
  • Idea – when a page is needed, pick the oldest page in the system
  • Problems? Process mixes?
    • Interactive processes
    • Active large-memory sweep processes
  • Mitigating damage?
amdahl s law
Amdahl’s Law
  • Gene Amdahl (IBM, then Amdahl)
  • Noticed the bottlenecks to speedup
  • Assume speedup affects one component
  • New time =

(1-not affected) + affected/speedup

  • In other words, diminishing returns
nt x86 virtual address space layouts
NT x86 Virtual Address Space Layouts

00000000

Application code

Globals

Per-thread stacks

DLL code

3-GB user space

7FFFFFFF

80000000

Kernel & exec

HAL

Boot drivers

C0000000

C0800000

Process page tables

Hyperspace

BFFFFFFF

C0000000

System cache

Paged pool

Nonpaged pool

1-GB system space

FFFFFFFF

FFFFFFFF

virtual address space in win95 and win98
Virtual Address Space in Win95 and Win98

00000000

User accessible

Unique per process

(per application),

user mode

7FFFFFFF

80000000

Shared, process-writable

(DLLs, shared memory,

Win16 applications)

Systemwide

user mode

C0000000

Win95 and Win98

Systemwide

kernel mode

Operating system

(Ring 0 components)

FFFFFFFF

details with vm management
Details with VM Management
  • Create a process’s virtual address space
    • Allocate page table entries (reserve in NT)
    • Allocate backing store space (commit in NT)
    • Put related info into PCB
  • Destroy a virtual address space
    • Deallocate all disk pages (decommit in NT)
    • Deallocate all page table entries (release in NT)
    • Deallocate all page frames
page states nt
Page States (NT)
  • Active: Part of a working set and a PTE points to it
  • Transition: I/O in progress (not in any working sets)
  • Standby: Was in a working set, but removed. A PTE points to it, not modified and invalid.
  • Modified: Was in a working set, but removed. A PTE points to it, modified and invalid.
  • Modified no write: Same as modified but no write back
  • Free: Free with non-zero content
  • Zeroed: Free with zero content
  • Bad: hardware errors
dynamics in nt vm
Dynamics in NT VM

Demand

zero fault

Page in or allocation

Standby

list

Free

list

Zero

list

Bad

list

Process

working

set

Modified

writer

Zero

thread

“Soft”

faults

Modified

list

Working set

replacement

shared memory
Shared Memory
  • How to destroy a virtual address space?
    • Link all PTEs
    • Reference count
  • How to swap out/in?
    • Link all PTEs
    • Operation on all entries
  • How to pin/unpin?
    • Link all PTEs
    • Reference count

w

.

.

.

.

.

.

Page table

.

.

.

Process 1

w

Physical

pages

.

.

.

.

.

.

Page table

Process 2

copy on write
Child’s virtual address space uses the same page mapping as parent’s

Make all pages read-only

Make child process ready

On a read, nothing happens

On a write, generates an access fault

map to a new page frame

copy the page over

restart the instruction

Copy-On-Write

r

r

.

.

.

.

.

.

Page table

.

.

.

Parent process

r

r

Physical

pages

.

.

.

.

.

.

Page table

Child process

issues of copy on write
Issues of Copy-On-Write
  • How to destroy an address space
    • Same as shared memory case?
  • How to swap in/out?
    • Same as shared memory
  • How to pin/unpin
    • Same as shared memory
ad