l6 malloc lab writing a dynamic storage allocator october 30 2006 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
L6: Malloc Lab Writing a Dynamic Storage Allocator October 30 , 2006 PowerPoint Presentation
Download Presentation
L6: Malloc Lab Writing a Dynamic Storage Allocator October 30 , 2006

Loading in 2 Seconds...

play fullscreen
1 / 33

L6: Malloc Lab Writing a Dynamic Storage Allocator October 30 , 2006 - PowerPoint PPT Presentation


  • 567 Views
  • Uploaded on

L6: Malloc Lab Writing a Dynamic Storage Allocator October 30 , 2006. 15-213 “The course that gives CMU its Zip!”. Topics Memory Allocator (Heap) L6: Malloc Lab Reminders L6: Malloc Lab Due Nov 10, 2006 . Section A (Donnie H Kim) recitation8 .ppt (some slides from lecture notes).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'L6: Malloc Lab Writing a Dynamic Storage Allocator October 30 , 2006' - rasha


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
l6 malloc lab writing a dynamic storage allocator october 30 2006
L6: Malloc LabWriting a Dynamic Storage AllocatorOctober 30, 2006

15-213“The course that gives CMU its Zip!”

  • Topics
    • Memory Allocator (Heap)
    • L6: Malloc Lab
  • Reminders
    • L6: Malloc Lab Due Nov 10, 2006

Section A (Donnie H Kim) recitation8.ppt (some slides from lecture notes)

l6 malloc lab
L6: Malloc Lab
  • Things that matter in this lab:
    • Performance goal
      • Maximizing throughput
      • Maximizing memory utilization
    • Implementation Issues (Design Space)
      • Free Block Organization
      • Placement Policy
      • Splitting
      • Coalescing
    • And some advice
so what is memory allocation
So what is memory allocation?

memory invisible to

user code

kernel virtual memory

stack

%esp

Memory mapped region for

shared libraries

Allocators request

additional heap memory

from the operating system using the sbrk function.

the “brk” ptr

run-time heap (via malloc)

uninitialized data (.bss)

initialized data (.data)

program text (.text)

0

malloc package
Malloc Package
  • #include <stdlib.h>
  • void *malloc(size_t size)
    • If successful:
      • Returns a pointer to a memory block of at least size bytes, (typically) aligned to 8-byte boundary.
      • If size == 0, returns NULL
    • If unsuccessful: returns NULL (0) and sets errno.
  • void free(void *p)
    • Returns the block pointed at by p to pool of available memory
    • p must come from a previous call to malloc or realloc.
  • void *realloc(void *p, size_t size)
    • Changes size of block p and returns pointer to new block.
    • Contents of new block unchanged up to min of old and new size.
allocation examples
Allocation Examples

p1 = malloc(4)

p2 = malloc(5)

p3 = malloc(6)

free(p2)

p4 = malloc(2)

performance goals
Performance goals
  • Maximizing throughput (Temporal)
    • Defined as the number of requests that it completes per unit time
  • Maximizing Memory Utilization (Spatial)
    • Defined as the ratio of the requested memory size and the actual memory size used

There is a tension between maximizing throughput and utilization! Find an appropriate balance between two goals!

  • Keep this in mind, we will come back to these issues
implementation issues
Implementation Issues
  • Free Block Organization
    • How do we keep track of the free blocks?
    • How do we know how much memory to free just given a pointer?
  • Placement Policy
    • How do we choose an appropriate free block?
  • Splitting
    • What do we do with the extra space when allocating a structure that is smaller than the free block it is placed in?
  • Coalescing
    • How do we reinsert freed block?

p0

free(p0)

p1 = malloc(1)

implementation issues 1 free block organization
Implementation Issues 1: Free Block Organization
  • Identifying which block is free or allocated
    • Available design choices of how to manage free blocks
      • Implicit List
      • Explicit List
      • Segregated List
    • Header, Footer organization
      • storing information about the block (size, allocated, freed)
keeping track of free blocks
Keeping Track of Free Blocks
  • Method 1: Implicit list using lengths -- links all blocks
  • Method 2: Explicit list among the free blocks using pointers within the free blocks
  • Method 3: Segregated free list
    • Different free lists for different size classes
  • Method 4: Blocks sorted by size
    • Can use a balanced tree (e.g. Red-Black tree) with pointers within each free block, and the length used as a key

5

4

6

2

5

4

6

2

free block organization
Free Block Organization
  • Free Block with header

1 word

a = 1: allocated block

a = 0: free block

size: block size

payload: application data

(allocated blocks only)

size

a

payload

Format of

allocated and

free blocks

optional

padding

free block organization12
Free Block Organization
  • Free Block with Header and Footer

Header

size

a

a = 1: allocated block

a = 0: free block

size: total block size

payload: application data

(allocated blocks only)

payload and

padding

Format of

allocated and

free blocks

Boundary tag

(footer)

size

a

implementation issues 2 placement policy
Implementation Issues 2: Placement Policy
  • “Placement Policy” choices
    • First Fit
      • Search free list from the beginning and chose the first free block
    • Next Fit
      • Starts search where the previous search has left off
    • Best Fit
      • Examine every free block to find the best free block
implementation issues 3 splitting
Implementation Issues 3: Splitting
  • “Splitting” Design choices
    • Using the entire free block
      • Simple, fast
      • Introduces internal fragmentation (good placement policy might reduce this)
    • Splitting
      • Split free block into two parts, when second part can be used for other requests (reduces internal fragmentation)

p1 = malloc(1)

implementation issues 4 coalescing
Implementation Issues 4: Coalescing
  • False Fragmentations
    • Free block chopped into small, unusable free blocks

Coalesce adjacent free blocks to get bigger free block

  • Coalescing - Policy decision of when to perform coalescing
    • Immediate coalescing
      • Merging any adjacent blocks each time a block is freed
    • Deferred coalescing
      • Merging free blocks some time later
        • Ex) when allocation request fails.
    • Trying “Bidirectional Immediate Coalescing” proposed by Donald Knuth would be good enough for this lab
performance goals16
Performance goals
  • Maximizing throughput (Temporal)
    • Defined as the number of requests that it completes per unit time
  • Maximizing Memory Utilization (Spatial)
    • Defined as the ratio of the requested memory size and the actual memory size used

There is a tension between maximizing throughput and utilization! Find an appropriate balance between two goals!

performance goal 1 throughput
Performance goal (1) - Throughput
  • Throughput is mostly determined by time consumed to search free block
  • How you keep track of your free block affects search time
    • Naïve allocator
      • Never frees block, just extend the heap when you need a new block : throughput is extremely fast, but…?
    • Implicit Free List
      • The allocator can indirectly traverse the entire set of free blocks by traversing all of the blocks in the heap, definitely slow.
    • Explicit Free List
      • The allocator can directly traverse entire set of free blocks by traversing all of the free blocks in the heap
    • Segregated Free List
      • The allocator can directly traverse a particular free list to find an appropriate free block
performance goal 2 memory utilization
Performance goal (2) – Memory Utilization
  • Poor memory utilization caused by fragmentation
    • Comes in two forms: internal and external fragmentation
    • Internal Fragmentation
      • Based on previous requests
      • Causes
        • Allocator impose minimal size of block (depending on allocator’s choice of block format)
        • Satisfying alignment requirements
    • External Fragmenatation
      • Based on future requests
      • Aggregate free memory is enough, but no single free block is large enough to handle the request
internal fragmentation
Internal Fragmentation
  • Internal fragmentation
    • For some block, internal fragmentation is the difference between the block size and the payload size.
    • Caused by overhead of maintaining heap data structures, padding for alignment purposes, or explicit policy decisions (e.g., not to split the block).
    • Depends only on the pattern of previous requests, and thus is easy to measure.

block

Internal

fragmentation

Internal

fragmentation

payload

external fragmentation
External Fragmentation

Occurs when there is enough aggregate heap memory, but no single

free block is large enough

p1 = malloc(4)

p2 = malloc(5)

p3 = malloc(6)

free(p2)

p4 = malloc(6)

oops!

External fragmentation depends on the pattern of future requests, and

thus is difficult to measure.

assumptions
Assumptions
  • Assumptions made in Malloc Lab
    • Standard C library malloc always returns payload pointer that is aligned to 8 bytes, so should yours
    • 64-bit Architecture
      • pointers are 8 bytes long!
      • size_t is now 8 bytes (unsigned long)
    • But the requested size will be less than 4 bytes
      • You may use 4 byte headers and footers and get away

Free word

Allocated block

(4 words)

Free block

(2 words)

Allocated word

porting to 64 bit machine
Porting to 64-bit Machine
  • Porting the code in your CS:APP text book to 64-bit
    • sizeof(long) == 4 // 32-bit
    • sizeof(long) == 8 // 64-bit
    • The only significant difference is in the definitions of the GET and PUT macros.
    • Changes (To keep our 32-bit header and footers)
      • #define GET(p) (*(size_t *)(p)) // 32 bits
        • #define GET(p) (*(unsigned int *)(p)) // 64 bits
      • #define PUT(p, val) (*(size_t *)(p) = (val)) // 32 bits
        • #define PUT(p, val) (*(unsigned int *)(p) = (val)) // 64 bits
      • if ((long)(bp = mem_sbrk(size)) < 0)
        • if ((int)(bp = mem_sbrk(size)) < 0)
using macros why
Using MACROS – why?

#include <stdio.h>

#define GET8(p) (*(unsigned long *)(p))

#define PUT8(p, val) (*(unsigned long *)(p) = (unsigned long)(val))

void test(void *p, void *pval){

unsigned long *newpval;

/* Reading and writing pointers the hard way */

*(unsigned long *)p = (unsigned long) pval;

newpval = (unsigned long *)(*(unsigned long *)p);

printf("pval=%p newpval=%p\n", pval, newpval);

/* Reading and writing pointers the easy way */

PUT8(p, pval);

newpval = (unsigned long *) GET8(p);

printf("pval=%p newpval=%p\n", pval, newpval);

}

int main() {

char *pval = (char *)0x99;

char buf[128];

test(&buf[0], pval);

return 0;

}

approach advice
Approach Advice
  • Start with the implicit list implementation in your text book, and understand every details of it
  • When you finish your implicit list, start thinking about your heap checker
    • The more time you spend on this, the more time you will save later
  • Go on and start implementing explicit list with several placement policies
    • Modulate, and save each of your placement policy for comparison
  • When you finish your explicit list, you would like to add more checksin your heap checker, do this right away.
  • Now when you feel your explicit list is robust, move on to the segregated free list.
    • We are looking for a good segregated free list implementation. You can go further by trying other schemes such as balanced trees, but a solid segregated free list implementation is good enough for a full credit
  • You can also try some tweaks on the given trace files
heap checker 10 pts
Heap Checker (10 pts)
  • Basic Checks Guidelines (5/10 pts)
    • Check Heap (while working on implicit list)
      • Check epilogue and prologue blocks
      • Block’s address alignment (8 bytes)
      • Heap boundaries
      • Check your blocks’ header and footer
        • Size (minimum size , alignment)
        • prev/next allocate/free bit consistency (explicit list)
        • header and footer matching each other
      • Check your coalescing
        • All blocks are coalesced correctly (no two consecutive free blocks in the heap)
heap checker 10 pts27
Heap Checker (10 pts)
  • Free List Checks Guidelines (5/10 pts)
    • Check Free List (while working on explicit free list)
      • All next/prev pointers are consistent (If A’s next pointer points to B, B’s prev pointer should point to A)
      • All free list pointers points between mem_heap_lo() and mem_heap_high()
      • Count free blocks by iterating every block, and traversing free list by pointers, see if they match
      • Recommended to add more as you wish
    • Check Segregated Free List (segregated free list)
      • All blocks in each list bucket fall within bucket size range
      • Be creative
style 10 pts
Style (10 pts)
  • It will be some of the most difficult and sophisticated code you have written so far in your career.
  • Thing we are looking for:
    • Explain your high level design at front of your code (2 pts)
    • Each function should be prepared by a header comment (2 pts)
    • Comment properly inside each functions (2 pts)
    • Decompose into functions and use as few global variables as possible (2 pts)
    • Use macros, inline functions, C preprocessors wisely (2 pts)
  • Please try to write a clean code that is readable and self-explaining!
    • For you
    • For your Teaching Staff
    • And for world peace
debugging techniques
Debugging Techniques
  • Guidelines for Debugging
    • Intensively testing your code even though it seems to work is a good programming practice, try to learn the process from this lab
    • You can print out all the information and monitor it
      • Do this when you just started
      • When the trace file is small
    • You can also print out error messages only when something is wrong
      • Printing and monitoring becomes painful when trace files are huge
      • Just print errors
debugging tips
Debugging Tips
  • Guidelines for using mdriver’s options
    • Use ./mdriver –c <file> option to run a particular trace file just once, which only checks correctness
      • ./mdriver runs your allocator multiple times to estimate the throughput of your allocator by using k-best measurement scheme (if you are interested, refer to ch 9 and mdriver source code)
    • Use ./mdriver –v <level> option to set verbosity level
      • It is sometimes useful to have layers of debugging depth
      • Can also use #define, #ifdef, #if
    • Make sure to turn all checking routines off completely when measuring performance – it does affect performance
more hints
More Hints?
  • Going further (beyond solid segregated list)
    • Before trying this, make sure your allocator is doing what you intended, using heap/free list checkers
    • If you think you have implemented a solid segregated free list, try focus on trace files that gives you less performance results
more hints32
More Hints?
  • Some possible tackle points
    • In malloc(), you have to adjust the requested size to meet alignment requirements or minimum block size requirements
      • It turns out that how you adjust size affects the performance of some trace files
      • And sometimes it is better to force your allocator to avoid splitting the free block by using larger block than the request size
        • It will obviously increase internal fragmentation, but can also increase throughput by avoiding repeated splitting and coalescing
    • How large will you extend your heap, when you have to extend your heap?
    • How do you classify each free list?