VxWorks & Memory Management Group A7 CSE8343
Agenda • General Overview • High level overview of how VxWorks thinks about memory • Virtual Memory • Caching • Specific example of how VxWorks allocates memory on a Synergy Dual Processor
General Overview • In basic VxWorks, all memory can be conceived as a singular linear array of words*. • All processes (theoretically) can access all words. • *=Or what ever the appropriate addressing unit is: byte, word, long word
Memory Layout (VxWorks on PPC) • Interrupt Vector Table • Exception/Interrupt vectors • Shared Memory Anchor (if necessary) • Boot Line • ASCII string of boot parameters • Exception Message • ASCII string of the(prior) fatal exception message
Memory Layout, II • Initial Stack • Initial stack for usrInit( ), until usrRoot( ) allocates the real stack. • System Image • VxWorks itself (three sections: text, data, bss). The entry point for VxWorks is at the start of this region, which is BSP dependent. The entry point for each BSP is as follows:
Memory Layout, III • Host Memory Pool • Memory allocated by host tools. The size depends on the the macro WDB_POOL_SIZE. Modify WDB_POOL_SIZE under INCLUDE_WDB. • Applications downloaded to processor are allocated space here. • Interrupt Stack • Size is defined by ISR_STACK_SIZE
Memory Layout, IV • System Memory Pool • Size and location depend on the size of the system image. Malloc() allocates space from here. • Many of these items can be changed by modifying various macros (ISR_STACK_CHANGE, WDB_POOL_SIZE are just two) and then recompiling VxWorks.
Memory Access (Translations) • Untranslated: physical address is used unchanged (physical == virtual) • The processors usually boot into this mode until MMU hardware is initialised. • BAT Registers • Set of four/eight registers which define (large) blocks of memory • Each logical address is converted by BAT registers • Responsibility of user to correctly set BAT registers
Memory Access, II • Page mode • As you would expect: logical addresses are decoded through segment registers and TLB page tables to construct physical address • Be careful on size: mapping 1GB => 16MB of page table space! • User must correctly set-up page table entries for certain (non-standard) memory locations. • Usually a combination of BAT and Page is used, depending on size of memories to map (VME / PCI / IO, etc). Page mode is preferred for use with Virtual Memory and Caching.
Virtual Memory • VxWorks does not require a Memory Management Unit (MMU) • Not all systems have a MMU • System performance is best when a MMU is used • vmBaseLib allows 1 global Virtual Memory mapping for the system • vmBaseLib allows user to set cacheable / uncacheable memory blocks
VxWorks Memory, ctd • Extensive Virtual Memory support available is separate product (VxMI, aka vmLib) • Allows private virtual memory contexts • User could set up each task with a separate VM context • But: increased context switch time, user must manage contexts correctly • Not normally used in my company’s embedded products.
VxWorks and Caching • VxWorks is designed to the worst case scenario (greatest number of coherency issues): • Harvard architecture (separate Instruction and Data caches) • copy-back mode • DMA transfers and devices • Multiple bus masters • No Hardware support
More on Caching • VxWorks supplies the functions to control cache settings (write-through/ copy-back) • It is the responsibility of the user to handle the intricacies of cache / dma / virtual memory. • User must indicate whether memory buffers are cacheable / noncacheable • User must handle cache invalidation / cache flushing to maintain cache coherency • User must correctly set up DMA transfers with the use of the MMU and cache • Most of the time, the default settings are fine. Only when accessing other devices does this become some thing that must be fully analyzed. • VxWorks / BSP supply the functions, the user supplies how/when to use them.
How does VxWorks allocate memory? • VxWorks starts at address 0: • includes exception vector, etc. • During kernel initialisation, kernelLib will use memLib to allocate the system memory partition. • Malloc() will use the system memory partition for kernel memory needs. • User can create memory partitions by calling memPartCreate() in memPartLib. • This allows user to maintain different pools of memory, each of a different size, and allocate from those pools. • Generally these allocates are pulled from the system memory partition. • User can add more memory to the partition, and it need not be contiguous. • With appropriate number and appropriately sized partitions (a priori analysis) the effects of memory fragmentation can be reduced.
One problem with VxWorks • PPC’s addresses are 24 bits-- all code (VxWorks & application) must be within 32 MB. • There are several methods of dealing with this problem: • Add memory to global memory pool _AFTER_ all application code is loaded; • Compile code w/ -mlongcall option (reduced efficiency in code due to extra instructions to handle long calls) • User manages extra memory (not visible to VxWorks malloc())
Specifics of VxWorks • vmLib.h: handles the physical to virtual memory mapping • Data structure: sysPhysMemDesc • Partly filled in by VxWorks at system configuration (boot time) • Filled in by user prior to VxWorks compilation to add other memory (memory mapped devices)
Example sysPhysMemDesc (Synergy Dual processor board) • This is essentially the first page table in the system (it contains the first set of Page Table Entries) • It is an array of records: • Virtual Address, Physical Address, Length in Bytes; Virtual Memory Mask, Virtual Memory Enable • Note: This is a dual processor board, with all of RAM accessible to both processors. Different board configurations will yield different physical memory layouts.
sysPhysMemDesc, ctd • The following are generally completed by the BSP vendor: • Element 0: Maps mailboxes, starts at address 0, and should be uncached.*/ • Element 1: Maps gemini registers and ethernet data arrays. Starts at 0xF000000 • Element 2: Define the space for VxWorks kernel (processor 1). Modified by kernel at run time. • Element 3: Define the space for VxWorks kernel (processor 2). Modified at run time. • Element 4: Shared Memory. Starts at 0x2000000 • Element 5: Processor 1’s RAM. Modified at run time. • Element 6: Processor 2’s RAM. Modified at run time. • Element 7: Page tables space. Modified at run time.
sysPhysMemDesc, ctd • These elements are the responsibility of the user to modify prior to compiling: • PCI SPACE -- 8 entries • initially zero (so if they are not used, no memory is allocated) • my current project has a mapping here to a device on the PCI bus • VME SPACE -- last 8 entries • Describe the VME master ports for the architecture • Initially zero (like PCI maps) • We will add a mapping to a VME device here as well • The base system is untranslated, physical = virtual
What about the BAT registers? • BAT registers can conflict with the routines in cacheLib and vmBaseLib • ie, OS and HW may not play well together • BAT registers map memory outside of the processors (Flash, PROM, etc) • Locations where fine-grained control is not necessary
Conclusion • VxWorks offers much power to the user (application / system programmer) to control exactly how memory is addressed, allocated, cached. • With power comes responsibility: • user must not rely on VxWorks to perfectly implement an application’s memory needs. • User will need to consider caching (flushing, disabling), DMA accesses, etc. as part of design.
Definitions • Board Support Package (BSP): The extensions to the generic VxWorks that apply to a specific processor and its board. • Copy-back: Only write to memory when necessary • Direct Memory Access (DMA): Transfering data withou processor support. • Write-through: Write to cache and then into memory
References • VxWorks Reference Manual • Synergy User Guide for Dual-PPC • VxWorks on-line documentation • VxWorks source code (all the libs)