1 / 23

Week 9 Lecture slides

Cosc 3P92. Week 9 Lecture slides. The human brain starts working the moment you are born and never stops until you stand up to speak in public. George Jessel. Memory Organization. In a typical computer system, the storage system is organized according to the following hierarchy:.

Download Presentation

Week 9 Lecture slides

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cosc 3P92 Week 9 Lecture slides The human brain starts working the moment you are born and never stops until you stand up to speak in public. George Jessel

  2. Memory Organization • In a typical computer system, the storage system is organized according to the following hierarchy: slow access (1-5 s.) and large capacity (almost unlimited) Archival Storage (magnetic tape or photographic) Moving head disk (magnetic or optical) High speed drum decreasing access time Charge Coupled Device decreasing cost/bit Main memory Cache Internal fast access (1-20 ns.) and small capacity (1-4K byte)

  3. Memory speed • Access time (Ta) • the average time taken to read a unit of information e.g., 100 ns (100 x 10**-9 s) • Access rate (Ra) = 1/Ta (bits/second) e.g., 1/100ns = 10 Mb/s • Cycle time (Tc) • the average time lapse between two successive read operations e.g., 500 ns (500 x 10**-9 s) • Bandwidth or transfer rate (Rc) = 1/Tc (bits/second) e.g., 1/500ns = 2 Mb/s

  4. Classes of Memory • RAM (“normal memory”) • Direct-access storage: HD, CD ROM, DVD • Sequential access storage tapes: DAT • Associative (content-addressable) memory: searches for data via bit patterns • CAM (Content Addressable Memory) • Includes comparison logic with each bit of storage. • A data value is broadcast to all words of storage and compared with the values there. • Words which match are flagged. • Subsequent operations can then work on flagged words. • (computing-dictionary.thefreedictionary.com) • ROM

  5. RAM ROM Bipolar MOS magnetic semiconductor core static dynamic Mask PROM ROM Mask PROM EPROM, ROM EAROM Categories of RAM and ROM primary memory

  6. 10 A9-A0 1K x 4 RAM chip 4 WE D3-D0 CS Main Memory Design • CS WE MODE Status of the Power • Bi-directional • Datelines D3-D0 • H X not selected High impedance Standby • L L Write Acts as input bus Active • L H Read Acts as output bus Active

  7. Main Memory Design Q. How do we build a 4K x 4 RAM using four 1K x 4 RAM chips? Chip A11 A10 A9 A8 A7 . . . A0 Range 0 0 0 x x x . . . x 0000 to 1023 1 0 1 x x x . . . x 1024 to 2047 2 1 0 x x x . . . x 2048 to 3071 3 1 1 x x x . . . x 3072 to 4096

  8. log2 n 1-of-n Decoder Processor n 1 2 Enable n Enable 2 Addr bus Enable 1 Memory bank (On bus in parallel ) Main Memory Design • Q. How do we build a 256KB RAM system with an 16-bit address bus and four 64KB RAM chips? • Memory band-switching

  9. Data bus 4-bit Base base address Processor 20 bit Physical address to memory 16-bit Address bus Offset Main memory design • Memory address extension

  10. c a c h e C a c h e external storage CPU Main memory Cache Memory • Cache: fast-access memory buffer • locality principle: programs usually use limited memory areas, in contrast to totally random access • spatial: location, address • temporal: time accessed • if commonly used memory can be buffered in high-speed cache, overall performance enhanced • cache takes form of small amount of store, with hardwaresupport for maintenance and lookup • each cache cell saves a cache line - block of main memory (4-64 words) • cache hit: • requested memory resides in cache

  11. Cache • cache miss: • requested memory not in cache, and must be fetched from main memory and put into cache • unified cache: • instns, data share same cache • split cache: • separate instn, data caches • parallel access: • double the bandwidth • level 2 cache: • between instn/data cache and main memory • Cache maintenance algorithms similar in spirit to virtual memory ideas at operating system level; main difference is that cache is hardware-supported, whereas v.m. is software implemented

  12. Measuring cache performance • c - cache access time • m - main memory access time • hr - hit ratio ( 0 <= hr <= 1) : • # cache hits / total memory requests • mr - miss ratio (1-hr) • mean access time = c + (1-hr)m • if hr --> 1 then m.a.t. = c • if hr --> 0 then m.a.t. = c + m

  13. Cache Example • example: let c = 160 ns m = 960 ns h = .90 (common)mean = 160 + (1-.90)960 = 256 nsefficiency = c / mean = 160/256 = 62.5%

  14. Main memory 0 1 Cache 0 i i mod N N-1 M-1 Direct mapping

  15. Direct Mapped Cache

  16. Direct mapping • use a hash function to find cache location • normally, modulo some bit field of address, then just use low end field • cache fields: • valid bit • tag - block # being held • value - data block • scheme: • memory request: • compute cache slot (low n bits) • check block (tag) field • hit: return value • miss: fetch block from memory, give to CPU, and put into that computed slot (replace existing item if there) • can occasionally produce thrashing • eg. addresses that are multiple of cache size (64K) will reside at same entry • split instn/data cache helps avoid thrashing

  17. Main memory 0 1 Cache Set 0 S blocks per set Set 1 i Set i mod (N/S) Set N/S - 1 M-1 Set associative mapping

  18. Set associative mapping

  19. Set associative mapping • [4.39] • use same hash function as direct mapping, except that each cache slot holds multiple data blocks • usually max. 4 blocks (“4-way”) • searching blocks in a slot done associatively: simultaneous pattern matching • more flexible than direct: multiple blocks in set • use smaller tag than associative, therefore cheaper to implement associative matching • commonly used in larger systems (VAX 11-780) • which line should be replaced when slot full? • eg. LRU (least recently used)

  20. Writing back to the memory • only write to memory if cache data modified. • write back (write-deferred): • (i) use a modified bit. When swapping a cache slot or ending job, write slot if its modified bit is set • write through: • (ii) whenever modifying data, always write it back to main memory • have to do this if memory being shared in a DMA or multiprocessing system

  21. Example: direct mapping 4 byte blocks 1 byte words 8 slots in cache

  22. Example (cont)

  23. The end

More Related