1 / 129

COSC 1306 COMPUTER LITERACY FOR SCIENCE MAJORS

COSC 1306 COMPUTER LITERACY FOR SCIENCE MAJORS. Jehan-François Pâris j fparis@uh.edu. COSC 1306—COMPUTER SCIENCE AND PROGRAMMING COMPUTER ORGANIZATION. Module Overview. We will focus on the main challenges of computer architecture Managing the I/O hierarchy

jonah
Download Presentation

COSC 1306 COMPUTER LITERACY FOR SCIENCE MAJORS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COSC 1306COMPUTER LITERACY FORSCIENCE MAJORS Jehan-François Pârisjfparis@uh.edu COSC 1306—COMPUTER SCIENCE AND PROGRAMMING COMPUTER ORGANIZATION

  2. Module Overview We will focus on the main challenges of computer architecture Managing the I/O hierarchy Caching, multiprogramming, virtual memory Speeding up the CPU Pipelined and multicore architectures Protecting user computations and data Memory protection, privileged instructions

  3. THE MEMORY HIERARCHY

  4. The memory hierarchy (I) CPU registers Main memory(RAM) Secondary storage(Disks) Mass storage(Often offline)

  5. CPU registers • Inside the processor itself • Some can be accessed by our programs • Others no • Can be read/written to in one processor cycle • If processor speed is 2 GHz • 2,000,000,000 cycles per second • 2 cycles per nanosecond

  6. Main memory (I) • Byte accessible • Each group of 8 bits has an address • Dynamic random access memory (DRAM) • Slower but much cheaper than static RAM • Contents must be refreshed every 64 ms • Otherwise its contents are lost: • DRAM is volatile

  7. Main memory (II) Memory is organized as a sequence of 8-bit bytes Each byte an address Bytes can contain one character Roman alphabet with accents 0 12 4 8 9 5 1 13 10 6 14 2 7 11 3 15

  8. Main memory (III) Groups of four bytes starting at addresses that are multiple of 4 form words Better suited to hold numbers Also have half-words, double words, quad words 0 4 8 12

  9. Accessing main memory contents (I) • When look for some item, our search criteria can include the location of the item • The book on the table • The student behind you, … • More often our main search criterion is some attributeof the item • The color of a folder • The title or the authors of a book • The name of an individual

  10. 512 513 514 515 Accessing main memory contents (II) • Computers always access their memory by location • The byte at address 4095 • The word at location 512 • States the address of the first byte in the word • Why? • Fastest way for them to access an item

  11. An analogy (I) • Some research libraries have a closed-stack policy • Only library employees can access the stacks • Patrons wanting to get an item fill a form containing a call number specifying the location of the item • Could be Library of Congress classification if the stacks are organized that way.

  12. An analogy (II) • The procedure followed by the employee fetching the book is fairly simple • Go at location specified by the book call number • Check it the book is there • Bring it to the patron

  13. An analogy (III) • The memory operates in an even simpler manner • Always fetch the contents of the addressed bytes • Junk or not

  14. Disk drives (I) • Sole part of computer architecture with moving parts: • Data stored on circular tracks of a disk • Spinning speed between 5,400 and 15,000 rotations per minute • Accessed through a read/write head

  15. Servo Platter Arm R/W head Disk drives (II)

  16. Disk drives (III) • Data can be accessed by blocks of 4KB, 8 KB, … • Depends on disk partition parameters • User selectable • To access a disk block • Read/write head must be over the right track • Seek time • Data to be accessed must pass under the head • Rotational latency

  17. Estimating the rotational latency • On the average half a disk rotation • If disk spins at 15,000 rpm • 250 rotations per second • Half a rotation corresponds to 2ms • Most desktops have disks that spin at 7,200 rpm • Most notebooks have disks that spin at 5,400 or 7,200 rpm

  18. Accessing disk contents • Each block on a disk has a unique address • Normally a single number • Logical block addressing (LBA) • Older PCs used a different scheme

  19. The memory hierarchy (II)

  20. The memory hierarchy (III) • To make sense of these numbers, let us consider an analogy

  21. Writing a paper (I)

  22. Writing a paper (II)

  23. Writing a paper (III)

  24. Writing a paper (IV)

  25. The two gaps (I) • Gap between CPU and main memory speeds: • Will add intermediary levels • L1, L2, and L3 caches • Will store contents of most recently accessed memory addresses • Most likely to be needed in the future • Purely hardware solution • Software does not see it

  26. Major issues • Huge gaps between • CPU speeds and SDRAM access times • SDRAM access times and disk access times • Both problems have very different solutions • Gap between CPU speeds and SDRAM access times handled by hardware • Gap between SDRAM access times and disk access times handled by combination of software and hardware

  27. Why? • Having hardware handle an issue • Complicates hardware design • Offers a very fast solution • Standard approach for very frequent actions • Letting software handle an issue • Cheaper • Has a much higher overhead • Standard approach for less frequent actions

  28. Will the problem go away? • It will become worse • RAM access times are not improving as fast as CPU power • Disk access times are limited by rotational speed of disk drive

  29. What are the solutions? • To bridge the CPU/DRAM gap: • Interposing between the CPU and the DRAM smaller, faster memories that cache the data that the CPU currently needs • Cache memories • Managed by the hardware and invisible to the software (OS included)

  30. What are the solutions? • To bridge the DRAM/disk drive gap: • Storing in main memory the data blocks that are currently accessed (I/O buffer) • Managing memory space and disk space as a single resource (Virtual memory) • I/O buffer and virtual memory are managed by the OS and invisible to the user processes

  31. Why do these solutions work? • Locality principle: • Spatial locality:at any time a process only accesses asmall portion of its address space • Temporal locality:this subset does not change too frequently

  32. The true memory hierarchy CPU registers L1, L2 and L3 caches Main memory(RAM) Secondary storage(Disks) Mass storage(Often offline)

  33. Handling the CPU/DRAM speed gap

  34. The technology • Caches use faster static RAM (SRAM) • (D flipflops) • Can have • Separate caches for instructions and data • Great for pipelining • A unified cache

  35. Basic principles • Assume we want to store in a faster memory 2n words that are currently accessed by the CPU • Can be instructions or data or even both • When the CPU will need to fetch an instruction or load a word into a register • It will look first into the cache • Can have a hit or a miss

  36. Cache hits • Occur when the requested word is found in the cache • Cache avoided a memory access • CPU can proceed

  37. Cache misses • Occur when the requested word is not found in the cache • Will need to access the main memory • Will bring the new word into the cache • Must make space for it by expelling one of the cache entries • Need to decide which one

  38. Cache design challenges • Cache contains a small subset of memory addresses • Must find a very fast access mechanism • No linear search, no binary search • Would like to have an associative memory • Can search by content all memory entries in parallel • Like human brains do

  39. An associative memory Search for“ice cream” COSC 1306 program Finding a parking spot Found My last ice cream Other ice cream moment

  40. An analogy (I) • Let go back to our closed-stack library example • Librarians have noted that some books get asked again and again • Want to put them closer to the circulation desk • Would result in much faster service • The problem is how to locate these books • They will not be at the right location!

  41. An analogy (II) • Librarians come with a great solution • They put behind the circulation desk shelves with 100 book slots numbered from 00 to 99 • Each slot is a home for the most recently requested book that has a call number whose last two digits match the slot number • 3141593 can only go in slot 93 • 1234567 can only go in slot 67

  42. An analogy (III) Let me see if it's in bin 93 The call number of the book I need is 3141593

  43. An analogy (IV) • To let the librarian do her job each slot much contain either • Nothing or • A book and its reference number • There are many books whose reference number ends in 93or 67 or any two given digits

  44. An analogy (V) Sure Could I get this time the book whose call number 4444493?

  45. An analogy (VI) • This time the librarian will • Go bin 93 • Find it contains a book with a different call number • She will • Bring back that book to the stacks • Fetch the new book

  46. A very basic cache • Has 2n entries • Each entry contains • A word (4 bytes) • Its memory address • Sole way to identify the word • A bit indicating whether the cache entry contains something useful

  47. Valid 110 000 100 010 RAM Address RAM Address RAM Address RAM Address RAM Address RAM Address RAM Address RAM Address RAM Address RAM Address RAM Address RAM Address RAM Address RAM Address RAM Address Tag RAM Address Word Word Word Word Word Word Word Word Word Word Word Word Word Word Contents Word Word Y/N Y/N Y/N Y/N 101 111 001 011 Y/N Y/N Y/N Y/N A very basic cache (I) Actual caches are much bigger

  48. Valid Valid Valid 000 010 000 100 000 000 110 010 000 010 110 000 010 100 010 110 110 010 100 000 010 000 100 010 110 100 100 100 Address* Address* Address* Address* Word Word Word Word Word Word Word Word Word Word Y/N Y/N Y/N Y/N Y/N Y/N Y/N Y/N Y/N Y/N Y/N Y/N 001 111 101 011 001 001 011 011 101 111 011 101 101 011 011 111 001 011 001 101 011 101 001 101 111 001 001 111 Address* Address* Address* Address* Word Word Word Word Word Word Word Word Word Word Y/N Y/N Y/N Y/N Y/N Y/N Y/N Y/N Y/N Y/N Y/N Y/N Multiword cache Tag Contents

  49. Set-associative caches (I) • Can be seen as 2, 4, 8 caches attached together • Reduces collisions

  50. Back to our library example • What if two books whose call number have the same last two digits are often asked on the same day: • Say, 3141593 and 4444493 • Best solution is • Keep the number of book slots equal to 100 • Store more than one book with same last two digits in the same slot

More Related