1 / 23

CS 230: Computer Organization and Assembly Language

CS 230: Computer Organization and Assembly Language. Aviral Shrivastava. Department of Computer Science and Engineering School of Computing and Informatics Arizona State University. Slides courtesy: Prof. Yann Hang Lee, ASU, Prof. Mary Jane Irwin, PSU, Ande Carle, UCB. Announcements.

aminia
Download Presentation

CS 230: Computer Organization and Assembly Language

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 230: Computer Organization and Assembly Language Aviral Shrivastava Department of Computer Science and Engineering School of Computing and Informatics Arizona State University Slides courtesy: Prof. Yann Hang Lee, ASU, Prof. Mary Jane Irwin, PSU, Ande Carle, UCB

  2. Announcements • Alternate Project • Due Today • Real Examples • Finals • Tuesday, Dec 08, 2009 • Please come on time (You’ll need all the time) • Open book, notes, and internet • No communication with any other human

  3. Time, Time, Time • Making a Single Cycle Implementation is very easy • Difficulty and excitement is in making it fast • Two fundamental methods to make Computers fast • Pipelining • Caches Write Data Instruction Memory Address Read Data Register File Reg Addr Data Memory Read Data PC Address Instruction ALU Reg Addr Read Data Write Data Reg Addr

  4. Effect of high memory Latency • Single Cycle Implementation • Cycle time becomes very large • Operation that do not need memory also slow down Write Data Instruction Memory Address Read Data Register File Reg Addr Data Memory Read Data PC Address Instruction ALU Reg Addr Read Data Write Data Reg Addr

  5. Effect of high memory Latency IR A ALUout B MDR • Multi-cycle Implementation • Cycle time becomes long • But • Can make memory access multi-cycle • Avoid penalty to instructions that do not use memory Address Memory Read Addr 1 PC Read Data 1 Register File Read Addr 2 Read Data (Instr. or Data) ALU Write Addr Write Data Read Data 2 Write Data

  6. Effects of high memory latency ALU IM Reg DM Reg • Pipelined Implementation • Cycle time becomes long • But • Can make memory access multi-cycle • Avoid penalty to instructions that do not use memory • Can overlap execution of other instructions with a memory operation

  7. Kinds of Memory faster Flipflops CPU Registers 100s Bytes <10s ns SRAM K Bytes 10-20 ns $.00003/bit SRAM DRAM M Bytes 50ns-100ns $.00001/bit DRAM Disk G Bytes ms 10-6 cents Disk Tape infinite sec-min Tape larger

  8. Memories • CPU Registers, Latches • Flip flops: very fast, but very small • SRAM – Static RAM • Very fast, Low Power, but small • Data is persistent, until there is power • DRAM – Dynamic RAM • Very dense • Like a vanishing ink – data disappears with time • Need to refresh the contents

  9. Flip Flops • Fastest form of memory • Store data using combinational logic components only • SR, JK, T, D- flip flops CSE 420: Computer Architecture I

  10. SRAM Cell b b’ Computer Scientist View Electrical Engineering View

  11. A 4-bit SRAM Wr Driver Wr Driver Wr Driver Wr Driver - + - + - + - + Din 3 Din 2 Din 1 Din 0 WrEn Precharge Word SRAM Cell SRAM Cell SRAM Cell SRAM Cell

  12. A 16X4 Static RAM (SRAM) Wr Driver Wr Driver Wr Driver Wr Driver - + - + - + - + A0 A1 A2 A3 Sense Amp Sense Amp Sense Amp Sense Amp Din 3 Din 2 Din 1 Din 0 WrEn Precharge Word 0 SRAM Cell SRAM Cell SRAM Cell SRAM Cell Address Decoder Word 1 SRAM Cell SRAM Cell SRAM Cell SRAM Cell : : : : Word 15 SRAM Cell SRAM Cell SRAM Cell SRAM Cell - + - + - + - + Dout 3 Dout 2 Dout 1 Dout 0

  13. Dynamic RAM (DRAM) • Value is stored in the capacitor • Discharges with time • Needs to be refreshed regularly • Dummy read will recharge the capacitor • Very high density • Newest technology is first tried on DRAMs • Intel became popular because of DRAM • Biggest vendor of DRAM

  14. Why Not Only DRAM? • Not large enough for some things • Backed up by storage (disk) • Virtual memory, paging, etc. • Will get back to this • Not fast enough for processor accesses • Takes hundreds of cycles to return data • OK in very regular applications • Can use SW pipelining, vectors • Not OK in most other applications

  15. Is there a problem with DRAM? Processor-Memory Performance Gap:grows 50% / year Processor-DRAM Memory Gap (latency) µProc 60%/yr. (2X/1.5yr) 1000 CPU “Moore’s Law” 100 Performance 10 DRAM 9%/yr. (2X/10yrs) DRAM 1 1988 1980 1981 1982 1983 1984 1985 1986 1987 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 Time

  16. Memory Hierarchy Analogy: Library (1/2) • You’re writing a term paper (Anthropology) at a table in Hayden • Hayden Library is equivalent to disk • essentially limitless capacity • very slow to retrieve a book • Table is memory • smaller capacity: means you must return book when table fills up • easier and faster to find a book there once you’ve already retrieved it

  17. Memory Hierarchy Analogy: Library (2/2) • Open books on table are cache • smaller capacity: can have very few open books fit on table; again, when table fills up, you must close a book • much, much faster to retrieve data • Illusion created: whole library open on the tabletop • Keep as many recently used books open on table as possible since likely to use again • Also keep as many books on table as possible, since faster than going to library

  18. Memory Hierarchy: Goals • Fact: Large memories are slow, fast memories are small • How do we create a memory that gives the illusion of being large, cheap and fast (most of the time)?

  19. Memory Hierarchy: Insights • Temporal Locality (Locality in Time): => Keep most recently accessed data items closer to the processor • Spatial Locality (Locality in Space): => Move blocks consists of contiguous words to the upper levels Lower Level Memory Upper Level Memory To Processor Blk X From Processor Blk Y

  20. Memory Hierarchy: Solution Our current focus Capacity Access Time Cost Upper Level Staging Xfer Unit faster CPU Registers 100s Bytes <10s ns Registers prog./compiler 1-8 bytes Instr. Operands Cache K Bytes 10-100 ns 1-0.1 cents/bit Cache cache cntl 8-128 bytes Blocks Main Memory M Bytes 200ns- 500ns $.0001-.00001 cents /bit Memory OS 4K-16K bytes Pages Disk G Bytes, 10 ms (10,000,000 ns) 10 - 10 cents/bit Disk -5 -6 user/operator Mbytes Files Larger Tape infinite sec-min 10 Tape Lower Level -8

  21. Memory Hierarchy: Terminology Lower Level Memory Upper Level Memory To Processor Blk X From Processor Blk Y • Hit: data appears in some block in the upper level (Block X) • Hit Rate: fraction of memory accesses found in the upper level • Hit Time: Time to access the upper level which consists of • RAM access time + Time to determine hit/miss • Miss: data needs to be retrieve from a block in the lower level (Block Y) • Miss Rate = 1 - (Hit Rate) • Miss Penalty: Time to replace a block in the upper level + Time to deliver the block the processor • Hit Time << Miss Penalty

  22. Memory Hierarchy: Show me numbers • Consider application • 30% instructions are load/stores • Suppose memory latency = 100 cycles • Time to execute 100 instructions • = 70*1 + 30*100 = 3070 cycles • Add a cache with latency 2 cycle • Suppose hit rate is 90% • Time to execute 100 instructions • = 70*1 + 27*2 + 3*100 = 70+54+300 = 424 cycles

  23. Yoda says… You will find only what you bring in

More Related