1 / 37

2014-1-21 John Lazzaro (not a prof - “John” is always OK)

www-inst.eecs.berkeley.edu/~cs152/. CS 152 Computer Architecture and Engineering. Lecture 1 – Single Cycle Design. 2014-1-21 John Lazzaro (not a prof - “John” is always OK). TA: Eric Love. Play:. Today’s lecture plan. Class Outline. What we’ll be doing this semester. Short Break.

winka
Download Presentation

2014-1-21 John Lazzaro (not a prof - “John” is always OK)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. www-inst.eecs.berkeley.edu/~cs152/ CS 152 Computer Architecture and Engineering Lecture 1 – Single Cycle Design 2014-1-21 John Lazzaro (not a prof - “John” is always OK) TA: Eric Love Play:

  2. Today’s lecture plan ... Class Outline. What we’ll be doing this semester. Short Break. Single-cycle processor design. Preliminaries ... prep for Thursday.

  3. Nvidia Tegra K1 Tech Talk 5:30 PM this Thursday in the Woz. Tegra K1 remixes the Kepler GPU architecture for lowpower SOCs.

  4. Nvidia Tegra K1 This class prepares you to be on a team like the one at Nvidia that designed this chip.

  5. Nvidia Tegra K1 This is true even if your goal is to be in the group that designs circuits ... ... or writes software ...for the chip.

  6. Lecture topics GPU architecture: Apr 15/17. Dynamic scheduling: Apr 1/3, 8/10. Memory System: (February) Array of 192 CUDA cores in the Kepler GPU Hierarchical Memory System ARM A15 CPU Cores (4+1)

  7. And other topics What do we get to do?

  8. Timeline For 9 weeks, lectures and labs only. Midterm March 18: Complete HW1 and take Midterm 1 Midterm II May 1: Complete HW2 and take Midterm 2 Lab 1: Pipelines: Lab 2: Caches: Lab 3: Dynamically-Scheduled CPU Design:

  9. About the labs Rocket, a RISC-V (“risk five”) chip project Professor Krste Asanovic directs ASPIRE (microprocessor design research project). CS 152 uses ASPIRE software tools and CPU designs. ASPIRE graduate students take turns with TA duties. RISC-V: a new instruction set architecture (ISA) Extensive software support: gcc port, disassemblers, etc. Chisel: Professor Bachrach’s hardware description language Labs will use Chisel simulators of RISC-V CPU designs

  10. Open-source ... on the web ... Warning: It’s tricky to compile ...

  11. Each lab has two parts Not ‘team’ labs - you work alone. Directed portion Teaches you how to use the tools. Helps you understand the material. Not doing well puts you in ‘C’ grade territory. Open-ended portion Define a project and work on it for several weeks. “High bar” for an ‘A’ grade (about 10% of class). “Solid, competent work” gets you a ‘B’ grade. Falls out of EECS 2.7-3.1 upper-division GPA guidelines.

  12. About exams: Two mid-terms and no final. Mid-term start time TBD

  13. About homeworks:

  14. Discussion sections Go to the section you can make. Focused on labs. TA: Eric Love (ASPIRE graduate student). Essential for doing well in the labs. John does Q&A for lecture materials, midterms, hw. What constitutes ‘cheating’ on labs?

  15. And more generally ...

  16. Required text ... 5th edition only On reserve in library. See class website for readings for each lecture ...

  17. Recommended text ... On reserve in library. Any edition is fine ... whatever you used for 61C.

  18. Administriva, Part I Piazza is our all-to-all communication media. Send John email if he hasn’t contacted you about it. Class website is our archival media. Lecture slides, labs, due dates ... add ‘/sp14’ to URL. Tools run on EECS instructional machines. Get the account form from Eric in discussion. Laptop/tablet/smartphone in class. Fine for note taking and class-related activities. Every lecture will have a short break in the middle. Please wait till the break for heavy-duty multitasking.

  19. Administriva Rain Checks Expect updates soon on the following items: Course grading Breakdown between mid-terms and labs, more details on how we will grade the labs. Office hours For Eric and John. Deadlines policies. Our late policies for labs, and procedures if you can’t make it to one of the mid-terms. Wait list. We hope we can let everyone in, but we don’t know for sure yet. If you are planning to drop, email John.

  20. Break Play:

  21. Instruction Set Architecture Lectures examples will mostly use the MIPS ISA. The labs will use the RISC-V ISA ...

  22. New successful instruction sets are rare software instruction set hardware Implementors suffer with original sins of ISAs, to support the installed base of software.

  23. opcode Fieldsize: Binary: Hardware rs rt 01010 rd shamt 01000 funct “R-Format” 100000 000000 01001 00000 Processor Memory I/O system 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits Application (iTunes) Operating Datapath & Control Compiler System (Mac OS X) Digital Design Software Assembler Circuit Design Bitfield: Transistors Instruction Sets: A Thin Interface Syntax: ADD $8 $9 $10 Semantics: $8 = $9 + $10 Instruction Set Architecture In Hexadecimal:012A4020

  24. opcode rs rt rd shamt funct Instruction Fetch Decode fields to get : ADD $8 $9 $10 Instruction Decode Operand Fetch Execute Result Store Next Instruction Hardware implements semantics ... Syntax: ADD $8 $9 $10 Semantics: $8 = $9 + $10 Fetch next inst from memory:012A4020 “Retrieve” register values: $9 $10 Add $9 to $10 Place this sum in $8 Prepare to fetch instruction that follows the ADD in the program.

  25. ADD syntax & semantics, as seen in the MIPS ISA document.

  26. “I-Format” opcode rs rt offset Instruction Fetch Decode fields to get : LW $1, 32($2) Instruction Decode Operand Fetch Execute Result Store Next Instruction Memory Instructions: LW $1,32($2) Fetch the load inst from memory “Retrieve” register value: $2 Compute memory address: 32 + $2 Load memory address contents into: $1 Prepare to fetch instr that follows the LW in the program. Depending on load semantics, new $1 is visible to that instr, or not until the following instr (”delayed loads”).

  27. LW syntax & semantics, as seen in the MIPS ISA document.

  28. ALWAYS prepare to fetch instr that follows the BEQ in the program (”delayed branch”). IF we take branch, the instr we fetch AFTER that instruction is PC + 4 + 100. “I-Format” opcode rs rt offset Instruction Fetch Decode fields to get: BEQ $1, $2, 25 Instruction Decode Operand Fetch PC == “Program Counter” Execute Result Store Next Instruction Branch Instructions: BEQ $1,$2,25 Fetch branch inst from memory “Retrieve” register values: $1, $2 Compute if we take branch: $1 == $2 ?

  29. BEQ syntax & semantics, as seen in the MIPS ISA document.

  30. define: The Architect’s Contract To the program, it appears that instructions execute in the correct order defined by the ISA. As each instruction completes, the machine state (regs, mem) appears to the program to obey the ISA. What the machine actually does is up to the hardware designers, as long as the contract is kept.

  31. Single Cycle CPU Design Preliminaries ...

  32. D Q All state elements act like positive edge-triggered flip flops. clk Single cycle data paths: Assumptions Processor uses synchronous logic design (a “clock”). Reset ?

  33. D Q Review: Edge-Triggered D Flip Flops Value of D is sampled on positive clock edge. Q outputssampledvalue for rest of cycle. CLK D Q This abstraction is sufficient for the 2014 CS 152!

  34. Holds value Sampling circuit If you are a circuit designer ... Not required for 2014 CS 152 ... D Q A flip-flop “samples” right before the edge, and then “holds” value. 16 Transistors: Makes an SRAM look compact! What do we get for the 10 extra transistors? Clocked logic semantics.

  35. D Q CLK If you are a CS 150 veteran ... Not required for 2014 CS 152 ... Value of D is sampled on positive clock edge. Q outputssampledvalue for rest of cycle. module ff(D, Q, CLK); input D, CLK; output Q; reg Q; always @ (posedge CLK) Q <= D; endmodule

  36. define: Single-cycle datapath All instructions execute in a singlecycle of the clock (positive edge to positive edge) Drawbacks: unrealistic hardware assumptions, slow clock period Advantage: a great way to learn CPUs.

  37. Thursday: Complete single-cycle ... and maybe get to other listed topics.

More Related