1 / 74

Course number: CS141 Who? Tarun Soni ( tsoni@cs.ucsd )

Introduction to Computer Architecture. Course number: CS141 Who? Tarun Soni ( tsoni@cs.ucsd.edu ) TA: Wenjing Rao (wrao@cs) and Eric Liu (xeliu@cs) Where? CENTR: 119 When? M,W @ 6-8:50pm Textbook: Patterson and Hennessy, Computer Organization & Design

anoki
Download Presentation

Course number: CS141 Who? Tarun Soni ( tsoni@cs.ucsd )

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Computer Architecture Course number: CS141 Who? Tarun Soni ( tsoni@cs.ucsd.edu ) TA: Wenjing Rao (wrao@cs) and Eric Liu (xeliu@cs) Where? CENTR: 119 When? M,W @ 6-8:50pm Textbook: Patterson and Hennessy, Computer Organization & Design The hardware software interface, 2nd edition. Web-page: http://www-cse.ucsd.edu/users/tsoni/cse141 (slides, homework questions, other pointers and information) Office hours: Tarun: Mon. 4pm-6pm: AP&M 3151 Yang Yu and Wenjing Rao: TBD, look on the webpage Tarun Soni, Summer ‘03

  2. Todays Agenda • Administrivia • Technology trends • Computer organization: concept of abstraction • Instruction Set Architectures: Definition, types, examples • Instruction formats: operands, addressing modes • Operations: load, store, arithmetic, logical • Control instructions: branch, jump, procedures • Stacks • Examples: in-line code, procedure, nested-procedures • Other architectures Tarun Soni, Summer ‘03

  3. Schedule-sort of Tarun Soni, Summer ‘03

  4. Grading • Grade breakdown • Mid-term (1.5 hours) 30% • Final (3 hours) 40% • Pop-Quizzes (3, 45 min each, only 2 high scores cout) 30% • Class Participation: Extras?? • Can’t make exams: tell us early and we will work something out • Homeworks do not need to be turned in. However, pop-quizzes will be based on hw. • What is cheating? • Studying together in groups is encouraged • Work must be your own • Common examples of cheating: copying an exam question from other material or other person... • Better off to skip question (small fraction of grade.) • Written/email request for changes to grades • average grade will be a B or B+; set expectations accordingly Tarun Soni, Summer ‘03

  5. Why? • You may become a practitioner someday ? • Keeper of Moore’s law • Architecture concepts are core to other sub-systems • Video-processors • Security engines • Routing/Networking etc. • Even if you become a software geek? • Architecture enables a way of thinking • Understanding leads to breadth and better implementation of software Tarun Soni, Summer ‘03

  6. ‘Computer” of the day • Jacquard loom • late 1700’s • for weaving silk • “Program” on punch cards • “Microcode”: each hole • lifts a set of threads • “Or gate”: thread lifted if • any controlling hole punched Tarun Soni, Summer ‘03

  7. Trends: Moores law Tarun Soni, Summer ‘03

  8. Trends: $1000 will buy you… Tarun Soni, Summer ‘03

  9. Trends: Densities Tarun Soni, Summer ‘03

  10. Technology Source: Intel Journal, May 2002 Tarun Soni, Summer ‘03

  11. Other technology trends • Processor • logic capacity: about 30% per year • clock rate: about 20% per year • Memory • DRAM capacity: about 60% per year (4x every 3 years) • Memory speed: about 10% per year • Cost per bit: about 25% per year • Disk • capacity: about 60% per year Physics-advancement Architecture-advancement Speed Capacity Tarun Soni, Summer ‘03

  12. SPEC Performance RISC introduction • performance now improves ­ 50% per year (2x every 1.5 years) Tarun Soni, Summer ‘03

  13. Organization: A Basic Computer • Every computer has 5 basic components Computer Control Input Memory Output Datapath Tarun Soni, Summer ‘03

  14. Organization: A Basic Computer • Not all “memory” are created equally • Cache: fast (expensive) memory are placed closer to the processor • Main memory: less expensive memory--we can have more Proc Caches Busses adapters Memory Controllers Disks Displays Keyboards I/O Devices: Networks • Input and output (I/O) devices have the messiest organization • Wide range of speed: graphics vs. keyboard • Wide range of requirements: speed, standard, cost ... • Least amount of research (so far) Tarun Soni, Summer ‘03

  15. What is “Computer Architecture” Computer Architecture = Instruction Set Architecture + Machine Organization How you talk to the machine What the machine looks like Computer Architecture and Engineering Instruction Set DesignComputer Organization Interfaces Hardware Components Compiler/System View Logic Designer’s View Tarun Soni, Summer ‘03

  16. Architecture? Application Operating System Compiler Firmware Instruction Set Architecture Instr. Set Proc. I/O system Datapath & Control Digital Design Circuit Design Layout • Coordination of many levels of abstraction • Under a rapidly changing set of forces • Design, Measurement, and Evaluation Tarun Soni, Summer ‘03

  17. Levels of abstraction? temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; High Level Language Program Compiler lw $15, 0($2) lw $16, 4($2) sw $16, 0($2) sw $15, 4($2) Assembly Language Program Assembler 0000 1001 1100 0110 1010 1111 0101 1000 1010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111 Machine Language Program Machine Interpretation Control Signal Specification ALUOP[0:3] <= InstReg[9:11] & MASK Tarun Soni, Summer ‘03

  18. Instruction Set Architecture ISA is the agreed-upon interface between all the software that runs on the machine and the hardware that executes it. software instruction set hardware Tarun Soni, Summer ‘03

  19. Example ISAs • IBM360, VAX etc. • Digital Alpha (v1, v3) 1992-97 • HP PA-RISC (v1.1, v2.0) 1986-96 • Sun Sparc (v8, v9) 1987-95 • SGI MIPS (MIPS I, II, III, IV, V) 1986-96 • Intel (8086,80286,80386, 1978-96 80486,Pentium, MMX, ...) • ARM ARM7,8,StrongARM 1995- Digital Signal Processors also have an ISA TMS320, Motorola, OAK etc. Tarun Soni, Summer ‘03

  20. ISAs Instruction Set Architecture “How to talk to computers if you aren’t in Star Trek” Tarun Soni, Summer ‘03

  21. ISAs • Language of the Machine • More primitive than higher level languages e.g., no sophisticated control flow • Very restrictive e.g., MIPS Arithmetic Instructions • We’ll be working with the MIPS instruction set architecture • similar to other architectures developed since the 1980's • used by NEC, Nintendo, Silicon Graphics, Sony • Design goals: maximize performance and minimize cost, reduce design time Tarun Soni, Summer ‘03

  22. ISAs • Ideally the only part of the machine visible to the programmer/compiler • Available instructions (Opcodes) • Formats • Registers, number and type • Addressing modes, access mechanisms • Exception conditions etc. Tarun Soni, Summer ‘03

  23. Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction Instruction Set Architecture: What Must be Specified? ° Instruction Format or Encoding – how is it decoded? ° Location of operands and result – where other than memory? – how many explicit operands? – how are memory operands located? – which can or cannot be in memory? ° Data type and Size ° Operations – what are supported ° Successor instruction – jumps, conditions, branches fetch-decode-execute is implicit! Tarun Soni, Summer ‘03

  24. Vocabulary • superscalar processor -- can execute more than one instructions per cycle. • cycle -- smallest unit of time in a processor. • parallelism -- the ability to do more than one thing at once. • pipelining -- overlapping parts of a large task to increase throughput without decreasing latency Tarun Soni, Summer ‘03

  25. ISA Decisions destination operand operation y = x + b • operations • how many? • which ones • operands • how many? • location • types • how to specify? • instruction format • size • how many formats? (add r1, r2, r5) how does the computer know what 0001 0100 1101 1111 means? Tarun Soni, Summer ‘03

  26. Crafting an ISA • We’ll look at some of the decisions facing an instruction set architect, and • how those decisions were made in the design of the MIPS instruction set. • MIPS, like SPARC, PowerPC, and Alpha AXP, is a RISC (Reduced Instruction Set Computer) ISA. • fixed instruction length • few instruction formats • load/store architecture • RISC architectures worked because they enabled pipelining. They continue to thrive because they enable parallelism. Tarun Soni, Summer ‘03

  27. Basic types of ISAs Accumulator (1 register): 1 address add A acc ¬ acc + mem[A] 1+x address addx A acc ¬ acc + mem[A + x] Stack: 0 address add tos ¬ tos + next General Purpose Register: 2 address add A B EA(A) ¬ EA(A) + EA(B) 3 address add A B C EA(A) ¬ EA(B) + EA(C) Load/Store: 3 address add Ra Rb Rc Ra ¬ Rb + Rc load Ra Rb Ra ¬ mem[Rb] store Ra Rb mem[Rb] ¬ Ra Comparison: Bytes per instruction? Number of Instructions? Cycles per instruction? Tarun Soni, Summer ‘03

  28. Instruction Count C = A+B Accumulator (1 register): Load A Add B Store C Stack: Push A Push B Add Pop C General Purpose Register: (Register-Memory) Load R1,A Add R1,B Store C,R1 Load/Store: Load R1,A Load R2,B Add R3,R1,R2 Store C,R3 Tarun Soni, Summer ‘03

  29. Instruction Length Variable: Fixed: Hybrid: … MIPS Instructions • All instructions have 3 operands • Operand order is fixed (destination first)C code: A = B + CMIPS code: add $s0, $s1, $s2 (associated with variables by compiler) Tarun Soni, Summer ‘03

  30. Instruction Length • Variable-length instructions (Intel 80x86, VAX) require multi-step fetch and decode, but allow for a much more flexible and compact instruction set. • Fixed-length instructions allow easy fetch and decode, and simplify pipelining and parallelism. • All MIPS instructions are 32 bits long. • this decision impacts every other ISA decision we make because it makes instruction bits scarce. Recent embedded machines (ARM, MIPS) added optional mode to execute subset of 16-bit wide instructions (Thumb, MIPS16) choose performance or density per procedure • • If code size is most important, use variable length instructions • If performance is most important, use fixed length Tarun Soni, Summer ‘03

  31. MIPS Instruction Format 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits OP rs rd sa funct rt OP rs rt immediate OP target • the opcode tells the machine which format • so add r1, r2, r3 has • opcode=0, funct=32, rs=2, rt=3, rd=1, sa=0 • 000000 00010 00011 00001 00000 100000 Tarun Soni, Summer ‘03

  32. Operands • operands are generally in one of two places: • registers (32 int, 32 fp) • memory (232 locations) • registers are • easy to specify • close to the processor (fast access) • the idea that we want to access registers whenever possible led to load-store architectures. • normal arithmetic instructions only access registers • only access memory with explicit loads and stores Tarun Soni, Summer ‘03

  33. Load Store Architectures Load-store architectures • can do: • add r1=r2+r3 • and • load r3, M(address) • forces heavy dependence on registers, which is exactly what you want in today’s CPUs can’t do add r1 = r2 + M(address) -more instructions +fast implementation (e.g., easy pipelining) Expect new instruction set architecture to use general purpose register Pipelining => Expect it to use load store variant of GPR ISA Tarun Soni, Summer ‘03

  34. General Purpose Registers ° Advantages of registers • registers are faster than memory • registers are easier for a compiler to use vs. stack - e.g., (A*B) – (C*D) – (E*F) multiplies in any order • registers can hold variables - memory traffic is reduced, so program is sped up - code density improves (since register named with fewer bits than memory location) MIPS Registers • Programmable storage • 2^32 x bytes of memory • 31 x 32-bit GPRs (R0 = 0) • 32 x 32-bit FP regs (paired DP) • HI, LO, PC Tarun Soni, Summer ‘03

  35. Memory Organization • Viewed as a large, single-dimension array, with an address. • A memory address is an index into the array • "Byte addressing" means that the index points to a byte of memory. 0 8 bits of data 1 8 bits of data 2 8 bits of data 3 8 bits of data 4 8 bits of data 5 8 bits of data 6 8 bits of data Tarun Soni, Summer ‘03

  36. Memory Organization • Bytes are nice, but most data items use larger "words" • For MIPS, a word is 32 bits or 4 bytes. • 232 bytes with byte addresses from 0 to 232-1 • 230 words with byte addresses 0, 4, 8, ... 232-4 • Words are aligned i.e., what are the least 2 significant bits of a word address? 0 32 bits of data 4 32 bits of data Registers hold 32 bits of data 8 32 bits of data 12 32 bits of data ... Tarun Soni, Summer ‘03

  37. Data Types Bit: 0, 1 Bit String: sequence of bits of a particular length 4 bits is a nibble 8 bits is a byte 16 bits is a half-word 32 bits is a word 64 bits is a double-word Character: ASCII 7 bit code Decimal: digits 0-9 encoded as 0000b thru 1001b two decimal digits packed per 8 bit byte Integers: 2's Complement Floating Point: Single Precision Double Precision Extended Precision How many +/- #'s? Where is decimal pt? How are +/- exponents represented? exponent E M x R base mantissa Tarun Soni, Summer ‘03

  38. Operand Usage Support data sizes and types: 8-bit, 16-bit, 32-bit integers and 32-bit and 64-bit IEEE 754 floating point numbers Tarun Soni, Summer ‘03

  39. Addressing: Endian-ness and alignment • Big Endian: address of most significant byte = word address (xx00 = Big End of word) • IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA • Little Endian: address of least significant byte = word address(xx00 = Little End of word) • Intel 80x86, DEC Vax, DEC Alpha (Windows NT) little endian byte 0 3 2 1 0 msb lsb 0 1 2 3 0 1 2 3 Aligned big endian byte 0 Alignment: require that objects fall on address that is multiple of their size. Not Aligned Tarun Soni, Summer ‘03

  40. Addressing Modes how do we specify the operand we want? • Register direct R3 • Immediate (literal) #25 • Direct (absolute) M[10000] • Register indirect M[R3] • Base+Displacement M[R3 + 10000] • if register is the program counter, this is PC-relative • Base+Index M[R3 + R4] • Scaled Index M[R3 + R4*d + 10000] • Autoincrement M[R3++] • Autodecrement M[R3 - -] • Memory Indirect M[ M[R3] ] Tarun Soni, Summer ‘03

  41. Addressing Modes Addressing mode Example Meaning Register Add R4,R3 R4 ¬ R4+R3 Immediate Add R4,#3 R4 ¬ R4+3 Displacement Add R4,100(R1) R4 ¬ R4+Mem[100+R1] Register indirect Add R4,(R1) R4 ¬ R4+Mem[R1] Indexed / Base Add R3,(R1+R2) R3 ¬ R3+Mem[R1+R2] Direct or absolute Add R1,(1001) R1 ¬ R1+Mem[1001] Memory indirect Add R1,@(R3) R1 ¬ R1+Mem[Mem[R3]] Auto-increment Add R1,(R2)+ R1 ¬ R1+Mem[R2]; R2 ¬ R2+d Auto-decrement Add R1,–(R2) R2 ¬ R2–d; R1 ¬ R1+Mem[R2] Scaled Add R1,100(R2)[R3] R1 ¬ R1+Mem[100+R2+R3*d] Tarun Soni, Summer ‘03

  42. Addressing Modes: Usage 3 programs measured on machine with all address modes (VAX) --- Displacement: 42% avg, 32% to 55% 75% --- Immediate: 33% avg, 17% to 43% 85% --- Register deferred (indirect): 13% avg, 3% to 24% --- Scaled: 7% avg, 0% to 16% --- Memory indirect: 3% avg, 1% to 6% --- Misc: 2% avg, 0% to 3% 75% displacement & immediate 88% displacement, immediate & register indirect similar measurements: - 16 bits is enough for the immediate address 75 to 80% of the time - 16 bits is enough of a displacement 99% of the time. Tarun Soni, Summer ‘03

  43. Addressing mode usage: Application Specific Tarun Soni, Summer ‘03

  44. MIPS Addressing Modes register direct add $1, $2, $3 immediate add $1, $2, #35 base + displacement lw $1, disp($2) OP rs rd sa funct rt OP rs rt immediate rs immediate • register indirect • disp = 0 • absolute • (rs) = 0 rt Tarun Soni, Summer ‘03

  45. MIPS ISA-so far • fixed 32-bit instructions • 3 instruction formats • 3-operand, load-store architecture • 32 general-purpose registers (integer, floating point) • R0 always equals 0. • 2 special-purpose integer registers, HI and LO, because multiply and divide produce more than 32 bits. • registers are 32-bits wide (word) • register, immediate, and base+displacement addressing modes But what about the actual instructions themselves ?? Tarun Soni, Summer ‘03

  46. Typical Operations (little change since 1960) Data Movement Load (from memory) Store (to memory) memory-to-memory move register-to-register move input (from I/O device) output (to I/O device) push, pop (to/from stack) Arithmetic integer (binary + decimal) or FP Add, Subtract, Multiply, Divide Shift shift left/right, rotate left/right Logical not, and, or, set, clear Control (Jump/Branch) unconditional, conditional Subroutine Linkage call, return Interrupt trap, return Synchronization test & set (atomic r-m-w) String search, translate Graphics (MMX) parallel subword ops (4 16bit add) Tarun Soni, Summer ‘03

  47. 80x86 Instruction usage Tarun Soni, Summer ‘03

  48. Instruction usage • Support the simple instructions, since they will dominate the number of instructions executed: load, store, add, subtract, move register-register, and, shift, compare equal, compare not equal, branch, jump, call, return; Compiler Issues orthogonality: no special registers, few special cases, all operand modes available with any data type or instruction type completeness: support for a wide range of operations and target applications regularity: no overloading for the meanings of instruction fields streamlined: resource needs easily determined Register Assignment is critical too Easier if lots of registers Tarun Soni, Summer ‘03

  49. MIPS Instructions • arithmetic • add, subtract, multiply, divide • logical • and, or, shift left, shift right • data transfer • load word, store word • conditional Branch • unconditional Jump Tarun Soni, Summer ‘03

  50. MIPS Instructions • arithmetic • add, subtract, multiply, divide Instruction Example Meaning Comments add add $1,$2,$3 $1 = $2 + $3 3 operands; exception possible subtract sub $1,$2,$3 $1 = $2 – $3 3 operands; exception possible add immediate addi $1,$2,100 $1 = $2 + 100 + constant; exception possible add unsigned addu $1,$2,$3 $1 = $2 + $3 3 operands; no exceptions subtract unsigned subu $1,$2,$3 $1 = $2 – $3 3 operands; no exceptions add imm. unsign. addiu $1,$2,100 $1 = $2 + 100 + constant; no exceptions multiply mult $2,$3 Hi, Lo = $2 x $3 64-bit signed product multiply unsigned multu$2,$3 Hi, Lo = $2 x $3 64-bit unsigned product divide div $2,$3 Lo = $2 ÷ $3, Lo = quotient, Hi = remainder Hi = $2 mod $3 divide unsigned divu $2,$3 Lo = $2 ÷ $3, Unsigned quotient & remainder Hi = $2 mod $3 move from Hi mfhi $1 $1 = Hi Used to get copy of Hi move from Lo mflo $1 $1 = Lo Used to get copy of Lo Tarun Soni, Summer ‘03

More Related