1 / 43

CSE 331 Computer Organization and Design Fall 2007 Week 13

CSE 331 Computer Organization and Design Fall 2007 Week 13. Section 1: Mary Jane Irwin ( www.cse.psu.edu/~mji ) Section 2: Krishna Narayanan Course material on ANGEL: cms.psu.edu [ adapted from D. Patterson slides ]. Head’s Up. Last week’s material

iren
Download Presentation

CSE 331 Computer Organization and Design Fall 2007 Week 13

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 331Computer Organization and DesignFall 2007Week 13 Section 1: Mary Jane Irwin (www.cse.psu.edu/~mji) Section 2: Krishna Narayanan Course material on ANGEL: cms.psu.edu [adapted from D. Patterson slides]

  2. Head’s Up • Last week’s material • Multicycle MIPS datapath and control path implementation, microprogramming • This week’s material • Input/Output – dealing with exceptions and interrupts • Reading assignment – PH: 5.6, 8.1, 8.5, A.7-A.8 • Next week’s material • Intro to pipelined datapath design • Reading assignment – PH: 6.1 • Reminders • HW 7 is due Monday, Dec 3rd (by 11:55pm) • Quiz 7 closes Tues., Dec 4th (by 11:55pm) • Final Exam is Tues., Dec 18th, 10:10 to noon, 112 Walker • Dec. 12thdeadline for filing grade corrections/updates

  3. Well, surely a higher college GPA ought to be correlated with career success … the data didn’t bear this out. … fast-trackers have identifiable qualities and show definite trends. One quality is a whatever-it-takes attitude … a willingness to do whatever it takes to make a project succeed … If that means working weekends … they do it. Another quality is a solid, unflappable understanding of all the technologies they are using or developing. … Finally, these high-output types seemed to innately grasp that they are members of a large team … they are the ones always helping everyone else. The Pentium Chronicles, Colwell, pg. 140

  4. Major Components of a Computer • Important metrics for an I/O system • Performance • Compatibility • Expandability and diversity • Dependability • Cost, size, weight Processor Devices Control Output Memory Datapath Input

  5. 8 orders of magnitude range Input and Output Devices • I/O devices are incredibly diverse with respect to • Behavior – input, output or storage • Partner – human or machine • Data rate – the peak rate at which data can be transferred between the I/O device and the main memory or processor

  6. Input/Output in SPIM via System Calls • SPIM provides a small set of operating-system-like services through the syscall instruction • Load the system call code into register $v0 and the arguments into registers $a0 trhough $a3 • Return values are put in register $v0

  7. Communication of I/O Devices and Processor • How the processor directs the I/O devices • Special I/O instructions • Must specify both the device and the command • Memory-mapped I/O • Portions of the high-order memory address space are assigned to each I/O device. Read (lw) and writes (sw) to those memory addresses are interpreted as commands to the I/O devices • Load/stores to the I/O address space done only by the OS • How the I/O device communicates with the processor • Polling – the processor periodically checks the status of an I/O device to determine its need for service • Processor is totally in control – but does all the work • Can waste a lot of processor time due to speed differences • Interrupt-driven – the I/O device issues an interrupts to the processor to indicate that it needs attention

  8. Processor Devices Control Transmitter Memory Datapath Receiver “Real” I/O in SPIM • SPIM supports one memory-mapped I/O device – a terminal with two independent units • Transmitter writes characters to the display • Receiver reads characters from the keyboard

  9. Review: MIPS (spim) Memory Allocation Memory f f f f f f f c Mem Map I/O f f f f 0000 Kernel Code & Data 8000 0080 $sp 7f f e f f fc Stack 230 words Dynamic data $gp 1000 8000 ( 1004 0000) Static data 1000 0000 User Code PC 0040 0000 Reserved 0000 0000

  10. Terminal Receiver (Input) Control with SPIM • Input is controlled via two memory-mapped device registers (i.e., each is a special memory location) 7 0 Receiver data (0xffff0004) unused received byte from keyboard (read only) 1 0 Receiver control (0xffff0000) unused interrupt enable ready (read only) • The keyboard inputs into the Receiver data register which sets the ready bit in the Receiver control register (i.e., the keyboard input is ready to be read by the program) • Reading the next input character from the Receiver data register resets the ready bit in the Receiver control register

  11. Terminal Output Control with SPIM • Output is controlled via two memory-mapped device registers (i.e., each is a special memory location) 7 0 Transmitter data (0xffff000c) unused transmitted byte to display 1 0 Transmitter control (0xffff0008) unused interrupt enable ready (read only) • The display outputs the Transmitter data register character which sets the ready bit in the Transmitter control register (i.e., the display is ready to accept a new output character) • Writing the next character to output into the Transmitter data register resets the ready bit in the Transmitter control register

  12. MIPS I/O Instructions • MIPS has 2 coprocessors: Coprocessor 0 handles exceptions including input and output interrupts, Coprocessor 1 handles floating point • Coprocessors have their own register sets so have instructions to move values between these registers and the CPU’s registers mfc0 rd, rt #move from coprocessor 0 0x10 0 rt rd 0 0 mtc0 rt, rd #move to coprocessor 0 0x10 4 rt rd 0 0

  13. Polling in SPIM • Be sure that memory-mapped I/O is enabled (through the PCSpim “Settings” dialog box) li $t0, 0xffff0000 #recv ctrl li $t1, 0xffff0004 #recv data li $t2, 0xffff0008 #trans ctrl li $t3, 0xffff000c #trans data mtc0 $zero, $12 #disable interrupts I1: lw $t4, 0($t0) #poll recv ready bit andi $t4, $t4, 1 beq $t4, $zero, Il #loop til recv ready lw $t6, 0($t1) #read input character I2: lw $t4, 0($t2) #poll trans ready bit andi $t4, $t4, 1 beq $t4, zero, I2 #loop til trans ready sw $t6, 0($t3) #echo (print) character

  14. The Downsides of Polling • Input and output devices are very slow compared to the processor • These time lags are simulated in SPIM which measures time in instructions executed, not in real clock time • After the transmitter starts to write a character, the transmitter’s ready bit becomes 0. It doesn’t become ready again until the processor has executed a (large) fixed number of instructions. (You don’t want to single step the simulator!) • Polling will execute the “loop til ready” code thousands of times. While the input or output is occurring, nothing else can be done – a waste of resources. • There is a better way

  15. I/O Interrupts • An I/O interrupt is used to signal an I/O request for service • Can have different urgencies (so may need to be prioritized) • Need to identity the device generating the interrupt • An I/O interrupt is asynchronous wrt instr execution • An I/O interrupt is not associated with any instruction and does not prevent any instruction from completion • You can pick your own convenient point to take an interrupt • Advantage • User program progress is only halted during the actual transfer of I/O data to/from user memory space • Disadvantage – special hardware is needed to • Cause an interrupt (I/O device) • Detect an interrupt and save the proper information to resume after servicing the interrupt (processor)

  16. add sub and or beq 2.1 save PC 2.2 jump to interrupt service routine lbu sb ... jr 2.4 return to user code Interrupt Driven Input 1. input interrupt Processor user program Receiver Memory 2.3 service interrupt Keyboard input interrupt service routine memory

  17. Interrupt Driven Input in SPIM • the Receiver indicates with an interrupt that it has input a new character from the keyboard into the Receiver data register • writing to the Receiver data register sets the Receiver control register ready bit to 1 • the user process responds to the interrupt by transferring control to an interrupt service routine that copies the input character into the user memory space • reading the Receiver data register resets the Receiver control register ready bit to 0 received byte Receiver data (0xffff0004) unused 65 Receiver control (0xffff0000) unused 1 1 0 interrupt enable ready

  18. 2.1 save PC 2.2 jump to interrupt service routine 2.4 return to user code Interrupt Driven Output 1.output interrupt add sub and or beq user program Processor 2.3 service interrupt Trnsmttr Memory output interrupt service routine lbu sb ... jr Display memory

  19. Interrupt Driven Output in SPIM • the transmitter indicates with an interrupt that it has successfully output the character in the Transmitter data register in memory to the display • reading from the Transmitter data register sets the Transmitter control register ready bit to 1 • the user process responds to the interrupt by transferring control to an interrupt service routine that writes the next character to output from the user memory space into the Transmitter data register • writing to the Transmitter data register resets the Transmitter control register ready bit to 0 transmitted byte Transmitter data (0xffff000c) unused 65 Transmitter control (0xffff0008) unused 1 1 0 interrupt enable ready

  20. Additions to MIPS ISA for I/O • Coprocessor 0 records the information the software needs to handle exceptions (including interrupts) • EPC (register 14) – holds the address+4 of the instruction that was executing when the exception occurred • Status (register 12) – exception mask and enable bits • Intr Mask = 1 bit for each of 6 hw and 2 sw exception levels (1 enables exception at that level, 0 disables them) • User mode = 0 if running in kernel mode when exception occurred; 1 if running in user mode (fixed at 1 in SPIM) • Excp level = set to 1 (disable exceptions) when an exception occurs; should be reset by exception handler when done • Intr enable = 1 if exception are enabled; 0 if disabled 4 1 0 15 8 Intr Mask User mode Intr enable Excp level

  21. Additions to MIPS ISA, Con’t • Cause (register 13) – exception pending and type bits • PI: bits set if exception occurs but not yet serviced • so can handle more than one exception occurring at same time, or records exception requests when exception are disabled • Exception code:encodes reasons for exception • 0 (INT)  external interrupt (I/O device request) • 4 (AdEL)  address error trap (load or instr fetch) • 5 (AdES)  address error trap (store) • 6 (IBE)  bus error on instruction fetch trap • 7 (DBE)  bus error on data load or store trap • 8 (Sys)  syscall trap • 9 (Bp)  breakpoint trap • 10 (RI)  reserved (or undefined) instruction trap • 12 (Ov)  arithmetic overflow trap 31 15 8 6 2 Branch delay Pending exception (PI) Exception code PI3 = recv intr PI2 = trans intr PI2 = trans intr

  22. MIPS Exception Return Instruction • Exception return – sets the Excp level bit in coprocessor 0’s Status register to 0 (reenabling exception) and returns to the instruction pointed to by coprocessor 0’s EPC register eret #return from exception 0x10 1 0 0 0 0x18

  23. Example I/O Interrupts in SPIM - Enable li $t0, 0xffff0000 #recv ctrl li $t1, 0xffff0004 #recv data li $t2, 0xffff0008 #trans ctrl li $t3, 0xffff000c #trans data mfc0 $t4, $13 andi $t4, $t4, 0xffff00ff #clear Pending interrupt mtc0 $t4, $13 #(PI) bits in Cause reg li $t4, 0x2 sw $t4, 0($t0) #enable recv interrupts sw $t4, 0($t2) #enable trans interrupts mfc0 $t4, $12 ori $t4, $t4, 0xff01 #enable intr and mask mtc0 $t4, $12 #in Status reg #do something useful while I/O is taking place #when I/O interrupts occur transfer control to #exception handler (at address 0x80000180)

  24. Example I/O Interrupts in SPIM - Handler .ktext 0x80000180 mfc0 $t4, $13 #get ExcpCode from Cause srl $t5, $t4, 2 andi $t5, $t5, 0x1f #ExcpCode in $t5, if 0 bne $t5, $zero, excp #then I/O intr has occurred ck_recv: andi $t5, $t4, 0x800 #check for PI3 (input), beq $t5, $zero, ck_trans #if 0,then trans intr I1: lw $t5, 0($t0) #check recv ready andi $t5, $t5, 1 beq $t5, $zero, no_recv_ready lw $t6, 0($t1) #input character into $t6 andi $t4, $t4, 0xfffff7ff #clear PI3 bit in Cause reg mtc0 $t4, $13

  25. Example I/O Interrupts in SPIM – Handler, con’t ck_trans: beq $t1, $zero, ret_hand #no character to echo yet andi $t5, $t4, 0x400 #check for PI2 (output) beq $t5, $zero, ret_hand #if 0, then no trans intr I2: lw $t5, 0($t2) #check trans ready andi $t5, $t5, 1 beq $t5, $zero, no_trans_ready sw $t6, 0($t3) #echo character to display mfc0 $t4, $13 andi $t4, $t4, 0xfffffbff #clear PI2 bit in Cause reg mtc0 $t4, $13 ret_hand: mfc0 $t4, $12 ori $t4, $t4, 0xff01 #enable intr and mask mtc0 $t4, $12 #in Status reg eret #return from intr

  26. “… designed the P6 frontside bus to be transaction-oriented. Chips that connected to the bus were known as bus agents … This transaction orientation is [now] a standard feature of most modern microprocessor buses, despite its complexity and the implication that all bus agents must continuously monitor the bus and track the overall state … it’s a good trade-off between the expense (wires, motherboard routing, and CPU package pins) and inexpensive (transistors on the CPU) The Pentium Chronicles, Colwell, pg. 76

  27. Exception Exceptions in General System Exception Handler • Exception = unprogrammed control transfer • system takes action to handle the exception • must record the address of the offending or next to execute instruction and save (and restore) user state • returns control to user after handling the exception user program normal control flow: sequential, jumps, branches, calls, returns return from exception

  28. Two Types of Exceptions • Interrupts • caused by external events (i.e., request from I/O device) • asynchronous to program execution • may be handled between instructions • simply suspend and resume user program • Traps • caused by internal events • exceptional conditions (e.g., arithmetic overflow, undefined instr.) • errors (e.g., hardware malfunction, memory parity error) • faults (e.g., non-resident page – page fault) • synchronous to program execution • condition must be remedied by the trap handler • instruction may be retried (or simulated) and program continued or program may be aborted

  29. Additions to MIPS ISA for Interrupts • Control signals to write EPC (EPCWrite), Cause and Status (Cause&StatusWrite) • Hardware to record the type of interrupt in Cause • Modify the finite state machine so that • the address of interrupt handler (8000 0180hex) can be loaded into the PC, so must increase the size of PC mux • and save the address of the next instr in EPC

  30. Status Cause MDR Interrupt Modified Multicycle Datapath Interrupt EPCWrite PCWriteCond Cause&StatusWrite PCWrite PCSource IorD ALUOp MemRead Control ALUSrcB MemWrite ALUSrcA EPC MemtoReg RegWrite IRWrite RegDst 8000 0180 PC[31-28] Instr[31-26] Shift left 2 3 Instr[25-0] 2 0 1 Address Memory 0 PC 0 Read Addr 1 A Read Data 1 IR Register File 1 1 zero Read Addr 2 Read Data (Instr. or Data) 0 ALUout ALU Write Addr Write Data 1 Read Data 2 B 0 1 Write Data 4 1 0 2 Instr[15-0] Sign Extend Shift left 2 3 32 ALU control Instr[5-0]

  31. Interrupt Modified FSM Decode 0 IorD = 0 MemRead;IRWrite ALUSrcA = 0 ALUsrcB = 01 ALUOp = 00 PCSource = 00 PCWrite Instr Fetch 1 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 Start (Op = R-type) (Op = beq) 2 (Op = lw or sw) (Op = j) 6 8 9 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCSource = 01 PCWriteCond ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10 PCSource = 10 PCWrite Execute (Op = lw) (Op = sw) 3 5 7 Memory Access RegDst = 1 RegWrite MemtoReg = 0 MemWrite IorD = 1 MemRead IorD = 1 Interrupt pending? Cause&StatusWrite EPCWrite;PCWrite IntrOrExcp = 0 PCSource = 11 11 4 RegDst = 0 RegWrite MemtoReg = 1 Write Back

  32. Additions to MIPS ISA for Traps • Control signals to write EPC (EPCWrite&IntrOrExcp), Cause and Status (Cause&StatusWrite) • Hardware to record the type of trap in Cause • Further modify the finite state machine so that • for traps, record the address of the current (offending) instruction in the EPC, so must undo the PC = PC + 4 done during fetch

  33. Status Cause MDR Trap Modified Multicycle Datapath Trap EPCWrite PCWriteCond IntrOrExcp CauseWrite PCWrite PCSource IorD ALUOp MemRead Control ALUSrcB 0 MemWrite ALUSrcA EPC MemtoReg RegWrite 1 IRWrite RegDst 8000 0180 PC[31-28] Instr[31-26] Shift left 2 3 Instr[25-0] 2 0 1 Address Memory 0 PC 0 Read Addr 1 A Read Data 1 IR Register File 1 1 zero Read Addr 2 Read Data (Instr. or Data) 0 ALUout ALU Write Addr Write Data 1 Read Data 2 B 0 1 Write Data 4 1 0 2 01 Instr[15-0] Sign Extend Shift left 2 3 32 ALU control Instr[5-0]

  34. How Control Detects Two Traps • Undefined instruction (RI) – detected when no next state is defined in state 1 (decode) for the opcode value • Define the next state value for all undefined op values as new state 10 • Arithmetic overflow (Ov) – The overflow signal from the ALU is used in state 6 (if don’t want to complete RegWrite) • Need to modify the FSM in a similar fashion for remaining traps • Challenge is to handle the interactions between instructions and exception-causing events so that the control logic remains small and fast • Complex interactions makes the control unit the most challenging aspect of hardware design, especially in pipelined processors

  35. Trap Modified FSM Decode 0 IorD = 0 MemRead;IRWrite ALUSrcA = 0 ALUsrcB = 01 ALUOp = 00 PCSource = 00 PCWrite Instr Fetch 1 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 Start (Op = R-type) (Op = other) (Op = beq) 2 (Op = lw or sw) (Op = j) 6 8 9 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCSource = 01 PCWriteCond ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10 PCSource = 10 PCWrite Execute (Op = lw) (Op = sw) No Overflow 3 5 7 Memory Access 10 Cause&StatusWrite ALUSrcA =0 ALUSrcB = 01 ALUOp = 01 EPCWrite;PCWrite IntrOrExcp = 1 PCSource = 11 RegDst = 1 RegWrite MemtoReg = 0 MemWrite IorD = 1 MemRead IorD = 1 Overflow Interrupt pending? Cause&StatusWrite EPCWrite;PCWrite IntrOrExcp = 0 PCSource = 11 11 4 RegDst = 0 RegWrite MemtoReg = 1 Write Back

  36. bus I/O System Interconnect Issues • Usually have more than one I/O device in the system connected to the processor via a bus • each I/O device is controlled by an I/O Controller interrupt signals Processor Cache Memory Memory - I/O Bus I/O Controller I/O Controller I/O Controller Main Memory Terminal Disk Disk Network

  37. Buses • A bus is a shared communication link (a single set of wires used to connect multiple subsystems) that needs to support a range of devices with widely varying latencies and data transfer rates • Advantages • Versatile – new devices can be added easily and can be moved between computer systems that use the same bus standard • Low cost – a single set of wires is shared in multiple ways • Disadvantages • Creates a communication bottleneck – bus bandwidth limits the maximum I/O throughput • The maximum bus speed is largely limited by • The length of the bus • The number of devices on the bus

  38. I/O Performance Measures • I/O bandwidth (throughput) – amount of information that can be input (output) and communicated across an interconnect (e.g., a bus) to the processor/memory (I/O device) per unit time • How much data can we move through the system in a certain time? • How many I/O operations can we do per unit time? • I/O response time (latency) – the total elapsed time to accomplish an input or output operation • An especially important performance metric in real-time systems • Many applications require both high throughput and short response times

  39. Types of Buses • Processor-memory bus (proprietary) • Short and high speed • Matched to the memory system to maximize the memory-processor bandwidth • Optimized for cache block transfers • Backplane bus (industry standard, e.g., ATA, PCIexpress) • The backplane is an interconnection structure within the chassis • Used as an intermediary bus connecting I/O busses to the processor-memory bus • I/O bus (industry standard, e.g., SCSI, USB, Firewire) • Usually is lengthy and slower • Needs to accommodate a wide range of I/O devices • Connects to the processor-memory bus or backplane bus

  40. Memory Controller Hub (“Northbridge”) DDR SDRAM Main Memory I/O Controller Hub (“Southbridge”) Example: The Pentium 4’s Buses System Bus (“Front Side Bus”): 64b x 800 MHz (6.4GB/s), 533 MHz, or 400 MHz Graphics output: 2.0 GB/s Gbit ethernet: 0.266 GB/s Hub Bus: 8b x 266 MHz 2 serial ATAs: 150 MB/s PCI: 32b x 33 MHz 2 parallel ATA: 100 MB/s 8 USBs: 60 MB/s

  41. Bus Bandwidth Determinates • The bandwidth of a bus is determined by • Whether its is synchronous or asynchronous and the timing characteristics of the protocol used • The bus width (i.e., number of data lines) • Whether the bus supports block transfers or only word at a time transfers

  42. Buses in Transition • Companies are transitioning from synchronous, parallel, wide buses to asynchronous narrow buses • Reflection on wires and clock skew makes it difficult to use 16 to 64 parallel wires running at a high clock rate (e.g., ~400 MHz) so companies are transitioning to buses with a few one-way wires running at a very high “clock” rate (~2 GHz)

  43. ATA Cable Sizes • Serial ATA cables (red) are much thinner than parallel ATA cables (green)

More Related