1 / 40

CSE 8383 - Advanced Computer Architecture

CSE 8383 - Advanced Computer Architecture. Week-3 Week of Jan 26, 2004 engr.smu.edu/~rewini/8383. Contents. Linear Pipelines Nonlinear pipelines Instruction Pipelines Arithmetic Operations Design of Multifunction Pipeline. Linear Pipeline. Processing Stages are linearly connected

omer
Download Presentation

CSE 8383 - Advanced Computer Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 8383 - Advanced Computer Architecture Week-3 Week of Jan 26, 2004 engr.smu.edu/~rewini/8383

  2. Contents • Linear Pipelines • Nonlinear pipelines • Instruction Pipelines • Arithmetic Operations • Design of Multifunction Pipeline

  3. Linear Pipeline • Processing Stages are linearly connected • Perform fixed function • Synchronous Pipeline • Clocked latches between Stage i and Stage i+1 • Equal delays in all stages • Asynchronous Pipeline (Handshaking)

  4. Latches S1 S2 S3 L1 L2 Slowest stage determines delay Equal delays  clock period

  5. Reservation Table Time S1 S2 S3 S4

  6. 5 tasks on 4 stages Time S1 S2 S3 S4

  7. Non Linear Pipelines • Variable functions • Feed-Forward • Feedback

  8. 3 stages & 2 functions Y X S1 S2 S3

  9. Reservation Tables for X & Y S1 S2 S3 S1 S2 S3

  10. Linear Instruction Pipelines • Assume the following instruction execution phases: • Fetch (F) • Decode (D) • Operand Fetch (O) • Execute (E) • Write results (W)

  11. Pipeline Instruction Execution F D O E W

  12. Dependencies • Data Dependency (Operand is not ready yet) • Instruction Dependency (Branching) Will that Cause a Problem?

  13. Data Dependency I1 -- Add R1, R2, R3 I2 -- Sub R4, R1, R5 1 2 3 4 5 6 F D O E W

  14. Solutions • STALL • Forwarding • Write and Read in one cycle • ….

  15. Instruction Dependency I1 – Branch o I2 – 1 2 3 4 5 6 F D O E W

  16. Solutions • STALL • Predict Branch taken • Predict Branch not taken • ….

  17. Floating Point Multiplication • Inputs (Mantissa1, Exponenet1), (Mantissa2, Exponent2) • Add the two exponents  Exponent-out • Multiple the 2 mantissas • Normalize mantissa and adjust exponent • Round the product mantissa to a single length mantissa. You may adjust the exponent

  18. Linear Pipeline for floating-point multiplication Round Normalize Add Exponents Multiply Mantissa Round Normalize Accumulator Partial Products Add Exponents Re normalize

  19. Linear Pipeline for floating-point Addition Partial Shift Find Leading 1 Add Mantissa Partial Shift Subtract Exponents Round Re normalize

  20. Combined Adder and Multiplier B Partial Products G C H F A Partial Shift Find Leading 1 Add Mantissa Partial Shift Exponents Subtract / ADD Round Re normalize E D

  21. Reservation Table for Multiply

  22. Reservation Table for Addition

  23. Nonlinear Pipeline Design • Latency The number of clock cycles between two initiations of a pipeline • Collision Resource Conflict • Forbidden Latencies Latencies that cause collisions

  24. Nonlinear Pipeline Design cont • Latency Sequence A sequence of permissible latencies between successive task initiations • Latency Cycle A sequence that repeats the same subsequence • Collision vector C = (Cm, Cm-1, …, C2, C1), m <= n-1 n = number of column in reservation table Ci = 1 if latency i causes collision, 0 otherwise

  25. Mul – Mul Collision (lunch after 1 cycle)

  26. Mul –Mul Collision (lunch after 2 cycles)

  27. Mul – Mul Collision (lunch after 3 cycles)

  28. Collision Vector for Multiply after Multiply Forbidden Latencies:1, 2 Collision vector 0 0 0 0 1 1  11 Maximum forbidden latency = 2  m = 2

  29. Example Y X S1 S2 S3

  30. Reservation Tables for X & Y S1 S2 S3 S1 S2 S3

  31. Reservation Tables for X & Y S1 S2 S3 S1 S2 S3

  32. Forbidden Latencies • X after X • X after Y • Y after X • Y after Y

  33. X after X 2 S1 S2 S3 5 S1 S2 S3

  34. X after X 4 S1 S2 S3 7 S1 S2 S3

  35. Collision Vector • Forbidden Latencies: 2, 4, 5, 7 • Collision Vector = 1 0 1 1 0 1 0

  36. Y after Y S1 S2 S3 S1 S2 S3

  37. Collision Vector • Forbidden Latencies: 2, 4 • Collision Vector = 1 0 1 0

  38. Exercise – Find the collision vector

  39. State Diagram for X 8+ 1 0 1 1 0 1 0 8+ 3 8+ 6 1* 1 0 1 1 0 1 1 1 1 1 1 1 1 1 3* 6

  40. Cycles • Simple cycles each state appears only once (3), (6), (8), (1, 8), (3, 8), and (6,8) • Greedy Cycles  simple cycles whose edges are all made with minimum latencies from their respective starting states (1,8), (3)  one of them is MAL

More Related