E N D
1. 1
2. 2 COMP 206:Computer Architecture and Implementation Montek Singh
Thu, Jan 22, 2009
Lecture 3: Quantitative Principles
3. 3 Quantitative Principles of Computer Design This is intro to design and analysis
Take Advantage of Parallelism
Principle of Locality
Focus on the Common Case
Amdahl’s Law
The Processor Performance Equation
4. 4 1) Taking Advantage of Parallelism (exs.) Increase throughput of server computer via multiple processors or multiple disks
Detailed HW design
Carry lookahead adders uses parallelism to speed up computing sums from linear to logarithmic in number of bits per operand
Multiple memory banks searched in parallel in set-associative caches
Pipelining (next slides)
5. 5 Pipelining Overlap instruction execution…
… to reduce the total time to complete an instruction sequence.
Not every instruction depends on immediate predecessor
? executing instructions completely/partially in parallel possible
Classic 5-stage pipeline: 1) Instruction Fetch (Ifetch), 2) Register Read (Reg), 3) Execute (ALU), 4) Data Memory Access (Dmem), 5) Register Write (Reg)
6. 6 Pipelined Instruction Execution
7. 7 Limits to pipelining Hazards prevent next instruction from executing during its designated clock cycle
Structural hazards: attempt to use the same hardware to do two different things at once
Data hazards: Instruction depends on result of prior instruction still in the pipeline
Control hazards: Caused by delay between the fetching of instructions and decisions about changes in control flow (branches and jumps).
8. 8 Increasing Clock Rate Pipelining also used for this
Clock rate determined by gate delays
9. 9 2) The Principle of Locality The Principle of Locality:
Programs access a relatively small portion of the address space. Also, reuse data.
Two Different Types of Locality:
Temporal Locality (Locality in Time): If an item is referenced, it will tend to be referenced again soon (e.g., loops, reuse)
Spatial Locality (Locality in Space): If an item is referenced, items whose addresses are close by tend to be referenced soon (e.g., straight-line code, array access)
Last 30 years, HW relied on locality for memory perf. The principle of locality states that programs access a relatively small portion of the address space at any instant of time.
This is kind of like in real life, we all have a lot of friends. But at any given time most of us can only keep in touch with a small group of them.
There are two different types of locality: Temporal and Spatial. Temporal locality is the locality in time which says if an item is referenced, it will tend to be referenced again soon.
This is like saying if you just talk to one of your friends, it is likely that you will talk to him or her again soon.
This makes sense. For example, if you just have lunch with a friend, you may say, let’s go to the ball game this Sunday. So you will talk to him again soon.
Spatial locality is the locality in space. It says if an item is referenced, items whose addresses are close by tend to be referenced soon.
Once again, using our analogy. We can usually divide our friends into groups. Like friends from high school, friends from work, friends from home.
Let’s say you just talk to one of your friends from high school and she may say something like: “So did you hear so and so just won the lottery.”
You probably will say NO, I better give him a call and find out more.
So this is an example of spatial locality. You just talked to a friend from your high school days. As a result, you end up talking to another high school friend. Or at least in this case, you hope he still remember you are his friend.
+3 = 10 min. (X:50)The principle of locality states that programs access a relatively small portion of the address space at any instant of time.
This is kind of like in real life, we all have a lot of friends. But at any given time most of us can only keep in touch with a small group of them.
There are two different types of locality: Temporal and Spatial. Temporal locality is the locality in time which says if an item is referenced, it will tend to be referenced again soon.
This is like saying if you just talk to one of your friends, it is likely that you will talk to him or her again soon.
This makes sense. For example, if you just have lunch with a friend, you may say, let’s go to the ball game this Sunday. So you will talk to him again soon.
Spatial locality is the locality in space. It says if an item is referenced, items whose addresses are close by tend to be referenced soon.
Once again, using our analogy. We can usually divide our friends into groups. Like friends from high school, friends from work, friends from home.
Let’s say you just talk to one of your friends from high school and she may say something like: “So did you hear so and so just won the lottery.”
You probably will say NO, I better give him a call and find out more.
So this is an example of spatial locality. You just talked to a friend from your high school days. As a result, you end up talking to another high school friend. Or at least in this case, you hope he still remember you are his friend.
+3 = 10 min. (X:50)
10. 10 Levels of the Memory Hierarchy
11. 11 3) Focus on the Common Case In making a design trade-off, favor the frequent case over the infrequent case
e.g., Instruction fetch and decode unit used more frequently than multiplier, so optimize it 1st
e.g., If database server has 50 disks / processor, storage dependability dominates system dependability, so optimize it 1st
Frequent case is often simpler and can be done faster than the infrequent case
e.g., overflow is rare when adding 2 numbers, so improve performance by optimizing more common case of no overflow
May slow down overflow, but overall performance improved by optimizing for the normal case
What is frequent case and how much is performance improved by making case faster => Amdahl’s Law
12. 12 4) Amdahl’s Law (History, 1967) Historical context
Amdahl was demonstrating “the continued validity of the single processor approach and of the weaknesses of the multiple processor approach”
Paper contains no mathematical formulation, just arguments and simulation
“The nature of this overhead appears to be sequential so that it is unlikely to be amenable to parallel processing techniques.”
“A fairly obvious conclusion which can be drawn at this point is that the effort expended on achieving high parallel performance rates is wasted unless it is accompanied by achievements in sequential processing rates of very nearly the same magnitude.”
Nevertheless, it is of widespread applicability in all kinds of situations
13. 13 Speedup Book shows two forms of speedup eqn
We will use the second because you get “speedup” factors like 2X
14. 14 4) Amdahl’s Law
15. 15 Amdahl’s Law example New CPU 10X faster
I/O bound server, so 60% time waiting
16. 16 Amdahl’s Law for Multiple Tasks
17. 17 Example
18. 18 Another Example Note that there are three categories of operations here: floating point square root operations, other floating point operations, and non-FP operations. When we change one of these categories, we group the other two categories as the “other case”.Note that there are three categories of operations here: floating point square root operations, other floating point operations, and non-FP operations. When we change one of these categories, we group the other two categories as the “other case”.
19. 19 Solution using Amdahl’s Law
20. 20 Implications of Amdahl’s Law Improvements provided by a feature limited by how often feature is used
As stated, Amdahl’s Law is valid only if the system always works with exactly one of the rates
Overlap between CPU and I/O operations? Amdahl’s Law as given here is not applicable
Bottleneck is the most promising target for improvements
“Make the common case fast”
Infrequent events, even if they consume a lot of time, will make little difference to performance
Typical use: Change only one parameter of system, and compute effect of this change
The same program, with the same input data, should run on the machine in both cases
21. 21 5) Processor Performance
22. 22 CPI – Clocks per Instruction
23. 23 Details of CPI
24. 24 Processor Performance Eqn
25. 25 Processor Performance Eqn How can we improve performance?
26. 26 Example 1
27. 27 Example 1 (Solution)
28. 28 Example 2
29. 29 Example 2 (Solution)
30. 30 Performance of (Blocking) Caches
31. 31 Example
32. 32 Fallacies and Pitfalls Fallacies - commonly held misconceptions
When discussing a fallacy, we try to give a counterexample.
Pitfalls - easily made mistakes
Often generalizations of principles true in limited context
We show Fallacies and Pitfalls to help you avoid these errors
33. 33 Fallacies and Pitfalls (1/3) Fallacy: Benchmarks remain valid indefinitely
Once a benchmark becomes popular, tremendous pressure to improve performance by targeted optimizations or by aggressive interpretation of the rules for running the benchmark: “benchmarksmanship.”
70 benchmarks from the 5 SPEC releases. 70% were dropped from the next release since no longer useful
Pitfall: A single point of failure
Rule of thumb for fault tolerant systems: make sure that every component was redundant so that no single component failure could bring down the whole system (e.g, power supply)
34. 34 Fallacies and Pitfalls (2/3) Fallacy - Rated MTTF of disks is 1,200,000 hours or ? 140 years, so disks practically never fail
Disk lifetime is ~5 years ? replace a disk every 5 years; on average, 28 replacement cycles wouldn't fail (140 years long time!)
Is that meaningful?
Better unit: % that fail in 5 years
Next slide
35. 35 Fallacies and Pitfalls (3/3) So 3.7% will fail over 5 years
But this is under pristine conditions
little vibration, narrow temperature range ? no power failures
Real world: 3% to 6% of SCSI drives fail per year
3400 - 6800 FIT or 150,000 - 300,000 hour MTTF [Gray & van Ingen 05]
3% to 7% of ATA drives fail per year
3400 - 8000 FIT or 125,000 - 300,000 hour MTTF [Gray & van Ingen 05]
36. 36 Next Time Instruction Set Architecture
Appendix B
37. 37 References G. M. Amdahl, “Validity of the single processor approach to achieving large scale computing capabilities”, AFIPS Conference Proceedings, pp. 483-485, April 1967
http://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdf