1 / 19

Chapter 4:

Chapter 4:. 22343 - Computer Organization & Design. Assessing & Understanding Performance. Defining Performance. Task A. Task B. Calculate. Calculate. Calculate. Calculate. Save File. Read File. Save File. Read File. Time. t. 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

penda
Download Presentation

Chapter 4:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4: 22343 - Computer Organization & Design Assessing & UnderstandingPerformance

  2. Defining Performance Task A Task B Calculate Calculate Calculate Calculate Save File Read File Save File Read File Time t 0 1 2 3 4 5 6 7 8 9 10 • Response Time The time it takes to do a task Execution Time • Throughput The total amount of work done in a given time • Difference?

  3. Defining Performance Task A Task B Calculate Calculate Calculate Calculate Save File Save File Read File Read File Time t 0 1 2 3 4 5 6 7 8 9 10 • Response Time The time it takes to do a task Execution Time • Throughput The total amount of work done in a given time • Difference?

  4. Measuring Performance = CPU Time Calculate Calculate Calculate Calculate Save File Read File Save File Read File • Performance = 1 / Execution Time • Response Time = Wall-Clock Time = Elapsed Time • Processor Time • + Memory Access Time • + Disk and I/O Access Time • + Operating System Time, etc.

  5. Measuring Performance = Execute user code = Call OS functions, e.g. malloc Elapsed time on an unloaded system CPU time T • CPU Time • User CPU Time • System CPU Time • System Performance: • CPU Performance: • Clock Cycles: • Clock Period • Clock Rate

  6. CPU Performance e.g. Clocks × nanosecond Clocks e.g. ───────── GHz CPU-1: # of CPU clock cycles = sec × cycles/sec CPU-2: # of CPU clock cycles = × cycles Clock rate = cycles / seconds = GHz • Program CPU Execution Time = Number of CPU Clock Cycles × Clock Cycle Time Number of CPU Clock Cycles = ─────────────────── Clock Rate Exercise: A program takes 10 seconds to run on a 4 GHz CPU. The same program on another CPU would take 20% extra clock cycles, yet it finishes in 6 seconds. What is the other CPU clock rate?

  7. CPU Performance # of instructions in a program = CPU-A: CPU Execution Time = × ×ps = ps CPU-B: CPU Execution Time = × ×ps = ps Computer A is ( / ) = times faster than B • Clocks Per Instruction, CPI The average number of clock cycles each instruction takes to execute. Exercise: Which computer is faster?

  8. CPU Performance Seq1: CPU Execution Time = × + ×+ ×= cycles Seq2: CPU Execution Time = × + × + × = cycles Seq2 is / = times faster than Seq1 Seq1 average CPI = = cycles / instruction Seq2 average CPI = = cycles / instruction Exercise: Given 3 groups of instructions: A, B andC, it takes different clock cycles to execute an instruction within each group. Given the shown instruction mix, whichcode sequence is faster to execute?

  9. Evaluating Performance • Workload Set of user programs to be executed. • Benchmark Program specifically chosen to measure performance. • Target Benchmarks form a workload that the user hopes will predict the performance of the actual workload. • Today Benchmarks are real applications, from various environments.

  10. Evaluating Performance Performance B seconds ────────── = ──────── Performance A seconds Computer B = times faster than Computer A • Weighted Arithmetic Mean • Total Execution Time Which computer is faster? • Arithmetic Mean

  11. SPEC Benchmarks • System Performance Evaluation Corporation • CPU Performance • Graphics/Workstations Performance • High Performance Computing • Java Client/Server • Mail Servers • Network File System • Power • SIP • Virtualization • Web Servers

  12. SPEC CPU Benchmarks • SPEC CPU2006 Suite • CINT2006: 12 Integer Benchmarks • CFP2006: 17 Floating Point Benchmarks • Exercise • CPU • Memory Systems • Compilers (Fortran, C, C++) One benchmark has ½ million lines in C++

  13. SPEC CPU Benchmarks

  14. CPU Efficiency Core i7: 6073 Core 2 Duo: 2321 Pentium IV: 539 Pentium III: 152 Is the increase in performance due to higher clocks?

  15. CPU Efficiency Normalized Scores: ─── = ─── = It takes more clocks New instructions; Streaming SIMD Ex2 CPI was sacrificed to enhance Clock rate. • Implementation Efficiency • Clock-Normalized Scores Example: Pentium 3 @ 800 MHz  152 Pentium 4 @ 3.4 GHz  539 Example:

  16. Amdahl’s Law ─── = ─── + Not Possible • When introducing an improvement, Execution Time is divided into 2 parts: • Affected by the improvement • Not affected Execution Time Execution Time Affected Execution After = ──────────────── + Time Improvement Amount of Improvement Unaffected Example: How much improvement is required for the multiply hardware to make the program run 5 times faster?

  17. MIPS: Million Instructions Per Second • No Regard to Instruction Type • Instructions Have Different Capabilities • Different Computers Have Different Architectures • Different MIPS for Different Programs, Same CPU • MIPS Can Vary Inversely With Performance Example: Which code is faster?

  18. MIPS: Million Instructions Per Second CPU Clock Cycles1 = ( + + ) × = CPU Clock Cycles2 = ( + + ) × = Execution Time1 = / = seconds Execution Time2 = / = seconds MIPS1 = ( + + ) million instr / seconds = MIPS2 = ( + + ) million instr / seconds =

  19. Chapter 4 The End

More Related