1 / 33

Topics 4: Performance Measurement

This presentation provides an introduction to performance measurement in computer systems engineering, covering topics such as metrics, latency vs. throughput, execution time, CPU time, and more.

cordie
Download Presentation

Topics 4: Performance Measurement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topics 4: Performance Measurement Introduction to Computer Systems Engineering (CPEG 323) \ELEG323-05F\Topic4.ppt

  2. Reading List • Slides: Topic4 • Henn & Patt: Chapter 4 • Other papers as assigned in class or homework \ELEG323-05F\Topic4.ppt

  3. Performance An attempt to quantify how well a particular computer can perform a user’s applications Problems: • Essentially a software+hardware issue • Different machines have different strengths and weaknesses • There is an enormous amount of hype and outright deception in the market – be wary \ELEG323-05F\Topic4.ppt

  4. Conflicting Goals User: Find the most suitable machine to get the job done at the lowest cost  Application-oriented metrics Vendor: Persuade you to buy their machine regardless of your needs  hardware-oriented metrics \ELEG323-05F\Topic4.ppt

  5. Why Study Performance? Know the vocabulary and understand the issues, so that: • As a user/buyer, you can make better purchasing decisions • As an engineer, you can make better hardware/software design decision \ELEG323-05F\Topic4.ppt

  6. Summary of Metrics • Latency and throughput • CPU time, CPI, clock rate and instruction count • MIPS, relative MIPS • SPEC ratio and rate • Benchmarks \ELEG323-05F\Topic4.ppt

  7. Latency vs. Throughput These are two very different metrics! Latency: How long does it take to get a particular task done? - Also called execution time or running time - Usually measured in time (e.g., microseconds) Throughput: How many tasks can you perform in a unit of time? - Also related to bandwidth (communication channels, storage) - Usually measured in units per time (e.g., megabytes/ second) Relationship between them \ELEG323-05F\Topic4.ppt

  8. Performance Expressed as Time • Absolute time measures • Difference between start and finish of an operation • Synonyms: running time, elapsed time, completion time, execution time, response time, latency • Relative (normalized) time measures • Running time normalized to some reference time \ELEG323-05F\Topic4.ppt

  9. 1 Execution time Choosing a Time-Based Performance Metric • Guiding principle: choose performance measures that track running time • Performance  • Higher performance means it takes less time to run the application, so bigger is better \ELEG323-05F\Topic4.ppt

  10. The Nature of Execution Time Execution time on a computer is typically divided into: User time: Time spent executing instructions in the user code System time: Time spent executing instructions in the kernel on behalf of the user code (e.g., opening files) Other: Time when the system is idle or executing other programs Use “time” and “top” commands in Unix to see these \ELEG323-05F\Topic4.ppt

  11. Illustration of Execution Time “Real” or “wall clock” time is the sum of all three Sys. time User time Other / idle Warning: File access delays sometimes counted as “idle” even though they’re yours. \ELEG323-05F\Topic4.ppt

  12. CPU Time vs. Latency - The time CPU spends for computing the given task, not including the time waiting for I/O or running other programs. • Also known as CPU execution time -Consists of user CPU time and system CPU time. • User CPU time: Total time CPU spends in the task • System CPU time: Total time CPU spends in operating system for the sake of the task. \ELEG323-05F\Topic4.ppt

  13. Application Metrics vs. Hardware Metrics How do you relate the application-oriented performance measurements to what is going on inside the machine? Most processors are synchronous, so we can use the clock as a basis. \ELEG323-05F\Topic4.ppt

  14. 1 109 nsec 2000 x 106 cycles/sec. sec. Clock Cycles • Clock “ticks” refer to clock edges (rising or falling) • Cycle time (period) = time between ticks = seconds per cycle • Clock rate (frequency) = cycles per second (1 Hz = 1 cycle/sec) • A 2GHz clock has a cycle time of Clock period x = 0.5 nsec. \ELEG323-05F\Topic4.ppt

  15. Measuring Time • If you’re lucky, you can count clock cycles directly; some CPUs have a built-in counter which increments every clock cycle. • If you’re not, you have to use a slower clock. Most systems have extra hardware which generates a regular tick; many operating systems will count these ticks for you. • Timing accuracy limited by the resolution of the clock – you get less accurate readings off a 1Hz clock than a 1MHz clock! \ELEG323-05F\Topic4.ppt

  16. Cycles and Instructions • In almost all processors, a single instruction (executing one line of assembly code) requires more than one clock cycle. Either: - One instruction must finish before the next can begin - Consecutive instructions may overlap (“pipelining”) • In most processors, different types of instructions may take different numbers of cycles (e.g., integer vs. floating point) \ELEG323-05F\Topic4.ppt

  17. Relating cycles and Instructions So we can add the following to our vocabulary: • Cycles per instruction (CPI) – smaller is better • Instruction per cycle (IPC) bigger is better • If the cycles to execute one instruction vary depending on the instruction, then the average CPI or IPC of a program will depend on how many of each type of instruction is executed. \ELEG323-05F\Topic4.ppt

  18. Clock, CPI and Instruction Count Clock rate - Hardware technology and organization CPI - Instruction set architecture Instruction - Instruction set architecture and count compiler technology - CPI should be measured, instead of check “Manuals” Why? ( affected by many factors, e.g Cache/memory, etc.) - The most important is time : lower inst. count may increase instruction clock cycle time \ELEG323-05F\Topic4.ppt

  19. Example A program requires executing 100 million instructions on a processor which typically takes 2 CPI with a 2GHz clock. How much time will the program take? \ELEG323-05F\Topic4.ppt

  20. Instruction count * CPI CPU time = Clock rate Answer Or you can work backwards from a known execution times and clock rate to calculate the CPI for a given program. 2 cycles 1 second x 1 x 108instructions x instruction 2 x 109 cycles = 0.1 seconds \ELEG323-05F\Topic4.ppt

  21. How to Improve the Performance? • Reduce the number of instructions to execute • Increase the number of instructions per cycle • Concurrent execution of instructions • Increase clock rate \ELEG323-05F\Topic4.ppt

  22. Weighted CPI Sometimes it is useful in designing the CPU to calculate the number of total CPU clock cycles as CPU clock cycles = (CPIi * Ii) n S i=1 \ELEG323-05F\Topic4.ppt

  23. n S i=1 Weighted CPI Cont’d Where Ii represents number of times instruction of type i is executed in a program and CPIi represents the average number of clock cycles for instruction of type i. This form can be used to express CPU time as CPU time =( (CPIi * Ii))/clock rate \ELEG323-05F\Topic4.ppt

  24. CPI Should Be Measured CPI should be measured and not just calculated from a table in the back of a reference manual Always bear in mind that the real measure of computer performance is time. \ELEG323-05F\Topic4.ppt

  25. Hardware-Oriented Metrics Clock rate and IPC are often combined into various figures of merit: • MIPS (Millions of Instructions Per Second) – pronounced “mips” • MOPS (Millions of Operations Per Second) – pronounced “mops” • MFLOPS (Millions of Floating-point Operations Per Second) – pronounced “megaflops” and sometimes written “megaFLOPS” Replace first letter with K (kilo), G (giga), T (tera), P (peta), etc., as appropriate. ( or even E (exa), Z (zeta) ..) \ELEG323-05F\Topic4.ppt

  26. Problems with Hardware-Oriented Metrics • Processors with different ISAs may require a different number of instructions to perform the same task, so MIPS hard to compare - MOPS and MFLOPS are a somewhat better measure - How do you count floating-point divides? • Vendors usually report “peak” rates \ELEG323-05F\Topic4.ppt

  27. MIPS Calculation One alternative to time as the metric is MIPS, or million instructions per second. For a given program, MIPS is simply Instruction count Clock rate = MIPS = Execution time * 106 CPI * 106 \ELEG323-05F\Topic4.ppt

  28. Limitations of MIPS -Meaningful only for comparing machines with same ISA, same program, and same input • Instruction capability not considered -May vary inversely with performance! • Instruction count is an absolute number without considering the frequency of each instruction class \ELEG323-05F\Topic4.ppt

  29. MIPS - What May Go Wrong with It ? A number of popular measures have been adopted in the quest for a standard measure of computer performance, with the result that a few innocent terms have been twisted from their well-defined environment and forced into a service for which they were never intended. \ELEG323-05F\Topic4.ppt

  30. Misleading Performance Measurement Clock rate: 500MHZ -MIPS=instruction count/(execution time*106) MIPS1= MIPS2= -Execution time=(CPIi*li)/clock rate Execution time1= Execution time2= {(1*5+2*1+3*1)*109}/(500*106)=20s {(1*10+2*1+3*1)*109}/(500*106)=30s {(5+1+1)*109}/(20*106)=350 {(10+1+1)*109}/(30*106)=400 \ELEG323-05F\Topic4.ppt

  31. Key: Execution Time of Real Programs The authors’ position is that the only consistent and reliable measure of performance is the execution time of realprograms, and that all proposed alternatives to time as the metric or to real programs as the items measured have eventually led to misleading claims or even mistakes in computer design. \ELEG323-05F\Topic4.ppt

  32. What is MIPS? “Meaningless Indication of Processor Speed” - Bob Estall Computer, 1987 \ELEG323-05F\Topic4.ppt

  33. MIPS Is Not A Multidimensional Measure • A computer system is multidimensional - therefore should be measured by some “vector”; • MIPS is a scalar - measures only one dimension; • MIPS is a very useful measure within it’s dimension. \ELEG323-05F\Topic4.ppt

More Related