1 / 18

Optimum System Balance for Systems of Finite Price

Optimum System Balance for Systems of Finite Price. John D. McCalpin, Ph.D. IBM Corporation Austin, TX. SuperComputing 2004 Pittsburgh, PA November 10, 2004. Overview. The HPC Challenge Benchmark was announced last year at SuperComputing’2003 The HPC Challenge Benchmark consists of

coty
Download Presentation

Optimum System Balance for Systems of Finite Price

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimum System Balance for Systems of Finite Price John D. McCalpin, Ph.D. IBM Corporation Austin, TX SuperComputing 2004 Pittsburgh, PA November 10, 2004

  2. Overview • The HPC Challenge Benchmark was announced last year at SuperComputing’2003 • The HPC Challenge Benchmark consists of • LINPACK (HPL) • STREAM • PTRANS (transposing the array used by HPL) • RandomAccess (random read/modify/write) • FFT • and some low-level MPI latency & BW measurements • No single figure of merit is defined

  3. Overview (continued) • Q: What is a “balanced” system? • My answer:“A balanced system is one for which the primary applications are limited in performance by the most expensive component(s) of the system.”

  4. The Two Questions • We need to understand both performance and cost in the context of low-level component metrics • Performance • What performance model? • Use a harmonically weighted, time-based model • Cost • What cost model? • Simple linear additive cost model

  5. Performance Model • Composite Figures of Merit for Performance must be based on “time” rather than “rate” • i.e., weighted harmonic means of rates • Why? • Combining “rates” in any other way fails to have a “Law of Diminishing Returns” • Time = Work/Rate • Repeat for each component: Ti = Wi/Ri

  6. A Simple Composite Model • Assume the time to solution is composed of a compute time proportional to peak GFLOPS plus a memory transfer time proportional to sustained memory bandwidth • Assume “x Bytes/FLOP” to get: • Target SPECfp_rate2000 as the workload

  7. Does Peak GFLOPS predict SPECfp_rate2000?

  8. Does Sustained Memory Bandwidth predict SPECfp_rate2000?

  9. Optimized Model Results • Results rounded to nearby round values: • Bytes/FLOP for large caches === 0.16 • Bytes/FLOP for small caches === 0.80 • Size of asymptotically large cache === ~12 MB • Coefficient of best fit === ~6.4 • The units of the coefficient are SPECfp_rate2000 / Effective GFLOPS

  10. Does this Revised Metric predict SPECfp_rate2000?

  11. Cost Model • Assume simple linear additive model • FLOPS cost some amount • Sustained BW costs a different amount • Define:b = Rmem / Rcpug = Wmem / Wcpud = ($/BW) / ($/FLOPS) System characteristics Application characteristics Technology characteristics

  12. How to Optimize? • For a given application g (Wmem/Wcpu), what is the optimum system balance b? • Many people seem to believe that the system should be “balanced” to the application: • boptimal = g i.e., • Rmem/Rcpu = Wmem/Wcpu • This does not optimize price/performance

  13. The Correct Optimization • This is actually an easy optimization problem • Minimize cost/performance • Same as minimizing cost * time • Optimum cost/performance occurs at • b = sqrt(g/d) • Definitely not intuitive!

  14. Example: High BW, expensive BW g = 3  relatively high BW d = 3  relatively expensive BW Optimum price/performance is at b=1, not b=3 Improvement in price/performance is ~30%

  15. High BW, very expensive memory g = 3  relatively high BW d = 10  very expensive BW Optimum price/performance is at b=0.58, not b=3 Improvement in price/performance is ~50%

  16. Low-BW, expensive BW g = 0.1  low BW application d = 3  moderately expensive BW Optimum price/performance is at b=0.18, not b=0.1 Improvement in price/performance is ~5% More BW helps here even though it is expensive, because the application does not need much.

  17. Medium BW, expensive BW g = 1  modest BW d = 3  moderately expensive BW Optimum price/performance is at b=0.58, not b=1 Improvement in price/performance is ~10%

  18. Summary • Balance is important to cost/performance • You must understand performance • You must understand cost • Optimum cost-performance is not intuitive!

More Related