An Apples-to-Apples GPGPU Benchmark (…or at least an attempt at one) - PowerPoint PPT Presentation

livana
an apples to apples gpgpu benchmark or at least an attempt at one n.
Skip this Video
Loading SlideShow in 5 Seconds..
An Apples-to-Apples GPGPU Benchmark (…or at least an attempt at one) PowerPoint Presentation
Download Presentation
An Apples-to-Apples GPGPU Benchmark (…or at least an attempt at one)

play fullscreen
1 / 22
Download Presentation
An Apples-to-Apples GPGPU Benchmark (…or at least an attempt at one)
87 Views
Download Presentation

An Apples-to-Apples GPGPU Benchmark (…or at least an attempt at one)

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. An Apples-to-Apples GPGPU Benchmark(…or at least an attempt at one) Peter S. Shenkin

  2. Attachment-Based Core Hopping • What it does • The architecture • The benchmark

  3. Attachment-Based Core Hopping • What it does • Find a replacement for the central portion of a molecule • … keeping the peripheral parts in place • … while making “chemical sense” • Why would you do such a thing? • Increase efficacy • Improve “ADMET” properties • (Absorption, Distribution, Metabolism, Excretion, Toxicity) • Find new IP • Designed as a fast interactive desktop application • The architecture • The benchmark

  4. Define Core in a “Template” Molecule • Two ways shown, to emphasize user choice 1kv1 core “1kv1-smaller” core

  5. Result: 1err: olap= 0.95 relgscore= -1.37 • Replaced C with N • Replaced S with C

  6. Result: 1erb: olap= 0.80, relgscore= -0.96 • Spiro core!

  7. Result: 1kv2: olap= 0.29, relgscore= -0.37 • Replaced O with N • Replaced N with C • Added an N • Huge shape difference!

  8. Attachment-Based Core Hopping • What it does • The architecture • Workflow engine independent of application code • (… and APU technology) • Multithreaded using Qthreads; C++ • Application stages are essentially plug-ins • The benchmark

  9. Architecture Legend Non-thread-safethread Thread-safethread CUDAthread I O Queue Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Scheduler

  10. Attachment-Based Core Hopping • What it does • The architecture • The benchmark • A truism that goes without saying • Results slowly unveiled • The dilemma & its resolution • Did we “do the right thing”?

  11. The Truism • There are lies…

  12. The Truism • There are lies… • … damn lies

  13. The Truism • There are lies… • … damn lies • … statistics

  14. The Truism • There are lies… • … damn lies • … statistics • … benchmarks

  15. The Truism • There are lies… • … damn lies • … statistics • … benchmarks • … salesmen’s claims

  16. The Truism • There are lies… • … damn lies • … statistics • … benchmarks • … salesmen’s claims … and the last two all too often interact

  17. Results Test system: • i7/930, 2.7 GHz processor • 4 physical cores, run hyperthreaded • 12 Gb RAM • 8-lane PCIe motherboard • SSD drive

  18. Results

  19. Results

  20. Results

  21. Results At constant CPU utilization: • With two GPGPUs: • Speedup = 1.07 / 0.3275 = 3.3 • With one GPGPU: • Speedup = 0.76 / 0.20 = 3.8

  22. Closing Remarks • If we did our comparisons with different number of threads, speedups would be different • If we worked on a machine with more or fewer processors, speedups would be different • If we used an 4-lane PCIe motherboard, or a different CPU, or a slower hard drive, speedups would be different • If our software architecture were different, speedups would be different • Conclusion from above: The world is a complicated place • Do you agree that our approach is fair?