220 likes | 325 Views
An Apples-to-Apples GPGPU Benchmark (…or at least an attempt at one). Peter S. Shenkin. Attachment-Based Core Hopping. What it does The architecture The benchmark. Attachment-Based Core Hopping. What it does Find a replacement for the central portion of a molecule
 
                
                E N D
An Apples-to-Apples GPGPU Benchmark(…or at least an attempt at one) Peter S. Shenkin
Attachment-Based Core Hopping • What it does • The architecture • The benchmark
Attachment-Based Core Hopping • What it does • Find a replacement for the central portion of a molecule • … keeping the peripheral parts in place • … while making “chemical sense” • Why would you do such a thing? • Increase efficacy • Improve “ADMET” properties • (Absorption, Distribution, Metabolism, Excretion, Toxicity) • Find new IP • Designed as a fast interactive desktop application • The architecture • The benchmark
Define Core in a “Template” Molecule • Two ways shown, to emphasize user choice 1kv1 core “1kv1-smaller” core
Result: 1err: olap= 0.95 relgscore= -1.37 • Replaced C with N • Replaced S with C
Result: 1erb: olap= 0.80, relgscore= -0.96 • Spiro core!
Result: 1kv2: olap= 0.29, relgscore= -0.37 • Replaced O with N • Replaced N with C • Added an N • Huge shape difference!
Attachment-Based Core Hopping • What it does • The architecture • Workflow engine independent of application code • (… and APU technology) • Multithreaded using Qthreads; C++ • Application stages are essentially plug-ins • The benchmark
Architecture Legend Non-thread-safethread Thread-safethread CUDAthread I O Queue Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Scheduler
Attachment-Based Core Hopping • What it does • The architecture • The benchmark • A truism that goes without saying • Results slowly unveiled • The dilemma & its resolution • Did we “do the right thing”?
The Truism • There are lies…
The Truism • There are lies… • … damn lies
The Truism • There are lies… • … damn lies • … statistics
The Truism • There are lies… • … damn lies • … statistics • … benchmarks
The Truism • There are lies… • … damn lies • … statistics • … benchmarks • … salesmen’s claims
The Truism • There are lies… • … damn lies • … statistics • … benchmarks • … salesmen’s claims … and the last two all too often interact
Results Test system: • i7/930, 2.7 GHz processor • 4 physical cores, run hyperthreaded • 12 Gb RAM • 8-lane PCIe motherboard • SSD drive
Results At constant CPU utilization: • With two GPGPUs: • Speedup = 1.07 / 0.3275 = 3.3 • With one GPGPU: • Speedup = 0.76 / 0.20 = 3.8
Closing Remarks • If we did our comparisons with different number of threads, speedups would be different • If we worked on a machine with more or fewer processors, speedups would be different • If we used an 4-lane PCIe motherboard, or a different CPU, or a slower hard drive, speedups would be different • If our software architecture were different, speedups would be different • Conclusion from above: The world is a complicated place • Do you agree that our approach is fair?