1 / 37

MDSimAid: Automatic optimization of fast electrostatics in molecular simulations

MDSimAid: Automatic optimization of fast electrostatics in molecular simulations. Jesús A. Izaguirre, Michael Crocker, Alice Ko, Thierry Matthey and Yao Wang Corresponding author: izaguirr@cse.nd.edu Department of Computer Science and Engineering University of Notre Dame, USA. Talk Roadmap.

frayne
Download Presentation

MDSimAid: Automatic optimization of fast electrostatics in molecular simulations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MDSimAid: Automatic optimization of fast electrostatics in molecular simulations Jesús A. Izaguirre, Michael Crocker, Alice Ko, Thierry Matthey and Yao Wang Corresponding author: izaguirr@cse.nd.edu Department of Computer Science and Engineering University of Notre Dame, USA

  2. Talk Roadmap 1. Motivation: automatic tuning of molecular simulations 2. Key to performance: fast electrostatics 3. Evaluation of MDSimAid recommender system 4. Discussion and extensions

  3. Motivation • Create a recommender system/ self-adaptive software for running molecular dynamics (MD) simulations • Importance of MD • Protein dynamics, e.g., protein folding • Sampling and thermodynamics, e.g., drug design • Complicated to use effectively • Too many algorithms for performance critical sections, with complex parameter relationships

  4. Classical molecular dynamics • Newton’s equations of motion: • Molecules • CHARMM force field(Chemistry at Harvard MolecularMechanics) Bonds, angles and torsions

  5. Energy Functions Ubond = oscillations about the equilibrium bond length Uangle = oscillations of 3 atoms about an equilibrium angle Udihedral = torsional rotation of 4 atoms about a central bond Unonbond = non-bonded energy terms (electrostatics and Lennard-Jones)

  6. Talk Roadmap 1. Motivation: automatic tuning of molecular simulations 2. Key to performance: fast electrostatics 3. Evaluation of MDSimAid recommender system 4. Discussion and extensions

  7. Performance critical section: Electrostatics computation (1) This is a conditionally convergent sum. Ewald (1921) figured how to split this sum into two rapidly convergent sums:

  8. Derivation of fast electrostatics methods I Ewald used the following function: The short range part is usually solved directly. The smooth part may be solved by a Fourier series, giving rise to Ewald methods:

  9. Derivation of fast electrostatics methods II Ewald chooses splitting parameter so that work is evenly spaced as O(N 3/2 ) Particle Mesh Ewald (PME) chooses splitting parameter such that short-range part is O(N), andinterpolates Fourier series to a mesh, thus allowing the use of O(N log N) FFT

  10. Fast electrostatics algorithms There are many methods. Which one to use for a given system and accuracy?

  11. Particle Mesh Ewald • Following Ewald, separates the electrostatic interactions into two parts: • Direct-space short range evaluation • Fourier-space evaluation • The Fourier term is approximated by using fast Fourier transforms on a grid • Method parameters are grid size and cutoff of direct-space

  12. Multigrid I

  13. Multigrid II

  14. Multigrid III • Complex relationship among method parameters: • Cutoff and softening distances for potential evaluation at the particle and grid levels • Grid size and interpolation order • Number of levels • Rules extracted from extensive evaluation encapsulated in MDSimAid • Fine tuned at run-time by running selected tests • Makes these methods easier to use

  15. Talk Roadmap 1. Motivation: automatic tuning of molecular simulations 2. Key to performance: fast electrostatics 3. Evaluation of MDSimAid recommender system 4. Discussion and extensions

  16. Related Work I: Performance Models • Darden et al., J. Chem. Phys. 1993 • effect of varying parameters of Particle Mesh Ewald • Petersen et al., J. Chem. Phys. 1995 • accuracy and efficiency of Particle Mesh Ewald • Krasny et al., J. Chem. Phys. 2000 • used FMM to compute direct part of Ewald sum • Skeel et al., J. Comp. Chem. 2002 • study of parameters for multigrid (MG) method. Compared MG to Fast Multipole Method (FMM). MG faster than FMM for low accuracy

  17. Related Work II: Limitations • Most published results • fail to suggest how to determine the specific values • provide general trends only • contain unknown constants in equations that model performance • Do not account for modern computer architectures, particularly memory subsystem • For example, using parameters for MG suggested by MDSimAid gave an order of magnitude better accuracy and between 2 to 4 times faster execution than suggested by analytical results (Ko, 2002)

  18. Summary • General contributions of this study • Practical guidelines for choosing parameters for each fast electrostatics algorithms, and to choose among different algorithms • Implemented important algorithms with reasonable efficiency in ProtoMol • Tested algorithms for various system sizes and accuracy • Tested quality of these methods for MD of solvated proteins • Encapsulated results on a tool called MDSimAid • Hybrid rule-based and run-time optimization for choosing algorithm and parameter for MD

  19. Experimental protocol • These methods were tested and implemented in a common generic and OO framework, ProtoMol: • Smooth Particle Mesh Ewald • Multigrid summation • Ewald summation • Testing protocol: • Methods (1) and (2) above were compared against (3) to determine accuracy and relative speedup • Tested on water boxes and protein systems ranging from 1,000 to 100,000 atoms, and low and high accuracies • For selected protein systems, structural and transport properties were computed (e.g., Melittin, pdb id 2mlt, in water, 11845 atoms)

  20. Software adaptation I • (1) Domain specific language to define MD simulations in ProtoMol (Matthey and Izaguirre, 2001-3) • (2) “JIT” generation of prototypes: factories of template instantiations

  21. Software adaptation II (3) Timing and comparison facilities: force compare time force Coulomb -algorithm PMEwald -interpolation BSpline -cutoff 6.5 -gridsize 10 10 10 force compare time force Coulomb -algorithm PMEwald -interpolation BSpline -cutoff 6.5 -gridsize 20 20 20 (4) Selection at runtime – extensible module using Python trial_param1[i]=cutoffvalue+1 test_PME(cutoffvalue+1,gridsize,..., accuracy)

  22. Optimization strategies I (1) Rules generated from experimental data & analytical models (thousands of data points) (2A) At run-time, M tests are tried by exploring the parameter that changes accuracy or time most rapidly (biased search), or (2B) At run-time, M tests are tried by randomly exploring all valid method parameter combinations (3) Choose the fastest algorithm/parameter combination within accuracy constraints

  23. Example of parameter space exploration for PME

  24. Examples of rules in MDSimAid • Rules can be generated automatically using inductive decision trees or regression rules (or any machine learning technique). Example: • Regression for PME: cutoff = 0.0001 * N -4750.498 * rPE + 11.186 • Regression tree for MG-Ewald:rPE <= 1E-5 : LM1 (51/76.5%)rPE > 1E-5 : | rPE <= 1E-4 : LM2 (12/45.6%)| rPE > 1E-4 : LM3 (41/54.3%) • Models at the leaves:LM1: level = 2.96LM2: level = 1.83LM3: level = 1.27 • Rules can be evaluated by correlation coefficients and other metrics: • Correlation coefficient for PME 0.9146 • Correlation coefficient for MG-Ewald 0.8319

  25. Fastest algorithms found I Platform: Solaris; Biased search with 3 trials and unbiased with 5 trials

  26. Fastest algorithms found II Platform: Linux; Biased search with 3 trials and unbiased with 5 trials

  27. Fastest algorithms found III Platform: Solaris; Biased search with 3 trials and unbiased with 5 trials

  28. Fastest algorithms found IV Platform: Linux; Biased search with 3 trials and unbiased with 5 trials

  29. Optimization strategies II • Hybrid optimization strategies work well for algorithmic tuning: • Possible to obtain rules for general trends • Runtime optimization provides fine-tuning and may signal invalid rules • Rules provide initial guess; unbiased local search in parameter space provides improvements • In our tests, local search found three times faster fast electrostatic algorithms, at a cost of 3 times more computational search

  30. Hybrid optimization speedup Platform: Solaris

  31. Hybrid optimization speedup Platform: Linux

  32. Results (10-4 rPE)

  33. Results (10-5 rPE)

  34. Talk Roadmap 1. Motivation: automatic tuning of molecular simulations 2. Key to performance: fast electrostatics 3. Evaluation of MDSimAid recommender system 4. Discussion and extensions

  35. Related Work III • Algorithm or software selection problem (Rice, 1976) • Optimization at higher level than PHIPAC, ATLAS, etc. • Optimization approach similar to SPIRAL (Moura et al., 2000) • Some rules are generated by inductive methods, similar to PYTHIA II (Houstis et al., 2000) • Some rules come from performance models, similar to SALSA (Dongarra & Eijkhout, 2002) and BeBop (Vuduc, Yelick, Demmel, 2001-3)

  36. Discussion • Extensions • Fit in framework for SANS • Timing facilities (Autopilot at UIUC, etc.) • Algorithmic description metadata on XML • History database, agents for update of rules • Extend to more parts of MD simulation protocol: • Multiple time stepping integrators • Parallelism, clusters, grids • Lower level parameters (matrix-vector block size, etc.)? Or use ATLAS! • For further reference: • http://www.nd.edu/~lcls/mdsimaid • http://www.nd.edu/~lcls/protomol • http://www.nd.edu/~izaguirr

  37. Acknowledgements • NSF Biocomplexity grant IBN-0083653 • NSF ACI CAREER Award ACI-0135195 • NSF REU grant • Prof. Ricardo Vilalta and students, Univ. of Houston, for work on inductive learning

More Related