1 / 44

OSPREY Tutorial

OSPREY Tutorial. Ivelin Georgiev Bruce Donald Donald Lab Duke University. Distribution of Structures. min ( ). Maximum Likelihood. (pick most probable). Global Minimum Energy Conformation. Bayesian. ò. 1 Z. (average over all conformations).

carolek
Download Presentation

OSPREY Tutorial

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OSPREYTutorial Ivelin Georgiev Bruce Donald Donald Lab Duke University

  2. Distribution of Structures min( ) Maximum Likelihood (pick most probable) Global Minimum Energy Conformation Bayesian ò 1 Z (average over all conformations) Probability« Energy using Boltzmann distribution

  3. Distribution of Structures min( ) Maximum Likelihood (pick most probable) Global Minimum Energy Conformation `Bayesian’ ò 1 Z (weighted average over all conformations) Probability« Energy using Boltzmann distribution

  4. GMEC traditional-DEE maximum likelihood MinDEE BD

  5. K*: provably-accurate approximation to the binding constant via conformational ensembles ∫ 1 Z a GMEC traditional-DEE maximum likelihood MinDEE BD weighted average Application: Enzyme-Ligand Binding

  6. thousands of sequences!!! MinDEE ε approximation K* A* 1 - ε BD pruned partition function conformations s1 s2 … J. Comp. Chem. (2008) si fraction evaluated confs … sk

  7. Example(PNAS, 2009) Cheng-Yu Chen Ivelin Georgiev Amy Anderson Bruce Donald

  8. NonRibosomal Peptide Synthetases (NRPS) • NRPS enzymes found in some fungi and bacteria • NRPS enzymes make peptide-like products with pharmaceutical properties (antifungal, antineoplastic, antibacterial)e.g. vancomycin, penicillin, gramicidin, bacitracin, cyclosporin, bleomycin, … • NRPS similar to PKS FPVOL

  9. NRPS: GrsA-PheA Redesign gramicidin S Phe Leu

  10. Protein Redesign (NRPS) Three-dimensional structure of GrsA PheA domain [Conti et al., 1997]

  11. Change specificity from Phe to Leu by allowing any 2 (of 9) mutations Mutations to GAVLIFYWM Appx. 3000 Mutation Sequences = 680,000,000 Conformations (78,200 after pruning) - CO2 +H3N r = 9 s = 2 Leu

  12. Crystal Structure: 1amu (1.9 Å) 563 a.a., 65 kD (K517) I330 C331 AMP D235 A322 A301 A236 W239 T278 I299

  13. Three-Step Enzyme Redesign • K*: active site mutations • Entropy step: mutatable positions • MinDEE: bolstering mutations Ivelin Georgiev, Cheng-Yu Chen 1. 2. 3. provable heuristic provable Computational Structure-Based Redesign of Enzyme Activity. PNAS (2009)

  14. T278L/A301G with Leu AMP K517 • #1 • 3,000 sequences • 6.8  108 rotameric conformations PNAS (2009)

  15. V187 S447 V238 I207 F45 I277 L210 Mutations Outside the Active Site rotamer probabilities AA probabilities mutatable positions SCMF residue entropy Boltzmann MinDEE PNAS (2009)

  16. All top 10 • 3,000 sequences • 6.8  108 rotameric conformations A301G/T278L [L-Leu] mM Leu Phe Normalizedkcat/ KM PNAS (2009)

  17. L-Arg T278D/A301G with Arg Arg: #1 of 2511 sequences Lys: #4 of 2511 sequences >9  108 conformations WT AMP [L-arg] mM D235 K517 301G L-Lys W239 278D WT PNAS (2009)

  18. Tutorial

  19. Installation Setup Running OSPREY

  20. Installation Java mpiJava MPICH2 32-bit 64-bit √ may require special instructions

  21. Setup Compute Nodes Input Structure Rotamer Library Energy Function

  22. Compute Nodes Select MPI nodes: linux1 linux2 linux3 linux4 linux5 mpdboot mpdboot -n 5 -f mpd.hosts Select job-specific nodes: linux1 linux1 linux1 linux2 linux3 linux3 mpirun java OSPREY mpirun -machinefile ./machines -np 5 java -Xmx1024M KStar mpi -c KStar.cfg

  23. Input Structure REMARK 470 MISSING ATOM REMARK 470 THE FOLLOWING RESIDUES HAVE MISSING ATOMS (M=MODEL NUMBER; REMARK 470 RES=RESIDUE NAME; C=CHAIN IDENTIFIER; SSEQ=SEQUENCE NUMBER; REMARK 470 I=INSERTION CODE): REMARK 470 M RES CSSEQI ATOMS REMARK 470 GLU A 34 CG CD OE1 OE2 REMARK 470 GLU A 63 CD OE1 OE2 missing atoms KiNG model delete possible over-constraint possible under-constraint

  24. Input Structure adding hydrogens proteins general compounds recommended: MolProbity recommended: Accelrys DS Visualizer Check: protonation states missing protons

  25. Input Structure His residues HIP HIE HID

  26. Input Structure steric shell • close to design site • significant speedup

  27. Input Structure Other considerations: • protein, ligand, cofactor • ligand: natural AA, small molecule • water molecules • no chain ID’s • unique residue numbers • protein-peptide, protein-protein • connectivity (good input structures)

  28. Input Structure Check and double-check!!!

  29. Rotamer Library rotamers Richardsons’ Penultimate proteins general compounds # dihed # rot name TYR 2 4 N CA CB CG CA CB CG CD1 62 90 -177 80 -65 -85 -65 -30 TYR 2 5 N CA CB CG CA CB CG CD1 62 90 -177 80 -65 -85 -65 -30 -65 -45 FCL 2 4 N CA CB CG CA CB CG CD1 62 90 -177 80 -65 -85 -65 -30 1 2 one rotamer

  30. Energy Function parm96a.dat all_amino94X.in all_nuc94_and_gr.in • atom types • dihedral parameters • vdW parameters • amino acids • partial charges • connectivity • general compounds • partial charges • connectivity add params for new atom types antechamber typically no changes add params for new compounds antechamber can modify partial charges user control: distance-dependent dielectric, dielectric value, vdW radii scaling, solvation energy scaling, dihedral energies switch

  31. Running OSPREY GMEC-based Ensemble-based Residue entropy

  32. GMEC-based mpirun -machinefile ./machines -np 5 java -Xmx1024M KStar mpi -c KStar.cfg doDEE System.cfg DEE.cfg input structure rotamer library energy function mutation search parameters doDEE energy minimization (MinDEE, BD, BRDEE) DACS 1 MET GLY ASP ARG FCL 6 0 2 18 3 unMinE: -273.75 minE: -273.75 bestE: -273.75 2 MET GLY ASP MET FCL 6 0 2 6 3 unMinE: -271.96 minE: -271.96 bestE: -273.75 3 MET GLY ASP ARG FCL 6 0 2 18 3 unMinE: -271.78 minE: -271.78 bestE: -273.75 1 MET GLY SER ARG FCL 6 3 2 18 2 unMinE: -276.50 minE: -276.50 bestE: -276.50 2 MET GLY SER ARG FCL 6 3 1 18 2 unMinE: -276.42 minE: -276.42 bestE: -276.50

  33. GMEC-based java -Xmx1024M KStar -c KStar.cfg genStructDEE System.cfg GenStruct.cfg input structure rotamer library energy function struct generation parameters genStructDEE energy minimization (MinDEE, BD, BRDEE) 1 MET GLY SER ARG FCL 6 3 2 18 2 unMinE: -276.50 minE: -276.50 bestE: -276.50 2 MET GLY SER ARG FCL 6 3 1 18 2 unMinE: -276.42 minE: -276.42 bestE: -276.50 3 MET GLY ASP ARG FCL 6 0 2 18 3 unMinE: -273.75 minE: -273.75 bestE: -273.75 1 MET GLY ASP ARG FCL 6 0 2 18 3 unMinE: -273.75 minE: -273.75 bestE: -273.75 2 MET GLY ASP MET FCL 6 0 2 6 3 unMinE: -271.96 minE: -271.96 bestE: -273.75 3 MET GLY ASP ARG FCL 6 0 2 18 3 unMinE: -271.78 minE: -271.78 bestE: -273.75 1 MET GLY SER ARG FCL 6 3 2 18 2 unMinE: -276.50 minE: -276.50 bestE: -276.50 2 MET GLY SER ARG FCL 6 3 1 18 2 unMinE: -276.42 minE: -276.42 bestE: -276.50 rank

  34. Ensemble-based: Protein-ligand binding mpirun -machinefile ./machines -np 5 java -Xmx1024M KStar mpi KSMaster System.cfg MutSearch.cfg bound structure rotamer library energy function K* mutation search parameters KSMaster energy minimization (MinDEE, BD, BRDEE) doSinglePartFn 1 4.25E+24 ILE TRP ILE ALA ALA ILE 2 3.12E+24 TRP ASP ILE GLY ALA ILE 3 2.18E+24 ILE THR ILE PHE ALA ILE 4 1.45E+24 VAL THR ILE PHE ALA ILE 5 1.41E+24 ILE THR ILE TYR ALA ILE

  35. Residue entropy mpirun -machinefile ./machines -np 5 java -Xmx1024M KStar mpi doResEntropy System.cfg ResEntropy.cfg input structure rotamer library energy function mutation search parameters doResEntropy entropy res ID # prox res AA probabilities 257 2.33 0.2 0.0 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.1 0.0 0.0 0.0 0.2 0.0 0.1 18 481 2.29 0.2 0.0 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.2 0.0 0.0 0.0 0.1 0.0 0.1 15 32 2.29 0.3 0.0 0.1 0.1 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.1 23 26 2.28 0.2 0.0 0.1 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.2 0.0 0.0 0.1 0.1 0.0 29 163 2.26 0.3 0.0 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 0.1 0.0 0.1 0.0 0.1 22

  36. Some important parameters mpirun -machinefile ./machines -np 5 java -Xmx1024M KStar mpi -c KStar.cfg doDEE System.cfg DEE.cfg KStar.cfg: hElect true hVDW false hSteric false distDepDielect true dielectConst 6.0 vdwMult 0.95 doDihedE true doSolvationE true solvScale 0.8 stericThresh 0.4 softStericThresh 1.5 rotFile LovellRotamer.dat grotFile GenericRotamers.dat volFile AAVolumes.dat energy function steric filter rotamer libraries volume filter

  37. Some important parameters mpirun -machinefile ./machines -np 5 java -Xmx1024M KStar mpi -c KStar.cfg doDEE System.cfg DEE.cfg System.cfg: pdbName 1amuFH.pdb numInAS 4 residueMap 239 278 299 301 pdbLigNum 566 ligAA false numCofRes 1 cofMap 567 input pdb design site ligand cofactor

  38. Some important parameters mpirun -machinefile ./machines -np 5 java -Xmx1024M KStar mpi -c KStar.cfg doDEE System.cfg DEE.cfg DEE.cfg (partial): doDACS true distrDACS false initDepth 2 subDepth 1 diffFact 6 doMinimize false minimizeBB false doBackrubs false backrubFile none useEref true ligPresent false ligType none resAllowed0 gly ala val leu ile tyr phe trp met … resAllowed3 gly ala val leu ile tyr phe trp met resumeSearch false resumeFilename runInfo.out.partial DACS minimization reference energies ligand in search allowed mutations resuming

  39. Some important parameters mpirun -machinefile ./machines -np 5 java -Xmx1024M KStar mpi KSMaster System.cfg MutSearch.cfg MutSearch.cfg (partial): mutFileName 1amuFCL_2MUT.mut numMutations 2 targetVolume 620.0 volumeWindow 100000000.0 doMinimize false minimizeBB false doBackrubs false backrubFile none epsilon 0.03 gamma 0.01 repeatSearch true useUnboundStruct false unboundPdbName none resAllowed0 gly ala val leu ile tyr phe trp met resumeSearch false resumeFilename 1amuFCL_MutSearch.partial volume filter/ candidate mutants minimization (1-ε) accuracy inter-mutation at most 1 repeat unbound struct allowed mutations resuming

  40. General citation: Citing OSPREY K* and MinDEE: BD: BRDEE: DACS: Original K* publication:

  41. OSPREY is open source!!!

  42. Acknowledgements Bruce Donald Ryan Lilien Faisal Reza Kyle Roberts Daniel Keedy Pablo Gainza Donald Lab Funding: • NIH

More Related