1 / 16

Developments ‘08…

Developments ‘08…. Inclusion of intermolecular degrees of freedom Changes of the genetic algorithm Constrained Sampling q min i < q i ≤ q max i Hybrid Islands, Electrostatic Forcing Dynamic Tabus Buffered Migrations ProCheck structure selection (folding)

spence
Download Presentation

Developments ‘08…

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Developments ‘08… • Inclusion of intermolecular degrees of freedom • Changes of the genetic algorithm • Constrained Sampling qmini<qi≤qmaxi • Hybrid Islands, Electrostatic Forcing • Dynamic Tabus • Buffered Migrations • ProCheck structure selection (folding) • Divide & Conquer “Planetary” strategy • BestEffortdeployment • Simulation Results

  2. Intermolecular degrees of freedom • Loose fragments detected & considered ligands • Chromosomes now include real values! • Torsional angles • 3 Euler angles/ligand • 3 Translations/ligand q1 q2 q3 … ... qn • The site must contain at least one fixed atom. • Translations (mapped onto [0…359.99], for homogeneity) position the topological center of the ligand within the box occupied by the free atoms of the site – unless a site_def.gen file is provided. • All Euler angles may evolve between 0 and 360°

  3. One or two islands are allowed to use “heavy” alternative Heuristics. • At each generation, there is a total (tunable) probability phyb to use “directed” rather than classical mutations: • Torsional driving (Explorers), with a frequency of (phyb)2 , or • Electrostatic Forcing, with a frequency of phyb(1- phyb), replacing the ancient time-consuming Monte Carlo simulation. • Randomy increase weight of electrostatic interactions • Perform gradient relaxation with perturbed Hamiltonian • Reset Hamiltonian and reoptimize

  4. Dynamic Tabus • A geometryiswithin a tabu zone if, for all degrees of freedomi, the differencesDi to the declared tabu geometry are below the minimal significant torsional differencessi, i.e. max(Di /si)<1 • However, making a binary « tabu or not tabu » decision for the currentgeometrydoes not suggestanyway to escapethe tabu zone. • A smooth & differentiable tabu penalty, decreasingwithincreasing max(Di /si) might permit escaping the tabu area by following the gradients • A differentialble approximation DMAX(Di /si) ≈max(Di /si) -1wasdefined • If the energye of the currentgeometryfallsbelow the one of the declared tabu structure et, thenthereis no more interest in leaving the tabu zone – whichhad been overhastily set!

  5. Buffered Migrations • Stalled evolution of an island triggers population reset (apocalypse) in order to let the sampler move to other phase space zones. • If a migrant– likely related to the ancient population – enters the island, it will be fittest among primitive post-reset individuals • It will have a lot of children and drive natives to extinction • Strategy change: incoming migrants enter a buffer zone* and are released into the population as soon as its evolutionary dynamics seems to slow down • After 20 successive generations without progress, an island “opens” to migrants (in the mean time, natives should be comparable to migrants – if not, they deserve extinction) *Hortefeux, B., Sarkozy, N., “L”ImmigrationChoisie”, pp. 1-29 in The Alien Menace, Le Pen, J.M. Ed., Vichy Press (2007)

  6. ProCheckused to discard misfolded proteinconformers… • Discard Structure if: • Has more than one residue in forbidded Rama-chandran area • Has a goodness factor < -1.0 • Has no minimal contigouossequence of secondary structure elements (AAA or BBB.*BBB) • Torsions of residuesoutsidecoreregions are discardedfrom the list of preferential values in seeding

  7. Divide-and-ConquerPlanetaryStrategy • Allocates a number of nodes to be used for global (NG) and local (NL) sampling. • Global searches return a set of diverse low-energy conformers, representing potentially interesting cells. • Once such cells were found and stored into the Open Cell Repository (OCR) they are eligible for local sampling. • After the fifth local search, a cell will be closed (added to the Closed Cell Repository CCR) if the current run failed to discover any more stable geometry.

  8. ResultIntegration Mode Launching Mode • Global Search • Assignfoundgeo-metries to cells • Merge entries into OCR (keepstablestgeometry /cell) • Update Samplingsuccessvs.Opera-tional pars. table Dispatcher Detect Running Jobs DetectResult Type • Submit Global Search • Set WALLTIME • Select SEED and TABU from CCR (if enough entries) • Select Operationalparameters. • Submit Local Search • Set WALLTIME/l • Pick a cellfrom OCR • Use ICR as SEED • Open Cell • Add to OCR • Addgeometries to ICR • ClosedCell • Add to CCR

  9. BestEffortDeployment • The schedulernowruns on a regularlyreservednode, no longer on the frontend machine. • It checks for the list of currentlydeployedbesteffortnodes and decidesupon jobs to beassigned to each of these. • The panspermia strategy – selectingseeds and tabus –and the selection of the nextcell to besubmitted to local sampling – based on energy & diversitycriteria – maynowbeperformedwithoutrisking to overload the frontend machine • BestEffortnodes are running waitingloops, expecting a job assignment file (global or local search, tabus, seeds, cell to explore, etc.). • The frontendruns a meta-schedulerchecking, every 2 minutes, the state of the nodes, and trying to restart terminatedtasks.

  10. Conclusions & Perspectives • The divide-and-conquer planetary strategy apparently works better than any other before • 1L2Y folded in <24 h, several days were needed before • However, there are no resources to lead any decent benchmarking concerning the choice of Kmax, NG/NL, etc. • It is practically out of question to use GRID 5000 for docking experiments on various systems!! • BestEffort deployment sucks! • Having jobs killed is not the worst thing that may happen • Having the one regular reservation (for the node running the scheduler) postponed lets all the other nodes do… nothing in BestEffort mode – they run an empty loop waiting for jobs no one submits! • Cannot run the scheduler in besteffort – getting it killed while accessing result databases may corrupt everything • We need some dedicated 100 nodes in order to make real progress.

  11. Ab initio folding of Trp cage 1L2Y: native structure (reproducibly) found and ranked as most stable. Planetary model used max. 20 nodes for 4…5 days PDB

  12. Ab initio folding of the Villin headpiece 1VII: helical parts are seen to fold in a matter of days (40 nodes) – although not properly oriented. PDB

  13. Good news for the b-hairpin of Chignolin: out of the top 10 best ranked conformers, 8 are native-like • Number one is not – but in this case, that may not be a problem #1,#5 PDB

  14. The Trp Zipper 1LE1 b-sheet is not the absolute energy minimum according to the current setup! PDB • However, proper folding of 1LE1 could be achieved (though not reproducibly!) with previous force field versions – is the current setup too helix-specific?

  15. Docking simulations in presence of flexible loops, such as the hinge region of Casein Kinase 2 (3BQC) • – pose of ligand emodin and loop geometry are correctly predicted (3BQC not in FF training set). PDB, #1

  16. Conclusion, Status&Needs… • We have working sampling & docking software, which must now be • Fine tuned • Helped to reduce the scope of search, by exploiting experimental (preferred rotamers, etc) or empirical knowledge (required key interactions, fingerprints, etc) • Also exploited in other 3D chemoinformatics approaches, with higher throughput than docking • Need our own CLUSTER (~100 nodes or mode) • Invest in existing platforms for privileged access • Buy own (with system manager) • One postdoc

More Related