780 likes | 1.07k Views
Molecular Docking. CBMB, TIGP Ming-Jing Hwang Institute of Biomedical Sciences Academia Sinica June 8, 2011. What is Docking?. Receptor. Protein, other macromolecules. Ligand. Protein, DNA, RNA, small molecules.
E N D
Molecular Docking CBMB, TIGP Ming-Jing Hwang Institute of Biomedical Sciences Academia Sinica June 8, 2011
What is Docking? Receptor Protein, other macromolecules Ligand Protein, DNA, RNA, small molecules Docking is to identify the best relative position (pose) that puts the ligand and the receptor together.
Why is docking important? • Most cellular functions are accomplished by proteins interacting with themselves and/or with other molecules. • A computational approach key to rational drug design.
HIV immunity is all in the amino acids(at ligand binding site) Why do some people (called HIV controllers;~1/300) who are infected with HIV not go on to develop AIDS? Worldwide (GWAS) study implicates structural changes in a protein binding site The International HIV Controllers Study Science, 2010 Tiny changes to the structure of the HLA-B protein confer immunity to HIV
Two Typical Types of Ligands in Molecular (Protein) Docking • I: the ligand is a small molecule (protein-ligand docking) • II: the ligand is a itself a protein (protein-protein docking)
(1) Protein-ligand (small molecule) docking • Predicts... • The pose of the molecule in the binding site • The binding affinity or a score representing the strength of binding • A Structure-Based Drug Design (SBDD) method • “structure” means “using protein structure” • Computational method that mimics the binding of a ligand to a protein • Given... Noel O’Boyle
Docking (pose search) & Scoring complex ligand docking scoring (GA, MC, etc.) (MMFF, empirical, etc.) X-ray structure & DG receptor … etc Qi-Chen (Eli Lilly)
Two Key Components ofProtein-ligand Docking • An efficient search algorithm • - A good scoring function (to determine the best “matching”) From: http://www.molsoft.com/images/icm/dockRef.jpg
Ligand binding site prediction Calculation of solvent-accessible surface can identify clefts or holes, which are potential binding sites. SURFNET, Pocket-Finder, and Q-SiteFinder are three programs that can do this. There are a number of other prediction methods also. Understanding Bioinformatics (Chap. 14) Marketa Zvelebil & Jeremy O. Baum
Virtual Screening Library of drug candidates Receptor-based Ligand-based Goal: given receptor binding pocket and ligand to quickly find correct binding pose
Pharmacophore A pharmacophore is an abstract description of molecular features which are necessary for molecular recognition of a ligand by a biological macromolecule Pharmacophores are used to define the essential features of one or more molecules with the same biological activity. A database of diverse chemical compounds can then be searched for more molecules which share the same features located a similar distance apart from each other. Typical pharmacophore features are for where a molecule is hydrophobic, aromatic, a hydrogen bond acceptor, a hydrogen bond donor, a cation, or an anion. http://en.wikipedia.org/wiki/Pharmacophore
Virtual screening • Virtual screening is the computational or in silico analogue of biological screening • The aim is to score, rank or filter a set of chemical structures using one or more computational procedures • Docking is just one way to do this • It can be used • to help decide which compounds to screen (experimentally) • which libraries to synthesise • which compounds to purchase from an external company • to analyse the results of an experiment, such as a HTS run Noel O’Boyle
http://www.columbia.edu/cu/chemistry/CADD/slides/cadd-02-sherman.pdfhttp://www.columbia.edu/cu/chemistry/CADD/slides/cadd-02-sherman.pdf
http://www.columbia.edu/cu/chemistry/CADD/slides/cadd-02-sherman.pdfhttp://www.columbia.edu/cu/chemistry/CADD/slides/cadd-02-sherman.pdf
Docking Software (Scoring function) SDOCKER (Wu et al. 2004) QXP (McMartin & Bohacek 1997) Validate (Head et al. 1996) • de novo design tools • LUDI (Boehm 1992), • BUILDER (Roe & Kuntz 1995) • SMOG (DeWitte et al. 1997) • CONCEPTS (Pearlman & Murcko 1996) • DLD/MCSS (Stultz & Karplus 2000) • Genstar (Rotstein & Murcko 1993) • Group-Build (Rotstein & Murcko 1993) • Grow (Moon & Howe 1991) • HOOK (Eisen et al. 1994) • Legend (Nishibata & Itai 1993) • MCDNLG (Gehlhaar et al. 1995) • SPROUT (Gillet et al. 1993) DOCK: (Kuntz et al. 1982) DOCK 4.0 (Ewing & Kuntz 1997) AutoDOCK (Goodsell & Olson 1990) AutoDOCK 3.0 (Morris et al. 1998) GOLD (Jones et al. 1997) FlexX: (Rarey et al. 1996) GLIDE: (Friesner et al. 2004) ADAM (Mizutani et al. 1994) CDOCKER (Wu et al. 2003) CombiDOCK (Sun et al. 1998) DIVALI (Clark & Ajay 1995) DockVision (Hart & Read 1992) FLOG (Miller et al. 1994) GEMDOCK (Yang & Chen 2004) Hammerhead (Welch et al. 1996) LIBDOCK (Diller & Merz 2001) MCDOCK (Liu & Wang 1999) PRO_LEADS (Baxter et al. 1998) Qi-Chen (Eli Lilly)
ADME & Toxicity ADME concerns can be more important than bioactivity. Most of these properties are difficult to predict. • Absorption • Distribution • Metabolism • Excretion
Prediction of Drug Off-Targets Identify drug binding site of primary target Identify off-targets by binding site similarity (SOIPPA) Dock drug to off-targets (Ren et al Nucleic Acids Res 2010) Chang, UCSD 19/22
Final thoughts on protein-ligand docking • Protein-ligand docking is an essential tool for computational drug design • Widely used in pharmaceutical companies • Many success stories (see Kolb et al. Curr. Opin. Biotech., 2009, 20, 429) • But it’s not a golden bullet • The perfect scoring function has yet to be found • The performance varies from target to target, and scoring function to scoring function • See for example, Plewczynski et al, “Can we trust docking results? Evaluation of seven commonly used programs on PDBbind database”, J. Comp. Chem., Online 1 Sep 2010. • Care needs to be taken when preparing both the protein and the ligands • The more information you have (and use!), the better your chances • Targeted library, docking constraints, filtering poses, seeding with known actives, comparing with known crystal poses Noel O’Boyle
http://www.columbia.edu/cu/chemistry/CADD/slides/cadd-02-sherman.pdfhttp://www.columbia.edu/cu/chemistry/CADD/slides/cadd-02-sherman.pdf
http://www.columbia.edu/cu/chemistry/CADD/slides/cadd-02-sherman.pdfhttp://www.columbia.edu/cu/chemistry/CADD/slides/cadd-02-sherman.pdf
(2) Protein-protein docking Ligand (moved around) Receptor (fixed) Ref: Mendez, et al., Proteins, 2005, 60: 150
Why is protein-protein docking needed? Data from PDB (http://www.pdb.org/pdb/home/home.do): on 2010/11/22
Protein vs. small molecule as ligand • The computational problems (searching and scoring) are essentially the same, though at different levels of complexity. • Site of docking usually quite different. • Methods developed for small molecule dockings are usually more advanced than for protein-protein docking.
Grid + FFT (most efficient right now) Bates, Camacho, ClusPro, Eisenstein, Sternberg, and Weng Monte Carlo Abagyan, Baker, and Gray Others Bonvin (Energy Minimization and Molecular dynamics) Wolfson (Geometric docking) Shape complementarity (FFT) Van der Waals Coulomb Desolvation Rotamer probabilities Contact pair potential The use of scoring functions that combine various terms is common. Sampling & Scoring
Ex: Monte Carlo Gray, et al., JMB, 2003, 331: 281 Fernandez-Recio, et al., JMB, 2004, 335: 843
Two kinds of protein-protein docking problems • Bound docking The complex structure is known. The receptor and the ligand in the complex are pulled apart and reassembled. • Unbound docking Individually determined protein structures are used. Weng
Dataset and Benchmark • You can find ~10,000 protein-small molecule complexes and ~1000 protein-protein complexes in PDB • Small molecule docking benchmark datasets include Wang et al. (2003) and Brooks’ LPDB (2001) • CAPRI for protein-protein docking
CAPRI – introduction • CARPI – Critical Assessment of PRedicted Interactions (a community wide experiment on the comparative evaluation of protein-protein docking for structure prediction) • Round 1-2: T01-T07 (2002) • Round 3-5: T08-T14, T18-T19 (2003-2004) • Round 6-12: T20-T21, T24-T28 (2005-2007) ….. • Round 17: T38-T39 (2009) • Aim - objective tests • Individual groups that develop docking procedures, predict the 3D structure of a protein complex from the known structures of the components. • The predicted structure is subsequently assessed by comparing it to the experimental structure—the target—determined most commonly by X-ray diffraction, which is deposited with CAPRI prior to publication. http://www.ebi.ac.uk/msd-srv/capri/
Schematic illustration of the quality measures in CAPRI *** ** * 0 Mendez, et al., Proteins, 2005, 60: 150
Summary of docking predictions in CAPRI 1st Mendez, et al., Proteins, 2003, 52: 51
Summary of docking predictions in CAPRI 2nd Mendez, et al., Proteins, 2005, 60: 150
Summary of docking predictions in CAPRI 3rd Lensink, et al., Proteins, 2007, 69: 704
Target 2: Antibody/VP6 Red: Crystal Structure Blue: Prediction 50/52; 1st Weng
Target 7:T Cell Receptor / Toxin Red: Crystal Structure Blue: Prediction 31/37, 1st Weng
Target 3:Antibody/Hemagglutinin Red: Crystal Structure Blue: Prediction 37/62, 3rd Weng
Target 6:Camelide Antibody/a amylase Red: Crystal Structure Blue: Prediction 18/65 Weng
Target 1:Hpr/HPrK Red: Crystal Structure Blue: Prediction 5/52 Weng
(ZDOCK) Weng
Recent trends in CAPRI • developing better scoring function • crude estimates of geometric and energetic complementarity during the rigid body search • modeling conformational flexibility • incorporation of nonstructural information • e.g. biochemical information, multiple alignments Lensink, et al., Proteins, 2007, 69: 704
Experiment data-assisted PPD The docking solution is directly derived by finding minimal violation of the distance constraints imposed in the distance-constraint energy optimization.
Some Sources of distance constraints Tang C & Clore GM (2006) J Biomol NMR36, 37-44. Green NS, Reisler E, & Houk KN (2001) Protein Sci10, 1293-1304. Knight JL, Mekler V, Mukhopadhyay J, Ebright RH, & Levy RM (2005) Biophys J88, 925-938. Bhatnagar J, Freed JH, & Crane BR (2007) Methods Enzymol423, 117-133
Incorporating expt data can significantly increase success rates ZDOCK best possible success rate ZDOCK top1 success rate Data from ZDOCK website (http://zlab.bu.edu/zdock): ZDOCK 2.1 + benchmark 2.0
Methods for determining macro-molecular assembly structure Sali et al, 2003
Hybrid approach for solving macromolecular complex structures Sali et al, 2003
Structure of the nuclear pore complex Alber et al.1, 2 started with 456 randomly distributed protein components of the complex (coloured beads), corresponding to several copies of each of the 30 different constituent nucleoporins. They moved the beads randomly, while computationally minimizing thousands of biochemical and morphological restraints. (Nature, 2007)