overview of chembl database n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Overview of ChEMBL Database PowerPoint Presentation
Download Presentation
Overview of ChEMBL Database

Loading in 2 Seconds...

play fullscreen
1 / 47

Overview of ChEMBL Database - PowerPoint PPT Presentation


  • 204 Views
  • Uploaded on

Overview of ChEMBL Database. Gareth Owen, ChEBI group, EMBL-EBI Northwestern University 16 th October 2012. What is ChEMBL?. Open access database for drug discovery Freely available (searchable and downloadable) Content:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Overview of ChEMBL Database


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Overview of ChEMBL Database Gareth Owen, ChEBI group, EMBL-EBI Northwestern University 16th October 2012

    2. What is ChEMBL? • Open access database for drug discovery • Freely available (searchable and downloadable) • Content: • 2D structures & calculated properties (logP, MW, Lipinski, etc.) • Associated bioactivity data extracted from the primary medicinal chemistry journals such as J. Med. Chem. • Deposited data from neglected disease screening (e.g. malaria) • Subset of data from PubChem • Covers ~30 years of compound synthesis and testing • Annotated FDA-approved drugs • Secure searching (https://www.ebi.ac.uk/chembldb)

    3. ChEMBL Database ChEMBL14 Targets: 9,003 Compounds: 1,376,469 Activities: 10,129,256* Publications: 46,133 60% proteins 20% organisms 20% cell lines * Includes: ~5,900,000 (PubChem) ~100,000 (Deposited malaria screening sets) • Assays are classified as: • Binding measurements • Functional assays • ADME/toxicity data 3 Content

    4. ChEMBL Assays –Binding, Functional, ADMET Binding Assays • Assays which directly measure the binding of a compound to a particular target • E.g., competition binding assays with a radioligand • Various endpoints measured, but most commonly reported are: • IC50 (half maximal inhibitory concentration) • Ki (binding affinity) • MIC (minimum inhibitory concentration) • % Inhibition (of activity)

    5. Functional Assays Whole organism assays (e.g., anti-infectives/parasitics) Disease-derived cell-line (e.g., human ovarian cancer cell line cytotoxicity) Tissue or cell-based disease model (e.g., glucose uptake by adipocytes) Tissue or cell-based assay for target effect (e.g., contraction of guinea-pig ileum) Cell-based assay over-expressing target (e.g., GPCR calcium mobilisation) Target association Disease association

    6. ADMET Assays • Assays measuring: Absorption, Distribution, Metabolism, Excretion, Toxicity properties of compounds • Examples include: • Half-life of compound in rats • Tissue distribution of compound • Levels of metabolites

    7. ChEMBL Targets: Protein Protein complex Protein family Nucleic Acid e.g., DNA e.g., Muscarinic receptors e.g., Nicotinic acetylcholine receptor e.g., PDE5 Cell Line TissueSub-cellular Fraction Organism e.g., Nervous e.g., HEK293 cells e.g., Drosophila e.g., Mitochondria

    8. Protein Targets • Each protein target linked to a sequence in UniProt • Information from UniProt used in ChEMBL to allow searching: • Protein name/description • Synonyms and gene names • Organism (and NCBI Tax ID) • Proteins in ChEMBL also classified according to family (e.g., Receptor, Kinase, Protease, Transporter etc). • Used for searching by target tree (Browse Targets)

    9. ChEMBL Compounds • Tautomers of the same compound are treated as the same compound. The form shown is as in the paper • Chemical structures are stored as .mol files • If the stereochemistry is known it is drawn as a specific enantiomer • Identifying unique compounds is done using standard Inchis • Salts and parent molecules are grouped together for displaying bioactivity data although activity data is recorded against the specific salt

    10. ChEMBL Home Page https://www.ebi.ac.uk/chembldb

    11. ChEMBL Main Search Page

    12. Drug Information Clickable structure Parent and Salt Forms Small molecule resources at the EBI

    13. Click to display data

    14. ChEBI Link:

    15. This will take you back to ChEMBL

    16. ChemSpider Links: The link works both ways. They link TO ChemSpider and FROM ChemSpider. They link on Standard_Inchi

    17. Wikipedia Links: We also have links with Wikipedia. These also use the Standard_Inchi as the common identifier. These links will link to the Compound Report Card in ChEMBL. The links are added by a ChemoBot and can be updated with each release, if required.

    18. Use Case 1 - Searching by Target • What is known about chemical structures that bind to a specific protein (Adenosine A2a)? • What is known about their potency/selectivity/ADMET Properties • Is there any protein structure data?

    19. Use Case 1 Searching by Target in ChEMBL Choose Sources to include in search

    20. Retrieving Bioactivity Data - Single Target 3D Structures Bioactivity data for target Display all bioactivity data for target Assay data for target Click pie chart to retrieve particular end-points

    21. Filtering Bioactivities Select targets of interest Select required activity types and define cut-offs e.g Ki<100nM

    22. Bioactivity Results Compound structures Activity values Assay details Target details References

    23. Selectivity Data For example:Can search ChEMBL for all data on compounds that have adenosine A2a Ki values <100nM

    24. ADMET Data Summary of ChEMBL bioavailability data for compounds with A2a Ki values <100nM Example of Bioavailability data

    25. Use Case 2 – Searching by Structure What compounds contain a particular substructure? What is known about their bioactivities? Known drugs/clinical Trials

    26. name Lists of Identifiers • Types of synonyms: • Research codes • Trade names • INN, USAN Different sketchers

    27. Similarity and Substructure Searching Display/Download Bioactivity Data

    28. Filtering Data on Lipinski Properties etc Display Bioactivities of subset 32

    29. names Bioactivities Structure

    30. Properties Cross-references Clinical Trials Bioactivities

    31. Links to Other Resources

    32. Links to Other Resources PDBe - http://www.ebi.ac.uk/pdbe

    33. Marketed Drugs Select set of interest Export to Excel or Export SDF

    34. Use Case 3 – Similar Targets Are there any available data on compounds that bind to proteins similar to IRAK2? For these compounds what bioactivity data is there on compounds with related sub-structures? Is there any crystal structure data on these proteins?

    35. Protein Sequence Search • *Altschul SF et al., J Mol Biol. 215(3), p403-10 (1990) • More precise method for identifying targets • Input is a protein sequence of interest • Uses BLAST* algorithm to perform pair-wise comparisons between input sequence and all proteins in the Target Dictionary, to find most closely related matches • Results are scored according to similarity to input sequence (determined by number of amino acids that are identical or have similar properties)

    36. Use Case 3 – Similar Targets Protein Sequence of Interest e.gfrom UniProt http://www.uniprot.org Data on IRAK1,IRAK3 and IRAK4 but not IRAK2

    37. IRAK1, IRAK3 and IRAK4 data Identify sub-structure of interest What other data available on compounds with this sub-structure?

    38. Use Case 4 - Assay keyword search • Some ChEMBL data (e.g., functional assays) may not be mapped against molecular targets • May want to perform a more general search (e.g., for a disease process, animal model, cell type of interest) • Examples: • What compounds have been tested in disease models (cholesterol lowering)? • What data is available for brain penetration (brain to plasma ratio)?

    39. Assay Search for “Cholesterol Lowering”

    40. Assay Search for “Brain to Plasma”

    41. Accessing ChEMBL Data

    42. Useful Links ChEMBL Blog: http://chembl.blogspot.com If you would like help: chembl-help@ebi.ac.uk For ChEMBLnews and data releases subscribe to: http://listserver.ebi.ac.uk/mailman/listinfo/chembl-announce

    43. Acknowledgements ChEMBL Group John Overington Anne Hersey Anna Gaulton Mark Davies Jon Chambers Louisa Bellis Kazuyoshi Ikeda Patricia Bento Shaun McGlinchey Yvonne Light Felix Krueger Ben Stauch Ruth Akhtar Francis Atkinson Rita Santos EMBL-EBI Samuel Kerrien, Sandra Orchard, Bruno Aranda, Rafael Jimenez, Reactome, UniProt and ChEBI teams Collaborators Imperial Cancer Research, University of Dundee, University of Cambridge, Sanger Centre, University of Maryland, NCBI, TDR, IUPHAR, Bayer-Schering, Pfizer, GSK, Schering-Plough, MMV, Novartis, St Jude Children’s Research Hospital Former Inpharmatica colleagues

    44. Exercises!