1 / 30

Chemoinformatics

Chemoinformatics. P. Baldi, J. Chen, and S. J. Swamidass School of Information and Computer Sciences Institute for Genomics and Bioinformatics University of California, Irvine. Overall Outline. Introduction Molecular Representations Chemical Data and Databases Molecular Similarity

orrd
Download Presentation

Chemoinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chemoinformatics P. Baldi, J. Chen, and S. J. Swamidass School of Information and Computer Sciences Institute for Genomics and Bioinformatics University of California, Irvine

  2. Overall Outline • Introduction • Molecular Representations • Chemical Data and Databases • Molecular Similarity • Chemical Reactions • Machine Learning and Other Predictive Methods • Molecular Docking and Drug Discovery 2

  3. 1. Introduction • What is Chemoinformatics • Resources • Brief Historical Perspective • Chemical Space: Small Molecules • Overview of Problems and Methods 3

  4. What is Chemoinformatics? • chemoinformatics encompasses the design, creation, organisation, management, retrieval, analysis, dissemination, visualization and use of chemical information 4

  5. What is Chemoinformatics? • "the mixing of information resources to transform data into information and information into knowledge, for the intended purpose of making better decisions faster in the arena of drug lead identification and optimizaton" 5

  6. What is Chemoinformatics? • “the set of computer algorithms and tools to store and analyse chemical data in the context of drug discovery and design projects” • However: drug design/discovery is to chemoinformatics like DNA/RNA/ protein sequencing is to bioinformatics 6

  7. Resources Books: J. Gasteiger, T. E. and Engel, T. (Editors) (2003). Chemoinformatics: A Textbook. Wiley. A.R. Leach and V. J. Gillet (2005). An Introduction to Chemoinformatics. Springer. Journal: Journal of Chemical Information and Modeling Web: http://cdb.ics.uci.edu and many more……… 7

  8. Brief Historical Perspective • Historical perspective: physics, chemistry and biology • Theorem: computers/biology or computers/physics>> computers/chemistry • Proof: Genbank, Swissprot, PDB, Web (CERN), etc.. 8

  9. Caveat: Long Tradition • Quantum Mechanics • Docking • Beilstein • ACS • Etc… Gasteiger, J. (2006). "Chemoinformatics: a new field with a long tradition." Anal Bioanal Chem(384): 57-64. 9

  10. Possible Causes • Alchemy • Industrial age and early commercial applications of chemistry • Concurrent development of modern computers and modern biology • Scientific differences (theory/process) • Psychological perceptions (life/inert) • ACM 10

  11. Chemical Space: Small Molecules in Organic Chemistry • Understanding chemical space • Small molecules: • chemical synthesis • drug design • chemical genomics, • systems biology • nanotechnology • etc 11

  12. “A mathematician is a machine that converts coffee into theorems” P. Erdos 12

  13. Cholesterol 13

  14. Aspirin 14

  15. “A chemoinformatician is a machine …..…” 15

  16. Chemical Space 16

  17. Chemoinformatics • Historical perspective: physics, chemistry and biology • Understanding chemical space • Small molecules (chemical synthesis, drug design, chemical genomics, systems biology, nanotechnology) • Predict physical, chemical, biological properties (classification/regression) • Build filters/tools to efficiently navigate chemical space to discover new drugs, new reactions, new “galaxies”, etc. 17

  18. Chemo/Bio Informatics Two Key Ingredients 1. Data 2. Similarity Measures Bioinformatics analogy and differences: • Data (GenBank, Swissprot, PDB) • Similarity (BLAST) 18

  19. Computational/Predictive Methods • Spetrum of methods: • Quantum Mechanics • …. • Molecular Mechanics • …. • Machine Learning 19

  20. Quantum Mechanics Schrodinger’s Equation (time independent) Hψ=Eψ H=(-h2/8π2m)∂2+V = Hamiltonian Operator E=Energy V =external potential (time independent) ψ= ψ(x,t) =(complex) wave function = ψ(x)T(t) (time independent case) Ψ2 = Ψ* Ψ =probability density function (particle at position x) 20

  21. Schrodinger Equation • Partial differential eigenvalue equation • Where are the electrons and nuclei of a molecule in space? • Uncer a given set of conditions, what are their energies? • Difficult to solve exactly as number of particle grows (electron-electron interactions, etc) • Approximate methods • Ab initio • Semi empirical • 3D structures • Reaction mechanisms, rates 21

  22. Ab Initio • Limited to tens of atoms and best performed using a cluster or supercomputer • Can be applied to organics, organo-metallics, and molecular fragments (e.g. catalytic components of an enzyme) • Vacuum or implicit solvent environment • Can be used to study ground, transition, and excited states (certain methods) • Specific implementations include: GAMESS, GAUSSIAN, etc. 22

  23. Semiempirical Methods • Semiempirical methods use parameters that compensate for neglecting some of the time consuming mathematical terms in Schrodinger's equation, whereas ab initio methods include all such terms. • The parameters used by semiempirical methods can be derived from experimental measurements or by performing ab initio calculations on model systems.Limited to hundreds of atoms • Can be applied to organics, organo-metallics, and small oligomers (peptide, nucleotide, saccharide) • Can be used to study ground, transition, and excited states (certain methods). • Specific implementations include: AMPAC, MOPAC, and ZINDO. 23

  24. Molecular Mechanics • Force field approximation • Ignore electrons • Calculate energy of a system as a function of nuclear positions 24

  25. Molecular Mechanics Energy = Stretching Energy + Bending Energy + Torsion Energy + Non-Bonded Interactions Energy 25

  26. Stretching Energy 26

  27. Bending Energy 27

  28. Torsion Energy 28

  29. Non-Bonded Energy 29

  30. Statistical/Machine Learning Methods NNs and recursive NNs GA SGs Graphical Models Kernels ……… Representations are essential. Must either (1) deal with non-standard data structures of variable size; or (2) represent the data in a standard vector format. 30

More Related