1 / 57

ChEBI

ChEBI. Kirill Degtyarenko, EMBL-EBI / EPO. The team. Rafael Alcántara Michael Ashburner * Volker Ast * Michael Darsow * Paula de Matos Marcus Ennis Janna Hastings Alan McNaught * Inma Spiteri Christoph Steinbeck Martin Zbinden *. ChEBI: What is it?.

paul
Download Presentation

ChEBI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ChEBI Kirill Degtyarenko, EMBL-EBI / EPO

  2. The team • Rafael Alcántara • Michael Ashburner * • Volker Ast * • Michael Darsow * • Paula de Matos • Marcus Ennis • Janna Hastings • Alan McNaught * • Inma Spiteri • Christoph Steinbeck • Martin Zbinden *

  3. ChEBI: What is it? Chemical Entities of Biological Interest – an EBI database/dictionary of ‘biochemical compounds’

  4. What are the ‘biochemical compounds’? Can be defined as consisting of “molecules not directly encoded by the genome ... that are either the products of nature or are synthetic products used ... to intervene in the processes of living organisms” [Michael Ashburner]

  5. Molecular entity “Any constitutionally or isotopically distinct atom, molecule,ion, ion pair, radical, radical ion, complex, conformer etc., identifiable as a separately distinguishable entity” [IUPAC “Gold Book”]

  6. In fact, ChEBI contains • Molecular entities • trans-vaccenic acid • Groups • trans-vaccenoyl group • Classes • fatty acids

  7. ‘Small molecules’? Yes, but big molecules as well! • alumina • amylose • metaborate • poly(vinyl alcohol)

  8. Current status (17.12.08)

  9. 1-D ChEBI • Numeric ID • Carefully checked terminology • Unambiguous ChEBI name • IUPAC names • Cross-references to free resources

  10. Unambiguous ChEBI name CHEBI:28918 L-adrenaline not just‘adrenaline’

  11. 6 5 1 4 2 3 1 6 2 5 3 4 Systematic Name (IUPAC) 2-{[3-(trifluoromethyl)phenyl]amino}benzoic acid

  12. Common Name • flufenamic acid (INN English) • acide flufénamique (INN French) • ácido flufenámico (INN Spanish) • acidum flufenamicum (INN Latin) • Flufenaminsäure (German)

  13. The Unpronounceables CHEBI:48935 (E)-roxithromycin IUPAC name: (3R,4S,5S,6R,7R,9R,10E,11S,12R,13S,14R)-4-(2,6-dideoxy-3-C-methyl-3-O-methyl-α-L-ribo-hexopyranosyloxy)-14-ethyl-7,12,13-trihydroxy-10-{[(2-methoxyethoxy)methoxy]imino}-6-[3,4,6-trideoxy-3-(dimethylamino)-β-D-xylo-hexopyranosyloxy]-3,5,7,9,11,13-hexamethyloxacyclotetradecan-2-one

  14. CHEBI:48935 (E)-roxithromycin INN: roxithromycin CHEBI:32109 (Z)-roxithromycin What is the common name of roxithromycin?

  15. CHEBI:48844 roxithromycin Roxithromycin (2) (Z)-roxithromycin (E)-roxithromycin

  16. CHEBI:18385 thiamine(1+) aka thiamine CHEBI:33283 thiamine(1+) chloride INN: thiamine CHEBI:49105 thiamine(2+) dichloride aka thiamine chloride hydrochloride aka thiamine hydrochloride What is thiamine?

  17. Need for 2-D • “Better to see the face than to hear the name” (Zen proverb) • Structures and identifiers based on structures offer new ways of crosslinking to other databases • Structure search

  18. Connection table ChEBI 9 10 0 0 0 0 999 V2000 11.8219 -7.2713 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.8219 -8.0922 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.6074 -7.0165 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 11.1072 -6.8574 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.6039 -8.3505 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 11.1072 -8.5027 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 13.0886 -7.6818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.3923 -7.2713 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 10.3888 -8.0922 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 0 0 0 0 1 3 1 0 0 0 0 1 4 1 0 0 0 0 2 5 1 0 0 0 0 2 6 1 0 0 0 0 3 7 1 0 0 0 0 4 8 2 0 0 0 0 6 9 2 0 0 0 0 5 7 2 0 0 0 0 8 9 1 0 0 0 0 M END

  19. 2-D ChEBI • One or more 2-D (or 3-D) connection tables • One is default • Autogenerated images (PNG) • Default diagrams should be unambiguous

  20. The Fine Art of chemical drawing

  21. Linear forms of monosaccharides

  22. Pyranose forms of monosaccharides

  23. Fused systems (R)-camphor ambiguous unambiguous

  24. Square planar geometry cisplatin transplatin

  25. From 2-D back to 1-D • SMILES • InChI

  26. SMILES (1) • Simplified Molecular Input Line Entry Specification • Developed by David Weininger in 1988 • Extended by others (e.g. Daylight) • String of standard ASCII characters • A number of valid SMILES can be produced for the same molecule

  27. SMILES (2) • N1C=NC2=C1C=NC=N2 • c1ncc2ncnc2n1 • C=1N\C=N/C\2=N/C=N\C=1/2 • c1ncnc2/N=C\Nc12 • n1cc2c(nc1)ncn2 • [H]c1nc([H])c2n([H])c([H])nc2n1

  28. InChI (1) • IUPAC International Chemical Identifier or InChI • Open source • Developed by Stein, Heller, Tchekhovskoi and McNaught • Used by NIST, PubChem, CML… and ChEBI

  29. InChI (2) InChI=1/C5H4N4/c1-4-5(8-2-6-1)9-3-7-4/h1-3H,(H,6,7,8,9)/f/h7H InChIKey=KDCGOANMDULRCW-QDQILVOLCG

  30. Limitations (1) • Stereochemistry other than sp3 tetrahedral and sp2 trigonal planar • Polymers • Conformers • Radicals/different spin state • Topological isomers • Mixtures • Markush structures

  31. Limitations (2) cisplatin transplatin InChI=1/2ClH.2H3N.Pt/h2*1H;2*1H3;/q;;;;+2/p-2

  32. 3-D ChEBI cisplatin

  33. Uncertainty and ambiguity in chemistry • Compositional uncertainty • Positional uncertainty • Configurational uncertainty • Conformational uncertainty

  34. Compositional uncertainty Examples • an alkali metal cation • vanadate(V) anion • [2H]ethanol

  35. Positional uncertainty Examples • L-bromohistidine residue • pteroic acid (several tautomers)

  36. Configurational uncertainty Examples • androstane • rel-(2R,3R)-2-amino-3-methylpentanoic acid • tetradec-11-enoic acid

  37. Conformational uncertainty Examples • cyclohexane: chair, boat, twist • protein secondary structure: , , …

  38. ChEBI ontology • Molecular structure ontology • Subatomic particle ontology • Role ontology • Biological role • Application

  39. L-adrenaline Molecular structure ontology • catecholamines Biological role • hormone Application • antiglaucoma • bronchodilator • cardiostimulant

  40. The family relations L-cystein-S-yl L-cysteine(•) L-cysteine zwitterion cysteine D-cysteine L-cysteino L-cysteine L-cysteinium L-cysteinyl L-cysteinate(1–) L-cysteine residue L-cysteinate(2–) L-cysteinate residue

  41. Relationships in ChEBI

  42. Is A relationship ∆ L-cysteine is a cysteine

  43. ∆ Is Enantiomer Of  L-cysteine is enantiomer of D-cysteine

  44. Has Part has part ⋄ L-cysteinium is part of L-cysteine hydrochloride

  45. ♯ ♯ Is Conjugate Acid Of L-cysteinium L-cysteinate(2–) L-cysteine is conjugate acid of L-cysteinate(1–)

  46. ♭ ♭ Is Conjugate Base Of L-cysteinium L-cysteinate(2–) L-cysteine L-cysteinate(1–)

  47. ♯ ♯ ♭ Acid/base relationships L-cysteinium L-cysteinate(2–) ♯ ♭ L-cysteine L-cysteinate(1–)

  48. is tautomer of Is Tautomer Of L-cysteine L-cysteine zwitterion

  49.   Is Tautomer Of 1H-pyrrole 2H-pyrrole 3H-pyrrole

  50. Has Parent Hydride is parent hydride of ℋ salutaridinol has parent hydride morphinan

More Related