smiles n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
SMILES PowerPoint Presentation
Download Presentation
SMILES

Loading in 2 Seconds...

play fullscreen
1 / 21

SMILES

7 Views Download Presentation
Download Presentation

SMILES

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. SMILES • Simplified Molecular Input Line Entry System (SMILES) • Widely used AND computationally efficient • Uses atomic symbols and a set of intuitive rules • Uses hydrogen-suppressed molecular graphs (HSMG)

  2. SINGLE* DOUBLE TRIPLE AROMATIC* * can be omitted - = # : SMILES Bonds

  3. Butanols 2-Butanol iso-Butanol tert-Butanol

  4. SMILES Branches • Represented by enclosure in parentheses • Can be nested or stacked • Examples: CC(O)CC is 2-Butanol OCC(C)C is iso-Butanol OC(C)(C)C is tert-Butanol

  5. Ethene Chloroethene 1,1-Dichloroethene cis-1,2-Dichloroethene Trichloroethene Perchloroethene C=C ClC=C ClC(Cl)=C ClC=CCl ClC(Cl)=CCl ClC(Cl)=C(Cl)Cl SMILES Bonds

  6. SMILES Atoms • Use normal chemical symbols • Add punctuation symbols if necessary • No super- or subscripts

  7. SMILES Symbols • String of alphanumeric characters and certain punctuation symbols • Terminates at the first space encountered when read left to right • The ORGANIC SUBSET: B, C, N, O, P, S, F, Cl, Br, I

  8. Other SMILES Atoms • Aliphatic or nonaromatic carbon: C • Atom in aromatic ring: lowercase letter • Designate ring closure with pairs of matching digits, e.g. c1ccccc1 (or C1=CC=CC=C1) is Benzene, whereas C1CCCCC1 is Cyclohexane

  9. SMILES Charges • Specify attached hydrogens and charges in square brackets • Number of attached hydrogens is the symbol H followed by optional digit

  10. [H+] [OH-] [OH3+] [Fe++] [NH4+] proton hydroxyl anion hydronium cation iron(II) cation ammonium cation SMILES Charges

  11. SMILES Cyclic Structures • Break one single or one aromatic bond in each ring • Number in any order • Designate ring-breaking atoms by the same digit following the atomic symbol

  12. Cyclic Structures • Numbers indicate start and stop of ring • Same number indicates start and end of the ring, entered immediately following the start/end atoms • Only numbers 1 – 9 are used • A number should appear only twice • Atom can be associated w. 2 consecutive numbers, e.g., Napthalene: c12ccccc1cccc2

  13. c12ccccc1cccc2 Naphthalene

  14. SMILES Conventions • Avoid two consecutive left parentheses if possible • Strive for the fewest number of possible branches • Tautomeric bonds are not designated; enter the appropriate form

  15. Further Restrictions • A branch cannot begin a SMILES notation • A branch cannot immediately follow a double- or triple-bond symbol • Example: C=(CC)C is invalid, but • C(=CC)C or C(CC)=C are valid SMILES

  16. Nitro Nitrate Nitrite Sulfonic acid Cyanide/Nitrile Azide Azido N(=O)(=O) ON(=O)(=O) ON(=O) S(=O)(=O)O C#N N=N#N N+=N- SMILES Fragments

  17. SMILES Metals [Al] [As] [Au] [Be] [Bi] [Cd] [Ca] [Fe] [Hg] [K] [Li] [Mg] [Na] [Ni] [Pt] [Sb] [Sn] [Zn] [Zr]

  18. Disconnected Structures • Indicated by a dot • Tetramethyl ammonium bromide C[N+]C(C)C.[Br-]

  19. Isomeric and Chiral SMILES • Isomeric configuration indicated by forward and backward slashes: / \ • Examples: • trans-1,2-dibromoethene: Br/C=C/Br • Direction of the slash continues • cis-1,2-dibromoethene: Br/C=C\Br • Direction of the slash reverses • Chirality indicated by the “@” symbol

  20. Some Applications • JMDraw/SMILESViewer (Christoph Steinbeck) • JME Molecular Editor (Peter Ertl) • STN Express (SMILES as output) • Tripos (dbtranslate: SMILES to MOL) • Marvin (Ferenc Csizmadia) http://chemaxon.com/marvin/ • CACTVS http://www2.ccc.uni-erlangen.de/cactvs/

  21. Another Application • SMILESCAS Database http://www.syrres.com/esc/smilecas.htm Over 103,000 SMILES notations • Input CAS Registry Number • Leads to SMILES and thence to a structure search