100 likes | 236 Views
This document outlines the rules and conventions for creating SMILES (Simplified Molecular Input Line Entry System) notations, which express chemical structures. It explains cyclic structures, branch limitations, isomeric configuration, and the proper use of numbers to indicate ring starts and stops. Notably, SMILES uses only numbers 1 to 9, and specific structures like disconnected ions and metals are highlighted. Various software applications for SMILES processing, such as JMDraw and Marvin, are also mentioned, providing tools for molecular visualization and database searches.
E N D
Cyclic Structures • Numbers indicate start and stop of ring • Same number indicates start and end of the ring, entered immediately following the start/end atoms • Only numbers 1 – 9 are used • A number should appear only twice • Atom can be associated w. 2 consecutive numbers, e.g., Napthalene: c12ccccc1cccc2
SMILES Conventions • Avoid two consecutive left parentheses if possible • Strive for the fewest number of possible branches • Tautomeric bonds are not designated; enter the appropriate form
Further Restrictions • A branch cannot begin a SMILES notation • A branch cannot immediately follow a double- or triple-bond symbol • Example: C=(CC)C is invalid, but • C(=CC)C or C(CC)=C are valid SMILES
Nitro Nitrate Nitrite Sulfonic acid Cyanide/Nitrile Azide Azido N(=O)(=O) ON(=O)(=O) ON(=O) S(=O)(=O)O C#N N=N#N N+=N- SMILES Fragments
SMILES Metals [Al] [As] [Au] [Be] [Bi] [Cd] [Ca] [Fe] [Hg] [K] [Li] [Mg] [Na] [Ni] [Pt] [Sb] [Sn] [Zn] [Zr]
Disconnected Structures • Tetramethyl ammonium bromide C[N+]C(C)C.[Br-]
Isomeric and Chiral SMILES • Isomeric configuration indicated by forward and backward slashes: / \ • Examples: • trans-1,2-dibromoethene: Br/C=C/Br • cis-1,2-dibromoethene: Br/C=C\Br • Chirality indicated by the “@” symbol
Some Applications • JMDraw/SMILESViewer (Christoph Steinbeck) http://jmdraw.sourceforge.net • JME Molecular Editor (Peter Ertl) • STN Express (SMILES as output) • Tripos (dbtranslate: SMILES to MOL) • Marvin (Ferenc Csizmadia) http://chemaxon.com/marvin/ • CACTVS http://www2.ccc.uni-erlangen.de/cactvs/
Another Application • SMILESCAS Database http://esc.syrres.com/interkow/smilecas.htm • Over 103,000 SMILES notations • Input CAS Registry Number • Leads to SMILES and thence to a structure search