240 likes | 358 Views
Standardizer. Molecular Cosmetics for Chemoinformatics. Gy ö rgy Pirok N ó ra M á te István Cseh Szil á rd D ó r á nt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia. Why standardize structures?. Canonicalisation
E N D
Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia
Why standardize structures? • Canonicalisation • Uniformization of structures without changing the chemical content to recognize duplicates, functional groups (aromatization, mesomers, tautomers, ... ) • Beautification • Making the structures visually more attractive (dearomatization, cleaning coordinates, wedge orientation, ... ) • Modification • Conversion of structures by modifying its original content as a preparation step for further chemoinformatics tasks (transformations, removing stereo, removing R-groups, ...). often difficult to categorize the standardization actions
Canonicalisation • Hydrogens • Tautomers making hydrogens explicit converting to canonical tautomer form making hydrogens implicit transforming to user defined tautomer form • Resonant structures • Other aromatizing Kekülé rings removing small fragments converting to canonical mesomer form removing user defined fragments transforming to user defined mesomer form expanding stoichiometry setting the chiral flag
Beautification • Hydrogens • Cleaning making hydrogens implicit calculating 2D coordinates reallocating wedge bonds • Resonant structures template based cleaning converting aromatic rings to Kekülé format 3D geometry optimization • Groups contracting/expanding/ungrouping abbreviated and multiple groups
Template-based Cleaning2D-coordinate calculation of macrocycles or bridged systems
Template-based Cleaningaligning search results to the query query
Canonicalization During Database Import client server input structures JChem Base / Cartridge Standardizer canonicalization configuration canonicalized structures original structures Relational Database
Sending Query to the Database client server query structure JChem Base / Cartridge Standardizer query is compared to the canonicalized structures canonicalization configuration canonicalizedquery Relational Database
Displaying Result Structures client server beautified structures JChem Base / Cartridge Standardizer beautification configuration original structures Relational Database
Modification + custom transformations
Standardizer st = new Standardizer(new File("standardize.xml")); st.standardize(mol); standardize input.sdf -c config.xml -o output.smiles API and command line interface
How can ChemAxon Help • Free for non commercial websites • Free for academic teaching and research“Academic Package” • Free Academic Package to be extended to cover academic networks – campus-wide roll out
Acknowledments • Ferenc Csizmadia • Nóra Máté • István Cseh • Szabó Attila • Szilárd Dóránt • Péter Kovács • Szabolcs Csepregi