Download
instantjchem a flexible chemical database system n.
Skip this Video
Loading SlideShow in 5 Seconds..
InstantJChem: a flexible chemical database system PowerPoint Presentation
Download Presentation
InstantJChem: a flexible chemical database system

InstantJChem: a flexible chemical database system

96 Views Download Presentation
Download Presentation

InstantJChem: a flexible chemical database system

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. InstantJChem: a flexible chemical database system G. Marcou, D. Horvath +Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg

  2. Introduction • The goal is to present InstantJChem for the storage and manipulation of chemical information • General presentation • Database search • Creation of a database from scratch

  3. What is a database? • A database stores data in an ordered form on a precise subject. • A relational database stores information into tables which possess inter-references • A relational database management system (RDBMS) is a software that manages relational databases • InstantJChem is not a database and is not an RDBMS.

  4. What is InstantJChem? • InstantJChem is a friendly interface between a RDBMS, chemical information and the user. User RDBMS Chemical Information

  5. Key concepts of InstantJChem Projects Schema Databases and Tables Entities Data Trees Views

  6. Exercise 1 Create a new project names IJCExercises…

  7. Key concept: Project Project contains resources and connections to one or more databases. icon

  8. Exercise 1 …and import the file SC100.SDF in it….

  9. Key concept: Schema Schema/ Database Contains connection to a database and special tables (JChemProperties) icon

  10. Key concept: Database and Tables Table Database and tables are managed by the RDBMS. Actually store information. icon

  11. What can be stored

  12. Key concept: Entities Entity An entity is a representation of data. icon It is a unique interface to conceptually different types of tables (Standard, Chemical, SQL, Extractions, etc).

  13. Key concept: Data Trees Data Tree A collection of entities and views. icon Organize information using a hierarchy (parent-child relationship between entities).

  14. Exercise 1 ….Customize a browser for it.

  15. Key concept: Views Views An interface to data. icon For simple data, a spreadsheet view is relevant. For complex relational data, a form is mandatory.

  16. Exercise 2 In the SC100 database, search for fluorobenzene and pyridine containing molecules. Use Substructure or Similarity search.

  17. Exercise 2 In the SC100 database, search for fluorobenzene and pyridine containing molecules. Use Substructure or Similarity search. Substructure search: 20 hits Similarity search: 0 hits Substructure search: 14 hits Similarity search: 0 hits Similarity search uses Chemical Hashed Fingerprints defined at database creation.

  18. Chemical Hashed Fingerprints (CHF) • Pattern Length: number of bonds of a pattern • Fingerprint Length: total number of bits to store the fingerprint • Bits per pattern: number of bits a pattern shall set on www.chemaxon.com Efficient annotation to accelerate structure search

  19. Exercise 3 Combine molecule 25 and 89 into a pseudo-molecule to perform a superstructure query.

  20. Exercise 4 Use compound 46 as a Full and Full fragment query to search the database. Repeat after removing the bromide from the query.

  21. Structure Searches www.chemaxon.com

  22. Exercise 5 Search benzene containing compounds, which name contains “pyrimidin” and annotated as “Good” concerning their aqueous solubility.

  23. Exercise 6 Search for compounds with at least one aromatic ring containing at least on Nitrogen atom

  24. Exercise 7 Search for compounds which MolWeight > 200 and not containing a benzene ring

  25. Exercise 8 Search for compounds with MolWeigh > 200, then for compounds without a benzene ring and search for the union of the hit lists.

  26. Execrise 9 Search for compounds possessing more than 4 microspecies at pH=4.0….

  27. Exercise 9 … Export your hit list.

  28. Exercise 10 Import in your project the file ISICCRsm.RDF…

  29. Exercise 10 … Create a Browser for this database

  30. Exercise 11 Search for reactions including an imidazole ring into their reactants then into their products.

  31. Exercise 12 Add to your Schema a new data tree and structure entity named AlkanBoilingPoint…

  32. Exercise 12 … and add a floating point value field named BoilingPoint.

  33. Exercise 13 Add to the AlkanBoilingPoint entity the following data.

  34. Exercise 14 Add to the AlkanBoilingPoint entity a new date field named Date and fill it.

  35. Exercise 15 Add to the AlkanBoilingPoint entity a calculated value of LogP using a Chemicalterm field.

  36. Summary • Create a project and schema • Import data • Search by substructure, superstructure, similarity, and exact match • Search by keyword • Combining queries and result lists • Export query results • Create a new database

  37. Conclusion • InstantJChemis a Chemoinformatics layer above a standard SGDB. • Provides many more Chemoinformatics services (databases overlap, QSPR modeling, plots, enumeration, scripting) SGDB InstantJChem