slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Interactive Datamining of Large-Scale Screening Datasets PowerPoint Presentation
Download Presentation
Interactive Datamining of Large-Scale Screening Datasets

Loading in 2 Seconds...

play fullscreen
1 / 24

Interactive Datamining of Large-Scale Screening Datasets - PowerPoint PPT Presentation


  • 86 Views
  • Uploaded on

Interactive Datamining of Large-Scale Screening Datasets. Frank Oellien, Wolf D. Ihlenfeldt Computer-Chemie-Centrum Universit y Erlangen-Nuremberg. Klaus Engel, Thomas Ertl Visualization and Interactive Systems Group Universit y Stuttgart. Overview.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Interactive Datamining of Large-Scale Screening Datasets' - zwi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Interactive Datamining of Large-Scale Screening Datasets

Frank Oellien, Wolf D. Ihlenfeldt

Computer-Chemie-Centrum University Erlangen-Nuremberg

Klaus Engel, Thomas ErtlVisualization and Interactive Systems Group University Stuttgart

slide2

Overview

  • Multi-variate and multi-dimensional datasets
  • Motivation
  • Information Visualization Techniques
  • Examples (ChemCodes Inc., NCI)
  • Demo
slide3

Overview

  • Multi-variate and multi-dimensional datasets
  • Motivation
  • Information Visualization Techniques
  • Examples (ChemCodes Inc., NCI)
  • Demo
chemical data
Chemical data

18000000

16000000

Merck Katalog

Synopsys PG

14000000

ACX

12000000

NCI DTP

10000000

ChemInform

8000000

Spresi

6000000

Beilstein

4000000

CAS

Current datasets

2000000

0

multi variate and multi dimensional numeric datasets today
Multi-Variate and Multi-Dimensional Numeric Datasets Today
  • Change in chemical synthesis technology
  • new technologies (HTS, combinatorial synthesis)
    • experiments generate terabytes of data per year
  • development of data mining and visualization tools could not keep pace
  • most critical bottleneck in R&D today !
  • tools for interactive mining and information visualization are needed
tools for interactive visualization of multi variate and multi dimensional data
Tools for Interactive Visualization of Multi-Variate and Multi-Dimensional Data
  • Standard applications
    • barchart, 2D and pseudo 3D scatter plots, molecular spreadsheets
    • limited to small subsets
    • platform-dependent
  • Our goal: applications that are
    • simple to use
    • allow straightforward interpretation of results
    • generalized access to tabular numeric data
    • platform-independent
slide7

Overview

  • Multi-variate and multi-dimensional datasets
  • Motivation
  • Information Visualization Techniques
  • Examples (ChemCodes Inc., NCI)
  • Demo
3d tools for interactive information visualization
3D Tools for Interactive Information Visualization
  • Information Visualization Applications that uses 3D capabilities of modern clients
  • Glyph-based InfVis approaches
  • Volume-based InfVis approaches
glyph based infvis tools
Glyph-based InfVis Tools
  • 3 orthogonal axes
  • color
  • shape
  • size
  • transparency
  • surface effects
  • animation
  • up to ~100 Glyphs
java java3d infvis applet
Java/Java3D InfVis Applet
  • Java3DCanvas
  • Tool Panel
  • (filters, selection tools, details)
  • ControlPanel
slide12

Dynamic Filter Tools

  • Selection Tools
  • Detail Tools

Java/Java3D InfVis Applet3D Tool Panel

advantages of volume based infvis tools
Advantages of Volume-based InfVis Tools
  • Databases with millions of data points
    • Glyph-based InfVis approaches
      • produce millions of geometricprimitives
      • interactive visualization not possible
    • Volume-based InfVis approaches
      • can handle large number of data points
      • interactive visualization using low-cost graphics hardware is possible
slide15

Overview

  • Multi-variate and multi-dimensional datasets
  • Motivation
  • Information Visualization Techniques
  • Examples (ChemCodes Inc., NCI)
  • Demo
chemcodes reaction database
ChemCodes Reaction Database
  • 100 most important FGs ~75% chemistry
  • 100 standard reactions
  • Limits of standard reactions
  • Functional Group Compatibility
  • Generating Rules
  • Goal: Analysis of the reaction space
chemcodes reaction optimization i
ChemCodes - Reaction Optimization I
  • Goal: Reaction Optimization: > 95% Yield
  • 7 Dimensions:reagent, solvent, time, temperature,stoichiometry,reagent order,FG-compatibility
chemcodes reaction planning
ChemCodes - Reaction Planning
  • FunctionalGroupCompatibilityCheck
example 2 nci anti tumor anti viral database
Example 2: NCI Anti-tumor / Anti-viral Database
  • Initiated in April 1990 (modified 1994)
  • ~ 250.000 compounds
  • ~ 30.000 with anti-tumor screening data
  • Enhanced NCI Database Browser
  • > 30 different molecular properties
  • up to 23 3D conformers per compound
slide23

Overview

  • Multi-variate and multi-dimensional datasets
  • Motivation
  • Information Visualization Techniques
  • Examples (ChemCodes Inc., NCI)
  • Demo
acknowledgment
Acknowledgment
  • Prof. Johann GasteigerComputer-Chemie-CentrumUniversity of Erlangen-Nuremberg
  • Prof. Thomas Ertl, Dipl. Inf. Klaus Engel Visualization and interactive SystemsUniversity of Stuttgart
  • Dr. Patrick Kiser, Dr. Gary Eichenbaum ChemCodes Inc.
  • Marc NicklausLaboratory of Medicinal ChemistryNCI, NIH
  • Deutsche Forschungsgemeinschaft