Proof of concept studies consortia building networks
This presentation is the property of its rightful owner.
Sponsored Links
1 / 79

The Basic Technology Research Programme PowerPoint PPT Presentation


  • 46 Views
  • Uploaded on
  • Presentation posted in: General

Proof of Concept Studies & Consortia Building Networks. The Basic Technology Research Programme. Background. Cross research council endeavour administered by EPSRC Funding for research to create a new technology Change the way we do science Underpin the future industrial base. Background.

Download Presentation

The Basic Technology Research Programme

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Proof of concept studies consortia building networks

Proof of Concept Studies & Consortia Building Networks

The Basic Technology Research Programme


Background

Background

  • Cross research council endeavour

    • administered by EPSRC

  • Funding for research to create a new technology

  • Change the way we do science

  • Underpin the future industrial base


Background1

Background

  • 15 research projects funded up to April 2003

  • Total funding for this period - £41M

  • To support large, long term, high risk, high impact research consortia

  • Encourage investigation of speculative ideas


Background2

Background

  • Two levels of funding

    • One year start up

    • Full grant up to five years

  • Two types of start up funding

    • Proof of concept

    • Consortia building networking


Proof of concept studies

Proof of Concept Studies

  • One year funding up to £100K

  • Research to investigate feasibility of developing the new technology

  • Output – a business case for the next step of investigation to be submitted in May 2004

    • Basic Technology Programme

    • Existing Research Council initiatives

    • DTI programmes


Consortia building networks

Consortia Building Networks

  • Involvement of the users of the new technology at a very early stage

  • Funding to form networks & hold workshops


Parasurf in silico screening technology

ParaSurf – in silico Screening Technology

  • Basic Technology Funding for October 2003 to September 2004

    • Proof of concept

    • Consortia building networking

  • Academic partners

    • University of Portsmouth

    • University of Erlangen

    • University of Southampton

    • University of Oxford

    • University of Aberdeen


Parasurf proof of concept research programme

ParaSurf – Proof of Concept Research Programme

  • Development of techniques to describe irregular solids & surfaces

  • Development of projection & pattern recognition techniques for non-planar colour-coded surfaces

    • spherical harmonics, molecular topology

  • Conformational analysis

  • Rigid body dynamics incorporating surface features

    • rigid parts of molecule treated as anisotropic solids linked by rotatable bonds

  • Investigate how best to generate prediction models using surface properties that define a low dimensional chemical space

    • QSAR, pattern recognition, artificial intelligence, analysis of surfaces

  • Bench marking using Grid computing


Parasurf proof of concept research programme1

ParaSurf – Proof of Concept Research Programme


Potential applications of the in silico screening technology

Potential applications of the in silico screening technology

  • High throughput virtual docking

  • Physical property mapping

  • ADMET prediction

  • Long time-period simulation techniques

  • Crystallisation and solubility

  • Prediction of tautomers

  • Chemical reactivity and metabolism


Letchworth 16 th march 2004

Letchworth, 16th March 2004

ParaSurf Progress Report


Main areas

Main Areas

  • Molecular Surfaces and Property Calculation

  • RGB Encoding & Pattern Recognition

  • Conformational Analysis

  • Rigid Body Molecular Dynamics

  • Analysis of Variables & QSAR models

  • Grid Computing

  • Consortium Building


Datasets

Datasets

Small

Consensus Set of 74 Drug Molecules (diverse)

QSAR set (31 CoMFA steroids)

Medium

WDI subset (2,400 comps)

Harvard Chembank dataset (2,000 comps)

Large

WDI (50,000)

Maybridge (50,000)


Example molecule

Example Molecule

Allopurinol


Surface definition local property calculation

Surface Definition & Local Property Calculation


Calculations

Calculations

3D co-ordinates from CORINA

QM calculations with VAMP

Local Properties and surfaces from ParaSurf


Parasurf v1 0

ParaSurf v1.0

Surfaces

Isodensity Surfaces

Shrink Wrap

Marching Cube

Surfaces fit to Spherical Harmonics

Properties

MEP, LIE, LEA and LP

Encoded at points on the surface

Encoded as Spherical Harmonic Expansions


Small molecule

Small molecule


Rgb encoding pattern recognition

RGB Encoding & Pattern Recognition


Rgb encoding

RGB Encoding

Each Local Property encoded as a colour

LIE encoded on Red channel

LEA encoded on Green Channel

LP encoded on Blue Channel


Allopurinol rgb surface

Allopurinol RGB Surface


Rgb encoding1

RGB Encoding

Alternative Encoding

LIE

LEA

Absolute value of MEP


Allopurinol rgb surface1

Allopurinol RGB Surface


Conformational analysis

Conformational Analysis


Conformational analysis1

Conformational Analysis

Efficient All Atom MD analysis (DASH)

Treated as time series (not Cluster Analysis)

Scales linearly with simulation length

No need for arbitrary choice of number of clusters

Can be analysed using Markov Chain methodology


Md studies of rosiglitazone

MD studies of Rosiglitazone


Rigid body molecular dynamics

Rigid Body Molecular Dynamics


Rigid body molecular dynamics1

Rigid body molecular dynamics

Well founded methodology e.g. CNS / XPLOR (Axel T. Brunger, Stanford University)

Idea is to use rigid groups to model flexibility:

In the ligand

and the protein binding site.

Allows time-steps of 10fs to 20fs.


Qsar models

QSAR models


Distribution of properties

Distribution of Properties


Correlation matrix

LIE

LEA

LP

MEP

LIE

1

0.44

0.26

0.39

LEA

0.44

1

0.58

0.47

LP

0.26

0.58

1

-0.1

MEP

0.39

0.47

-0.1

1

Correlation Matrix


Descriptors

Descriptors

34 descriptors based on Normal Distribution

Principal Components

Spherical Harmonic Co-efficients


Descriptors for lie

Maximum value of the local ionization energy

Minimum value of the local ionization energy

Mean value of the local ionization energy

Range of the local ionization energy

Variance in the local ionization energy

Descriptors for LIE


Other descriptors

Other Descriptors

Moments

Order 1 – Mean

Order 2 – Variance

Order 3 – Skewness

Order 4 – Kurtosis

Overlapping Gaussians

Derived from previous work on MD analysis


Qsar models1

QSAR models

Models derived from Local Properties

Surface Integral Model for Solvation Energy

RMS Error ~ 0.75 Kcal

Drug Likeness

SOMs trained on WDI (drugs) & Maybridge (general)

Parameters from PC of Local Property Descriptors

Medium sized datasets superimposed on SOMs


Grid computing

GRID Computing


Grid computing1

GRID Computing

ParaSurf compiled on

SGI IRIX

Windows

Linux (SUSE)

IBM AIX

Future Platforms

SUN Solaris

GRID enabling at Portsmouth (Mark Baker), Southampton and Oxford.


Provisional timings

Provisional Timings

SGI R10k, 256MB

VAMP ~ 30s/compound

ParaSurf ~ 10s/compound

Intel 1.8 Xeon/ AMD Athlon XP-2000+

ParaSurf ~ 2s/compound

SGI FUEL Workstation R14K

ParaSurf ~ 2s/compound


Conclusions

Conclusions


Conclusions1

Conclusions

  • Properties can be calculated

  • Properties can be RGB encoded

  • Properties are local

  • Properties can be used for QSAR models


Computer vision methods for comparing molecular surfaces

Computer vision methods for comparing molecular surfaces

  • Comparing and recognising 3D objects is an active research area in robotics and AI.

  • Fast methods have been developed for database indexing.

  • Rotationally invariant descriptors of 3D objects are possible.


Pattern matching on molecular surfaces

Pattern matching on molecular surfaces

  • Can we recognise similar surfaces?

  • Can we recognise similar surfacefragments?

  • Can we identify the most similar surface to our target?

  • How do we compare field descriptors on the molecular surface?


Rotationally invariant 3d object descriptors

Rotationally invariant 3D object descriptors

  • Internal coordinates e.g. a distance matrix.

  • Energy distributions based on the spherical harmonics.

  • The spherical harmonic coefficients.

  • Radial integration, radial scanning, and invariant moments.


Surface comparison

Surface comparison

Two different approaches:

  • Using spherical harmonic molecular surfaces [J. Comp. Chem. 20(4) 383-395; Ritchie and Kemp 2000; University of Aberdeen].

  • Partial molecular alignment via local structure analysis [J. Chem. Inf. Comput. Sci. 40(2) 503-512 ; Robinson, Lyne and Richards 1999; University of Oxford].


An example grid of surface points

An example grid of surface points

A grid is placed on a ParaSurf surface in order to reduce

the number of surface points from 4038 to 55.


Partial molecular alignment

Partial molecular alignment

  • We do not know which points on the two surfaces need to be aligned with each other.

  • The essential approach is:

    all surface points on one surface are compared with all points on the other.

  • For two surfaces, with M and N points, MN possible alignments are possible:

    • we want to reduce this large search space!


Voting pairs are possible alignments

Voting pairs are possible alignments

The voting pairs can have a critical effect

on the quality of the surface alignment.


The voting table

The voting table

  • A voting table may list all matching pairs of surface points (i.e. all possible alignments).

  • A smart editing of votes within the voting table can enable speed and accuracy.

    • We want to only consider alignments between similar local features on the surfaces.

    • The more false votes we have in the voting table the harder it is to find the optimum alignment.


A distance matrix can be used to describe local surface features

A distance matrix can be used to describe local surface features

P1

The internal distance matrix

can be used to distinguish

between surface points.

P3

By comparing rows and

columns from distance matrices

of different surfaces we can

detect similar surface features.

P2


Selecting the voting pairs

Selecting the voting pairs

Similar local features, or interest points, on the molecular surface can be identified using a distance matrix.

For a point on each surface:

  • Arrays of internal surface point distances are calculated for both points i.e. dist1[], dist2[].

  • After a crude alignment, the absolute difference of dist1[] and dist2[] indicates the similarity of this pair of points.


Scoring the possible alignments

Scoring the possible alignments

The optimum alignment is composed of a rotation R and a translation T.

  • Apply the current rotation r:

    • Score the translation vectors t =p – q of all voting pairs (p,q) using a gravitational potential:

    • High potentials identify clusters of similar translation vectors.

    • The vector with the highest potential is the optimum translation T.

  • Scoring all r gives R and T.


Scoring with a gravitational potential

Scoring with a gravitational potential

Translation vectors

(x,y coordinates plotted)

Some voting pairs for example rotations


Can we use the potential to compare aligned structures

Can we use the potential to compare aligned structures?


Can we get better alignments with more voting pairs

Can we get better alignments with more voting pairs?


Example alignments

Example alignments

4

3

2

1


Example 1 rmsd 0 75

Example 1: RMSD = 0.75

A

B


Example 2 rmsd 1 05

Example 2: RMSD = 1.05

A

B


Example 3 rmsd 1 20

Example 3: RMSD = 1.20

A

B


Example 4 rmsd 1 89

Example 4: RMSD = 1.89

A

B


Matching with the surface field descriptors example 1

Matching with the surface field descriptors: example 1

  • Surfaces are aligned (using a quick search method; e.g. 45º rotations).

  • Best N alignments are selected.

  • Each alignment is gently perturbed and optimised using the field descriptors.


Matching with the surface field descriptors example 2

Matching with the surface field descriptors: example 2

  • Align using the field descriptors’ values to identify suitable voting pairs:

    • only match on similar field descriptors.

  • Filtering can be achieved by aligning the fields separately.

  • More accurate alignments can be generated by combining field values.


Parameterisation

Parameterisation

  • Voting pairs:

    • The distance between points in surface grid.

    • The number of voting pairs.

    • Identifying and selecting local features.

    • How to represent the fields at interest points.

  • Scoring:

    • Scoring function to identify the correct rotation and translation (e.g. gravitational potential).

    • Target function to compare different surface alignments (e.g. RMSD).

  • Optimising the alignments.


Molecular surface property graphs

Molecular Surface Property Graphs

Characterize the behaviour of a property

f : S  

on amolecular surface S, in terms of a directed graph G on S derived from the gradient vector field

x =grad f(x)

Vertices (G) =fixed points of grad f (= critical points of f ).

Edges (G) = stable and unstable manifolds of the saddle points.


Gradient flow

Gradient Flow

  • minima

  • saddles

  • maxima


Molecular surface property graph

Molecular Surface Property Graph


Applications

Applications

  • Similarity

    • Pattern recognition methods

    • Maximal common subgraphs

  • Complementarity

    • Compare ligand graph with graph induced on ligand

    • by receptor

  • QSAR

    • Topological indices


Example

Example

  • S = Connolly Surface

  • f(x) = Electrostatic Potential = ∑ q(i)/ d(x,i)

  • Method

  • Locate critical points of f (Newton-Raphson).

  • Linearize at saddles, find eigenvectors of Hessian( f ).

  • Integrate gradient vector field forward in time from 2 points on

  • unstable eigenvector, backward in time from 2 points on stable

  • eigenvector (Runge-Kutta).

  • Integrate to boundary of Connolly surface patch, then continue

  • on adjacent patch until reaching another critical point.


Allopurinol

Allopurinol

8 maxima

7 minima

13 saddles

#maxima – #saddles + #minima =  (S) = 2


Work in progress

Work in Progress …

  • Implementation for

    S = spherical harmonic surface

    f = MEP, LIE, LEA and LP

    • Use images of triangulation points as starting points for Newton-Raphson search for critical points.

    • Automatic differentiation.


Summary

Critical features

Dave Whitley

Pattern matching

on surfaces

Martin Swain

Molecular

surfaces

QM properties

presented on

surface

Data reduction

and QSAR

Brian Hudson

Summary

Compound

screening

Spherical harmonic

representation

Dave Ritchie


Future directions

Future directions

  • High-throughput ligand docking

    • Superimposition of ligand and a “negative” of the receptor

  • Use of the fields to drive simulation

    • Use of the fields to derive intermolecular forces

    • Rigid-body motions – long time-step MD

    • Free energy calculations


A hierarchy of methods

A hierarchy of methods

  • Rapid screening using computationally fast approaches

    • 3D fields – Andy Vinter

  • On reduced set:

    • Semi-empirical property calculations and alignments

  • On most interesting molecules:

    • Density-functional or ab-initio calculations and alignment

  • More accurate molecular representations are used as appropriate, as resources allow


  • Login