Cheminformatics in drug discovery and chemical genomics research
Download
1 / 53

Cheminformatics in Drug Discovery and Chemical Genomics Research - PowerPoint PPT Presentation


  • 111 Views
  • Uploaded on

UKY Seminar Weifan Zheng, Ph.D. Cheminformatics in Drug Discovery and Chemical Genomics Research. Weifan Zheng, Ph.D. Associate Professor Department of Pharmaceutical Sciences BRITE Institute, NC Central University Adjunct Associate Professor Department of Medicinal Chemistry

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Cheminformatics in Drug Discovery and Chemical Genomics Research ' - quincy-ayers


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Cheminformatics in drug discovery and chemical genomics research

UKY Seminar Weifan Zheng, Ph.D.

Cheminformatics in Drug Discovery and Chemical Genomics Research

Weifan Zheng, Ph.D.

Associate Professor

Department of Pharmaceutical Sciences

BRITE Institute, NC Central University

Adjunct Associate Professor

Department of Medicinal Chemistry

University of North Carolina at Chapel Hill


UKY Seminar Weifan Zheng, Ph.D.

Topics to Be Covered

Biotech/Pharma

Orphan Disease

Chemical Genomics

Computational Needs

Compound Collection

Docking Scoring

Data Analytics

CECCR Cheminformatics Center


UKY Seminar Weifan Zheng, Ph.D.

Drug Discovery & Development Pipeline


UKY Seminar Weifan Zheng, Ph.D.

Phases and Costs of Drug Discovery


UKY Seminar Weifan Zheng, Ph.D.

Drug Discovery Process and the Roles of CADD

  • GR: Genetic Research; DR: Discovery Research; DD: Drug Discovery

  • CADD: computer-assisted drug discovery

  • ADMET: Absorption, distribution, metabolism, elimination, toxicity

GR

DR

DD

Preclin

II

III

I

IND

T2H

H2L

LO

T

H

L

C

Clinical trials

CADD


UKY Seminar Weifan Zheng, Ph.D.

Human Genome Project Success

“Genome announcement 'technological triumph'

Milestone in genetics ushers in new era of discovery, responsibility”

CNN, June 26, 2000


UKY Seminar Weifan Zheng, Ph.D.

Chemogenomics/Chemical Genomics

F. Collins

Chris Austin


UKY Seminar Weifan Zheng, Ph.D.

Chemical Genomics

  • Chemogenomics

    • 69,000 in google (Oct.16, 2006)

  • Chemical genomics

    • 113,000 in google (Oct.16, 2006)

  • Chemical biology

    • 4,210,000 (Oct.16, 2006)

  • Chemical genetics

    • 104,000 (Oct.16, 2006)


Chemical genetics

is a research method that uses small molecules to change

the way proteins work—directly in real time rather than indirectly

by manipulating their genes. It is used to identify which proteins

regulate different biological processes, to understand in molecular

detail how proteins perform their biological functions, and

to identify small molecules that may be of medical value.


to create a national resource in chemical probe development.

The center uses the latest industrial-scale technologies

to collect data that is useful for defining the cross-section

between chemical space and biological activity (and do so

on genomic scale).


UKY Seminar Weifan Zheng, Ph.D.

NIH Molecular Library Initiative

MLI

Chemical Synthesis

Centers

MLSCN (9+1)

9 centers

1 NIH intramural

20 x 10 = 200 assays

ECCR (6)

Exploratory

Centers

PubChem

(NLM)

CombiChem

Parallel synthesis

DOS

4 centers + DPI

100K – 1M compounds

SAR matrix

compounds

200 assays


UKY Seminar Weifan Zheng, Ph.D.

Biological Assay Data

  • Biochemical assays

  • Cell-based functional assays

  • Phenotypic assays

  • Databases

    • PubChem (http://pubchem.ncbi.nlm.nih.gov/)

    • ChemBank (http://chembank.broad.harvard.edu/)

    • WOMBAT (http://sunsetmolecular.com/index.php)

    • Jubilant (http://www.jubilantbiosys.com/)

    • Gvk/Bio (http://www.gvkbio.com/)


Rules

Virtual

Libraries

Diverse Lib Design

Targeted Lib Design

Drug Discovery

Chemical Genomics

KDD

(QSAR, P.R.)

Combinatorial

Synthesis

Scientific

Logistics

SAR Data

Real

Libraries

HTS

UKY Seminar Weifan Zheng, Ph.D.

High Throughput Chemistry and Screening: Informatics


UKY Seminar Weifan Zheng, Ph.D.

Topics to Be Covered

Biotech/Pharma

Orphan Disease

Chemical Genomics

Computational Needs

Compound Collection

Docking Scoring

Data Analytics

CECCR Cheminformatics Center


R2 (3000)

(3000) R1

R3 (3000)

UKY Seminar Weifan Zheng, Ph.D.

Challenges in Combinatorial Chemistry

3,0003 / 1,000 per week = ~0.5 million years!!!

  • Library Design: rational selection of a subset of building blocks to obtain a maximum amount of information


UKY Seminar Weifan Zheng, Ph.D.

Design for Activity: Similarity

  • If we know a compound is active, and we want to design a set of compounds that may be active against the same target, we may select

    • A set of compounds that are similar to the active compound

  • The similarity principle: similar compounds should have similar biological activity


X

X

X

X

1

2

3

20

Str. 1

2

5

1

4

Str. 2

4

7

9

7

Str. 3

1

6

8

6

2

1

3

X2

Str.100

0

3

5

1

UKY Seminar Weifan Zheng, Ph.D.

X1

Molecular Identity and Molecular Similarity


UKY Seminar Weifan Zheng, Ph.D.

Design for General Application: Diversity


UKY Seminar Weifan Zheng, Ph.D.

Similarity and Diversity

- Maxi Min

- Minimize (Sum 1/Dij*Dij)


UKY Seminar Weifan Zheng, Ph.D.

Cluster Hits Obtained by SAGE and Random Sampling


UKY Seminar Weifan Zheng, Ph.D.

Drug Discovery & Development Failures

6%

21%

39%

29%

Venkatesh & Lipper, J. Pharm. Sci. 89, 145-154 (2000)


UKY Seminar Weifan Zheng, Ph.D.

Multi-Factorial Design


UKY Seminar Weifan Zheng, Ph.D.

Total Score is the Weighted Sum of Individual Terms


R1

R2

R1

R2

Better Library

Initial Library

R1

R2

Optimal Library

R1

Penalty Scores

R2

P450 Activity

Lipinski Properties

Diversity

Iteration


Designed Library Has a Better MW-clogP Distribution

clogP

Initial Ten solutions (undesigned)

The final ten solutions (well designed)


X

X

X

X

1

2

3

20

Str. 1

2

5

1

4

Str. 2

4

7

9

7

Str. 3

1

6

8

6

2

1

3

X2

Str.100

0

3

5

1

UKY Seminar Weifan Zheng, Ph.D.

X1

Molecular Identity and Molecular Similarity


UKY Seminar Weifan Zheng, Ph.D.

SPE Algorithm (Agrafiotis)

  • Iterative Random Sampling

D(a,b)

D’(a,b)

b

a

Embedding

Space (2D)

Original

Space

If D’ > D, move a, b closer

If D’ < D, move a, b apart


UKY Seminar Weifan Zheng, Ph.D.

Chemical Space - Compound Collection Comparison


UKY Seminar Weifan Zheng, Ph.D.

Chemical Space - Compound Collection Comparison


UKY Seminar Weifan Zheng, Ph.D.

Chemical Space - Compound Collection Comparison


UKY Seminar Weifan Zheng, Ph.D.

SPE Embedding of ChemSpace


UKY Seminar Weifan Zheng, Ph.D.

Topics to Be Covered

Biotech/Pharma

Orphan Disease

Chemical Genomics

Computational Needs

Compound Collection

Docking Scoring

Data Analytics

CECCR Cheminformatics Center


.

.

.

.

.

.

.

.

.

.

actual

actual

.

.

.

.

.

.

.

UKY Seminar Weifan Zheng, Ph.D.

predict

predict

Quantitative Structure-Activity Relationship (QSAR)

q2=0.8

R2=0.75

Multiple Linear regression (MLR); partial least square (PLS);

Artificial neural nets; k-nearest neighbor (kNN)


UKY Seminar Weifan Zheng, Ph.D.

Basic Assumptions of KNN-QSAR Method

  • Structurally similar compounds should have similar biological activities

  • Biological similarities are often due to similarities of substructures (pharmacophore)

  • Biological activities can be estimated from molecular similarities, which are calculated with pharmacophore-specific descriptors


UKY Seminar Weifan Zheng, Ph.D.

Comparison of CoMFA, GA-PLS, and KNN-QSAR


UKY Seminar Weifan Zheng, Ph.D.

QSAR Based Virtual Screening for GPCR Ligand Design


UKY Seminar Weifan Zheng, Ph.D.

Topics to Be Covered

Biotech/Pharma

Orphan Disease

Chemical Genomics

Computational Needs

Compound Collection

Docking Scoring

Data Analytics

CECCR Cheminformatics Center


Docking and Scoring

  • Early 1980’s, Kuntz, I.D. developed the first computerized molecular docking program: DOCK

  • GOLD, FRED, GLIDE, FLEXX, AutoDock, ICM

X-ray

structure


UKY Seminar Weifan Zheng, Ph.D.

Our Approach to Derive DT-SCORE

1. Use Delaunay tessellation to derive geometrical chemical descriptors of protein ligand interface

2. Establish correlation between the geometrical chemical descriptors and protein-ligand binding affinity using Perceptron Learning algorithm


Receptor-ligand Complexes

Tessellation of receptor

-ligand interface

Descriptor Generation

Perceptron Learning

algorithm

Model Generation & Prediction

DT-SCORE

UKY Seminar Weifan Zheng, Ph.D.

Flowchart to Derive DT-SCORE

Binding

constant


UKY Seminar Weifan Zheng, Ph.D.

Delaunay Tessellation in 2D

  • Rigorous definition of nearest neighbors in 2D & 3D space - Delaunay tessellation

Nearest neighbors are

unambiguously defined in

sets of three (in 2D) and

in sets of four (in 3D)


UKY Seminar Weifan Zheng, Ph.D.

Delaunay Tessellation of the Receptor-Ligand Interface


R

R

R

R

R

A Detailed View of Active Site Tessellation

L

An atom is shared

by several tetrahedra


UKY Seminar Weifan Zheng, Ph.D.

3 Types of Tetrahedra at the Receptor-Ligand Interface

RLLL

RRLL

RRRL

RLLL: Formed by 1 receptor atom and 3 ligand atoms

RRLL: Formed by 2 receptor atoms and 2 ligand atoms

RRRL: Formed by 3 receptor atoms and 1 ligand atom

Each of the above tetrahedron types is further discriminated by

atom types on the vertices


RR LL

RRRL

RLLL

UKY Seminar Weifan Zheng, Ph.D.

Geometrical Descriptors According to Tetrahedron Types

……

……

……

NOCS

COSC

CNOO

NCNO

OSXN

ONOS

……

……

4

0

……

2

8

5

3


R l interaction pattern binding affinity relationship table

UKY Seminar Weifan Zheng, Ph.D.

( R·L Interaction Pattern – Binding Affinity Relationship Table)

“QSAR” Input Table


x 1

1

w1

x2

2

w2

y

w3

x3

3

wN

xN

N

Single-Layer Perceptron Network

Input Layer

Output Layer

xi = input of neuron

wi= weight associated with the input xi

fn(.) = Activation function of output neuron.


UKY Seminar Weifan Zheng, Ph.D.

Training Vs. Test Set Selection and Validation

Entire dataset

(264 complexes)

80%

(214 complexes)

20%

(50 complexes)

Test set

Training set

Prediction of the

test set (R2)

Model development (q2)


UKY Seminar Weifan Zheng, Ph.D.

Model Stability

  • Average value from multiple (ca. 80) models


UKY Seminar Weifan Zheng, Ph.D.

Actual vs. Predicted Binding Affinity for the Training Set

214 complexes: q2 = 0.73


UKY Seminar Weifan Zheng, Ph.D.

Actual vs. Predicted Binding Affinity for the Test Set

50 complexes: R2 = 0.61


UKY Seminar Weifan Zheng, Ph.D.

Acknowledgements

  • NCCU and UNC

    • Jerry Ebalunode, Ph.D., BRITE

    • Min Shen, Ph.D., Lexicon

    • Alex Tropsha, Ph.D., Chair of MedChem, UNC-Chapel Hill

  • Funding

    • NIH P20HG003898

    • NIH R21GM076059

  • GSK

    • Sunny Hung (GSK)

    • George Seibel (JNJ)

    • Ken Kopple (retired)

    • Jeff Wiseman (Locus)

  • Lilly

    • Minmin Wang

    • Greg Durst

    • Jim Wikel (retired)


ad