Genetic algorithms select protein features most predictive of enzyme function
This presentation is the property of its rightful owner.
Sponsored Links
1 / 6

Genetic Algorithms Select Protein Features Most Predictive of Enzyme Function PowerPoint PPT Presentation


  • 51 Views
  • Uploaded on
  • Presentation posted in: General

Genetic Algorithms Select Protein Features Most Predictive of Enzyme Function. Andrew Kernytsky, Burkhard Rost Columbia University. Enzyme function prediction. Given protein sequence predict Enzyme Commission (EC) number. Ligases. Isomerases. Oxidoreductases. Lyases. Transferases.

Download Presentation

Genetic Algorithms Select Protein Features Most Predictive of Enzyme Function

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Genetic algorithms select protein features most predictive of enzyme function

Genetic Algorithms Select Protein Features Most Predictive of Enzyme Function

Andrew Kernytsky, Burkhard Rost

Columbia University


Enzyme function prediction

Enzyme function prediction

Given protein sequence predict Enzyme Commission (EC) number

Ligases

Isomerases

Oxidoreductases

Lyases

Transferases

Hydrolases

NC-IUBMB (1992) Recommendations of the International Union of Biochemistry on the Nomenclature and Classification of Enzymes. In, Enzyme Nomenclature. Academic Press, New York.

EC Wheel Figure: Porter CT, Bartlett GJ, Thornton JM. Nucleic Acids Res. 2004 January 1; 32: D129–D133.


Intersection properties capture local information

Limited local information

All Global

1%

All Intersection

0.1%

TAGHCVNYDYGAGCQSGSPV

bbbbbieeeiibbieeeeee

..|....|......||....

HHHEEEEELLEEEEELLLLL

iiibbbbbbboooobbbbbb

36788842100000000123

AA

Acc

Cons

Feat 4

Feat 5

Feat 6

0.01%

Significant risk of overfitting during training

103+ features > 102 positive samples

Intersection properties capture local information

20%

10%

5%


Algorithm overview

All intersection and global feature classes

All possible combinations

of feature classes[genomes]

Protein

sequence

2nd Generation

Genome Pop.

Inner Learning Algorithm

3rd Generation

Genome Pop.

Fitness Assesed

M

S

N

L

L

K

D

F

E

V

A

Q

C

AA×sec

AA AA×sec

sec AA×sec

AA

AA×sec

sec AA×sec

0.635

0.688

0.677

AA

sec

AA×sec

AA sec

AA AA×sec

sec AA×sec

AA sec AA×sec

AA

sec

AA×sec

GA Evolution

Neural

Network

Selection

Crossover

Mutation

OR

SVM

1st 2nd 3rd 4th

Generation Populations

Algorithm overview

Genetic Algorithm


Ga improves performance

GA improves performance

EC Level


Balance between intersection and global features gives best performance

Balance between intersection and global features gives best performance


  • Login