slide1 l.
Download
Skip this Video
Download Presentation
Optimization of SVM parameters in caspase cleavage sites prediction using grid-computing Lawrence Wee

Loading in 2 Seconds...

play fullscreen
1 / 18

Optimization of SVM parameters in caspase cleavage sites prediction using grid-computing Lawrence Wee - PowerPoint PPT Presentation


  • 116 Views
  • Uploaded on

Optimization of SVM parameters in caspase cleavage sites prediction using grid-computing Lawrence Wee. What are caspases?. Caspases are downstream effectors in apoptosis 1. Extrinsic. Intrinsic. As the final effectors of apoptosis, caspases cleave many protein substrates.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Optimization of SVM parameters in caspase cleavage sites prediction using grid-computing Lawrence Wee' - misae


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Optimization of SVM parameters in

caspase cleavage sites prediction

using grid-computing

Lawrence Wee

slide2

What are caspases?

Caspases are downstream effectors in apoptosis 1

Extrinsic

Intrinsic

As the final effectors of apoptosis, caspases cleave many

protein substrates.

1. Hengartner MO. The biochemistry of apoptosis.Nature. 2000 Oct 12;407(6805):770-6.

slide3

Caspases are proteases

Caspase Cleavage of Substrates1

Caspases are cysteine proteases.

Recognize tetrapeptide sequence on substrates (P4-P3-P2-P1).

P4 P3 P2 P1 P1’ P2’

- D– E – V – D --- T – Y

Cleave after canonical Asp (D) residue at the P1 position.

  • 1. Fuentes-Prior et al. Biochem J. 2004 Dec 1;384(Pt 2):201-32.
  • 2. Thornberry et al. J Biol Chem. 1997 Jul 18;272(29):17907-11.
slide4

Caspases are proteases

The Enormous Range of Caspase Substrates1

Apoptotic regulators

Cytoskeletal proteins

Caspase

Substrates

Organelle proteins

DNA-associated proteins

Caspases

RNA-associated proteins

Cell signaling proteins

Cell cycle proteins

Viral proteins

More than 400 caspase substrates experimentally determined to date.1Many more await discovery.

Other proteins ???

1. Wee LJ, Tong JC, Tan TW, Ranganathan S. A multi-factor model for caspase degradome prediction. BMC Genomics. 2009, 10:S6.

slide5

Computation prediction of caspase cleavage sites

  • Identification of caspase substrates is important for elucidating biological function of caspases.
  • Refine our understanding of apoptotic and other caspase-dependent signaling pathways.
  • Wet-laboratory efforts can be laborious.
  • Consider computational prediction of caspase cleavage sites?
slide7

Support Vector Machines (SVM)

  • A type of machine learning algorithm
  • Works very well for several biological problems
  • Can be computationally hungry with large dimensions or parameters to optimize.
slide8

Prediction of caspase cleavage sites

Support Vector Machines: A Brief Introduction1

Data-points belonging to 2 distinct classes are represented as vectors.

A set of “learning” or “training” data-points belong to 2 classes (green and orange).

Each data-point has a unique set of attributes represented by vectors.

1. Cortes,C. and Vapnik,V. (1995) Support vector networks. Machine Learning, 20, 273–293.

slide9

Prediction of caspase cleavage sites

Support Vector Machines: A Brief Introduction1

The SVM algorithm constructs a “classifier” to discriminate the two classes.

Maximal margin hyperplane

The classifier is a maximal margin hyperplane that separates the two classes (green and orange)

Support Vectors

1. Cortes,C. and Vapnik,V. (1995) Support vector networks. Machine Learning, 20, 273–293.

slide10

Prediction of caspase cleavage sites

SVM: A Brief Introduction1

The SVM algorithm classifies new unseen data into one of two classes.

The classifier assigns the new data-point into one of the two classes based on where it is represented relative to the hyperplane.

New data-point assigned to

“orange” class.

1. Cortes,C. and Vapnik,V. (1995) Support vector networks. Machine Learning, 20, 273–293.

slide11

Prediction of caspase cleavage sites

SVM: A Brief Introduction1

SVM Decision Function with RBF kernel:

2 Parameters: C and gamma

1. Cortes,C. and Vapnik,V. (1995) Support vector networks. Machine Learning, 20, 273–293.

slide12

Prediction of caspase cleavage sites

Computational issues

Training dataset (390 sequences)

Leave-one-out

cross-validation

SVM Classifier

slide13

Predicting caspase cleavage sites

Computational issues

Leave-one-out cross-validation for a set of C and gamma values:

Training set (5 sequences)

Seq 1

Seq 2

Seq 3

Seq 4

Seq 5

Set 1

Set 2

Set 3

Set 4

Set 5

Trained classifier

slide14

Prediction of caspase cleavage sites

Computational issues

Training dataset (390 sequences)

For C=0.1, g=0.1,

Accuracy = 70%

Leave-one-out

cross-validation

SVM Classifier

slide15

Prediction of caspase cleavage sites

Grid-based (brute force) optimization of SVM parameters

slide16

Two Computational Issues

1. Leave-one-out cross-validation is computationally tedious.

With a dataset of 390 training examples, leave-one-out cross-validation takes ~12 secs using an Intel 2.66GHz Core2Duo processor with 4GB ram using 2 parameters (C and gamma).

Challenge: How fast will grid computers complete the same computation?

slide17

Two Computational Issues

2. Brute-force optimization is computationally tedious.

Challenge: How fast will grid computers complete the same computation (but repeated 100 times with different set of C and gamma values)?