Prev
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

Introduction PowerPoint PPT Presentation


  • 53 Views
  • Uploaded on
  • Presentation posted in: General

Prev i ous Work. Backup. Method. Reference. Background. Introduction. Acknowledgments. Investigating mRNA’s of intrinsically disordered proteins. Harini Gopalakrishnan Advisor: Dr. Predrag Radivojac. Prev i ous Work. Backup. Method. Reference. Background. Introduction.

Download Presentation

Introduction

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

Investigating mRNA’s of intrinsically disordered proteins

Harini Gopalakrishnan

Advisor: Dr. Predrag Radivojac


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

  • Basic Facts –mRNA

  • 1. mRNA-Messenger Ribonucleic Acid

  • 2. Nucleic Acid polymer consisting of nucleotide

  • monomers adenine, guanine, cytosine and uracil

  • 3. Three important types

  • rRNA(ribosomal RNA)

  • tRNA (transfer RNA)

  • mRNA (messenger RNA)


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

Basic Facts –mRNA (contd)

Encodes and carries information from DNA to protein synthesis

http://en.wikipedia.org/wiki/Image:Mature_mRNA.png


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

  • Basic Facts-mRNA (contd)

  • What is significance of mRNA folding?

  • Secondary Structures have been used to explain

  • Translational controls

  • Regulatory function in the cell especially

  • the non-coding mRNA

  • What are the different folding algorithms?

  • Energy Minimization

  • Base Pair Maximization

  • Covariation

  • Eg: Mfold, Vienna Package


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

  • Basic Facts-Disordered Protein

  • What is a disordered Protein?

  • lack a well defined three-dimensional structure

  • conserved between species in composition and sequence

  • presence of low sequence complexity

  • amino acid compositional bias away from bulky hydrophobic residues

  • What are the significance of disorder Proteins?

  • regulation of transcription and translation, cellular signal transduction, protein phosphorylation, the storage of small molecules and the regulation of the self assembly of large multiprotein complexes such as the bacterial flagellum and the ribosome


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

Basic Facts-Disordered Protein

What is its role in diseases?

Famous (or infamous?) disorder proteins in diseases

-alpha-synuclein -p53 -proteins in HPV’s linked to Ovarian Cancer

What are the different predictors that are used?

(all based on amino acid sequence inputs)

VL2,VSL2,PONDR,VLXT

Image Courtesy: http://www.disprot.org


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

  • Snapshot from Previous Studies …..

  • Third Codon and stability

  • Speed of translation and protein secondary structures

  • -alpha helices and beta sheets

  • The three bases in the codon

  • 1st base -Biosynthetic pathway

  • 2nd base -Residue hydrophobicity

  • 3rd base -helix or beta strand-forming potential of amino

  • acid


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

  • In a Nutshell

  • Check if nucleotide composition has a bias towards the proteins being ordered and disordered

  • Check if the stability of RNA fold have any say in differentiating the proteins between the two categories.

  • Work is different because no study has linked Protein disorder and mRNA composition and stability.

  • Also establishing the correlation would open new avenues in studying how protein structure can be inferred directly from its precursor- the mRNA.


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

  • Hypothesis

  • There should exist some kind of codon bias between the mRNA sequence of ordered and disordered protein

  • There should be a difference in folding energy stability between the mRNA of ordered and disordered proteins

  • There is a correlation between the age of codons and disordered proteins

Central dogma


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

  • Method

  • Data Collection

  • Implementation

  • Analysis

  • Future Work


Introduction 5736274

Predicted Dataset (From disorder predictors)

True Dataset(Experimentally Verified)

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

  • Data Collection

  • One of the important phases , as the whole significance of the analysis lies on the quality of data set selected for both the categories of proteins.

  • After experimentation with various other databases, proteins were finally taken from the unigene90, DisProt and PDB

  • Disorder was predicted using VSL2B

Dataset


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

  • Data Collection

  • Once we have the proteins of interest, we use Uniprot to webmine the protein and corresponding mRNA dataset based on their unigene id

  • Problem!

  • Introns

  • Poly A tails, which need to be removed

  • We need a clean data set, in order to study Codon Usage, and nucleotide composition


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

  • Solution - Alignment

  • BLAST

  • Proved to be efficient while aligning the ordered proteins

  • Extremely inefficient while aligning protein vs. mRNA for the disordered set of proteins

  • Disorder proteins have more low complexity region

  • WISE

  • Software by the EMBL institute to align protein vs. nucleotide data

  • Uses Markov Chain methods to make gene predictions and hence identifies introns

  • Extremely efficient and provided qualitative datasets


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

Data Collection-Final input Statistics


Introduction 5736274

Previous Work

Backup

Method

Reference

Background

Introduction

Acknowledgments

  • Method -Overview

  • Analyzed mainly two characteristics of mRNA

  • Nucleotide Composition of mRNA

  • Codon Usage

  • Nucleotide Composition

  • RNA Folding Energy and Base Pair analysis using Mfold

  • number of base pair formation

  • total minimum free energy per RNA fold between


Introduction 5736274

Previous Work

Backup

Method

Background

Introduction

Acknowledgments

Reference

Methods

Mfold Snapshot


Introduction 5736274

Previous Work

Backup

Method

Background

Introduction

Acknowledgments

Mfold -Overview

What is Mfold?

A mRNA secondary structure prediction algorithm by M. Zuker and N.Markham

How does it work?

It is based on the nearest neighbor thermodynamic rules in which free energies are assigned to loops rather than base pairs. It tries to predict the optimal structure by minimizing the overall free energy of the structure formed by coaxial stacking of helices.

What does it output?

Several output files for every optimal and sub optimal folds within the allowable energy range are obtained. Energy dot plot (on the right) is one important component of this predictor output


Introduction 5736274

Previous Work

Backup

Method

Background

Introduction

Acknowledgments

Method

  • Tools Employed

  • Parsing and mining information on Web done by PERL

  • Analysis and graphs done using MATLAB

  • Reporting and graphs done in Excel

  • Disorder Prediction using mRNA inputs was done in MATLAB using SVM


Introduction 5736274

Previous Work

Backup

Method

Background

Introduction

Acknowledgments

Reference

Results


Introduction 5736274

Previous Work

Results

Method

Reference

Background

Introduction

Acknowledgments

Nucleotide Composition

Nucleotide Composition

True Dataset

Predicted Dataset


Introduction 5736274

Previous Work

Results

Method

Reference

Background

Introduction

Acknowledgments

Analysis based on the Composition of mRNA

Analysis of Codon Age

Amino acid

New

Old

codon

New

14 out of 18 Amino Acids have Disorder promoting Codon as the older one

2 amino acids (M and W) are neutral as they have only one codon each


Introduction 5736274

Previous Work

Results

Method

Reference

Background

Introduction

Acknowledgments

Base Composition

Preferential selection of codons with “g” or “c” for the third base

Base Composition

Predicted Dataset

Statistical Verification


Introduction 5736274

Previous Work

Results

Method

Reference

Background

Introduction

Acknowledgments

Energy of Folding and Base Pair

Energy of Folding

Predicted Dataset


Introduction 5736274

Previous Work

Results

Method

Reference

Background

Introduction

Acknowledgments

Energy of Folding and Base Pair

Base Pair Analysis

Base Pair Analysis


Introduction 5736274

Previous Work

Results

Method

Reference

Background

Introduction

Acknowledgments

Energy of Folding and Base Pair

Sequence Entropy Plot


Introduction 5736274

Previous Work

Results

Method

Reference

Background

Introduction

Acknowledgments

Future Work

Predictions

Aim: To predict disorder from mRNA based on all above information

Using Support Vector Machines(SVM’s)

  • Based on Codon Composition

  • Age of Codons

  • Base Composition

Accuracies have been good and promising


Introduction 5736274

Previous Work

Results

Method

Reference

Background

Introduction

Acknowledgments

Future Work

Acknowledgments

Dr. Predrag Radivojac

Dr. Haixu Tang

Dr. Vladimir Uversky

Amrita Mohan

Linda Hostetter

Informatics faculty and staff

My various Course Professors

Friends and Fellow Students


Introduction 5736274

Previous Work

Results

Method

Reference

Background

Introduction

Acknowledgments

Future Work

References

1. http://helix.nih.gov/docs/online/mfold/node3.html

2 Jan C Biro Nucleic acid chaperons: a theory of an RNA-assisted protein folding Theoretical Biology and Medical Modeling 2005, 2:35 

3 T. A. Thanaraj and p. Argos Protein secondary structural types are differentially coded on messenger RNA Protein Sci. 1996 5: 1973-1983

4 Taylor FJR, Coates D. 1989. The code within codons. Biosystems 22:177-187.

5.Brunak S, Engelbrecht J, Kesmir C. 1994. Correlation between protein secondary structure and the mRNA nucleotide sequence Protein Structure by Distance Analysis. Amsterdam: 10s Press. pp 327-334.

6. H Jane Dyson and Peter E Wright Intrinsically Unstructured proteins and their functions Nat Rev Mol Cell Biol. 2005 Mar; 6(3):197-208

7. Dunker, A.K., Brown, C.J., Lawson, J.D., Lakoucheva, L.M, and Obradovic, Z Intrinsic disorder And Protein Function.

8 Tompa P Intrinsically Disorder proteins evolve by repeat expansion Bioessays 2003 Sep; 25(9):847-55

9 Svetlana A. Shabalina, Aleksey Y. Ogurtsov, and Nikolay A. Spiridonov A periodic pattern of mRNA secondary structure created by the genetic code Nucleic Acids Res. 2006; 34(8): 2428–2437

10 Edward N Trifonov Theory of Early Molecular Evolution Landes Biosciences 2006

11 E.N.Trifonov Consensus temporal order of Amino Acids and evolition of the triplet code Gene 2000 ;( 261):139-151

12 Predrag Radivojac, Zoran Obradovic, David K. Smith, Guang Zhu, Slobodan Vucetic, Celeste J. Brown J. David Lawson and A. Keith Dunker Protein flexibility and intrinsic disorder ProteinScience (2004), 13:71-8013 N. R. Markham & M. Zuker. UNAFold: software for nucleic acid folding and hybridizing. Methods in Molecular Biology: Bioinformatics. Totowa, NJ: Humana Press, in press.

14 Peng K., Radivojac P., Vucetic S., Dunker A.K., and Obradovic Z., Length-Dependent Prediction of Protein Intrinsic Disorder, BMC Bioinformatics 7:208, 2006.

15 Gene Ontology: tool for the unification of biology. Nture Genet. (2000) 25: 25-29.

16 Brooks D, Singh, M, Fresco J R Selection influences the proteomic usage of a majority of amino acid

17 Vucetic S, Obradovic Z, Vacic V, Radivojac P, Peng K, Iakoucheva LM, Cortese MS, Lawson JD, Brown CJ, Sikes JG, Newton CD, and Dunker AK. 2005Disprot: A database of protein disorder Bioinformatics 21:137-140


  • Login