In silico study of cancer related genes and micrornas micrornas
This presentation is the property of its rightful owner.
Sponsored Links
1 / 35

In silico study of cancer-related genes and microRNAs 運用微晶片篩選癌症基因及探討其上游之調控 microRNAs PowerPoint PPT Presentation


  • 95 Views
  • Uploaded on
  • Presentation posted in: General

In silico study of cancer-related genes and microRNAs 運用微晶片篩選癌症基因及探討其上游之調控 microRNAs. Ka-Lok Ng ( 吳家樂 ) Department of Biomedical Informatics ( 生物與醫學資訊學系 ) Asia University. Contents. Motivation Predict cancer genes based on microarray mRNA expression levels

Download Presentation

In silico study of cancer-related genes and microRNAs 運用微晶片篩選癌症基因及探討其上游之調控 microRNAs

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


In silico study of cancer related genes and micrornas micrornas

In silico study of cancer-related genes and microRNAs運用微晶片篩選癌症基因及探討其上游之調控microRNAs

Ka-Lok Ng (吳家樂)

Department of Biomedical Informatics

(生物與醫學資訊學系)

Asia University


Contents

Contents

Motivation

  • Predict cancer genes based on microarray mRNA expression levels

  • microRNA (miRNA) can act as an oncogene (OCG) or tumor suppressor gene (TSG)

  • Identify cancer-related miRNAs, their target genes, downstream protein-protein interactions (prediction novel cancerous proteins)

    (1) Introduction – microarray, cancer, microRNA

    (2) Methods – input data

    (3) Results

    (a) cancer genes prediction (Bioconductor), i.e. prostate/breast cancer

    (b) correlation study of miRNAs and mRNA expression levels

    (c) ncRNAppi – A platform for studying microRNA and their target

    genes’ protein-protein interactions

    (4)Summary


Central dogma of molecular biology

Central dogma of molecular biology

Post-transcription regulation – microRNA targets mRNA

transcriptome


Types of rnas

Introduction

Types of RNAs


In silico study of cancer related genes and micrornas micrornas

癌症的形成及97年台灣前十大主要癌症死亡原因摘要


In silico study of cancer related genes and micrornas micrornas

Probe genes

Target

cDNA labeled

by Cy5 (Red)

cDNA labeled

by Cy3 (Green)

By Hanne Jarmer, BioCentrum-DTU, Technical University of Denmark

Microarray – overview


In silico study of cancer related genes and micrornas micrornas

cDNA microarrays

Microarrays are used to measure gene expression levels in two different conditions. Greenlabel for the control sample and a red one for the experimental sample. DNA-cDNA or DNA-mRNA hybridization.The hybridised microarray is excited by a laser and scanned at the appropriate wavelenghts for the red and green dyesAmount of fluorescence emitted (intensity) upon laser excitation ~ amount of mRNA bound to each spotIf the sample in control/experimental condition is in abundance  green/red, which indicates the relative amount of transcript for the mRNA (EST) in the samples. If both are equal  yellowIf neither are present  black


Microarray data generation processing and analysis

Microarray data generation, processing and analysis

Image analysis

Information processing

  • Image quantitation – locating the spots and measuring their fluorescence intensities

  • Data normalization and integration – construction of the gene expression matrix from sets of spot

  • Gene expression data analysis and mining – finding differentially expressed genes (DEGs) or clusters of similarly expressed genes

  • Generation from these analyses of new hypotheses about the underlying biological processes stimulates new hypotheses that in turn should be tested in follow-up experiments

Data analysis

clustering

http://www.mathworks.com/company/pressroom/image_library/biotech.html


In silico study of cancer related genes and micrornas micrornas

Introduction – biogenesis of microRNA

miRNA gene

 pri-miRNA (stem-loop structure) processed by Drosha

 pre-miRNA (65~90 bp) carried by Exportin 5 to cytoplasm

 mature miRNA (20~25 bp) is generated by the RNaseIII type enzyme Dicer

 directed by RISC to the miRNA target

 mRNA cleavage or impede its translation into protein


In silico study of cancer related genes and micrornas micrornas

  • When miRNA plays an oncogenic role, it targets TSG, control cell differentiation or apoptosis genes, and leads to tumor formation.

  • if miRNA plays the tumor suppressor role, it targets OCG, control cell differentiation or apoptosis genes, so it can suppress tumor formation.

  • Expect negative correlation of miRNA and mRNA expression profiles

  • integrate the human miRNA-targeted (or siRNA-targetd) mRNA data, protein-protein interactions (PPI) records, tissues, pathways, and disease information to establish a disease-related miRNA (or siRNA) pathway database

Introduction

- miRNAs can play the role of an OCG and TSG


Introduction cancer related mirnas

Introduction – cancer-related miRNAs


A platform for studying mirnas and cancerous target genes

A platform for studying miRNAs and cancerous target genes

Annotation:

miR2Disease – disease related miRNA

Chromosomal fragile sites

miRNA clusters info.

CpG island proximal miRNA

TarBASE data 

Experimentally verified

miRNA-mRNA pairs

miRNA

miRNA-mRNA

anti-correlation pairs

NCI-60 cancer data:

Expression profile

of miRNA and mRNA

Annotation:

TAG  known OCG, TSG or CRG

OMIM  disease genes

KEGG  cancer pathways

mRNA

Number of cell lines for the nine cancer types in the NCI-60 data sets


Mirna target gene protein protein interaction ppi

miRNA, target gene, protein-protein interaction (PPI)

protein

miRNA

or siRNA

protein (TF)

protein (mRNA is suppressed)

protein

TG L1 L2

BP/MF x y z

Overlap BP/MF n1 n2

  • Tissue specific miRNA or siRNA target, and its PPI partners up to the second level

  • If the upstream miRNA (or siRNA) is defective, its effect could be amplified downstream.

  • As an illustration, given that a miRNA (or siRNA) targets gene TG, which has two successive PPI partners, i.e. proteins L1and L2; and suppose that genes TG and L2 are involved with the same disease, then it is highly probably that gene L1 is also related to the same disease  quantify by enrichment analysis


Input data and methods

Input data and Methods

Databases :

  • ArrayExpress

    • 64 prostate cancer tissue and 18 normal prostate tissue samples’ raw data files with U95Av2

  • TAG (Tumor Associated Gene)

  • NCI-60 – miRNA and mRNA gene expression profiles for 9 cancer types

  • TarBase – miRNA targets (experimental verified)

  • miR2Disease

    • a comprehensive resource of miRNA deregulation in various human diseases

  • OMIM – human disease information

  • KEGG – cancer pathways information

  • ncRNAppi

    • a useful tool for identifying ncRNA target pathways

  • PPI data (BioIR) – Seven databases are integrated: HPRD, DIP, BIND, IntAct, MIPS, MINT and BioGRID

  • Gene Ontology (GO) – Biological Function, Molecular Process annotations

  • Tool: Bioconductor


Research protocol

ResearchProtocol


Predict degs using r and bioconductor commands

Predict DEGs using R and Bioconductor commands


Results degs predicted by bioconductor

Results – DEGs predicted by Bioconductor

  • The result of the top 100 DEGs (either up or down)

  • Eliminate duplicated genes, the predicted total number of DEGs is 85, and the adjusted p-value of all DEGs are less than 1.9 * 10-5.

  • TAG ∩ DEGs 14 known cancer genes among the 85 predicted DEGs (16.5%)


Results mirnas degs and cancer types

Results – miRNAs, DEGs and cancer types

Other DEGs


Results the relationship among mir 20a tgfbr2 and human prostate cancer

Results - The relationship among miR-20a, TGFBR2 and human prostate cancer

16461460

http://ppi.bioinfo.asia.edu.tw/R_cancer/


A platform for studying mirnas and cancerous target genes1

A platform for studying miRNAs and cancerous target genes


A platform for studying mirnas and cancerous target genes2

A platform for studying miRNAs and cancerous target genes

Annotation:

miR2Disease – disease related miRNA

Chromosomal fragile sites

miRNA clusters info.

CpG island proximal miRNA

TarBASE data 

Experimentally verified

miRNA-mRNA pairs

miRNA

miRNA-mRNA

anti-correlation pairs

NCI-60 cancer data:

Expression profile

of miRNA and mRNA

Annotation:

TAG  known OCG, TSG or CRG

OMIM  disease genes

KEGG  cancer pathways

mRNA

Number of cell lines for the nine cancer types in the NCI-60 data sets


A platform for studying mirnas and cancerous target genes3

A platform for studying miRNAs and cancerous target genes

For a given cancer tissue type, we calculated both the PCC and SRC, r, between the is given by,

where xi and yi denote the expression intensity of miRNA and the miRNA's target gene respectively.

One of the troubles with quantifying the strength of correlation by PCC is that it is susceptible to be skewed by outliers. Outliers that are a single data point can result in two genes appearing to be correlated, even when all the other data points not. SRC is a non-parametric statistical method that is robust to outliers.

The PCC and SRC are calculated for:

Three Affymetrix chips: U95(A-E), U133A, U133B

Normalization methods: GCRMA, MAS5, RMA


Test of hypothesis of pcc and src

Test of hypothesis of PCC and SRC

The Pearson product-moment table to test the significance of a PCC result. The hypothesis being tested is a one-tailed test. A different test is applied for the SRC results.

Critical values for one-tailed test using Pearson and Spearman correlation at a significant level of a equal to 0.05 and 0.10.


In silico study of cancer related genes and micrornas micrornas

Results – hsa-miR-1:AXL, PCC and SRC calculations

Cases where both PCC and SRC are less than or equal to -0.5.


Results hsa mir 10b hoxd10

Results – hsa-miR-10b:HOXD10

Another example:

hsa-miR-21:PTEN (TSG)

hsa-miR-15b: BCL2 (TSG)

hsa-miR-16: BCL2 (TSG)

miR2Disease - hsa-mir-10b initiated diseases, i.e. leukemia, breast, colon, ovarian, prostate cancers.


Extension works in progress

Extension - works in progress

  • Validate how good is correlation prediction

  • Adding further information

    • – CpG island, miRNAs located around CpG islands (i.e., miR-34b, miR-137, miR-193a, and miR-203) are silenced by DNA hypermethylation in oral cancer

    • miRNA clusters, fragile sites

  • Positive correlated miRNA:mRNA pairs may involving TFs


Ncrnappi mirna target genes ppi and the protocol of enrichment analysis

ncRNAppi – miRNA, target genes, PPI, andthe protocol of enrichment analysis

protein

miRNA

or siRNA

protein (mRNA is suppressed)

protein (TF)

protein

There is a tendency for two directly interacting proteins participate in the same biological process or share the same molecular function. Let a miRNA targeting pathway denoted by miRNA – TG – L1 – L2. We propose to rank the pathway result according to the number of overlapping of the biological processes (or molecular functions) between TG and L1, and between L1 and L2. The Jaccard coefficient (JC) is used to rank the significance of a pathway.

JC of set A and B is defined by

where and denote the cardinality of and respectively.

JC(TG,L1) JC(L1,L2)


Ncrnappi the protocol of enrichment analysis

ncRNAppi – The protocol of enrichment analysis

The biological process (BP) and molecular function (MF) annotations are carried from Gene Ontology, which is used to characterize the path TG – L1 – L2, and the JC for the pathway is given by,

where and denote the JC score of the biological process for segment TG – L1, and the TG – L1– L2 pathway respectively.


Ncrnappi the protocol of enrichment analysis p value

ncRNAppi – The protocol of enrichment analysis, p-value

We assigned a p-value to every JC calculation, this provides a measure of the statistical significance. Here is how we estimate the p-value. Let N be total number of BP found in GO. Assume that TG,L1 and L2 have x, y and z BP annotations respectively. Also, let n1 and n2 be the number of identical BP for TG – L1 and L1 – L2 respectively. Let p1 and p2 be the probabilities that TG – L1 and L1 – L2 have n1 and n2 common BP (or MF) terms respectively, which are defined as;

and

TG L1

x-n1 n1 y-n1

N


Ncrnappi extension of tarbase targets

ncRNAppi – Extension of TarBase targets

Limitations of miRNA target prediction tools

There are many tools available for miRNA target genes prediction, such as miRanda, TargetScan, and RNAhybrid etc.

A major problem of miRNA target genes prediction is that the prediction accuracy remains uncertain, there was report indicated that the false positive rate could be as high as 24-39% for miRanda, and 22-31% for TargetScan.

If the miRNA:mRNA targeting part is uncertain, then the ‘Level 1’ and ‘Level 2’ protein-protein interaction pathways derived from PPI database are doubtful.


Ncrnappi extension of tarbase targets1

ncRNAppi – Extension of TarBase targets

  • miRNA target prediction tool – miRanda

  • Mature human miRNA FASTA sequences is downloaded from miRBase

  • (the latest version is 13).

  • Then, we predict the possibilities of miRNA binding with OCG, and TSG.

  • Target prediction tool, miRanda, allows for fining tuning of certain parameters, i.e. MFE threshold, score, shuffle statistics, gap open and gap extension scores.

  • We set MFE threshold and the shuffle statistics to -25 kcal/mol and ON respectively.

  • The rest of the parameters are set to their default values.

  • Once the binding lists of OCG and TSG obtained, then their PPI pathways can be retrieved from the BioIR database.


In silico study of cancer related genes and micrornas micrornas

Results - ncRNAppi

  • ncRNAppi provides web-based data access and allows disease assignment for a specific node along miRNA (siRNA) targeting pathways. For example

  • Select miRNA ID – hsa-let-7

  • Checks the ‘OMIM Disease type for individual node’ box labeled with ‘Target’ and ‘Level-2’

  • Choose the item ‘lung tumor’ under the ‘TUMOR TYPE’ pull-down menu (OMIM)

  • Select ‘Yes’ under the “Common expression of target, Level-1 and level-2 nodes in KEGG”

  • pathways are ranked according to the Jaccrad index and p-value for BP or MF

Example

hsa-let7

Unigene: liver

Target, L1 and L2 are OCG

submit


Summary

Summary

The R and Bioconductor are used to predict DEGs using prostate cancer microarray data. By integrating the Tumor Associated Gene (TAG), ncRNAppi and miR2Disease databases, it is found that certain DEGs are regulated by microRNAs.

A platform for studying miRNAs and cancer target genes

(1) PCC and SRC results are used to quantify the correlation between miRNA and its target expression profiles. The predicted results are annotated with reference to the TAG, OMIM, miR2Disease and KEGG data sets.

(2) The main advantage of the two platforms on miRNA-mRNA targeting information is that all the target genes information and disease records are experimentally verified.

ncRNAppi platform

ncRNAppi provide a powerful tool for identifying cancer-related miRNAs or siRNAs. For instance, the tool allows the possibilities of predicting novel caner genes through tissue or disease specific search. This platform is useful for investigating the regulatory role of miRNAs and siRNAs for cancer study.


Acknowledgement

Acknowledgement

National Science Foundation

Professor S.C. Lee (李尚熾) - Chung Shan Medical University

Mr. Liu Hsueh-Chuan (劉學銓) – former graduate student at Asia University

Mr. C.W. Weng (翁嘉偉)– former graduate student at Asia University

Mr. Kevin Lo (羅琮傑)– MSc. graduate student at Asia University


In silico study of cancer related genes and micrornas micrornas

Thank you for your attention.


  • Login