1 / 26

Daniel Rico , PhD. drico@cnio.es

Course on Functional Analysis. ::: Introduction to Functional Analysis. ?. Daniel Rico , PhD. drico@cnio.es. Bioinformatics Unit CNIO. ::: Schedule. Biological (Functional) Databases Threshold-based and threshold free methods Threshold-based example: FatiGO.

lyre
Download Presentation

Daniel Rico , PhD. drico@cnio.es

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Course on Functional Analysis ::: Introduction to Functional Analysis ? Daniel Rico, PhD. drico@cnio.es Bioinformatics Unit CNIO

  2. ::: Schedule. Biological (Functional) Databases Threshold-based and threshold free methods Threshold-based example: FatiGO. Threshold free example 1: FatisScan.

  3. ACKNOWLEDGEMENTS Many of these slides have been taken and adapted from original slides by Fatima Al-Shahrour from Joaquin Dopazo’s group (Babelomics team). We are grateful for the material and for the great tools they have developed!!!!

  4. Regulatory elements miRNA CisRed Transcription Factor Binding Sites KEGG pathways Gene Ontology Biological Process Molecular Function Cellular Component Keywords Swissprot Biocarta pathways Gene Expression in tissues Bioentities from literature: Diseases terms Chemical terms InterPro Motifs Homo sapiens Mus musculus Rattus norvegicus Gallus gallus Danio rerio Drosophila melanogaster Caenorhabditis elegans Saccharmoyces cerevisae Arabidopsis thaliana UniProt/Swiss-Prot UniProtKB/TrEMBL Ensembl IDs EntrezGene Affymetrix Agilent HGNC symbol EMBL acc RefSeq PDB Protein Id IPI…. Genes IDs Biological databases

  5. Gene Ontology CONSORTIUMhttp://www.geneontology.org • The objective of GO is to provide controlled vocabularies for the description of the molecular function, biological process and cellular component of gene products. • These terms are to be used as attributes of gene products by collaborating databases, facilitating uniform queries across them. • The controlled vocabularies of terms are structured

  6. IS_A relation PART_OF relation GO structure The three categories of GO Molecular Function the tasks performed by individual gene products; examples are transcription factor and DNA helicase Biological Process broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions Cellular Component subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and origin recognition complex GO tree structure

  7. http://www.genome.ad.jp/kegg/pathway.html

  8. http://www.biocarta.com/genes/index.asp

  9. http://www.reactome.org/

  10. http://www.pathwaycommons.org

  11. http://www.whichgenes.org/

  12. http://www.cisred.org/

  13. ::: Schedule. Biological (Functional) Databases Threshold-based and threshold free methods Threshold-based example: FatiGO. Threshold free example 1: FatisScan.

  14. Threshold-based functional analysisStudy the enrichment in functional terms in groups of genes defined by the experimental value.FatiGOGOminerDAVIDMarmite The two-steps approach • Genes of interest are selected using the experimental value. • Selected genes are compared to the background. Threshold-free functional analysisSelect genes taking into account their functional properties.FatiScanGSEA MarmiteScan • Under a systems biology perspective. • Detect blocks of functionally related genes.

  15. FDR<0.05 ttest cut-off FDR<0.05 Biological meaning? Threshold-based functional analysis Class1 Class2

  16. Gene set 3 enriched in Class 2 ttest cut-off Gene set 2 enriched in Class 1 Threshold-free functional analysis Gene Set 1 Gene Set 2 Gene Set 3 - Class1 Class2 ES/NES statistic +

  17. ::: Schedule. Biological (Functional) Databases Threshold-based and threshold free methods Threshold-based example: FatiGO. Threshold free example 1: FatisScan.

  18. http://babelomics.bioinfo.cipf.es/

  19. ::: How the functional profiling should never be done It is not uncommon to find the following assertion in papers and talks: “then we examined our set of genes selected in this way (whatever) and we discover that 65% of them were related to metabolism, so we can conclude that our experiment activates metabolism genes”. Annotation is not a functional result!!!

  20. ::: Exercise 1: FatiGO SEARCH 1. Select “FatiGO Search” ” and “H. sapiens”. 2. Upload FatiGO_example.txt file 3. Select “KEGG pathways” and click “Run”

  21. ::: Exercise 1: FatiGO SEARCH 1. Select “FatiGO Search” ” and “H. sapiens”. 2. Upload FatiGO_example.txt file 3. Select “KEGG pathways” and click “Run” FatiGO-Search annotations

  22. A B Biosynthesis 6 2 No biosynthesis 4 8 Testing the distribution of GO terms among two groups of genes(remember, we have to test hundreds of GOs) Group A Group B Are this two groups of genes carrying out different biological roles? Biosynthesis 60% Biosynthesis 20% Sporulation 20% Sporulation 20% Genes in group A have significantly to do with biosynthesis, but not with sporulation.

  23. Significant functional terms BABELOMICS GO KEGG Interpro KW Bioentities Gene Expression TF Cisred “clean” List1 List1 011000101010101001 ...... 11001010 ........... 010001010 ........... 0110001010 ........... 1111001111............... “clean” List2 List2 Using FatiGO Comparing groups of genes • List1: genes of interest (they are significantly over- or under-expressed when two classes of experiments are compared, co-located in the chromosomes, etc.) • List2:the background (typically the rest of genes). • Select suitable database, Run... Remove genes repeated in list1 Matrix of functional terms Fisher´s test Extract functional terms Remove genes repeated between both lists Adjust p-value by FDR Remove genes repeated in list2

  24. FDR<0.05 ttest cut-off FDR<0.05 List 1b / List 2b Class1 Class2 List 2 (background) List 1

  25. ::: Exercise 2: FatiGO COMPARE 1. Select “FatiGO Compare” and “H. sapiens”. 2. Upload FatiGO_example.txt file 3. Select “Rest of Genome” as background. 4. Select “KEGG pathways” and click “Run”

  26. ::: Exercise 2: FatiGO COMPARE 1. Select “FatiGO Compare” and “H. sapiens”. 2. Upload FatiGO_example.txt file 3. Select “Rest of Genome” as background. 4. Select “KEGG pathways” and click “Run” Only “Apoptosis” is significant

More Related