1 / 26

Annotation for Gene Expression Analysis with Reactome.db Package

Annotation for Gene Expression Analysis with Reactome.db Package. Utah State University – Spring 2012 STAT 6570 : Statistical Bioinformatics Cody Tramp. References. Ligtenberg W. 2011. Reactome.db : How to use the reactome.db package. www.reactome.org. Reactome.db Overview.

lilika
Download Presentation

Annotation for Gene Expression Analysis with Reactome.db Package

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Annotation for Gene Expression Analysis with Reactome.db Package Utah State University – Spring 2012 STAT 6570: Statistical Bioinformatics Cody Tramp

  2. References • LigtenbergW. 2011. Reactome.db: How to use the reactome.db package. • www.reactome.org

  3. Reactome.db Overview • “Open souce, open access, manually curated, and peer-reviewed pathway database” – www.reactome.org • Reactome.db is an R interface that allows queries to the SQL database containing pathway information • Contains functions for converting between annotation IDs and names for GO, Entrez, and Reactome

  4. Getting Help on Specific Reactome.db Functions #Load the Reactome.db package library(reactome.db) #Check for main manual pages ?reactome.db #This won't get the actual manual #List all reactome.db objects ls("package:reactome.db") # [1] "reactome“ "reactome_dbconn“ "reactome_dbfile" # [4] "reactome_dbInfo“ "reactome_dbschema“ "reactomeEXTID2PATHID" # [7] "reactomeGO2REACTOMEID“ "reactomeMAPCOUNTS“ "reactomePATHID2EXTID" #[10] "reactomePATHID2NAME“ "reactomePATHNAME2ID“ "reactomeREACTOMEID2GO" #Look up specific manual for an object ?reactome_dbInfo #Still not very useful – poor documentation

  5. How IDs and names are stored in Reactome.db • The reactome.db links to a SQL database • Functions are interfaces to the database • SQL databases are relational databases (think of Excel spreedsheets, but better) • Data is stored as key:value pairs

  6. Reactome.db Function Uses(NOTE: all return a key:value list) Converting Between Entrez and Reactome reactomeEXTID2PATHID = Entrez ID to Reactome.db ID reactomePATHID2EXTID = Reactome.db Name to Entrez ID > xx <- toTable(reactomeEXTID2PATHID) > head(xx) reactome_idgene_id 1 168253 10898 2 168254 10898 3 168253 8106 4 168254 8106 5 168253 5610 6 168254 5610 Use toTable() instead of as.list() that is shown in manuals

  7. Reactome.db Function Uses(NOTE: all return a key:value list) Converting from GO ID and Reactome ID reactomeREACTOMEID2GO = Reactome.db ID to GO IDs reactomeGO2REACTOMEID = GO ID to Reactome.db ID > xx <- toTable(reactomeGO2REACTOMEID) > head(xx) reactome_idgo_id 1 168276 GO:0019054 2 168276 GO:0019048 3 168276 GO:0044068 4 168276 GO:0022415 5 168276 GO:0051701 6 168276 GO:0044003

  8. Reactome.db Function Uses(NOTE: all return a key:value list) Retrieving Pathway Names from Reactome IDS reactomePATHNAME2ID = Reactome.db Name to Reactome.db ID reactomePATHID2NAME = Reactome.db ID to Reactome.db Name > xx <- toTable(reactomePATHID2NAME) > head(xx) reactome_idpath_name 1 15869 Homo sapiens: Metabolism of nucleotides 2 68616 Homo sapiens: Assembly of the ORC complex at the origin of replication 3 68689 Homo sapiens: CDC6 association with the ORC:origin complex 4 68827 Homo sapiens: CDT1 association with the CDC6:ORC:origin complex 5 68867 Homo sapiens: Assembly of the pre-replicative complex 6 68874 Homo sapiens: M/G1 Transition

  9. Reactome.db Function Uses(NOTE: all return a key:value list) reactomeMAPCOUNTS = shows number of rows in each function’s relational database (not very useful unless error checking) > xx <- as.list(reactomeMAPCOUNTS) > xx $reactomeEXTID2PATHID [1] 28363 $reactomeGO2REACTOMEID [1] 3217 $reactomePATHID2EXTID [1] 8320 $reactomePATHID2NAME [1] 13778 $reactomePATHNAME2ID [1] 13876 $reactomeREACTOMEID2GO [1] 47575

  10. Ex: Find apoptosis induction-related ID(compare to Notes 6.1 slide 10) # Get data.framesummarizing all reactome.dbpathways including a certain string xx <- toTable(reactomePATHNAME2ID) all.pathways<- xx$path_name # get name of each reactome.dbpathway t <- grep('apoptosis',all.Terms) # get index where Term includes #use agrep() for approximate term searching reactome.Term <- unlist(all.pathways[t]) reactome.IDs <- unlist(xx$reactome_id[t]) reactome.frame <- data.frame(reactome.ID=reactome.IDs, reactome.Term=reactome.Term) rownames(reactome.frame) <- 1:length(reactome.ID) reactome.frame # 13 terms

  11. Ex: Find apoptosis induction-related ID(compare to Notes 6.1 slide 10)

  12. Ex. Pathway Term Search Function ##Define Function to search for pathways with given key word ##agrep.bool is indicator to use agrep (TRUE) or grep (FALSE) searchPathways2REACTOMEID <- function(term, agrep.bool) { xx <- toTable(reactomePATHNAME2ID) all.pathways <- xx$path_name # get name of each reactome.db pathway #get index where Term is found if (agrep.bool==FALSE) (t <- grep(term, all.pathways)) else (t <- agrep(term, all.pathways)) unlist(xx$reactome_id[t]) } apop.IDs <- searchPathways2REACTOMEID("apoptosis", FALSE) length(apop.IDs) #13 pathways matched apop.IDs <- searchPathways2REACTOMEID("apoptosis", TRUE) length(apop.IDs) #85 pathways matched

  13. Getting GO Terms from single Reactome ID ##Get List of GO Terms from Reactome ID xx <- toTable(reactomeGO2REACTOMEID) t <- xx$reactome_id == "15869" GOTerms <- xx$go_id[t] > GOTerms [1] "GO:0055086" "GO:0006139" "GO:0044281" [4] "GO:0034641" "GO:0044238" "GO:0008152" [7] "GO:0006807" "GO:0044237" "GO:0008150" [10] "GO:0009987" > xx <- toTable(reactomeGO2REACTOMEID) > head(xx) reactome_idgo_id 1 168276 GO:0019054 2 168276 GO:0019048 3 168276 GO:0044068 4 168276 GO:0022415 5 168276 GO:0051701 6 168276 GO:0044003

  14. Getting GO Terms from list of Reactome IDs ##Define Function to get all GO Terms for all Reactome IDs in a list getGOTerms <- function(list_reactome) { listGO = list(); xx <- toTable(reactomeGO2REACTOMEID); for(i in 1:length(list_reactome)) {t <- xx$reactome_id==list_reactome[i]; temp_list = xx$go_id[t] listGO = c(listGO, temp_list)} unlist(listGO) } GOTerms.all <- getGOTerms(apop.IDs)#From slide 10 length(GOTerms.all) #136 GO Terms from 13 apop.IDs Should have yielded 169 terms (Notes 4.1 slide 10) – reactome.db might not be complete

  15. Reactome.org Online Tools

  16. Pathway Viewer on reactome.org http://www.reactome.org/userguide/Usersguide.html#Introduction

  17. Pathway Viewer on reactome.org • Details Panel

  18. Pathway Viewer on reactome.org http://www.reactome.org/entitylevelview/PathwayBrowser.html#DB=gk_current&FOCUS_SPECIES_ID=48887&FOCUS_PATHWAY_ID=71387&ID=76213&VID=3422142

  19. Reactome Pathway Symbols Upregulation and participating proteins Inhibition http://www.reactome.org/entitylevelview/PathwayBrowser.html#DB=gk_current&FOCUS_SPECIES_ID=48887&FOCUS_PATHWAY_ID=71387&ID=76213&VID=3422142

  20. Reactome Database Assignment Method • Genes seem to be assigned to pathways in a similar manner to GO database • If gene is up-regulated, it is included • Genes that are down-regulated in a condition are NOT mapped to the condition/pathway • Haven’t received official response from reactome.org, but from general browsing this seems to be the case

  21. Pathway Analysis Tool http://www.reactome.org/ReactomeGWT/entrypoint.html#PathwayAnalysisDataUploadPage

  22. Pathway Analysis Tool http://www.reactome.org/ReactomeGWT/entrypoint.html#PathwayAnalysisDataUploadPage

  23. Expression Set Data Analysis

  24. Expression Set Data Analysis

  25. Summary • Reactome.db provides an interface to the SQL database containing IDs • Functions for converting between ID types • No functionality for gene testing through R • Online tools include pathway maps and ID lookup tables • Some limited expression testing (with unknown statistical methods)

  26. Questions?

More Related