1 / 31

GxDb a universal tool to collect, analyse , manage and visualize transcriptomic data

GxDb a universal tool to collect, analyse , manage and visualize transcriptomic data. Wolfgang Raffelsberger, Raymond Ripp and Laetitia Poidevin. BingGi Days January 2010. Introduction. What is transcriptomic ?

saxton
Download Presentation

GxDb a universal tool to collect, analyse , manage and visualize transcriptomic data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GxDb a universal tool to collect, analyse, manage and visualize transcriptomic data Wolfgang Raffelsberger, Raymond Ripp and Laetitia Poidevin BingGi Days January 2010

  2. Introduction • What is transcriptomic ? • -> a high throughput analysis of gene expression by measuring the amount of mRNA • What are the techniques ? • -> DNA microarrays • -> SAGE • -> Differential Display • -> …. • => large quantities of data • GxDb: integrative tool to collect treat analyze manage visualize

  3. GxDb is a website and a database

  4. Organization of data in GxDb Arraytype ex: Mouse430_2 Sample ex: mouse wt aged 9 day Organism • Individual • name • age • description SampleCondition Genotype Tissue Treatment

  5. Organization of data in GxDb RealExp RealExp 2 Arraytype Arraytype ex: Mouse430_2 RealExp 3 Arraytype Sample ex: wt_d9 Arraytype RealExp 4 ex: wt_d9 Sample 2 CEL file r1 Arraytype ex: wt_d11 Sample 3 CEL file r2 CEL file r3 ex: wt_d13 CEL file r3 Sample 4 CEL file r4 CEL file r6 ex: wt_d15 CEL file r5 CEL file r7 CEL file r8 CEL file r9 CEL file r10 CEL file r11

  6. Organization of data in GxDb RealExp 2 RealExp 2 Experiment Experiment Arraytype Arraytype RealExp RealExp Sample 2 Sample 2 Signal Intensity Ratio Cluster ≠ expressed genes Quality Arraytype Arraytype Arraytype Arraytype RealExp 3 RealExp 3 CEL file r3 CEL file r3 Sample Sample Arraytype Arraytype CEL file r4 CEL file r4 RealExp 4 RealExp 4 CEL file r5 CEL file r5 Treatment and Analysis protocol Treatment and Analysis protocol Sample 3 Sample 3 CEL file r1 CEL file r1 Arraytype Arraytype CEL file r2 CEL file r2 CEL file r3 CEL file r3 Sample 4 Sample 4 CEL file r6 CEL file r6 CEL file r7 CEL file r7 CEL file r8 CEL file r8 CEL file r9 CEL file r9 CEL file r10 CEL file r10 CEL file r11 CEL file r11

  7. Treatment and Analysis protocol 1) Normalization 6 methods: RMA, gcRMA, dChip, MAS5.0, plier, vsn => signal intensity 2) Calculate average (between replicats) and ratio 3) Filtering - Eliminate probesets that are never expressed in all arrays of one experiment based on distribution or call (according to normalization method) - Eliminate probesets with very low changes between condition et reference based on fold change based on standard deviation 4) Statistical analysis - method: t-test combined with empirical bayes for shrinkage -estimation of FDR (false discovery rate) - tag probesets with differential expression (automatic threshold findings)

  8. Treatment and Analysis protocol 1) Normalization 2) Calculate average (replicats) and ratio 3) Filtering 4) Statistical analysis 5) Clustering tool: Cluspack methods: k-means (DPC) Mixtures models (aic and bic) => clusters 6) Quality Control Report tool: RReportGenerator for Automatic Statistical Analysis Automatic Statistical Analysis to estimate the quality of arrays

  9. Upload form

  10. Upload form Step 1: Selection of Arraytype and Experiment

  11. Upload form Step 1 Create your new experiment

  12. Upload form Step 1 Create your news samples Organism Genotype SampleCondition Individual TreatmentType Treatment Tissue Sample

  13. Upload form Step 1: Selection of Arraytype and Experiment

  14. Upload form Step 2: Upload of .cel files

  15. Upload form Step 3: Select the corresponding sample to each cel file

  16. Upload form Step 4: Select the interesting comparisons to calculate ratio Ratio: Condition / reference Example: C3H_rd1_d10 / C3H_wt_d10

  17. Upload form Step 5: Launch Treatment and Analysis protocol

  18. Upload form Step 5: Clustering, Quality analysis and loading in database

  19. Organization of data in GxDb Arraytype-Probeset Signal Intensity Ratio ≠ expressed gene Clustering Cel file RealExp Experiment Sample Quality

  20. Query GxDb

  21. Query GxDb Experiment Sample Probeset RealExp Signal Intensity Ratio Cluster

  22. Visualization in GxDb time-course of retinal development

  23. Star3 Star4 Star5 Star6 Star7 Star8 GxDbressources Languages used: PHP (HTML) - Upload - PipeWork - RadarGenerator - Fed R - Treatment and analysis protocol - RReportGenerator SQL Tcl - Gx (~ Gscope) - Probeset loading C - Cluspack /GxData GxDb SQL database alnitak http://gx.igbmc.fr Web Services Café des sciences QSub Ordonnanceur • GxDb Website • Upload • Querying • Display

  24. Conclusion and Prospects • Automated raw-data upload, storage, treatment and analysis • multiple treatment protocols • multiple clustering methods • multiple human and automatic expert analysis • => Comparisons • => Analyse the strengths and weaknesses of the differentprotocols • Improvement of website • More user friendly • Visualization of clusters, ratio • Tools for meta-analysis • Possibility of upload data directly from GEO • Diagnostic report to analyze easier the data • Links to others databases and tools: STRING, GSEA..

  25. Ratio Pipework Organism Normalization Ratio minimum Ratio maximum

  26. Advantages of GxDb • Integration and storage in a unifying format • Automated raw-data upload, storage, treatment and analysis • multiple treatment protocols • multiple clustering methods • multiple human and automatic expert analysis • => Comparisons • => Analyse the strengths and weaknesses of the differentprotocols • Facilitated querying and data visualization

  27. RealExp 3 RealExp 4 RealExp 2 Arraytype Arraytype Arraytype Arraytype Arraytype Arraytype Sample 3 Sample 2 Sample 4 CEL file r9 CEL file r6 CEL file r3 CEL file r4 CEL file r10 CEL file r7 CEL file r5 CEL file r11 CEL file r8 GxDb transcriptomics RealExp Arraytype Arraytype Sample CEL file r1 CEL file r2 CEL file r3

  28. RealExp 3 RealExp 2 RealExp 4 Arraytype Arraytype Arraytype Arraytype Arraytype Arraytype Sample 4 Sample Sample CEL file r1 CEL file r9 CEL file r1 CEL file r2 CEL file r2 CEL file r10 CEL file r3 CEL file r3 CEL file r11 45000 • PROBESET 3 • probeset_id • genename • genedescription • species • speciessymbol • representpublicid • refseqtranscriptid • gscope_id • swissprot • unigene_id • entrezgene • ensembl • mgi • cytoband • chromoloc • omim • tissuespecificity • linkeddiseases • go_biologicalprocess • go_cellularcomponent • go_molecularfunction • pathway • interpro • transmembrane Experiment • PROBESET 2 • genename • probeset_id • genedescription • species • speciessymbol • representpublicid • refseqtranscriptid • gscope_id • swissprot • unigene_id • entrezgene • ensembl • mgi • cytoband • chromoloc • omim • tissuespecificity • linkeddiseases • go_biologicalprocess • go_cellularcomponent • go_molecularfunction • pathway • interpro • transmembrane Arraytype RealExp 1 • PROBESET • probeset_id • genename • genedescription • species • speciessymbol • representpublicid • refseqtranscriptid • gscope_id • swissprot • unigene_id • entrezgene • ensembl • mgi • cytoband • chromoloc • omim • tissuespecificity • linkeddiseases • go_biologicalprocess • go_cellularcomponent • go_molecularfunction • pathway • interpro • transmembrane Arraytype Arraytype Signal Intensity Ratio Cluster Sample CEL file r1 CEL file r2 CEL file r3 Sample • Individual • name • age • description Tissue SampleCondition Treatment Organism Genotype

  29. GxDb protocol from upload to display Arraytypes Sample already exists ? already exists ? Create new Arraytype • Create new Sample with • existing or new Individual • existing or new Organism • existing or new Tissues • existing or new Genotype • existing or new Treatment • Upload your .CEL files • Enter their association to • Arraytypes and Samples • Define Couples of RealExps • for the Ratio Calculation • Fill in the other information • for the Experiment Differentially Expressed Genes Signal Intensity Quality Report Cluster Ratio Run Automatic Analysis Query and Display Results

More Related