1 / 31

Hunting strategy of the bigcat

BiGCaT Bioinformatics. Hunting strategy of the bigcat. BiGCaT, bridge between two universities. TU/e Ideas & Experience in Data Handling. Universiteit Maastricht Patients, Experiments, Arrays and Loads of Data. BiGCaT. Major Research Fields. Nutritional & Environmental Research.

calder
Download Presentation

Hunting strategy of the bigcat

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BiGCaT Bioinformatics Hunting strategy of the bigcat

  2. BiGCaT,bridge between two universities TU/eIdeas & Experience in Data Handling Universiteit Maastricht Patients, Experiments,Arrays and Loads of Data BiGCaT

  3. Major Research Fields Nutritional &EnvironmentalResearch CardiovascularResearch BiGCaT

  4. What are we looking for?

  5. What are we looking for? Different conditions show different levels of gene expression for specific genes

  6. Differences in gene expression? • Between e.g.: • healthy and sick • different stages of disease progression • different stages of healing • failed and successful treatment • more and less vulnerable individuals • Shows: • important pathways and receptors • which then can be influenced

  7. The transfer of informationfrom DNA to protein. From: Alberts et al. Molecular Biology of the Cell, 3rd edn.

  8. Eukaryotic genesin somewhat more detail

  9. Gene expression measurement DNA  mRNA protein Functional genomics/transcriptomics: Changes in mRNA • Gene expression microarrays • Suppression subtraction lybraries Proteomics: Changes in protein levels • 2D gel electrophoresis • Antibody arrays

  10. Gene expression arrays Macroarrays: absolute radioactive signal. Validation. Microarrays: relative fluorescense signals. Identification.

  11. Layout of a microarray experiment • Get the cells • Isolate RNA • Make fluorescent cDNA • Hybridize • Laser read out • Analyze image

  12. The cat and its prey:the data Comprises: • Known cDNA sequences (not known genes!)on the array = reporters • Data sets typically contain 20,000 image spot intensity values in 2 colors • One experiment often contains multiple data points for every reporter (e.g. times or treatments) • Each datapoint can (should) consist of multiple arrays Bioinformatics should translate this in to useful biological information

  13. Hunting Comprises: • Analyze reporters • Data pretreatment • Finding patterns in expression • Evaluate biological significance of those patterns

  14. Reporter analysis • Reporter sequence must be known(can be sequenced using digest electrophoresis). • Lookup sequence in genome databases (e.g. Genbank/Embl or Swissprot) • Will often find other RNA experiments (ESTs) or just chromosome location.

  15. Blast reporters against what? • Nucleotide databases (EMBL/Genbank)Disadvantages: many hits, best hit on clone, we actually want function (ie protein) • Nucleotide clusters (Unigene)Disadvantage: still no function • Protein databases (Swissprot+trEMBL)Disadvantages: non coding sequence not found, frameshifts in clones

  16. Two implemented solutions • Start with Unigene (from Blastn or platform provider), mine using SRS (direct, through PDB, through PIR) -> Swissprot/trEMBL • Use dedicated EMBL-Swissprot X-linked DB (Blast against EMBL subset get Swissprot/trEMBL)

  17. Two implemented solutions • Start with Unigene (from Blastn or platform provider), mine using SRS (direct, through PDB, through PIR) -> Swissprot/trEMBL • Use dedicated EMBL-Swissprot X-linked DB (Blast against EMBL subset get Swissprot/trEMBL)

  18. Scotland - Holland: 1-0? Check Affymetrix reporter sequences. • Each reporter 16 25-mer probes. • Blast against ENSEMBL genes(takes 1 month on UK grid). • Use for cross-species analysis • Adapt RMA statistical analysis in Bioconductor

  19. Next slide shows data of one single actual microarray • Normalized expression shown for both channels. • Each reporter is shown with a single dot. • Red dots are controls • Note the GEM barcode (QC) • Note the slight error in linear normalization (low expressed genes are higher in Cy5 channel)

  20. Next slide shows same data after processing • Controls removed • Bad spots (<40% average area) removed • Low signals (<2.5 Signal/Background) removed • All reporters with <1.7 fold change removed (only changing spots shown)

  21. Final slide shows information for one single reporter • This signifies one single spot • It is a known gene:an UDP glucuronyltransferase • Raw data and fold change are shown

  22. Secondary Analyses • Gene clustering(find genes that behave equally) • Cluster evaluation(what do we see in clusters …) • Physiological evaluation(for arrays, proteomics, clusters) • Understand the regulation

  23. Expr. level T2 signal 2 T1 signal time Clustering: find genes with same pattern Left hand picture shows expression patterns for 2 genes (these should probably end up in the same cluster). Right hand picture shows the expression vector for one gene for the first 2 dimensions. Can be normalized by amplitude (circle) or relatively (square).

  24. Cluster evaluation • Group genes (function, pathway, regulations etc.) • Find groups in patterns using visualization tools and automatic detection. • Should lead to results like:“This experiment shows that a large number of apoptosis genes are up-regulated during the early stage after treatment. Probably meaning that cells are dying”

  25. Example of GenMAPP results: Manual lookup on a MAPP

  26. Understanding regulation The main idea: co-regulated genes could have common regulatory pathways. The basic approach: annotate transcription factor binding sites using Transfac and use for supervised clustering. The problem: each gene has hundreds of tfb’s. Solution? Use syntenic regions using rVista (work in progress with Rick Dixon)

  27. Understanding QTL’s Get blood pressure QTLs:from ENSEMBL/cfg Welcome group. Look up functional pathways and Go annotations using GenMapp: virtual experiment assume all genes in QTL are changing. Create a new blood pressure Mapp: confront this with real blood pressure/heart failure microarray data. Work in progress TU/e MDP3 group.

  28. People involved Bigcat Maastricht: Rachel van Haaften (IOP), Edwin ter Voert (BMT), Joris Korbeeck (BMT/UM), Willem Ligtenberg (IOP), Stan Gaj (tUL), Chris Evelo Tue: Peter Hilbers, Huub ten Eijkelder, Patrick van Brakel, lots of students CARIM: Yigal Pinto, Umesh Sharma, Blanche Schroen, Matthijs Blankesteijn, Jos Smits, Jo de Mey, Danielle Curfs, Kitty Cleutjens, Natasja Kisters, Esther Lutgens, Birgit Faber, Petra Eurlings, Ann-Pascalle Bijnens, Mat Daemen, Frank Stassen, Marc van Bilssen, Marten Hoffker. NUTRIM: Wim Saris, Freddy Troost, Johan Renes, Simone van Breda.GROW: Daisy vd Schaft, Chamindie PuyandeeraIOP Nutrigenomics: Milka Sokolovic, Theo Hackvoort, Meike Bunger, Guido Hooiveld, Michael Müller, Lisa Gilhuis-Pedersen, Antoine van Kampen, Edwin Mariman, Wout Lamers, Nicole Franssen, Jaap keijer Cfg Welcome group: Neil Hanlon (Glasgow) Gontran Zepeda (Edinburg), Rick Dixon (Leicester), Sheetal Patel (London). Paris leptin group: Soraya Taleb, Rafaelle Cancello,Nathalie Courtin, Carine ClementOrganon: Jan Klomp, Rene van Schaik. BioAsp: Marc Laarhoven.

More Related