1 / 41

Bioinformatics Dr. Víctor Treviño vtrevino@itesm.mx

Reading and Pre-Processing Microarrays. Bioinformatics Dr. Víctor Treviño vtrevino@itesm.mx. Data processing of Placental Microarrays Dr. Hugo A. Barrera Saldaña Paper in Mol. Med. 2007 . Search PubMed for Trevino V. Exercise. Example 1: Differential Expression. Reference Pool.

skylar
Download Presentation

Bioinformatics Dr. Víctor Treviño vtrevino@itesm.mx

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reading and Pre-Processing Microarrays BioinformaticsDr. Víctor Treviñovtrevino@itesm.mx

  2. Data processing of Placental Microarrays • Dr. Hugo A. Barrera Saldaña • Paper in Mol. Med. 2007. • Search PubMed for Trevino V Exercise

  3. Example 1: Differential Expression Reference Pool Placenta 1 Placenta 2 mRNA Extraction Red Green Labelling Green Red (controls) Microarray Hybridization (by duplicates) Within Normalization (per array) Between Normalization (all arrays) Image Analysis Scanning & Data Processing Detection of Differentially Expressed Genes t-test  H0: µ = 0 p-values correction: False Discovery Rate Validation and Analysis Comparison With Known Tissue Specific Genes (Dr. Hugo Barrera)

  4. b a c Placenta/Reference d Control/Control

  5. (b) T1dbase 1 0 T1 score Lung Thalamus Amygdala Spinal Cord Testis Kidney Liver Pituitary Thyroid Cerebellum Hypothalamus Caudate Nucleus Exocrine Pancreas Lymph Node Frontal Cortex Stomach Breast Bone Marrow Pancreatic Islets Uterus Ovary Skin Heart Skeletal Muscle Prostate Thymus Salivary Gland Trachea Placenta (a) Microarray Experiment 10 -6 Ratio (log2) 51 52 56 54 Array: Placenta 2 Replcate 2 Placenta 1 Replicate 2 Placenta 2 Replicate 1 Placenta 1 Replicate 1

  6. Data downloaded from URL: http://chipskipper.embl.de/iner-embo-course/index.htm • 2 dyes, 2 slides per assay (each containing different probes, same sample in both slides, oligo or cDNA arrays ?). 48 grids, 24x24 spots • .grd files contain the "initial grid" specification for the slides • .adf files contain the "annotations" of the genes. • Files: 51,52,53,54,55,56. 5xais the slide 1 and 5xb the slide 2 of each assay. • Some assays use the same rna sample (techincal replicates). See table in next slide. • One dye is Placental RNA and the other is a reference pool of different organs RNA • GOALS: • Detect Differential Expressed Genes • Focus on Placental Specific Genes (growth hormone family?) • Contact: • Dr. Hugo A. Barrera Saldana • (81) 83294050 ext. 2871, 2872, 2587 • (81) 81238249 (particular), 0448110778789 (mobile) • Secretario de Investigacion, Regulacion y Vinculacion • hbarrera@fm.uanl.mx

  7. Pending Questions: • Slides from group 1 and 2 should be 52 and 51, which is which? • Are the slides from Group 5 and 6 Control vs Control? • In which case we have only 2 independent samples • Group 5 should be slide 55, A and B, isn't?

  8. Download and use SpotFinder from TM4 Suite • http://www.tm4.org • Download Images (51.zip or 55.zip from http://bioinformatica.mty.itesm.mx/?q=node/68) • Read BOTH Images together using SpotFinder • Mark file 1 as "Cy3" = Green • Mark file 2 as "Cy5" = Red • Create Grid • Metarows = 12, Metacolumns = 4 • Rows = 24, Columns = 24 • Pixels = 450 (of the 24 x 24 spots) • Spacing = 18 (between metacolumns and metarows) • Adjust each of the 24 Grids to correct positions • Right mouse button in a grid • Right mouse button in a blank section to move all grids • Save the grid Image Analysis

  9. Use Gridding and Processing • Adjust (save grid first, in mac adjust doesn´t work well) • Process • Copy images • 1 From the grid adjust • 1 From the RI plot • 1 From the data (figure) • 2 From the QC view (A and B) • What does they represent? • Export to .mevfile • Open .mev file in excel • Remove comment lines • Compute signal: • Signal A = Cy3 Green = MNA - MedBkgA = Media del spot A - Mediana del fondo B • Signal B = Cy5 Red = MNB - MedBkgB = Media del spot B - mediana del fondo B • Plot Signal A vs Signal B • Copy image in a word file • DO NOT SAVE THE modified .MEV FILE Image Analysis

  10. Upload .mev file to google groups identifying the Slide name and team Next week, we will process all your uploaded data for processing Results

  11. Columns within .MEV File • UID • IA • IB • R • C • MR  Print-tip Normalization • MC  Print-tip Normalization • SR • SC • FlagA • FlagB • SA • SF • QC • QCA • QCB • BkgA • BkgB • SDA • SDB • SDBkgA • SDBkgB • MedA • MedB • MNA  Signal Ch. A = Cy3 [Green] • MNB Signal Ch. B = Cy5 [Red] • MedBkgA  Background Ch. A • MedBkgB  Background Ch. B • X • Y • PValueA • PValueB

  12. Columns within GenePix .GPR File • Block  Print-tip Normalization • Column • Row • Name • ID • X • Y • Dia. • F635 Median • F635 Mean • F635 SD • B635 Median • B635 Mean • B635 SD • % > B635 + 1 SD • % > B635 + 2 SD • F635 % Sat. • F532 Median • F532 Mean • F532 SD • B532 Median • B532 Mean • B532 SD • % > B532 + 1 SD • % > B532 + 2 SD • F532 % Sat. • Ratio of Medians • Ratio of Means • Median of Ratios • Mean of Ratios • Ratios SD • Rgn Ratio • Rgn R² • F Pixels • B Pixels • Sum of Medians • Sum of Means • Log Ratio • Flags • Normalize • F1 Median - B1 • F2 Median - B2 • F1 Mean - B1  Signal - Background • F2 Mean - B2  Signal - Background • SNR 1 • F1 Total Intensity • Index • "User Defined" http://www.moleculardevices.com/pages/software/gn_genepix_file_formats.html#gpr

  13. www.gepas.org MIDAS TM4 Normalization – "Easy" Options

  14. http://www.tm4.org/midas.html • Project  New • Read Data  Single Data File • Specify your .mev file • OperNormalization • LOWESS • Write Output • No virtual • Execution MIDAS TM4 • ReportsPDF

  15. MIDAS TM4

  16. MIDAS - Preview Results click, right-button, plot click, right-button, plot

  17. only ~ 9,000 data generated for 54a Output is different MIDAS - Problems Chipskipper + R (Bioconductor) Spotfinder+Midas This problem exemplify that the right software + right parameters is needed for each experiment (ChipSkipper was designed by the microarray slide provider).

  18. 51a.txt

  19. 51b.txt

  20. 56a.txt

  21. 56b.txt

  22. 52a.txt

  23. 52b.txt Same Sample?? Same Image?? Same Scan??

  24. 55A.txt controls

  25. 55B.txt controls

  26. 53A.txt controls

  27. 53B.txt controls

  28. 54a.txt

  29. 54b.txt

  30. 2 independent samples • 51a+52a, 54a+56a • 51b, 54b+56b (52b has problems) • It seems that no bias is present per subgrid (not shown) • Raw values will be used (no-normalised) Summary

  31. g51a a bit different to g52a g52a seems to be more "noisy" 54a and 56a looks more correlated in both g and r (This is was computed normalizing each channel independently)

  32. Averages = [Log(Cy3) + Log(Cy5)] / 2

  33. M (ratios) = Log("Cy5" / "Cy3") = Log(Sample/Reference)

  34. GENES SELECTED SLIDES A: (t-test vs mean=0) fdr <= 10% fold >= 2

  35. Lun 21 6-9pm Juev 24 Next Session

  36. Paper for Next Session Maru "AND" Perla

More Related