1 / 24

Develop mathematical, statistical, and computational methods

Develop mathematical, statistical, and computational methods to analyse biologically or technologically novel experiments in order to understand disease-relevant regulatory and genetic interaction networks. What we do.

quasar
Download Presentation

Develop mathematical, statistical, and computational methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Develop mathematical, statistical, and computational methods to analyse biologically or technologically novel experiments in order to understand disease-relevant regulatory and genetic interaction networks

  2. What we do High-density marrays for RNA transcription and protein-DNA binding: Regulatory networks in heart development Jörn Tödling (with Silke Sperling, MPI Molecular Genetics) Fundamentals of transcription and genetics in yeast Matt Ritchie (with Lars Steinmetz, EMBL)  High-throughput RNAi assays, high-content automated microscopy, genetic interaction networks Oleg Sklyar, Ligia Bras, Thomas Horn (with Michael Boutros, DKFZ, Robert Gentleman, FHCRC Seattle, Amy Kiger, UCSD)  Bioconductor

  3. Finding differentially expressed genes with marrays

  4. log-ratio which genes are differentially transcribed? same-same tumor-normal

  5. Statistics 101: biasaccuracy  precision variance

  6. Basic dogma of data analysis: Can always increase sensitivity on the cost of specificity, or vice versa, the art is to find the best trade-off. X X X X X X X X X

  7. 3000 3000 x3 ? 1500 200 1000 0 ? x1.5 A A B B C C But what if the gene is “off” (below detection limit) in one condition? ratios and fold changes Fold changes are useful to describe continuous changes in expression

  8. ratios and fold changes The idea of the log-ratio (base 2) 0: no change +1: up by factor of 21 = 2 +2: up by factor of 22 = 4 -1: down by factor of 2-1 = 1/2 -2: down by factor of 2-2 = ¼ A unit for measuring changes in expression: assumes that a change from 1000 to 2000 units has a similar biological meaning to one from 5000 to 10000. What about a change from 0 to 500? - conceptually - noise, measurement precision

  9. ratio compression Yue et al., (Incyte Genomics) NAR (2001) 29 e41

  10. Systematic Stochastic o similar effect on many measurements o corrections can be estimated from data o too random to be ex-plicitely accounted for o remain as “noise” Calibration Error model Sources of variation amount of RNA in the biopsy efficiencies of -RNA extraction -reverse transcription -labeling -fluorescent detection probe purity and length distribution spotting efficiency, spot size cross-/unspecific hybridization stray signal

  11. bi per-sample normalization factor bk sequence-wise probe efficiency hik ~ N(0,s22) “multiplicative noise” ai per-sample offset eik ~ N(0, bi2s12) “additive noise” modeling ansatz measured intensity = offset + gain  true abundance

  12. “multiplicative” noise “additive” noise  The two-component model raw scale log scale B. Durbin, D. Rocke, JCB 2001

  13. variance stabilizing transformations Xu a family of random variables with EXu=u, VarXu=v(u). Define var f(Xu ) independent of u derivation: linear approximation

  14. variance stabilizing transformations f(x) x

  15. the “glog” transformation - - - f(x) = log(x) ———hs(x) = asinh(x/s) P. Munson, 2001 D. Rocke & B. Durbin, ISMB 2002

  16. generalized log-ratio difference log-ratio variance: constant part proportional part glog raw scale log glog

  17. “usual” log-ratio 'glog' (generalized log-ratio) c1, c2are experiment specific parameters (~level of background noise)

  18.  Variance Bias Trade-Off Estimated log-fold-change log glog Signal intensity

  19.  Variance-bias trade-off and shrinkage estimators Shrinkage estimators: pay a small price in bias for a large decrease of variance, so overall the mean-squared-error (MSE) is reduced. Particularly useful if you have few replicates. Generalized log-ratio: = a shrinkage estimator for fold change There are many possible choices, one is stabilization of variance: + interpretabality even in cases where gene is off in some conditions + can subsequently use standard statistical methods (hypothesis testing, ANOVA, clustering, classification…) with less worries about uneven variances

  20. evaluation: a benchmark for Affymetrix genechip expression measures o Data: Spike-in series: from Affymetrix 59 x HGU95A, 16 genes, 14 concentrations, complex background Dilution series: from GeneLogic 60 x HGU95Av2, liver & CNS cRNA in different proportions and amounts o Benchmark: 15 quality measures regarding -reproducibility -sensitivity -specificity Put together by Rafael Irizarry (Johns Hopkins) http://affycomp.biostat.jhsph.edu

  21. good affycomp results (28 Sep 2003) bad

  22.  ROC curves

  23. Availability oimplementation in R oopen source package vsn on www.bioconductor.org othe next step: sequence-specific norma-lization (till here, just array-specific) oideas of shrinkage and variance stabilization are now used in mainstream preprocessing programs for Affymetrix data (PLIER, GCRMA)

  24. Acknowledgements MPI Molekulare Genetik Anja von Heydebreck Martin Vingron Uni Heidelberg Günther Sawitzki DFCI Harvard Robert Gentleman UMC Leiden Judith Boer RZPD Anke Schroth Bernd Korn ...and many more! DKFZ Heidelberg Molecular Genome Analysis Annemarie Poustka Holger Sültmann Andreas Buneß Markus Ruschhaupt Katharina Finis Jörg Schneider Klaus Steiner Stefan Wiemann Dorit Arlt

More Related