Ecs 289a presentation jimin ding
1 / 16

ECS 289A Presentation Jimin Ding - PowerPoint PPT Presentation

  • Uploaded on

ECS 289A Presentation Jimin Ding. Problem & Motivation Two-component Model Estimation for Parameters in above model Define low and high level gene expression Comparing expression levels Limitations of the model and method Other possible solutions References.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'ECS 289A Presentation Jimin Ding' - devon

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Ecs 289a presentation jimin ding l.jpg
ECS 289A PresentationJimin Ding

  • Problem & Motivation

  • Two-component Model

  • Estimation for Parameters in above model

  • Define low and high level gene expression

  • Comparing expression levels

  • Limitations of the model and method

  • Other possible solutions

  • References

A model for measurement error for gene expression arrays l.jpg

A Model for Measurement Error for Gene Expression Arrays

David Rocke & Blythe Durbin

Journal of Computational Biology Nov.2001

Problem motivation l.jpg
Problem & Motivation

  • Statistical inference for data need assumption of normality with constant variance

    --- So hypothesis testing for the difference between control and treatment need equal variance (not depending on the mean of the data);

  • Measurement error for gene expression rises proportionately to the expression level

    --- So linear regression fails and log transformation has been tried;

  • However, for genes whose expression level is low or entirely unexpressed, the measurement error doesn’t go down proportionately Example

    --- So log transformation fails by inflating the variance of observations near background, and two component model is introduced.

Example mice from barosiewics etatl 2000 l.jpg
Example: MiceFrom: Barosiewics etatl, 2000

Two component model l.jpg
Two-Component Model

  • Y is the intensity measurement

  • is the expression level in arbitrary units

  • is the mean intensity of unexpressed genes

  • Error term:

Estimation for background l.jpg
Estimation for background ( )

  • Estimation of background using negative controls

  • Estimation of background with replicate measurements Detail

  • Estimation of background without replicate

Estimation of with replicate measurements l.jpg
Estimation of with replicate measurements

  • Begin with a small subset of genes with low intensity (10%)

  • Define a new subset consisting of genes whose intensity values are in

  • Repeat the first and second steps until the set of genes does not change.

Estimation of the high level rsd l.jpg
Estimation of the High-level RSD

  • The variance of intensity in two-component model: , where

  • At high expression level, only multiple error term is noticeable, so the ratio of the variation to the mean is a constant, i.e. RSD=

  • For each replicated gene that is at high level, compute the mean of the and the standard deviation of

  • Then use the pooled standard deviation to estimate :

Define high and low l.jpg
Define “high” and “low”

  • Low expression level:

    Most of the variance is due to the additive error component. 95% CI:

  • High expression level:

    Most of the variance is due to the multiplicative error component. 95% CI:

Comparing expression levels l.jpg
Comparing Expression Levels

  • Common method: standard t-test on ratio of expression for treatment and control (low level), or its logarithm (high level).

  • Problem:

    Less effective when gene is expressed at a low level in one condition and high in the other:

Solution consider treatment and control are correlated l.jpg
Solution consider treatment and control are correlated

  • Model:

  • Variation:

    Background: High-level RSD:

Hypothesis testing comparison l.jpg
Hypothesis testing (Comparison)

  • Assume the data have been adjusted:

  • Testing: (Gene has same expression level at

    Control and treatment)

  • Then using the following approximate variance to do standard t-test for log ratio of raw data:

Limitations l.jpg

  • No theoretical result for above estimations. (Consistency and asymptotical distribution)

  • Cutoff point of high level and low level is fairly artificial

  • The convergence of estimation of background information is heavily dependent on data and initial selection

Literature other possible solutions for measurement error l.jpg
Literature & Other Possible Solutions for Measurement Error

  • Chen et al. (1997): measurement error is normally distributed with constant coefficient of variation (CV)—in accord with experience

  • Ideker et al.(2000) introduce a multiplicative error component (normal)

  • Newton et al. (2001) propose a gamma model for measurement error.

  • Durbin et al.(2002) suggest transformation

    , where

  • Huber et al.(2002) introduce transformation

References l.jpg

  • Blythe Durbin, Johanna Hardin, Douglas Hawkins, and David Rocke. “A variancestabilizing transformation from gene-expression microarray data”, Bioinformatics, ISMB, 2002.

  • Chen. Y., Dougherty, E.R. and Bittner, M.L.(1997) “Ratio-based decisions and the quantitative analysis of cDNA microarray images”, J.Biomed. Opt.,2,364-374

  • Wolfgang Huber, Anja von Heydebreck,Martin Vingron (Dec.2002) “Analysis of microarray gene expression data”, Preprint

  • Wolfgang Huber, Anja von Heydebreck, Holger S¨ultmann, Annemarie Poustka, and Martin Vingron. “Variance stablization applied to microarray data calibration and to the quantification of differential expression”, Bioinformatics, 18 Suppl. 1:S96–S104, 2002. ISMB 2002.