Bayesian Models for Gene expression With DNA Microarray Data

# Bayesian Models for Gene expression With DNA Microarray Data

## Bayesian Models for Gene expression With DNA Microarray Data

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Bayesian Models for Gene expression With DNA Microarray Data Joseph G. Ibrahim, Ming-Hui Chen, and Robert J. Gray Presented by Yong Zhang

2. Goals: • To build a model to compare between normal and tumor tissues and to find the genes that best distinguish between tissue types. • to develop model assessment techniques so as to assess the fit of a class of competing models.

3. Outline General Model Gene Selection Algo. Prior Distributions L measures(assessment) example

4. Data structure x: the expression level for a given gene C0: threshold value for which a gene is considered as not expressed Let p = P(x=c0), then where y is the continuous part for x.

5. j=1, 2 index the tissue type(normal vs. tumor) • i=1,2,…nj, ith individual • g=1,…G, gth gene • xjig : the gene expression mixture random variable for the jth tissue type for the ith individual and the gth gene.

6. The General Model • Assume • δjig = 1(xjig=c0) • pjg=P(xjig=c0)=P(δjig = 1)

7. =(,2,p) • Data D=(x111,…x2,n2,G, ) • Likelihood function for : L(|D)= In order to findwhich genes best discriminate between the normal and tumor tissues, let

8. Then we set such that we can use g to judge them.

9. Prior Distributions • jg2 ~ Inverse Gamma(aj0,bj0) • j0 ~ N(mj0,vj02), j=1,2

10. bj0 ~ gamma(qj0,tj0) • ejg ~ N(uj0,kj0wj02)

11. Gene Selection Algo. • For each gene, compute g and • Select a “threshold” value, say r0, to decide which genes are different. If • Once the gth genes are declared different, set 1g  2g, otherwise set 1g =2g  g , where g is treated as unknown.

12. Gene Selection Algo. 4) Create several submodels using several values of r0. 5) Use L measure to decide which submodel is the best one(smallest L measure).

13. The properties of this approach • Model the gene expression level as a mixture random variable. • Use a lognormal model for the continuous part of the mixture. • Use L measure statistic for evaluating models.

14. L measure for model assessment • It relies on the notion of an imaginary replicate experiment. • Let z= (z111, …, z2,n2,G) denote future values of a replicate experiment.

15. L measure is the expected squared Euclidean distance between x and z, A more general is The r.s. of the last formula can be got by MCMC.

16. For 1–4 and 6, the generation is straightforward. • For 5, we can use an adaptive rejection algorithm(Gilks and Wild, 1992) because the corresponding conditional posterior densities are log-concave.

17. Discussion • That model development and prior distributions in this paper can be easily extended to handle three or more tissue types. • More general classes of priors • The gene selection criterions