Create Presentation
Download Presentation

Download Presentation
## Bayesian Models for Gene expression With DNA Microarray Data

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Bayesian Models for Gene expression**With DNA Microarray Data Joseph G. Ibrahim, Ming-Hui Chen, and Robert J. Gray Presented by Yong Zhang**Goals:**• To build a model to compare between normal and tumor tissues and to find the genes that best distinguish between tissue types. • to develop model assessment techniques so as to assess the fit of a class of competing models.**Outline**General Model Gene Selection Algo. Prior Distributions L measures(assessment) example**Data structure**x: the expression level for a given gene C0: threshold value for which a gene is considered as not expressed Let p = P(x=c0), then where y is the continuous part for x.**j=1, 2 index the tissue type(normal vs. tumor)**• i=1,2,…nj, ith individual • g=1,…G, gth gene • xjig : the gene expression mixture random variable for the jth tissue type for the ith individual and the gth gene.**The General Model**• Assume • δjig = 1(xjig=c0) • pjg=P(xjig=c0)=P(δjig = 1)**=(,2,p)**• Data D=(x111,…x2,n2,G, ) • Likelihood function for : L(|D)= In order to findwhich genes best discriminate between the normal and tumor tissues, let**Then we set**such that we can use g to judge them.**Prior Distributions**• jg2 ~ Inverse Gamma(aj0,bj0) • j0 ~ N(mj0,vj02), j=1,2**bj0 ~ gamma(qj0,tj0)**• ejg ~ N(uj0,kj0wj02)**Gene Selection Algo.**• For each gene, compute g and • Select a “threshold” value, say r0, to decide which genes are different. If • Once the gth genes are declared different, set 1g 2g, otherwise set 1g =2g g , where g is treated as unknown.**Gene Selection Algo.**4) Create several submodels using several values of r0. 5) Use L measure to decide which submodel is the best one(smallest L measure).**The properties of this approach**• Model the gene expression level as a mixture random variable. • Use a lognormal model for the continuous part of the mixture. • Use L measure statistic for evaluating models.**L measure for model assessment**• It relies on the notion of an imaginary replicate experiment. • Let z= (z111, …, z2,n2,G) denote future values of a replicate experiment.**L measure is the expected squared Euclidean distance between**x and z, A more general is The r.s. of the last formula can be got by MCMC.**For 1–4 and 6, the generation is**straightforward. • For 5, we can use an adaptive rejection algorithm(Gilks and Wild, 1992) because the corresponding conditional posterior densities are log-concave.**Discussion**• That model development and prior distributions in this paper can be easily extended to handle three or more tissue types. • More general classes of priors • The gene selection criterions