1 / 19

# Part II – with interactions of genes in mind Min-Te Chao 2002/10/ 28 - PowerPoint PPT Presentation

Part II – with interactions of genes in mind Min-Te Chao 2002/10/ 28. So far, all methods are one-gene-at-a-time First these methods are simple and intuitive, then they begin to become complicated.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Part II – with interactions of genes in mind Min-Te Chao 2002/10/ 28' - dominic-cherry

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

– with interactions of genes in mind

Min-Te Chao

2002/10/ 28

• So far, all methods are one-gene-at-a-time

• First these methods are simple and intuitive, then they begin to become complicated.

• Eg., Efron has to use a tricky logistic regression to estimate the prior density which is not too easy.

In the setup similar in regression setup, the “design matrix” is never of full rank.

Y=X * \beta + error

X is n by p, with n<100, p>1000.

I have seen a case with n=7, but p>6000.

• Let us say there is a way to “Do the statistical problem” (say, with traditional methods), with a smaller p, say p=p_1=3 or 30, depending on the value of n we have.

• Let us assume a model with the first p_1 parameteres only (the other betas are all 0, say)

• With our traditional method, we may find the likelihood function – with n observation and p_1 parmateres

• And we go through the text book method to do inference about the selected p_1 parameters.

• And obtain an estimator of the p_1-dim parameter (together with a sd or p-value)

• Instead of genes, they use markers. transmission association algorithm – a fast multi-marker screening method

• P-markers, n-patient

• For each patient, we have data from father and mother

• So we have n pieces of

parents – child

data.

• They pick out r markers at a time, r<<p markers

• A statistics T(r) is constructed, which tells the “amount of information” for a n-patient, r-marker sub-problem

• Markers in this subproblem are deleted one by one, the least important one first,

until all markers left are important

• It is the generality of the setup that is important. likelihood function for the r selected markers.

• Because it considers r markers at a time, so the likelihood function is with respect to the r selected markers. If there is any interaction between 2 or 3 markers, this process has a potential to pick them up

• All known methods, data mining or not, for analysis of micro array type of data are ad hoc and rather primitive.

• Amount of theory is limited.

• It has the tendency that these methods will eventually become statistical in nature, because an assessment of risk is still a very important factor in scientific work