Part II
Download
1 / 19

Part II – with interactions of genes in mind Min-Te Chao 2002/10/ 28 - PowerPoint PPT Presentation


  • 50 Views
  • Uploaded on

Part II – with interactions of genes in mind Min-Te Chao 2002/10/ 28. So far, all methods are one-gene-at-a-time First these methods are simple and intuitive, then they begin to become complicated.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Part II – with interactions of genes in mind Min-Te Chao 2002/10/ 28' - dominic-cherry


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Part II

– with interactions of genes in mind

Min-Te Chao

2002/10/ 28


  • So far, all methods are one-gene-at-a-time

  • First these methods are simple and intuitive, then they begin to become complicated.

  • Eg., Efron has to use a tricky logistic regression to estimate the prior density which is not too easy.



In the setup similar in regression setup, the “design matrix” is never of full rank.

Y=X * \beta + error

X is n by p, with n<100, p>1000.

I have seen a case with n=7, but p>6000.


  • Let us say there is a way to “Do the statistical problem” (say, with traditional methods), with a smaller p, say p=p_1=3 or 30, depending on the value of n we have.

  • Let us assume a model with the first p_1 parameteres only (the other betas are all 0, say)


  • With our traditional method, we may find the likelihood function – with n observation and p_1 parmateres

  • And we go through the text book method to do inference about the selected p_1 parameters.

  • And obtain an estimator of the p_1-dim parameter (together with a sd or p-value)





  • Instead of genes, they use markers. transmission association algorithm – a fast multi-marker screening method

  • P-markers, n-patient

  • For each patient, we have data from father and mother

  • So we have n pieces of

    parents – child

    data.



  • They pick out r markers at a time, r<<p markers

  • A statistics T(r) is constructed, which tells the “amount of information” for a n-patient, r-marker sub-problem

  • Markers in this subproblem are deleted one by one, the least important one first,

    until all markers left are important





  • It is the generality of the setup that is important. likelihood function for the r selected markers.

  • Because it considers r markers at a time, so the likelihood function is with respect to the r selected markers. If there is any interaction between 2 or 3 markers, this process has a potential to pick them up



  • All known methods, data mining or not, for analysis of micro array type of data are ad hoc and rather primitive.

  • Amount of theory is limited.

  • It has the tendency that these methods will eventually become statistical in nature, because an assessment of risk is still a very important factor in scientific work



ad