Mcmc model composition search strategy in a hierarchical model
This presentation is the property of its rightful owner.
Sponsored Links
1 / 12

MCMC Model Composition Search Strategy in a Hierarchical Model PowerPoint PPT Presentation


  • 87 Views
  • Uploaded on
  • Presentation posted in: General

MCMC Model Composition Search Strategy in a Hierarchical Model . by Susan J. Simmons University of North Carolina Wilmington. Variable selection. In many hierarchical settings, it is of interest to be able to identify important features or variables

Download Presentation

MCMC Model Composition Search Strategy in a Hierarchical Model

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Mcmc model composition search strategy in a hierarchical model

MCMC Model Composition Search Strategy in a Hierarchical Model

by Susan J. Simmons

University of North Carolina Wilmington


Variable selection

Variable selection

  • In many hierarchical settings, it is of interest to be able to identify important features or variables

  • We propose using an MC3 search strategy to identify these important features

  • We denote the response variable by yij where i=1,…,L (number of clusters or groups) and j=1,…,ni (number of observations within group i).

    yij ~ N(qi, si2) and qi ~ N(Xi´b,t2)

    Where X is P x L matrix of explanatory variables (P=#variables and L=# obsn)


Hyperparameters

Hyperparameters

  • We set prior distributions on each of the parameters:

    bj ~ N(0,100)t2 ~ Inv-c2(1)

    si2 ~ Inv-c2(1)

    Combining the data and prior distributions give us an implicit posterior distribution, but the full conditional posterior distributions have a nice form


Mcmc model composition search strategy in a hierarchical model

.

where


Samples from full posterior distribution

Samples from Full Posterior distribution

  • With nice parametric full conditional distributions, we can use the Gibbs sampler to get posterior samples from joint posterior distribution p(q,s2,t2,b |y).

  • Using these samples, we can estimate


The search mc 3

The Search (MC3)

  • The model selection vector is a binary vector that indicates which features are in the model. The vector is of length PFor example, say there are 5 variables, then one possibility is [1 0 0 1 0].

    • (0) The search begins by randomly selecting the first model selection vector, and using this model, calculate P(D|M(1)).

    • (1) A position along the model selection vector is randomly chosen, and then switched (0 to 1, or 1 to 0).

    • (2) The probability of the data given this model is calculated (P(D|M(2)).

    • (3) A transition probability aij is calculated as min (1, P(D|M(j))/ P(D|M(i))).

    • (4) This transition probability is the success probability in a randomly chosen Bernoulli trial.

    • (5) If the value of the Bernoulli trial is 1, the chain moves to model j and this becomes the new model selection vector. If the value is 0, the old model selection vector is maintained.

      Repeat (1) – (5).


More on the search algorithm

More on the search algorithm

  • To ensure that the chain does not get stuck in a local minimum, we generated 10 chains (each of length 2000). Total of 20,000 models investigated.

  • Once all the chains have completed, we can calculate the activation probability for each feature as

  • P(bj ≠0|D) = ∑P(bj ≠0|D,M(k))P(M(k)|D)

    Where P(bj ≠0|D,M(k)) = 0 if feature j is not in the model and 1 if feature jis in the model, and P(M(k)|D) is calculated as


An example using a plant qtl experiment bay o by sha

An example using a plant QTL experiment(Bay-O by Sha)


Plant qtl experiments

Plant QTL experiments

Line 1

Clones

Line 2

There are 165 different lines (or clusters) and in this simulation, ni=10 for i=1,…,165. We generated 60 different simulations scenarios.


Simulations

Simulations

  • The 60 different simulations had effect sizes (1 through 9) and with two different error distributions. (gamma(0.5,1) and gamma(3,1)). In addition, we simulated models between 1 and 6 QTL.

  • The model used to simulate the data was yij=m+∑ajxij + eij

    where aj is the effect of marker j, m is the overall mean and eij is the error (gamma)


Some results

Some results


Conclusions

Conclusions

  • Results so far are promising.

  • Try more complex models.

  • Need to make model flexible to deal with p>>n (restricted parameter search).


  • Login