Sampling from a mvn distribution
1 / 21

Sampling from a MVN Distribution - PowerPoint PPT Presentation

  • Uploaded on

Sampling from a MVN Distribution. BMTRY 726 1/17/2014. Sample Mean Vector. We can estimate a sample mean for X 1, X 2, …, X n. Sample Mean Vector. Now we can estimate the mean of our sample But what about the properties of ? It is an unbiased estimate of the mean

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Sampling from a MVN Distribution' - chars

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Sample mean vector
Sample Mean Vector

  • We can estimate a sample mean for X1,X2, …, Xn

Sample mean vector1
Sample Mean Vector

  • Now we can estimate the mean of our sample

  • But what about the properties of ?

    • It is an unbiased estimate of the mean

    • It is a sufficient statistic

    • Also, the sampling distribution is:

Sample covariance
Sample Covariance

  • And the sample covariance for X1,X2, …, Xn

  • Sample variance

  • Sample Covariance

Sample mean vector2
Sample Mean Vector

  • So we can also estimate the variance of our sample

  • And like , S also has some nice properties

    • It is an unbiased estimate of the variance

    • It is also a sufficient statistic

    • It is also independent of

  • But what about the sampling distribution of S?

Wishart distribution
Wishart Distribution

Given , the distribution of is called a Wishart distribution with n degrees of freedom.

has a Wishart distribution with n -1 degrees of freedom

The density function is

where A and S are positive definite

Wishart cont d
Wishart cont’d

  • The Wishart distribution is the multivariate analog of the central chi-squared distribution.

    • If are independent then

    • If then CAC’ is distributed

    • The distribution of the (i, i) element of A is

Large sample behavior
Large Sample Behavior

  • Let X1,X2, …, Xnbe a random sample from a population with mean and variance (not necessarily normally distributed)

    Then and Sare consistentestimators for m and S. This means

Large sample behavior1
Large Sample Behavior

  • If we have a random sample X1,X2, …, Xna population with mean and variance, we can apply the multivariate central limit theorem as well

  • The multivariate CLT says

Checking normality assumptions
Checking Normality Assumptions

  • Check univariate normality for each component of X

    • Normal probability plots (i.e. Q-Q plots)

    • Tests:

      • Shapiro-Wilk

      • Correlation

      • EDF

  • Check bivariate (and higher)

    • Bivariate scatter plots

    • Chi-square probability plots

Univariate methods
Univariate Methods

  • If X1, X2,…, Xn are a random sample from a p-dimensional normal population, then the data for the ith trait are a random sample from a univariate normal distribution (from result 4.2)

  • -Q-Q plot

    • Order the data

    • Compute the quantiles according to

    • Plot the pairs of observations

Correlation tests
Correlation Tests

  • Shapiro-Wilk test

  • Alternative is a modified version of Shapiro-Wilk test

  • Uses correlation coefficient from the Q-Q plot

  • Reject normality if rQ is too small (values in Table 4.2)

Empirical distribution tests
Empirical Distribution Tests

  • Anderson-Darling and Kolmogrov-Smirnov statistics measure how much the empirical distribution function (EDF)

    differs from the hypothesized distribution

  • For a univariate normal distribution

  • Large values for either statistic indicate observed data were not sampled from the hypothesized distribution

Multivariate methods
Multivariate Methods

  • You can generate bivariate plots of all pairs of traits and look for unusual observations

  • A chi-square plot checks for normality in p> 2 dimensions

    • For each observation compute

    • Order these values from smallest to largest

    • Calculate quantiles for the chi-squared distribution with p d.f.

Multivariate methods1
Multivariate Methods

  • Plot the pairs

    Do the points deviate too much from a straight line?

Things to do with non mvn data
Things to Do with non-MVN Data

Apply normal based procedures anyway

Hope for the best….

Resampling procedures

Try to identify an more appropriate multivariate distribution

Nonparametric methods


Check for outliers


  • The idea of transformations is to re-express the data to make it more normal looking

  • Choosing a suitable transformation can be guided by

    • Theoretical considerations

      • Count data can often be made to look more normal by using a square root transformation

    • The data themselves

      • If the choice is not particularly clear consider power transformations

Power transformations
Power Transformations

  • Commonly use but note, defined only for positive variables

  • Defined by a parameter l as follows:

  • So what do we use?

    • Right skewed data consider l< 1 (fractions, 0, negative numbers…)

    • Left skewed data consider l> 1

Power transformations1
Power Transformations

  • Box-Cox are a popular modification of power transformations where

  • Box-Cox transformations determine the best l by maximizing:


  • Note, in the multivariate setting, this would be considered for every trait

  • However… normality of each individual trait does not guarantee joint normality

  • We could iteratively try to search for the best transformations for joint and marginal normality

    • May not really improve our results substantially

    • And often univariate transformations are good enough in practice

  • Be very cautious about rejecting normality

Next time
Next Time

  • Examples of normality checks in SAS and R

  • Begin our discussion of statistical inference for MV vectors