Inference

1 / 16

# Inference - PowerPoint PPT Presentation

Inference. Mary M. Whiteside, Ph.D. Nonparametric Statistics. Two Sides of Inference. Parametric Interval estimation, xbar Hypothesis testing, m 0 Nonparametric Interval estimates, EDF Hypothesis testing, P(X&lt;Y) &gt; P(X&gt;Y). Meaning of Nonparametric. Not about parameters

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Inference' - traci

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Inference

Mary M. Whiteside, Ph.D.

Nonparametric Statistics

Two Sides of Inference
• Parametric
• Interval estimation, xbar
• Hypothesis testing, m0
• Nonparametric
• Interval estimates, EDF
• Hypothesis testing, P(X<Y) > P(X>Y)
Meaning of Nonparametric
• Methods for non-normal distributions
• Methods for ordinal data
• Data Scales
• Nominal, categorical, qualitative
• Ordinal
• Interval
• Ratio - natural zero
Random Sample - Type 1
• Random sample from a finite population
• Simple
• Stratified
• Cluster
• Inferences are about the finite population
• Audit comprised of a sample from a population of invoices
• Public opinion polls
• QC samples of delivered goods
Random Sample - Type 2
• Observations of (iid) random variables
• Inferences are about the probability distributions of the random variables
• Weekly average miles per gallon for your new Lexus
• Chi square tests of independence in medical treatment offered men and women
• Effect of female literacy on infant mortality worldwide
Transition from data sets to distributions
• All random variables, by definition, have probability functions (pmf or pdf) and cumulative probability distributions
• Random variables defined on a random sample (Type 1 or 2) are called statistics with probability distributions that are called sampling distributions
Sampling Distributions
• Statistics support both sides of inference
• Estimators - random variables used to create interval estimates
• Test statistics - random variables used to test hypotheses
Consider Xbar - a parametric statistic
• Type I sample - subset of invoices where X = sales tax paid on an invoice randomly selected from a finite population
• Xbar is the average sales tax of n randomly selected invoices
• Xbar is an estimator of m, the average sales tax paid for the population of invoices (with standard deviation s)
• Xbar is a test statistic for testing hypotheses

H0: m = m0

• Xbar is a random variable with sampling distribution asymptotically normal as n increases with mean m and standard deviation sn
Consider Xbar - a parametric statistic
• Type 2 sample - the complete set of miles per gallon observations made by you since buying your Lexus where X = mpg for your Lexus in a given week
• Xbar is the average mpg for n observations of X
• Xbar is an estimator of the expected value (mX) of the RV X
• Xbar is a test statistic for testing hypotheses

H0: m = m0

• Xbar is a random variable with sampling distribution asymptotically normal as n increases with mean mX and standard deviationsX/n
X in the Type 1 sample
• If X from a Type 1 sample is regarded as a random variable, then it has the discrete uniform distribution
• Prob [X = x] = 1/N for all x in the population (where the N values of x are assumed to be unique)
Order statistics of rank k - a nonparametric statistic
• the kth order statistic is the kth smallest observation
• the first order statistic is the smallest observation in a sample
• the nth order statistic is the largest
• Large body of literature on sampling distributions of order statistics
Estimation
• Definitions
• EDF
• pth sample quantile
• sample mean, variance, and standard deviation
• unbiased estimators (S2 and s2)
Intervals for parameter estimation
• (point estimate - r*standard error of the estimator, point estimate +q*standard error of the point estimate) where r is the a/2 quantile and q is the (1-a/2) quantile from the sampling distribution of the estimator
• r equals -q in symmetric distributions with mean 0 (z = +/- 1.96 or t = +/-2.02581)
• r does not equal -q in skewed distributions such as Chi squared and F
Sampling distribution of the estimator
• Parametric procedures - Assumed normal or normal based from the Central Limit Theorem and sample size
• Xbar is approximately normal if n is large
• Xbar is t if X is normal and s is unknown
• Xbar’s distribution is unknown if X’s distribution is unknown and n is small
Sampling distribution of the estimator
• Nonparametric distribution-free procedures I.e. the sampling distribution of the statistic (estimator or test statistic) is “free” from the distribution of X
• rank order statistics
• bootstrapped distributions - a/2 and 1-a/2 quantiles
Parametric vs nonparametric sampling distributions
• Exact distributions with approximate models
• Exact distributions with exact models (but usually small samples)

or

• Asymptotic distributions with exact models