Statistics [1/2,3/2]

Statistics [1/2,3/2] • The Essential Mathematics

Standard Error • What standard deviation is to an individual (relative to a population mean), standard error is to a sample mean (relative to a population mean) • standard deviation/sqrt(n) • All parameters have a standard error associated with them...we use them to “normalize” statistical tests

Short Exercise • What is the mean of {1,2,3,4,5}? • Now, let’s take all possible triplets: • {1,2,3}, {1,2,4}, {1,2,5}, {1,3,4}, {1,3,5}, {1,4,5}, {2,3,4}, {2,3,5}, {2,4,5}, {3,4,5}

Short Exercise • What is the mean of {1,2,3,4,5} = 3 • Std. Dev (sample) = 1.58114 • Now, let’s take all possible triplets: • 2, 7/3, 8/3, 8/3, 3, 10/3, 3, 10/3, 11/3, 4 • Mean = 3, Std. Dev (sample) = .60858 • Maximum offset: 1 (was originally 2) • Message: having a group reduces the Std. Dev, hence we have standard error

Statistical Tests • Null hypothesis: A hypothesis of no change • Alternate hypothesis: A hypothesis of change • All stats tests assume “no change from something”...the goal is to prove otherwise...

Common Tests • Skewness and Kurtosis • Z-Test / T-Test • ANOVA / F-Test • Correlation Test

Standard Error Skewing

Who’s Skewed?

Standard Error Skewing

Who’s Kurtic?

Central Limit Theorem • Population distribution X • Take n (large) random samples and compute the mean of the samples • The distribution of these random sample means (independent of X) will follow the Gaussian distribution, hence we call it Normal

Normal Distribution

Z-Test • Assumes normality • Either you know it should be normal, or you have enough of a sample size to use the Central Limit Theorem • (observed - mean)/(std. dev / sqrt(n)) • This equation is generalized for sample means of sample size n (individual is n = 1)

Example • A group of 9 people takes an IQ test. The population is known to follow a normal distribution with average score of 100 on the same test with a standard deviation of 15. The group of 9 averaged a score of 105. Should we assume that this group differs from the population of test takers?

Calculation • (sample mean - population mean) = 5 • (std. dev)/sqrt(9) = 5 • z = 5/5 = 1 • What does this 1 mean?

Generalization • An arbitrary Gaussian distribution down to a Gaussian distribution with mean 0 and standard deviation 1 • It’s a value that helps us find another value

p-value • Every statistical test has a p-value • The probability that other observations (less than it) have already occurred • In other words, how extreme the observation is relative to others of its kind • z = 1 links to a p-value of .8414 (or .1586) • Not something very extreme

a-level • Every statistical test has an alpha level • The level at which you reject the null hypothesis in favor of the alternate hypothesis • This defines how you handle the p-value • Otherwise known as Type 1 Error (false rejection probability)

T-test • A test for when normality cannot be assumed • Behaves just like a z-test, but has a different distribution to work from • Degrees of freedom

ANOVA • A way to test whether or not there is a difference based upon some factor in a study • Partitions variance into sources and uses the ratio as the determining factor

One-Way ANOVA

Two-Way ANOVA

Example ANOVA

Example • It has always been said that hitter of the opposite hand as the pitcher throws will succeed at a higher rate • Does this claim hold water?

Example • Managers frequently set their lineups on the principle that they do not want left-handed hitters back-to-back because a left-handed specialist (almost always an LHP) can be used to get consecutive outs, yet righties are frequently stacked without concern. • Are these managers paranoid, or is there some merit to this?

Sample Set • 30 of the top 75 qualifying hitters for MLB batting titles in 2012 were selected • Top 10 right-handed hitters • Top 10 left-handed hitters • Top 10 switch hitters (both left and right) • Average against LHP and average against RHP was recorded for each of these 30 hitters

Let’s check it out!

Correlation Test • I got an r-value from a regression that I performed • What does it tell me? • Long story short, it depends on the sample size

Correlation Test Statistic H0: correlation (r) = p HA: correlation is <,> that

Interesting Picture Positively Correlated, but could be perfect positive model Positively Correlated Not Correlated, but could be perfect positive model Not Correlated No Clue Not Correlated, but could be perfect negative model Negatively Correlated Negatively Correlated, but could be perfect negative model

What did we learn? • When dealing with correlation studies, make sure you have at least 13 observations • You can disassociate no correlation from the possibility of a perfect model at this sample size (at 95% confidence) • With more confidence, you will need more observations to achieve this • A little correlation goes a long way in large samples • With small samples, more correlation is required to make a claim

Assignment • Given definition of outliers for a population: • 25% - 1.5(IQR) • 75% + 1.5(IQR) • Determine what the z-scores of the minimum outliers on either side would be • I will send you an ANOVA table: • Tell me the factorial environment • A has a levels • B has b levels • How many subjects per block n

Statistics [1/2,3/2]

Statistics [1/2,3/2]

Presentation Transcript

Ernest van den Haag The Ultimate Punishment

STATISTICS 542 Introduction to Clinical Trials

STATISTICS 542 Introduction to Clinical Trials

ST1232 Statistics in the Life Sciences

SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Statistics A Basic Introduction and Review

AP Statistics Review

Basic Statistics

Statistics

Introduction to Applied Statistics

Statistics

Inferential Statistics

Basic statistics: a survival guide

Applications of Statistics in Research

Chapter 1

Introduction to Probability & Statistics Concepts of Probability

COMPLETE BUSINESS STATISTICS

COMPLETE BUSINESS STATISTICS

Descriptive Statistics Univariate Statistics Chi Square ANOVA

Review of Top 10 Concepts in Statistics

Statistics [1/2,3/2]

Statistics [1/2,3/2]

Presentation Transcript

Ernest van den Haag The Ultimate Punishment

STATISTICS 542 Introduction to Clinical Trials

STATISTICS 542 Introduction to Clinical Trials

ST1232 Statistics in the Life Sciences

SAA 2023 COMPUTATIONALTECHNIQUE FOR BIOSTATISTICS

Statistics A Basic Introduction and Review

AP Statistics Review

Basic Statistics

Statistics

Introduction to Applied Statistics

Statistics

Inferential Statistics

Basic statistics: a survival guide

Applications of Statistics in Research

Chapter 1

Introduction to Probability &amp; Statistics Concepts of Probability

COMPLETE BUSINESS STATISTICS

COMPLETE BUSINESS STATISTICS

Descriptive Statistics Univariate Statistics Chi Square ANOVA

Review of Top 10 Concepts in Statistics

Introduction to Probability & Statistics Concepts of Probability