120 likes | 135 Views
Learn the concept of confidence intervals in sampling and inference through bootstrapping methods. Find out how to make statistical claims confidently. Watch informational videos and engage in resampling activities to enhance your understanding.
E N D
Taking a sample from a population Taking a 1000 samples
1000 SAMPLES OF SIZE 10 FROM THE ENTIRE POPULATION. The true distribution of your population When sampling, You don’t know this, otherwise why would we need to take a sample when we have a census? Every time a new sample is drawn from the above population, and a mean found for that sample, we put a blue vertical line here. In this case we have taken 1000 samples and drawn 1000 blue lines, one for every mean found When sampling, You don’t know this, who has time or money to take 1000 samples??? When sampling, You don’t know this Here there is a dot for every mean found instead of a blue line. This is good because it gives you a better idea of which average height came more frequently The only thing you have is ONE box plot from ONE random sample
PROBLEM I wonder if there is a difference between the mean heights of male workers at CHEDDERCO and the mean height of female workers at CHEDDERCO
LEARNING TO RESAMPLE The activity is described here: https://youtu.be/mq3HAe_f0eo You will need data cards Double click the icon on the right or download from the website You will also need this excel spread sheet:
What you have just done is called re-sampling and every mean you found is a resampled mean. The computer resamples 1000 times, taking note of the difference in mean/medians every time. https://youtu.be/DnNNd9lw7eo
OUR INFERENCE IS MADE BASED ON THE BOTTSTRAPPED CONFIDENCE INTERVAL
CAVEAT…you CANNOT say there is a 95% chance the population mean is in this interval • This is a misunderstanding. • For example, lets say we were estimating the heights of Chida High students. • Our 95% ci comes out with an interval between 120– 140 cm. • If we say we are 95% sure the mean height of Chida High students is in the interval, we are basically saying: • P(mean of all Chida High students is between 120 and 140cm) = 0.95 • This doesn’t make sense, because the population mean either is or isn’t in this interval.
CAVEAT…you CANNOT say there is a 95% chance the population mean is in this interval • Remember, every interval is different. If I took a different sample I would get A NEW SAMPLE MEAN (and therefore a new center to the interval) and A NEW SAMPLE STANDARD DEVIATION (a slightly different width of the interval). • So, what we can say, is that if many confidence intervals were taken, each with their own centre depending on their specific sample mean, each with their own variation… • If many samples were taken, and each time confidence intervals were created, 95% of them would include the POPULATION MEAN
13 STATS When talking about confidence intervals, we like to start our sentence “I can make the call that….” This expression is what NCEA considers as expressing a lot of confidence in the answer, but not to the point of absolute surety. If you really want to include 95%, the only thing you can say is, “With 95% confidence, I can claim...“