1 / 23

Review for Exam Three

Review for Exam Three.

tilly
Download Presentation

Review for Exam Three

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Review for Exam Three • In the first section of the course, we studied descriptive statistics, such as frequency distributions and measures of central tendency, as tools for describing the characteristics of the sample in our data set. In the second section, we studied descriptive statistics for measuring and describing relationships between variables. • In this section of the course, we focused on topics that support the transition from descriptive statistics to inferential statistics. Inferential statistics are designed to draw conclusion about the population rather than simply describing the sample.

  2. Topics from the Text - 1 • The Normal Distribution • Probability and the area under the normal curve • Computing and interpreting of z-scores • Computing and interpreting percentile ranks • Sampling and Sampling Distributions • Probability sampling methods • Concept of sampling distributions • The Central Limit Theorem • Estimation • Point and interval estimation • Affecting the width of confidence intervals (continued)

  3. Topics from the Text - 2 • Testing Hypotheses • Assumptions in hypothesis testing • Research and null hypotheses • One-tailed and two-tailed tests • Alpha, or level of significance • Test statistic and sampling distribution • P value versus alpha • Decision about null hypothesis and interpretation of the research hypothesis

  4. The normal distribution • The normal distribution is a bell-shaped, symmetrical distribution, whose shape is defined by the mean and standard deviation of the distribution. • To compute the probability for raw scores, we convert the raw scores to standard scores, or z-scores. • Z-scores are the difference between an individual raw score and the mean, divided by the standard deviation. Z-scores are number of standard deviation units above or below the mean. • Z-scores correspond to known normal curve probabilities. • Benchmark Z-scores: • The mean has a z-score of 0 • 2/3 of cases between z-score of -1 and z-score of +1 • 95% of cases between z-score of -2 and z-score of +2 • 99% of cases between z-score of -3 and z-score of +3

  5. Using the normal distribution • The normal distribution will be used to determine the probability of obtaining a specific statistical result (e.g. a sample mean) from the population represented by the sample in the data set. • If the population variable is normally distributed, the distribution of all possible sample means will be normally distributed, and our probabilities will be accurate. • To test that a variable is normally distributed, we see if the skewness and kurtosis of the distribution are between -1.0 and +1.0. • If the variable is not normally distributed by this criteria, we can see if the Central Limit Theorem applies.

  6. The Central Limit Theorem • The Central Limit Theorem tells us that with a large enough sample (50 or more), the sampling distribution of means will be normal, regardless of the shape of the distribution of the population distribution.

  7. Sampling • If we are to use our sample data to make statements about the population from which our sample is drawn, the sample must be representative of the population. To be representative, each case in the population must have a known probability of being included in the sample. • Probability sampling strategies • Simple random sampling • Systematic random sampling • Stratified random sampling

  8. Using Sample Data to Draw Conclusions about a Population • Inferential statistics use sample data to make statements and draw conclusions about the population represented by the sample. • If we had data for the entire population, we could be absolutely 100% certain that our statements about the population are correct. • Since we don’t have data for the entire population, there is always a chance, or probability, that our statements based on sample data will be incorrect.

  9. One Population – Many Possible Samples • From any given population, it is possible to correctly draw many, many representative samples. • Each sample will include people with different scores for the variable. • The statistics calculated for each sample will have different values, e.g. one sample may have a mean GPA of 2.75, a second sample has a mean GPA of 2.72, and a third sample has a mean of GPA of 2.79.

  10. Sampling Distributions • In reality, we draw only one sample from a population. We use the statistic calculated for this sample as our estimate of the population parameter. • We know that if we had drawn a different sample, our estimate for the population would be slightly different. What we need is a method or tool for comparing the accuracy of the statistic for any possible sample to the population from which it was drawn so that we can distinguish good estimates from poor estimates. • The tool or method that supports comparisons between sample statistics and populations is the sampling distribution. • Statisticians have found that sampling distributions for statistical measures, such as means, follow known probability distributions, such as the normal curve.

  11. Using Sampling Distributions • Using a sampling distribution that follows a normal curve, we can determine the probability that we could obtain any particular statistical measure from a sampling distribution with specified values for central tendency and dispersion. • We do this in the same way that we used a z-score to compare the score of any single subject to the distribution of scores for the entire sample. • Good estimates of population parameters have a high probability of being correct.

  12. Using Sampling Distributions • If the difference between the sample statistic and the population parameter is small(relative to the differences we tend to find among statistical values for different samples), a sample with this value has a high probability of occurring. This high probability of occurring is evidence that the sample statistic and the population parameter belong to the same distribution, and that the sample statistic is a reasonably good estimate of the population value. • If the difference between the sample statistic and the population parameter is large (relative to the differences we tend to find among statistical values for different samples), the sample statistic has a low probability of being drawn from that population. This low probability of occurring is evidence that the sample statistic and the population parameter belong to different distributions. The sample statistic is not a good estimate of the population value.

  13. Using the Probability Relationship between Samples and Populations • The probability that a sample value and a population value belong to the same distribution is used in two ways: • First, we use this information to estimate population values, or parameters, based on sample data. • Second, we use this information to test hypotheses about population values based on sample data. • In both instances, our goal is to minimize the probability that we are making an incorrect statement about the value in the population.

  14. Parameter Estimation • There are two types of estimates of population parameters: • 1) point estimates, or our best guess of the exact value in the population • 2) interval estimates, or a range of scores which we are willing to claim contains the true population value.

  15. Point Estimates • The point estimate of the population value is the sample value. • Though we can’t compute the probability that the population value would be exactly equal to the sample value, intuitively we know that the chances that we are exactly right is low. • We would have a better chance of being correct if we guessed that the population value fell within some range of scores, rather than being exactly equal to a specific score.

  16. Interval Estimates • An interval estimate consists of all the values that have a high probability of belonging to the sampling distribution associated with a particular measure of central tendency and dispersion. • An interval estimate of 2.25 to 3.25 means that we believe the true population value is 2.25 or 2.26 or 2.27, etc. up to 3.25. We don’t know exactly which number it is, just that it falls between the numbers 2.25 and 3.25.

  17. Computing an Interval Estimate • To compute an interval estimate, we use the sample mean, the sample standard deviation, and normal curve probabilities to obtain our estimate of the range of scores we believe contains the true population mean. • The width of confidence intervals varies according to the probability we want to associate with our estimated range. The more certain we want to be that our interval contains the true population mean, e.g. 99% rather than 95% or 90%, the wider will be the range of scores in the interval estimate.

  18. Interpreting an Interval Estimate • We can interpret the interval estimate as our confidence that the range of scores of the interval estimate contains the correct value for the population mean. • If we drew repeated samples from a population with a specific measure of central tendency and variability and computed the confidence interval for each sample, the proportion of samples which contain the true population mean is equal to the confidence level. • The probability applies to the researcher's belief or confidence that the population estimate based on the available sample does include the population mean.

  19. Increasing the precision of an interval estimate • If the range of scores in the confidence interval is too large to be useful, we cannot improve our estimate by examining the factors that influence the width of the confidence interval: • Larger sample sizes reduce the width of the interval • Smaller standard deviations reduce the width of the interval • Reducing the level of confidence narrows the width of the confidence interval

  20. Testing Hypotheses • A hypothesis is a statement that compares a value computed for our sample to a specific value for the population. • A hypothesis uses our sample data to test the correctness of a statement about a specific population value. For example. could we have gotten our sample statistic from a population that had a specific parameter value? • Based on our sample data, the research hypothesis states that the actual population value is greater than, less than, or not equal to the specified value in the population. • We test this proposed statement against its opposite, the null hypothesis, which states that the sample value is equal to the specified value in the population; i.e., there is no real difference between the two values.

  21. Five-step model for Testing Hypotheses • The five-step model used for all hypothesis tests contains the following steps: • Step 1. Evaluating assumptions. • Step 2. Stating the research hypothesis and the null hypothesis and setting alpha • Step 3. Selecting the sampling distribution and the test statistic • Step 4. Computing the test statistic • Step 5. Making a decision about hypotheses

  22. Alpha • Alpha is the probability that we will make a mistake when we reject the null hypothesis. It specifies the risk we are willing to take in stating an erroneous conclusion. • The traditional value for alpha is 0.05 (risk making 1 mistake in every 20 decision, or being correct 19 out of 20 times), but it can be set at 0.10 or 0.01. The level of risk is specified in relationship to the consequences of making a mistake. • If there are serious consequences, we set alpha lower (0.01 or 0.001), e.g. life-threatening or irreversible consequences.

  23. P-value and decision-making • Every test statistic computed in a hypothesis test has a probability associated with the sampling distribution which the test statistic follows (e.g. normal distribution). This is called the p-value (SPSS calls it Sig. for statistical significance) • The decision about rejecting the null hypothesis is made by comparing the p-value to alpha. • If p-value <= alpha, reject the null hypothesis. • If p-value > alpha, fail to reject the null hypothesis. • If we reject the null hypothesis, we conclude that the analysis supports the correctness of the research hypothesis. If we fail to reject the null hypothesis, we conclude that the statistical analysis does not support the research hypothesis. Since there is always a chance of an error, we do not prove or disprove.

More Related