1 / 29

1. Homework #2 2. Inferential Statistics 3. Review for Exam

1. Homework #2 2. Inferential Statistics 3. Review for Exam. HOMEWORK #2: Part A. Sanitation Eng. Z=.53 = .2019 + .50 = .7019 F.C. Z=.67 = . 2486 + .50 = .7486 5 GPA’s, which are in the top 10%? GPA of 3.0 and 3.20 are not : Z = (3.0-2.78)/.33 =.67 Area beyond = .2514 (25.14%)

Download Presentation

1. Homework #2 2. Inferential Statistics 3. Review for Exam

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 1. Homework #22. Inferential Statistics 3. Review for Exam

  2. HOMEWORK #2: Part A • Sanitation Eng. Z=.53 = .2019 + .50 = .7019 • F.C. Z=.67 = .2486 + .50 = .7486 • 5 GPA’s, which are in the top 10%? • GPA of 3.0 and 3.20 are not: • Z = (3.0-2.78)/.33 =.67 • Area beyond = .2514 (25.14%) • Z=(3.20-2.78)/.33=1.27 • Corresponds to .8980 (.3980+.5000) • Area beyond = .1020 (10.2%) • By contrast, for 3.21… • Z=(3.21-2.78)/.33=1.30 • Corresponds to .9032 (.4032+.5000)

  3. HOMEWORK #2: Part B • Question 1 • a. Mean=18.87; median=15; mode=4 • b. The mean is higher because the distribution is positively skewed (several large cities with high percents) • c. When you remove NYC, the mean=16.43 & the median goes from 15 to 14.5. Removing NYC’s high value from the distribution reduces the skew. • The mean decreases more than the median because value of the mean is influenced by outlying values; the median is not—it only moves one case over.

  4. HOMEWORK #2: Part B • Question 2 • For this problem, there are two measures of central tendency (indicating the “typical” score). • The mean per student expenditure was almost $2,000 higher in 2003 ($9,009) than in 1993 ($7,050). • The median also increased, but not nearly as much (from $7,215 to $7,516). • The spread of the scores, as indicated by the standard deviation, was more than double 2003 (1,960) than it was in 1993 (804). • Shape • For 1993, the distribution of scores has a slight negative skew; this distribution is essentially normal (bell-shaped) as the mean ($7,050) and median ($7,215) are similar. By contrast, for 2003, the mean is much greater than the median; this distribution has a strong positive skew.

  5. HOMEWORK #2: Part B • Q3 • a. 53.28% • Opposite sides of mean, add 2 areas together • b. 6.38% • Both scores on right side of mean, subtract areas • c. 10.56% • “Column C” area for Z=1.25 is .1056 • d. 69.15% • “Column B” area for Z= -0.5 is .1915 + .5000 (for other half of normal curve) • e. 99.38% • Z=2.5; Column B (for area between 2.5 & 0) = .4938 + .5000 (for other half of normal curve) • f. 6.68% • Z = -1.5; Column C for area beyond -1.5 =.0668

  6. HOMEWORK #2: Part B • Q4 • a. .9953 • Column B area (.4953) + .5000 (for other half of normal curve) • b. .5000 • 50% of area on either side of mean (47) • c. .6826 • “Column B” for both – .3413 + .3413 • d. .9997 • Column B area (.4997) + .5000 (for other half of normal curve) • e. .0548 • “Column C” area for Z=1.6 • f. .3811 • Scores on opposite sides of mean  add “Col. B” areas

  7. HOMEWORK #2: Part C Statistics HOURS PER DAY WATCHING TV N Valid 1426 Missing 618 Mean 3.03 Median 2.00 Mode 2 Std. Deviation 2.766 Percentiles 10 1.00 20 1.00 25 1.00 30 2.00 40 2.00 50 2.00 60 3.00 70 3.00 75 4.00 80 4.00 90 6.00 • SPSS: • All the info needed to answer these questions is contained in this output 

  8. Distribution (Histogram) for TV Hours

  9. Sibs Distribution

  10. College Science Credits

  11. Sampling Terminology • Element: the unit of which a population is comprised and which is selected in the sample • Population: the theoretically specified aggregation of the elements in the study (e.g., all elements) • Parameter: Description of a variable in the population • σ = standard deviation, µ = mean • Sample: The aggregate of all elements taken from the pop. • Statistic: Description of a variable in the sample (estimate of parameter) • X = mean, s = standard deviation

  12. Non-probability Sampling • Elements have unknown odds of selection • Examples • Snowballing, available subjects… • Limits/problems • Cannot generalize to population of interest (doesn’t adequately represent the population (bias) • Have no idea how biased your sample is, or how close you are to the population of interest

  13. Probability Sampling • Definition: • Elements in the population have a known (usually equal) probability of selection • Benefits of Probability Sampling • Avoid bias • Both conscious and unconscious • More representative of population • Use probability theory to: • Estimate sampling error • Calculate confidence intervals

  14. Sampling Distributions • Link between sample and population • DEFINITION 1 • IF a large (infinite) number of independent, random samples are drawn from a population, and a statistic is plotted from each sample…. • DEFINITION 2 • The theoretical, probabilistic distribution of a statistic for all possible samples of a certain outcome

  15. The Central Limit Theorem I • IF REPEATED random samples are drawn from the population, the sampling distribution will always be normally distributed • As long as N is sufficiently (>100) large • The mean of the sampling distribution will equal the mean of the population • WHY? Because the most common sample mean will be the population mean • Other common sample means will cluster around the population mean (near misses) and so forth • Some “weird” sample findings, though rare

  16. The Central Limit Theorem II • Again, WITH REPEATED RANDOM SAMPLES, The Standard Deviation of the Sampling distribution = σ √N • This Critter (the population standard deviation divided by the square root of N) is “The Standard Error” • How far the “typical” sample statistic falls from the true population parameter

  17. The KICKER • Because the sampling distribution is normally distributed….Probability theory dictates the percentage of sample statistics that will fall within one standard error • 1 standard error = 34%, or +/- 1 standard error = 68% • 1.96 standard errors = 95% • 2.58 standard errors = 99%

  18. The REAL KICKER • From what happens (probability theory) with an infinite # of samples… • To making a judgment about the accuracy of statistics generated from a single sample • Any statistic generated from a single random sample has a 68% chance of falling within one standard error of the population parameter • OR roughly a 95% CHANCE OF FALLING WITHIN 2 STANDARD ERRORS

  19. EXAM • Closed book • BRING CALCULATOR • You will have full class to complete • Format: • Output interpretation • Z-score calculation problems • Memorize Z formula • Z-score area table provided • Short Answer/Scenarios • Multiple choice

  20. Review for Exam • Variables vs. values/attributes/scores • variable – trait that can change values from case to case • example: GPA • score (attribute)– an individual case’s value for a given variable • Concepts  Operationalize  Variables

  21. Review for Exam • Short-answer questions, examples: • What is a strength of the standard deviation over other measures of dispersion? • Multiple choice question examples: • Professor Pinhead has an ordinal measure of a variable called “religiousness.” He wants to describe how the typical survey respondent scored on this variable. He should report the ____. • a. median • b. mean • c. mode • e. standard deviation • On all normal curves the area between the mean and +/- 2 standard deviations will be • a. about 50% of the total area • b. about 68% of the total area • c. about 95% of the total area • d. more than 99% of the total area

  22. EXAM • Covers chapters 1- (part of)6: • Chapter 1 • Levels of measurement (nominal, ordinal, I-R) • Any I-R variable could be transformed into an ordinal or nominal-level variable • Don’t worry about discrete-continuous distinction • Chapter 2 • Percentages, proportions, rates & ratios • Review HW’s to make sure you’re comfortable interpreting tables

  23. EXAM • Chapter 3: Central tendency • ID-ing the “typical” case in a distribution • Mean, median, mode • Appropriate for which levels of measurement? • Identifying skew/direction of skew • Skew vs. outliers • Chapter 4: Spread of a distribution • R & Q • s2 – variance (mean of squared deviations) • s • Uses every score in the distribution • Gives the typical deviation of the scores • DON’T need to know IQV (section 4.2)

  24. Keep in mind… • All measures of central tendency try to describe the “typical case” • Preference is given to statistics that use the most information • For interval-ratio variables, unless you have a highly skewed distribution, mean is the most appropriate • For ordinal, the median is preferred • If mean is not appropriate, neither is “s” • S = how far cases typically fall from mean

  25. EXAM • Chapter 5 • Characteristics of the normal curve • Know areas under the curve (Figure 5.3) • KNOW Z score formula • Be able to apply Z scores • Finding areas under curve • Z scores & probability • Frequency tables & probability

  26. EXAM • Chapter 6 • Reasons for sampling • Advantages of probability sampling • What does it mean for a sample to be representative? • Definition of probability (random) sampling • Sampling error • Plus… • Types of nonprobability sampling

  27. Interpret • Number of cases used to calculate mean? • Most common IQ score? • Distribution skewed? Direction? • Q? • Range? • Is standard deviation appropriate to use here?

  28. Scenario • Professor Scully believes income is a good predictor of the size of a persons’ house • IV? • DV? • Operationalize DV so that it is measured at all three levels (nominal, ordinal, IR) • Repeat for IV

  29. Express the answer in the proper format • Percent • Proportion • Ratio • Probability

More Related