1 / 28

Outline

Stat 350 Lab Session GSI: Yizao Wang Section 016 Mon 2pm30-4pm MH 444-D Section 043 Wed 2pm30-4pm MH 444-B. Outline. Two main types of studies Time plots Module 3 Activity 1 QQ plots Module 3 Activity 2 Quizdom Question. Two Main Types of Studies.

sylvia
Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stat 350 Lab SessionGSI: Yizao WangSection 016 Mon 2pm30-4pm MH 444-DSection 043 Wed 2pm30-4pm MH 444-B

  2. Outline • Two main types of studies • Time plots • Module 3 Activity 1 • QQ plots • Module 3 Activity 2 • Quizdom Question

  3. Two Main Types of Studies • Observational studies: simple observation • Experiments: design,measuring the effect of manipulation on output of interest, comparison

  4. Time Plots & QQ Plots • In statistical inference, first we need to check some assumptions for the data beforeinference procedures. • Ex1: We need random sample. • Ex2: We need the population from which the sample is taken to be normally distributed.

  5. Time Plots & QQ Plots • Time plots: check randomness of quantitative data collected over time (identically distributed) • QQ plots: check normality of population

  6. Time/Sequence Plots

  7. Time/Sequence Plots Time plots • Horizontal axis: time (sequence number) • Vertical axis: response or measured variable Distinguish time plots from histograms

  8. Time/Sequence Plots • Purpose: check the identically distributed aspect of a random sample • How: looking for evidence of stabilityStability = no patterns • If the data is not stable, histograms and numerical summaries may not be meaningful.

  9. Time/Sequence Plots Overall patterns: • Trend: a persistent, long-term rise or fall • Seasonal variation: a pattern that repeats itself at regular intervals of time • A persistent, long-term increase (or decrease) in the variation of the observations.

  10. Time/Sequence Plots Increasing trend

  11. Time/Sequence Plots Seasonal variation

  12. Time/Sequence Plots Increasing trend, seasonal variation and increase in the variation

  13. Module 3 Activity 1 • Background 1: death rate (number of deaths per 100 million miles driven) from 1960 to 2004 p14 or deathrate.sav • Task: What do you see? Would it make sense to make a histogram of the death rates?

  14. Module 3 Activity 1 • Background 1: death rate (number of deaths per 100 million miles driven) from 1960 to 2004 p14 or deathrate.sav • Task: What do you see? Would it make sense to make a histogram of the death rates?From the sequence plot we see that overall the death rate appears to be decreasing over time. It wouldn't make sense to make a histogram of the death rates because the time series is not stationary because of the decreasing trend.

  15. Module 3 Activity 1 • Background 2: observed passenger flow (number of passengers flying a given flight) of a certain airline company over many years airline.sav • Task: What do you see? Is there any pattern?

  16. Module 3 Activity 1 • Background 2: observed passenger flow (number of passengers flying a given flight) of a certain airline company over many years airline.sav • Task: What do you see? Is there any pattern? We see a seasonal variation here. We also see that both the overall mean and the overall variation appear to be increasing over time. One possible reason for this seasonal variation is that much more people go on vacation during the summer time than in any other season of a year.

  17. Module 3 Activity 1 • Interpretations-words should not be too strong. Not: The mean is increasing.Better: There is evidence of an increasing mean, based on the data.Not: The variance decreased.Better: The variability in the response appears to be decreasing over time. • Time plots do not tell the shape of a distribution.

  18. QQ Plots • You need to know how to check normality using QQ plot • You don’t need to know how to produce a QQ plot by yourself. (A plot of the percentiles of a standard normal distribution against the corresponding percentiles of the observed data.)

  19. QQ Plots

  20. QQ Plots • Purpose:QQ plot is a graphic tool to check normality. (Why/when we need normality?) • How to check: look at how the points fall… Approximately Normal: approximately along a straight line with a positive slope Not Normal: deviations from the line

  21. QQ Plots

  22. Module 3 Activity 2 • Task 1: Use the iq.sav dataset and examine the distribution of iq by creating a histogram and QQ plot. Describe the shape of the IQ distribution.

  23. Module 3 Activity 2 • Task 1: Use the iq.sav dataset and examine the distribution of iq by creating a histogram and QQ plot. Describe the shape of the IQ distribution. • From the histogram, we see that the distribution of IQ is unimodal, centered around 105-110. It appears to be very slightly skewed left, but is basically symmetric. • With the Q-Q plot, we do see points falling roughly along a straight line with a positive slope, which would indicate that the bell-curve normal distribution is a reasonable model.

  24. Module 3 Activity 2 • Task 2: Use the employee data.savdataset and create the histogram for the variable salary (current salary). Then create a QQ plot for salary. Comments?

  25. Module 3 Activity 2 • Task 2: Use the employee data.savdataset and create the histogram for the variable salary (current salary). Then create a QQ plot for salary. Comments?

  26. Module 3 Activity 2 • The histogram reveals that the distribution for current salary is definitely not symmetric. It is strongly skewed right. However, it is unimodal, with a peak at around 25000. The mean will be much higher than the median due to the right skewness. • The QQ plot clearly shows that the distribution for current salary cannotbe considered normal, since it does not follow a straight line.

  27. Review of lab 2 • Experiments and observational studies • Time/sequence plotsRandomness and stability • QQ plotsNormality of population Any questions?

  28. Before we finish… Qwizdom questions

More Related