1 / 8

S1 Representing data

S1 Representing data. Skewness and choice of data analysis. Skewness. The first distribution shown has a positive skew. This means that it has a long tail in the positive direction. The distribution below it has a negative skew since it has a long tail in the negative direction.

jace
Download Presentation

S1 Representing data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. S1 Representing data Skewness and choice of data analysis

  2. Skewness The first distribution shown has a positive skew. This means that it has a long tail in the positive direction. The distribution below it has a negative skew since it has a long tail in the negative direction. Finally, the third distribution is symmetric and has no skew. Distributions with positive skew are sometimes called "skewed to the right" whereas distributions with negative skew are called "skewed to the left."

  3. Skewness – visuals and calculations Calculate Q1, Q2, Q3, mode, mean and standard deviation Draw all 3 boxplots on one piece of graph paper Data set 1 1, 3, 5, 5, 5, 7, 10 Data set 2 2, 7, 7, 8, 12, 14, 20 Data set 3 3, 6, 7, 9, 10, 10, 11 • For each data set find a relationship between the mode, median and mean using =,>,< symbols • For each data set find a relationship between Q2-Q1 and Q3-Q2 • Work out 3(mean-median) • standard deviation

  4. Skewness – Using the Quartiles Q2-Q1 = Q3-Q2 Q2-Q1 < Q3-Q2 Q2-Q1 > Q3-Q2

  5. Skewness – Using mode, median, mean Q2-Q1 = Q3-Q2 Q2-Q1 < Q3-Q2 Q2-Q1 > Q3-Q2 Mode=median=mean Mode<median<mean Mode>median>mean

  6. Skewness calculations You can calculate 3(mean-median) Standard deviation This gives you a value to tell you how skewed the data are. The closer the number to zero the more symmetrical the data Negative value means the data has a negative skew and vice versa

  7. Comparing data sets • You should always compare data sets using • a measure of location (mean, median, mode) • a measure of spread (range, IQR, standard deviation) • skewness • Range gives a rough idea of spread, but is affected by extreme values. • Generally only used with small data groups • IQR not affected by extreme values • Tells you the spread of middle 50% • Often used in conjunction with median • Mean and standard deviation generally used when data are fairly symmetrical • data size is reasonably large

More Related