 Download Presentation Statistics Chapter 2 Organizing Data

# Statistics Chapter 2 Organizing Data

Download Presentation ## Statistics Chapter 2 Organizing Data

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Statistics Chapter 2 Organizing Data

2. Quick Talk • Think of a situation where you need to organize data? (any kind of data) • What can you do after you collected the data and organized it?

3. Answer • You can graph it, calculate the range, midpoint, find a frequency then analyze the data.

4. Frequency Table • A frequency table partitions data into intervals and shows how many data values are in each interval. The intervals are constructed so that each data value falls into exactly one interval. • Note: intervals are known as classes. The book uses the word “classes”, but I use “intervals” because it makes more sense.

5. How do you create a frequency table? • Consider this situation: You are collecting how many minutes each student study for a particular class. You interviewed 50 students and here is the chart.

6. How do you create a frequency table? • 1) Determine how many intervals you want. • between 5-15 is usually preferred • Anything less than 5, you risk losing information • Anything more than 15, data might not be sufficiently analyzed • Let’s use 6 intervals for this case. (remember you can any number between 5 and 15) • With this, you can find the width of each interval.

7. Finding the width of the interval • Interval width= • So in our case: • Note: You always round to the next whole number, even if the number is 2.3. 2.3 would become 3 • So in each interval, it will include 8 numbers and this tells you the limit of each interval

8. The lower interval limit is the lowest data value that can fit in an interval. • The upper interval limit is the highest data value that can fit in an interval. • The interval width is the difference between the lower class limit of one interval and the lower class limit of the next interval. • In our case, our lowest number is 1, so 1+8=9, therefore, 9 would be the start of the next interval (remember we will have 6 intervals total)

9. Activity • Find the starting number of each interval

10. Answer • Start of 1st interval=1 • Start of 2nd interval=9 • Start of 3rd interval=17 • Start of 4th interval=25 • Start of 5th interval=33 • Start of 6th interval=41 • Start of 7th interval=49

11. Therefore, the interval limit

12. Now tally all the numbers that fall in each interval • Tally is the mark that is used to count the amount of numbers that lies in each interval. • Frequency (represented by ) is the number of tally marks corresponding to that interval

13. Activity • Now tally up all the numbers that fall in each interval. Find the frequency also

15. Midpoint (within the interval) • Midpoint=

16. Activity • Find the midpoint of each interval

18. Finding interval boundary • Upper interval boundaries, add 0.5 to the upper interval limit. • Lower interval boundaries, subtract 0.5 from the lower interval limits.

19. Activity • Find the interval boundaries for all interval.

21. Relative Frequency • Relative Frequency shows the probability of data values that falls in each interval • Relative frequency=

22. Activity • Find the relative frequency of each interval

23. Review • How to create frequency table. • Determine how many intervals you want • Find interval width • Determine the lower/upper interval limit for each interval • Determine the lower/upper interval boundaries for each interval • Do the tally and find the frequency (they are the same number) • Find the midpoint • Find the relative frequency

24. Group activity: Now try to do this by yourself or with a partner. This is a data represent glucose blood level after 12 hour fast for a random sample of 70 women. Use 6 intervals (classes)

26. Homework practice • Pg 46-47 #1-4 all, 5-10 (only do frequency table) (Will start in class if time permits)

27. Before we talk about how to graph a histogram, let’s talk about different shapes of a distribution

28. Different distribution shapes

29. Distribution definitions • Mound-shaped symmetrical: the term refers to a histogram in which both sides are the same when the graph is folded vertically down the middle. (Normal curve) • Uniform or rectangular: These terms refer to a histogram in which every interval has equal frequency. From one point of view, a uniform distribution is symmetrical with added property that the bars are of the same height • Skewed left or skewed right: These terms refer to a histogram in which one tail is stretch out longer than the other. • Bimodal: This term refers to a histogram in which the two classes with the largest frequencies are separated by at least one interval. The top two frequencies may have slightly different values.

30. Graphing a histogram • You use the frequency table to graph a histogram (use the example we did together in class about study minutes with 50 students) • You use lower/upper interval boundaries for the x axis because you don’t want any gaps. • Let’s graph both frequency histogram and relative-frequency histogram

31. This is how a frequency histogram looks like

32. This is how relative frequency histogram looks like

33. Activity • Compare the two graphs. What do you guys notice? What can you say about the distribution of data?

34. Quick talk • If we were to construct a normal distribution curve or mound-shaped symmetrical histogram for IQ, Newton and Einstein would be considered an “outlier”. What do you guys think outlier mean?

35. What is outlier? • Outliers are data values that are very different from other measurements in the data set. • Two types: or

36. Cumulative Frequency • Cumulative Frequency for an interval is the sum of the frequencies for that interval and all the previous intervals. Example: Let’s take a look at the class example again.

37. Ogive Graph • Ogive is a graph that displays cumulative frequencies

38. Ogive graph of the example

39. So then what does this graph tell us? • Example: I can say that 31 students had studied no more than 16 minutes, because it is cumulative.

40. Activity • Find the cumulative frequency and do an ogive graph