1 / 38

Topic 1 (1.1.1-1.1.6)

Topic 1 (1.1.1-1.1.6). Statistical Analysis. 1.1.1. State that Error bars are a graphical representation of the variability of data. To answer an IB question involving 1.1.1 simply state that Error bars are a graphical representation of the variability of data.

leigh-duffy
Download Presentation

Topic 1 (1.1.1-1.1.6)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topic 1 (1.1.1-1.1.6) Statistical Analysis

  2. 1.1.1 • State that Error bars are a graphical representation of the variability of data. • To answer an IB question involving 1.1.1 simply state that Error bars are a graphical representation of the variability of data. • The variability of data refers to how close or far away most data values are from the mean. A high standard deviation indicates a high variability of data and a low standard deviation indicates a low variability of data.

  3. 1.1.1 The following is an example of error bars.

  4. 1.1.1 Another example

  5. 1.1.1 Key vocabulary list • Error bars • Variability • Data • Graphical

  6. 1.1.2 • Calculate the mean and standard deviation of a set of values. • (Students should specify the standard deviation (s), not the population standard deviation. Students will not be expected to know the formulas for calculating these statistics. They will be expected to use the standard deviation function of a graphic display or scientific calculator. • Aim 7:Students could also be taught how to calculate standard deviation using a spreadsheet computer program.) • The mean is the average data value. • The sample standard deviation is the average difference from the mean for data with a sample size that is less than thirty, which is noted as “s”. • A statistic is a characteristic or measure obtained by using the data values from a sample, as opposed to a parameter, which is a characteristic or measure obtained by using all the data values from a specific population. • A set of values is something that consists of data from multiple subjects, which is either a sample or a population.

  7. 1.1.2 The following is an example of how to calculate the mean.

  8. 1.1.2 The following is an example for how to calculate the sample standard deviation formula, which is used when the sample size is less than thirty. The sample size is the number of data values that are in the data set that the standard deviation is of. The formula for the sample standard deviation is.

  9. 1.1.2 How to find the sample mean and the sample standard deviation by using a calculator, which is the method that IBO expects to be used on the IB biology SL or HL test • To find the sample standard deviation (s) and the sample mean on your calculator all that you need to do is press the “STAT” button on your TI calculator, then press “1:Edit” under the “EDIT” menu, then input the data into L1 (list 1) by pressing “ENTER” each time you type a data value, then pressing “STAT” after all of the data values have been typed in, then press the right arrow button to go to the “CALC” menu, then press “1:1-Var Stats” by using the “ENTER” button, which takes you to the home screen, and then press “Enter” one last time. Data will appear on the home screen. The sample mean is represented by a X with a line over it. The sample standard deviation is represented by the symbol Sx. Find the two values that correspond to those symbols on the home screen and you will have found the sample standard deviation and the sample mean.

  10. 1.1.2 Key vocabulary list • Mean • Standard deviation • Values • Sample mean • Sample standard deviation • Population mean • Statistic • Sample size

  11. 1.1.3 • State that the term standard deviation is used to summarize the spread of the values around the mean and that 68% of the values fall within one standard deviation of the mean. • (For normally distributed data, about 68% of all values lie within +-1 standard deviation (s or o) of the mean. This rises to about 95% for +-2 standard deviations.) • To answer an IB question involving this simply write that the term standard deviation is used to summarize the spread of the values around the mean and that 68% of the values fall within one standard deviation of the mean.

  12. 1.1.3 • 1.1.3 refers to the empirical rule, which states for data that is normally distributed that 68% of all data values in a set of data lie within 1 standard deviation of the mean, 95% of all data values in a set of data lie within 2 standard deviations of the mean, and 99.7% of all data values in a set of data lie within 3 standard deviations of the mean. • Data is normally distributed if the mean, median, and mode are practically all the same and the distribution is unimodal. • The empirical rule does not apply to non-normally distributed data and in order to figure out how many data values lie within +-1 standard deviation of the mean, +-2 standard deviations of the mean, and +- 3 standard deviations of the mean one must utilize methods that require calculations. Such methods will not be discussed because IBO will not ask you to do anything that involves the use of them. • A standard normal distribution has a mean of 0 and a standard deviation of 1. • The spread of values about the mean refers to the average numerical amount that a set data values differ from the value of the mean.

  13. 1.1.3 An example

  14. 1.1.3 Standard normal distribution curve (bell curve)

  15. 1.1.3 Key vocabulary list • Mean • Standard deviation • Spread • Normally distributed • Normal distribution curve (bell curve)

  16. 1.1.4 • Explain how the standard deviation is useful for comparing the means of the spread of data between two or more samples. • (A small standard deviation indicates that the data is clustered closely around the mean value. Conversely, a large standard deviation indicates a wider spread around the mean.)

  17. 1.1.4 • If one sample of data has a large standard deviation and if another sample of data has a small standard deviation, then it is clear that the sample with the larger standard deviation is much more variable than the sample with the smaller standard deviation. • The standard deviation is the average spread about the mean. • The variance is the standard deviation to the 2nd power.

  18. 1.1.4 An Example

  19. 1.1.4 Key vocabulary list • Standard deviation • Spread • Sample • Clustered • Around

  20. 1.1.5 • Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate values. • (For a t-test to be applied, the data must have a normal distribution and a sample size of at least 10. The t-test can be used to compare two sets of data and measure the amount of overlap. Students will not be expected to calculate values of t. Only a two-tailed, unpaired t-test is expected. Aim 7: While students are not expected to calculate a value for the t-test, students could be shown how to calculate such values using a spreadsheet program or the graphic display calculator. TOK: The scientific community defines an objective standard by which claims about data can be made.)

  21. 1.1.5 • If knowledge of degrees of freedom is needed to answer an IB biology SL or HL test question all that one needs to know is that the degrees of freedom is represented by d.f. and that degrees of freedom equals the sample size minus 1 when constructing a two-tailed confidence interval. • Two-tailed means that variable that the test involves is thought to be greater than it is presumed to be or less than it is presumed to be. The presumption is referred to as the null hypothesis, which is Ho. When a t confidence interval is two tailed the level of significance, which is denoted by the Greek letter “alpha”, is divided by 2. For one-tailed confidence intervals the level of significance is unchanged. • The sample size must be greater than or equal to 10 or less than 30 and the population standard deviation must be unknown in order for a t-test to be used. If those conditions are not meet a z-test must be used, but IBO will not ask you to do a z-test. • To the deduce the significance calculations involving t-values need to be done, which involves the use of a formula. So any IB question involving 1.1.5 should just involve the use of a formula and several calculations, so it is all numerical and not verbal.

  22. 1.1.5 Example confidence interval calculation Usually confidence intervals are 90%, 95%, or 99% confidence intervals

  23. 1.1.5 Visual of SHOW confidence interval data

  24. 1.1.5 A VIDEO example

  25. 1.1.5 This may need to be used on an IB test to answer a question that involves 1.1.5. The table can be used to find t-values using the degrees of freedom and the probability that chance alone could produce the difference, which is 1 minus the percent of confidence in decimal form.

  26. 1.1.5 Key vocabulary list • Confidence interval • Level of significance (alpha) • Degrees of freedom • T-distribution • Sample standard deviation • Sample mean • Probability • Population mean • Two-tailed • Unpaired(in reference to t-tests) • Normal distribution

  27. 1.1.6 • Explain that the existence of a correlation does not establish that there is a casual relationship between two variables. • (Aim 7: While calculations of such values are not expected, students who want to use r and r2 values in their practical work could be shown how to determine such values using a spreadsheet program.)

  28. 1.1.6 • When a mathematical correlation test is used the values of r range form -1 to 1. A r-value of 1 implies that there is a completely positive correlation. A r-value of -1 implies that there is a completely negative correlation. A r-value of 0 implies that there is no correlation. • If the r-values show that there is a correlation between the two variables an experiment needs to be performed in order to know if there is a casual relationship between the two variables. • A variable is a characteristic or attribute that can assume different values. • In a question that involves 1.1.6 and example needs to be mentioned to support the points made by the IB biology SL or HL student. On the next slide is an excellent example that could be used.

  29. 1.1.6 An excellent example Africanized honey bees “The story of Africanized honey bees (AHBs) invading the USA includes an interesting correlation. In 1990, a honey bee swarm was found outside a small town in southern Texas. They were identified as AHBs. These bees were brought from Africa to Brazil in the 1950s, in the hope of breeding a bee adapted to the South American tropical climate. But by 1990, they had spread to the southern US. Scientists predicted that AHBs would invade all the southern states of the US, but this hasn’t happened. Look at Figure 1.5: the bees have remained in the southwest states (area shaded in yellow) and have not travelled to the south- eastern states. The edge of the areas shaded in yellow coincides with the point at which there is an annual rainfall of 137.5cm (55 inches) spread evenly throughout the year. This level of year-round wetness seems to be a barrier to the movement of the bees and they do not move into such areas.” The experiment shows that the existence of the presumed correlation did not prove that there was a casual relationship between the two variables, which is the most important aspect of 1.1.6.

  30. 1.1.6 Key vocabulary list • Correlation • Casual relationship • Completely positive correlation • Completely negative correlation • Experiment • Africanized honey bees • Variable

  31. 1.1.1-1.1.6 (sample topic 1 IB questions)

  32. 1.1.1-1.1.6 (sample topic 1 IB questions)

  33. 1.1.1-1.1.6 (sample topic 1 IB questions)

  34. 1.1.1-1.1.6 (sample topic 1 IB questions)

  35. 1.1.1-1.1.6 (sample topic 1 IB questions)

  36. 1.1.1-1.1.6 (sample topic 1 IB questions)

  37. 1.1.1-1.1.6 (sample topic 1 IB questions)

  38. 1.1.1-1.1.6 Spread sheet programs • IBO makes reference to the use of spread sheet programs in the topic 1 detailed syllabus. Good spread sheet programs are Microsoft Excel and Minitab. Minitab is a statistical program that can be downloaded online. There is a free 30 day trail for it. It is must better than Microsoft excel. If you ever require use of Minitab as a spreadsheet program go to http://www.minitab.com/Downloads/ • If you are unsure about how to use Minitab you can use its help feature that is very detailed. One of the examples in this PowerPoint presentation was created by using Minitab, which is the SHOW confidence interval data example for 1.1.5. By using the help feature for Minitab you should be able to do anything that you need to do for IB Biology SL or HL that involves the use of a spreadsheet. The help feature for Microsoft Excel can also be utilized, but Minitab is much better software than Microsoft Excel. • Microsoft Excel free trails can be downloaded at http://us1.trymicrosoftoffice.com/default.aspx?WT.srch=1&WT.mc_id=78C4B07A-6906-484D-B4DD-47E2084740A6

More Related