1 / 64

Statistical Analysis

Statistical Analysis. Topic 1. Statistics. 1.1.1 State that error bars are a graphical representation of the variability of data. 1.1.2 Calculate the mean and standard deviation of a set of values.

etracy
Download Presentation

Statistical Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Analysis Topic 1

  2. Statistics • 1.1.1 State that error bars are a graphical representation of the variability of data. • 1.1.2 Calculate the mean and standard deviation of a set of values. • 1.1.3 State that the term standard deviation is used to summarize the spread of values around the mean, and that 68% of values fall within one standard deviation of the mean.

  3. 1.1.4 Explain how the standard deviation is useful for comparing the means and spread of data between two or more samples. • 1.1.5 Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables. • 1.1.6 Explain that the existence of a correlation does not establish that there is a causal relationship between two variables.

  4. What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for making calculations or drawing conclusions Encarta dictionary

  5. 2 types of Data Qualitative Quantitative

  6. Statistics in Science • Data can be collected about a population (surveys) • Data can be collected about a process/mechanism (experimentation)

  7. Qualitative Data • Information that relates to characteristics ordescription (observable qualities) • Information is often grouped by descriptive category • Examples • Species of plant • Type of insect • Shades of color • Rank of flavor in taste testing Remember: qualitative data can be “scored” and evaluated numerically

  8. Qualitative data, manipulated numerically • Survey results, teens and need for environmental action • Data presented in proportion or % form:

  9. Quantitative data • Quantitative – measured using a naturally occurring numerical scale • Examples • Chemical concentration • Temperature • Length • Weight…etc.

  10. Quantification • Measurements are often displayed graphically

  11. Quantitation = Measurement • In data collection for Biology, data must be measured carefully, using laboratory equipment (ex. Timers, metersticks, pH meters, balances , pipettes, etc) • The limits of the equipment used add some uncertainty to the data collected. All equipment has a certain magnitude of uncertainty. For example, is a ruler that is mass-produced a good measure of 1 cm? 1mm? 0.1mm? • For quantitative testing, you must indicate the level of uncertainty of the tool that you are using for measurement.

  12. Finding the level of uncertainty • As a “rule-of-thumb”, if not specified, use ± 1/2 of the smallest measurement unit (e.g., metric ruler is lined to 1mm, so the limit of uncertainty of the ruler is ± 0.5 mm.) • If the room temperature is read as 25ºC, with a thermometer that is scored at 1-degree intervals, what is the range of possible temperatures for the room? • Answer: 25 ± 0.5 ºC • If you read 15oC, it may be between 14.5 and 15.5 ºC

  13. Definition of Statistics • Branch of mathematics which allows us to characterize large populations of data by randomly sampling small portions of data from the whole. • Samples come from habitats, communities, biological populations, or experimental investigations, and enable us to draw conclusions about the larger population. • Statistics measure the differences and relationships between sets of data • Nothing is 100% certain in science!

  14. Randomization • Valid conclusions about populations can only be reached when samples are drawn randomly. • Each member of the population must have an equal and independent chance of being sampled. • How might you ensure that populations are randomly sampled?

  15. Sample Size • The greater the number of samples drawn from a population, the more representative the sample is of that population. • Replication refers to repeatedly measuring a treatment in an experiment to account for variation.

  16. Factor: Amount of water per day Treatments: 0.1L, 0.5L, 1.0L Number of replicates: 3 per treatment 1 1 2 2 3 3 1 2 3

  17. Mean • An average of data points • Central tendency of the data • Find the mean of the given data³: • Answer: 12999.4

  18. Range • A measure of the spread of data • Difference between the largest and the smallest observed values • Find the range of the given data: • Answer: 22969 • If one data point were unusually large or unusually small, it would have a great effect on the range. Such points are called outliers.

  19. Looking at Data • How accurate is the data? (How close are the data to the “real” results?) This is also known as BIAS • How precise is the data? (All test systems have some uncertainty, due to limits of measurement) Estimation of the limits of the experimental uncertainty is essential.

  20. the mean. (=Replication!)

  21. Comparing Averages • Now plot means together on a graph to visualize the relationship between the two groups.

  22. The size of our error bars depends on how spread out the data is around the mean

  23. Drawing error bars • The simplest way to draw an error bar is to use the mean as the central point, and to use the distance of the measurement that is furthest from the average as the endpoints of the data bar

  24. Value farthest from average Calculated distance Average value

  25. What do error bars suggest? • If the bars show extensive overlap, it is likely that there is not a significant difference between those values

  26. Error bars • Graphical representation of the variability of data • Can be used to show either the range of data or the standard deviation on a graph

  27. Standard deviation • A measure of how the individual observations of a data set are dispersed or spread out around the mean. • Determined by a mathematical formula which is programmed into your calculator. • In a normal distribution, about 68% of all values lie within ±1 standard deviation of the mean. This rises to about 95% for ±2 standard deviations from the mean.

  28. How is Standard Deviation calculated? With this formula!

  29. How to calculate SD • TI-86 http://www.saintmarys.edu/~cpeltier/calcforstat/StatTI-86.html • TI-83 and 84 http://www.saintmarys.edu/~cpeltier/calcforstat/StatTI-83.html • In Microsoft Excel, type the following code into the cell where you want the Standard Deviation result, using the "unbiased," or "n-1" method: =STDEV(A1:A30) (substitute the cell name of the first value in your dataset for A1, and the cell name of the last value for A30.)

  30. Comparing the means and standard deviation between two or more samples Mean: 1300/10 = 130.0 cm

  31. Answers • SD for sunlight data: 17.68 cm • SD for shade data: 47.02 cm • Wide variation makes us question experimental design • Means alone are not sufficient in determining whether two groups differ statistically from one another.

  32. A typical standard distribution curve

  33. According to this curve: • One standard deviation away from the mean in either direction on the horizontal axis (the red area on the preceding graph) accounts for approximately 68 percent of the data in this group. • Two standard deviations away from the mean (the red and green areas) account for roughly 95 percent of the data.

  34. Three Standard Deviations? • three standard deviations (the red, green and blue areas) account for about 99 percent of the data -3sd-2sd+/-1sd2sd+3sd

  35. Significant difference between two data sets using the t-test • T-test compares two sets of data to see if chance alone could make a difference • Scientists like to be at least 95% certain of their findings before drawing conclusions • Mean, SD, and sample size are used to calculate the value of t • Degrees of freedom = sum of sample sizes of each of the two groups minus 2

More Related