240 likes | 340 Views
§ 13.4 - 14.1 Terminology, Clinical Studies, Graphical Representations of Data. Terminology. A statistic is a piece of numerical information taken from a sample. A parameter is a piece of numerical information about the population being studied.
E N D
§ 13.4 - 14.1 Terminology, Clinical Studies, Graphical Representations of Data
Terminology • A statistic is a piece of numerical information taken from a sample. • A parameter is a piece of numerical information about the population being studied. • In other words, a statistic is an estimate for a parameter.
Terminology • Sampling error is the difference between a parameter and the statistic used to estimate it. The causes of this error are:1. Error due to chance or sampling variability.2. A poorly chosen sample--sample bias. • If we have a sample of size n from a population of size N then the sampling rate is the ratio n/N.
The Capture-Recapture Method • Step 1: Capture (choose) a sample of size n1 and tag a certain number of the animals/objects/people. • Step 2: After some amount of time, capture a new sample of size n2and take a count of the tagged individuals. Call this number k. • If the second sample is representative then the size of the population is N (n1)(n2)/k
Example: The N - value of the Monarch Butterfly • Suppose 150 monarchs are caught, tagged and released. • A few days later 200 more monarchs are caught, of which only 2 are found to be tagged. • Estimate the N - value of the local monarch population.
Clinical Studies • Clinical studies are concerned with determining whether a single variable is causes a certain effect. • The goal is to limit confounding variables--other possible causes. • In a controlled study the subjects are divided into two groups: the treatmentgroup and the control group. • If the subjects are assigned to the two groups randomly then the study is a randomized controlled study.
Clinical Studies • If the control group is given a placebo then the study is a controlled placebo study. • If neither group of subjects knows whether they are receiving treatment or a placebo then the study is said to be blind. • If neither the subjects nor the scientists know who is receiving treatment and who is receiving a placebo then the study is referred to as double-blind.
48 40 32 44 72 64 44 28 72 36 44 36 44 44 44 96 72 44 32 72 36 40 76 36 32 40 40 24 36 32 76 72 44 48 40 32 60 72 72 28 48 44 40 72 40 48 36 36 48 36 44 76 44 40 40 40 40 40 40 32 44 48 36 76 60 40 48 36 56 44 4 40 48 48 40 Graphical Representations of Data • A data set is a collection of individual data points. Below is a data set consisting of test scores:
Score 4 24 28 32 36 40 44 48 56 60 64 72 76 96 Frequency 1 1 2 6 10 16 13 9 1 2 1 8 4 1 Frequency Table • One way we might summarize the data is in the form of a Frequency Table. • The number below each exam score is the number of students getting that score.
Bar Graphs • Another convenient way to summarize the test scores is in the form of a bar graph:
Variables:Quantitative v. Qualitative • A variable is any value or characteristic that varies with members of a population. • In the previous example, test scores would be considered a variable. • A variable is said to be quantitative if it represents a measurable quantity. • A variable that cannot be measured is called qualitative.
Variables:Continuous v. Discrete • If the possible values of a variable are ‘countable’--or if there is some smallest increment we can use--the variable is said to be discrete. • If the difference between values of a variable can be arbitrarily small, then the variable is called continuous.
O O A B A O A A A O B O B O O A O O A A A A AB A B A A O O A O O A A A O A O O AB Example: Blood Types Forty people recently donated blood and their types are listed below:
Example: Blood Types While this data is qualitative, it is still possible to make both a frequency table and a bar graph to represent it:
Example: Blood Types Another way to present the information is in the form of a pie chart. • What differentiates this from the previous tables and graphs is that it shows the percentage, or relative frequency of each blood type in the sample.
Let’s return for a moment to our test score example. . . • Suppose the instructor decided to allocate grades as follows: A 80 - 100 B 50 - 79 C 30 - 49 D 0 - 29 • This is an example of using what are called class intervals • When there are too many different values or categories to display our data nicely, we will use these kinds of intervals to simplify the situation.
The test scores, when sorted into class intervals (in this case the letter grades), can be graphed like this:
Histograms • You may have noticed that in all the cases where we have given a chart or graph that the variable used was discrete. • How can we graphically display continuous variables? • We can use a variation on the bar graph called a histogram.
Age Interval* # of Grooms 20 - 25 11,768 25 - 30 9,796 30 - 35 3,300 35 - 40 840 40 - 45 404 45 - 50 83 Example: Age at first marriage. Based on a survey, the frequency table below was obtained for the age of groom at first marriage in the state of Wisconsin Using class intervals of length 10 (years) draw a histogram for the given data.
Age Interval* # of Grooms 20 - 25 11,768 25 - 30 9,796 30 - 35 3,300 35 - 40 840 40 - 45 404 45 - 50 83 Example: Age at first marriage. Based on a survey, the frequency table below was obtained for the age of groom at first marriage in the state of Wisconsin Using class intervals of length 10 (years) draw a histogram for the given data.
Age Interval* # of Grooms 20 - 25 11,768 25 - 30 9,796 30 - 35 3,300 35 - 40 840 40 - 45 404 45 - 50 83 Example: Age at first marriage. Now draw a histogram with intervals which are five years in length.
Age Interval* # of Grooms 20 - 25 11,768 25 - 30 9,796 30 - 35 3,300 35 - 40 840 40 - 45 404 45 - 50 83 Example: Age at first marriage. Now draw a histogram with intervals which are five years in length.