Practical Applications of Statistical Methods in the Clinical Laboratory. Roger L. Bertholf, Ph.D., DABCC Associate Professor of Pathology Director of Clinical Chemistry & Toxicology UF Health Science Center/Jacksonville.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Practical Applications of Statistical Methods in the Clinical Laboratory
Roger L. Bertholf, Ph.D., DABCC
Associate Professor of Pathology
Director of Clinical Chemistry & Toxicology
UF Health Science Center/Jacksonville
“[Statistics are] the only tools by which an opening can be cut through the formidable thicket ofdifficulties that bars the path of those who pursue the Science of Man.”
[Sir] Francis Galton (18221911)
“There are three kinds of lies: Lies, damned lies, and statistics”
Benjamin Disraeli (18041881)
“Do not worry about your difficulties in mathematics, I assure you that mine are greater”
Albert Einstein (18791955)
“I don't believe in mathematics”
Albert Einstein
The mean is a measure of the centrality of a set of data.
The geometric mean is primarily used to average ratios or rates of change.
Suppose you spend $6 on pills costing 30 cents per dozen, and $6 on pills costing 20 cents per dozen. What was the average price of the pills you bought?
You spent $12 on 50 dozen pills, so the average cost is 12/50=0.24, or 24 cents.
This also happens to be the harmonic mean of 20 and 30:
For the data set:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10:
The mode is the value that occurs most often
The midrange is the mean of the highest and lowest values
The median is the value for which half of the remaining values are above and half are below it. I.e., in an ordered array of 15 values, the 8th value is the median. If the array has 16 values, the median is the mean of the 8th and 9th values.
Suppose you’re thinking about building a house in a certain neighborhood, and the real estate agent tells you that the average (mean) size house in that area is 2,500 sq. ft. Astutely, you ask “What’s the median size?” The agent replies “1,800 sq. ft.”
What does this tell you about the sizes of the houses in the neighborhood?
Two sets of data may have similar means, but otherwise be very dissimilar. For example, males and females have similar baseline LH concentrations, but there is much wider variation in females.
How do we express quantitatively the amount of variation in a data set?
The variance is the mean of the squared differences between individual data points and the mean of the array.
Or, after simplifying, the mean of the squares minus the squared mean.
In what units is the variance?
Is that a problem?
The standard deviation is the square root of the variance. Standard deviation is not the mean difference between individual data points and the mean of the array.
In what units is the standard deviation?
Is that a problem?
*Sometimes called the Relative Standard Deviation (RSD or %RSD)
The standard deviation of an average decreases by the reciprocal of the square root of the number of data points used to calculate the average.
How many measurements must we average to improve our precision by a factor of 2?
To improve precision by a factor of 2:
To improve precision by a factor of 10:
Improvement in CV by running duplicates:
“Sir, I have found you an argument. I am not obliged to find you an understanding.”
Samuel Johnson (17091784)
The binomial distribution applies to events that have two possible outcomes. The probability of r successes in n attempts, when the probability of success in any individual attempt is p, is given by:
What is the probability that 10 of the 12 babies born one busy evening in your hospital will be girls?
“God does arithmetic”
Karl Friedrich Gauss (17771855)
What is the Gaussian distribution?
63
81
36
12
28
7
79
52
96
17
22
4
61
85
etc.
63
81
36
12
28
7
79
52
96
17
22
4
61
85
22
73
54
33
99
5
61
28
58
24
16
77
43
8
85
152
90
45
127
12
140
70
154
41
38
81
104
93
+
=
. . . etc.
Probability
x
The probability of x in a Gaussian distribution with mean and standard deviation is given by:
“Like the ski resort full of girls hunting for husbands and husbands hunting for girls, the situation isnot as symmetrical as it might seem.”
Alan Lindsay Mackay (1926 )
Probability
.67
.95
µ3
µ2
µ
µ
µ+
µ+2
µ+3
Range
Probability
Odds
+/ 1.00
68.3%
1 in 3
+/ 1.64
90.0%
1 in 10
+/ 1.96
95.0%
1 in 20
+/ 2.58
99.0%
1 in 100
That
This
[On the Gaussian curve] “Experimentalists think that it is a mathematical theorem while the mathematicians believe it to bean experimental fact.”
Gabriel Lippman (18451921)
"Life is good for only two things, discovering mathematics and teaching mathematics"
Siméon Poisson (17811840)
The Poisson distribution predicts the frequency of r events occurring randomly in time, when the expected frequency is
?
How many counts must be collected in an RIA in order to ensure an analytical CV of 5% or less?
When a small sample is selected from a large population, we sometimes have to make certain assumptions in order to apply statistical methods
Recall that the Gaussian distribution is defined by the probability function:
Note that the exponential factor contains both and , both population parameters. The factor is often simplified by making the substitution:
The variable z in the equation:
is distributed according to a unit gaussian, since it has a mean of zero and a standard deviation of 1
Probability
.67
.95
3
2
1
0
1
2
3
z
But if we use the sample mean and standard deviation instead, we get:
and we’ve defined a new quantity, t, which is not distributed according to the unit Gaussian. It is distributed according to the Student’s t distribution.
The Student’s t statistic can also be used to analyze differences between the sample mean and the population mean:
Note that, for a sufficiently large N (>30), t can be replaced with z, and a Gaussian distribution can be assumed
The mean age of the 20 participants in one workshop is 27 years, with a standard deviation of 4 years. Next door, another workshop has 16 participants with a mean age of 29 years and standard deviation of 6 years.
Is the second workshop attracting older technologists?
First, calculate the t statistic for the two means:
Next, determine the degrees of freedom:
Since 1.16 is less than 1.64 (the t value corresponding to 90% confidence limit), the difference between the mean ages for the participants in the two workshops is not significant
Suppose we are comparing two sets of data in which each value in one set has a corresponding value in the other. Instead of calculating the difference between the means of the two sets, we can calculate the mean difference between data pairs.
Instead of:
we use:
to calculate t:
If the type of data permit paired analysis, the paired t test is much more sensitive than the unpaired t.
Why?
There is a general formula that relates actual measurements to their predicted values
A special (and very useful) application of the 2 distribution is to frequency data
In your hospital, you have had 83 cases of iatrogenic strep infection in your last 725 patients. St. Elsewhere, across town, reports 35 cases of strep in their last 416 patients.
Do you need to review your infection control policies?
If your infection control policy is roughly as effective as St. Elsewhere’s, we would expect that the rates of strep infection for the two hospitals would be similar. The expected frequency, then would be the average
First, calculate the expected frequencies at your hospital (f1) and St. Elsewhere (f2)
Next, we sum the squared differences between actual and expected frequencies
In general, when comparing k sample proportions, the degrees of freedom for 2 analysis are k  1. Hence, for our problem, there is 1 degree of freedom.
A table of 2 values lists 3.841 as the 2 corresponding to a probability of 0.05.
So the variation (2between strep infection rates at the two hospitals is within statisticallypredicted limits, and therefore is not significant.
The F statistic is simply the ratio of two variances
(by convention, the larger V is the numerator)
There are several ways the F distribution can be used. Applications of the F statistic are part of a more general type of statistical analysis called analysis of variance (ANOVA). We’ll see more about ANOVA later.
You’re asked to do a “quick and dirty” correlation between three whole blood glucose analyzers. You prick your finger and measure your blood glucose four times on each of the analyzers.
Are the results equivalent?
The mean glucose concentrations for the three analyzers are 70, 85, and 76.
If the three analyzers are equivalent, then we can assume that all of the results are drawn from a overall population with mean and variance 2.
Approximate by calculating the mean of the means:
Calculate the variance of the means:
But what we really want is the variance of the population. Recall that:
Since we just calculated
we can solve for
So we now have an estimate of the population variance, which we’d like to compare to the real variance to see whether they differ. But what is the real variance?
We don’t know, but we can calculate the variance based on our individual measurements.
If all the data were drawn from a larger population, we can assume that the variances are the same, and we can simply average the variances for the three data sets.
Now calculate the F statistic:
A table of F values indicates that 4.26 is the limit for the F statistic at a 95% confidence level (when the appropriate degrees of freedom are selected). Our value of 10.6 exceeds that, so we conclude that there is significant variation between the analyzers.
Probability
Probability
log x
x
The concentrations of most clinical analytes are not usually distributed in a Gaussian manner. Why?
How do we determine the reference range (limits of expected values) for these analytes?
What happens when we want to compare one reference range with another? This is precisely what CLIA ‘88 requires us to do.
How do we do this?
“Everything should be made as simple as possible, but not simpler.”
Albert Einstein
Suppose we just do a small internal reference range study, and compare our results to the manufacturer’s range.
How do we compare them?
Is this a valid approach?
Rank normal values (x1,x2,x3...xn) and the reference population (y1,y2,y3...yn):
x1, y1, x2, x3,y2, y3 ... xn, yn
Count the number of y values that follow each x, and call the sum Ux. Calculate Uy also.
*Also called the U test, rank sum test, or Wilcoxen’s test.
It should be obvious that: Ux + Uy = NxNy
If the two distributions are the same, then:
Ux = Uy = 1/2NxNy
Large differences between Ux and Uy indicate that the distributions are not equivalent
“‘Obvious’ is the most dangerous word in mathematics.”
Eric Temple Bell (18831960)
In the run test, order the values in the two distributions as before:
x1, y1, x2, x3, y2, y3 ... xn, yn
Add up the number of runs (consecutive values from the same distribution). If the two data sets are randomly selected from one population, there will be few runs.
Sometimes, when we don’t know anything about a distribution, the best thing to do is independently test its characteristics.
y
x
mean, SD
N
mean, SD
N
mean, SD
N
mean, SD
N
Reference population
With the Monte Carlo method, we have simulated the test we wish to applythat is, we have randomly selected samples from the parent distribution, and determined whether our inhouse data are in agreement with the randomlyselected samples.
50
45
40
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
Linear regression analysis generates an equation for a straight line
y = mx + b
where m is the slope of the line and b is the value of y when x = 0 (the yintercept).
The calculated equation minimizes the differences between actual y values and the linear regression line.
y = 1.031x  0.024
50
45
40
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
Do x and y values vary in concert, or randomly?
It is clear that the greater the covariance, the stronger the relationship between x and y.
But . . . what about units?
e.g., if you measure glucose in mg/dL, and I measure it in mmol/L, who’s likely to have the highest covariance?
50
45
40
y = 1.031x  0.024
= 0.9986
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
50
45
40
y = 1.031x  0.024
= 0.9894
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
The linear regression equation gives us a way to calculate an “estimated” y for any given x value, given the symbol ŷ (yhat):
Now what we are interested in is the average difference between the measured y and its estimate, ŷ :
50
45
40
y = 1.031x  0.024
= 0.9986
sy/x=1.83
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
50
45
40
y = 1.031x  0.024
= 0.9894
sy/x = 5.32
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
If we assume that the errors in the y measurements are Gaussian (is that a safe assumption?), then the standard error of the estimate gives us the boundaries within which 67% of the y values will fall.
2sy/x defines the 95% boundaries..
Signal/Noise threshold
Signal
time
At an S/N ratio of 5, what is the minimum CV of the measurement?
If the S/N is 5, 20% of the measured signal is noise, which is random. Therefore, the CV must be at least 20%.
Signal
Concentration
We can eliminate any point that differs from the next highest value by more than 0.765 (p=0.05) times the spread between the highest and lowest values (Dixon test).
Example: 4, 5, 6, 13
(13  4) x 0.765 = 6.89
If the analytical method has a high variance (CV), it is likely that small deviations from linearity will not be detected due to the high standard error of the estimate
Signal
Concentration
Recall that, for linear data, the relationship between x and y can be expressed as
y = f(x) = a + bx
A curve is described by the quadratic equation:
y = f(x) = a + bx + cx2
which is identical to the linear equation except for the addition of the cx2 term.
It should be clear that the smaller the x2 coefficient, c, the closer the data are to linear (since the equation reduces to the linear form when c approaches 0).
What is the drawback to this approach?
Signal
Concentration
The ANOVA technique requires that method variance is constant at all concentrations. Cochran’s test is used to test whether this is the case.
“If your experiment needs statistics, you ought to have done a better experiment.”
Ernest Rutherford (18711937)
If TP as the number of “true positives”, and FN is the number of “false negatives”, the sensitivity is defined as:
Of 25 admitted cocaine abusers, 23 tested positive for urinary benzoylecgonine and 2 tested negative. What is the sensitivity of the urine screen?
If TN is the number of “true negative” results, and FP is the number of falsely positive results, the specificity is defined as:
What would you guess is the specificity of any particular clinical laboratory test? (Choose any one you want)
Since reference ranges are customarily set to include the central 95% of values in healthy subjects, we expect 5% of values from healthy people to be “abnormal”this is the false positive rate.
Hence, the specificity of most clinical tests is no better than 95%.
Marker concentration

+
Disease
True positive rate
(sensitivity)
False positive rate
1specificity
The predictive value of a clinical laboratory test takes into account the prevalence of a certain disease, to quantify the probability that a positive test is associated with the disease in a randomlyselected individual, or alternatively, that a negative test is associated with health.
The predictive value is the % of all positives that are true positives:
Predictive value describes the usefulness of a clinical laboratory test in the real world.
Or does it?
We can combine the PV+ and PV to give a quantity called the efficiency:
The efficiency is the percentage of all patients that are classified correctly by the test result.
“To call in the statistician after the experiment is done may be no more than asking him to performa postmortem examination: he may be able to say what the experiment died of.”
Ronald Aylmer Fisher (1890  1962)
“He uses statistics as a drunken man uses lamp posts  for support rather than illumination.”
Andrew Lang (18441912)
12s
13s
22s
R4s
41s
10x
1 in 20
1 in 300
1 in 400
1 in 800
1 in 600
1 in 1000
+3sd
+2sd
+1sd
mean
1sd
2sd
3sd
+3sd
+2sd
+1sd
mean
1sd
2sd
3sd
+3sd
+2sd
+1sd
mean
1sd
2sd
3sd
+3sd
+2sd
+1sd
mean
1sd
2sd
3sd
“In science one tries to tell people, in such a way as to be understood by everyone, something thatno one ever knew before. But in poetry, it's the exact opposite.”
Paul Adrien Maurice Dirac (1902 1984)