Characterizing Variability and Comparing Patterns from Data. “Statistics” Module 3. Outline. random samples notion of a statistic estimating the mean  sample average assessing the impact of variation on estimates  sampling distribution
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Characterizing Variability and Comparing Patterns from Data
“Statistics”
Module 3
J. McLellan
Scenario 
J. McLellan
A random sample of size “n” of a population random variable is a collection of random variables X1, … Xn such that
Each Xi represents a snapshot of the process. The Xi’s are referred to as sample random variables.
What do we do with these sample values?...
=
F
(
x
)
F
(
x
)
X
X
i
J. McLellan
n
1
=
å
X
X
i
n
=
i
1
n
Lower case is used to denote
observed values of the sample
random variables and average.
1
=
å
x
x
i
n
=
i
1
J. McLellan
Definition
A statistic is a function of sample random variables that is used to estimate a value of a parameter, and does not depend on any unknown parameters.
n
1
=
å
X
X
i
n
=
i
1
J. McLellan
A statistic is a random variable, with its own probability distribution
J. McLellan
Mean
ì
ü
ì
ü
n
n
1
1
=
=
å
å
E
{
X
}
E
X
E
X
í
ý
í
ý
i
i
n
n
î
þ
î
þ
=
=
i
1
i
1
n
n
m
1
1
n
=
=
m
=
=
m
å
å
E
{
X
}
i
n
n
n
=
=
i
1
i
1
Value expected on average
of the sample average is
the true mean of the process
 sample average is an
UNBIASED estimator for the
mean.
because of independence
of sample random variables
J. McLellan
Variance
æ
ö
n
1
ç
÷
=
å
Var
(
X
)
Var
X
ç
÷
i
n
è
ø
=
i
1
æ
ö
n
n
1
1
ç
÷
=
=
å
å
Var
X
Var
(
X
)
ç
÷
i
i
2
2
è
ø
n
n
=
=
i
1
i
1
2
2
s
s
n
=
=
2
n
n
J. McLellan
If we have a sum of independent random variables, X and Y, with “a” and “b” constants, then
Var( a X+ b Y) = a2 Var(X) + b2 Var(Y)
J. McLellan
Interpretation
J. McLellan
J. McLellan
J. McLellan
… is estimated using the following statistic:
Observed value:
Mean of the sample variance:
n
1
2
2
=

å
s
(
X
X
)
i

n
1
=
i
1
n
1
2
2
=

å
s
(
x
x
)
i

n
1
=
i
1
Sample variance is an UNBIASED
estimator of variance.
2
2
=
s
E
{
s
}
J. McLellan
… is simply the square root of the sample variance
BUT
¹
s
E
{
s
}
J. McLellan
J. McLellan
Consider the sample average
We can standardize this to have zero mean and unit variance:
2
m
s
X
~
N
(
,
/
n
)
X
X
“Normally distributed with mean
and variance”
“is distributed as”

m
X
X
=
Z
s
/
n
X
J. McLellan
Distribution for standard normal:
Start with 
and consider Z 

<
<
=
P
(
1
.
96
Z
1
.
96
)
0
.
95

m
X
X

<
<
=
P
(
1
.
96
1
.
96
)
0
.
95
s
/
n
X
Û
m

s
<
<
m
+
s
=
P
(
1
.
96
/
n
X
1
.
96
/
n
)
0
.
95
X
X
X
X
J. McLellan
Rearrange this last statement to obtain:
Interpretation 

s
<
m
<
+
s
=
P
(
X
1
.
96
/
n
X
1
.
96
/
n
)
0
.
95
X
X
X
RANDOM
NOT
random
RANDOM
J. McLellan
Picture  sequence of intervals associated with repeated experimentation
true value of mean
J. McLellan
General result for mean 
100(1)% confidence interval given by:
where 

s
<
m
<
+
s
X
z
/
n
X
z
/
n
a
a
/
2
X
X
/
2
X
J. McLellan
General Approach

m
X
X
=
Z
s
/
n
X

m
X
X

<
<
=
P
(
1
.
96
1
.
96
)
0
.
95
s
/
n
X

s
<
m
<
+
s
=
P
(
X
1
.
96
/
n
X
1
.
96
/
n
)
0
.
95
X
X
X
J. McLellan
When population variance is “known”, 100(1)% confidence interval is 
Known variance 

s
<
m
<
+
s
X
z
/
n
X
z
/
n
a
a
/
2
X
X
/
2
X
J. McLellan
What if variance is unknown?
Follow previous approach by forming standardized quantity:
Solution 

m
X
X
s
/
n
X
J. McLellan
When the data are Normally distributed,
follows a Student’s t distribution with n1 degrees of freedom
Degrees of freedom 

m
X
X
s
/
n
X
J. McLellan
… has a shape similar to that of Normal distribution
3 degrees of
freedom
J. McLellan
Variance Unknown

<
m
<
+
X
t
s
/
n
X
t
s
/
n
n
a
n
a
,
/
2
X
X
,
/
2
X
J. McLellan
Conversion in a chemical reactor using new catalyst preparation
J. McLellan

<
m
<
+
76
.
1
(
1
.
96
)(
2
.
1
)
/
10
76
.
1
(
1
.
96
)(
2
.
1
)
/
10
Þ
<
m
<
74
.
8
77
.
4
J. McLellan
Conversion in a chemical reactor using new catalyst preparation
J. McLellan

<
m
<
+
76
.
1
(
2
.
262
)(
2
.
3
)
/
10
76
.
1
(
2
.
262
)(
2
.
3
)
/
10
Þ
<
m
<
74
.
5
77
.
7
J. McLellan
First, we need to know the sampling distribution of the sample variance:
n
1
2
2
=

å
s
(
X
X
)
i

n
1
=
i
1
J. McLellan
2
2
c
Z
~
1
2
2
2
2
+
+
c
Z
Z
Z
~
1
2
3
3
3 degrees of
freedom
J. McLellan
Sample variance
2
s
2
2
c
s
~

n
1

n
1
J. McLellan
2

(
n
1
)
s
2
2
c
<
<
c
=

a
P
(
)
1


a

a
n
1
,
1
/
2
n
1
,
/
2
2
s
2
2


(
n
1
)
s
(
n
1
)
s
2
<
s
<
=

a
P
(
)
1
2
2
c
c

a


a
n
1
,
/
2
n
1
,
1
/
2
2
2


(
n
1
)
s
(
n
1
)
s
2
<
s
<
2
2
c
c

a


a
n
1
,
/
2
n
1
,
1
/
2
J. McLellan
Notes
1) the tail areas are equal
however the interval can be asymmetric
2) is the value of the Chisquared random variable with upper tail area of 1/2 and n1 degrees of freedom
equal tail areas
2
c


a
n
1
,
1
/
2
J. McLellan
Temperature controller has been implemented on a polymer reactor 
J. McLellan
Use confidence interval for variance
2
c
=
2
.
7

9
,
1
0
.
025
2
c
=
19
.
0
9
,
0
.
025
2
<
s
<
1
.
52
10
.
67
J. McLellan
Comment
Conclusion still doesn’t
change, however.
2
<
s
<
2
.
04
5
.
71
2
<
s
<
1
.
52
10
.
67
J. McLellan