750 likes | 910 Views
Regression Analysis. Defense Resources Management Institute. Unscheduled Maintenance Issue:. 36 flight squadrons Each experiences unscheduled maintenance actions (UMAs) UMAs costs $1000 to repair, on average. You’ve got the Data… Now What?. Unscheduled Maintenance Actions (UMAs).
E N D
Regression Analysis Defense Resources Management Institute
Unscheduled Maintenance Issue: • 36 flight squadrons • Each experiences unscheduled maintenance actions (UMAs) • UMAs costs $1000 to repair, on average.
You’ve got the Data… Now What? Unscheduled Maintenance Actions (UMAs)
What do you want to know? • How many UMAs will there be next month? • What is the average number of UMAs ?
UMAs Next Month 95% Confidence Interval
Average UMAs 95% Confidence Interval
Model: Cost of UMAs for one squadron If the cost per UMA = $1000, the Expected cost for one squadron = $60,000
Model: Total Cost of UMAs Expected Cost for all squadrons = 60 * $1000 * 36 = $2,160,000
Model: Total Cost of UMAs Expected Cost for all squadrons = 60 * $1000 * 36 = $2,160,000 How confident are we about this estimate?
~ 95% mean (=60) standard error =12/36 = 2
~56 ~58 60 ~62 ~64 (1 standard unit = 2) ~ 95%
95% Confidence Interval on our estimate of UMAs and costs • 60 + 2(2) = [56, 64] • low cost: 56 * $1000 * 36 = $2,016,000 • high cost: 64 * $1000 * 36 = $2,304,000
What do you want to know? • How many UMAs will there be next month? • What is the average number of UMAs ? • Is there a relationship between UMAs and and some other variable that may be used to predict UMAs? • What is that relationship?
Relationships • What might be related to UMAs? • Pilot Experience ? • Flight hours ? • Sorties flown ? • Mean time to failure (for specific parts) ? • Number of landings / takeoffs ?
Regression: • To estimate the expected or mean value of UMAs for next month: • look for a linear relationship between UMAs and a “predictive” variable • If a linear relationship exists, use regression analysis
Regression analysis: describes and evaluates relationships between one variable (dependent or explained variable), and one or more other variables (called the independent or explanatory variables).
What is a good estimating variable for UMAs? • quantifiable • predictable • logical relationship with dependent variable • must be a linear relationship: Y = a + bX
Describing the Relationship • Is there a relationship? • Do the two variables (UMAs and sorties or experience) move together? • Do they move in the same direction or in opposite directions? • How strong is the relationship? • How closely do they move together?
Correlation Coefficient • Statistical measure of how closely two variables are moving together in a coordinated fashion • Measures strength and direction • Value ranges from -1.0 to +1.0 • +1.0 indicates “perfect” positive linear relation • -1.0 indicates “perfect” negative linear relation • 0 indicates no relation between the two variables
Sorties vs. UMAs r = .9788
Experience vs. UMAs r = .1896
A Word of Caution... • Correlation does NOT imply causation • It simply measures the coordinated movement of two variables • Variation in two variables may be due to a third common variable • The observed relationship may be due to chance alone
What is the Relationship? • In order to use the correlation information to help describe the relationship between two variables we need a model • The simplest one is a linear model:
One Possibility Sum of errors = 0
Another Possibility Sum of errors = 0
Which is Better? • Both have sum of errors = 0 • Compare sum of absolute errors:
One Possibility Sum of absolute errors = 6
Another Possibility Sum of absolute errors = 6
Which is Better? • Sum of the absolute errors are equal • Compare sum of errors squared:
The Correct Relationship: Y = a + bX + U Y systematic random 100 90 80 70 60 50 X 100 110 120 130
The correct relationship: • Y = a + bX + U Y systematic random 100 90 80 70 60 50 X 100 110 120 130
Least-Squares Method • Penalizes large absolute errors • Y- intercept: • Slope:
Assumptions • Linear relationship: • Errors are random and normally distributed with mean = 0 and variance = • Supported by Central Limit Theorem