230 likes | 433 Views
Outline. ObjectivesBackgroundTecolote's positionFuture study items noted in Mr. Covert's paper (see Reference 1)How to apply the correlation formulaGround Rules for Developing USCM7 CERsUsing CER Data Points to Compute Pearson's rMultiplicative Error Model (MUPE) and Error FormsPearson's Cor
E N D
2. Outline Objectives
Background
Tecolote’s position
Future study items noted in Mr. Covert’s paper (see Reference 1)
How to apply the correlation formula
Ground Rules for Developing USCM7 CERs
Using CER Data Points to Compute Pearson’s r
Multiplicative Error Model (MUPE) and Error Forms
Pearson’s Correlation Coefficient
Definition and example
Property
Revisited High Correlation Items from Reference 1
USCM8 Sample Correlation Coefficients
Conclusions
3. Objectives Derive correlations between the USCM CER uncertainties using an analytic method Note: Correlation matters in cost risk analysis as correlation impacts uncertainty.Note: Correlation matters in cost risk analysis as correlation impacts uncertainty.
4. Tecolote’s Position Cost correlation is not the same as “CER noise correlation”
With CERs as cost estimating methodologies, most of the correlations are captured through the functional relationships specified in the WBS
Do any correlations exist for the remaining noise terms?
“Cost correlation” is not the same as “noise correlation” when CERs are considered. Strong correlations between cost elements in a database should not be mistaken as evidence that residuals or percentage errors of our estimating methodologies derived from the same database are correlated. “Cost correlation” is not the same as “noise correlation” when CERs are considered. Strong correlations between cost elements in a database should not be mistaken as evidence that residuals or percentage errors of our estimating methodologies derived from the same database are correlated.
5. Future Study Items Noted in Reference 1 High correlation coefficients between USCM7 CER uncertainties in “Correlation Coefficients for Spacecraft Subsystems from the USCM7 Database” Note: These correlation numbers seemed extraordinarily high, especially those approaching one, such as 0.98 and 0.97. We wondered if there were any good engineering reasons to believe that the remaining noise for the apogee kick motor (AKM) T1 CER was almost perfectly correlated with the noise for the attitude determination and control system (ADCS) nonrecurring CER. Similarly, could we conclude the existence of high correlation for the remaining uncertainties between program-level and communication nonrecurring costs? Note: These correlation numbers seemed extraordinarily high, especially those approaching one, such as 0.98 and 0.97. We wondered if there were any good engineering reasons to believe that the remaining noise for the apogee kick motor (AKM) T1 CER was almost perfectly correlated with the noise for the attitude determination and control system (ADCS) nonrecurring CER. Similarly, could we conclude the existence of high correlation for the remaining uncertainties between program-level and communication nonrecurring costs?
6. How should we apply the correlation formula to the data points? Reference 1 used 26 satellites from the entire USCM7 database to compute correlation coefficients for USCM7 CERs
“Outliers” not eliminated
Population not homogeneous
We should not use the entire database to compute correlation coefficients
Data point selection
Error form consideration
7. Ground Rules for Developing USCM7 CERs ATSF deleted due to incomplete cost data
Programs with no costs identified were not used
AE, CRRES, P78-1, P78-2, P72-2, OSO, S3, DMSP 5-D1, DMSP 5-D2, and DMSP 5-D3 did not have a communication payload
DSCS, DMSP, DSP, AE, OSO, and SMS did not have an AKM
GPS 9-11 and CRRES AKMs were GFE…
Follow-on production programs: DSCS 4-7, DSCS 8-14, DMSP 5-D2, DSP 5-12, DSP 18-22, FLTSATCOM 6-8, GPS 9-11, and GPS 13-40 not used in the nonrecurring CERs
DSCS A (a development program) not used in the T1 CER
Data points displaying program peculiarity were not used in subsystem CER development
Note: The Combined Release and Radiation Effects Satellite (CRRES) was deleted from the TT&C nonrecurring cost CER because the costs did not represent a full design effort. Note: The Combined Release and Radiation Effects Satellite (CRRES) was deleted from the TT&C nonrecurring cost CER because the costs did not represent a full design effort.
8. Ground Rules for Developing USCM7 CERs (2) P78-1, P78-2, P72-2, and S3 were identified as Space Test Programs (STPs)
A smaller physical size, maximum reuse of existing HW
Shorter design life (6 –18 months)
Not a full-up design effort for nonrecurring
Not a full-up manufacturing effort for recurring
AE, OSO, and CRRES were considered experimental satellites
Developed a separate CER for estimating STPs and experimental programs if appropriate
Using primary equation to predict STPs would be incorrect
Note: We have tried to use dummy variables to include STPs and experimental programs in primary equations, if suitable. Note: We have tried to use dummy variables to include STPs and experimental programs in primary equations, if suitable.
9. Using CER Data Points to ComputePearson’s r Even worse: calculate the corresponding correlation coefficient when using primary equation to predict STPs
If a satellite doesn’t have a particular subsystem, do not include it in computing the correlation coefficient for the corresponding subsystem-level CER
Percentage errors could be 100% using any CER
Do not use data points with program peculiarity to compute Pearson’s r if they are excluded from the CER
Refit the CER with previously excluded outliers if necessary
Homogeneous data set is essential
Note: Using primary equation to predict STPs would give inaccurate and misleading results if STPs are not included.Note: Using primary equation to predict STPs would give inaccurate and misleading results if STPs are not included.
10. Multiplicative Error Model – MUPE Definition for cost variation:
Y = f(X)*e
where E(e ) = 1 and V(e ) = s 2 Error in cost is proportional to cost.Error in cost is proportional to cost.
11. Candidate Error Forms MUPE models use percentage errors:
Note: Residuals are weighted by the reciprocal of the predicted value
Additive models use residuals:
12. Pearson’s Correlation Coefficient Pearson’s correlation coefficient measures the linear association between two sets of pairs {xi} and {yi}
{xi} and {yi} are the paired percentage errors for multiplicative models
{xi} and {yi} are the paired residuals for additive models
should both be zero
13. Reference 1: Deriving Correlation Coefficients Usually don’t know the true value of rxy, so approximate it by sample correlation rxy
Example calculation using randomly generated numbers Note: Both the means of xi’s and yi’s are not zero. This is a warning flag to indicate that there is a mismatch between the CERs and their error terms.Note: Both the means of xi’s and yi’s are not zero. This is a warning flag to indicate that there is a mismatch between the CERs and their error terms.
14. Pearson’s r Preserved through Linear Transformation Given the following:
T = X + Y
X = f(W)* e
Y = g(W)* ?
(Note: f and g are USCM7 weight-based CERs, e and h are error terms)
The correlation between X and Y is the same as the correlation between e and h, i.e.,
Total cost variance at a given weight, wt, is given by
We should consider the correlations between percentage errors instead of residuals Note: If a total project T is composed of two elements, X and Y, which are hypothesized by the USCM7 weight-based CERs, f and g, respectively: T = X + Y, X = f(W)* e, and Y = g(W)* ?Note: If a total project T is composed of two elements, X and Y, which are hypothesized by the USCM7 weight-based CERs, f and g, respectively: T = X + Y, X = f(W)* e, and Y = g(W)* ?
15. Pearson’s r Preserved Through Linear Transformation (2) General total cost variance:
Where:
sk, sm, and rkm are the standard deviations of the noise terms for the WBS elements k and m, respectively, and the correlation between them.
fk and fm are the CER estimated values for the WBS elements k and m, respectively.
16. Revisited High Correlation Items in Previous Study High correlation coefficients listed in Reference 1 not found with the revised approach Note: The correlation of 0.8 between the EPS NR and COMM NR CER noise terms is not significant as the sample size is only 6. The corresponding 95% CI for r is from –0.033 to 0.977. CI for z’ is 0.5*ln[(1+r)/(1-r)] + (za/2)(sz')Note: The correlation of 0.8 between the EPS NR and COMM NR CER noise terms is not significant as the sample size is only 6. The corresponding 95% CI for r is from –0.033 to 0.977. CI for z’ is 0.5*ln[(1+r)/(1-r)] + (za/2)(sz')
17. USCM8 Sample Correlation Coefficients Range: (-0.925,0.913), Mean = 0.04, Median = 0.02, Skew = - 0.02
1st quartile = -0.32, 3rd quartile = 0.44, sd = 0.44
73% of the correlation coefficients are from –0.5 to 0.5
Three sample correlations with absolute values > 0.85: 0.90, 0.91, -0.93
The sample correlation coefficients range from -0.925 to 0.913 with an average of 0.04, median of 0.02, and standard deviation of 0.44. There are only three sample correlations with absolute values greater than 0.85. They are 0.90, 0.91, and -0.93 (shown in red on the backup chart). The sample correlation of 0.9 is significant, but the other two numbers are not, due to the sample size.
The shape of the histogram is very different from the one listed in Reference 1. See the graph on next page for comparison. The sample correlation coefficients range from -0.925 to 0.913 with an average of 0.04, median of 0.02, and standard deviation of 0.44. There are only three sample correlations with absolute values greater than 0.85. They are 0.90, 0.91, and -0.93 (shown in red on the backup chart). The sample correlation of 0.9 is significant, but the other two numbers are not, due to the sample size.
The shape of the histogram is very different from the one listed in Reference 1. See the graph on next page for comparison.
18. Reference 1: USCM7 Correlation Coefficients This graph is from Mr. Covert’s correlation analysis paper.This graph is from Mr. Covert’s correlation analysis paper.
19. Correlations between Structure/Thermal and SEPM Nonrecurring CERs For non-communication satellites: 0.90
For communication satellites: -0.54
For all satellites: 0.73
This result indicates that the noise of the SEPM nonrecurring CER for non-communication satellites might be correlated with the noise of the combined structure and thermal nonrecurring CER. The data points in this category are STPs and experimental programs. Another interesting point is that the SEPM noise term is moderately correlated with the combined structure and thermal nonrecurring CER noise term for communication satellites, with a negative correlation of -0.54. But the overall sample correlation coefficient is 0.73 if communication and non-communication satellites are combined.This result indicates that the noise of the SEPM nonrecurring CER for non-communication satellites might be correlated with the noise of the combined structure and thermal nonrecurring CER. The data points in this category are STPs and experimental programs. Another interesting point is that the SEPM noise term is moderately correlated with the combined structure and thermal nonrecurring CER noise term for communication satellites, with a negative correlation of -0.54. But the overall sample correlation coefficient is 0.73 if communication and non-communication satellites are combined.
20. Conclusions Sample correlation coefficient is sensitive to the computing method
Use CER data points to compute Pearson’s r to avoid heteroscedasticity
In cost risk analysis, consider the correlations between
percentage errors instead of residuals for multiplicative CERs and
residuals instead of percentage errors for additive CERs
Means of the errors should be zero when computing Pearson’s r
With the revised approach, high correlations from previous study for USCM7 CERs are not found
We have found no discernible sample correlations for the USCM8 subsystem-level CERs using the revised method:
Mean = 0.04, Median = 0.02, Skew = -0.02
73% of them are between -0.5 and 0.5.
Three sample correlations with absolute values greater than 0.85: 0.90, 0.91, and -0.93 ( 0.9 is significant, but not the other two)
Cost correlation is not the same as “CER noise correlation.” Use this analytic method as a cross-check Suggestion: Use this analytic method as a cross-check to see (1) if CERs are developed properly and (2) if we need to check with the program office about the development process for certain cost elements.
Suggestion: Use this analytic method as a cross-check to see (1) if CERs are developed properly and (2) if we need to check with the program office about the development process for certain cost elements.
21. References Covert, Raymond P., "Correlation Coefficients for Spacecraft Subsystems from the USCM7 Database," Third Joint Annual ISPA/SCEA International Conference, Vienna, VA, 12-15 June 2001.
Garvey, Paul R, "Do Not Use Rank Correlation in Cost Risk Analysis," 32nd Annual DoD Cost Analysis Symposium, Williamsburg, VA, 2-5 February 1999.
Nguyen, P., et al., “Unmanned Spacecraft Cost Model, Seventh Edition,” U.S. Air Force Space and Missile Systems Center (SMC/FMC), Los Angeles AFB, CA, August 1994.
Nguyen, P., et al., “Unmanned Spacecraft Cost Model, Eighth Edition,” U.S. Air Force Space and Missile Systems Center (SMC/FMC), Los Angeles AFB, CA, October 2001.
Tecolote Research, Inc., “RI$K in ACE User’s Manual,” GM 075, August 1999.
23. USCM8 Sample Correlation Coefficients Note: The above table contains correlation coefficients for the uncertainties between USCM8 Subsystem-Level MUPE CERs. PGM_T1C denotes the SEPM T1 CER for communication satellites, while PGM_T1NC is the SEPM T1 CER for non-communication satellites. The noise correlation between the spacecraft nonrecurring cost and the SEPM nonrecurring cost for non-communication satellites is –0.23, which is not displayed in the above table.
There are only three sample correlations with absolute values greater than 0.85. They are 0.90, 0.91, and -0.93 (shown in red in Table above). The sample correlation of 0.9 is significant, but the other two numbers are not, due to the sample size. Note: The above table contains correlation coefficients for the uncertainties between USCM8 Subsystem-Level MUPE CERs. PGM_T1C denotes the SEPM T1 CER for communication satellites, while PGM_T1NC is the SEPM T1 CER for non-communication satellites. The noise correlation between the spacecraft nonrecurring cost and the SEPM nonrecurring cost for non-communication satellites is –0.23, which is not displayed in the above table.
There are only three sample correlations with absolute values greater than 0.85. They are 0.90, 0.91, and -0.93 (shown in red in Table above). The sample correlation of 0.9 is significant, but the other two numbers are not, due to the sample size.