schedule and cost growth n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Schedule and Cost Growth PowerPoint Presentation
Download Presentation
Schedule and Cost Growth

Loading in 2 Seconds...

play fullscreen
1 / 62

Schedule and Cost Growth - PowerPoint PPT Presentation


  • 127 Views
  • Uploaded on

Schedule and Cost Growth. R. L. Coleman, J. R. Summerville, M. E. Dameron 35 th ADoDCAS. PMI 2002 National Conference November, 2002. Outline. Descriptive Statistics Investigating the Hypothesis Is There a Curve? Normalizing for Dollar Size Correction Factors and Their Use

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Schedule and Cost Growth' - eddy


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
schedule and cost growth

Schedule and Cost Growth

R. L. Coleman, J. R. Summerville, M. E. Dameron

35th ADoDCAS

PMI 2002 National Conference

November, 2002

outline
Outline
  • Descriptive Statistics
  • Investigating the Hypothesis
  • Is There a Curve?
  • Normalizing for Dollar Size
  • Correction Factors and Their Use
    • Correcting EACs and Risk Models
  • Analysis Conclusions
  • Modeling Schedule Duration in Networks
  • How Networks Operate
    • Some Toy Problems
background
Background
  • At the MDA* Risk Working Group of 29/30 May 01, Schedule Risk was a major topic
  • Action Item:
    • Investigate Schedule Risk
      • Content variation
      • Cost risk*
      • PERT
      • Time and budget constraints

* The subject of this paper

This work was conducted for and funded by the IC CAIG and MDA

the hypothesis
The Hypothesis
  • Many people believe1 a graph of cost growth vs. schedule growth as illustrated below:

Cost Growth Factor

1.0

Schedule Growth Factor

1.0

1 E. g., Cost Risk Schedule – CEAC, Dr. M. Anvari,

First BMDO Cost Symposium, 4 October 2001

the data
The Data
  • We analyzed data from the RAND Cost Growth Database with both thefollowing characteristics:
    • Programs with E&MD only
      • Because growth is different for those with and without PDRR
    • Programs with schedule data in the requisite fields
  • There were 59 points. The analysis follows.
descriptive statistics for schedule growth
Descriptive Statistics for Schedule Growth
  • We will look at these descriptive statistics in the following slides
    • Distribution shape
    • Scatter plots
    • Dollar weighting
schedule growth distribution
Schedule Growth Distribution

PDF for Schedule Growth

The distribution is highly skewed

CDF for Schedule Growth

Note this region

These two graphs look much like CGF graphs, but the PDF is tighter here, and the CDF is steeper.

basic statistics of schedule change analyzed data only
Basic Statistics of Schedule ChangeAnalyzed data only

Observations

  • Mean 1.29
  • Standard Deviation 0.54
  • CV 42%
  • 75th %-ile 1.46
  • 61st %-ile 1.29
  • 50th %-ile 1.11
  • 25th %-ile 1.00
  • Shrinkers 9/59 15.3%
  • Steady 12/59 20.3%
  • Stretchers 38/59 64.4%

There is some dispersion and tendency to extremes

The distribution is highly skewed,

as was seen in the histogram

But, many programs have little-to-no growth

basic scatterplots sgf sked vs dollar size
Basic Scatterplots – SGF & Sked vs. Dollar Size
  • We see the usual size effect, analogous to that in CGF graphs
    • Bigger programs have less schedule growth
the 1 x pattern
The “1/x Pattern”

The 1/x pattern is virtually universal.

cgf and sgf vs cost size
CGF and SGF vs. Cost Size
  • The pattern is similar, but CGF is generally more extreme:
  • Higher highs
  • Lower lows*
  • * See later plot
basic scatterplots dollar size vs length
Basic Scatterplots – Dollar Size vs. Length

At Phase 2 start, there is a vague connection between length and size

At end, there is no connection

We would not say that longer programs are costlier

basic scatterplots length vs size
Basic Scatterplots – Length vs. $ Size

At Phase 2 start, there is a vague connection between size and length

At end, there is no connection

We would not say that costlier programs are longer

basic scatterplots cost growth
Basic Scatterplots – Cost Growth

There is no obvious connection between CGF and SGF

basic scatterplots length
Basic Scatterplots - Length

There is a slight tendency for longer programs to grow less

weighting by length and dollar size
Weighting by Length- and Dollar-Size

Dollar Weighting shows a more severe effect

Schedule growth is less than cost growth

Weighting by Length- and Dollar-Size both reinforce size effects

sorted graphs
Sorted Graphs

This graph is a zoom-in

SortedCGF shows more growth than SortedSGF

(To the left and right of the x-intercept, Pink y-values are more extreme)

correlation and other joint effects between schedule growth and cost growth

CGF

SGF

Correlation and Other Joint Effects Between Schedule Growth and Cost Growth
  • We will look for correlation
    • Parametric
    • Non-parametric
    • Trends in sorted data
  • We will investigate the hypothesis for schedule growth vs. cost growth
    • We will normalize by dollar size to eliminate any inadvertent distortion
correlation parametric
Correlation - Parametric

There is no linear parametric correlation

correlation non parametric
Correlation – Non-Parametric
  • Test
    • Cox Stewart Test for Trend test statistic of 18 is within the critical values of 8.41 and 18.59
      • The non-parametric test cannot reject no correlation
      • Used CGF Sort because CGF had less ties, thus less ambiguity
    • Previous parametric test cannot reject no correlation
    • Moving averages of CGF do not show a rise
  • Conclusion: Cannot reject “no correlation”
  • Visual presentations follow
patterns in sgf and cgf
Patterns in SGF and CGF

The gentle rise here conforms with the near-critical test statistic

There is no strong rising pattern in either CGF or SGF after sorting on the other

cgf by regime
CGF by Regime

Larger CGFs, but Some small n’s

Largest CGF

Smallest CGF

Programs divided into SGF Regimes show a marked pattern, like the hypothesis suggested

cgf by regime1
CGF by Regime

Programs divided into SGF regimes look somewhat like the hypothesis suggested they would

is there a curve
Is there a curve?

CGF

  • There is no pattern on either side of the data

SGF

is there a curve1
Is there a Curve?

CGF

SGF

There is no reasonable grouping of the stretchers that will produce a curve.

Any grouping of points has the same average.

size normalization
Size Normalization
  • We know there is a size effect in CGF
  • We think there is a size effect in SGF
  • We must investigate schedule effects free from size effects
    • First we will look at a scatter plot
    • Then we will normalize1 all programs for dollar size, and compare to actuals
      • If there is a pattern in any regime, we will worry
      • If there is no regime pattern, we can conclude there is no dollar size distortion
  • We chose to correct out dollar-size because it is stronger, and because we were worried about a length and SGF correlation causing mischief if we tried to correct it out

1 See backup for norming algorithm

is there a dollar size bias
Is there a Dollar-Size Bias?

“Steady” programs are probably attenuated vertically (growth bias)

“Shrink” programs maybe attenuatedhorizontally (size bias)

“Growth” programs span the full range horizontally and vertically

Programs in the 3 regimes show no clear size bias, but a clear growth bias

normed vs actual cgfs by regime
Normed vs Actual CGFs by Regime

Averages for size-normed programs show the same patterns, so there is no size distortion

Note: Corrected 20 Apr 02. Minor differences

normed vs actual cgfs by regime1
Normed vs Actual CGFs by Regime

Both sets of bars look like the hypothesis suggested they would

hypothesis the answer
Hypothesis – The Answer
  • The Hypothesis was about right
    • The below is all we can say for sure
    • Some liberties have been taken with the graph

CGF

SGF

Cost Growth Factor

NB 1: Nominal has growth

1.43

1.24

1.12

1.0

NB 2: Thecurveis not validated, just the 3 regimes

Schedule Growth Factor

1.0

correction factors and their use1
Correction Factors and Their Use
  • We must correct for schedule growth, if we can predict it. The form of the correction is unclear:

We might use these factors to correct a risk model’s nominal growth factors

These factors describe what happens if schedules change.

We might use these factors to adjust an EAC if a schedule changed.

conclusions
Conclusions
  • Schedule growth is less extreme than cost growth
    • But patterns are the same
  • There is a cost-size and length effect, just as for cost growth
    • Dollar-larger programs lengthen less
    • Longer programs lengthen less
  • Neither cost nor length predict the other
  • There is a difference in cost growth by schedule-growth regime

Relative to Relative to

RegimeCGFNo ChangeAverage

    • Programs that shorten 1.42 1.25 1.14
    • Programs that stay the same 1.13 1.00 0.91
    • Programs that lengthen 1.24 1.09 1.00
  • We now have tools to correct EACs and risk analyses

The hypothesis was essentially true

But there is no curve in evidence

schedule growth distributions
Schedule Growth Distributions
  • For schedule network models, a distribution is useful to model durations
  • We will provide a distribution for program-level network schedule growth
    • Useable for confidence intervals and predictions for single programs
    • Useable for systems of systems, to simulate component systems as single entities
  • This section will provide a detailed analysis for fitting the schedule growth data to a distribution
    • Lognormal and Extreme Value distributions show the most promise
    • Extreme Value is the most theoretically compelling
      • Extreme value distributions are used to model the largest of a set of random variables, and networks complete when the last event is finished
best fits vs empirical data
Best Fits vs. Empirical Data

Note disproportionate amount of 1.0’s

Note disproportionate number of 1.0’s

  • Extreme Value Distribution is what we expect theoretically
  • Extreme Value more peaked, appears to represent data better than Lognormal
  • But we will see the number of 1.0’s in the data base (schedules finishing “on time”) creates problems in the fit statistics
why are values of 1 more common and who cares
Why are Values of 1 more Common?And who cares?
  • There is intense pressure to complete on time, and late finishes are easily discerned
  • The consequence of an early finish is to “ship” a flawed system
    • Flaws can be fixed after testing
  • There is a temptation to drag out work if you are done early
  • Perhaps the implication is that the customer should put less emphasis on finish time and more on test results?
  • In any event, it is altogether likely that there would be cosmetic 1.0 SGFs, and the data would seem to reflect that
  • We will find a way to deal with this in the analysis, and recommend a modeling approach
extreme value distribution fit
Extreme Value Distribution Fit
  • The CDF of the data is oddly shaped due to a large number of 1.0’s and fails a Kolmogorov-Smirnov test for the Extreme Value Distribution
  • We believe the disproportionate amount of 1.0’s is politically motivated and not a natural occurrence
    • This causes a “gap” between the empirical and fitted distributions
  • We will next examine a hypothetical distribution with the 1.0’s redistributed along the “gap” area (using the Ext Val fit)

Note “gap” caused by 1.0’s

Empirical Schedule Growth CDF vs Fitted Extreme Value

K-S stat = 0.161

95% Critical Value (n=59)

= 0.1131

“gap”

1. Lilliefors methodology applied to Extreme Value distribution to generate critical value with Monte Carlo simulation

the hypothetical natural cdf
The Hypothetical “Natural” CDF

1.0’s redistributed along the “gap” area (in red) better represents what we believe to be the “natural” distribution

12 points at 1.0

Revised Empirical and Extreme Value Fit

Extreme Value:

m = 1.12

b = 0.28

12 points respread

K-S stat = 0.093

95% Critical Value (n=59)

= 0.113

The revised empirical produces an Extreme Value fit with K-S stat below the critical value. This suggests Extreme Value is a good representation of the natural SGF distribution

what the test shows and what it doesn t show
What the test showsAnd what it doesn’t show
  • The redistributed data pass a K-S test
  • But, the test cannot take the redistribution of data into account
    • This is analogous to loss of degrees of freedom, but the literature provides no remedy
  • We fully realize that this is not a “valid statistical test”
    • But it strongly suggests that the underlying distribution is the Extreme Value distribution
hybrid distribution alternative
Hybrid Distribution Alternative
  • The hypothetical natural (re-distributed) distribution is reasonable for use
    • But, if you wish to capture the effects of too many programs appearing to finish “on schedule” then a hybrid distribution should be examined
  • To do this we must consider the probability of 1.0 vs. the rest of the outcomes as discrete cases
    • P(1.0) = 12/59 = 20.3%
    • P(Extreme Value) = 79.7%
  • The Extreme Value parameters would then be estimated from the data with the 1.0’s removed

20.3% (i.e. 12/59) probability of 1.0

Hybrid Schedule Growth PDF with Histogram (original SGF data)

79.7% probability of Extreme Value Distribution (fitted w/o 1.0’s)

hybrid distribution alternative1
Hybrid Distribution Alternative

Extreme Value fit to data without 1.0s:

K-S stat is less than the critical value. The Extreme Value is a good representation of this data.

Extreme Value:

m = 1.16

b = 0.32

K-S stat = 0.087

95% Critical Value (n=47)

= 0.1261

Results of simulation combining this distribution with a discrete 20.3% probability of a 1.0

1. Lilliefors methodology applied to Extreme Value distribution to generate critical value with Monte Carlo simulation

distribution conclusions
Distribution Conclusions
  • We have shown that the Extreme Value distribution is well supported as the natural distribution
  • We have shown that the pieces of the hybrid distribution fit the data
    • And, the hybrid reproduces the actuals well
  • We recommend using the hybrid
    • But if “political” or “cosmetic” effects are absent, we recommend using the hypothetical natural distribution
independent tasks
Independent Tasks
  • Tasks 1 and 2 begin at the same time and are independent
  • Both tasks must be complete before the system is ready
  • Duration is modeled as a uniform distribution ranging from Estimated ± 20%
    • Note that it is symmetric!
  • What is the Expected Duration?

Task 1

Duration 9

Start

End

Task 2

Duration 10

independent tasks1
Independent Tasks

Task 1

Duration 9

Start

End

Task 2

Duration 10

Each task is uniformly distributed from –20% to +20% of the expected duration

The “shorter” Task 1 is the critical path 20% of the time!

The average system duration is 10.91 months … longer than the estimated duration of either component task

comparisons with constant critical path
Comparisons with Constant Critical Path

These all have Critical Path = 10

10

S

E

5

S

E

5

9

S

E

10

5

4

S

E

5

5

5

4

1

S

E

5

5

comparisons with constant cp
Comparisons with Constant CP

These all have CP = 10

… but their probabilistic durations are all different

10

S

E

5

Serial is good

S

E

5

9

Parallel is bad

S

E

Serial is good

10

5

4

Cross links are bad

S

E

5

5

5

4

Durations were modeled as uniform distributions ranging from ±20% of the estimate. 5000 iterations were run.

1

S

E

5

5

network schedule growth as a function of network complexity parallel task toy problem
Network Schedule Growth As a Function of Network Complexity … Parallel-Task Toy Problem
  • This is another toy problem, to see what happens to a network as identical parallel tasks are added

Increasing the number of tasks increases the schedule stretch

network schedule growth as a function of task variance changing cv toy problem
Network Schedule Growth As a Function of Task Variance … Changing-CV Toy Problem
  • This is a real network, with changing variance, to see what happens as variance grows

Increasing the variance of tasks increases the schedule stretch

toy problem conclusions
“Toy Problem” Conclusions
  • The duration of a network will be longer than any of the component legs
  • Parallel tasks lengthen the average duration
    • Independent tasks that must finish at the same time should make you worry about schedule
    • The more parallel tasks, the more you stretch
  • Serial tasks decrease the average duration
    • Serial tasks should make you feel a bit better about schedule
    • However, breaking a single task into smaller pieces will not improve your schedule
  • Interdependencies (cross links) increase the average duration
    • Tasks that depend on two or more other tasks should make you worry about schedule
  • Greater variability of the tasks will make the schedule duration grow
prediction equation rand rdt e
Prediction Equation - RAND RDT&E

SSE = 72.56

Note that data is sparse on the right (large programs)

RDT&E Predicted CGF = 1.8 * (MSII Baseline FY96$M)-0.3 + 1.1

prediction equation rand rdt e1
Prediction Equation - RAND RDT&E

RDT&E Predicted CGF = 1.8 * (MSII Baseline FY96$M)-0.3 + 1.1

dispersion bounds
Dispersion – Bounds

This graph shows the actual data, the CGF prediction line, and the Bounds. The next slide will zoom-in.

dispersion bounds1
Dispersion – Bounds

Note that the Upper and Lower bounds are not symmetric. Also, dispersion is higher for smaller projects … an effect that is captured by the bounds.

basic statistics of schedule change all available schedule data compared to analyzed data
Basic Statistics of Schedule ChangeAll available schedule data compared to analyzed data

Statistic AnalyzedAllObservations

  • Mean 1.29 1.25
  • Standard Deviation 0.54 0.51
  • CV 42% 41%
  • n 59 98
  • 75th %-ile 1.46 1.365
  • %-ile of the mean 61% 63%
  • 50th %-ile 1.11 1.03
  • 25th %-ile 1.00 1.00
  • Shrinkers 15.3% 20.4%
  • Steady 20.3% 22.4%
  • Stretchers 64.4% 57.1%

The two data sets are quite similar,

but,

use the smaller one as your basis

The larger data set is somewhat less skewed

The larger data set has slightly less dispersion