- By
**denis** - Follow User

- 229 Views
- Updated On :

Cost Risk Analysis. How to adjust your estimate for historical cost growth. Unit Index. Unit I – Cost Estimating Unit II – Cost Analysis Techniques Unit III – Analytical Methods Basic Data Analysis Principles Learning Curves Regression Analysis Cost Risk Analysis

Related searches for Cost Risk Analysis

Download Presentation
## PowerPoint Slideshow about 'Cost Risk Analysis' - denis

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Unit Index

Unit I – Cost Estimating

Unit II – Cost Analysis Techniques

Unit III – Analytical Methods

- Basic Data Analysis Principles
- Learning Curves
- Regression Analysis
- Cost Risk Analysis
- Probability and Statistics
Unit IV – Specialized Costing

Unit V – Management Applications

Unit III - Module 9

Outline

- Introduction to Risk
- Model Architecture
- Historical Data Analysis
- Model Example
- Summary
- Resources

Unit III - Module 9

Overview

- Risk is a significant part of cost estimation and is used to allow for cost growth due to anticipatable and un-anticipatable causes
- There are several approaches to risk estimation
- Incorrect treatment of risk, while better than ignoring it, creates a false sense of security
- Risk is perhaps best understood through a detailed examination of an example method

Unit III - Module 9

Definitions

- Cost Growth:
- Increase in cost of a system from inception to completion

- Cost Risk:
- Predicted Cost Growth.

In other words:

Cost Growth = actuals

Cost Risk = projections

Unit III - Module 9

Types of Risk

- Cost Growth = Cost Estimating Growth + Sked/Tech Growth + Requirements Growth + Threat Growth
- Cost Risk = Cost Estimating Risk + Sked/Technical Risk + Requirements Risk + Threat Risk
- Cost Estimating Risk: Risk due to cost estimating errors, and the statistical uncertainty in the estimate
- Schedule/Technical Risk: Risk due to inability to conquer problems posed by the intended design in the current CARD or System Specifications
- Requirements Risk: Risk resulting froman as-yet-unseen design shift from the current CARD or System Specifications arising due to shortfallsin the documents
- Due to the inability of the intended design to perform the(unchanged) intended mission
- We didn’t understand the solution

- Threat Risk: Risk due to an unrevealed threat; e.g. shift from the current STAR or threat assessment
- The problem changed

Often implicit or omitted

1

2

Unit III - Module 9

Basic Flow of the Risk Process

Structure & Execution

Includes the organization,

the mathematical assumptions,

and how the model runs

Inputs

Outputs

- From the cost analyst and technical experts
- The CARD
- Expert rating/scoring
- Point Estimate

- To the decision maker and the cost analyst
- Means
- Standard Deviations
- Risk by CWBS

Inputs and outputs, although outside the purview of the risk analyst, are determined by the structure and execution of the risk model

Unit III - Module 9

Work in physical materials, with

Physics-based responses

Physical connections

Typically examine or discuss a specific outcome

System Parameters

Designs

Typically seek to know:

Given this solution, what will go wrong?

Are my design margins enough?

Cost Analysts

Work in dollars and parameters, with

Statistical relationships

Correlation

Typically examine or discuss a general outcome set

Probability distribution

Statistical parameters such as mean and standard deviation

Engineers’ and Cost Analysts’ View of Risk- Typically seek to know:
- Given this relationship, what is the range of possibilities?
- Are my cost margins enough?

Unit III - Module 9

Cost Estimating

Schedule / Technical

Requirements

Threat

Assigning Cost to Risk

CERs

Direct Assessment of Distribution Parameters

Factors

Rates

Below-the-Line

Yes

No

Distribution

Normal

Log Normal

Triangular

Beta

Bernoulli

Correlation

Functional

Relational

Injected

None

General Model ArchitectureInputs

- Historical
- Domain Experts
- Conceptual

- Interval w/ objective criteria
- Interval
- Ordinal
- None

Dollar

Basis

Scoring

Structure

Organization

Probability Model

Tip: Higher is better except in Cross Checks

Execution

- Monte Carlo
- Method of Moments
- Deterministic

- Means
- CVs
- Inputs

Cross

Checks

Compu-

tation

Unit III - Module 9

Inputs – Scoring

- Interval with objective criteria
- Set scoring based on objective criteria, and for which the distance (interval) between scores has meaning. (Note: the below example is also Ratio, because it passes through the origin.)
- A schedule slip of 1 week gets a score of 1, a slip of 2 weeks gets a score of 2, a slip of 4 weeks gets a 4, a slip of 5 weeks gets a score of 5, etc.
- The difference between a score of 1 and 2 is as big as a difference between score of 4 and 5

- Set scoring based on objective criteria, and for which the distance (interval) between scores has meaning. (Note: the below example is also Ratio, because it passes through the origin.)
- A scale is interval if it acts interval under examination*

8

“Nominal, ordinal, interval, and ratio typologies are misleading,” P.F. Velleman and L. Wilkinson, The American Statistician, 1993, 47(1), 65-72

Unit III - Module 9

Inputs – Scoring

- Interval
- Set scoring for which the distance (interval) between scores has meaning
- Low risk is assigned a 1, medium risk is assigned a 5, and a high risk is assigned a 10
- Note that it is not immediately clear that the scale is interval, but it is surely not subject to objective criteria.

- Set scoring for which the distance (interval) between scores has meaning
- Ordinal
- Score is relative to the measurement
- e.g., difficulty in achieving schedule is high, medium, or low

- Score is relative to the measurement
- None

Unit III - Module 9

Inputs – Dollar Basis

- Historical
- Actual costs of similar programs or components of programs are used to predict costs

- Domain Experts
- Persons with expertise regarding similar programs or program components assess the cost based on their experience

- Conceptual
- An arbitrary impact is assigned
- Any scale without a historical basis or expert assessment is conceptual

- An arbitrary impact is assigned

Unit III - Module 9

Org – Coverage & Partition

- How the four types of risk are covered and partitioned
- Cost Estimating
- Schedule/Technical
- Requirements
- Threat

These risk types may be covered implicitly or explicitly in any combination.

Unit III - Module 9

Org – Assigning Cost to Risk

- Risk CERs: Equations are developed that reflect the relationship between an interval risk score and the cost impact of the risk (this might also be termed a Risk Estimating Relationship (RER))
- These equations amount to the same thing as CERs used in the cost estimate
- e.g., Risk Amount = 0.12 * Risk Score

- Direct Assessment of Distribution Parameters: Costs are captured in shifts of parameters of the risk, e.g., shifted end points for triangulars, shifted end points or means for betas, etc.
- Note: Scoring is completely eliminated from this mapping method
- e.g., triangles assessed by domain experts

9

Unit III - Module 9

Org – Assigning Cost to Risk

Warning: Rates are independent of the element’s cost.

- Factors: Fractions or percents are used in conjunction with the scores and the cost of the component or program
- e.g., a score of 2 increases the cost of the component by 8%
- Antenna Risk Score = 2
- Cost of Antenna = $4090K
- Risk Amount = 0.08 * 4090K = $327.2K

- Rates: Predetermined costs are
associated with the scores

- e.g., a score of 2 has a cost of $100K
- Antenna Risk Score = 2
- Cost of Antenna = $4090K
- Risk Amount = $100K

Unit III - Module 9

Org – Below-the-Line

- Below-the-Line Elements
- Elements that are driven by hardware, software, and the like
- Below-the-Line Elements include:
- Systems Engineering/Program Management (SE/PM)
- System Test and Evaluation (ST&E)

- Not all models account for this cost growth
- Functional Correlation is another approach to address the risk in these elements

9

Unit III - Module 9

Best behavior, most iconic

Theoretically (although not practically) allows negative costs, which spook some users

Symmetric, needs mean shift to reflect propensity for positive growth

Lognormal

A natural result in non-linear CERs

Indistinguishable from Normal at CVs below 25%

Skewed

Probability Model – Distribution10

4

Unit III - Module 9

Probability Model – Distribution

- Triangular
- Most common
- Easy to use, easy to understand
- Modes, medians do not add
- Skewed

- Beta
- Rare now, but formerly popular
- Solves negative cost and duration issues
- Many parameters – simplifications like PERT Beta are possible
- Skewed

- Bernoulli
- Probability is only assigned to two possible outcomes, success and failure (p and 1-p)
- Simplest of all discrete distributions
- Mean = p
- Variance = p*(1-p)

10

Unit III - Module 9

Probability Model – Correlation

- Functional: Arises between source and derivative variables as a result of functional dependency. The lines of the Monte Carlo are cell-referenced wherever relationships are known.
- CERs are entered as equations
- Cell references are left in the spreadsheet
- When the Monte Carlo runs, input variables fluctuate, and outputs of CERs reflect this

Correlation is a measure of the relation between two or more variables/WBS elements

3

An Overview of Correlation and Functional Dependencies in Cost Risk and Uncertainty Analysis, R. L. Coleman and S. S. Gupta, DoDCAS, 1994

Unit III - Module 9

Old: No Functional Correlation; Simulation run with WBS items entered as values

New: Simulation run with functional dependencies entered as they are in cost model

Functional CorrelationNot Correlated

Correlated

Note shift of mean, and increasedvariability

Unit III - Module 9

Probability Model – Correlation

- Relational: Introduces the geometry of correlation and provides a substantial improvement over injected correlations, and fills a gap in FC
- Relational Correlation provides insight into
- the tilt of the data, i.e. the regression line,
- and the variance around the regression line

- Relational Correlation provides insight into

Relational Correlation: What to do when Functional Correlation is Impossible, R. L. Coleman, J. R. Summerville, M. E. Dameron, C. L. Pullen, S. S. Gupta, ISPA/SCEA Joint International Conference,2001

Unit III - Module 9

Probability Model - Correlation

- Injected: Imposed by setting the correlation directly between variables without having a functional relationship.
- None: No relationship exists among the variables. The lines of the Monte Carlo are self contained.

Unit III - Module 9

Shortcomings of Injected Correlation

- Correlations are very hard to estimate
- No check of the functional implications of the correlations is done
- This is troublesome because of the regression line that arises when we insert a correlation.
- Simply injecting arbitrary correlations of 0.2 - 0.3 to achieve dispersion is unsatisfactory as well.
- Unless the injected correlations are among elements that are actually correlated

- If correlations are actually known, no harm is done.

Unit III - Module 9

Execution – Computation

- Monte Carlo: A widely accepted method, used on a broad range of risk assessments for many years. It produces cost distributions. The cost distributions give decision makers insight into the range of possible costs and their associated probabilities.
- Method of Moments: The mean and standard deviation of lower-level WBS lines are known, and are rolled up assuming independence to provide higher-level distributions.
- Only provides an analysis of distribution at a top level
- Easy to calculate
- Negated by the rapid advances in microcomputer technology
- Only works for independent elements, unless covariances are allowed for, which is difficult.

- Deterministic: Only point values are used. No shifts or other probabilistic effects are taken into account.

10

Unit III - Module 9

Risk Assessment Techniques

- Add a Risk Factor/Percentage (Minutes)
- Low accuracy, no intervals

- Bottom Line Monte Carlo/Bottom Line Range/Method of Moments (Hours)
- Moderate accuracy, provides intervals

- Historically based Detailed Monte Carlo (Months of non-recurring work, but recurring in days)
- Time consuming non-recurring work, but with recurring implementation being easier, accurate if done right. Provides intervals.

- Expert Opinion-Based Probability and Consequence (Pf*Cf) or Expert Opinion-Based Detailed Monte Carlo (Months)
- Time consuming with no gains in recurring effort, but accurate if done right. Provides intervals.

- Detailed Network and Risk Assessment (Month)
- Time consuming with no gains in recurring effort, but accurate if done right. Provides intervals.

Unit III - Module 9

Execution – Cross Checks

- Means: The mean cost growth factor for WBS items can be compared to history as a way to cross check results
- CVs: The CV of the cost growth factors for WBS items can be compared to history as a way to cross check results
- Inputs: Checks are performed on inputs or other parameters to see if historical values are in line with program assumptions
- Example: Historical risk scores can be compared to program risk scores to see if risk assessors are being realistic, and to see if the underlying database is
representative of the program.

- Example: Historical risk scores can be compared to program risk scores to see if risk assessors are being realistic, and to see if the underlying database is

11

Unit III - Module 9

Intro to SARs – Sample

A SAR report is submitted for each year of a program’s Acquisition cycle. The most recent SAR is used to determine cost growth

Sample Program: XXX, December 31, 19XX

12

To calculate the CGF, adjust the current estimate for quantity changes, then divide by the baseline estimate

Unit III - Module 9

Contract Data

- Hard to use – problems with changing baselines, lack of reasons for variances, and access to data
- Preliminary comparative analysis suggests Contract Data mimics patterns in SAR data
- Shape of distribution
- Trends in tolerance for cost growth

- K-S tests find no statistically significant difference between Contract data and SAR data for programs <$1B in RDT&E
- Failed to reject the null hypothesis of identical distributions

- Descriptive statistics indicate amount of Contract Data growth and dispersion is more extreme than previously found in SAR studies
- SAR data remains the best choice for analysis and predictive modeling

NAVAIR Cost Growth Study: A Cohorted Study of The Effects of Era, Size, Acquisition Phase, Phase Correlation and Cost Drivers , R. L. Coleman, J. R. Summerville, M. E. Dameron, C. L. Pullen, D. M. Snead, DoDCAS, 2001 and ISPA/SCEA International Conference, 2001

Unit III - Module 9

Contract Data Exploratory Analysis

CGF vs IPE-Contract and SAR (RDT&E)

ZOOM IN with common Scale

Contract Data

SAR Data

Contract Data blends well: Continues trend that tolerance for growth increases as program size decreases

Unit III - Module 9

Common Problems

- Most historically-based methods rely on SARs
- Adjusting for quantity – important to remove quantity changes from cost growth
- Beginning points – the richest data source is found by beginning with EMD
- Cohorting must be introduced to avoid distortions

- EVM data is also potentially useable, but re-baselined programs are a severe complication.
- “Applicability” and “currency” are the most common criticisms

15

Unit III - Module 9

Applicability and Currency

- Applicability: “Why did you include that in your database?”
- Virtually all studies of risk have failed to find a difference among platforms (some exceptions)
- If there is no discoverable platform effect, more data is better

- Currency: “But your data is so old!”
- Previous studies have found that post-1986 data is preferable
- Data accumulation is expensive

Unit III - Module 9

Cost Estimating

Schedule / Technical

Requirements

Threat

Assigning Cost to Risk

CERs

Direct Assessment of Distribution Parameters

Factors

Rates

Below-the-Line

Yes

No

Distribution

Normal

Log Normal

Triangular

Beta

Bernoulli

Correlation

Functional

Relational

Injected

None

Example Model ArchitectureInputs

- Historical
- Domain Experts
- Conceptual

- Interval w/ objective criteria
- Interval
- Ordinal
- None

Dollar

Basis

Scoring

Structure

Organization

Probability Model

13

Tip: Higher is better except in cross checks

Execution

- Monte Carlo
- Method of Moments
- Deterministic

- Means
- CVs
- Inputs

Cross

Checks

Compu-

tation

Unit III - Module 9

Assessment Approach

- Develop a cost estimating risk distribution for each CWBS element
- Develop a schedule/technical risk distribution for each WBS entry for:
- Hardware
- Software
- Note that “Below-the-line” WBS elements get risk from Above-the-line WBS elements via Functional Correlation

- Combine these risk distributions and the point estimate using a Monte Carlo simulation

Unit III - Module 9

Example Model in Blocks

Cost Estimating Risk

Standard

Errors &

SEEs

IPE

CARD

Functional

Correlation

Risk

Scoring

Mapping

Monte

Carlo

Sked/Tech Risk

Risk Report

Cost Risk Analysis of the Ballistic Missile Defense (BMD) System, An Overview of New Initiatives Included in the BMDO Risk Methodology, R. L. Coleman, J. R. Summerville, D. M. Snead, S. S. Gupta, G. E. Hartigan, N. L. St. Louis, DoDCAS, 1998 (Outstanding Contributed Paper), and ISPA/SCEA International Conference, 1998

Unit III - Module 9

Cost Estimating Risk Assessment

- Consists of a standard deviation and a bias associated with the costing methodologies
- Standard deviation comes from the CERs and factors
- Bias is a correction for underestimating

14

Unit III - Module 9

Sked/Tech Risk Assessment

- Technical risk is decomposed into categories and each category into sub categories
- Hardware sub categories:
- Technology Advancement, Engineering Development, Reliability, Producibility, Alternative Item and Schedule

- Software sub categories:
- Technology Approach, Design Engineering, Coding, Integrated Software, Testing, Alternatives, and Schedule

- Hardware sub categories:

Unit III - Module 9

Hardware Risk Scoring Matrix

Unit III - Module 9

Calculating Sked/Tech Risk Endpoints

- Technical experts score each of the categories from 0 (no risk) to 10 (high risk)
- Each category is weighted depending on the relevancy of the category
- Weights are allowed, but rarely used

- Weighted average risk scores are mapped to a cost growth distribution
- This distribution is based on a database of cost growth factors of major weapon systems collected from SARs. These programs range from those which experienced tremendous cost growth due to technical problems to those which were well managed and under budget.

Unit III - Module 9

Average cost growth

Minimum possible cost growth

AVG=1.17

AVG=1.28

AVG=1.40

MIN=0.77

MIN=0.61

MIN=0.46

MAX=1.58

MAX=1.96

MAX=2.34

Sked/Tech Score Mapping

Unit III - Module 9

AVG=1.28

AVG=1.40

MIN=0.77

MIN=0.61

MIN=0.46

MAX=1.58

MAX=1.96

MAX=2.34

Sked/Tech Risk Distribution

Bars are the frequency

of occurrence of each

risk score

These are the PDFs for 3 risk scores above. More risk has higher mode, wider base, all are symmetric.

This is the composite

PDF for all SARs

Model

S/T Risk Score

1

3

5

7

9

10

Unit III - Module 9

Cost Growth Database

28

20

15

9

8

7

5

4

4

3

2

1

1

-30%

-20%

-10%

0%

10%

20%

30%

50%

60%

40%

70%

80%

90%

100%

400%

Cost Decrease

Cost Increase

No Change

Risk appears skewed, perhaps Triangular or Lognormal

This distribution, found in databases, is the result of a blending of a family of distributions as shown on the previous slide.

Unit III - Module 9

Example Cost Estimate with Risk – R&D

Note: These are means – there is an associated confidence interval, not portrayed.

33.7%

25%

150

8.7%

S/T Risk

100

$

CE Risk

50

Init Pt Est

6

0

Initial Point

Add Cost

Add

7

Estimate

Estimating

Sched/Tech

Risk

Risk

Unit III - Module 9

Summary

- Why include risk?
- Risk adjusts the cost estimate so that it more closely represents what historical data and experts know to be true … it predicts cost growth

- How to treat risk?
- We have seen an overview of the many different options in terms of inputs, the structure of the risk model, and how to execute the risk model
- The choices are varied, but it is important that the model “fits together” and that it predicts well.

- Closing thought: Always include cross checks to support the accuracy of the model and the specific results for a program
- The model may seem right, but will it (did it) predict accurate results?

Unit III - Module 9

Risk Resources – Books

- Against the Gods: The Remarkable Story of Risk, Peter L. Bernstein, August 31, 1998, John Wiley & Sons
- Living Dangerously! Navigating the Risks of Everyday Life, John F. Ross, 1999, Perseus Publishing
- Probability Methods for Cost Uncertainty Analysis: A Systems Engineering Perspective, Paul Garvey, 2000, Marcel Dekker
- Introduction to Simulation and Risk Analysis, James R. Evan, David Louis Olson, James R. Evans, 1998, Prentice Hall
- Risk Analysis: A Quantitative Guide, David Vose, 2000, John Wiley & Sons

Unit III - Module 9

Risk Resources – Web

- Decisioneering
- Makers of Crystal Ball for Monte Carlo simulation
- http://www.decisioneering.com

- Palisade
- Makers of @Risk for Monte Carlo simulation
- http://www.palisade.com

Unit III - Module 9

Risk Resources – Papers

- Approximating the Probability Distribution of Total System Cost, Paul Garvey, DoDCAS 1999
- Why Cost Analysts should use Pearson Correlation, rather than Rank Correlation, Paul Garvey, DoDCAS 1999
- Why Correlation Matters in Cost Estimating , Stephen Book, DoDCAS 1999
- General-Error Regression in Deriving Cost-Estimating Relationships, Stephen A. Book and Mr. Philip H. Young, DoDCAS 1998
- Specifying Probability Distributions From Partial Information on their Ranges of Values, Paul R. Garvey, DoDCAS 1998
- Don't Sum EVM WBS Element Estimates at Completion, Stephen Book, Joint ISPA/SCEA 2001
- Only Numbers in the Interval –1.0000 to +0.9314… Can Be Values of the Correlation Between Oppositely-Skewed Right-Triangular Distributions, Stephen Book , Joint ISPA/SCEA 1999

Unit III - Module 9

Risk Resources – Papers

- An Overview of Correlation and Functional Dependencies in Cost Risk and Uncertainty Analysis, R. L. Coleman, S. S. Gupta, DoDCAS, 1994
- Weapon System Cost Growth As a Function of Maturity, K. J. Allison, R. L. Coleman, DoDCAS 1996
- Cost Risk Estimates Incorporating Functional Correlation, Acquisition Phase Relationships, and Realized Risk, R. L. Coleman, S. S. Gupta, J. R. Summerville, G. E. Hartigan, SCEA National Conference, 1997
- Cost Risk Analysis of the Ballistic Missile Defense (BMD) System, An Overview of New Initiatives Included in the BMDO Risk Methodology, R. L. Coleman, J. R. Summerville, D. M. Snead, S. S. Gupta, G. E. Hartigan, N. L. St. Louis, DoDCAS, 1998 (Outstanding Contributed Paper)and ISPA/SCEA International Conference, 1998

Unit III - Module 9

Risk Resources – Papers

- Risk Analysis of a Major Government Information Production System, Expert-Opinion-Based Software Cost Risk Analysis Methodology, N. L. St. Louis, F. K. Blackburn, R. L. Coleman, DoDCAS, 1998 (Outstanding Contributed Paper),and ISPA/SCEA International Conference, 1998 (Overall Best Paper Award)
- Analysis and Implementation of Cost Estimating Risk in the Ballistic Missile Defense Organization (BMDO) Risk Model, A Study of Distribution, J. R. Summerville, H. F. Chelson, R. L. Coleman, D. M. Snead, Joint ISPA/SCEA International Conference 1999
- Risk in Cost Estimating - General Introduction & The BMDO Approach,R. L. Coleman, J. R. Summerville, M. DuBois, B. Myers, DoDCAS,2000
- Cost Risk in Operations and Support Estimates, J. R. Summerville, R. L. Coleman, M. E. Dameron, SCEA National Conference, 2000

Unit III - Module 9

Risk Resources – Papers

- Cost Risk in a System of Systems, R.L. Coleman, J.R. Summerville, V. Reisenleiter, D. M. Snead, M. E. Dameron, J. A. Mentecki, L. M. Naef, SCEA National Conference, 2000
- NAVAIR Cost Growth Study: A Cohorted Study of the Effects of Era, Size, Acquisition Phase, Phase Correlation and Cost Drivers, R. L. Coleman, J. R. Summerville, M. E. Dameron, C. L. Pullen, D. M. Snead, ISPA/SCEA Joint International Conference, 2001
- Probability Distributions of Work Breakdown Structures,, R. L. Coleman, J. R. Summerville, M. E. Dameron, N. L. St. Louis, ISPA/SCEA Joint International Conference, 2001
- Relational Correlation: What to do when Functional Correlation is Impossible, R. L. Coleman, J. R. Summerville, M. E. Dameron, C. L. Pullen, S. S. Gupta, ISPA/SCEA Joint International Conference,2001
- The Relationship Between Cost Growth and Schedule Growth, R. L. Coleman, J. R. Summerville, DoDCAS, 2002
- The Manual for Intelligence Community CAIG Independent Cost Risk Estimates, R. L. Coleman, J. R. Summerville, S. S. Gupta, DoDCAS, 2002

Unit III - Module 9

Geometry of Bivariate Normal Random Variables

- The dispersion and axis tilt of the “data cloud” is a function ofcorrelation:
- less correlation, moredispersion about the axis
- more correlation, moreaxis tilt

y

ρ=.75

σy

(μx, μy)

tilt

μy

ρ=0

σy

σx

σx

x

μx

Unit III - Module 9

Implications for Regression Line

y

This line is with perfect correlation …

The slope that would be true if ρ = 1

y = ρ(σy / σx) (x- μx) + μy

ρ=.75

2σx

σy

(μx, μy)

μy

2σy

σy

b

This line has correlation added

b= μy- ρσy / σx * μx

σx

σx

x

μx

Unit III - Module 9

Geometry of Regression Line

Slope m varies with ρ, σx, σy

The regression line of y on x depends on their means, their standard deviations and their correlation

y

y = ρ(σy / σx) (x- μx) + μy

ρ=1

2σx

σy

(μx, μy)

μy

2σy

ρ=0

Range of intercepts

Range of slopes

σy

b

b= μy- ρσy / σx * μx

Dispersion varies with ρ

ρ=-1

σx

σx

x

Intercept b varies with

ρ, σx, σy, μx, and μy

μx

Unit III - Module 9

Geometry of r squared

r2 is the percent reduction between these two variances:

σy2 and σy|x2

or

σx2 and σx|y2

y

r2 = 0.75

σy|x

σy

σy|x

μy

r2 = 0

σy|x

σy

σy|x

b

Variance of y|x = (1- ρ2)* σy2

σx

σx

x

μx

Unit III - Module 9

Download Presentation

Connecting to Server..