- 43 Views
- Uploaded on
- Presentation posted in: General

Regression and Correlation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Regression and Correlation

BUSA 2100, Sect. 12.0 - 12.2, 3.5

- Forecasts (predictions) are often based on the relationship between 2 or more variables.
- Ex. 1: Advertising expenditures and sales.
- Example 2: Daily high temperature and demand for electricity.
- X = independent variable, the variable being used to make a forecast; Y = dependent variable, the variable being forecasted.
- Identify X and Y in Examples 1 and 2.
- Y depends on X.

- A regression line can be used to show mathematically how variables are related.

- To determine the equation of a line, all we need are the slope and Y-intercept.
- Example: Pizza House builds restaur-ants near college campuses.
- Before building another one, it plans to use X = student enrollment (1000s) to estimate Y = quarterly sales ($1000s).
- A sample of 6 existing restaurants is chosen.

- Resulting data pairs are shown below.
- X Y 4 95 6 155 9 14011 21012 25015 260

- Draw a scatter diagram on the board.
- Use a hiatus so that the X, Y axes don’t have to begin at zero. All units must be the same size within axes.
- By trial and error, draw some lines through the data. The regression line is the one line that fits the data best. (Also called the line of best fit.)

- As indicated earlier, YF is a forecasted value (on the regression line). Y is an actual value (one of the dots).

- Based on calculus, the equation of a regression line (line of best fit) can be found using these formulas.

- Carry out the numerical coefficients (b1 and b0) 3 or 4 decimal places; then round to 2 or fewer places at the end.
- Substitute the numbers into the regression equation: YF = b0 + b1X.
- We will complete the restaurant prob-lem, using a table to organize the data.

- X Y XY X2 Y2 4 95 380 16 9025 6 155 930 36 24025 9 140 1260 81 19600 11 210 2310 121 44100 12 250 3000 144 62500 15 260 3900 225 67600 SUM 57 1110 11780 623 226850

.

- Example: Vidalia State University has an enrollment of 9,800. Forecast pizza sales for a restaurant near the campus.

- The accuracy of forecasts depends on how closely the points in a scatter diagram fit the regression line.
- If the linear relationship is too weak (the deviations are too large), there are large forecast errors and there may be no need to pursue use of a regression line.

- It is best to have an estimate of forecast accuracy before using a regression line.
- 3 ways to estimate forecast accuracy:

- Def.: The coefficient of correlation (r) is a numerical measure of the strength of the linear relationship between 2 variables.
- Values of r are always between -1 & 1; i.e., between 0 and 1 in absolute value.
- r = 0 means no correlation; r = +-1 means perfect correlation; both rare.

- Definition:Two variables X, Y have a positive correlation if large values of X tend to be associated with large values of Y; similarly, for small values.
- X, Y must be measurable quantitatively.
- Example of positive correlation:

- Definition:Two variables X, Y have a negative correlation if large values of X tend to be associated with small values of Y, and vice-versa.
- Example of negative correlation:
- Graph positive and negative correlation.

- General guidelines: Degree of Forecast Correlation Accuracy
- very highvery good high good moderate medium low fair very low poor

- Use regression for forecasts only if r is .70 or larger, in absolute value.

.

- Steps in regression analysis:
- (1) Collect data pairs, using 2 related variables.
- (2) Calculate the correlation, r.
- (3) (a) If r >= .70, in absolute value, find the regression equation and use it for forecasting.
- (b) If r < .70, don’t use regression.

- Regression analysis with one independ-ent variable (X) is called simple regres-sion.
- Regression analysis with 2 or moreindependent variables (X1, X2, etc.) is called multiple regression.

- State the multiple regression equation.
- A regression equation is also called the line of average relationship. Explain in terms of GPA example.
- Correlation does not necessarily imply cause and effect. Illustrate with example.