- 69 Views
- Uploaded on
- Presentation posted in: General

Linear Regression Models

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Linear Regression Models

Andy Wang

CIS 5930-03

Computer Systems

Performance Analysis

- What is a (good) model?
- Estimating model parameters
- Allocating variation
- Confidence intervals for regressions
- Verifying assumptions visually

- For correlated data, model predicts response given an input
- Model should be equation that fits data
- Standard definition of “fits” is least-squares
- Minimize squared error
- Keep mean error zero
- Minimizes variance of errors

- If then error in estimate for xi is
- Minimize Sum of Squared Errors (SSE)
- Subject to the constraint

- Best regression parameters are
where

- Note error in book!

- Execution time of a script for various loop counts:
- = 6.8, = 2.32, xy = 88.7, x2 = 264
- b0 = 2.32 (0.30)(6.8) = 0.28

- If no regression, best guess of y is
- Observed values of y differ from , giving rise to errors (variance)
- Regression gives better guess, but there are still errors
- We can evaluate quality of regression by allocating sources of errors

- Without regression, squared error is

- Recall that regression error is
- Error without regression is SST
- So regression explains SSR = SST - SSE
- Regression quality measured by coefficient of determination

- Compute
- Compute
- Compute
- where R = R(x,y) = correlation(x,y)

- For previous regression example
- y = 11.60, y2 = 29.88, xy = 88.7,
- SSE = 29.88-(0.28)(11.60)-(0.30)(88.7) = 0.028
- SST = 29.88-26.9 = 2.97
- SSR = 2.97-.03 = 2.94
- R2 = (2.97-0.03)/2.97 = 0.99

- Variance of errors is SSE divided by degrees of freedom
- DOF is n2 because we’ve calculated 2 regression parameters from the data
- So variance (mean squared error, MSE) is SSE/(n2)

- Standard dev. of errors is square root:(minor error in book)

- Degrees of freedom always equate:
- SS0 has 1 (computed from )
- SST has n1 (computed from data and , which uses up 1)
- SSE has n2 (needs 2 regression parameters)
- So

- For regression example, SSE was 0.03, so MSE is 0.03/3 = 0.01 and se = 0.10
- Note high quality of our regression:
- R2 = 0.99
- se = 0.10

- Regression is done from a single population sample (size n)
- Different sample might give different results
- True model is y = 0 + 1x
- Parameters b0 and b1 are really means taken from a population sample

- Standard deviations of parameters:
- Confidence intervals arewhere t has n - 2 degrees of freedom
- Not divided by sqrt(n)

- Recall se = 0.13, n = 5, x2 = 264, = 6.8
- So
- Using 90% confidence level, t0.95;3 = 2.353

- Thus, b0 interval is
- Not significant at 90%

- And b1 is
- Significant at 90% (and would survive even 99.9% test)

- Previous confidence intervals are for parameters
- How certain can we be that the parameters are correct?

- Purpose of regression is prediction
- How accurate are the predictions?
- Regression gives mean of predicted response, based on sample we took

- Standard deviation for mean of future sample of m observations at xpis
- Note deviation drops as m
- Variance minimal at x =
- Use t-quantiles with n–2 DOF for interval

- Using previous equation, what is predicted time for a single run of 8 loops?
- Time = 0.28 + 0.30(8) = 2.68
- Standard deviation of errors se = 0.10
- 90% interval is then

- Regressions are based on assumptions:
- Linear relationship between response y and predictor x
- Or nonlinear relationship used in fitting

- Predictor x nonstochastic and error-free
- Model errors statistically independent
- With distribution N(0,c) for constant c

- Linear relationship between response y and predictor x
- If assumptions violated, model misleading or invalid

- Scatter plot x vs. y to see basic curve type

Linear

Piecewise Linear

Outlier

Nonlinear (Power)

- Scatter-plot i versus
- Should be no visible trend
- Example from our curve fit:

- May be useful to plot error residuals versus experiment number
- In previous example, this gives same plot except for x scaling

- No foolproof tests
- “Independence” test really disproves particular dependence
- Maybe next test will show different dependence

- Prepare quantile-quantile plot of errors
- Example for our regression:

- Tongue-twister: homoscedasticity
- Return to independence plot
- Look for trend in spread
- Example:

- Regression throws away some information about the data
- To allow more compact summarization

- Sometimes vital characteristics are thrown away
- Often, looking at data plots can tell you whether you will have a problem

IIIIIIIV

xyxyxyxy

108.04109.14107.4686.58

86.9588.1486.7785.76

137.58138.741312.7487.71

98.8198.7797.1188.84

118.33119.26117.8188.47

149.96148.10148.8487.04

67.2466.1366.0885.25

44.2643.1045.391912.50

1210.84129.13128.1585.56

74.8277.2676.4287.91

55.6854.7455.7386.89

- Exactly the same thing for each!
- N = 11
- Mean of y = 7.5
- Y = 3 + .5 X
- Standard error of regression is 0.118
- All the sums of squares are the same
- Correlation coefficient = .82
- R2 = .67

I

II

III

IV