# Design and Analysis of Experiments (5) Fitting Regression Models - PowerPoint PPT Presentation

1 / 25

Design and Analysis of Experiments (5) Fitting Regression Models. Kyung-Ho Park. In many problems one or more variables are related, it is of interest to model and explore this relationship. The model can be used for prediction, process optimization, or process control. Modelling.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Design and Analysis of Experiments (5) Fitting Regression Models

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## Design and Analysis of Experiments (5) Fitting Regression Models

Kyung-Ho Park

• In many problems one or more variables are related, it is of interest to model and explore this relationship.

• The model can be used for prediction, process optimization, or process control.

Modelling

Interpolation Method

• interpolation is a method of constructing new data points within the range of a discrete set of known data points.

• In engineering and science one often has a number of data points, as obtained by sampling or experiment, and tries to construct a function which closely fits those data points.

• This is called curve fitting or regression analysis. Interpolation is a specific case of curve fitting, in which the function must go exactly through the data points.

Piecewise constant interpolation

The simplest interpolation method is to locate the nearest data value, and assign the same value. In one dimension, there are seldom good reasons to choose this one over linear interpolation, which is almost as cheap, but in higher dimensions, in multivariate interpolation, this can be a favourable choice for its speed and simplicity

Linear interpolation

One of the simplest methods is linear interpolation (sometimes known as lerp). Consider the above example of determining f(2.5). Since 2.5 is midway between 2 and 3, it is reasonable to take f(2.5) midway between f(2) = 0.9093 and f(3) = 0.1411, which yields 0.5252.

Generally, linear interpolation takes two data points, say (xa,ya) and (xb,yb).

Linear interpolation is quick and easy, but it is not very precise. Another disadvantage is that the interpolant is not differentiable at the point xk.

Polynomial interpolation

• Polynomial interpolation is a generalization of linear interpolation. Note that the linear interpolant is a linear function. We now replace this interpolant by a polynomial of higher degree.

• Consider again the problem given above. The following sixth degree polynomial goes through all the seven points:

• f(x) = − 0.0001521x6 − 0.003130x5 + 0.07321x4 − 0.3577x3 + 0.2255x2 + 0.9038x.

• Substituting x = 2.5, we find that f(2.5) = 0.5965.

Spline interpolation

Remember that linear interpolation uses a linear function for each of intervals [xk,xk+1]. Spline interpolation uses low-degree polynomials in each of the intervals, and chooses the polynomial pieces such that they fit smoothly together. The resulting function is called a spline.

Extrapolation Method

• A sound choice of which extrapolation method to apply relies on a prior knowledge of the process that created the existing data points.

• Crucial questions are for example if the data can be assumed to be continuous, smooth, possibly periodic etc.

• Linear extrapolation

• Polynomial extrapolation

• Conic extrapolation

• A conic section can be created using five points near the end of the known data. If the conic section created is an ellipse or circle, it will loop back and rejoin itself. A parabolic or hyperbolic curve will not rejoin itself, but may curve back relative to the X-axis. This type of extrapolation could be done with a conic sections template (on paper) or with a computer.

Simple Linear Regression and Correlation

(Empirical Models)

Example 6-1

y : the purity of oxygen produced in a chemical distillation process

x : the percentage of hydrocarbons that are present in the main condenser of the distillation unit

Simple Linear Regression and Correlation

Stat > Regression > Fitted Line Plot

Simple Linear Regression and Correlation

(Empirical Models)

Example 6-1

Simple Linear Regression and Correlation

(Empirical Models)

Example 6-1

Regression Analysis: Purity y(%) versus Hydrocarbon level x(%)

The regression equation is

Purity y(%) = 74.28 + 14.95 Hydrocarbon level x(%)

S = 1.08653 R-Sq = 87.7% R-Sq(adj) = 87.1%

Analysis of Variance

Source DF SS MS F P

Regression 1 152.127 152.127 128.86 0.000

Error 18 21.250 1.181

Total 19 173.377

Simple Linear Regression and Correlation

(Empirical Models)

Example 6-1

Simple Linear Regression and Correlation

(Empirical Models)

Example 6-1

Predicted Values for New Observations

New

Obs Fit SE Fit 95% CI 95% PI

1 89.081 0.364 (88.316, 89.846) (86.674, 91.489)

2 89.530 0.336 (88.824, 90.235) (87.141, 91.919)

3 91.473 0.250 (90.947, 91.999) (89.130, 93.815)

4 93.566 0.273 (92.993, 94.138) (91.212, 95.919)

5 96.107 0.424 (95.216, 96.998) (93.656, 98.557)

6 94.612 0.325 (93.929, 95.295) (92.229, 96.995)

7 87.288 0.493 (86.251, 88.324) (84.781, 89.795)

8 92.669 0.247 (92.150, 93.188) (90.328, 95.010)

9 97.452 0.526 (96.348, 98.556) (94.916, 99.988)

10 95.210 0.362 (94.449, 95.971) (92.804, 97.616)

11 92.071 0.243 (91.560, 92.582) (89.732, 94.410)

12 91.473 0.250 (90.947, 91.999) (89.130, 93.815)

13 88.932 0.374 (88.146, 89.718) (86.518, 91.346)

14 89.380 0.345 (88.655, 90.105) (86.985, 91.775)

15 90.875 0.268 (90.312, 91.438) (88.524, 93.226)

16 92.220 0.243 (91.710, 92.731) (89.881, 94.559)

17 93.117 0.257 (92.577, 93.657) (90.771, 95.463)

18 94.014 0.293 (93.399, 94.629) (91.650, 96.378)

19 95.658 0.392 (94.834, 96.483) (93.231, 98.085)

20 88.483 0.405 (87.633, 89.334) (86.047, 90.919)

Simple Linear Regression and Correlation

(Empirical Models)

Example 6-1

Hydrocarbon Purity

Obs level x(%) y(%) Fit SE Fit Residual St Resid

1 0.99 90.010 89.081 0.364 0.929 0.91

2 1.02 89.050 89.530 0.336 -0.480 -0.46

3 1.15 91.430 91.473 0.250 -0.043 -0.04

4 1.29 93.740 93.566 0.273 0.174 0.17

5 1.46 96.730 96.107 0.424 0.623 0.62

6 1.36 94.450 94.612 0.325 -0.162 -0.16

7 0.87 87.590 87.288 0.493 0.302 0.31

8 1.23 91.770 92.669 0.247 -0.899 -0.85

9 1.55 99.420 97.452 0.526 1.968 2.07R

10 1.40 93.650 95.210 0.362 -1.560 -1.52

11 1.19 93.540 92.071 0.243 1.469 1.39

12 1.15 92.520 91.473 0.250 1.047 0.99

13 0.98 90.560 88.932 0.374 1.628 1.60

14 1.01 89.540 89.380 0.345 0.160 0.16

15 1.11 89.850 90.875 0.268 -1.025 -0.97

16 1.20 90.390 92.220 0.243 -1.830 -1.73

17 1.26 93.250 93.117 0.257 0.133 0.13

18 1.32 93.410 94.014 0.293 -0.604 -0.58

19 1.43 94.980 95.658 0.392 -0.678 -0.67

20 0.95 87.330 88.483 0.405 -1.153 -1.14

Simple Linear Regression and Correlation

(Empirical Models)

Example 6-1

Simple Linear Regression and Correlation

(Empirical Models)

Example 6-1

Multiple Linear Regression Model

Multiple Linear Regression Model

Stat > Regression > Regression

Multiple Linear Regression Model

Regression Analysis: Viscosity versus Temp, Rate

The regression equation is

Viscosity = 1566 + 7.62 Temp + 8.58 Rate

Predictor Coef SE Coef T P

Constant 1566.08 61.59 25.43 0.000

Temp 7.6213 0.6184 12.32 0.000

Rate 8.585 2.439 3.52 0.004

S = 16.3586 R-Sq = 92.7% R-Sq(adj) = 91.6%

Analysis of Variance

Source DF SS MS F P

Regression 2 44157 22079 82.50 0.000

Residual Error 13 3479 268

Total 15 47636

Multiple Linear Regression Model

Multiple Linear Regression Model