VI. Evaluation of Model Fit Part 2: Graphical Analysis of Model Fit and Related Statistics

VI. Evaluation of Model FitPart 2: Graphical Analysis of Model Fit and Related Statistics Graphs of weighted residuals versus unweighted simulated values, and minimum, maximum, and average weighted residuals Graphs of weighted observations versus weighted simulated values and the correlation coefficient R Graphics using independent variables and the runs statistic Normal probability plots and the correlation coefficient RN2 Determining acceptable deviations from independent normal weighted residuals Use GW_Chart for plotting

Weighted Residuals vs Simulated Values (_ws in UCODE_2005) (Book, p. 100 – 104) • Two of the requirements for a valid regression are that the weighted residuals are random and have a mean of zero. • Use graph of the weighted residuals versus simulated values to evaluate the the weighted residuals. • The weighted residuals should be evenly distributed about zero for all weighted simulated values, and should display no trends with the simulated values. • Trends or unequal variance indicator model bias. • Examples of randomly and non-randomly distributed weighted residuals: • Figures 6-1 and 6-2 of Hill and Tiedeman (page 102-103). • Figure 6-7A (page 116) of Hill and Tiedeman shows graph for the steady-state problem.

Change to the _ws file from the book and previous codes • The book uses WEIGHTED simulated values on the horizontal axis of these graphs. • There is a statistical reason for using weighted simulated values, but in practice it is more confusing and doesn’t add much. • The _ws data-exchange file now lists weighted residuals and unweighted simulated values. This is a change from the _ws file in previous codes UCODE, MODFLOW-2000, and MODFLOWP. • If there is a very wide range in the simulated values, you can make the horizontal axis log-transformed, or use the weighted simulated values from the _ww file.) • The weighted residuals also can be plotted against observed values

Wted Residuals vs. Simulated Values (These figures use weighted simulated values, but the results are the same) Define scale of weighted residual axis using the standard error of the regression. Figure 6-2 of Hill and Tiedeman (page 103) Figure 6-1 (page 102)

1. Minimum, Maximum, and Average Weighted Residual (Book, p. 100) • Minimum and maximum weighted residuals display the range of weighted residuals. Examination of these values can help reveal: • Areas where the fit to the observed data is especially poor, • Areas where data has been incorrectly interpreted, and • Data input errors. • The average weighted residual in nonlinear regression should be close to zero. In linear regression, the average residual is always exactly zero. • Caution: the average weighted residual in nonlinear regression can be close to zero even if there are other problems with the model or regression. • DO EXERCISE 6.2a: Evaluate graph of weighted residuals versus weighted simulated values. Evaluate the minimum, maximum, and average weighted residuals.

1. Weighted Residuals vs. Weighted Simulated Values (can use unweighted simulated values) _ws data-exchange file Figure 6-7A of Hill and Tiedeman (page 116)

2. Graphs of Weighted Observed vs.Weighted Simulated Values (Book, p. 105) • Values on the graph of weighted observed versus weighted simulated values should plot close to a line with slope = 1.0 • Generally, for assessing model bias or desired properties of weighted residuals, these graphs are not as useful as graphs of weighted residuals vs. weighted simulated values. • This is partly because a large range in magnitudes of weighted observations can obscure trends in the differences between the weighted observations and the weighted simulated values. • See Figure 6-3 (page 105) of Hill and Tiedeman.

2. Wted Observed vs.Wted Simulated Values Figures with weighted observations Figures with weighted residuals • The data have the same problems as in the graphs with weighted residuals (repeated here). Bit these graphs do not reveal the problems as clearly! _ww _ws Figure 6-3 of Hill and Tiedeman (page 105) In UCODE_2005, _ws has simulated instead of weighted simulated values

2. Correlation Coefficient R (Book, p. 106) • R is the correlation between the weighted simulated values and the weighted observations. • This summary statistic reflects how well the trends in the weighted simulated values match the trends in the weighted observations. • A value of R greater than 0.90 generally indicates a good match of the trends. However, R is not too useful for assessing model fit, because of the same drawbacks as for the graph of weighted observed vs. weighted simulated values. • DO EXERCISE 6.2b: Graph weighted observations versus weighted simulated values and examine the correlation coefficient R. • Figure 6-7b and c (page 116) of Hill and Tiedeman show graphs of weighted and unweighted simulated versus observed values for the steady-state problem.

Weighted Observed vs.Weighted Simulated Values _ws _ww Figure 6-7b of Hill and Tiedeman (page 116)

3. Graphs Using Independent Variables (Book, p. 106) • Graphs using independent variables include plots of weighted residuals on maps of the model area or versus time. • These plots should appear random and show no obvious patterns. • Lack of randomness can be indicative of model error – for example if weighted residuals are all positive in a particular region of the model. • Graph weighted residuals on maps of the model layers. (Exercise 6.2c) • Figure 6-9 (page 117) of Hill and Tiedeman shows these graphs for the steady-state problem.

3. Graphs UsingIndependent Variables Plotted using _w and .xyzt file or other source of location information. Figure 6-9 of Hill and Tiedeman (page 117)

4. Normal Probability Graphs(Book, p. 108-111) • Normality of the true errors, and therefore the weighted residuals, is not an assumption required for the regression to be valid. • However, computation of some inferential statistics, such as parameter confidence intervals, does require that the weighted residuals be normally distributed. • If weighted residuals are independent and normally distributed, they should plot on a straight line on a normal probability graph.

4. Correlation Coefficient RN2 (Book, p. 110) • Summary statistic to test for independence and normality of weighted residuals is RN2, the correlation coefficient between ordered weighted residuals and order statistics from a standard normal probability distribution function. • The hypothesis tested with RN2 is that the weighted residuals are independent and normally distributed. Critical values of RN2 are used for significance levels of 0.05 and 0.10, to test this hypothesis. The significance level is the probability that we are wrong in rejecting the hypothesis. • For example, we choose a significance level of 0.05, and its critical value: • If RN2 > critical value, we accept the hypothesis that the weighted residuals are independent and normally distributed. • If RN2 < critical value, we reject the hypothesis that the weighted residuals are independent and normally distributed, and there is a 5 percent probability that we are wrong.

4. Correlation Coefficient RN2 • The critical values for significance levels of 0.05 and 0.10 are printed in the UCODE and MF2K output files. Because this is a strict test for independence and normality, it is generally adequate to use a significance level of 0.05. • RN2 is calculated in two ways: • using only the weighted residuals for the dependent-variable observations. • using the dependent-variable and the prior information weighted residuals. • Large differences in RN2 between these two data sets indicate that the data sets are not statistically similar. • For commonly used sample sizes, this test is more powerful than other statistics used to test normality, such as Kolomogorov-Smirnov. • DO EXERCISE 6.2d: Prepare normal probability graphs and evaluate the correlation coefficient RN2. • Figure 6-11 (page 119) of Hill and Tiedeman shows the graph for the steady-state problem.

4. Normal Probability Graph _nm Figure 6-11 of Hill and Tiedeman (page 119)

5. Deviations from Independence and Normality(Book, p. 111-113) • It is possible that the weighted residuals might fail the tests for independence and normality because • There aretoo few residuals, or • The weighted residuals are normal, but they are correlated due to the fitting process of the regression, instead of being independent. • The regression methodology can result in the weighted residuals being correlated, because the regression fits the errors in the data. This correlation becomes more significant if the number of observations is small compared to the number of parameters. • Cooley and Naff (1990) developed a method for testing whether too few residuals or correlations between residuals might be the cause of the failure of the tests for independence and normality.

5. Deviations from Independence and Normality • Steps of the method developed by Cooley and Naff are: • Generate normally distributed random numbers with and without the regression-induced correlation. • Compare the normal probability graph of the weighted residuals with those of the generated independent normally distributed random numbers (d’s). If similar deviations from a straight line are detected in the two types of graphs, then the deviations could result from too few residuals. • Compare the normal probability graph of the weighted residuals with those of the generated correlated normally distributed random numbers (g’s). If similar deviations from a straight line are detected in the two types of graphs, then the deviations could result from regression-induced correlation. • UCODE and RESAN-2000 can be used to generate the independent and correlated normally distributed random numbers. • Assess deviations from independence and normality (EXERCISE 6.2e) • Figure 6-13 (p. 121) of Hill and Tiedeman shows graphs of generated numbers versus weighted simulated values; Figure 6-14 (p. 122) shows normal probability graphs of the generated numbers. Exer\ex6.2e-residual-analysis.bat

_rd 5. Deviations in normal probabilitygraphs Figure 6-14A _nm _rg Figure 6-11: weighted residuals for steady-state model Run residual_analysis to get the _rd and _rg files. Plot with gw_chart. (gw_chart reverses the axes compared to the graphs here.) Figure 6-14B

VI. Evaluation of Model Fit Part 2: Graphical Analysis of Model Fit and Related Statistics