1 / 23

Materials for Lecture 11

Materials for Lecture 11. Chapters 3 and 6 Chapter 16 Section 4.0 and 5.0 Lecture 11 Pseudo Random LHC.xls Lecture 11 Validation Tests.xls Next 4 slides were added because right about now most students are confused about PDF parameters and what functions to use. Parameter Estimation.

elkan
Download Presentation

Materials for Lecture 11

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Materials for Lecture 11 • Chapters 3 and 6 • Chapter 16 Section 4.0 and 5.0 • Lecture 11 Pseudo Random LHC.xls • Lecture 11 Validation Tests.xls • Next 4 slides were added because right about now most students are confused about PDF parameters and what functions to use

  2. Parameter Estimation • Parameters for a distribution define the shape and position on the number scale • Uniform( Min, Max) • Norm( Mean, Std Dev) • Mean (Ỹ or Ῡ) and risk as Empirical( Si, P(Si)) • Shape can be skewed right or left, can be tall or squatty (kurtosis) • Parameters reflect amount of variability in the stochastic variable • Must validate random variables against their parameters • We use the parameters to simulate the distributions

  3. Same Mean Different Std Dev

  4. Review Steps for Parameter Estimation • Step 1: Check for presence of a trend, cycle or structural pattern • If trend or structural model, work with the residuals (ẽt) • If no trend use actual data (X’s) • Step 2: Estimate parameters for several assumed distributions using the X’s or the residuals (ẽt) • Step 3: Simulate the different distributions • Step 4: Pick the best match based on • Mean, Variability -- use validation tests • Minimum and Maximum • Shape of the CDF vs. historical series • Penalty function CDFDEV() to quantify differences

  5. Univariate Parameter Estimation • When do you use UPES? • When there is no trend in the data • When you want to use the historical mean as your forecasted y-hat • Test an unknown random variable for its shape • Or use residuals

  6. Univariate Parameter Estimation • Empirical distribution fits your data best because it lets the data define the shape • Prefer to use the EMP with deviations as a percent or fraction from Y-hat • If there is a trend, then account for it with deviations from trend • Else use deviations from mean • EMP allows us to model low probability events • Test with =CDFDEV(original data, sim data)

  7. Model Validation • Do the simulated values for the random variables reproduce their parameters? • Does the model accurately forecast the system? • Do the results conform to theoretical expectations? • Do the results conform to expectations of experts? • Touring Test of simulation model results • Show the results to experts, using alternative assumptions about the input values

  8. Four P’s for Validation • Planning – in the initial model preparation mode, developer should plan how to validate the model • Personal – it’s the developer’s responsibility to verify every equation, coefficient, and random variable; check if results are theoretically correct? • Peers – utilize experts in the field to review model results using Touring Test; use sensitivity testing of model • Prospective Clients – do the results conform to their expectations? Are the results useful to the client?

  9. Model Verification • Check all equations for arithmetic accuracy • Use Excel’s “Trace Dependence” functions • Check linkage of variables coming into each equation • Check model in “Expected Value” and “Stochastic” mode • Insure that the variables in each equation are theoretically correct • Make sure the model contains all of the necessary equations to calculate the KOVs

  10. Model Validation • Use statistical tests of each random variable to insure that it: • reproduces the historical distribution • reproduces the historical correlation matrix among random variables • Statistical Tests • Student t test • F test • Chi Square test

  11. Statistical Tests for Validation • Test the means of the random variables against their historical values • Statistically equal at 95% level based on a t-test? • Test the variance against historical values • Statistically equal at 95% level based on an F-test? • Check the historical vs. simulated coefficient of variation • Needs to be constant over time • Check the minimum and maximum • For a Normal distribution are they reasonable? Should be: Min ≈ Mean + StdDev * (-3) Max ≈ Mean + StdDev * (3) • For an Empirical distribution compare simulated min and max to values the model “should” simulate or Xmin should get = Y-hat * (1+Minimum Fractional Deviate) Xmax should get = Y-hat * (1+Maximum Fractional Deviate) • Check the correlation matrix for the simulated variables vs. the historical correlation matrix using t-tests

  12. Validation Tests in Simetar • Verification/Validation tests in Simetar • Hypothesis tests icon • Compare Two Series Historical Data vs. Simulated Values • 1st Data Series is history • 2nd Data Series is simulated • Test means and variances for two series, i.e., are they statistically equal • Test works for a pair of variables and for comparing two multivariate distributions (matrices)

  13. Statistical Tests for Validation • Compare Two Series Historical Data vs. Simulated Values • 1st Data Series is history • 2nd Data Series is simulated

  14. Validation Tests in Simetar • Compare mean and standard deviation of simulated data to the user’s specified values • “Data Series” is the simulated values • Type in the mean or cell • Specify the Std Dev as a value or a cell location • The test is used when • Only mean and std dev are known, i.e., there is no history for the variable • Mean is a projected value which is different from the history

  15. Validation Tests in Simetar • Compare mean and standard deviation of simulated data to the user’s specified values • The test is used when only mean and stddev are known, i.e., there is no history for the variable Or the mean is a projected value different from history • Note the Given Values are Mean = 10 and StdDev = 3

  16. Validation Tests in Simetar • Test simulated values for Multivariate Distributions (MVE and MVN) to test if the historical correlation matrix is reproduced in the simulation • Data Series is the simulated values for all random variables in the MV distribution, a matrix of variables in SimData • The original correlation matrix used to simulate the MVE or MVN distribution • OK, if the majority of correlation coefficients are statistically the same as the historical correlation matrix

  17. Charts for Validation • Test simulated values for Multivariate Distributions (MVE and MVN) to test if the historical correlation matrix is reproduced in the simulation

  18. Using Charts for Visual Validation • Use a CDF to compare historical series to simulated series, tests the min and max • Use a PDF to compare historical series to simulated series, tests the shape • Use a Box Plot to compare historical series to simulated series, checks the variability • Use a Probability graph to compare historical series to simulated series, P(x) vs. F(x) • Use a Fan graph to show the range of the risk and level of the mean over time, visual test of CV constant over time

  19. How Simetar Simulates Random Numbers • A pseudo random number generator is used so we can reproduce the simulation results from day to day with the same inputs • Pseudo random number generator uses a seed to start the sampling sequence • The default seed in Simetar is 31517 • Change the seed if you like • If you do not use a pseudo random number generator then every time you simulate the model you get different answers, even if the input has not changed

  20. Latin Hyper Cube vs. Monte Carlo Simulated Numbers • Monte Carlo simulation procedure samples randomly from the full range of the possible values for a random variable • Requires large number of iterations for adequate coverage over possible range of a variable • For small number of iterations does not sample adequately

  21. Latin Hyper Cube vs. Monte Carlo Simulated Numbers • Latin Hyper Cube systematically samples all segments of the distribution for a random variable • If 100 iterations are to be simulated, LHC samples one value randomly from each of 100 intervals of equal length on 0 to 1 USD scale • Insures all segments of distribution are sampled, even at small numbers of iterations • With LHC get “adequate” sampling coverage of a distribution with fewer iterations

  22. Latin Hyper Cube vs. Monte Carlo • A Uniform distribution defined as U(0,1) is a straight line with a 450 angle out of the origin • A perfect sample would lie on the straight line • Use the following USDs • Excel’s =RAND() • Simetar’s =UNIFORM() • Simulate these two USDs • Draw a CDF with the two random variables, Which one lies on the straight line between 0 and 1? 1.0 F(x) 0.0 X 1.0

  23. Example of Latin Hyper Cube vs. Monte Carlo Simulation of USD

More Related