1 / 19

Analyzing the Results of a Simulation and Estimating Errors

Analyzing the Results of a Simulation and Estimating Errors. Jason Cooper. Types of Error. Big and obvious errors Systematic error Statistical (random) error. Big, Obvious Errors. Arise from gross error, often in the particle configuration.

dyllis
Download Presentation

Analyzing the Results of a Simulation and Estimating Errors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analyzing the Results of a Simulation and Estimating Errors Jason Cooper

  2. Types of Error • Big and obvious errors • Systematic error • Statistical (random) error

  3. Big, Obvious Errors • Arise from gross error, often in the particle configuration. • Examine intermediate conformations (MD or MC) for obvious problems, regardless of the focus of the study. • Conformations typically stored every 5-25 steps.

  4. Systematic ErrorCharacterization • Results in a constant bias or skew from the expected result. Expected distribution Biased distribution Skewed distribution

  5. Systematic ErrorCharacterization • Calculated values for simple thermodynamic properties should be normally distributed:

  6. Systematic ErrorCharacterization 1. Sort data into bins of approximately equal number. Expected number is given by: 2. Calculate chi-squared statistic: (2 > 1 indicates a poor match)

  7. Systematic ErrorSources • Four main sources of systematic error: • The model (limitations of the basis set, functional, etc.) • The algorithms used (drift in Euler integration of a DE) • Numerical precision (round-off and quantization error) • Implementation (programming error)

  8. Systematic ErrorThe Fix • Systematic errors are most easily isolated when several algorithms are applied: • to several different chemical systems, • on several different computers, • using several different compilers, • etc…

  9. Statistical ErrorCharacterization • Characteristic normal distribution of values about the set average: • M is the number of independent data values

  10. Statistical ErrorRelaxation Time and Statistical Inefficiency • Successive data values are well correlated, and not independent. • To find the effective M, we need to know the statistical inefficiency of the system.

  11. Statistical ErrorRelaxation Time and Statistical Inefficiency • We begin by dividing our M sequential configurations into b blocks each containing nb values of the property A:

  12. Statistical ErrorRelaxation Time and Statistical Inefficiency • The variance of the block averages is then given by: • Where Ai is the average for the ith block and Atotal is the average calculated only over those values covered in the blocks.

  13. Statistical ErrorRelaxation Time and Statistical Inefficiency • For large nb, Ai become uncorrelated and: • Next, define the statistical inefficiency s: and, finally... so that

  14. Statistical ErrorRelaxation Time and Statistical Inefficiency • We solve for s: • Where s can be visualized in two ways: • The factor by which the variance exceeds a naïve estimate (statistical inefficiency); or • The number of steps per block required to give uncorrelated block averages (relaxation time).

  15. Statistical ErrorRelaxation Time and Statistical Inefficiency • In practice, s is calculated from a plot similar to the following:

  16. Statistical ErrorRelaxation Time and Statistical Inefficiency • Care must be taken to avoid boundary effects:

  17. Statistical ErrorApplication of Statistical Inefficiency: Sampling • Simulation is divided into blocks of size nb ≥ s • Blocks may be sampled in one of three ways: • Stratified systematic sampling • Stratified random sampling • Coarse graining • Coarse graining most commonly applied for scalar properties. Sampling applied otherwise.

  18. Statistical ErrorSources • Arises from the finite nature of the simulation: • Finite number of atoms or molecules considered • Finite number of sequential values taken • Finite precision retained in intermediate values

  19. Statistical ErrorThe Fix • Three main approaches: • Increase the number of atoms or molecules considered in the simulation; • Increase the duration of the simulation (number of samples taken); or • Reduce the statistical inefficiency of the algorithms used.

More Related