1 / 40

SENSIBLE VALIDATION FOR IW SIMULATIONS

SENSIBLE VALIDATION FOR IW SIMULATIONS. ...with special attention paid to models displaying emergent behavior, and those used to support analysis. Michael P. Bailey Operations Analysis Division, USMC. “All models are wrong, some are useful.”. George Box Wartime Statistician.

yasuo
Download Presentation

SENSIBLE VALIDATION FOR IW SIMULATIONS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SENSIBLE VALIDATION FOR IW SIMULATIONS ...with special attention paid to models displaying emergent behavior, and those used to support analysis. Michael P. Bailey Operations Analysis Division, USMC

  2. “All models are wrong, some are useful.” George Box Wartime Statistician

  3. WORKABLE DEFINITIONS • Conceptual Model – Description of the system in some abstracted/symbolic formalism, usually mathematics. • Verification – The simulation executable faithfully reflects the Conceptual Model • Validation – The degree to which the system described in the conceptual model is appropriate in supporting the intended use. • Accreditation – The judgement, made by someone responsible for the outcome, that the simulation is adequately verified and valid for the intended use.

  4. COMMENTS • Conceptual models are always incomplete. • Verification of a simulation is a scientific endeavor if the conceptual model is complete. • A simulation is never “Valid.” • Analytical intended uses are difficult to deal with… • Repetition is very rare. • Analysts have no way to scientifically express the intended use.

  5. formal transitions T(M) implementation and design executable code conceptual model ideal sim natural system IDEALIZED DEVELOPMENT PROCESS

  6. formal transitions T(M) implementation and design executable code conceptual model ideal sim natural system IDEALIZED DEVELOPMENT PROCESS abstraction coding and testing modeling mapping to sim design pattern software design

  7. formal transitions T(M) implementation and design executable code conceptual model ideal sim natural system IDEALIZED DEVELOPMENT PROCESS abstraction Driven by analytic task. More later... coding and testing modeling mapping to sim design pattern software design

  8. executable code ideal sim natural system REALITY FOR BIG-IRON SIMS abstraction data development

  9. formal transitions T(M) implementation and design executable code conceptual model ideal sim FOR A GIVEN ANALYSIS natural system FOCUS FOR ANOTHER DAY

  10. ANALYSIS • Predict the response (absolute) • Predict the response (as compared to a baseline) • Predict the functional form of the response for a set of independent variables • Predict the sign of the gradient (set of 1st derivatives) • Is there any response? • Predict the min/max of the response over a high-dimensional domain • Predict xi in [Li, Ui] such that response > c • Characterize the probabilistic nature of the response

  11. ISAAC: A Screenshot ISAAC is an agent-based simulation for ground combat CNA developed for Project Albert

  12. DISCUSSION EXAMPLE: ISAAC • Sensor range – maximum range at which the ISAACA can sense other ISAACAs • Fire range – area within which the ISAACA can engage enemy ISAACAs in combat (also allows for fratricide) • Threshold range – area within which an ISAACA computes the numbers of friendly and enemy ISAACAs that play a role in determining what move to make on a given time step • Movement Range – area defining where an ISAACA can select a move from on a given time step (1 or 2) • Communications Range – boxed area within which friendly ISAACAs can communicate the information of their local sensor fields to the ISAACA

  13. Move selection For every possible move an ISAACA can make, a penalty is calculated using the personality weight vector. ISAACA moves to the square that best satisfies personality-driven “desire” to get closer to (or farther away from) friendly and enemy ISAACAs and enemy (or own) goal.

  14. ISAACA Combat • Each ISAACA given opportunity to fire at all enemy ISAACAs within range, unless constraint in place* • If an ISAACA is shot, its current state is degraded from alive to injured or from injured to dead • Probability that a given enemy ISAACA is shot is fixed by user specified single-shot probabilities • Probability for an injured ISAACA is equal to one half of its single-shot probability when it is alive

  15. s1 Courtesy of Project Albert

  16. s1 Courtesy of Project Albert

  17. s1 Courtesy of Project Albert

  18. s2 Courtesy of Project Albert

  19. SOME TERMS WE USE • Distillation: Representing a physical process using a simple, controllable model • e.g. Range rings for radar detections, the ISAACA movement (direction and speed) model • Emergent Behavior: Group behavior of entities that appears to be organized, but is generated from distributed individual logic • e.g. Encirclement

  20. formal transitions T(M) implementation and design executable code conceptual model ideal sim natural system • Simulations displaying emergent behavior are difficult to validate because it is difficult to predict their behavior from the Conceptual Model • Therefore there is greater pressure to use results validation.

  21. "The more complex the model, the harder it is to distinguish unusual emergent behavior from programming bugs." Douglas Samuelson Renowned Operations Research Analyst

  22. TECHNICAL SENSITIVITIES • Activation pattern • Simulation language • More? The response should NOT be sensitive to these.

  23. Schism • Agent-based simulations use modular rules and local reasoning to produce realistic and/or interesting emergent aggregate behavior. • Surprise is good** • Successful simulation testing (core to face/results validation) based on demonstrating credibility across the range of potential input. • Surprise not good** ** Refined later in this talk

  24. GOAL: STOP BEING SURPRIZED In control, no more surprises How do we tell about this experience? Surprise Production Runs Accept/reject Explore • “Unnatural acts” reflect negatively on a sim • Once we chieve top-down control, is there still emergent behavior? Explain

  25. TERMS • Spreadsheet model • Object-oriented • Agent-based These are terms pertaining to software design, and are irrelevant to the utility of the resulting software in supporting analysis.

  26. SIMULATION-SUPPORTED ANALYSIS • Baseline/Excursion or Factorial Experiment • Driven to answer Analysis Questions • Key Elements of Analysis • Constraints, Limitations, and Assumptions

  27. SIMULATION DYNAMICS based on accepted physical laws based on accepted social dynamics based on common sense distillation simple model relic required to facilitate actions simple model relic required to maintain consistency top-down human intervention DATA authoritative value measured witnessed argued by logic sensible range guess/arbitrary dimensionless ELEMENTS GOODNESS RELEVANT DYNAMICS + REQUIRED DATA = ELEMENT e.g. underwater detection using observed detection range data in a cookie-cutter model

  28. RELEVANT ELEMENTS CLA: Constraints, Limitations, and Assumptions necessary to give context, scope the analysis, and interpret the results1 NECESSARY ELEMENTS: necessary to support the model, give the analysis context INTERESTING ELEMENTS: have impact on the results, but are not elements of analysis CORE ELEMENTS: drive the results of your experiment, align with the key elements of analysis 1An excellent FOUO document exists on this subject. Seeking unlimited release. Watch orsagouge.pbwiki.com.

  29. CORE • Drive the results, dominate the effect of other elements • Pertain directly to the goals of the analysis • Are the key elements in the experiment core

  30. core interesting INTERESTING • Influential elements not directly connected to goals of the analysis • Sources of variability in context • May comprise cases within a baseline or excursion

  31. core interesting necessary NECESSARY • Required to facilitate the simulation • Can provide a variety of contexts weighted by their plausible frequency • Believed not influential, often fixed • Give the outcome meaning, interpretability

  32. constraints, limitations, and assumptions core interesting necessary CLA • Important context, scoping, or caveats that must accompany the analysis results • Believed to be valid for the purposes of the analysis • Knowledge gained can be extrapolated beyond the CLA with caution • Stated a priori, and agreed to by analysis sponsor prior to commencement

  33. EXAMPLE • ANALYSIS GOAL: Determine the relative effectiveness of USN fleet sizes for aviation-capable amphibious ships • MOE’s • JFEO – time to deliver Assault Echelon • SASO -- % deck space and aircraft available for non-routine tasking • CORE: number of ships in the fleet (implied -- closure times and embarked aircraft) • INTERESTING: MAGTF ACE composition, temperature, posture of units ashore • NECESSARY: terrain at LZ’s, order of lift, serial loading, tasking • CLA: scenario, Joint TACAIR support

  34. constraints, limitations, and assumptions core interesting necessary IMPACT ON ANALYSIS • Agent-based design is reputed to enable fast and easy construction of interesting and necessary elements • Simulation environments usually provide pre-programmed interesting and necessary elements, and plug-in capability for core elements • Necessary and Interesting elements can display emergent behavior to add variability • Emergent behavior is often not predictable/controllable • Big-iron simulations often have parametric (knob) control over necessary elements • impossible to promote these to interesting or core elements • should NOT be elements of analysis • Ideally, analysts should have the most faith in their core elements • should have high-quality data (high on the GOODNESS scale) • should have well-studied dynamics (high on the GOODNESS scale) • must not display emergent behavior • Core elements should be results-proven to be highly influential (see Scientific Method of Choosing Model Fidelity) • Limitations on the Core = Limitation of the simulation for analytical purposes • Core and Interesting Elements should results-proven to be consistent with SME (explainable 1st derivative)

  35. constraints, limitations, and assumptions core interesting necessary NEGATIVE INFORMATION for IN-VALIDATION • Elements not data-driven • Elements not controllable • Element displays undesired emergent behavior • Element displays unexplainable 1st-order influence (results schism unexplainable) • Element not in the anticipated layer • level of influence is more/less than anticipated by analyst • dynamics or data are... • too low on the GOODNESS scale • mismatched vis. the GOODNESS scale

  36. constraints, limitations, and assumptions core interesting necessary NEGATIVE INFORMATION = IN-VALIDATION? no concern show stopper • Negative information about an element may/may not be a show-stopper • influence (Core-ness) of the element vis. the analysis • harshness of the negative information

  37. “Computer programs should be verified,Models should be validated,and Analysts should be accredited.” Alfred G. Brandstein Renowned Military Operations Research Analyst Founder of Project Albert

  38. THE ANALYST • Prior to any experience with the simulation, can the Analyst... • Pose analytic questions mathematically? • Describe the experiment? • Identify core & interesting elements? • Specify CLA elements? • Evaluate core elements on “good-ness” scale? • Disclose all outcomes the analyst anticipates matching with the simulation (Test Cases)? • Once experience has been gained, can the Analyst... • Explain changes to anticipated core/interesting/necessary classification? • Describe all experimentation on distilled elements? • Describe all testing and tuning required? • Quantify level of influence of each core & interesting element statistically? • Explain the impact of each CLA on the results? • Statistically determine the level of agreement of the simulation outcomes with the Test Cases? • Resulting analysis should be peer-reviewed

  39. Bottom line -- if the analyst cannot describe his application and predict certain things, the model/application match is doomed from the start.

More Related