Verification and Validation as Applied Epistemology Or, How I Learned to Stop Worrying and Love [the DOE’s approach to verifying and validating models of] The Bomb (SAND 2007 2628C) Laura A. McNamara Exploratory Simulation Technologies Timothy G. Trucano Optimization and Uncertainty Quantification George Backus Exploratory Simulation Technologies Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.
The Pitch • Since 1998, the Department of Energy/NNSA National Laboratories have invested millions in strategies for assessing the credibility of computational science and engineering (CSE) models used in high consequence decision making. • The answer? There is no answer. There’s a process - and a lot of politics. • The importance of model evaluation (verification, validation, uncertainty quantification, and assessment) increases in direct proportion to the significance of the model as input to a decision. • There are clear limitations on what models can do. • Other fields, including computational social science, can learn from the experience of the national laboratories. • Some implications for evaluating ‘low cognition agents’
STARRING…. • Computational physicists and engineers at LANL, Sandia, and LLNL • Accelerated Strategic Computing (nee ASCI) V&V Program/ also QMU • LANL: Hemez, Rider, Brock, Kamm, Doebling, Lucero; V&V and UQ Program teams • Sandia: Peery , Trucano, Oberkampf, Pilch, V&V and UQ/QMU Program team • LLNL: Logan, Nitta; V&V and UQ program teams • Predictive Science Panel (DOE expert advisory panel) • Probability/information/decision theorists in academia
The purpose of computing is not insight. • The purpose of computing is to provide “high-performance, full-system, high-fidelity-physics predictive codes to support weapon assessments, renewal process analyses, accident analyses, and certification.” (DOE/DP-99-000010592)
What are we talking about? • Verification • The process of determining that a computational software implementation correctly represents a model of a physical process • The process of determining that the equations are solved correctly • Validation • The process of assessing the degree to which a computer model is an accurate representation of the real world from the perspective of the models intended applications • The process of determining that we are using the correct equations *Pilch, Trucano, Moya, Froelich, Hodges, Peercy 2001
“If we regard theories as descriptions… of reality produced by the human imagination, it is clear that there must be some account of the constraints upon that imagination, for the human imaginative faculty is well-known for its capacity to generate mere fantasy: and yet, it is plain that the conceptions of reality which scientists have drawn upon from time to time are not fantasies, though at the end some have been abandoned as unrealistic.” V&V is a methodology of constraint
Conceptual Validation Simulation Validation Code Verification Putting Epistemology into Practice ‘Reality’ Are data valid? Observation and Analysis; hypothesis formulation; Data collection Experimental design, Simulation predictions Implementation of conceptual model in code Mathematics works right? Computer Simulation Conceptual Model The Sargent model from Ang, Trucano, Luginbuhl 1998
An ideal world, a dysfunctional family, a dramatic tension-filled triangle … and a car wreck.
The Ideal (V&V/UQ) World 1 DP Application • Requirements and planning 2 Planning 4 Validation Experiments Experiment Design, Execution & Analysis 3 5 Code Verification Metrics 6 Validation Metrics Verification 3 Assessment Calculation Verification 7 Prediction & Credibility Credibility 8 Permanence Document
The Dysfunctional Family The bigger the modeling and simulation effort, the more complicated the distribution of expertise • Where does V&V reside? Who owns V&V methodologies and who champions? • Who decides when enough is enough? • If Prediction is the goal, and V&V and UQ are necessary for establishing prediction…then does that require a focus on V&V and UQ? • What does this mean for data collection? Who pays for it? • V&V means ongoing negotiation of investments, sufficiency within organization and with decision makers • V&V, UQ as boundary work within organization • V&V, UQ as communicative vehicles to demonstrate credibility by delineating ‘how we know what we know.’ Harder…Slower….More Expensive
The Tragic Tension:I only get two? Fidelity to Data Confidence in predictions (“looseness”) Robustness to uncertainty Hemez, F. 2004. “The Myth of Science Based Predictive Modeling”. Los Alamos, NM: LANL LAUR-04-6829
Conceptual Validation Simulation Validation Observation and Analysis; hypothesis formulation; Data collection Experimental design, Simulation predictions Implementation of conceptual model in code Code Verification Putting Epistemology into Practice ‘Reality’ Are data valid? Computer Model/ Simulation Conceptual Model Ang, Trucano, Luginbuhl 1998
REFERENCES • Ang, J., Trucano, T., Luginbuhl, D. 1998. Confidence in ASCI Scientific Simulations. Albuquerque, NM: Sandia National Laboratories. SAND 98-1525c. • Axelrod, R. 2003. Advancing the Art of Simulation in the Social Sciences. Japanese Journal for Information Management Systems.12(3) • Goldstein, H. 2006.Modeling terrorists: New simulators could help intelligence analysts think like the enemy. IEEE Spectrum September: 34-43. • Hemez, F. 2004. “The Myth of Science Based Predictive Modeling”. Los Alamos, NM: LANL LAUR-04-6829 • Harre, H.R. 2003. Modeling: Gateway to the Unknown. Amsterdam, NL: Elsevier Press. • Marks, Robert E. 2003. ‘Coffee, Segregation, Energy and the Law: Validating Simulation Models.’ GET FULL CITATION • McNamara, L. 2005. “Where are the anthropologists?” Anthropology News. • McNamara, Laura and Trucano, Timothy. 2004. So Why DO You Trust That Model? Some Thoughts on Modeling, Simulation, Social Science and Decision Making. Albuquerque, NM: SAND • McNamara, Laura and Trucano, Timothy. 2006. Modeling and Simulation for National Security Decision Making: Notes Towards a Practical Epistemology Scientific Computing (And What That Means for Intelligence). Albuquerque, NM: Sandia National Laboratories. SAND 2006-6340c. • Trucano, T., Garasi, C., Mehlhorn, T. 2005. ALEGRA-HEDP Validation Strategy. Albuquerque, NM: Sandia National Laboratories (SAND 2005-6890). • Oberkampf, W.L., Trucano, T. 2007. Verification and Validation Benchmarks. Albuquerque, New Mexico: Sandia National Laboratories (SAND 2007-0853). • Pilch, M., Trucano, T., Moya, J., Groehlich, G., Hodges, A., Peercy, D. 2000. “Guidelines for Sandia ASCI Verification and Validation Plans – Content and Format: Version 2.0.” Albuquerque, NM: Sandia National Laboratories, SAND 2000-3101. • Smith, T. J.2007. “Predictive Network Centric Intelligence: Toward a Total Systems Transformation of Analysis and Assessment.” Washington, DC: Director of National Intelligence.
High-consensus ‘laws,’ rules, theories exist Implemented mathematically with (relative) ease Theory is explanatory and predictive Multiple theories explain the same set of phenomena Theories expressed in narrative form Theories are explanatory and descriptive CSS vs CSE
So where is the computational social science community? • Axelrod: Does the program correctly implement the model? (internal validity) • Carley: processes and techniques for addressing comparability between simulated world and ‘real’ world … (external validation) • Marks: How successfully the model’s output exhibits the historical behaviors of the real world target system (‘output validation’, cf Manson 2002)
Q: Why do we create models? • Kinds of models • To highlight features of a phenomenon we have observed (to describe, explain, predict) • As our observations mature, so can our conceptual models • The Ptolemaic universe • The Copernican universe • To represent a conception of a phenomenon we have not yet observed • Superstrings in cosmological physics • Roles that models play • Models fix a mental representation, collective or otherwise, of a phenomenon occurring in the world around us • Models are frameworks for organizing inquiry • Models enable knowledge to evolve A: Because we can’t do science without them.
Uses of agent-based simulations • Explain a phenomenon, explore a phenomenon, understand interactions that produce a phenomenon (Marks 2003) • Insight into system control, make predictions, derive general principles (Haefner) • Prediction, performance, training (flight simulators), entertainment, education (SimCity), existence proofs, discovery, gedankenexperiment (Axelrod 2003)
Terminology • CSE: Computational Science and Engineering • CSS: Computational Social Science (to include agent-based models) • V&V: Verification and Validation • UQ: Uncertainty Quantification