SEED Center for Data Farming Overview

http://harvest.nps.edu SEED Center for Data Farming Overview Tom Lucas and Susan Sanchez Operations Research Department Naval Postgraduate School Monterey, CA. Mission: Advance the collaborative development and use of simulation experiments and efficient designs to provide decision makers with timely insights on complex systems and operations

Simulation studies underpin many DoD decisions DoD uses complex, high-dimensional, simulation models as an important tool in its decision-making process. • Used when too difficult or costly to experiment on “real systems” • Needed for future systems—we shouldn’t wait until they’re operational to decide on appropriate capabilities and operational tactics, or evaluate their potential performance • Investigate the impact of randomness and other uncertainties Many complex simulations involve hundreds or thousands of “factors” that can be set to different levels.

The new view Appropriate goals are: (i)Developing a basic understandingof a particular model or system; • seeking insightsintohigh-dimensional space. • identifying significant factors and interactions. • finding regions, ranges, and thresholds where interesting things happen. (ii) Finding robust decisions, tactics, or strategies; (iii) Comparing the merits of various decisions or policiesKleijnen, Sanchez, Lucas & Cioppa 2005 “Models are for thinking”—Sir Maurice Kendall Once you have invested the effort to build (and perhaps verify, validate & accredit) a simulation model, it’s time to let the model work for you!

An environment for exploration requires… • Flexible models or tools to build them • High-performance computing • Experimental design • Data analysis and visualization

Traditional DOE Assumptions Small/ moderate # of factors Univariate response Homogeneous error Linear Sparse effects Higher order interactions negligible Normal errors Black box model Assumptions for Defense & Homeland security Simulations Large # of factors Many output measures of interest Heterogeneous error Non-linear Many significant effects Significant higher order interactions Varied error structure Substantial expertise exists These goals mean fewer assumptions... “The idea behind [Monte Carlo simulation]…is to [replace] theory by experiment whenever the former falters—Hammersley and Handscomb “We use simulations to avoid making Type III errors—working on the wrong model”—W. David Kelton

...that, in turn, call for different designs We have focused on Latin hypercubes and sequential approaches Efficient R5 FF and CCD Factorial (gridded) designs are most familiar

Why use experimental designs? More informative… • consider dozens or hundreds of factors, rather than a handful • a broader range of possibilities Faster… • e.g., 33 “scenarios” or “design points” vs. 10 billion More powerful… • get more info from a limited data set Bottom line -- experimental design should be part of EVERY simulation study!

Interpreting the results Mean(Alleg1Cas(blue)) Each Pair Student’s t 0.05 Log twd concealment Effect Tests Alternate Tactical 3 • Standard statistical graphics tools (regression trees, 3-D scatter plots, contour plots, plots of average results for a single factors, interaction profiles) can be used to gain insights from the data • Step-wise regression and regression trees identify important factors, interactions, and thresholds

Questions? SEED Center for Data Farming Mission: Advance the collaborative development and use of simulation experiments and efficient designs to provide decision makers with timely insights on complex systems and operations. Primary Sponsors: International Collaborators: Applications Include: Peacekeeping operations, convoy protection, networked future forces, unmanned vehicles, anti-terror emergency response, urban operations, humanitarian relief, and more Products Include: New downloadable experimental designs, plus over 40 student thesesand a dozen articles http://harvest.nps.edu

The “traditional” view Philosophy: “The three primary objectives of computer experiments are: (i) Predicting the response at untried inputs, (ii) Optimizing a function of the input factors, or (iii) Calibrating the computer code to physical data.” --Sacks, Welch, Mitchell, and Wynn (1989) For many (military) applications, these can be problematic! Approach: • Limit yourself to just a few factors or scenario alternatives • “Fix” all other factors in the simulation to specified values • At each design point, run the experiment a small number of times (once for deterministic simulations) The purpose of computing is insight, not numbers—Hamming

Efficiency • How many runs will you need? A few comparisons… • In contrast, structured nearly orthogonal Latin hypercube (NOLH designs) require • •17 runs for 2-7 factors (up to 17 levels/factor) • •33 runs for 8-11 factors (up to 33 levels/factor) • •65 runs for 12-16 factors (up to 65 levels/factor) • •129 runs for 17-22 factors (up to 129 levels/factor) • •257 runs for 23-29 factors (up to 257 levels/factor) • Random LH designs can be generated for arbitrary combinations of # factors (k) • and # runs (n) as long as n >= k.

So, what is a Latin hypercube? Low = 0 High = 60 A 6-run, 2 factor design • In its basic form, each column in an n-run, k-factor LH is a permutation of the integers 1,2,…,n • The n integers correspond to levels across the range of the factor • For exploratory purposes, we use a uniform spread over the range (but may round to integer values) • slightly different designs arise if you force sampling at the low and high values Pairwise projection 5 15 25 35 45 55 Factor 2 0 12 24 36 48 60 Low = 0 High = 60 Factor 1

Nearly orthogonal and space-filling Latin hypercubes -1.0 0.0 0.5 1.0 -1.0 0.0 0.5 1.0 -1.0 0.0 0.5 1.0 1.0 0.0 A -1.0 1.0 0.0 B -1.0 1.0 0.0 C -1.0 1.0 0.0 D -1.0 1.0 0.0 E -1.0 1.0 0.0 F -1.0 1.0 0.0 G -1.0 -1.0 0.0 0.5 1.0 -1.0 0.0 0.5 1.0 -1.0 0.0 0.5 1.0 -1.0 0.0 0.5 1.0 The pairwise projections for a 17-run, 7-factors orthogonal LH show • Orthogonality (no pairwise correlations) • space-filling behavior (points fill the sub-plots) • 17 total runs!

Other possibilities 1.0 x3 -1.0 1.0 1.0 x2 x1 -1.0 -1.0 1.0 x3 -1.0 1.0 1.0 1.0 x2 x3 x1 -1.0 -1.0 -1.0 1.0 1.0 x2 x1 -1.0 -1.0 • Very large resolution V fractional factorials and central composite designs • Standard DOE literature: 211-3 • New: an easy way to catalogue and generate up to 2443-423 • Two-phase adaptive sequential procedure for factor screening • New procedure that requires fewer assumptions, improves efficiency • Frequency domain experiments • Naturally samples factors at coarser/finer levels • Crossed/combined designs to identify robustdecision factor settings

Our portfolio of designs • Kleijnen, J. P. C., S. M. Sanchez, T. W. Lucas, and T. M. Cioppa, “A User’s Guide to the Brave New World of Designing Simulation Experiments,” INFORMS Journal on Computing, Vol. 17, No. 3, 2005, pp. 263-289. • Cioppa, T. M. and T. W. Lucas, “Efficient Nearly Orthogonal and Space-filling Latin Hypercubes,” Technometrics, Vol. 49, No. 1, 45-55. • Sanchez, S. M. and P. J. Sanchez, "Very Large Fractional Factorials and central composite designs," ACM Transactions on Modeling and Computer Simulation, Vol. 15, No. 4, 2005, pp. 362-377. • Sanchez, S. M., H. Wan, and T. W. Lucas, "A Two-phase Screening Procedure for Simulation Experiments," Invited paper (under review), ACM Transactions on Modeling and Computer Simulation. • Sanchez, S. M., F. Moeeni, and P. J. Sanchez, "So Many Factors, So Little Time…Simulation experiments in the frequency domain," International Journal of Production Economics, Vol. 103, 2006, pp. 149-165.

Sanchez, Lucas, “Agent-based Simulations: Simple Models, Complex Analyses,” Invited paper, Proc. 2002 Winter Simulation Conference, 116-126. Lucas, Sanchez, Brown, Vinyard, “Better Designs for High-Dimensional Explorations of Distillations,” Maneuver Warfare Science 2002, Marine Corps Combat Development Command, 2002, 17-46. Vinyard, Lucas, “Exploring Combat Models for Non-monotonicities and Remedies,” PHALANX, 35, No. 1, March 2002, 19, 36-38. Lucas, McGunnigle, “When is Model Complexity Too Much? Illustrating the Benefits of Simple Models with Hughes’ Salvo Equations,” Naval Research Logistics, Vol. 50, April 2003, 197-217. Lucas, Sanchez, Cioppa, Ipekci, “Generating Hypotheses on Fighting the Global War on Terrorism,” Maneuver Warfare Science 2003, Marine Corps Combat Development Command, 2003, 117-137. Lucas, Sanchez, “Smart Experimental Designs Provide Military Decision-Makers With New Insights From Agent-Based Simulations,” Naval Postgraduate School RESEARCH, 13,2, 20-21, 57-59, 63. Lucas, Sanchez, ““NPS Hosts the Marine Corps Warfighting Laboratory’s Sixth Project Albert International Workshop,” Lucas, T.W. and S.M. Sanchez, Naval Postgraduate School RESEARCH, 13,2, 45-46. Sanchez, Wu, “Frequency-Based Designs for Terminating Simulation Experiments: A Peace-enforcement Example,” Proc. 2003 Winter Simulation Conference, 952-959. Brown, Cioppa, “Objective Force Urban Operations Agent Based Simulation Experiment,” Technical Report TRAC-M-TR-03-021, Monterey, CA, June 2003. Cioppa, Brown, Jackson, Muller, Allison, “Military Operations in Urban Terrain Excursions and Analysis With Agent-Based Models,” Maneuver Warfare Science 2003, Quantico, VA, 2003. Cioppa, “Advanced Experimental Designs for Military Simulations,” Technical Report TRAC-M-TR-03-011, Monterey, CA, February 2003. Brown, Cioppa, Lucas, “Agent-based Simulation Supporting Military Analysis,” PHALANX, Vol. 37, No. 3, Sept 2004.Cioppa, Lucas, Sanchez, “Military Applications of Agent-based Simulation,” Proc. 2004 Winter Simulation Conference. Cioppa, Lucas, Sanchez, “Military Applications of Agent-Based Simulations,” Proceedings of the 2004 Winter Simulation Conference, 171-179 Allen, Buss, Sanchez, “Assessing Obstacle Location Accuracy in the REMUS Unmanned Underwater Vehicle,” Proceedings of the 2004 Winter Simulation Conference, 940-948. Cioppa, “An Efficient Screening Methodology For a Priori Assessed Non-Influential Factors,” Proc. 2004 Winter Simulation Conference, 171-180. Sanchez, “Work Smarter, Not Harder: Guidelines for Designing Simulation Experiments.” Proc. of the 2005 Winter Simulation Conference, forthcoming. Wolf, Sanchez, Goerger, Brown, “Using Agents to Model Logistics,” under revision for Military Operations Research. Baird, Paulo, Sanchez, Crowder, “ Measuring Information Gain in the Objective Force, under revision for Military Operations Research. Other publications

2000 Brown (Captain, USMC) Human Dimension of Combat 2001 Vinyard (Major, USMC) Reducing Non-monotonicities in Combat Models, MORS/Tisdale Winner, MORS Walker Award 2002 Erlenbruch (Captain, German Army) German Peacekeeping Operations, MORS/Tisdale Finalist 2002 Pee (Singapore DSTA) Information Superiority and Battle Outcomes, MORS/Tisdale Finalist 2002 Wan (Major, Singapore Army) Effects of Human Factors on Combat Outcomes Dickie (Major, Australian Army) Swarming Unmanned Vehicles, MORS/Tisdale Finalist 2002 Ipekci (1st Lieutenant, Turkish Army) Guerrilla Warfare, MORS/Tisdale Winner 2002 Wu (Lieutenant, USN) Spectral Analysis and Sonification of Simulation Data 2002 Cioppa (Lieutenant Colonel, US Army, PhD) Experimental Designs for High-dimensional Complex Models,ASA 3rd Annual Prize for Best Student Paper Applying Stat. to Defense 2003Efimbe (Lieutenant, US Navy) Littoral Combat Ships Protecting Expeditionary Strike Groups 2003 Wolf (Captain, USMC) Urban, Humanitarian Assistance/ Disaster Relief Operations, MORSS Best Presentation Award, MORS Barchi Prize Finalist 2004 Milton (Lieutenant Commander, US Navy) Logistical Chain of the Seabase, MORS/Tisdale Finalist 2004 Allen (Lieutenant, US Navy) Navigational Accuracy of REMUS Unmanned Underwater Vehicle, MORS/Tisdale Finalist 2004 Steele (Ensign, US Navy) Unmanned Surface Vehicles 2004 Hakola (Captain, USMC) Convoy Protection 2004 Lindquist (Captain, US Army) Degraded Communication in the Future Force, MORS Tisdale Winner 2004 Aydin (1st Lieutenant, Turkish Army) Village Search Operations 2004 Raffetto (Captain, USMC) UAVs in Support of IPB in a Sea-Viking Scenario, MORS/Tisdale Finalist 2004 Cason (Captain, USMC) UAVs in Support of Urban Operations 2004 Berner (LCDR, US Navy) Multiple UAVs in Maritime Search and Control 2004 Tan (Singapore S&T) Checkpoint Security 2005 Babilot (USMC) DO versus Traditional Force in Urban Terrain 2005 Bain(USMC) Logistics Support for Distributed Ops, MORS/Tisdale Finalist 2005 Gun (Turkish Army) Sunni Participation in Iraqi Elections 2005 McMindes (USMC) UAV Survivability 2005 Sanders (USMC) Marine Expeditionary Rifle Squad 2005 Ang (Singapore Technologies Engineering) Increasing Participation and Decreasing Escalation in Elections 2005 Chang (Singapore DSTA) Edge vs. Hierarchical Organizations for Collaborative Tasks 2005 Liang (Singapore DSTA) Cooperative Sensing of the Battlefield 2005 Martinez-Tiburcio (Mexican Navy) Protecting Mexico’s Oil Well Infrastructure 2005 Sulewski (USA) UAVs in Army’s FCS Family of Systems 2006 Lehmann (Major, German Army) A Discrete, Even-driven Simulation of Peacekeeping Operations 2006 Roginski (Major, US Army) Emergency Response to a Terrorist Attack 2006 Alt (Major, US Army) TTPs for a Future Force Warrior Small Combat Unit 2006 Wittwer (Major, US Army) Non-Lethal Weapons in a Future Force Warrior Small Combat Unit 2006 Nannini (Major, US Army) Dynamic Scheduling of FCS UAVs 2006 Vaughan (Captain, USMC) Force Size Transitions in Stability Operations 2006 Michel (Major, USMC) Evaluating the Marine Corps’ Artillery Triad in STOM Operations 2006 Richardson (Captain, USMC) Distributed Capabilities in a Future Force Warrior Small Combat Unit 2006 Sickinger (Lieutenant, US Navy) Non-Lethal Weapons in a Maritime Environment, MORS/Tisdale Finalist Coming soon…Many more Student theses…note the breadth of applications

Decision Tree: Time and routing (Raffetto, 2004) MOE: Proportion of enemy classified JTF Swashbuckler Caliphate JTF Sea Viking Most Important Factor Needs to fly over 7 hours Think like the enemy! Rt.2 planned with intel In either case, throw more forces/capabilities at it next… if available

Example: Regression analysis (Raffetto, 2004) Preferred model for One UAV—7 Terms • Across the noise factors, the regression models produce R-Square values from .906 to .921 with seven to nine terms for 1-3 UAVs • Provides a means to compare expected effects of different configurations • Parameter estimates are put into a simple Excel spreadsheet GUI to allow decision makers to view relative effects of configurations within this scenario

Example: Interactions (Steele, 2004) camera range and speed • At low speeds, camera range is unimportant • At higher speeds, camera range has big impact • One of several technological challenges for systems design

Example: One-way analysis (Hakola, 2004) Mean(Alleg1Cas(blue)) Each Pair Student’s t 0.05 Log twd concealment

Example: MART (Ipekci, 2002) Relative Variable Importance Blue Casualties Relative Variable Importance Red Casualties

Example: Contour plot (Allen, 2004)

Resources: Seed Center for Data Farming http://harvest.nps.edu Check here for: • lists of student theses (available online) • spreadsheets & software • pdf files for several of our publications, publication info for the rest • links to other resources • updates All models are wrong, but some are useful—George Box

SEED Center for Data Farming Overview

SEED Center for Data Farming Overview

Presentation Transcript

Overview of Data Center Energy Use

Overview of the Joint Center for Satellite Data Assimilation

Overview: Internet vs Data Center Networks

Seed Supply System Through Kenya Forestry Seed Center (KFSC)

Strategic Data Farming

Data Center Security Overview

Precision Farming Technologies Overview

Multicast in the Data Center Overview

Data Center Security Overview

Multicast in the Data Center Overview

American Seed Industry Overview

DOE Data Center Tools Suite Overview

Seed Theology: Lessons from Farming

Scalarm: Scalable Platform for Data Farming

Transportation Secure Data Center (TSDC) Overview

Biomarkers Data Center Product Overview

Data Center Business Overview

Carbon Farming Initiative Overview

DHP Healthcare Workforce Data Center Overview

Seed Industry Overview, Seed Industry Statistics-Ken Research

Transportation Secure Data Center (TSDC) Overview

SEED Center for Data Farming Overview