130 likes | 167 Views
Learn about the advanced Parallel Discrete Event Simulation (PDES) techniques developed at ORNL for various application areas including emergencies, network simulation, social dynamics, sensor simulations, and more. Discover the unique mixed-mode kernel, scalable micro-kernel functionalities, and future defense systems powered by PDES.
E N D
Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering
PDES: Selected application areas Global and local events Emergencies • Network simulation • Internet protocols, security, P2P designs, … • Traffic simulation • Emergency planning/response, environmental policy analysis, urban planning, … • Social dynamics simulation • Operations planning, foreign policy, marketing, … • Sensor simulations • Wide area monitoring, situational awareness, border surveillance, … • Organization simulations • Command and control, business processes, … Protection and awareness systems Current and future defense systems
High-performance PDES kernel requirements • Global time synchronization • Total time-stamped ordering of events • Paramount for accuracy • Fast synchronization • Scalable, application-independent, time-advance mechanisms • Critical for real-time and as-fast-as-possible execution • Support for fine-grained events • Minimal overhead relative to event processing times • Application computation is typically only 5 µs to 50 µs per event • Conservative, optimistic, and mixed modes • Need support for the principal synchronization approaches • Useful to choose mode on per-entity basis at initialization • Desirable to vary mode dynamically during simulation • General-purpose API • Reusable across multiple applications • Accommodates multiple techniques • Lookahead, state saving, reverse computation, multicast, etc.
70 35 60 30 50 25 40 20 Aggregate events/s (M) µs per event 15 30 10 20 5 10 0 0 0 128 256 384 512 640 768 896 1024 128 256 512 1024 LP = 1 thousand, MSG = 1 million LP = 1 million, MSG = 100 million LP = 1 million, MSG = 1 billion Number of processors Number of processors µsik—unique PDES “micro-kernel” Unique mixed-mode kernel • The only scalable mixed-mode kernel in the world • Supports conservative, optimistic, and mixed modes in a single kernel Used in a variety of applications • DES-based vehicular traffic models • DES-based plasma physics models • DES-based neurological models • Largest Internet simulations • Some recent results of fine-grained PDES benchmark (phold) • Among the largest/fastest scalability results in parallel discrete event simulation LP = 1 thousand, MSG = 1 million LP = 1 million, MSG = 100 million LP = 1 million, MSG = 1 billion
17000 50 Conservative Optimistic 13000 40 Conservative Mixed Speedup 30 9000 µs per event Mixed 20 5000 Optimistic 10 1000 0 0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14 16 18 Number of processors (thousands) Number of processors (thousands) 1.4 600 Conservative Mixed Optimistic 1.2 Conservative 500 1.0 400 Mixed 0.8 Parallel efficiency Events/s (millions) 300 Optimistic 0.6 200 0.4 100 0.2 0 0 2,048 4,096 8,192 16,384 0 2 4 6 8 10 12 14 16 18 Number of processors Number of processors (thousands) µsik scaled to more than 104 processors • Some recent results of fine-grained PDES benchmark • On Blue Gene Watson (BGW) at IBM TJ Watson Research Center • Well-known PHOLD benchmark, with 1 million logical processes, 10 million pucks • The largest and fastest scalability results in PDES recorded to date
µsik micro-kernel internals Micro-kernel User LPs Future event listProcessed event listLocal virtual time LP PEL→t LVT FEL→t Pc LP LP ECTS Q LP LP Committable • When update kernel Qs? • New LP added or deleted • LP executes an event • LP receives an event Pp EPTS Q Kernel LPs Processable KP LP = Logical processKP = Kernel processECTS = Earliest committable time stampEPTS = Earliest processable time stampEETS = Earliest emittable time stampPEL = Processed event listFEL = Future event listLVT = Local virtual time Pe KP KP EETS Q KP Emittable
libSynk: µsik’s synchronization core µsik TM µsik Process µsik Process µsik Process TM Red TM Null RM RM Bar libSynk FM OS/Hardware FM Myr FM TCP Network FM ShM FM MPI X Y Implies X uses Y
µsik micro-kernel capabilities • μsik is currently able to support the following: • Lookahead-based conservative and/or optimistic execution • Reverse computation-based optimistic execution • Checkpointing-based optimistic execution • Resilient optimistic execution (zero rollbacks) • Constrained, out-of-order execution • Preemptive event processing • Any combinations of the above • Automated, network-throttled flow control • User-level event retraction • Process-specific limits to optimism • Dynamic process addition/deletion • Shared and/or distributed memory execution • Process-oriented views • It accommodates addition of the following: • Synchronized multicast • Optimistic dynamic memory allocation • Automated load-balancing
SimDriver SimComm Tiny Viz Script Interpreter Tython Python Reflected Simulation Script External Event Buffer SensorNet: Parallel simulation/ immersive test-bed SAF Weather Model TOSSIM Comm. Models Plume Models SCIPUFF Weather Simulator Sensor Net Simulator Operations Simulator PlumeModels LiveDevices Environ-mentalModel Proxy CB Sensor Network Testbed Framework Current Envisioned • Seamless integrated testbed to incorporate a variety of important simulations, stimulations, and live devices • Achieves unified capabilities and significant fidelity for test and evaluation of CB sensor device-based designs, concepts, and operations
Sensor Network Layout 0 1 6 11 16 21 Base Station 2 7 12 17 22 3 8 13 18 23 Source Release 4 9 14 19 24 5 10 15 20 25 40 No Winds Winds Turbulent 30 20 10 0 Mean Concentration (10-6 kgs/m3) 40 No Winds Winds Turbulent 30 20 10 0 300 400 500 600 Time (s) SensorNet: Simulation-based analysis for plume tracking • Environmental phenomenon exhibits high variability. • Phenomenon drives the sensor network’s computation and communication. • Trace gathered at base station of sensed phenomenon reflects high variability. • Communication effects induce unpredictable gaps in series. • Accurate, integrated simulation of phenomenon and communication captures complex interdependencies. 50.5 No Winds Kgs/cm3 48.5 1.00E-02 46.6 1.00E-03 1.00E-04 44.6 1.00E-05 1.00E-06 42.6 1.00E-07 Surface Dosage CHEM at T = 17.0 h 1.00E-08 40.6 50.5 Light Winds 48.5 46.6 Latitude 44.6 42.6 Surface Dosage CHEM at T = 17.0 h 40.6 50.5 Turbulent Winds 48.5 46.6 44.6 Surface Dosage CHEM at T = 17.0 h 42.6 40.6 -96.0 -91.2 -86.4 Longitude
SCATTER: Ultra-scale PDES-based mobility simulations Our approach: SCATTER • DES models • vs time-stepped • Parallel execution • vs sequential • Scalability to high-performance computing • 102–103 CPUs • Important behaviors • kinetic + non-kinetic • Scalable tool for transportation and energy/event/emergency research • Regional scale: multiple states • 106–107 intersections • Current tool capabilities • At most 104 intersections • Faster than real time is very useful Faster Network flow methods SCATTER Good for rough estimates of evacuation delay, . . . Speed OREMS Desirable Fidelity Lower accuracy Higher accuracy CORSIM Good for emissions estimates, traffic analysis, . . . MITSIM TRANSIMS Slower
Very low event processing overhead (ms)! SCATTER: Benchmark performance Speedup over real-time Significantly faster than real time with 1 million vehicles! 10000 Nodes = 9 1000 Real time/simulated time 100 Nodes = 1089 10 1 10000 100000 1000000 Number of vehicles Parallel Speedup Event processing speed 3.5 1000 3.0 Nodes = 9 2.5 Significant speedup with parallel execution! 100 2.0 Nodes = 1089 µs/event Speedup over sequential run 1.5 10 1.0 0.5 0.0 1 1 2 3 4 10000 100000 1000000 Number of processors Number of vehicles
Contact Kalyan S. Perumalla Modeling & Simulation GroupComputational Sciences & Engineering (865) 241-1315 perumallaks@ornl.gov 13 Perumalla_PDES_SC07