CPE 619 Introduction To Simulation

CPE 619Introduction To Simulation Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama in Huntsville http://www.ece.uah.edu/~milenka http://www.ece.uah.edu/~lacasa

Overview • Simulation: Key Questions • Introduction to Simulation • Common Mistakes in Simulation • Other Causes of Simulation Analysis Failure • Checklist for Simulations • Terminology • Types of Models

Simulation: Key Questions • What are the common mistakes in simulation and why most simulations fail? • What language should be used for developing a simulation model? • What are different types of simulations? • How to schedule events in a simulation? • How to verify and validate a model? • How to determine that the simulation has reached a steady state? • How long to run a simulation?

Simulation: Key Questions (cont’d) • How to generate uniform random numbers? • How to verify that a given random number generator is good? • How to select seeds for random number generators? • How to generate random variables with a given distribution? • What distributions should be used and when?

Introduction to Simulation The best advice to those about to embark on a very large simulation is often the same as Punch's famous advice to those about to marry: Don't! -Brately, Fox, and Schrage (1987)

Common Mistakes in Simulation 1. Inappropriate Level of Detail:More detail Þ More time Þ More Bugs Þ More CPU Þ More parameters ¹ More accurate 2. Improper Language General purpose Þ More portable, More efficient, More time 3. Unverified Models: Bugs 4. Invalid Models: Model vs. reality 5. Improperly Handled Initial Conditions 6. Too Short Simulations: Need confidence intervals 7. Poor Random Number Generators: Safer to use a well-known generator 8. Improper Selection of Seeds: Zero seeds, Same seeds for all streams

Other Causes of Simulation Analysis Failure 1. Inadequate Time Estimate 2. No Achievable Goal 3. Incomplete Mix of Essential Skills (a) Project Leadership (b) Modeling and (c) Programming (d) Knowledge of the Modeled System 4. Inadequate Level of User Participation 5. Obsolete or Nonexistent Documentation 6. Inability to Manage the Development of a Large Complex Computer Program Need software engineering tools 7. Mysterious Results

Checklist for Simulations 1. Checks before developing a simulation: (a) Is the goal of the simulation properly specified? (b) Is the level of detail in the model appropriate for the goal? (c) Does the simulation team include personnel with project leadership, modeling, programming, and computer systems backgrounds? (d) Has sufficient time been planned for the project? 2. Checks during development: (a) Has the random number generator used in the simulation been tested for uniformity and independence? (b) Is the model reviewed regularly with the end user? (c) Is the model documented?

Checklist for Simulations (cont’d) 3.Checks after the simulation is running: (a) Is the simulation length appropriate? (b) Are the initial transients removed before computation? (c) Has the model been verified thoroughly? (d) Has the model been validated before using its results? (e) If there are any surprising results, have they been validated? (f) Are all seeds such that the random number streams will not overlap?

Terminology • Introduce terms using an example of simulating CPU scheduling • Study various scheduling techniques given job characteristics, ignoring disks, display… • State Variables: Define the state of the system • Can restart simulation from state variables • E.g., length of the job queue. • Event: Change in the system state • E.g., arrival, beginning of a new execution, departure

Terminology: Types of Models • Continuous Time Model • State is defined at all times • Discrete Time Models • State is defined only at some instants

Terminology: Types of Models (cont’d) • Continuous State Model • State variables are continuous • Discrete State Models • State variables are discrete

Terminology: Types of Models (cont’d) • Discrete state = Discrete event model • Continuous state = Continuous event model • Continuity of time ¹ Continuity of state • Four possible combinations • 1. discrete state/discrete time • 2. discrete state/continuous time • 3. continuous state/discrete time • 4. continuous state/continuous time

Terminology: Types of Models (cont’d) • Deterministic and Probabilistic Models • Deterministic - If output predicted with certainty • Probabilistic - If output different for different repetitions

Output Output Input Input (Linear) (Non-Linear) Terminology: Types of Models (cont’d) • Static and Dynamic Models • Static - Time is not a variable • Dynamic - If changes with time • E.g.: CPU scheduler is dynamic, while matter-to-energy model E=mc2 is static • Linear and nonlinear models • Linear - Output is linear combination of input • Nonlinear - Otherwise

cpu cpu open closed Terminology: Types of Models (cont’d) • Open and closed models • Open - Input is external and independent • Closed - Model has no external inputs • Ex: if same jobs leave and re-enter queue then closed, while if new jobs enter system then open

Terminology: Types of Models (cont’d) • Stable and unstable • Stable - Model output settles down • Unstable - Model output always changes

Computer System Models • Continuous time • Discrete state • Probabilistic • Dynamic • Nonlinear • Open or closed • Stable or unstable

Selecting a Language for Simulation • Four choices • 1. Simulation language • 2. General purpose • 3. Extension of a general purpose language • 4. Simulation package

Selecting a Language for Simulation (cont’d) • Simulation language – built in facilities for time steps, event scheduling, data collection, reporting • General-purpose – known to developer, available on more systems, flexible • The major difference is the cost tradeoff (SL vs. GPL) • SL+: save development time (if you know it), more time for system specific issues, more readable code • SL-: requires startup time to learn • GPL+: Analyst's familiarity, availability, quick startup • GPL-: may require more time to add simulation flexibility, portability, flexibility • Recommendation may be for all analysts to learn one simulation language so understand those “costs” and can compare

Selecting a Language for Simulation • Extension of general-purpose – collection of routines and tasks commonly used. Often, base language with extra libraries that can be called • Simulation packages – allow definition of model in interactive fashion. Get results in one day • Tradeoff is in flexibility, where packages can only do what developer envisioned, but if that is what is needed then is quicker to do so • Examples: GASP (for FORTRAN) • Collection of routines to handle simulation tasks • Compromise for efficiency, flexibility, and portability. • Examples: QNET4, and RESQ • Input dialog • Library of data structures, routines, and algorithms • Big time savings • Inflexible Þ Simplification

Types of Simulation Languages • Continuous Simulation Languages • CSMP, DYNAMO • Differential equations • Used in chemical engineering • Discrete-event Simulation Languages • SIMULA and GPSS • Combined • SIMSCRIPT and GASP • Allow discrete, continuous, as well as combined simulations.

Types of Simulations 1. Emulation: Using hardware or firmware 2. Monte Carlo Simulation 3. Trace-Driven Simulation 4. Discrete Event Simulation

Java program Java VM Process Process Operating System Hardware Types of Simulations (cont’d) • Emulation • Simulation that runs on a computer to make it appear to be something else • Examples: JVM, NIST Net

Types of Simulation (cont’d) Monte Carlo method [Origin: after Count Montgomery de Carlo, Italian gambler and random-number generator (1792-1838).] A method of jazzing up the action in certain statistical and number-analytic environments by setting up a book and inviting bets on the outcome of a computation. - The Devil's DP Dictionary McGraw Hill (1981)

Monte Carlo Simulation • A static simulation has no time parameter • Runs until some equilibrium state reached • Used to model physical phenomena, evaluate probabilistic system, numerically estimate complex mathematical expression • Driven with random number generator • So “Monte Carlo” (after casinos) simulation • Example, consider numerically determining the value of  • Area of circle = 2 for radius 1

Monte Carlo Simulation (cont’d) • Imagine throwing dart at square • Random x (0,1) • Random y (0,1) • Count if inside • sqrt(x2+y2) < 1 • Compute ratio R • in / (in + out) • Can repeat as many times as needed to get arbitrary precision • Unit square area of 1 • Ratio of area in quarter to area in square = R •  = 4R

Monte Carlo Simulation (cont’d) • Evaluate the following integral • 1. Generate uniformly distributed x ~ Uniform(0,2) • 2. Density function f(x)=1/2 iff 0x 2 • 3. Compute:

Monte Carlo Simulation (cont’d) • Expected value for y:

Trace-Driven Simulation • Uses time-ordered record of events on real system as input • Example: to compare memory management, use trace of page reference patterns as input, and can model and simulate page replacement algorithms • Note, need trace to be independent of system • Example: if had trace of disk events, could not be used to study page replacement since events are dependent upon current algorithm

Advantages of Trace-Driven Simulations 1. Credibility 2. Easy Validation: Compare simulation with measured 3. Accurate Workload: Models correlation and interference 4. Detailed Trade-Offs: Detailed workload Þ Can study small changes in algorithms 5. Less Randomness: Trace Þ deterministic input Þ Fewer repetitions 6. Fair Comparison: Better than random input 7. Similarity to the Actual Implementation: Trace-driven model is similar to the system Þ Can understand complexity of implementation

Disadvantages of Trace-Driven Simulations 1. Complexity: More detailed 2. Representativeness: Workload changes with time, equipment 3. Finiteness: Few minutes fill up a disk 4. Single Point of Validation: One trace = one point 5. Detail 6. Trade-Off: Difficult to change workload

Discrete Event Simulations • A simulation using a discrete state model of the system is DISCRETE EVENT SIMULATION • Continuous-event simulations – the state of the system takes continuous values • Typical components: • Event scheduler • Simulation Clock and a Time Advancing Mechanism • System State Variables • Event Routines • Input Routines • Report Generator • Initialization Routines • Trace Routines • Dynamic Memory Management • Main Program

Components of Discrete Event Simulations • Event scheduler – linked list of events waiting • Schedule event X at time T • Hold event X for interval dt • Cancel previously scheduled event X • Hold event X indefinitely until scheduled by other event • Schedule an indefinitely scheduled event • Note, event scheduler executed often, so has significant impact on performance • Simulation Clock and a Time Advancing Mechanism • Global variable representing simulated time (maintained by the scheduler) • Two approaches • Unit-time approach: increment time and check for events • Event-driven approach: move to the next event in queue

Components of Discrete Events Sims (cont’d) • System State Variable • Global variables describing the state of the systems(e.g., the umber of jobs in CPU scheduling simulation) • Local variables (e.g., CPU time required for a job is placed in the data structure for that particular job) • Event Routines -- one per event; update state variables and schedule other events • E.g., job arrivals, job scheduling, and job departure • Input Routines • Get model parameters (e.g., means CPU time per job) from the user • Very parameters in a range

Components of Discrete Events Sims (cont’d) • Report Generator • Output routines run at the end of the simulation • Initialization Routines • Set the initial state of the system state variables. Initialize seeds. • Trace Routines • Print out intermediate variables as the simulation proceeds • On/off feature • Dynamic Memory Management • New entities are created and old ones are destroyed • Periodic garbage collection • Main Program • Tie everything together

Head Tail Next Next Next Previous Previous Previous Event 2 Event n Event 1 Code for event 2 Code for event n Code for event 1 Event-Set Algorithms • Event Set = Ordered linked list of future event notices • Insert vs. Execute next • 1. Ordered Linked List: SIMULA, GPSS, and GASP IV • Search from left or from right

t Head 1 Tail 1 t+Dt Head 2 Tail 2 t+nDt Head 3 Tail 3 Event-Set Algorithms (cont’d) • 2. Indexed Linear List • Array of indexes Þ No search to find the sub-list • Fixed or variable Dt. Only the first list is kept sorted

1 15 2 19 3 28 4 5 6 7 23 27 39 45 8 25 9 47 10 11 50 12 48 34 (a) Tree representation of a heap. Event-Set Algorithms (Cont) • 3. Tree Structures: Binary tree Þ log2 n • Special case: Heap: Event is a node in binary tree

Summary • Common Mistakes: Detail, Invalid, Short • Discrete Event, Continuous time, nonlinear models • Monte Carlo Simulation: Static models • Trace driven simulation: Credibility, difficult trade-offs • Even Set Algorithms: Linked list, indexed linear list, heaps

Analysis of Simulation Results

Overview • Analysis of Simulation Results • Model Verification Techniques • Model Validation Techniques • Transient Removal • Terminating Simulations • Stopping Criteria: Variance Estimation • Variance Reduction

Model Verification vs. Validation • The model output should be close to that of real system • Make assumptions about behavior of real systems • 1st step, test if assumptions are reasonable • Validation, or representativeness of assumptions • 2nd step, test whether model implements assumptions • Verification, or correctness • Four Possibilities • Unverified, Invalid • Unverified, Valid • Verified, Invalid • Verified, Valid

Model Verification Techniques • Top Down Modular Design • Anti-bugging • Structured Walk-Through • Deterministic Models • Run Simplified Cases • Trace • On-Line Graphic Displays • Continuity Test • Degeneracy Tests • Consistency Tests • Seed Independence

Top Down Modular Design • Divide and Conquer • Modules = Subroutines, Subprograms, Procedures • Modules have well defined interfaces • Can be independently developed, debugged, and maintained • Top-down design Þ Hierarchical structure Þ Modules and sub-modules

Top Down Modular Design (cont’d) Computer Network Simulator for Congestion Control studies

Top Down Modular Design (cont’d)

Verification Techniques • Anti-bugging: Include self-checks • å Probabilities = 1 • Jobs left = Generated - Serviced • Structured Walk-Through • Explain the code another person or group • Works even if the person is sleeping • Deterministic Models: Use constant values • Run Simplified Cases • Only one packet • Only one source • Only one intermediate node

Verification Techniques (cont’d) • Trace = Time-ordered list of events and variables • Several levels of detail • Events trace • Procedure trace • Variables trace • User selects the detail • Include on and off

Verification Techniques (cont’d) • On-Line Graphic Displays • Make simulation interesting • Help selling the results • More comprehensive than trace

CPE 619 Introduction To Simulation

CPE 619 Introduction To Simulation

Presentation Transcript

CPE 619 Monitors

CPE 619 Queueing Networks

CPE 619 Selection of Techniques and Metrics

CPE 619 Simple Linear Regression Models

CPE 619 One Factor Experiments

CPE 619 Random-Number Generation

CPE 619 Comparing Systems Using Sample Data

CPE 631: Introduction

CPE 619 Introduction to Queuing Theory

CPE 619 2 k-p Factorial Design

CPE 619 Workloads: Types, Selection, Characterization

Introduction to Simulation

CPE 619 Summarizing Measured Data

CPE 619 The Art of Data Presentation

CPE 619 Other Regression Models

CPE 619 Mean-Value Analysis

Introduction to Simulation

CPE 619 Experimental Design

Introduction to Simulation

CPE 619 Testing Random-Number Generators

CPE 619 Comparing Systems Using Sample Data

CPE 412 SIMULATION and MODELING