nextmv Docs

Simulation Best Practices

Simulation Best Practices

"There is a difference between knowing the path and walking the path."

-- Morpheus

Dash is a discrete event simulator inspired by the event-oriented dynamics of many modern software systems. Increasingly, complex systems are composed out of small, decoupled, components that communicate by publishing and subscribing to events. Dash makes modeling these systems easy so simulations function more like the environments they are intended to model.

Dash has a lot in common with Hop, from the way it reads and writes data to its environment variables and runners. As simulation differs from optimization in some key ways, we have found some best practices specific to Dash.

Terminating Simulations

By default, a Dash simulation will run until it has no more actors scheduled. There are some circumstances, such as the single-server queue example, where it makes sense to have an actor run forever. These simulations can be terminated using a duration limit for simulated time. Do this using either the DASH_SIMULATOR_LIMITS_DURATION environment variable or the -dash.simulator.limits.duration command-line flag.

Updating Actor State

An actor in Dash is any type that implements a Run method. If Run returns a boolean true value, then Dash schedules it to run again in the simulation.

Actors typically maintain their own internal states using struct attributes. Thus, they are often loaded from JSON input and referred to in method receivers as pointers. This allows them to mutate their state during calls to Run and in response to events.

For example, the customer actors of the single-server queue example are unmarshaled from JSON directly into pointers by the CLI runner.

func main() {
cli.Run(
func(customers []*customer, opt sim.Options) (sim.Simulator, error) {
// Customers can mutate their state in the simulation.
},
)
}

Similarly, their methods are defined with pointer receivers, so state can be updated and stored in the simulation.

func (c *customer) Run(now time.Time) (time.Time, bool) {
// Run changes customer state.
}

Randomizing Data

Introducing randomness into a simulation is a good way to bound estimates of important measures, as well as stress test your models. Dash makes it easy to set an arbitrary random seed to use for creating random values while running a simulation, via the -dash.simulator.random.seed command-line flag.

To ease the task further, bash and zsh provide a $RANDOM function which produces a signed 16-bit integer between 0 and 32767. We can use it to introduce some randomness into a Dash simulation as follows:

./dash-sim -dash.runner.input.path input.json \
-dash.simulator.random.seed $RANDOM

Using this method, a new random seed will be used each time the simulation is run. Note that we will still need to encode randomness into our simulation using Go's math/rand package. The random seed will be recorded in the options section of Dash's output.

Event & Measure Levels

Dash uses one event ledger for publishing and subscribing to actor events, and another for recording measures. For many use cases, the Publish and Subscribe methods provided by Dash's event ledger work quite well. However, when simulations produce many events or measures, they can become too verbose.

Events and measures can also be ascribed a level, similar to the levels of many popular logging systems. To use these levels, merely substitute PublishLevel and SubscribeLevel for calls to Publish and Subscribe. The dash/sim/log package provides the following levels:

  • All
  • Trace
  • Debug
  • Info
  • Warn

Lower levels (which have higher values) are more important. Like other log leveling systems, subscribing at a level (e.g. Info) means you receive every message that is at least as important as that (Info, Warn).

The Publish and Subscribe ledger methods use level All.

There is a command-line flag and associated environment variable that allows one to only receive measures or events at a certain level. For instance, in the customer.go file of the queue example, we can change the Run method to use:

if c.arrivalTime == nil {
c.arrivalTime = &now
c.events.PublishLevel(log.Info, arrival(*c))
}

Recompiling the example and running the command below will print nothing:

./queue -dash.runner.input.path input.json \
-dash.runner.output.events \
-dash.simulator.limits.duration 2h \
-dash.runner.output.level warn | \
jq .events

This will print only the arrival events:

./queue -dash.runner.input.path input.json \
-dash.runner.output.events \
-dash.simulator.limits.duration 2h \
-dash.runner.output.level trace | \
jq .events

The same functionality can be applied to measures using the corresponding environment variable or command-line flag.

Warmup Periods

Events and measures logged at the very beginning of a simulation may not always represent reality, in particular, if the system has not reached a steady state. For example, initializing all actors to "available" at simulation start may produce overly optimistic events and measures for a mid-day simulation. In otherwords, simulated time for actors to accomplish tasks may be faster than what would happen in reality. For these scenarios, we recommend specifying a warmup duration and excluding warmup messages from the output. This allows the simulation to ramp up to a steady state without impacting the fidelity of the events and measures.