Experiments

Experiments are used to evaluate changes to your models by running and comparing the results of one or more executable binaries (i.e. different versions). Experimentation is a key part of developing a good model and Nextmv’s goal is to make it easier to run experiments so you can focus on improving your model.

Nextmv Platform provides a suite of products to create and manage different types of experiments. There are five types of experiments: scenario, batch, acceptance, shadow, and switchback; along with input sets and managed inputs to organize the data used for experiments. Experiments are always created and managed in the context of an application. That is, each application will have its own set of experiments (that you have created). See the Apps core concepts page for more information about applications.

Experiments and input sets can be created and managed with Nextmv CLI, Nextmv Console, or the HTTP API endpoints. Created experiments are saved and can be accessed at any time. After experiments have been started, the results are aggregated and can be retrieved with the same tools. When viewing the result of an experiment, Console provides a visual interpretation of the results, while the API and Nextmv CLI provide the raw JSON.

The different types of experiments and input sets are summarized below.

Types of experiments

Scenario

Scenario tests compare the output from one or more scenarios. A scenario is composed of a model version, a collection of inputs, and any specific configuration that should be applied to the runs for that scenario. You can also configure repetitions to test for variability in the results.

You can use scenario tests as a way to explore impacts to business metrics (KPIs) based on model updates, different conditions (e.g. low demand vs. high demand), parameter tuning, and more. You can also use scenario tests as a way to validate that a model is ready for further testing and likely to make an intended business impact.

Batch

Batch experiments are used to analyze the output from one or more decision models. They are generally used as an exploratory test to understand the impacts to business metrics (or KPIs) when updating a model with a new feature, such as an additional constraint. They can also be used to validate that a model is ready for further testing — and likely to make an intended business impact.

See the batch experiment reference guide for more information on batch experiments.

Acceptance

Acceptance tests build on the core concept of a batch test with a focus on evaluating the differences between exactly two models and assigning a pass / fail label based on predefined thresholds. They are used to verify if business or operational requirements (e.g., KPIs and OKRs) are being met. Acceptance tests involve running an existing production model and a new updated model against a set of test data. You then look at the results and determine if the new model is acceptable based on criteria identified beforehand.

See the acceptance tests reference guide for more information on acceptance tests.

Shadow

A shadow test is an experiment that runs in the background and compares the results of a baseline instance against a candidate instance. When the shadow test has started, any run made on the baseline instance will trigger a run on the candidate instance using the same input and options. The results of the shadow test are often used to determine if a new version of a model is ready to be promoted to production.

Shadow tests can be created using the CLI, Nextmv console or the HTTP API. See the shadow test reference guide for more information on shadow tests

Switchback

Switchback tests for decision models allow algorithm teams to analyze the performance of a candidate model compared to a baseline model using production data and conditions while making operational decisions by randomizing the candidate treatment over units of time.

Switchback tests are related to general A/B tests, but they are not the same. Switchback tests allow you to account for network effects, whereas A/B tests do not.

Switchback tests can be created using the Nextmv console or the HTTP API. See the switchback test reference guide for more information on switchback tests

Managed inputs

Managed inputs are input data that you upload and manage directly on the platform. You can create a managed input from an uploaded file or by referencing a previous run. Managed inputs can be created and managed with Nextmv CLI, the Python SDK, Nextmv Console, or the HTTP API endpoints.

Overview of experimentation in Nextmv Platform.

Types of experiments

Scenario

Batch

Acceptance

Shadow

Switchback

Managed inputs

Input sets

Custom metrics (statistics convention)

Review the Results

Apps

Console

Contents