Batch experiments

Batch experiments are used to analyze the output from one or more decision models on a fixed input set. They are generally used as an exploratory test to understand the impacts to business metrics (or KPIs) when updating a model with a new feature, such as an additional constraint. They can also be used to validate that a model is ready for further testing and likely to make an intended business impact.

To be concise, batch experiments take the inputs from the input set and the specified instances and do a run for each input-instance pair. The results are then aggregated and analyzed. The decision model must produce an output that contains a statistics block that uses the statistics convention.

When working with scenario tests, you can configure two or more scenarios, all with the same input set but with different instances. This achieves the same goal as a batch experiment. The main difference is that scenario tests are more flexible and provide more detailed results.

Batch experiments are designed to be visualized in the Console web interface. Go to the app, Experiments > Batch tab.

Batch experiment

A batch experiment needs an input set to run.

There are several interfaces for creating batch experiments:

Console: use the web interface.
Nextmv CLI: use your terminal.
Python SDK: use Python.
Cloud API: use the HTTP endpoints.

In addition to an input_set_id and instance_ids, a batch experiment can process runs and option_sets.

runs. A list of objects.
- JSON
```
"runs": [
  {
    "instance_id": "string",
    "version_id": "string",
    "option_set": "string",
    "input_id": "string"
  }
]
```
Copy
Without runs, the batch experiment will run all of the inputs in the input set on all of the instances. With runs, the batch experiment will only run the specified inputs on the specified instances. The input_id of a run can be either a previous run_id or a managed input_id. The runs list can also be used to run the same input multiple times on specific instances or versions.
A run can specify an option_set to be used for that run.
option_sets. A dictionary of objects.
- JSON
```
"option_sets": {
  "option_set_id": {
    "string": "string"
  }
}
```
Copy
option_sets is a dictionary of options that will be used for a run. An option set can be referenced in the runs list via its option_set_id. This can be useful for example if:
- You want to compare the results of different solvers that have conflicting options. E.g. one uses solve.duration for the maximum runtime while the other uses duration.
- You want to run specific inputs with different options within the same experiment.

Console

Go to the Console web interface, and open your app. Go to the Experiments > Batch tab. Click on New Experiment. Fill in the fields.

Batch experiments

Nextmv CLI

Define the desired batch experiment ID and name. After, create the batch experiment.

Bash
Output (JSON)

nextmv experiment batch start \
    --app-id $APP_ID \
    --experiment-id $EXPERIMENT_ID \
    --name $EXPERIMENT_ID \
    --input-set-id $INPUT_SET_ID \
    --instance-ids "latest,latest-2" \
    --description "An optional description" \
    --confirm

Alternatively, you can use a JSON file that contains all of the information needed to start a batch experiment. The file must be valid JSON and equal to the payload of the POST request when using the Cloud API.

Using the definitions file you can fine tune the batch experiment by specifying runs and option_sets, for example.

Consider this example using a JSON file named batch-experiment.json.

{
  "id": "experiment-1",
  "name": "experiment-1",
  "description": "An optional description",
  "input_set_id": "input-set-1",
  "option_sets": {
    "long_running": {
      "solve.duration": "10s"
    },
    "short_running": {
      "solve.duration": "5s"
    }
  },
  "runs": [
    {
      "instance_id": "latest",
      "input_id": "latest-RNBs7AKSg",
      "option_set": "long_running"
    },
    {
      "instance_id": "latest-2",
      "input_id": "latest-RNBs7AKSg",
      "option_set": "short_running"
    },
    {
      "instance_id": "latest",
      "input_id": "latest-fK_wZ0FSg",
      "option_set": "long_running"
    },
    {
      "instance_id": "latest-2",
      "input_id": "latest-fK_wZ0FSg",
      "option_set": "short_running"
    }
  ]
}

Cloud API

POSThttps://api.cloud.nextmv.io/v1/applications/{application_id}/experiments/batch

Create and start batch experiment.

View API details

Define the desired batch experiment ID and name. After, create the batch experiment.

Bash
Output

curl -sS -L -X POST \
    "https://api.cloud.nextmv.io/v1/applications/$APP_ID/experiments/batch" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $NEXTMV_API_KEY" \
    -d "{
      \"id\": \"$EXPERIMENT_ID\",
      \"name\": \"YOUR_EXPERIMENT_NAME\",
      \"input_set_id\": \"$INPUT_SET_ID\",
      \"instance_ids\": [\"$INSTANCE_ID\", \"$INSTANCE_ID\"],
      \"description\": \"An optional description\"
    }" | jq

Result of a batch experiment - Console

Nextmv Console displays the results grouped by indicator keys and for each one includes:

a summary table
the percentile values
a box plot chart displaying summary values from the instances compared

Result of a batch experiment - CLI

The results can also be retrieved with the Nextmv CLI, which returns the raw data in JSON format and is useful if you would like to perform your own operations on the experiment results.

Bash

nextmv experiment batch result \
    --app-id $APP_ID \
    --experiment-id $EXPERIMENT_ID

The result of a batch experiment is returned as JSON. If you’re viewing the result in Console, the JSON is parsed and displayed in a web view; Nextmv CLI returns the complete JSON response. The response contains the batch experiment metadata and the results in the form of grouped distributional summaries.

Top-level properties

The table below summarizes the top-level properties in the return.

Field	Description
`id`	The batch experiment ID that was specified.
`name`	The batch experiment name that was specified.
`description`	The batch experiment description that was specified. If no description was specified this field will not exist.
`status`	The status of the experiment. The status can be: `started`, `completed`, or `failed`.
`created_at`	The date the experiment was created and started.
`input_set_id`	The input set ID specified for the batch experiment.
`instance_ids`	An array of the Instance IDs that were specified for the batch experiment.
`grouped_distributional_summaries`	The grouped distributional summaries is an array that contains the results of the batch experiment. It is a collection of calculated results from the individual runs by certain groupings.

Grouped distributional summaries

There are three types of summaries included in the grouped_distributional_summaries array (there could be more in the future):

Version (instance)
Version (instance) + input
Input

Each type is included for every experiment. However, note that if you are viewing the experiment in Console, ONLY the version summaries are displayed. In the future, Console will display all types of summaries.

No matter the type, each grouped distributional summary includes the following:

Field	Description
`group_keys`	This describes the type of group distributional summary which can be one of three options: `instanceID`, `versionId` (the version summary) `inputID`, `instanceID`, `versionId` (the version & input summary) `inputID` (the input summary)
`group_values`	The values that correspond to the `group_keys`. So if the group keys are `instanceID` and `versionID`, the `group_values` will be the ID of the instance and the ID of the version.
`indicator_keys`	These are the statistics that are being evaluated by the batch experiment. If you’re using the `nextroute` template, six statistics are automatically added for evaluation: `result.value`, `result.elapsed`, `metadata.duration`, `result.custom.routing.stops.unassigned`, `result.custom.routing.stops.assigned`, and `result.custom.used_vehicles`. Custom statistics can be specified by adding them to the output JSON’s top-level `statistics` block. See the Add custom statistics section of the Statistics convention reference.
`indicator_distributions`	An object that contains all of the values from the analysis for that particular indicator. If there are six indicator keys for example, the `indicator_distributions` will contain six object properties, the property key will correspond to the values in the `indicator_keys` array and each property value will be an object with matching data (see Indicator distributions section below).
`number_of_runs_total`	This is the number of runs that were analyzed for this particular summary. For example, if you ran an experiment with two instances and an input set with three runs, the version summary (`instanceID` + `versionID`) will have a run total of three runs because it is running all three input files on that particular version. The version & input summary (`instanceID` + `versionID` + `inputID`) will have one run because it ran that one input file on that particular version. And the input summary (`inputID`) will have two runs because it ran that input file on the two instances.

Indicator distributions

Each object property value in the indicator_distributions contains the values in the table below. Note that for some runs, certain values may be missing (a custom statistic for example). If you’re viewing the results in Console and a grouped distributional summary is missing values, a warning message will appear. If you are analyzing the results from the returned JSON, you must handle this check in your own systems.

When the runs are being evaluated, the final value is taken from the last solution found before the run has been terminated. The run duration can be set as an option on the experiment or will be set by the executable binary used for the run.

All values in the indicator distributions are either numbers or strings. If they are strings, they are one of three string values: nan, +inf, or -inf.

Field	Description
`min`	The minimum of the values returned from the runs for the statistic being evaluated. For example, if you are viewing the `result.custom.used_vehicles` indicator distribution, and if there were three runs with one input file returning 50, another 40, and the other 60; the `min` value would be 40.
`max`	The maximum of the values returned from the runs for the statistic being evaluated.
`count`	The number of successful runs that have the specific indicator in their statistics output.
`mean`	The average of the values returned from the runs for the statistic being evaluated.
`std`	The standard deviation of the values returned from the runs for the statistic being evaluated. Uses a denominator of n−1 (see Corrected sample standard deviation).
`shifted_geometric_mean`	The shifted geometric mean of the values returned from the runs for the statistic being evaluated. (The shift parameter is equal to 10.)
`percentiles`	An object that contains the percentiles of the values returned from the runs for the statistic being evaluated. There are nine values that give the following percentiles: 1%, 5%, 10%, 25%, 50%, 75%, 90%, 95%, and 99%.

Experiment runs

Runs made for the experiment can be retreived with this endpoint.

GEThttps://api.cloud.nextmv.io/v1/applications/{application_id}/experiments/batch/{batch_id}/runs

List batch experiment runs.

List runs used for batch experiment specified by application and batch ID.

View API details

This will return the runs of the experiment in the following format:

Experiment runs

{
  "runs": [
    {
      "id": "staging-dehnyogSg",
      "input_id": "staging-wZ8xcIQ4R",
      "created_at": "2023-08-21T21:09:41.48243386Z",
      "application_id": "pizza-delivery",
      "experiment_id": "pizza-delivery-region-compare",
      "application_instance_id": "nyc-pizza-region",
      "application_version_id": "nyc-v1.2.13",
      "status": "succeeded",
      "statistics": {
        "status": "succeeded",
        "indicators": [
          {
            "name": "result.value",
            "value": 2175064,
          },
          ...
        ]
      }
    },
    ...
  ]
}

Where runs is an array that contains the runs made for the experiment. Each run object includes the run metadata plus any summary statistics that were specified in the excecutable binaries used for the experiment (see table below).

Experiment runs must be retrieved independent of the batch experiment details. However, in Console this run history table can be viewed at the bottom of the experiment details. Each run history item can be clicked to view the details of the run. If the app is a routing app (using either the routing or nextroute template) the run details will also include a visualization of the results.

The run history data can also can be downloaded as a CSV file in Console. Click the Download CSV link in the upper right area of the experiment run history table to download the data as a CSV file.

Note that experiment runs are not shown in your app’s run history.

Experiment run summary data

Run history objects can include the items specified in the table below.

Field	Description
`id`	The id for the run. This is generated automatically and cannot be changed.
`created_at`	The date and time the run was created.
`application_id`	The ID of the application in which the run was made.
`application_instance_id`	The ID of the instance used for the run.
`application_version_id`	The ID of the version used for the run.
`experiment_id`	The ID of the experiment for which the run was made.
`input_id`	The ID of the input used for the run.
`status`	The status of the run. This can be either `running`, `succeeded`, or `failed`. If the status if `failed`, no error will be shown in the run history data, but the details of this run will contain relevant error messages. The run details for an experiment run can be viewed in Console by clicking on the run ID in the experiment run history table or retrieved using the `/applications/{application_id}/runs/{run_id}` endpoint.
`statistics`	The statistics evaluated for the experiment and their values. The `statistics` field contains the following fields: `status` `error` `indicators` The status of the statistics can be `none`, `pending`, `succeeded`, or `failed`. If it’s `failed`, then `indicators` is not present and `error` contains the error message. If the status is `succeded`, then `indicators` contains the values and `error` is not present. If the status is `none` or `pending`, then both `error` and `indicators` are not present. The `indicators` field contains an array of the statistic indicator evaluated for the batch experiment in the format of `{ name: <indicator_name>, value: <indicator_value> }`. In Console, the indicator values are shown in the experiment run history table only if they are present. No errors are displayed.

Delete a batch experiment

Deleting a batch experiment will also delete all of the associated information such as the udnerlying app runs.

This action is permanent and cannot be undone.

To delete a batch experiment, you can use the following interfaces:

Console: use the web interface. Use the Delete button in the batch experiment details.

Nextmv CLI: use your terminal.

Bash

nextmv experiment batch delete \
  --app-id "<YOUR-APP-ID>" \
  --experiment-id "<YOUR-BATCH-EXPERIMENT-ID>"

The command will prompt you to confirm the deletion. Use the --confirm flag to skip the confirmation.

Cloud API: use this HTTP endpoint.

DELETEhttps://api.cloud.nextmv.io/v1/applications/{application_id}/experiments/batch/{batch_id}

Delete a batch experiment.

Delete a batch experiment specified by application and batch ID.

View API details

Bash

curl -L -X DELETE \
    "https://api.cloud.nextmv.io/v1/applications/$APP_ID/experiments/batch/$EXPERIMENT_ID" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $NEXTMV_API_KEY"

You will not be prompted to confirm the deletion.

A tutorial for creating and managing batch experiments on Nextmv Cloud.

Console

Nextmv CLI

Cloud API

Create and start batch experiment.

Result of a batch experiment - Console

Result of a batch experiment - CLI

Top-level properties

Grouped distributional summaries

Indicator distributions

Experiment runs

List batch experiment runs.

Experiment run summary data

Delete a batch experiment

Delete a batch experiment.

Scenario

Acceptance

Contents