nextmv Docs

Input And Output Schema

Input And Output Schema

Traditional modeling and optimization tools require translating decision data (i.e. inputs) and business logic to some other form before calling some optimization algorithm. In contrast, Nextmv's tools are designed to operate directly on business data and produce decisions that are actionable by software systems. This makes decisions more interpretable and easier to test. It also makes integration with data warehouses and business intelligence platforms significantly easier.

In the modeling sections, we saw how Hop and Dash runners automatically convert input JSON into Go types. They do this using Go's standard rules for JSON unmarshaling. The same applies to marshaling Hop and Dash's output states in the solver and simulator outputs, respectively.

Hop

Say we have the following (mostly randomly constructed) JSON data we need to make a decision about. It has a driver's name, their capacity and location, and a vector of pickup and delivery requests to route for them. This is a slightly simplified version of the input to the pedicab routing model we discuss elsewhere.

{
"Driver": "Mooney",
"Capacity": 3,
"Location": [-122.066889, 37.386274],
"Requests": [
[[-122.080240, 37.393735], [-122.094032, 37.393879]],
[[-122.060621, 37.399051], [-122.066038, 37.396881]],
[[-122.058902, 37.394507], [-122.055139, 37.390682]]
]
}

We define a Go struct to read this into a Hop model.

type input struct {
Driver string
Capacity int
Location [2]float64
Requests [][2][2]float64
}

Now we need to define a decision, or system, state. In this case it probably makes sense to keep the current path as a slice of integers to refer to the locations by index.

type state struct {
Path []int
input input
}

There may be other data we store on our state, such as the final location of the driver or the cost of the route, but we don't show that here. What you store on your model state is up to you.

We add Feasible, Next and other methods in the example app. Since Path is exported, Hop will automatically serialize it as the model output. All we must do is hook up the input type to a runner, construct a state, and return a solver operating on that state.

Once we go build our model into a binary, it will read input structures from standard input and write state structures to standard output. An alternative to using standard input and output is to use the -hop.runner.input.path and -hop.runner.output.path command-line flags.

Statistics

In the output file statistics object you'll notice a similar structure.

{
"statistics": {
"bounds": {
"lower": 0,
"upper": 15281
},
"search": {
"generated": 51257,
"filtered": 28076,
"expanded": 23181,
"reduced": 0,
"restricted": 18838,
"deferred": 4343,
"explored": 22979,
"solutions": 9
},
"time": {
"elapsed": "64.30884ms",
"start": "2021-03-02T22:29:11.853297-05:00"
},
"value": 15281
}
}

The first field under the statistics are the upper and lower bounds. In cases where our solver is maximizing or minimizing an objective, Hop continously updates the upper and lower bounds after generating a new state.

If the bounds are the same in value that means Hop has mathematically proven optimality of its best solution. When those are not the same value, Hop terminated early while searching the state space proving optimality. In that case, Hop will return the best found to date, but hasn't explored enough states to prove that it is the mathematical optimal. You can read more about the Bounds method here.

In the search object, you'll notice that there are values for generated, expanded, deferred, reduced, explored, and filtered.

Hop's search mechanism is a state exploration. Hop projects states forward from the root state to determine what decisions you should make. Every transition in that search produces a new state which Hop then explores.

Generated states are all the states Hop has created.

Filtered states are those that have been removed from the search because they have been bounded out. Hop has determined that it cannot, from a particular set of decisions, produce a solution that will be better than the best one it has found so far.

Expanded states are all the states that are not Filtered.

Reduced states are states that have been removed from the search because a Reducer was able to determine that there are states that strictly dominate them.

Deferred states are states saved for later exploration if there's time. This is how Hop manages explosion of the state space. Hop won't explore every state sequentially. It will try to explore the most fruitful ones first and defer the others for later exploration if there's time.

Restricted states are the states that have not been deferred, but instead will be investigated immediately.

Explored states are all states Hop has fully explored. Those states are not able to produce more child states.

Each of these categories contain both feasible and infeasible states.

Dash

Let's look at an example in a Dash simulation. The following JSON is an excerpt of an example file named input.json located in the dash/examples/queue directory:

[
{ "Number": 0, "Arrival": 0, "Service": 3 },
{ "Number": 1, "Arrival": 1, "Service": 10 },
{ "Number": 2, "Arrival": 1, "Service": 7 }
]

Here, we have a JSON array representing a queue of three customers waiting for some kind of service. Each has a Number, Arrival, and Service attribute. Number serves as a unique identifier in the simulation, Arrival is the number of minutes since the beginning of the simulation that the customer enters the queue and Service is the number of minutes required to service that customer.

We define a Go struct so we can read (or unmarshal) this data into our simulation:

type customer struct {
Number int
Arrival int
Service int
}

To use a customer struct as an actor in our simulation, it must have a Run method which takes a time.Time value as an argument and returns a new time.Time value and a boolean: true if the Run was successful, and false if it was not. The specifics for what happens in the Run method are up to you. Refer to the Single-Server Queue example for inspiration.

Customizing JSON Formatting

Say we really don't like the uppercase names Go structures use by default for map keys. We can change the way they are read in and written out using json annotations on our structs.

type input struct {
Driver string `json:"driver`
Capacity int `json:"capacity"`
Location [2]float64 `json:"location"`
Requests [][2][2]float64 `json:"requests"`
}

Our model can now read data that looks like this:

{
"driver": "Mooney",
"capacity": 3,
"location": [-122.066889, 37.386274],
"requests": [
[[-122.080240, 37.393735], [-122.094032, 37.393879]],
[[-122.060621, 37.399051], [-122.066038, 37.396881]],
[[-122.058902, 37.394507], [-122.055139, 37.390682]]
]
}

If you need finer-grained control over how data are read in and decisions are written, you can provide UnmarshalJSON and MarshalJSON methods for the types as shown in the Go docs.

Let's also change the way our state type is written when solutions are found. Say we implement a duration method for state that estimates the time to complete a path and returns a time.Duration. We add that to our model output with a MarshalJSON method. We also add the request list from the input.

func (s state) MarshalJSON() ([]byte, error) {
m := map[string]interface{}{
"path": s.Path,
"requests": s.input.requests,
"time": s.duration().String(),
}
}

Distance / Duration Matrix

By default our models calculate relative distances using the Haversine formula. While this works well for development, production models generally rely on data provided by a mapping service. In many cases users opt for Open Source Routing Machine (OSRM), but Hop can be used with the mapping service of your choosing. We have customers using OSRM, Google Maps API, GraphHopper and proprietary internal mapping services. Hop is agnostic to provider and can interface with any via json input.

Of note when using OSRM: OSRM can be hosted as a REST service within your infrastructure using a docker image. This will expose two endpoints:

  • /table/v1/ provides a matrix containing the distances and durations between all points within a set of coordinates.
  • /route/v1 finds the fastest route for a set of coordinates, while preserving the order.

The matrix provdied by /table/v1/ can be stripped from the response and added to the input file. When the model calculates distances it will reference the matrix.

There is a known issue where hitting each of these endpoints will return slightly different values. OSRM uses different distance calculation algorithms for /table/v1/ and /route/v1. Our recommendation is that users apply the continue_straight=false option when making requests to /route/v1. More information on this issue can be found here.