Writing Computational Essays Based on Simulation Models

Using the BPTK-Py Python package to write computational essays based on System Dynamics models.

Oliver Grasl
Thursday, August 23, 2018

An overview of how to use Jupyter and Python to rapidly build interactive stories based on System Dynamics simulations

If you want to use the insights gained from simulation models to transform your enterprise, just having the model and the insights alone are often not enough – you need to develop a good story that persuades your stakeholders to make decisions and take action.

Our BPTK_PY library allows you to convert System Dynamics models into Python code and run them in the powerful Jupyter notebook environment, which is a wonderful place for writing interactive, data-driven papers and reports.

Writing such interactive stories is quite challenging, even if your simulation model is already fairly complete because you need to craft the storyline while testing different scenarios with the simulation model.

We realized early on that we needed an “interactive writing environment” that would help us to prototype a complex storyline while still building the model:

Build an initial simulation model using a visual modeling tool.
Start writing a story around the model using the interactive writing environment.
Use the new questions and insights that arise during the writing process to drive experiments with the simulation model – this may lead to changes in both the model and the storyline.

In 2014 we started writing such “computational essays” using Wolfram Research‘ Mathematica® and Wolfram Language.

Mathematica is a wonderful and very powerful tool to support this process – it is a complete environment that supports writing, computation, presentation and very sophisticated interactive dashboards.

Mathematica does not contain native support for System Dynamics, but we crafted a small library that allows us to import System Dynamics models based on the XMILE standard and to run them in Mathematica.

But since 2014 a lot has happened in the world of computational modeling… Python has become the premier programming language for data science and its use has thus become even more ubiquitous and with it the Jupyter notebook environment (using the IPython kernel).

Both Mathematica and Jupyter are fantastic environments for creating computational essays and “story-driven simulations”:

Working with Mathematica and the Wolfram Language is incredibly productive thanks to the symbolic nature of the language and the way it integrates into the highly sophisticated Mathematica notebook environment.
Currently, Jupyter notebooks still lack the sophistication of Mathematica notebooks. But what sets the Jupyter/IPython ecosystem apart is that it is open source (and thus free to use for everyone), highly extensible and very widely used in the data science/computational modeling community – it is thus much easier to find books, training materials and skilled resources.

For a more in-depth comparison of Jupyter and Mathematica see Paul Romer’s very interesting blog post on this topic. Also be sure to read the Atlantic Article referenced by Paul Romer.

Because Jupyter and Python are free to use, we decided to port our Business Prototyping Toolkit to the Jupyter/Python universe and also provide it free of charge – which means you can now create simulation models using System Dynamics and let them run in Python.

We are also still actively maintaing and extending the Wolfram Language Version of the library.

Our approach is very powerful and liberating because it turns our models into computable objects – we can now use our simulation models in new ways, quite independently of the modeling environment in which we create the model.

Some “everyday” examples of such uses are:

Writing up a report (or paper) based on a System Dynamics model using Jupyter and plotting all related graphs directly in Jupyter.
Creating and managing a comprehensive set of scenarios pertaining to that model.
Creating an interactive dashboard for a model.
Sharing models, dashboards and reports with people who do not have access to the original model environment.
Comparing multiple versions of a model to each other.

We provide examples for all these points later in this document.

More advanced examples for using models as computable objects are:

training machine learning algorithms using System Dynamics models.
combining system dynamics models with other computational model techniques, such as agent-based modeling.

We are providing the BPTK-Py framework as an open source framework under the MIT license, which means you are free to use the framework in your own modeling projects. You can download the Jupyter Notebook and System Dynamics model corresponding to this blog post along with installation instructions and a more detailed tutorial here.

To demonstrate our computational essay approach using our BPTK-Py framework we will use a very simple System Dynamics model. The System Dynamics model itself was built using the dynamic modeling and simulation software Stella® from isee systems.

Stella saves models using the XMILE format, which is an open XML protocol for sharing interoperable system dynamics models and simulations. The XMILE standard is governed by the OASIS standards consortium.

To illustrate how our framework works, this post uses the model from our Introduction to System Dynamics.

We have included the stock and flow diagram of the entire model below – you do not need to understand how the model works to follow this post, but knowing the stock and flow structure will be useful.

The following sections illustrated various aspects of our framework and of how to use it.

Importing the framework
Setting up scenarios and scenario managers
Plotting scenario results
Creating interactive simulation dashboards
Accessing the underlying system dynamics model
Accessing model information
Checking and comparing models

Importing The Framework

The Business Prototyping Toolkit for Python comes with a model transpiler, which automatically converts SD models created with Stella into Python Code, a model simulator that lets you run those models in a Python environment (such as a Jupyter notebook), a simple format for defining scenarios, and some methods to plot simulation results – these methods form the BPTK API.

For most users, this will be enough initially: you create the model using Stella and experiment with it by defining scenarios and plotting graphs in Jupyter notebooks. Whenever you change the model in Stella, the model is automatically transpiled to Python in a background process. Hence you can work on your model (in Stella) and write your computational essay (in Jupyter) in parallel.

You only need very limited Python skills to do this – the following sections show all the code that is needed to get up and running, our downloadable tutorial provides detailed installation instructions. we have found that even modelers who are new to Jupyter and Python can get productive with our toolkit within a few hours.

More advanced users can use the full power of Jupyter and Python to access and manipulate the underlying simulation model and simulation results.

from BPTK_Py.bptk import bptk
bptk = bptk()

Setting Up Scenarios And Scenario Managers

Scenarios are just particular settings for the constants and graphical functions in your model and scenario managers are a simple way of grouping scenarios.

You can create scenarios directly in Python, but the easiest way to maintain them is to keep them in separate files – you can define as many scenario managers and scenarios in a file as you would like and use as many files as you would like. Each scenario manager references the model it pertains to. So you can run multiple simulation models in one notebook.

All the scenario definition files are kept in the

scenarios/

folder. The BPTK_Py framework will automatically scan this folder and load the scenarios – including the underlying simulation models – into memory.

The following code lists all available scenario managers and scenarios:

print("Available Scenario Managers and Scenarios:")

managers = bptk
               .scenario_manager_factory
               .get_scenario_managers(
                   scenario_managers_to_filter=[])

for key, manager in managers.items():
   print("")
   print("*** {} ***".format(key))

for name in manager.get_scenario_names():
   print("\t {}".format(name))

This produces the following output:

Available Scenario Managers and Scenarios: *** anotherSimpleProjectManagement *** scenario100 scenario80 scenario120 *** smSimpleProjectManagement *** base scenario100 scenario80 scenario120 *** smInteractive *** scenario100 *** smSimpleProjectManagementV0 *** base scenario100 scenario80 scenario120

As the filename suggests, scenarios are defined using the JSON format:

"smSimpleProjectManagement":{
   "source":"simulation_models/sd_simple_project.itmx",
   "model":"simulation_models/sd_simple_project",
   "scenarios":{
      "scenario100":{
         "constants": {
            "deadline": 100,
            "effortPerTask": 1,
            "initialOpenTasks": 100,
            "initialStaff": 1
         }
      }
   }
}

The outer bracket defines the scenario manager. Next to containing a number of scenarios, the scenario manager also defines which model the scenarios apply to (the

source

and also the

model

file which contains the model’s transpired Python code.

Each scenario gets a name (

scenario100

in this example) and a list of constants that define the scenario settings.

We can also take a look at how the scenario is stored in our Python Scenario Manager:

bptk
    .scenario_manager_factory
    .get_scenario("smSimpleProjectManagement","scenario100")
    .dictionary

Leading to

{'constants': { 'deadline': 100, 'effortPerTask': 1, 'initialOpenTasks': 100, 'initialStaff': 1 } }

The only difference in

scenario80

is that we have set the number of initial open tasks to 80 instead of 100.

bptk
    .scenario_manager_factory
    .get_scenario("smSimpleProjectManagement","scenario80")
    .dictionary

produces

{ 'constants': { 'initialOpenTasks': 80, 'deadline': 100, 'effortPerTask': 1, 'initialStaff': 1 } }

Plotting Scenario Results

Example 1: Multiple Equations For One Scenario

Let us assume that we would like to plot multiple equations for the same scenario.

All plotting is done using the

bptk.plot_scenarios

method. We just need to pass the name of the scenario (

scenario100

) and the list of equations (

openTasks

and

closedTasks

bptk.plot_scenarios(
        scenario_managers=["smSimpleProjectManagement"],
        scenarios=["scenario100"],
        equations=['openTasks','closedTasks'],
        title="Tasks remaining vs. Tasks closed",
        freq="D",
        start_date="1/1/2018",
        x_label="Time",
        y_label="Tasks"
 )

Writing Computational Essays Based on Simulation Models Graph 1

As you can see we can easily change the name of the diagram, the axes labels and the time scale. A legend showing the names of the plotted equations is displayed automatically.

The colors for the plots are set in a configuration file – you can learn more about how to do this in the advanced documentation contained in our tutorial.

Renaming Equations

The equation names (such as

openTasks

) are often not the kind of names you want to show to the reader – in this case for instance we would much rather use the phrase

Open Tasks

Fortunately, we can use the

series_name

the parameter to rename them.

bptk.plot_scenarios(
        scenario_managers=["smSimpleProjectManagement"],
        scenarios=["scenario100"],
        equations=['openTasks','closedTasks'],
        title="Renaming Equations",
        freq="D",
        start_date="1/1/2018",
        x_label="Time",
        y_label="Tasks",
        series_names={
           "openTasks" : "Open Tasks",
           "closedTasks" : "Closed Tasks"
        }
 )

Writing Computational Essays Based on Simulation Models Graph 2

Example 2: Plot One Equation For Multiple Scenarios

Now let us change the perspective. In the above example, we assumed one scenario for which we simulate multiple equations. Now we plot one equation for multiple scenarios.

This is useful whenever we want to compare the behavior of the same model element between different scenarios.

To achieve this, we supply just one equation and a list of scenarios to the

plot_scenarios

method.

bptk.plot_scenarios(
   scenario_managers=["smSimpleProjectManagement"],
   scenarios=["scenario80","scenario100","scenario120"],
   equations=["openTasks"],
   title="One Equation for Multiple Scenarios",
   freq="D",
   start_date="1/1/2018",
   x_label="Time",
   y_label="Tasks",
   series_names={
      "scenario80_openTasks":"80 initial open tasks",
      "scenario100_openTasks":"100 initial open tasks",
      "scenario120_openTasks":"120 initial open tasks"
   }
)

Writing Computational Essays Based on Simulation Models Graph 3

bptk.plot_scenarios(
   scenario_managers=["smSimpleProjectManagement"],
   scenarios=["scenario80","scenario100","scenario120"],
   equations=["productivity"],
   title="One Equation for Multiple Scenarios",
   freq="D",
   start_date="1/1/2018",
   x_label="Time",
   y_label="Producitivty %",
   series_names={
      "scenario80_productivity":"80 initial open tasks",
      "scenario100_productivity":"100 initial open tasks",
      "scenario120_productivity":"120 initial open tasks"
    }
)

Writing Computational Essays Based on Simulation Models Graph 4

What if I Want Another Kind of Graph?

The default output format for our plots shows each plot with a shaded area. This can easily be changed using the

kind

parameter. Currently, we also support the

line

and

stacked

setting.

bptk.plot_scenarios(
   scenario_managers=["smSimpleProjectManagement"],
   scenarios=["scenario80","scenario100","scenario120"],
   equations=["openTasks"],
   kind="line",
   title="One Equation for Multiple Scenarios",
   freq="D",
   start_date="1/1/2018",
   x_label="Time",
   y_label="Tasks",
   series_names={
      "scenario80_openTasks":"80 initial open tasks",
      "scenario100_openTasks":"100 initial open tasks",
      "scenario120_openTasks":"120 initial open tasks"}
)

Writing Computational Essays Based on Simulation Models Graph 5

What if I Need The Underlying Data?

Sometimes you may not be interested in plotting a graph but would rather use the underlying data directly. This is easy – you can use use the

return_df=True

setting to return a data frame containing all of the data in the scenario. The data frame is provided as a pandas data frame.

data=bptk.plot_scenarios(
        scenario_managers=["smSimpleProjectManagement"],
        scenarios=["scenario80"],
        freq="D",
        start_date="1/1/2018",
        equations=[
           "openTasks",
           "closedTasks",
           "completionRate",
           "remainingTime",
           "schedulePressure",
           "productivity"],
        return_df=True
 )

The data frame is a table with a row for each day and a column for each of the scenarios. Using the

pandas.DataFrame.head

the function we can show the first five rows of the data frame:

data.head()

outputs

Writing Computational Essays Based on Simulation Models Graph 6

Once you have the data frame, you can use the full power of Python and the pandas library to access the data, e.g. the

iloc

indexer to access a particular row:

data.iloc[4]

produces

openTasks 77.286885 closedTasks 2.713115 completionRate 0.683436 remainingTime 96.000000 schedulePressure 0.805072 productivity 0.683436 Name: 2018-01-05 00:00:00, dtype: float64

Creating Interactive Simulation Dashboards

Jupyter notebooks are not just a great environment for interactive writing and plotting, they also allow you to build interactive user interfaces.

bptk.dashboard(
   scenario_managers=["smInteractive"],
   scenarios=["scenario100"],
   kind="area",
   equations=["openTasks"],
   series_names={"openTasks":"Open Tasks"},
   freq="D",
   start_date="",
   title="Interactive Simulation Dashboard",
   x_label="Date",
   y_label="€",
   constants=[
      ("slider",'initialOpenTasks',50.0,100.0),
      ("slider","deadline",0,150),
      ("timerange")]
)

produces this dashboard:

Writing Computational Essays Based on Simulation Models Graph 7

Accessing The Underlying System Dynamics Model

The underlying System Dynamics model is automatically converted from the XMILE format into Python code – currently, our transpiler only supports conversion of models in XMILE format, but we are planning to create a compiler for models in other formats.

We keep both the Stella/XMILE models and the compiled models in the `/MODELS` directory.

The actual equations underlying the simulation are stored in a dictionary of Python lambda functions within the class, e.g. here are the equations for closed tasks, open tasks and the completion rate:

'closedTasks': lambda t : 0 if t <= self.starttime else self.memoize('closedTasks',t-self.dt)
 + self.dt * ( self.memoize('completionRate',t-self.dt) ),

'openTasks': lambda t : self.memoize('initialOpenTasks', t) if t <= self.starttime else
 self.memoize('openTasks',t-self.dt) +
 self.dt * ( -1 * ( self.memoize('completionRate',t-self.dt) ) ),

'staff': lambda t : self.memoize('initialStaff', t) if t <= self.starttime
 else self.memoize('staff',t-self.dt)
 + self.dt * 0,

'completionRate': lambda t : max( 0, min( self.memoize('openTasks', t), self.memoize('staff', t) *
 self.memoize('productivity', t) / self.memoize('effortPerTask', t) ) ),

Because of the recursive nature of System Dynamics equations we use memoization to remember previous values – this makes the equations more cumbersome to read (and write should you want to create your own), but it dramatically speeds up calculations.

You can easily access (and change) these lambda functions if you want to, they are stored in a dictionary within the model, which itself is associated with the scenario. Associating the model with the scenario is essential because it ensures that the scenario settings are automatically applied to the model.

bptk
     .scenario_manager_factory
     .get_scenario("smSimpleProjectManagement","scenario80")
     .model
     .equations

outputs

{ 'closedTasks': .(t)>, 'openTasks': .(t)>, 'staff': .(t)>, 'completionRate': .(t)>, 'currentDate': .(t)>, 'remainingTime': .(t)>, 'schedulePressure': .(t)>, 'productivity': .(t)>, 'deadline': (t)>, 'effortPerTask': (t)>, 'initialOpenTasks': (t)>, 'initialStaff': (t)>}

You can run these equations to access individual values:

bptk
    .scenario_manager_factory
    .get_scenario("smSimpleProjectManagement","scenario100")
    .model
    .equations["closedTasks"](10)

leads to

10.0

Accessing Model Information

Having the model in a python class is useful for other purposes too – for instance, you can access the list of stocks, flows and converters to check which elements are in your model:

bptk
    .scenario_manager_factory
    .get_scenarios()["scenario100"]
    .model
    .stocks

outputs

[ 'closedTasks', 'openTasks', 'staff']

bptk
    .scenario_manager_factory
    .get_scenarios()["scenario100"]
    .model
    .constants

outputs

[ 'deadline', 'effortPerTask', 'initialOpenTasks', 'initialStaff' ]

Of course, we can then use the power of Jupyter to pretty print this using HTML:

from functools import reduce
from IPython.display import HTML
HTML("<ul><li>"+   
reduce(
       lambda x, y: x+"</li><li>"+y,  
       bptk
           .scenario_manager_factory
           .get_scenario("smSimpleProjectManagement","scenario100")
           .model
           .stocks
)     
+"</li></ul>")

outputs

closedTasks
openTasks
staff

We can also easily output the settings for the constants in the model (for any of the defined scenarios):

for constant in 
bptk
     .scenario_manager_factory
     .get_scenario("smSimpleProjectManagement","scenario100")
     .model
     .constants:

print (constant+": "+str(
bptk
    .scenario_manager_factory
    .get_scenario("smSimpleProjectManagement","scenario100")
    .model
    .equations[constant](0)
))

leads to

deadline: 100 effortPerTask: 1 initialOpenTasks: 100 initialStaff: 1

And

for constant in 
bptk
    .scenario_manager_factory 
    .get_scenario("smSimpleProjectManagement","scenario80")
    .model
    .constants:

print (constant+": "+str(
bptk
    .scenario_manager_factory
    .get_scenario("smSimpleProjectManagement","scenario80")
    .model
    .equations[constant](0)
))

leads to

deadline: 100 effortPerTask: 1 initialOpenTasks: 80 initialStaff: 1

Checking and Comparing Models

Simulation models can become very large and they often are often in use over many years. This means that you end up having many versions of a model, and it often becomes difficult to ensure that model versions are consistent or that changes to a later model do not invalidate results obtained earlier.

Thanks to the BPTK framework, models are now computable and we can easily create checks to ensure that models behave as we expect:

bptk.model_check(data=bptk.plot_scenarios(
scenario_managers=["smSimpleProjectManagement"],
scenarios=["scenario100"],
equations=['openTasks'],
return_df=True
),
check=(lambda data: data.iloc[-1,-1]>10),
message="Model not behaving as expected")

leads to

[ERROR] Model Checking failed with message: "Model not behaving as expected"

The

bptk.model_check

is very simple – it expects some data and a check function that takes the data as input and returns true or false depending on whether the data passes the check or not.

In this example, we used the data frame return from a scenario to provide the data and a Python lambda function to check whether the number of

openTasks

is greater than 10 in the last timestep. It isn’t, and hence an error is generated.

But in this case, the model is actually correct but the check function is wrong – if we correct this the `model_check` function returns with a success message:

bptk.model_check(data=bptk.plot_scenarios(
scenario_managers=["smSimpleProjectManagement"],
scenarios=["scenario100"],
equations=['openTasks'],
return_df=True
),
check=(lambda data: data.iloc[-1,-1]<10),
message="Model not behaving as expected")

results in

[SUCCESS] Model check successful!

We can also easily compare two different versions of a model to each other – for instance, let us assume we have two versions of our project management model, which say contain different settings for the graphical function we use to model the effect of schedule pressure on productivity and also the number of initial open tasks (as defined in the underlying model, not in the scenarios).

Let us compare the results between the models:

bptk.plot_scenarios(
     scenario_managers=["smSimpleProjectManagement","smSimpleProjectManagementV0"],
     scenarios=["scenario80"], 
     equations=['openTasks'],
     title="Compare Two Models",
     series_names={
        "smSimpleProjectManagement_scenario80_openTasks":"Current Model", 
        "smSimpleProjectManagementV0_scenario80_openTasks":"Model v0"}
)

Writing Computational Essays Based on Simulation Models Graph 8

Conclusion

This post detailed our approach to building interactive stories based on simulation models using Jupyter, Python and the BPTK-Py framework. Here are the important points you should remember:

If you want to use the insights gained from simulation models to have an impact in the real world, just having the model and the insights is often not enough – you need to develop a good story that persuades your stakeholders to make decisions and take action.
Jupyter and Python provide a great environment for rapidly building such interactive stories that rely on data and on simulations. Using the BPTK_PY framework, creating such documents is actually quite easy. You can concentrate on the story and let the framework do the heavy lifting.
Building complex simulations using System Dynamics is best done using visual modeling environments such as Stella. Thanks to the XMILE standard and our BPTK_PY framework, we can import System Dynamics models created with Stella straight into Jupyter and run them there.
Once you have finished the interactive story you can publish it to a wide audience by sharing the Jupyter notebook – this is made particularly easy because Jupyter, Python and BPTK_Py are free.

To see interactive stories and games written using this approach, take a look at our blog posts on Prototyping Business Models And Market Strategies, on Make Your PSF Grow or check our version of the Play The Beer Distribution Game.

Infrastructure Simulation Overview A Simple Python Library For System Dynamics