Semi-automated Modular Program Constructor for physiological modeling: Building cell and organ models

The Modular Program Constructor (MPC) is an open-source Java based modeling utility, built upon JSim's Mathematical Modeling Language (MML) ( http://www.physiome.org/jsim/) that uses directives embedded in model code to construct larger, more complicated models quickly and with less error than manually combining models. A major obstacle in writing complex models for physiological processes is the large amount of time it takes to model the myriad processes taking place simultaneously in cells, tissues, and organs. MPC replaces this task with code-generating algorithms that take model code from several different existing models and produce model code for a new JSim model. This is particularly useful during multi-scale model development where many variants are to be configured and tested against data. MPC encodes and preserves information about how a model is built from its simpler model modules, allowing the researcher to quickly substitute or update modules for hypothesis testing. MPC is implemented in Java and requires JSim to use its output. MPC source code and documentation are available at http://www.physiome.org/software/MPC/.

We describe our modeling utility as semi-automated modular programming construction. It is simple and not conceptually novel, but is easy to learn and use. For developing a series of models of increasing complexity, Modular Program Constructor (MPC) can serve well as the primary basis for coding new model components and for incorporating modules of previously developed modeling code. The perspective is to take a modular approach; this means that one builds from simple modeling elements initially and then use multimodular constructs as modules in higher level models.
Modular model creation and construction rely, to varying degrees, on meta-data to assist in reusing and merging previous models into a new one. Antimony (Smith et al., 2009) is the simplest approach. It requires the user to be familiar with the model and just specify that you want to import it into the new model. It relies on the user to resolve discrepancies between models. SemanticSBML (Krause et al., 2010), SemGen (Gennari et al., 2011;Neal et al., 2015), and Phy-Sim (Erson & Cavuşoğlu, 2012) make use of standard semantic and ontological descriptions of a biological model to allow large models to be broken down easily, without much user guidance, into biologically meaningful components linked to their mathematical description. Semantic and ontological metadata assists the construction of new models by providing suggested connections or relationships between models. This approach requires the user to invest time in complete annotation of models with standardized meta-data. The payoff is models that can be constructed and merged together using biological rather than mathematical terms. ProMot (Mirschel et al., 2009) enforces an object-oriented approach to modeling (defining external interfaces for each object) and attempts to use network theory to describe biological systems through specifying elements and coupling elements. MPC relies on the user to modularize a model using directives to specify them. MPC  3. Directives, the third component, comprises the set of instructions used by the MPC model utility to select processes and gather the code from existing modules, renaming parameters and variables to reflect the new purposes for which they will function, and automatically combining the mathematical structures into new structures. The directives control the identification, fetching and relabeling of variables and parameters, and the assembly and recombination of model code into new equations. All

Selecting and arranging components using directives -A simple example
The MPC input file guides the construction of a model made of previously existing model modules. It combines MML with "directives" embedded as comments and uses code from other JSim model files that have been annotated so that they can be read by MPC, yet without interfering with their operability. MPC may also combine models with other models or with modules of preconstructed code from model code libraries. These modules are specified within a library with the START and END directive. A "library" with a few elementary operators from which we will build a model in our next step is illustrated below:
The MPC file defines the domain, parameters, variables, and initial conditions first. Using the directives listed in 'Example.mpc', model code is extracted from the file 'CodeLibrary.mod' shown above. Values and variable names needing replacement throughout the final model are specified by the REPLACE directive along with the '%symbol%' placeholder. The use of the REPLACE, GET, COLLECT, INSERTSTART and INSERTEND directives are used in Example.mpc shown below: The GET directive warrants further explanation: it identifies a model code library file and module name within the library to insert into the model, and changes old names (names of parameters and variables in the module) to new model names. From the example above, //%GET %CL% reactionCalc ("A=A2","B=B2", "V=V2","G=Ga2b") will get the module named 'reactionCalc' in file 'CodeLibrary.mod' and replace the variable names with the new model names ("A=A2", etc).
The MPC directives control the identification, fetching, relabeling of variables and parameters, and assembling and recombining code into new equations. The directives extract equations from files, changing the names of the module variables to application specific names and assemble the code into combined equations. The model code resulting from these instructions provides a complete program (Example.mod); in the following MPC output file (example.mod) some redundant comments have been removed, other explanatory comments have been added. The MPC generated program is ready to use with no further intervention on the part of the user except to adjust parameters or the solution time step length, and to set up graphics in JSim to display solutions, as shown in Figure 2.  The process above is hardly worthwhile for small models but is highly efficient for larger models where flexibility in structure is desired. In the example above, converting the ODEs to PDEs requires a three line change.

Discussion
A prerequisite to using MPC is semantic consistency throughout the libraries and modules. Automated systems using ontologies will help craft models (Gennari et al., 2011), but the great efficiency of MPC for model construction begins to show when there are many model modules as in biochemical networks and circulatory or airway models. The VVUQ process (Johnstone et al., 2016) provides key steps toward reproducibility (VVUQ = Verification, Validation, Uncertainty Quantification, the latter defining predictive accuracy). Though an MPC-generated model is checked for syntax and unit balance through JSim, further verification is required: analytical solutions can be written into the code to match specific limiting cases, but otherwise one depends on testing for mass, charge, or energy balances. Validation requires testing against data, independent of the construction method; model solutions should not be in contradiction to the data. Quantification of the uncertainty is needed for making predictions from the model: UQ includes uncertainty in parameters, handled by JSim's Monte Carlo analysis, and in inputs/environment and model structure. Structural uncertainty, a major challenge, defines a major role for MPC: inserting different choices from amongst similar but differently functioning modules, into a large, multi-modular model, and solving the system many times with the variant constituents illustrating uncertainty in the projected outcomes.

Summary
A limited set of directives in MPC, our Modular Program Constructor, allows us to build complex models from small models of simple physiological processes. MPC encodes and preserves information about how a complex model is built from its simpler model modules allowing the researcher to quickly substitute or update modules to validate a hypothesis. The amount of actual model code a user needs to write is reduced, especially for more complicated models.
Future updates will improve collection and insertion of model code, better identify external model module 'connections' for easier incorporation into larger models, and more intelligent reconciliation of similar code between modules. The long-term strategy is to integrate MPC within JSim allowing the user to take advantage of JSim's MML compiler and graphical user interface to quickly merge code with less user intervention. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Software availability
* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
* Neither the name of the University of Washington nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

Data availability
The two compartment MPC built model, demonstrated here, is available at www.physiome.org (TwoCompExampMPC, Model # 0345). As it is an ODE model it could be translated to SBML or CellML, allowing researchers whose simulation systems support one of these markup languages to run this model. However, for this presentation we have provided only the MPC annotation in order to retain its simplicity.

Author contributions
All authors contributed to the design and organization of the paper and its writing and editing. Gary Raymond developed MPC. Bart Jardine currently maintains MPC source code and James Bassingthwaighte provides guidance and requirements for MPC development.

Competing interests
The authors declared no competing interests. Thank you for the latest revision. Please consider changing the last sentence of the first paragraph in the introduction to: "and then use" --> "and then uses".

I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
No competing interests were disclosed. I strongly believe one of the roadblocks to a more widespread use of markup languages and the reusability of code is the perceived challenge by modellers/users of the task(s) involved into make their models 'shareable' and 'reusable'. A contrast between 2 examples or simply a better description of the effort it would concretely entail, would make their case much clearer.
This isn't absolutely necessary for the article but since the authors are trying to make a point and for other researchers to use the tools they have developed and use for their own research, I believe this would be useful.

Dagmar Waltemath Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany
Thank you for revising the manuscript. It was nice reading it. I only have a few minor things to note Last sentence of first paragraph in the introduction: I suggest to generalise the sentence and cut "for multi-scale modeling". Also, I think it should be "uses" instead of "use".
Last sentence of first paragraph in Methods: "MPC currently is executed as command line utility" -I would write "as a command line utility".
in Methods, in the listing of the three components, I suggest to remove "not procedural like Fortran or Matlab", as it is hard to follow the sentence structure and the information not essential. Also, is MML really designed to solve equations? Or rather to provide the information to solve them?
in Methods, second point in the above listing: You mention several models ("For example, there have been a variety of models...") -would it be possible to link to these specific works, e.g., using citations?
Last sentence on page 3: "A "library" with a few..." -I think it should read "in our next step" (instead of "in out next step") End of first paragraph on page 4: "Species A enters, with flow F, a compartment... I had problems following the sentence, specifically because of "passive exchange between"... Can simplify the sentence structure or use two sentences instead?
First sentence in Discussion: Should it be "Though an MPC-generated..." (instead of "a")?
same paragraph: "These are key steps towards reproducibility and the VVUQ process." I would have found it helpful to get a reference to the VVUQ process. Can you add one?
Grant information, last sentence: Please use a capital letter to start the sentence, and add two "the" -"The grants supported the whole group." I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
No competing interests were disclosed.

Competing Interests:
Author Response 16 Jun 2016 , University of Washigton, USA Bartholomew Jardine

Responses reflected in version 3 of manuscript.
Last sentence of first paragraph in the introduction: I suggest to generalise the sentence and cut "for multi-scale modeling". Also, I think it should be "uses" instead of "use".
Author: Changed last two sentences of first paragraph (page 3) to generalize: "For developing a series of models of increasing complexity, Modular Program Constructor (MPC) can serve well as the primary basis for coding new model components and for incorporating modules of previously developed modeling code. The perspective is to take a modular approach; this means that one builds from simple modeling elements initially and provides key steps toward reproducibility (VVUQ = Verification, Validation, Uncertainty Quantification, the latter defining predictive accuracy). Though an MPC-generated model is checked for syntax and unit balance through JSim, further verification is required: analytical solutions can be written into the code to match specific limiting cases, but otherwise one depends on testing for mass, charge, or energy balances. Validation requires testing against data, independent of the construction method; model solutions should not be in contradiction to the data. Quantification of the uncertainty is needed for making predictions from the model: UQ includes uncertainty in parameters, handled by JSim's Monte Carlo analysis, and in inputs/environment and model structure. Structural uncertainty, a major challenge, defines a major role for MPC: inserting different choices from amongst similar but differently functioning modules, into a large, multi-modular model, and solving the system many times with the variant constituents illustrating uncertainty in the projected outcomes.
I. Grant information, last sentence: Please use a capital letter to start the sentence, and add two "the" -"The grants supported the whole group." Author: updated sentence as suggested.

Version 1
I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
No competing interests were disclosed.

Competing Interests:
Author Response 06 Apr 2016 , University of Washigton, USA Bartholomew Jardine Our responses to Referee Dagmar Waltemath's review: My suggestions for improvements are mainly on the terminology used throughout the manuscript, and on the discussion of related work. Unifying terms: In the abstract alone you speak about programs, utilities, code; about models, processes, model code and modules. Maybe you could -not only in the abstract but throughout the manuscript -unify your wording a little bit more to make the text more comprehensive.
Author Response: Yes, we updated the abstract and paper as a whole to try to use consistent and unifying wording when discussing model code, processes, modules, etc. These changes are most notable in the abstract and introduction.
Related work: I missed a discussion of related systems, e.g. the model merge tool for SBML, semanticSBML, or the semantic-based system (there was a new publication just recently1). While you mention them in the beginning of your introduction, I did not see a discussion of these systems, and how they differ from your approach. I, as a reader, would be interested to know which system is best to use when.
Author Response: Added a paragraph in the Introduction that briefly discusses other tools in relation to MPC: "Modular model creation and construction rely, to varying degrees, on meta-data to assist in reusing and merging previous models into a new one. Antimony (Smith 2009) is the simplest approach. It requires the user to be familiar with the model and just specify that you want to import it into the new model. It relies on the user to resolve discrepancies between models. SemanticSBML(Krause 2010), SemGen (Genari 2011, Neal 2015), and Phy-Sim (Erson 2012) make use of standard semantic and ontological descriptions of a biological model to allow large models to be broken down easily, without much user guidance, into biologically meaningful components linked to their mathematical description. Semantic and ontological metadata assists the construction of new models by providing suggested connections or relationships between models. This approach requires the user to invest time in complete annotation of models with standardized meta-data. The payoff is models that can be constructed and merged together using biological rather than mathematical terms. ProMot (Mirschel 2009) enforces an object-oriented approach to modeling (defining external interfaces for each object) and attempts to use network theory to describe biological systems through specifying elements and coupling elements (Mirschel 2009). MPC relies on the user to modularize a model using directives to specify them. MPC then requires the