10 Interoperability Guide
10.1 On interoperability
A key goal of the DDMoRe project is to have an intoperability framework in which models are written in a consistent language, translated to PharmML and from there converted to target software code. Before the DDMoRe project no existing language standard existed across target software used in pharmacometrics modelling, and while the underlying models could be expressed consistently in mathematical and statistical terms, the implementation of any given model varied by tool and by user according to their experience with a given target software tool.
There is some flexibility within MDL around how the user can express the mathematical and statistical models. Having flexibility allows the user to encode models quickly in a common language (MDL) which can then be shared with others and mutually understood. This flexibility also facilitates encoding in a given target when that language construct does not have a parallel in other tools. However, we STRONGLY encourage the user to encode the majority of models in a way that will facilitate interoperability. There are MDL constructs that facilitate interoperability – these generally appear as built-in functions which translate to specific constructs in PharmML and the target software. These constructs cover many typical models and are designed to allow the user to generate code quickly and have high confidence that it will be interoperable across tools.
The Model Description Language Interactive Development Environment (MDL-IDE) should assist the user in ensuring that the models encoded are valid MDL (and as a consequence, also valid PharmML). Not all models will result in code which can be readily converted to all target tools.
These interoperability constructs will be highlighted in the subsequent sections, but users should pay particular attention to sections on the use of GROUP_VARIABLES, INDIVIDUAL_VARIABLES and the MODEL_PREDICTION.
10.2 Dataset conventions
There are a number of conventions in preparing data for use in MDL and for target software.
It is assumed that the
SOURCEdata file will be present as an ASCII comma-delimited text file (.csv).The data file should have a header row with names matching those in the
DATA_INPUT_VARIABLESblock.Data values should be numeric.
Data columns with string or date:time values should have
use is ignorefor MDL.Null or missing values should be denoted by
..
Generally speaking, MDL follows NONMEM dataset conventions.
In addition the following restrictions should be observed for interoperability reasons:
A column with
use is idis mandatory. Values should be positive, non-zero integer, unique and contiguous.A column with
use is idvis mandatory. Values should be positive, real. When the model is expressed usingDEQor [COMPARTMENTS] block the values must be monotonic increasing within ID. This constraint does not apply to analytical models. The first idv value is taken as the initial time for the model. The initial value does not need to be the same for all individuals, but it must not be lower than that of the first individual. date:time format is not supported for time.A column with
use is dvis mandatory. This column can be any real value. This must have a null value for dosing records.When modelling multiple outcomes a column with
use is dvidis required. Values should be positive, integer. Values should not be null forOBSERVATIONrecords.A column with
use is mdvis optional. Valid values are 0 (observed),1 (missing). When theOBSERVATIONis null or missing this column should have the value 1. It can take the value 1 whenOBSERVATIONs are present if thisOBSERVATIONis to be ignored.A column with
use is evidis optional. Valid values are 0 (OBSERVATIONrecord), 1 (dosing record), 4 (reset and dose record).A column with
use is amtis optional. For dosing records this column must be have positive, real value.A column with
use is rateis optional. This column must have positive, real values. This column can only be used in combination with a column withuse is amt. If the value is zero then a bolus dose is assumed.A column with
use is addlis optional. This column must have positive, real values. This column can only be used in combination with columns withuse is amtanduse is ii.A column with
use is iiis optional. This column must have positive, real values. This column can only be used in combination with columns withuse is amtanduse is addloruse is ss.A column with
use is ssis optional. Valid values are 0 (not at steady state), 1 (at steady state). This column can only be used in combination with columns withuse is amtanduse is ii.A column with
use is cmtis optional unless there is more than one route of administration. This column must have positive, integer values. Values in this column should start at 1 and correspond to the order of ODEs specified in theDEQblock of the Model Object.Columns with
use is covariate,use is catCovanduse is variableare optional. These columns must not have missing values. Columns withuse is catCovmust have integer values. Covariate names in theDATA_INPUT_VARIABLESblock must match the same name (including matching case) as the header name in the source file .csv. Please see sections 2.2.5, 4.3 and 4.9 for details on the use of these variables. If the column hasuse is covariateoruse is catCovbut this variable is not declared in theCOVARIATESblock of the Model Object then it will bedroppedand ignored.Columns with
use is catCovcan only have values 0,1. This implies that categoricalCOVARIATESwith k values should be converted to k-1 indicator variables.A column with
use is varLevelis optional. This column should not have missing values. Columns with this type should not have an underscore in the column name. The change in value of this variable denotes when to sample new values of the random variable.
10.3 Multiple uses of dataset columns
Dataset columns cannot have multiple uses defined in DATA_INPUT_VARIABLES. The DATA_DERIVED_VARIABLES block can be used to specify additional uses for dataset variables. In the current MDL, scope for using DATA_DERIVED_VARIABLES is limited.
For example, if the user wants to specify different outcomes / OBSERVATIONs conditional on a dataset variable like CMT, i.e. using CMT as DVID then they will need to create a dataset variable DVID mapping into CMT values appropriately.
10.4 Defining constants in the model
For interoperability, constant values in the model should be defined as [STUCTURAL_PARAMETERS] and fixed to a value in the Parameter Object.
For models expressed as systems of differential equations (DEQ block), model parameters can be set to constant values in the MODEL_PREDICTION block, but this may be inefficient in the target software translation.
In the current SEE, to ensure interoperability, model parameters should not be assigned a constant value in the GROUP_VARIABLES or INDIVIDUAL_VARIABLES block.
10.5 Interoperability in the MODEL_PREDICTION block
To ensure Monolix interoperability, any variable used in the MODEL_PREDICTION block must be either:
• the independent variable
• defined in MODEL_PREDICTION
• declared in INDIVIDUAL_VARIABLES using {type is linear, … }
• defined as use is variable in the DATA_INPUT_VARIABLES block of the Data Object
This implies in particular that STRUCTURAL_PARAMETERS, VARIABILITY_PARAMETERS, GROUP_VARIABLES and random variables defined in RANDOM_VARIABLE_DEFINITION cannot be used in MODEL_PREDICTION.
10.6 ## Interoperability in the OBSERVATION block
For Monolix interoperability, different OBSERVATIONs / outcomes must not share VARIABILITY_PARAMETERS and RANDOM_VARIABLE_DEFINITION.
For interoperability with Monolix, the residual error(s) \(\sigma_{i,j}^{2}\) must be Normal(0,1) random variables.