4 Model Object
The Model Object within the MDL is intended to describe the mathematical and statistical properties of the model. MDL defines language elements that allow the user to code a wide variety of models and in a variety of ways. The Model Object is intended to specify the model independent of the target software which will be used for the task (estimation or simulation). The same model should be able to be used for a variety of tasks – estimation, simulation or optimal design without recoding. The Model Object should also be independent of the data – where possible we use enumerated types for categorical covariates and outcomes so that definition of the model is clear to any user regardless of the data used in a given task.
It should be noted however that MDL does not guide the user about whether the model that is defined is suitable for a given purpose or for any target software. The user is free to define any model, however they must also be aware that the specified model may not be useable with all target software.
As stated in the Introduction, the Model Object is intended to convey the mathematical and statistical definitions required to completely define the model. MDL used in defining the model is intended more as a descriptive language rather than a programmatic one. The Model Object is used in tasks by combining it with Data, Parameter and Task Properties objects, and defining tasks within the R script.
Currently defined blocks are IDV
, COVARIATES
, POPULATION_PARAMETERS
, FUNCTIONS
, VARIABILITY_LEVELS
, STRUCTURAL_PARAMETERS
, VARIABILITY_PARAMETERS
, GROUP_VARIABLES
, RANDOM_VARIABLE_DEFINITION
, INDIVIDUAL_VARIABLES
, MODEL_PREDICTION
, DEQ
, COMPARTMENT
, OBSERVATION
.
Which blocks within the Model Object are used for a particular model depends on the structure of that model. Blocks should not be left empty (although this is not a syntax error). It is good practice to structure and write the model to facilitate readability and understanding of the model. Simple statements that are clear and unambiguous are preferred to statements combining many actions into one line of code. Use of the MODEL_PREDICTION
block is encouraged to make it clear what the final prediction is from the model prior to use in generation of the observation level.
In the current version of MDL, the variable names and parameter names in the STRUCTURAL_PARAMETERS
and VARIABILITY_PARAMETERS
blocks of the Model Object must be matched to those in the Data and Parameter objects. The MOG Object brings together the Data, Parameter, Model and Task Properties Objects to perform tasks, and at this stage it is assumed that the variable names match across objects.
The independence of the Data Object from the model means that the data referenced by the Data Object may be easily used with a different model without modification of the Data Object. Similarly, the independence of the Parameter Object from the model means that all the parameters related to modelling project e.g. describing a particular drug, may be stored in one place. Note that parameters defined in the Parameter Object must have unique names.
Unlike other MDL Objects, the Model Object does not use a DECLARED_VARIABLES
block. Instead variables are declared when they are used within the IDV
, COVARIATES
, STRUCTURAL_PARAMETERS
, VARIABILITY_PARAMETERS
blocks. Particular care may be required for models defined using analytic equations (rather than via differential equations, compartments). In these models it may be necessary to declare inputs such as DOSE within the MODEL_PREDICTION
block.
4.1 On interoperability
The primary focus of MDL in this release is translation to valid PharmML, rather than conversion to target software. The previous Public Release was primarily concerned with demonstrating interoperability across key software targets. In this version of MDL there may be features supported which are not supported by certain target software, but which are valid for model description and which generate valid PharmML. The aim is to widen the scope of models which can be encoded in MDL and generate PharmML, since the latter is required for uploading models to the DDMoRe repository. Translation of these models to target software will follow with updates to the interoperability framework converters.
The MDL-IDE should assist the user in ensuring that the models encoded are valid MDL (and as a consequence, also valid PharmML).
Models in MDL may be expressed in a number of ways, which may be influenced by a number of factors including which languages the user is familiar with for encoding models. Flexibility allows the user to encode models quickly in a common language (MDL) which can then be shared with others and mutually understood. This flexibility also facilitates encoding in a given target when that language construct does not have a parallel in other tools. However, we STRONGLY encourage the user to encode the majority of models in a way that will facilitate interoperability. Interoperability allows the user of the model to choose the best tool for the job, or at least the tools that they have available to them.
If the user follows certain conventions for coding then it will increase the chance that a given model is interoperable between target tools. These conventions will be highlighted in the subsequent sections, but users should pay particular attention to sections 4.7, 4.9 and 4.10 on definition of GROUP_VARIABLES
defining fixed effects, INDIVIDUAL_VARIABLES
defining the relationship between covariates (or GROUP_VARIABLES
defined variables) and random effects and MODEL_PREDICTION
using these parameters to calculate predictions for given inputs.
4.2 IDV
The IDV
block defines the independent variable within the model. Typically this is TIME (or T for differential equations). An IDV
block must be present in the Model Object.
The syntax is a simple variable declaration:
IDV{ <Independent variable name> }
4.3 COVARIATES
The COVARIATES
block declares and defines covariates to be used in the GROUP_VARIABLES,
INDIVIDUAL_VARIABLES
and MODEL_PREDICTION
blocks (see discussion of regressors below for use of covariates in the MODEL_PREDICTION
block). Covariates listed in the COVARIATES
block must be specified as use is covariate
or use is catCov
in the DATA_INPUT_VARIABLES
block in the Data Object or defined in the POPULATION
block of the Design Object. Covariate transformations may be specified within this block.
COVARIATES{
< Covariate name >
< Categorical covariate name > withCategories {< category1 >, < category2 >, … , < category_k >}
< Covariate name > = <simple transformation equation>
}
For categorical covariates, the categories defined in the COVARIATES
block must match those specified for DATA_INPUT_VARIABLES
with use is catCov
– see the definition of the SEX covariate above.
An example COVARIATES
block is shown below:
COVARIATES{
WT
SEX withCategories {female, male}
logtWT = ln(WT/70)
}
In this example the withCategories prefix to the list of category names will be used to link to the values associated with these names in the Data Object.
The logtWT variable in the COVARIATES
block may be used as the value for the cov attribute in the linear function in the INDIVIDUAL_VARIABLES
block if WT
follows the above rules for covariates.
Please also read about the specification of covariate models in sections 4.7 and 4.9.
The definition of covariates above assumes that the covariates are constant within individuals or vary only at occasion levels. It is also possible to define covariates that vary with the independent variable (typically time):
COVARIATES(type is idvDependent){
WT
logtWT = ln(WT/70)
}
Covariates which are defined as idvDependent
should NOT be used in the type is linear
definition of INDIVIDUAL_PARAMETERS
.
4.4 STRUCTURAL_PARAMETERS
This block declares fixed effect parameters that define the structure of the model. There is no separator character in between variable names. The variable names do not need to be on separate lines, but it may be easier to read if they are presented in this way and it allows comments to be added to help communication
STRUCTURAL_PARAMETERS{
< Variable name(s) of structural parameters >
}
For example:
STRUCTURAL_PARAMETERS {
POP_CL
POP_V
POP_KA
POP_TLAG
BETA_CL_WT
BETA_V_WT
} # end STRUCTURAL_PARAMETERS
4.5 VARIABILITY_PARAMETERS
Similar to the STRUCTURAL_PARAMETERS
block, this block declares all the variability (including covariance, correlation and residual error) parameters (population parameter variability and other variability level parameters) used in the model. The variable names do not need to be on separate lines, but it may be easier to read if they are presented in this way and allows comments to be added to help communication
VARIABILITY_PARAMETERS{
<Variable name(s) of variability parameters>
}
For example:
VARIABILITY_PARAMETERS {
PPV_CL
PPV_V
CORR_CL_V
PPV_KA
PPV_TLAG
RUV_ADD
RUV_PROP
} # end VARIABILITY_PARAMETERS
4.5.1 Residual Unexplained Variability
Residual variability is typically defined as a standard Normal distribution ~N(mean=0, var=1). The standard residual error models (see section 4.12.1) then define the parameters of that model e.g. additive and proportional which multiply the random N(0,1) variable. They can also define expressions involving parameters which define the residual error model. These parameters should be declared as VARIABILITY_PARAMETERS
.
4.6 VARIABILITY_LEVELS
The VARIABILITY_LEVELS
block defines the model hierarchy. Each variable should have attributes defining its level
in the model hierarchy and variability type
which is one of parameter
or observation
. DATA_INPUT_VARIABLES
with use is dv
and use is id
are automatically identified as describing variability levels. Additional variables can be used to define variability levels by defining these as use is varLevel
in DATA_INPUT_VARIABLES
.
Typically, level = 1 is the level of each sample / experimental unit / observation. Additional levels of the hierarchy built on top of this. Typically in population models there is at least one additional level of variability – that of the individual (the experimental unit). Occasionally if modelling summary level data in a model-based meta-analysis, treatment arm may be used as the experimental unit and labelled in the DATA_INPUT_VARIABLES
block as use is id
.
The syntax is:
VARIABILITY_LEVELS(reference = < ID | varLevel >){
<Variable name> : { level = <number>, type is <parameter | observation>}
... # Additional levels of variability specified as above
}
For example:
VARIABILITY_LEVELS(reference=ID){
ID : { level=2, type is parameter }
DV : { level=1, type is observation }
}
If between occasion variability is required in the model then this should be specified here as a variability level between the observation and individual levels. In NONMEM occasion is typically specified as an additional layer of inter-individual variability which is defined conditionally on an occasion variable in the dataset. In MDL this is explicitly treated as a distinct level of variability.
VARIABILITY_LEVELS(reference=ID){
ID : { level=3, type is parameter }
OCC : { level=2, type is parameter }
DV : { level=1, type is observation }
}
Additional levels of variability are easily implemented by incrementing level = <number>
with an associated DATA_INPUT_VARIABLE with use is varLevel
. This facilitates definition of levels such as between trial random variability.
The distinction between type is observation
and type is parameter
will be used further in future versions of MDL to describe models where there are additional levels of hierarchy. For example, when describing differences between trials as a random effect or modelling population level differences; and at the observation level e.g. replicates of PD measurements at each time point, or multiple assays of a single sample. Specifying variability type allows extensibility for the future while retaining backwards compatibility.
When estimating model parameters using observed data, the DATA_INPUT_VARIABLE
with use is id
is the default “reference level of the model hierarchy (Gelman 2006). When using a Design Object, the reference level of the model hierarchy is likely to be implicit (subjects in a study arm) rather than explicit. Thus the need to specify reference = ID
.
4.7 GROUP_VARIABLES
The GROUP_VARIABLES
block can be used to specify group specific variables using parameters and fixed effect relationships between parameters and covariates. The INDIVIDUAL_VARIABLES
block can then use these values in definition of the individual parameters by incorporating the random between individual variabilities defined in the RANDOM_VARIABLE_DEFINITION
block(s).
Using the GROUP_VARIABLES
block to define covariate relationships is not supported for parameter estimation in some target software since the equations defined in the GROUP_VARIABLES
block are user defined. The MDL-IDE is not equipped to determine whether the defined relationships conform to linear relationships (after transformation) that have been shown to allow interoperability between software.
For this reason we suggest that definition of covariate dependent GROUP_VARIABLES
is used only in cases where a reformulation to “linear or “linear after transformation relationships with covariates as defined in section 4.9 is not possible.
The GROUP_VARIABLES
block is essential for defining relationships between structural parameters and covariates which are non-linear, even after transformation. For example to describe clearance across both adults and children a maturation model may be required. For example:
GROUP_VARIABLES{
FSIZE = (WT/70)^0.75
FAGE = if(AGE >= 20) then exp(BETA_CL_AGE*(AGE-20))
else 1
FMAT = 1/(1+(PCA/TM50)^(-HILL))
GRP_CL = POP_CL * FSIZE * FAGE * FMAT
}
GRP_CL can then be used in the definition of individual variables within the INDIVIDUAL_VARIABLES
block.
4.7.1 Defining model constants
Model constants (model variables with constant values) may be defined in MDL within the GROUP_VARIABLES
block.
However, to ensure interoperability within the current SEE, constant values in the model should be defined as STUCTURAL_PARAMETERS
and fixed to a value in the Parameter Object.
For models expressed as systems of differential equations (DEQ
block), model variables can be set to constant values in the MODEL_PREDICTION
block, but this may be computationally inefficient in the target software implementation.
4.8 RANDOM_VARIABLE_DEFINITION
The RANDOM_VARIABLE_DEFINITION
block defines the distribution of the random effects to be used in construction of mixed effects models. The RANDOM_VARIABLE_DEFINITION
block defines random variables in terms of parametric distributions.
It is assumed that all variables within the same block are defined for the same level of the model hierarchy. Separate RANDOM_VARIABLE_DEFINITION
blocks should be used for each layer of the model hierarchy.The user specifies which level through the (level = <name of variable associated with this level> )
syntax following the RANDOM_VARIABLE_DEFINITION
block name.
The following syntax is used to define random variables:
RANDOM_VARIABLE_DEFINITION( level = <VARIABILITY_LEVEL variable> ){
<VARIABLE NAME > ~ <Distribution with arguments>
}
The RANDOM_VARIABLE_DEFINITION
block supports probability distributions as specified in the ProbOnto knowledge base (M. J. Swat, Grenon, and Wimalaratne 2016). Typically for definition of structural parameter and residual error random variability, Normal distributions will be used. The MDL distribution “Normal(…) maps to either ProbOnto Normal1 or Normal2 depending on the parameterisation:
MDL Name | Argument name | Argument Types | ProbOnto distribution |
---|---|---|---|
Normal | mean | Real | Normal1 |
sd | Real | ||
Normal | mean | Real | Normal2 |
var | Real |
This means that the user does not need to remember which ProbOnto distribution uses which parameterisation for this frequently used distribution.
An example of RANDOM_VARIABLE_DEFINTION
for individual random effects is given below:
RANDOM_VARIABLE_DEFINITION( level = ID ) {
ETA_CL ~ Normal(mean = 0, sd = PPV_CL)
ETA_V ~ Normal(mean = 0, sd = PPV_V)
ETA_KA ~ Normal(mean = 0, sd = PPV_KA)
ETA_TLAG ~ Normal(mean = 0, sd = PPV_TLAG)
} # end RANDOM_VARIABLE_DEFINITION
In the code above, ETA_CL
, ETA_V
, ETA_KA
and ETA_TLAG
vary with each new value of ID. These variables are normally distributed with mean = 0 and standard deviation defined by the variability parameters. The distribution can also be defined using variances.
In the example above, all random variability parameters are independent. To specify correlation or covariance between parameters, the user should specify either pairwise correlation or covariance between random variables or use a multivariate distribution. In contrast to the previous version of MDL where correlations and covariances were defined only in the Parameter Object, this version requires the user to specify correlations and covariances in the RANDOM_VARIABLE_DEFINITION
block. This is to allow the Prior Object to define priors on parameters used in the Model Object.
The following syntax is used to define correlation or covariance:
:: {type is <correlation / covariance>,
rv1 = <RANDOM_VARIABLE_DEFINITION variable>,
rv2 = <RANDOM_VARIABLE_DEFINITION variable>,
variable = <VARIABILITY_PARAMETERS parameter> }
Note the use of the “anonymous list using double colon “:: . This is used since we are assigning additional information to the variable defined in the VARIABILITY_PARAMETERS
block.
For example (UseCase1):
:: {type is correlation, rv1=ETA_CL, rv2=ETA_V,
value=CORR_CL_V}
Alternatively, the user can specify multivariate distribution(s) for parameters to specify the joint distribution of multiple random variability parameters. To do this, the user must specify the type of parameters in the STRUCTURAL_PARAMETERS
(if required) and VARIABILITY_PARAMETERS
blocks. Typically for multivariate distributions, there may be a vector of mean values, and a matrix of correlations or covariances. We can then use these to define the multivariate distribution using ProbOnto definitions.
So if we assume that the random effects for CL, V and KA (ETA_CL, ETA_V and ETA_KA in the univariate case) come from a multivariate distribution then the distribution of the vector ETA_CL_V_KA is given as:
\[ETA\_ CL\_ V_{\text{KA}} = \begin{bmatrix} ETA\_ CL \\ ETA\_ V \\ ETA\_ KA \\ \end{bmatrix}\sim\ MultivariateNormal1\left( mean = \begin{bmatrix} 0 \\ 0 \\ 0 \\ \end{bmatrix},covariance = PPV\_ CL\_ V\_ KA \right)\]
where PPV_CL_V_KA is a covariance matrix defining the variances and covariances of the random effects:
\[PPV\_ CL\_ V\_ KA\ = \begin{pmatrix} PPV\_ CL & COV\_ CL\_ V & COV\_ CL\_ KA \\ COV\_ CL\_ V & PPV\_ V & COV\_ V\_ KA \\ COV\_ CL\_ KA & COV\_ V\_ KA & PPV\_ KA \\ \end{pmatrix}\]
Note that the ProbOnto distribution MultivariateNormal1 uses mean and covariance, while MultivariateNormal2 uses mean and correlation.
For example (UseCase6_2):
VARIABILITY_PARAMETERS {
PPV_CL_V_KA::matrix
PPV_TLAG
RUV_PROP
RUV_ADD
} # end VARIABILITY_PARAMETERS
RANDOM_VARIABLE_DEFINITION(level=ID) {
ETA_CL_V_KA ~ MultivariateNormal1(mean = [0,0,0],
covarianceMatrix = PPV_CL_V_KA)
ETA_TLAG ~ Normal(mean = 0, var = PPV_TLAG)
} # end RANDOM_VARIABLE_DEFINITION
Similarly for the residual unexplained variability with mean 0 and a fixed variance of 1, we might have a RANDOM_VARIABLE_DEFINITION
block as follows:
RANDOM_VARIABLE_DEFINITION(level=DV){
EPS_Y ~ Normal(mean = 0, var = 1)
}
To define between occasion variability we might have a RANDOM_VARIABLE_DEFINITION
block as follows:
RANDOM_VARIABLE_DEFINITION(level=OCC){
eta_BOV_CL ~ Normal(mean=0, var=BOV_CL)
eta_BOV_V ~ Normal(mean=0, var=BOV_V)
eta_BOV_KA ~ Normal(mean=0, var=BOV_KA)
eta_BOV_TLAG ~ Normal(mean=0, var=BOV_TLAG)
}# end RANDOM_VARIABLE_DEFINITION
Note that in the above example blocks, ID, DV and OCC are declared as valid identifiers for the variability hierarchy through the VARIABILITY_LEVELS
block assuming appropriate specification within the Data Object of DATA_INPUT_VARIABLES
with use is id
, use is varLevel
and use is dv
for ID, OCC and DV (respectively).
VARIABILITY_LEVELS{
ID : { level=3, type is parameter }
OCC : { level=2, type is parameter }
DV : { level=1, type is observation }
}
The RANDOM_VARIABLE_DEFINITION
block can also be used to explicitly define the distribution of individual variables, however if this method is used, then an anonymous list must be specified in the INDVIDUAL_VARIABLES
block to refer to random variable defined in this way. Parameters defined in this way cannot be used with type is linear
or type is general
in the INDIVIDUAL_VARIABLES
block.
For example:
RANDOM_VARIABLE_DEFINITION(level=ID) {
CL ~ LogNormal3(median = POP_CL, stdevLog = SD_LNCL)
…
}
INDIVIDUAL_VARIABLES{
:: {type is rv, variable=CL}
…
}
Covariates can be included via definition in the GROUP_VARIABLES
block, but this may limit interoperability and translation of the model to target software.
For example:
(See GROUP_VARIABLES
block example above for definition of GRP_CL)
RANDOM_VARIABLE_DEFINITION(level=ID) {
CL ~ LogNormal3(median = GRP_CL, stdevLog = SD_LNCL)
…
}
The RANDOM_VARIABLE_DEFINITION
block is also used to specify non-continuous outcome variables i.e. count, binary, categorical. See section 4.12.2.
4.9 INDIVIDUAL_VARIABLES
The INDIVIDUAL_VARIABLES
block is used to express how the fixed effect variables (population parameters, covariates with their associated fixed effect parameters) and random effects (defined in the RANDOM_VARIABLE_DEFINITION
block) combine to define the individual variables which will be used in the MODEL_PREDICTION
block to calculate predictions for given inputs. If this is not a population model or if variables are completely defined through the GROUP_VARIABLES
block then this block is not required. However, that might break interoperability with some tools like Monolix, which require the definition of individual parameters
There are three principle ways of defining INDIVIDUAL_VARIABLES
and these will be described below.
The only way of defining INDIVIDUAL_VARIABLES
that is currently supported for parameter estimation across target software is the “linear after transformation method described in section 4.9.1 below.
4.9.1 Mixed effect model with linear fixed effects and normally distributed
random effects
In some cases it is possible to express the fixed effects of covariates for a population parameter as a linear model with normally distributed random effects, sometimes employing a simple transformation (log, logit etc.) to achieve this.
We refer to this as a linear covariate model and this equates to the following mathematical definition:
\[{h(\psi}_{i}) = h\left( \left. \ \psi_{\mathrm{\text{pop}}} \right.\ \right) + \ \beta C_{i} + \eta_{i}\]
\(\psi_{i}\) – Individual parameter
\(\psi_{\mathrm{\text{pop}}}\) – Typical or population mean parameter
\(\beta\) – Fixed effects
\(C_{i}\) – Covariates
\(\eta_{i}\) – Random effect
h – Transformation function – typically log, logit, probit etc.
The MDL syntax for this form of specification is:
<Individual parameter>
: {type is linear, trans is <h>,
pop = <Population STRUCTURAL parameter>,
fixEff = [ {coeff = <Fixed Effect STRUCTURAL parameter for covariate>,
cov = <Covariate in COVARIATES block conforming to rules below>}
,
… <Additional coefficient and covariate pairs as above> ],
ranEff = [ RANDOM_VARIABLE_DEFINITION parameter(s) ] )
The fixEff
and trans
arguments are optional.
Note that the syntax allows conditional assignment of list types to the parameter. So if the model for a parameter varies according to a covariate or variable value, then this can be reflected in the condition applied.
Note that left hand side transformations of the individual parameters are no longer allowed. The transformation specified in the trans
argument applies to both the left hand side and right hand side of the equation. The ranEff argument expects a vector of random variables. If there is only one random variable the square brackets are not required.
For example (UseCase1):
CL : {type is linear, trans is ln, pop = POP_CL,
fixEff = [{coeff=BETA_CL_WT, cov=logtWT}] ,
ranEff = ETA_CL }
Using this construct for individual variables equates to the MU referencing approach in NONMEM and the standard definition of individual parameters in Monolix.
As discussed in the documentation of the Data Object defining covariates (section 3.3.5.1) certain constraints are placed on the type of covariate used in this form of specification. When covariates are defined within the Model Object COVARIATES
block and used in the specification of INDIVIDUAL_PARAMETERS
using the linear( …, fixEff=[{coeff=<coefficient>, cov = <covariate>}] )
construct, they must have particular properties:
They must be constant within an individual or constant within an occasion.
They will only be allowed simple transformations within the model e.g. centering on a median / mean and/or log transformation, logit transformation.
The transformation cannot depend on another covariate.
Statistical models (random effects) on covariates are not supported.
If a categorical covariate is used, then the catCov argument in fixEff should refer to the appropriate category of the covariate. For example (UseCase5):
CL : {type is linear, trans is ln,
pop = POP_CL,
fixEff = [
{coeff = BETA_CL_WT, cov = logtWT},
{coeff = POP_FCL_FEM, catCov = SEX.female },
{coeff = BETA_CL_AGE, cov = tAGE}
],
ranEff = ETA_CL }
If the categorical covariate has more than k>2 categories then the user needs to specify k-1 dichotomous “dummy covariates to specify the factor levels and appropriate contrasts (between level k and an appropriate comparison value). For example when adding GENOTYPE as a categorical covariate, the user may want to compare each category of GENOTYPE to a suitable reference value of the covariate. In this case all of the “dummy dichotomous comparison covariates should be included in the model in one step.
If between occasion variability is specified in a RANDOM_VARIABLE_DEFINITION
block then the associated random effects can be specified in a vector form of the ranEff attribute. These will be added into the linear equation. For example (UseCase8):
CL : {type is linear, trans is ln, pop = POP_CL,
fixEff = [{coeff=BETA_CL_WT, cov=logtWT}] ,
ranEff = [eta_BSV_CL, eta_BOV_CL ]}
4.9.2 General mixed effect model with Gaussian random effects.
The second formulation for the INDVIDUAL_PARAMETERS
block uses variables defined in the GROUP_VARIABLES
block and assumes that the random effect is additive i.e. is Gaussian (Normally distributed) or Gaussian after transformation.
We refer to this as a “general or Gaussian after transformation model and the associated mathematical representation is:
\[{h(\psi}_{i}) = H\left( \beta,C_{i} \right) + \ \eta_{i}\]
\(\psi_{i}\) – Individual parameter
\(\psi_{\mathrm{\text{pop}}}\) – Typical or population mean parameter
\(\beta\) – Fixed effects
\(C_{i}\) – Covariates
\(\eta_{i}\) – Random effect
H – Arbitrary function
h – Transformation function – log, logit, probit.
Where \(H\left( \beta,C_{i} \right)\) is defined in the GROUP_VARIABLES
block.
The MDL syntax for this form of specification is:
<Individual parameter> : {type is general,
grp = <GROUP_VARIABLES defined variable >,
trans is <ln / logit / probit>,
ranEff = [ RANDOM_VARIABLE_DEFINITION parameter(s) ] )
The trans argument is optional.
If the trans argument is used, then it is assumed that appropriate transformations have been made in the GROUP_VARIABLES
block or in the assigned value for the grp attribute to ensure that the fixed effect and random effect are additive and on the correct scale given the transformation.
For example, for the GROUP_VARIABLES defined in section 4.7 above:
CL : {type is general, grp = ln(GRP_CL),
trans is ln,
ranEff = ETA_CL)
This corresponds to the following equation for CL:
\(\ln\left( \text{CL} \right) = \ln\left( \text{GR}P_{\text{CL}} \right) + ETA_{\text{CL}}\)
which, after back-transformation is equivalent to:
\(CL = GRP_{\text{CL}}*exp(ETA_{\text{CL}}\))
4.9.3 Mixed effect model defined by equations
The individual variables can also be defined using expressions by combining parameters with variables defined in GROUP_VARIABLES
and random effects .
For example:
CL = POP_CL * exp(ETA_CL)
Or (using a variable GRP_CL defined in the GROUP_VARIABLES
block as defined above)
CL = GRP_CL * exp(ETA_CL)
It is also possible to define fixed and random effect expressions as follows:
CL=POP_CL * (WT/70)^0.75 * exp(eta_PPV_CL)
Which can be log transformed into a linear form of mixed effect model like this:
CL=exp(ln(POP_CL)+ 0.75 * ln(WT/70) + eta_PPV_CL)
However, note that while it is possible for the user to “see that the equation above is linear in the fixed and random effects, it is not possible for the MDL-IDE to determine this. To specify linear models we must explicitly do so using the {type is linear, … }
construct described in section 4.9.1.
4.9.4 INDIVIDUAL_VARIABLES without inter-individual variability
For interoperability reasons, parameters defined in the STRUCTURAL_PARAMETERS
block without associated variability i.e. where the individual value is the same as the parameter value, should be defined within the INDIVIDUAL_VARIABLES
block. If the model parameter is constrained to be positive, then an appropriate transformation should be used to ensure a positive value.
4.9.5 INDIVIDUAL_VARIABLES
where the variable is defined in the
RANDOM_VARIABLE_DEFINITION
block.
As discussed in section 0 above, if the individual variable is defined completely via a distribution in the RANDOM_VARIABLE_DEFINITION
block, then an anonymous list must be used to declare the variable within the INDIVIDUAL_VARIABLES
block.
The syntax for the anonymous list is:
:: {type is rv, variable = <RANDOM_VARIABLE_DEFINITION
variable>}
4.9.6 Conditional assignment of INDIVIDUAL_VARIABLES
It is possible to apply conditional handling to the assignment of INDIVIDUAL_VARIABLES
. Note that the conditioning occurs on the RIGHT HAND SIDE of the expression ONLY. The conditioning statement should follow the conventions described in section 9.1.4.4.
The syntax is as follows:
<Individual variable>
: if(condition1) then { INDIVIDUAL_VARIABLE list 1 }
elseif(condition2) then { INDIVIDUAL_VARIABLE list 2 }
else { INDIVIDUAL_VARIABLE list 3 }
Note the use of an “else statement to ensure that
For example:
CL : if(RF==RF.normal) {type is linear, trans is ln,
pop = POP_CL, fixEff = {coeff = BETA_CL_WT, cov = logtWT},
ranEff = ETA_CL }
else {type is general, trans is ln,
grp = ln(GRP_CL),
ranEff = ETA_CL }
In the above, the expression for individual Clearance is conditional on whether the subject’s renal function (RF) is “normal or not. The user would need to have defined a suitable model for GRP_CL in the GROUP_VARIABLES
block. (RF is a categorical data variable that has a symbolic value of “normal and other categories which are not used in this example).
Please also see sections 9.1.4.4 and 9.1.4.5 for further information on handling of conditional statements.
4.9.7 INDIVIDUAL_VARIABLES
definitions in practice.
As has been discussed above, to facilitate interoperability we strongly suggest that users try to formulate their models using the {type is linear, … } form shown in section 4.9.1 with the caveat included about the forms of covariate relationships that can be used within this construct.
In some cases, users may have to consider how their model is constructed more carefully. For example, in a pharmacodynamic model:
PD = PD_BASELINE + PD_BETA\*CP + ETA_PD
It may be tempting to try to write this as a {type is linear, … }
relationship with CP as a covariate, but recall that covariates may not be time-varying, and CP would almost certainly break this rule.
If we encode GRP_PD = POP_BASELINE + POP_BETA\*CP
as a GROUP_VARIABLE
and then add ETA_PD in INDIVIDUAL_VARIABLES
using the {type is general, … }
form then the GRP_PD is also time-varying.
INDIV_PD : {type is linear, pop = GRP_PD, ranEff = [ETA_PD])
However if we break the above model into components, then we can use {type is linear, … }
to express an individual baseline
INDIV_BASE : {type is linear, pop = POP_BASELINE, ranEff = ETA_BASE)
INDIV_BETA : {type is linear, pop = POP_BETA, ranEff = ETA_BETA)
We can then move the linear relationship with CP to the MODEL_PREDICTION
block
MODEL_PREDICTION{
PD = INDIV_BASE + INDIV_BETA\*CP
}
Using the INDIVIDUAL_VARIABLES
block to define individual parameters which are then used in MODEL_PREDICTION
should allow most models to be interoperable.
4.10 MODEL_PREDICTION
The MODEL_PREDICTION
block is where the structural model predictions are defined. Calculations use mathematical expressions that may involve the population parameters (structural) as well as group and individual variables (parameters).
If a MODEL_PREDICTION
block is not supplied this is not an error but requires that any prediction referred to in the OBSERVATION
blocks has been defined using variables in a GROUP_VARIABLES
or INDIVIDUAL_VARIABLE block.
For example below we present the MODEL_PREDICTION
block using variables DOSE, V, CL, V and TIME.
MODEL_PREDICTION{
DOSE::dosingVar # recall that DOSE must be declared before use in
analytical models.
CONC=DOSE/V*exp(-CL/V*TIME)
}
If a DEQ sub-block is specified then variables calculated within the DEQ sub-block can be referred to outside of this block to calculate the model prediction. An example is given below.
To ensure interoperability, any variable used in the MODEL_PREDICTION
block must be either:
- the independent variable
- defined in
MODEL_PREDICTION
- declared in INDIVIDUAL_VARIABLES using {type is linear, … }
- defined as “use is variable in the DATA_INPUT_VARIABLES block of the Data Object
This implies in particular that STRUCTURAL_PARAMETERS
, VARIABILITY_PARAMETERS
, GROUP_VARIABLES
and random variables defined in RANDOM_VARIABLES_DEFINITION
cannot be used in MODEL_PREDICTION
.
4.10.1 DEQ
Use of a DEQ
sub-block is optional – differential equations may be used anywhere in the MODEL_PREDICTION
block – but it is encouraged to use this sub-block for clarity and readability of the resulting code.
The DEQ
sub-block specifies the structural model through differential equations. The general form is
<VARIABLE> : { deriv = <expression>, init = <Real number>,
x0 = <Real number> }
The DEQ
sub-block combines equations and differential equations and the resulting system of equations is integrated across the independent variable, usually time.
init = <Real number>
is the initial value of the differential equation
x0 = <Real number>
is the starting value of the integrator. For most systems involving time, this is zero.
By default, init = 0
and x0 = 0
. If the default is to be used, these arguments can be dropped from specification of the differential equation.
For example:
MODEL_PREDICTION {
DEQ{
RATEIN = if(T >= TLAG) then GUT \* KA
else 0
GUT : { deriv =(- RATEIN), init = 0, x0 = 0 }
CENTRAL : { deriv =(RATEIN - CL \* CENTRAL / V) }
}
CC = CENTRAL / V
} # end MODEL_PREDICTION
4.10.2 On Tlag and Bioavailability
Since MDL has no reserved variable names, there is no mechanism for target software to identify lagtime and bioavailability. . Models with lag times and bioavailability must use the COMPARTMENT
sub-block to specify these input attributes.
4.10.3 COMPARTMENT
The COMPARTMENT
sub-block is intended to provide the user with a modular approach to describe PK processes through definition of the drug input, distribution, and elimination processes. The functions defined are influenced by the PK macros approach in Monolix. The table below shows how the Compartment definitions in MDL correspond to PK Macros as defined in Monolix.
MDL Compartment | Monolix PK Macros |
---|---|
direct | iv |
depot | absorption |
elimination | elimination |
distribution | peripheral |
effect | effect |
transfer | transfer |
compartment | compartment |
The major differences over the implementation in Monolix are that in MDL, the “from and “to attributes define the links between compartments and processes and that there are no reserved names for compartments or variables e.g. using “K12 and “K21 as variable names confers no special meaning to the use of these variables.
COMPARTMENT
definitions are translated into PK Macros in Monolix and where possible they are mapped to ADVAN closed-form solutions in NONMEM.
COMPARTMENT
sub-block processes are specified as lists with attributes depending on the processes being described.
4.10.3.1 Input & absorption
There are two COMPARTMENT
block processes describing input to the system (typically drug input). These are direct and depot. direct defines bolus or zero-order input processes, while depot describes first-order, zero-order or transit chain drug input processes.
Input format is of the form:
<VARIABLE NAME> : { type is <depot / direct>,
to = <VARIABLE>,
<other arguments> }
The other arguments depend on the process being described. The table below describes the possible combinations of attributes for different input and absorption processes.
Compartment Type | Attribute Combination |
---|---|
Direct | to, modelDur(O), tlag(O), finput(O) |
Depot | to, ka, tlag(O), finput(O) |
to, modelDur, tlag(O), finput(O) | |
to, ka, ktr, mtt | |
to, modelDur, ktr, mtt |
( O ) = Optional attribute
INPUT_KA : {type is depot, to=CENTRAL, ka=KA,
tlag=ALAG1, finput=F1}
4.10.3.2 Distribution processes
MDL defines drug distribution (movement of drug between compartments) through COMPARTMENT
block definitions with type compartment, distribution.
type is compartment
is used to define PK compartments which have an associated input and elimination process. Peripheral compartments with type is distribution
will not have type is input
or type is elimination
processes associated with them.
The syntax is:
<VARIABLE NAME> : { type is compartment }
The modelCmt argument is not used.
For example:
CENTRAL : { type is compartment }
Compartments where the transfer of drug in and out is defined through model variables have type is distribution
.
<VARIABLE NAME> : {type is distribution, from = <VARIABLE NAME>,
kin = <VARIABLE NAME>, kout = <VARIABLE NAME> }
For example, to specify a peripheral compartment in a two compartment PK model:
PERIPHERAL : {type is distribution, from=CENTRAL, kin=Q/V2, kout=Q/V3}
A “type is effect process provides a means to describe the transfer of amounts from a given compartment to an effect compartment e.g. for use with PD models.
<VARIABLE NAME> : {type is effect, from = <VARIABLE> NAME>, keq = <VARIABLE NAME>}
Compartment Type | Attribute Combination |
---|---|
distribution | from, kin, kout |
compartment | |
effect | from, keq |
4.10.3.3 Elimination and transfer processes
Elimination is defined via a list with “type is elimination and specification of the compartment from which drug is eliminated along with variable names for the volume of distribution in the compartment from which drug is eliminated and the micro constant or apparent clearance from the compartment.
If the amount of eliminated drug is not of interest, it is not necessary to name this process. If this is the case, an anonymous list must be used:
:: { type is elimination, from = <VARIABLE NAME>, v = <VARIABLE NAME>,
<k / cl> = <VARIABLE NAME> }
For example in the one compartment model:
:: {type is elimination, from=CENTRAL, v=V, cl=CL}
Note that the v argument refers to the volume of distribution in the compartment defined by the from argument.
A “type is transfer process has also been provided which defines the one-way transfer of drug amounts from one compartment to another.
<VARIABLE NAME> : {type is transfer, from = <VARIABLE NAME>,
to = <VARIABLE NAME>, kt = <VARIABLE NAME> }
For example: :: {type is transfer, from=LATENT, to=CENTRAL, kt=K23}
Compartment Type | Attribute Combination |
---|---|
elimination | from, v, k |
from, v, cl | |
from, vm, km | |
transfer | from, to, kt |
4.11 Combining COMPARTMENT
and DEQ
blocks
It is possible to use COMPARTMENT
to describe the input processes for differential equations in the DEQ
block. Using the type is depot
or type is direct
compartments allows the user to specify lag time and bioavailability (tlag and finput) which will translate to appropriate terms in target software e.g. ALAGn and Fn in NM-TRAN. If the COMPARTMENT
specification is not used then model parameters are treated in a very general way and there is no way of mapping these to target tools to implement these input attributes.
If the COMPARTMENT
sub-block is not used then delay absorption processes and/or bioavailability needs to be explicitly encoded.
For example in UseCase4 differential equations are used to describe IV and oral administration. The time of dosing DT is passed either from the data or via DATA_DERIVED_VARIABLES
(see section 2.5):
MODEL_PREDICTION {
DT
DEQ{
RATEIN = if(T-DT >= TLAG) then GUT * KA
else 0
GUT : { deriv =(- RATEIN), init = 0, x0 = 0 }
CENTRAL : { deriv =(RATEIN * FORAL - CL * CENTRAL / V), init = 0, x0 = 0 }
}
CC = CENTRAL / V
} # end MODEL_PREDICTION
In the above example, note that there is a discontinuity in RATEIN which is not well handled with differential equations. Note also that the model does not handle multiple doses since then DT (time of dose) and hence RATEIN would be reset to zero for each new dose.
To alleviate this problem, we can use COMPARTMENT
with DEQ
to handle the input processes correctly. The code below shows how to include lag time and bioavailability in a model using differential equations (UseCase4_2):
MODEL_PREDICTION {
COMPARTMENT{
INPUT_KA : {type is depot, to=CENTRAL, ka=KA, finput=FORAL, tlag=TLAG}
INPUT_CENTRAL : {type is direct, to = CENTRAL}
}
DEQ{
CENTRAL : { deriv =( - CL \* CENTRAL / V), init = 0, x0 = 0 }
}
CC = CENTRAL / V
} # end MODEL_PREDICTION
(Note that this model is not exactly equivalent to UseCase4 which does not allow for superposition of dosing). Note the first-order input to the CENTRAL compartment (INPUT_KA) is type is depot
while the bolus or zero-order rate input (INPUT_CENTRAL) is type is direct
.
The model corresponding to UseCase4 is shown below (UseCase4_3). In this case there is an explicit differential equation for the depot compartment (GUT)
MODEL_PREDICTION {
COMPARTMENT{
INPUT_KA : {type is direct, to = GUT, finput=FORAL,
tlag=TLAG}
INPUT_CENTRAL : {type is direct, to = CENTRAL}
}
DEQ{
GUT : { deriv =(- GUT \* KA), init = 0, x0 = 0 }
CENTRAL : { deriv =(GUT \* KA - CL \* CENTRAL / V), init = 0,
x0 = 0 }
}
CC = CENTRAL / V
} # end MODEL_PREDICTION
In the code above, note that both administrations use type is direct
, but the oral administration (INPUT_KA) has finput and tlag specified. In contrast to the above, the direct input would exactly correspond to the DEQ specification in UseCase4, however the TLAG in UseCase4 is treated as a general parameter, while the TLAG in UseCase4_3 will be translated to ALAGn in NMTRAN.
4.12 OBSERVATION
The OBSERVATION
block provides the definition of the outcome variable using the prediction from the MODEL_PREDICTION
block and/or RANDOM_VARIABLE_DEFINITION
at the observation level. Calculations or equations needed for this definition can be defined in the OBSERVATION
block e.g. calculation of weights for type is userDefined
error models.
If more than one outcome is specified then we specify each outcome separately within the OBSERVATION
block. Conditional assignment to outcome definitions is possible if the outcome depends on a covariate or calculated variable. However it is not necessary to conform multiple outcomes to a single observation variable name e.g. Y.
4.12.1 Continuous outcomes
For continuous outcomes the OBSERVATION
block defines how a variable from the MODEL_PREDICTION
block and RANDOM_VARIABLE_DEFINITION
providing the residual unexplained random variables are combined in a function to define the outcome.
The mathematical representation of the outcome variable is (after Lavielle, 2014)
\[h\left( y_{\text{ij}} \right) = h\left( f\left( x_{\text{ij}},\psi_{i} \right) \right) + g\left( f\left( x_{\text{ij}}{,\psi}_{i} \right),\ \xi \right)\varepsilon_{\text{ij}}\]
Where
\[y_{\text{ij}} = \text{jth}\ \mathrm{\text{observation\ for\ subject}}\ i\]
h = Transformation of the outcome to ensure that the resulting function is an additive function of f and g. Specfied in MDL error models described below using “trans is
f = structural model prediction from the MODEL_PREDICTION
block. Specified in the MDL error models described below through the “prediction argument.
g = functional definition of the residual error model. Specified in the MDL error models through the type is additiveError | proportionalError | combinedError1 | combinedError2
\(\psi_{i}\) = individual parameters defined in the INDIVIDUAL_VARIABLES block
\(x_{i,j}\) = covariates and regression variables e.g. time, concentration etc.
\(\xi\) = parameters of the residual error model defined in the VARIABILITY_PARAMETERS block and referred to in the appropriate additive
and proportional
arguments.
\(\varepsilon_{\text{ij}}\)= residual error defined in RANDOM_VARIABLE_DEFINITION
block and referred to in the MDL error models described below through the eps
argument.
The syntax for definition of continuous outcome variables is:
< OUTCOME VARIABLE NAME> : {type is additiveError |
proportionalError | combinedError1 | combinedError2 | userDefined,
<additional arguments defined in table below> }
The following residual error model functions are defined, as described in MDL Language Reference section 2.21 and reiterated here.
Name | Return Type | Argument name | Argument Types |
---|---|---|---|
additiveError | Real | trans (Optional) | Builtin |
lhsTrans | Boolean | ||
additive | Real | ||
prediction | Real | ||
eps | Real | ||
proportionalError | Real | trans (Optional) | Builtin |
lhsTrans | Boolean | ||
proportional | Real | ||
prediction | Real | ||
eps | Real | ||
combinedError1 | Real | trans (Optional) | Builtin |
lhsTrans | Boolean | ||
additive | Real | ||
proportional | Real | ||
prediction | Real | ||
eps | Real | ||
combinedError2 | Real | trans (Optional) | Builtin |
lhsTrans | Boolean | ||
additive | Real | ||
proportional | Real | ||
prediction | Real | ||
eps | Real | ||
userDefined | Real | value | Real |
weight | Real | ||
prediction | Real |
combinedError1
defines the following model:
\[h\left( y_{\text{ij}} \right) = h\left( f\left( x_{\text{ij}},\psi_{i} \right) \right) + \left( a + bf\left( x_{\text{ij}}{,\psi}_{i} \right) \right)\varepsilon_{\text{ij}}\]
combinedError2
defines the following model:
\[h\left( y_{\text{ij}} \right) = h\left( f\left( x_{\text{ij}},\psi_{i} \right) \right) + \sqrt{a^{2} + b^{2}f\left( x_{\text{ij}}{,\psi}_{i} \right)}\varepsilon_{\text{ij}}\]
the lhsTrans Boolean argument allows the user to apply transformations to the DV without having to transform the data column prior to analysis.
As with the type is linear
and type is general
definitions in the INDIVIDUAL_VARIABLES
block, we use defined types here to make explicit the relationships between predictions and residual random variable terms to facilitate interoperability between target software.
Use type is userDefined
to specify an arbitrary relationship between prediction, the residual error random variable which is typically Normal(0,1), and the with associated function g(.) defined above. Using this form ensures that correct calculation of the weighted residuals can be calculated.
The current version of MDL does not support definition of outcomes with arbitrary functions of variables and random variables. Any equations written in the OBSERVATION
block are treated as variables to be used in definition of the observation through a list definition as described above (including UserDefined). If a list definition is not used, then the observation equation may not be translated correctly to the target software tool.
4.12.2 Discrete data
Discrete data outcomes are described by referencing a suitable distribution for the outcome. In this version of MDL we assume that the parameters of the relevant distributions are supplied either in the data, for example the number of trials, N, in a binomial distribution, or are defined in the MODEL_PREDICTION
block.
In this version of MDL we assume an identity link for all models – that is the parameter supplied to the distribution must be on the appropriate scale for that distribution – the Poisson rate parameter must have a positive value, probabilities for binary and categorical distributions must be on the scale (0,1).
Count, discrete, categorical outcomes must be specified within the RANDOM_VARIABLE_DEFINITION
block for the dv level via a suitable ProbOnto distribution (see also section 0). The random variable is then declared in the OBSERVATION
block via an anonymous list, as has been seen previously when defining individual variables via the RANDOM_VARIABLE_DEFINITION
block. Since continuous outcomes have a residual error specified at the DV level, it is inferred that the outcome defined in the OBSERVATION
block is at the DV level of variability. However for other types of data, it is less clear that this is the case. Thus we must use RANDOM_VARIABLE_DEFINITION
to define these outcomes using ProbOnto definitions and then provide additional information in the OBSERVATION
block to assign additional attributes.
The syntax is as follows:
RANDOM_VARIABLE_DEFINITION(level=DV){
<outcome variable> ~ <ProbOnto distribution>
}
OBSERVATION{
:: {type is <count/discrete/categorical>, variable = <outcome variable>}
}# end ESTIMATION
See below for examples pertaining to specific outcome types.
4.12.2.1 Count data
For count data, we have the following syntax:
RANDOM_VARIABLE_DEFINITION(level=DV){
<variable> ~ <ProbOnto distribution for count data e.g. Poisson1>
}
OBSERVATION{
:: {type is count, variable = <variable>}
}# end ESTIMATION
For example (also showing the appropriate INDIVIDUAL_VARIABLES
, MODEL_PREDICTION
and OBSERVATION
blocks) (UseCase11)
INDIVIDUAL_VARIABLES{
BASECOUNT : {type is linear, trans is ln,
pop = POP_BASECOUNT, ranEff = eta_PPV_EVENT }
BETA = POP_BETA
}# end INDIVIDUAL_VARIABLES
MODEL_PREDICTION{
lnLAMBDA=ln(BASECOUNT) + BETA\*CP
LAMBDA = exp(lnLAMBDA)
}
RANDOM_VARIABLE_DEFINITION(level=DV){
Y ~ Poisson1(rate=LAMBDA)
}
OBSERVATION{
:: {type is count, variable = Y}
}# end ESTIMATION
Note in the above example that the BASECOUNT variable is specified using the linear function and a natural log transformation on both sides to ensure that BASECOUNT is positive. The linear relationship with CP (plasma concentration) is defined within the MODEL_PREDICTION
block. We cannot use CP as a covariate in the linear(…)
function as CP varies with time and so is regarded as a regressor rather than a covariate. In UseCase11 since there is no model for the pharmacokinetics we use CP as the independent variable (IDV
) in the model and use is idv
in the DATA_INPUT_VARIABLES
block. We also take exponential of lnLAMBDA to ensure that the variable LAMBDA is on the positive scale before using this in the Poisson distribution.
This is an example where a little consideration of the random effects and model prediction can facilitate interoperability. Writing an equation for the INDVIDUAL_VARIABLES we may have defined
INDIVIDUAL_VARIABLES{
lnLAMBDA = ln(POP_BASECOUNT) + BETA\*CP + eta_PPV_EVENT
}
Using this formulation of the model though would not guarantee interoperability with some target software for estimation since the equation for lnLAMBDA is user-defined.
There are many distributions defined in ProbOnto which will describe count data. The diagram below (Figure 4.1) illustrates a few of these and relationships between them as described in the ProbOnto Knowledge Base (www.probonto.org)
4.12.2.2 Binary data
Similar to count data above, we define the binary outcome and its distribution in a RANDOM_VARIABLE_DEFINITION
block at the observation level of variability. Note how we define the names of the categories on the left hand side, and then the probability distribution defines the likelihood of the second category.
RANDOM_VARIABLE_DEFINITION(level=DV){
Y withCategories {<category1>,<category2>}
~ <ProbOnto distribution e.g. Bernoulli | Binomial>
}
For example (again, showing the INDIVIDUAL_VARIABLES
, MODEL_PREDICTION
and OBSERVATION
blocks to show the model construction):
RANDOM_VARIABLE_DEFINITION(level=ID){
eta_PPV_EVENT ~ Normal(mean=0, var=PPV_EVENT )
}# end RANDOM_VARIABLE_DEFINITION
INDIVIDUAL_VARIABLES{
indiv_BASE : {type is linear, pop= POP_BASEP,
ranEff=[eta_PPV_EVENT], trans is logit}
}# end INDIVIDUAL_VARIABLES
MODEL_PREDICTION{
LP = logit(indiv_BASE) + POP_BETA\*CP
P1 = invLogit(LP)
}# end MODEL_PREDICTION
RANDOM_VARIABLE_DEFINITION(level=DV){
Y withCategories {none, event} ~ Bernoulli1(probability=P1)
}
OBSERVATION{
:: {type is discrete, variable = Y }
}# end ESTIMATION
Note that the INDIVIDUAL_VARIABLES
block defines the individual baseline by combining the population parameter and the random effect. Note also that this specification uses a logit transformation to ensure that the individual baseline indiv_BASE variable is on the (0,1) probability scale. Then, the linear regression with plasma concentration (CP) is defined in the MODEL_PREDICTION
– Note that CP is not a covariate. Finally LP is back-transformed to the probability scale to give variable P1 which is the probability of an event to be used in the Bernoulli distribution. By defining how the 0,1 in the data correspond to named events {none, event} it is easier to understand exactly what category is being modelled.
An alternative distribution for the same model is the Binomial distribution with one trial:
RANDOM_VARIABLE_DEFINITION(level=DV){
Y withCategories {none, event} ~ Binomial1(numberOfTrials=1, probability=P1)
}
OBSERVATION{
:: {type is discrete, variable = Y }
} # end of OBSERVATION
4.12.2.3 Categorical data
Again, similar to count and binary data, the syntax for Categorical data outcomes is:
RANDOM_VARIABLE_DEFINITION(level=DV){
<variable> withCategories{ <category1>, …, <categoryk> }
~ <ProbOnto distribution for categorical data
e.g. CategoricalNonordered1 | CategoricalOrdered1>
}
OBSERVATION{
:: {type is categorical, variable=<variable>}
}
For example (UseCase13_1):
GROUP_VARIABLES{
B0 = Lgt0
B1 = B0 + Lgt1
B2 = B1 + Lgt2
}
INDIVIDUAL_VARIABLES{
indiv_B0 : {type is general, grp = B0, ranEff = eta_PPV_EVENT}
indiv_B1 : {type is general, grp = B1, ranEff = eta_PPV_EVENT}
indiv_B2 : {type is general, grp = B2, ranEff = eta_PPV_EVENT}
}# end INDIVIDUAL_VARIABLES
MODEL_PREDICTION{
EDRUG = Beta * CP
A0 = indiv_B0 + EDRUG
A1 = indiv_B1 + EDRUG
A2 = indiv_B2 + EDRUG
P0 = invLogit(A0)
P1 = invLogit(A1)
P2 = invLogit(A2)
Prob0 = P0
Prob1 = P1 - P0
Prob2 = P2 - P1
Prob3 = 1 - P2
} # end MODEL_PREDICTION
RANDOM_VARIABLE_DEFINITION(level=DV){
Y withCategories{ none, mild, moderate, severe }
~ CategoricalOrdered1(categoryProb=[Prob0, Prob1, Prob2, Prob3])
}
OBSERVATION{
:: {type is categorical, variable=Y}
}
In the above code, the cutpoints between categories are defined in the GROUP_VARIABLES
block (B0, B1, B2) and individual values for these are defined in the INDIVIDUAL_VARIABLES
block. The linear effect of CP (plasma concentration) is defined in the MODEL_PREDICTION
block and this is added to the individualised cutpoints (A0, A1, A2). These are then back-transformed to the probability scale (P0, P1, P2) and the ordered categorical model is defined by calculating the probability of each category as the difference from the previous category – Prob0, Prob1, Prob2, Prob3.
4.12.3 Time to event data
Time to event (TTE) models are modelled by specifying the hazard function. The PharmML to target software tool converters handle the translation of the hazard specification to target tool implementation. For some software this involves calculation of the survival function and associated likelihood.
For an arbitrary hazard function \[\lambda\left(t\right)\]:
Hazard function | \[\lambda\left( t \right)\] |
---|---|
Cumulative hazard function | \[\Lambda\left( a,\ b \right) = \ \int_{a}^{b}{\lambda\left( t \right)\text{dt}}\] |
Survival function | \[P\left( T > t \right) = e^{- \Lambda\left( t_{0},t \right)}\] |
Probability density function | \[p\left( t \right) = \lambda\left( t \right)e^{- \Lambda\left( t_{0},t \right)}\] |
Cumulative distribution function | \[P\left( T < t \right) = \int_{0}^{t}{p\left( s \right)\text{ds}}\] |
For an introduction to TTE models see (Holford 2013) and for a tutorial in implementation in NONMEM and Monolix see (N. H. M. Lavielle 2011).
The MDL syntax for time to event outcomes is :
<OUTCOME VARIABLE NAME> : { type is tte, hazard = <VARIABLE> }
For example (UseCase14.mdl):
INDIVIDUAL_VARIABLES{
BTATRT = POP_BTATRT
H_BASE = POP_HBASE
}
MODEL_PREDICTION{
HBASE=H_BASE/365
HAZTRT=BTATRT * TRT
HAZ = HBASE * (1+HAZTRT)
} # end MODEL_PREDICTION
OBSERVATION{
Y : {type is tte, hazard = HAZ }
} # end ESTIMATION
In the above case the hazard and the effect of treatment on the hazard is calculated in the GROUP_VARIABLES
block. This is then used in the MODEL_PREDICTION
block to calculate the hazard for the event. The model outcome variable Y is then defined as having type tte and the hazard calculated in the MODEL_PREDICTION
block is passed in as an argument. Specification of the model is then very simple for the user – no calculation of Survival functions nor likelihood is necessary.
In the current MDL, TTE models are able to be handled equally by NONMEM and Monolix. To facilitate this, we impose certain constraints on dataset conventions. There must be a data record at the start of the interval during which the hazard will be integrated. We use DV = 0
to denote right censoring and DV = 1
to denote an event.
Currently only exact time of event and right censoring is supported in MDL. Future versions will support interval censoring and repeated time to event.
The convention in NONMEM datasets of using MDV to identify the start of the observation period for assessing TTE cannot be used.
4.13 FUNCTIONS
This block allows users to define their own functions, for example for use with interpolation.
The syntax is as follows:
<function name> :: function(<argument1> :: <argument1 type>,
<argument2> :: <argument2 type>,
<argumentk> :: <argumentk type>) :: <function result type>
is
<expression using argument1… argumentk>
Specifying the types of each argument and the type of the function result allows validation of the function inputs and outputs.
To call the function, the user types &
before the function name.
For example:
FUNCTIONS{ myInterp::function(t::real, x0::real, t0::real, x1::real,
t1::real)::real
is
x0
}
DATA_INPUT_VARIABLES {
ID : { use is id }
TIME : { use is idv }
WT : { use is covariate, interp=&constInterp }
AGE : { use is covariate, interp=&myInterp }
}
4.14 POPULATION_PARAMETERS
The POPULATION_PARAMETERS
block allows the user to define models and model parameters that exist at the population level. This may be used in hierarchical models to define population parameters when there are additional levels of hieararchy above the individual.
In most “population approach models, the individuals are assumed to be drawn from a population, the characterstics of which are described through fixed and random effect models. Inferences are then made on the “population parameters which we take to be representative of the population from which the individuals are drawn.
In meta-analysis across many studies, regions, demographic populations it may be useful to characterise additional levels of hierarchy characterising how individuals within each study, region or demographic differ systematically (through fixed effect models) and randomly (through variability models). These higher level models can be expressed in the POPULATION_PARAMETERS
block.
In the current MDL, population models are defined through combination of RANDOM_VARIABLE_DEFINITION
and POPULATION_PARAMETERS
definitions. In the current MDL expressions (equations) are NOT allowed in the POPULATION_PARAMETERS
block.
The POPULATION_PARAMETERS
block uses random variables defined in the RANDOM_VARIABLE_DEFINITION
block. The syntax for the POPULATION_PARAMETERS
block is as follows:
POPULATION_PARAMETERS{
:: {type is < continuous | categorical >,
variable = < RANDOM_VARIABLE_DEFINITION variable > }
}
For example (/FourModels/Hierarchical_Model.mdl):
RANDOM_VARIABLE_DEFINITION(level=POP){
w_pop ~ Normal(mean = ws, sd = gw)
V_pop ~ Normal(mean = Vs, sd = gV)
} # end RANDOM_VARIABLE_DEFINITION
POPULATION_PARAMETERS{
:: {type is continuous, variable=w_pop}
:: {type is continuous, variable=V_pop}
}
RANDOM_VARIABLE_DEFINITION(level=ID) {
ETA_BSV_V ~ Normal(mean = 0, sd = omega_V)
} # end RANDOM_VARIABLE_DEFINITION
INDIVIDUAL_VARIABLES {
V : {type is linear, trans is ln, pop=V_pop,
fixEff={coeff=BETA_WT, cov=WT}, ranEff=ETA_BSV_V}
} # end INDIVIDUAL_VARIABLES
MODEL_PREDICTION {
D
f = D/V * exp(-k*T)
} # end MODEL_PREDICTION
In the model above, the mean weight for each population is drawn from a Normal random variable. The mean Volume of distribution also varies for each population and is similarly drawn from a Normal random variable. The parameters for inference would be Vs (“global mean of Volume of distribution), gV (between population variability), omega_V (between individual variability), BETA_WT (fixed effect of Weight). V_pop gives the population prediction of Volume of distribution for each population observed in the data, while V gives the individual predicted Volume of distribution.
References
Gelman, Andrew. 2006. “Prior Distributions for Variance Parameters in Hierarchical Models(Comment on Article by Browne and Draper).” Bayesian Analysis 1. Institute of Mathematical Statistics: 515–34. doi:10.1214/06-ba117a.
Swat, Maciej J., Pierre Grenon, and Sarala Wimalaratne. 2016. “ProbOnto: Ontology and Knowledge Base of Probability Distributions.” Bioinformatics 32 (17). Oxford University Press (OUP): 2719–21. doi:10.1093/bioinformatics/btw170.
Holford, Nick. 2013. “A Time to Event Tutorial for Pharmacometricians.” CPT: Pharmacometrics & Systems Pharmacology 2 (5). Wiley-Blackwell: e43. doi:10.1038/psp.2013.18.
Lavielle, Nick Holford; Marc. 2011. “A Tutorial on Time to Event Analysis for Mixed Effect Modellers.” http://www.page-meeting.org/pdf_assets/2573-time-to-event-tutorial.pdf.