Mads notebook: Model diagnostics

MADS is an integrated high-performance computational framework for data/model/decision analyses.

MADS

MADS can be applied to perform:

Here, it is demonstrated how MADS can be applied to solve a general model diagnostic problem.

Most of the tasks listed above are demonstrated below.

Problem setup

Import Mads (if MADS is not installed, first execute in the Julia REPL: import Pkg; Pkg.add("Mads")):

Setup the working directory (in this case, the working directory is the location where this notebook is located):

Create a problem dictionary (the dictionary is applied to store all the information about the model applied to demonstrate the solution of the model diagnostic problem):

Setup model parameters:

There are 4 model parameters (a, b, c, and n). The initial values and the prior distributions (based on prior knowledge of the parameter uncertainty) are defined for each parameter.

Setup model observations:

There are 6 observations (o1, o2, o3, ... and o6). The calibration targets, observation weights (i.e., the inverse of measurement standard deviations), and acceptable ranges are defined for each observation.

Setup the model:

A function (called polynomial) is defined to compute the 6 observations given the 4 model parameters as an input:

The polynomial function is set up now in the md dictionary as a model that will be applied to perform the simulations:

The analyzed model captured in the problem dictionary can be:

The model can also be a reduced-order model developed using machine learning.

Set a default name for MADS input / output files:

Now, the problem dictionary md is fully defined:

And the model diagnostic problem is set up!

Forward model simulation

A single forward model run based on the initial model parameter values can be executed as follows:

The forward model run can also be executed using the following command:

The runs above produce outputs representing model predictions at the six observations over time.

The forward simulations are based on the initial guesses for the model parameters.

The initial model predictions can be plotted:

The figure above shows that the true observations are not well reproduced by the model using the initial model parameter guesses.

Model calibration (inversion)

The calibration (inversion) of the developed model is achieved using the following command:

The code returns 2 objects.

calib_param is a dictionary of the calibrated model parameters.

calib_information contains calibration information.

The obtained model predictions can be plotted:

Initial values of the model parameters are:

Estimated values of the model parameters based on the model calibration (inversion) are:

Model calibration (inversion) for a set of random initial guesses

The model inversion can also be performed for a set of random initial guesses for model parameters.

The final parameter estimates from the 100 random-initial-guess inverse runs are collected into a matrix below:

Plot the final predictions of the 100 random-initial-guess inverse runs:

The figure above demonstrates that there are several different global minima.

There are three important groups of results with different n values:

The code below identifies and plots solutions associated with these 3 distinct groups:

Analysis of predictive sensitivities and uncertainties

Local sensitivity and uncertainty quantification

localsa["stddev"] defines the estimated posterior uncertainties in the estimated model parameters.

This estimate is based on the Jacobian / Hessian matrix estimates of the parameter space curvature in the vicinity of the estimated (inverted) optimal parameters.

The uncertainties are assumed to be Gaussian with standard deviations defined by localsa["stddev"].

Based on these results, c is well constrained. n is also well defined. In contrast, a and b are less certain.

However, because of the local nature of the estimates, these results are not very accurate and differ with the global sensitivity and uncertainty analyses presented below.

The plots below show a series of graphical representations of the localsa results. These plots are generated automatically by the code.

A plot of the Jacobian representing the relationships between model parameters and estimated observations:

A plot of the eigen matrix of the Hessian (the Hessian is approximately computed based on the Jacobian above):

A plot of the eigen values of the Hessian:

The eigen analysis presented above suggests that a and b are correlated (this is expected based on the mathematical form of the solved model in the function polynomial). Both parameters are represented by the first and last (4th) eigen vectors.

The parameters n and c are uncorrelated and also independent of a and b.

Global sensitivity and uncertainty quantification

Affine Invariant MCMC

Our module AffineInvariantMCMC.jl (aka EMCEE) is applied to perform global sensitivity and uncertainty quantification:

The results above capture 10,000 equally likely parameter combinations. The parameter combintations represent the global sensitivity and uncertainty of the model parameters and associated predictions. A forward run based on this set (chain) is executed below:

The figure above compares the 10,000 model predictions with the actual measurements (red dots).

The figure below shows the histograms of the posterior model uncertainties (along the diagonal) and the cross-plots between the parameters (off-diagonal plots; the cross-plots above and below the diagonal are similar):

The figure above shows that the optimal (most probable) estimates are:

c i the most constrained (varying between -0.2 and 0.2).

There are plausible solutions for any value of a, b and n within the prior uncertainty range.

The parameters a and b are strongly inversely correlated by their respective cross-plots.

Based on the cross-plots, the plausible values for n can be within the entire prior uncertainty range if (1) a is equal to 0 and (2) b is equal to 1.

The plausible values for n are close to 1 if (1) a is very different from 0 and (2) b is very different from 1.

Saltelli (Sobol) and EFAST global sensitivity analyses

Both Saltelli (Sobol) and EFAST methods are producing similar results. Both methods are designed to perform global sensitivity analyses. EFAST is computationally more efficient.

The Saltelli (Sobol) results are obtained as follows:

The EFAST results are obtained as follows:

The differences in the total and main effect plots suggest correlations in the model parameters (which is also demonstrated by the AffineInvariantMCMC analyses above).

The figures also demonstrate that the parameter sensitivity to observations changes over time.

Based on the total effect, parameter a and n sensitivities generally increase with time. Parameter b and b sensitivities generally decrease with time.

Decision Analysis using Information-Gap Decision Theory

Define the Information-Gap Decision Theory horizons of uncertainty h:

Define the polynomial models to be explored:

Execute the infogap analyses, collect the obtained results, and produce a figure summarizing the results:

The figure above compares the model opportuneness (dashed lines) vs model robustness (solid lines) for different infogap horizons of uncertainty h and different models (different colors).

The model opportuneness defines that the things might get better than expected (i.e., observation at dimensionless time 5 o5 can get lower than expected).

The model robustness defines that things might get worse than expected (i.e., observation at dimensionless time 5 o5 can get higher than expected).

Based on both the model opportuneness and model robustness, the last model is the most complex and can bring the most surprises. The first model is the simplest and produces the lower level of surprises.

In terms of model selection, the simplest model is the best. However, the alternative models (if they capture all the conceptual model uncertainties) represent how much things can get worse/better within the horizon of uncertainty.