5 CSIEM : marvl
5.1 Overview
The subsequent chapters detail the nature of the model setup, and also focus on specific areas of model assessment. In general, the approach to assess the model loosely follows the CSPS framework of Hipsey et al. (2020). The framework considers:
- Level 0: conceptual evaluation; the conceptual basis for each of the hydrodynamic regimes, water quality variables, biogeochemical reactions, and habitat models as based on scientific review and data inspection;
- Level 1: traditional assessment of simulated state variables; a range of metrics is used for a large number of predicted variable and different sites.
- Level 2: evaluation of process rates; and
- Level 3: benchmarking system-level patterns and emergent properties; this is evaluated by assessments related to hydrodynamic regimes, nutrient budget analysis, nutrient cycling pathway analysis, and assessment of the relationship between the areas of habitat and water conditions.
The above assessments are supported by an analytics toolbox that co-orindates that above scripts and necessary data and workflows. In particular, csiem-marvl refers to the Cockburn Sound Integrated Ecosystem Model - Model Assessment, Reporting and Visualisation Library, which is a collection of scripts and tools for assisting users visualise model outputs and observational datasets, and for evaluating the model’s performance. csiem-marvl uses the more generic aed-marvl package for core functions for plotting and evaluating model output, in addition to a suite of custom python and matlab scripts developed for specific types of analyses. The MARVL “toolkit” can be operted locally, or via the SEAF-CS cloud-based “databricks” platform.
The specific data available for validation and the assessment metrics are extensive and summarised below. The level of model uncertainty is discussed in terms of how much confidence is in the current generation of model outputs for the purposes of defining model reliability.
5.2 The Model Assessment, Reporting and Visualisation Library (MARVL)
The publicly available GitHub repository called csiem-marvl contains a wide variety of scripts and functions that are used to post-process and visualise model output. Scripts that have been specifically developed for this project are contained within the csiem-marvl repository. Plotting and model processing types include:
- Time-series plotting;
- Transect plotting;
- Model animation creation;
- Error assessment;
- Wave model plotting;
- Habitat mapping (e.g., Seagrass & Fish HSI processing and mapping)
- Scenario comparison and “DelMap” plotting;
- Nutrient budget assessments.
In particular, AEDmarvl_plot_timeseries and AEDmarvl_plot_transect are the main functions that are frequently used. The AEDmarvl_plot_timeseries function uses data and gis files stored in the csiem-marvl repository, in addition to model output to create timeseries plots of the model (averaged within a polygon region) compared against field data. The plotting function will also automatically calculate a range of error statistics based on model output and field measured data.

Figure 5.1: Example output from plottfv_polygon with error matrix
AEDmarvl_plot_transect can also be found in the csiem-marvl repository. This plots model data extracted along a transect line during a specified plotting period, and compares against the range of field data found within that period.

Figure 5.2: Example output from plottfv_transect with distance from Goolwa Barrage (km) along the x-axis
The csiem-marvl analysis library also houses scripts and functions for:
- Nutrient Budgeting;
- Stacked Area Transect;
- Curtain plotting;
- Mesh manipulation tools;
- Data exports;
- Sheet plotting and animation tools;
Some of the example plots from previous research projects using the marvl-similar scripts are shown below illustrating their capability and presentations. The csiem-marvl analysis library is currently under development to meet the needs with more data collection and modelling progress, such as sediment profiling and habitat index.

Figure 5.3: Example output from plottfv_transect_StackedArea with distance along the x-axis

Figure 5.4: Example Nutrient Budgeting output of a polygon region.
5.3 Model assessment summary
5.3.1 Summary of validation data-set
The field observation data available for the model validation and assessment include a diversity of historical data (collected pre 2021), and a large volume of data generated by recent monitoring and WAMSI-Westport research projects. Relevant data for validation include:
- In situ water quality sensors; high frequency measurements at fixed locations.
- Water quality grab samples.
- Biotic surveys.
- Strategic experimental data.
All the data relevant to model calibration and validation are included in the CSIEM Data Catalogue and detailed in Appendix A; see also Chapter 3. The data spans a wide range of locations and time-periods; however the primary model assessment generally focuses on the most intense period of monitoring between 2022 and 2024. Long-term assessments are also undertaken for different versions of the model by comparing against the long-term monitoring data set.
5.3.2 Performance assessment metrics
The modelling results are compared against historical data collected within Cockburn Sound (where available), using both traditional statistical metrics of model error, and other metrics relevant to model performance. The approach is applied to each model generation with the aim to identify areas where the model is accurate, and areas for further improvement and ongoing calibration effort.
Error metrics : Initially, the model performance in predicting a range of relevant variables including salinity, temperature, nitrogen, phosphorus and total chlorophyll-a are assessed with a set of statistical metrics, and the calculations of statistical metrics was performed for each observation site where the number of field observations was >10 in the assessment period.
The core statistical metrics considered consist of:
- \(r\): regression coefficient, Varies between -1 and 1, with a score of 1 indicating the model varies perfectly with the observations and a negative score indicating the model varies inversely with the observations. A consistent bias may be present even when high score of r is obtained.
- \(BIAS\): bias of average prediction to the average observation during the assessing period. This method presents a magnitude for the discrepancy between the model results and the observational data.
- \(MAE\): mean absolute error: Similar to RMSE except absolute value is used. This reduces the bias towards large events. Values near zero indicate good model skill.
- \(RMS\): root mean squared error, Measures the mean magnitude, but not direction, of the difference between model data and observations, and hence can be used to measure bias. Values near zero are desirable. This method is not affected by cancellation of negative and positive errors, but squaring the data may cause bias towards large events.
- \(nash\): the Nash-Sutcliffe metric (also called \(NSE\) or \(MEF\) is a matrix of modelling efficiency, measures the mean magnitude of the difference between model data and observations. This method compares the performance of the model to that only uses the mean of the observed data. A value of 1 would indicate a perfect model, while a value of zero indicates performance similar to simply using the mean of observed data.
Seasonality : The model results are assessed in terms of the degree of seasonal fluctuation, as seen in the field data. Whilst this is captured in the error metrics (e.g. R, the visual assessment can assess timing issues related with seasonal peaks.
Transects : The model results are assessed in terms of the seasonal mean along the length of the domain (longitudinal transect). The transect analysis allows a system wide scale assessment of conditions, that smooths out noise and local variability in the field and model predictions.
Advanced measures : The model results were finally assessed considered that partitioning of nutrients in terms of inorganic vs organic, and other expected measures.
5.3.3 Assessment periods
- Historical period: 1970-2010: initial assessment prior to availability of WAMSI research project data;
- Recent period: 2011-2020: initial assessment prior to availability of WAMSI research project data;
- Focus period: 2021-2023: initial calibrated against the intensive field sampling and observations obtained from different components of the WAMSI research project;
- Long-term performance: 2013 – 2024: calibrated against the long-term water quality data collected from the routine measurements, as well as from the WAMSI research project.
5.3.4 Model confidence reporting
Based on the above assessment we evaluate confidence in the model by assigning each variable to the following categories:
- Good
- Acceptable, and
- Caution.
This confidence evaluation, considers:
- Quality of observed data, which is influenced by field and laboratory data limitations, methodologies, processes and protocols.
- Error metric scores relative to what is typically reported in the literature for water quality models (e.g., Arhonditsis and Brett, 2004).
- Ability of the CSIEM to capture the mean of an indicator and its spatial gradient and seasonality.
- Partitioning of water quality constituents within different ecosystem pools.
- Natural variability of the indicator at different temporal scales (i.e. sub-daily to seasonal).
5.4 MARVL-VIEWER
The model assessment considered ~70 assessment polygons, for 6 years for multiple (>20) variables. To assist with reviewing these plots, readers to browse via the MARVL-VIEWER web-app.