A hybrid variational-ensemble data assimilation scheme to estimate the vertical and horizontal parts of the background error covariance matrix for an ocean variational data assimilation system is presented and tested in a limited-area ocean model implemented in the western Mediterranean Sea. An extensive data set collected during the Recognized Environmental Picture Experiments conducted in June 2014 by the Centre for Maritime Research and Experimentation has been used for assimilation and validation. The hybrid scheme is used to both correct the systematic error introduced in the system from the external forcing (initialisation, lateral and surface open boundary conditions) and model parameterisation, and improve the representation of small-scale errors in the background error covariance matrix. An ensemble system is run offline for further use in the hybrid scheme, generated through perturbation of assimilated observations. Results of four different experiments have been compared. The reference experiment uses the classical stationary formulation of the background error covariance matrix and has no systematic error correction. The other three experiments account for, or not, systematic error correction and hybrid background error covariance matrix combining the static and the ensemble-derived errors of the day. Results show that the hybrid scheme when used in conjunction with the systematic error correction reduces the mean absolute error of temperature and salinity misfit by 55 and 42 % respectively, versus statistics arising from standard climatological covariances without systematic error correction.

The study and the characterisation of the ocean is a complex discipline involving different aspects of modern science. In order to obtain a coherent and time-evolving three-dimensional (3-D) picture of the ocean from historical and present-day observations, as well as be able to predict the future evolution of the environment, we need to solve theoretical and technical issues.

It is not feasible to observe all variables of interest with adequate spatial and temporal scales. Modern technologies, like satellite remote sensing and autonomous vehicles, have significantly increased our capability to observe the environment in general and the ocean in particular. However, the huge number of degrees of freedom characterising the ocean state still prevents sampling at the desired resolution. In order to fill the observational gaps and expand the temporal horizon covered by the observations (both in the past and in the future), oceanographers combine direct observations with theoretical studies by means of models and data assimilation.

A numerical hydrodynamic model is basically the discretised version of the primitive equations, it is an approximation of nature. Moving from the continuous to the discrete space, additional approximations are introduced and should be accounted for when analysing model results. These approximations affect the model solutions in terms of quality and accuracy and, more importantly, differences between the numerical solution and the true state amplify along time, due to the chaotic component of the ocean dynamic.

In order to minimise these differences and improve the quality and accuracy of model results, data assimilation techniques have been developed during the past decades. Data assimilation is a technique to correct the model solution-based on statistical and physical constraints derived from observations and model simulations.

Even if different kinds of data assimilation techniques exist, most of them rely on the same basic principle, the combination of physically based and statistical approaches to maximise the conditional probability of the model state given the observation.

Data assimilation schemes developed for oceanographic studies can be classified in two categories. The first one is the Kalman filter (KF) type algorithm with background error covariances (BECs) matrices usually derived from ensemble statistics (Evensen, 2003). The second type of assimilation algorithms employ stationary BECs derived from long-term model integrations (Yin et al., 2011; Weaver and Courtier, 2001; Pannekoucke and Massart, 2008). A key avenue to improving data assimilation is accurate specification of the error statistics for the background forecast, also known as the prior or first guess (Schlatter et al., 1999).

The ensemble Kalman filter (EnKF) (Evensen, 1994) consists of a set of short-term forecasts and data assimilation cycles. In the EnKF, the BECs are estimated from an ensemble of model simulations. The presumed benefit of utilising these ensemble-based techniques is their ability to provide a flow-dependent estimate of the BECs. The traditional EnKF incorporates probabilistic information on analysis errors in the generation of the ensemble by imposing a set of perturbations for each ensemble member, generating the individual numerical forecasts from different sets of initial conditions implied by the different sets of observations and/or different numerical model configurations. The EnKF is related to the classic Kalman filter, which provides the optimal analysis in the case that the forecast dynamics are linear and both background and observation errors have normal distributions. The main difference is that the KF explicitly forecasts the evolution of the complete forecast error covariance matrix using linear dynamics, while the EnKF estimates this matrix from a sample ensemble of fully non-linear forecasts. The EnKF also addresses the computational difficulty of propagating or even storing the forecast error covariance matrix. Using ensemble simulations also implies that the EnKF does not assume the covariances to propagate linearly.

On the other hand, many current and past operational data assimilation methods use long time series of previous forecasts to develop stationary and often also spatially homogeneous approximations to BECs. Schemes that use such statistics include optimum interpolation and three-dimensional variational data assimilation (3DVAR), and have the advantage of being less computationally demanding, namely allowing for higher resolution. In reality, BECs may vary substantially depending on the flow and error of the day. A four-dimensional variational data assimilation (4DVAR) system implicitly includes a time-evolving covariance model through the evolution of initial errors under tangent-linear dynamics (Lorenc, 2003) within the assimilation time window. However, the time-evolving covariance model may still be limited by usage of a stationary covariance model at the beginning of each 4DVAR cycle. Furthermore, like the EnKF, 4DVAR is computationally intensive, requiring multiple integrations of tangent-linear and adjoint versions of the forecast model. The specification of flow-dependent statistics is per se a demanding task, due to the difficulty of retrieving information on errors in model space.

The ensemble EnKF provides an alternative to variational data assimilation systems. Under assumptions of linearity of error growth and normality of observation and forecast errors, it has been proved that the EnKF scheme produces the correct BECs as the ensemble size increases (Burgers et al., 1998). However, for smaller ensembles, the EnKF is rank deficient and its BEC estimates suffer from a variety of sampling errors, including spurious correlations between widely separated locations that need to be removed by means of specific techniques (e.g. covariance filtering or localisation).

Assimilation methods using a static type of the BEC have recently gained considerable attention because of their flexibility (Lorenc, 2003). Furthermore, present computational resources limit the number of ensemble members accounted on operational EnKF. Thus, it is appealing to have an algorithm that could work with smaller-sized ensembles and that could benefit from whatever flow-dependent information this smaller ensemble provides.

Recent encouraging results suggest that if ensemble information is used in the variational data assimilation framework to augment the static BEC, analyses can be improved. Hereinafter, we call this method a “hybrid” scheme. Development of hybrid schemes has been an area of active research in atmospheric data assimilation (Hamill and Snyder, 2000; Etherton and Bishop, 2004; Wang et al., 2007). Several studies have been conducted on the hybrid schemes. Studies by Hamill and Snyder (2000), Etherton and Bishop (2004) and Wang et al. (2007) used simple models and simulated observations to suggest the effectiveness of incorporating ensembles in the 3DVAR to improve the analyses. Although the recourse to hybrid covariances and the choice of the relative weights given to them remains empirical in practice, it has been shown in particular that hybrid models tend to be more robust than conventional ensemble-based data assimilation schemes, especially when the model errors are larger than observational ones (Wang et al., 2007, 2008, 2009). This feature is attractive for the regional assimilation problems in oceanography, where information on the background state is often scant and incomplete. Promising application of the hybrid scheme in global oceanographic exercise has been recently provided by Penny et al. (2015). They compared hybrid, classical 3DVAR and EnKF schemes in an observing system simulation experiment and also using real data, showing that the hybrid scheme reduces errors for all prognostic model variables eliminating growth in biases present in the EnKF and 3DVAR.

Recent work has also started addressing the issue of multi-scale data assimilation, where the analyses are a combination of corrections with different spatial-scale signals, assuming somehow that spatial scales are separable and that observations may naturally bear information across several spatial scales. Examples of these schemes range from multi-scale 3DVAR systems (MS-VAR), sequential applications of horizontal operators with different correlation length scales (Mirouze et al., 2016), or inclusion of a large-scale analysis in the analysis formulation as additional constraint (Guidard and Fischer, 2008). A possible simplification is to assume that systematic errors are characterised by long length scales, as often occurs to some extent (Dee, 2005).

In this study, we develop a hybrid data assimilation system for the REP14-MED (Mediterranean Recognized Environmental Picture, 2014) NEMO model implementation, based on the existing 3DVAR system. Section 2 describes the hybrid variational data assimilation scheme adopted accounting for systematic error corrections. In Sect. 3 details on the experiments set-up are provided. In Sect. 4 the results are presented and discussed. Finally, Sect. 5 offer the summery and conclusions.

A 3DVAR algorithm has been used to implement and test our hybrid assimilation
scheme. 3DVAR is relatively easy to implement and to expand, it can easily
take into consideration different estimates of BEC, its core is independent
of the primitive equations model core, and it is portable. The cost function
in 3DVAR is defined as

Following Dobricic and Pinardi (2008), the present 3DVAR scheme assumes that
the

Basically, the background error covariance matrix is modelled as a linear
sequence of several

In our static formulation of the 3DVAR, the vertical transformation operator

The proposed approach introduces the flow-dependent

By defining the control vector

Ensemble statistics may also provide estimates of the day of the horizontal
correlation radii to be used in

According to Belo Pereira and Berre (2006), for any simulated error

Though most data assimilation methods assume that the model forecast (i.e. the background) is unbiased, that is rarely the case. Model bias can systematically cause the model to drift away from the truth, eventually propagating into the analyses. In limited-area models (LAM) integrated for relatively short time the systematic errors (bias) may derive from inadequate model physics and parameterisations as well as inaccurate initialisation and open boundary conditions, including the atmospheric forcing. An adequate solution is strictly necessary since the systematic error in the large-scale forcing field can prevent the right small-scale dynamics from developing properly and thus can strongly reduce the potential benefits deriving from the increased resolution and/or improved physics.

Here, we assume that systematic errors are associated to large-scale errors. This idea is consistent with the high-resolution model presented in Sect. 3 and with the experimental set-up where the large-scale uncertainties (initialisation, boundary conditions and surface forcing) are not accounted for in the generation of the ensemble members.

Further expanding the decomposition introduced in Eq. (7) and following
recent studies suggesting the possibility to treat multiple-scale errors
during the analysis steps (Li et al., 2015), we reformulate the analysis
increments as

Top-left panel: western Mediterranean Sea. Bottom-right panel: model domain and collected data during REP14-MED experiment. Green dots indicate sea level anomaly measurements from satellite (from 3 to 30 June 2014), red dots CTD positions (from 7 to 24 June 2014), blue dots glider trajectories (surfacing points, from 8 to 23 June 2014). The magenta lines indicate the box used to compute ensemble statistics. Bathymetric lines are also shown (m).

The availability of an ensemble simulation allows us to retrieve estimates of
the model bias or systematic error. Recalling that

In our formulation we assume that the scales in

Such a scheme thus requires a fairly dense observational network to estimate the bias, whose availability may in general depend on the simulation area and period. The method is potentially affected by systematic observational errors and thus is sensitive to the design of the observational networks. On the other hand the analysis of the systematic error can provide useful insights about the error not represented in the ensemble space and thus help in the definition of the ensemble generation procedure.

Depending on data availability and ensemble size, the bias estimator can be constant or spatially or temporal dependent.

In June 2014, a REP14-MED sea trial off the west coast of Sardinia was
conducted by CMRE (Centre for Maritime Research and Experimentation),
coordinating efforts of 20 partners from six different nations. Two research
vessels collected a massive amount of data in an area of approximatively

Model configuration details.

NEMO (Nucleus for European Modelling of the Ocean; Madec, 2008) has been
implemented as the primitive equations dynamical model component of the data
assimilation system. The ocean engine of NEMO is adapted to regional and
global ocean circulation problems. Prognostic variables are the meridional
and zonal velocities, sea surface height, temperature and salinity. In the
horizontal direction, the model uses a curvilinear orthogonal grid and in the
vertical direction, a full or partial step

Number of observations per day. The colour coding is according to
Fig. 1. The

An ensemble of data assimilation system with 14 independent
members with daily assimilation cycles has been performed to generate the
ensemble statistics. All simulation/assimilation experiments presented
hereafter started on 1 June and ended on 30 June 2014, the MED-REP14 period.
All the experiments are initialised and forced at the lateral open boundaries
using the Mercator-Ocean (Drévillon et al., 2008) product in the
Mediterranean Sea, while surface fluxes are computed by means of bulk
formulae using hourly atmospheric data with 7.0 km horizontal resolution
provided by the Italian Meteorological Centre and based on the COSMO-ME model, an
implementation of the COnsortium for Small-scale MOdelling (COSMO). The
ensemble members have been generated simultaneously assimilating perturbed
observations varying the corresponding observational error, and assuming
different horizontal correlation radii in

For the observation perturbation, either weak or strong criteria for retaining observations are used among the ensemble members. Conservative quality check procedures assume good quality flags in both temperature and salinity and reduce the total number of assimilated observations. Filters have been applied horizontally and vertically to reduce the higher spatial sampling of observation with respect to the model grid. Within the ensemble members, different vertical cut-off scales have been used in the low-pass filter resulting in differently smoothed profiles. Horizontal data binning has been applied to the observations falling in 1 or 2 model grid cells while keeping the original vertical resolution. When the filtering or binning procedures are applied the corresponding full resolution profile standard deviation has been used as an estimate of the observational error. Similar procedures have been applied to CTD and gliders data.

The default horizontal correlation radii (

The ensemble generation method spans the uncertainty linked with the
observational sampling and assimilation formulation, implicitly acting on the
background ensemble spread. This method clearly connects the growth of the
ensemble spread to the function used to perturb the observations and
simultaneously links the ensemble spread to observation availability. For the
time being, the perturbation of surface and lateral boundary conditions is
not considered, assuming that the flow-dependent component of

All the ensemble members use a static and homogeneous

Example of perturbed CTD vertical profile with different quality
check procedure and filtering applied. The solid black line indicates the
full resolution CTD profile while horizontal lines are the associated
observational error. The other colours indicate the perturbed profile. In the
middle panel the three tested couples of horizontal correlation length scales
(

In the hybrid variational assimilation system, the generated ensemble
information has been projected into

Experiments.

The ensemble statistics have also been used to estimate the model systematic
error and a large-scale systematic error correction has been applied. For
every simulated day,

To test bias correction and the impact of the ensemble-based EOFs, results
from four different experiments are compared. Exp-ref uses climatological,
spatially homogeneous

The quality of the ensemble has been evaluated on the base of ensemble spread values and distributions. The ensemble spread is defined as the standard deviation across the ensemble members.

In Fig. 4, the time evolution of temperature and salinity standard deviations computed from the ensemble members are shown from the surface to 1000 m depth. At all depths and for both temperature and salinity, the ensemble spread reaches a stable value on 10/12 June after 2/3 days assimilating the CTD and glider data from the first cruise leg (Fig. 2). The small spread during the first days is mostly confined to the surface layers and is due to the SLA assimilation. Between 13 and 22 June, the ensemble spread is nearly constant at all depths, probably constrained by the dense observational network, meaning that only a few days are needed to spin-up our ensemble system. Later on during the simulated period, the data density decreases and temperature and salinity ensemble spreads behaviours differ significantly. The salinity ensemble spread remains nearly constant, whereas the standard deviation of surface temperature decreases. We can speculate that the decreased ensemble variability in temperature is due to the surface and lateral forcing shared among all the ensemble members that rapidly constrain the temperature within the model domain when no observations are assimilated. On the other hand, salinity reacts slower to surface forcing. Thus, the methodology used to generate the ensemble could be improved to also account for errors in the external forcing (surface and lateral open boundary conditions) and model parameterisations.

Ensemble standard deviation computed over the observational domain in the observational box (magenta) shown in Fig. 1. Ordinate indicates ocean depth in metres while abscissa is time.

Horizontal maps of ensemble standard deviation for temperature (left) and salinity (right) at 0.5 m on 12, 18, 24 and 30 June 2014.

The horizontal distribution of the near-surface (0.5 m depth) ensemble standard deviations for temperature and salinity valid for 12, 18, 24 and 30 June are shown in Fig. 5.

During the simulated period the two state variables show different behaviours. Temperature standard deviation maxima are mostly confined within the observational space and have well-defined small/medium size structures. On 30 June, when no more in situ observations are available, a large-scale maxima structure is evident close to the north-west domain open boundaries partially due to SLA assimilation, and the different structures and dynamics developed by the individual ensemble members approaching the open boundaries. On the other hand, salinity spread horizontal distributions are significantly different.

During the entire simulated period, maxima of the salinity ensemble spread are evident outside the area sampled by the observational campaign and structures are generally larger than in temperature. These are probably due to errors in the salinity content of water masses forcing the simulations at the lateral open boundaries and conflicting with in situ observations, thus generating fronts and instabilities. The adopted method to generate the ensemble members does not account for uncertainties in the forcing (surface or lateral) or initialisation; further work is necessary in order to assess the impact of the forcing perturbation. The present work focuses on the potential benefits of a hybrid approach, rather than on evaluating the ensemble generation itself.

The top panel shows the static and spatially homogeneous vertical
error correlation matrix, the bottom panel the ensemble estimate on 22 June
2014 at lat 7.0

Figure 6 shows an example of how the ensemble method changes the estimates of
the salinity and temperature error vertical correlations and
cross-correlations. On 22 June 2014, the ensemble estimates exhibit
correlations of background temperature and salinity significantly different
from the climatological estimate. Clearly, the ensemble method has added
information to the climatological estimates from the variability generated by
the ensemble simulations on particular days. An interesting feature of
temperature and salinity vertical error correlations on 22 June 2014 is the
presence of several local maxima and minima. The similarities between static
and ensemble-based correlations reflects the error in the large-scale
dynamical processes, introduced in our system by lateral open boundaries
conditions. Salinity correlations (top right corner on both Fig. 6 panels)
show the largest differences between climatological/steady and the daily
estimates. While the salinity climatological correlation field is
characterised by generally positive values, in the daily estimate a clear
anti-correlation pattern is observed starting at level 50 and persisting
toward the bottom. This clearly indicates a more complex vertical error
structure probably due to the presence of an intermediate water mass (the
Modified Levantine Intermediate Waters) and deficiencies in the model to
correctly simulate it. Similar patterns, even if less pronounced, are also
observed in the temperature correlation and temperature–salinity
cross-correlations. Furthermore, vertical scales of the correlations differ
significantly. For instance, salinity vertical correlations are longer at the
ocean bottom in case of the ensemble

Left panels:

The temperature and salinity corrections due to the systematic error are shown in Fig. 7. The panels on the left show how the vertical structure of the systematic error, averaged over the entire domain, evolve during the simulated period. In the right panels the maps of the systematic error correction averaged between 12 and 28 June at 100, 350 and 1000 m depth are shown. During the first 4 days the number of in situ observations increase and the spatial coverage improves. The systematic error computation and thus the corresponding correction is strongly affected by this observation sampling error. The sampling error is particularly evident in the surface and near-surface corrections (between the surface and 300 m depth), where scale of horizontal variability is small, that oscillate between positive and negative values. In the deeper layers, the amplitude of this oscillation is significantly smaller. However, the overall effect of the correction after 4 days is to decrease the warm bias present in the deep temperature initial conditions and to increase salinity content at intermediate depths. At the end of the first cruise leg, 11 June, the systematic error stabilises. After the initial shock due to the correction of the initial state, the systematic error correction corrects errors due to the surface forcing, the lateral open boundary condition and the inadequate model parameterisations.

The combined analysis of vertical structures and horizontal maps supports some
inferences. A thin layer with negative temperature correction is present
between 5 and 15

The surface salinity field reacts slowly to surface forcing. Simulation
errors are mostly due to advection/diffusion processes. The salinity
corrections in the first 100

The impacts of the daily ensemble-based

In order to fully assess the performance of each experiment the mean squared
error (MSE) is decomposed following Oke et al. (2002) and the single
components analysed:

Model bias (MB), standard deviation error (SDE), cross-correlation (CC) and skill score (SS) for the different experiments and integrated between different layers. For each quantity the best performing models is highlighted in bold.

The analysis of the single components of the model error allows us to
identify the effect of the bias correction procedure and the impact of the
daily, ensemble-based estimate of vertical covariance. The two simulations
without the bias correction (Exp-ref and Exp-Hy1) are characterised by a
similar vertical structure and values of temperature and salinity RMSE
(Figs. 8a and 9a for temperature and salinity respectively). Both the
simulations are characterised by a large temperature RMSE below 500 m depth,
while they both show a maximum in salinity RMSE at about 400

Vertical profiles of temperature error components and skill score
for the different experiments: black lines indicate the Exp-ref results, red
lines indicate Exp-Hy1 results, blue lines indicate Exp-Cl1 results and green
lines indicate Exp-Hy2 results.

As Fig. 8 but for salinity.

The standard deviation error indicates the capability of our system to
correctly reproduce the amplitude of the observed spatial/temporal
variability. Differences between climatological and daily estimates of the
background error covariance are evident. The usage of daily hybrid

The differences introduced by the daily, ensemble based, estimates of the
background vertical error covariance are evident analysing the
cross-correlation (Figs. 8d and 9d for temperature and salinity respectively)
and the skill scores (Figs. 8e, 9e). The Exp-Hy2 with systematic error
correction and daily estimate of the vertical error background covariance has
a temperature cross-correlation generally higher than the other experiments.
These differences are at a maximum between 20 and 80

The overall experiment statistics are listed in Table 3. Exp-Hy2 vertically
integrated temperature skill score is 55 %; 47 % is due to the
systematic error correction (Exp-Cl1 SS is 0.47), whereas the remaining part
is due to the introduction of the daily ensemble-based estimates of

The vertically integrated salinity Exp-Hy2 and Exp-Cl1 skill scores are 42 %, although they have different vertical distributions. Exp-Hy2 strongly outperforms Exp-Cl1 in the surface layers (0–50 m depth), while it does not significantly improve the model solution between 110 and 215 m depth. In the other layers the two systems have a similar performance. The improvements and the worsening are both due to the cross-correlation between observations and modelled salinities. This can be a consequence of the relatively small ensemble size that has not adequately sampled the model error.

During June 2014 an extensive sea-trial
(Recognized Environmental Picture, REP14-MED) off the west coast of Sardinia
was conducted by CMRE (Centre for Maritime Research and Experimentation). Two
research vessels and a glider fleet collected a massive amount of data in an
area of approximatively 10 000

A Nucleus of European Modeling of the Ocean (NEMO; Madec, 2008)-based model
has been implemented in the area with a horizontal resolution of
approximatively 1

In order to address the data assimilation issues characterising ocean
limited-area models with dense observational networks, a 3DVAR assimilation
scheme was implemented and coupled with the NEMO-based code. Following
Dobricic and Pinardi (2008) the present variational scheme decomposes the
background error covariance matrix (

To correct the systematic error the ensemble members' misfit statistics have been used. For every simulated day an estimate of the systematic error has been obtained by averaging the misfit over the ensemble members and assuming that observational error and random model error have both zero mean. The results have been mapped onto the model grid using a univariate objective analysis (Barnes, 1994) and superimposed to the ensemble daily mean. At each assimilation step the differences between the corrected ensemble mean and the last available daily average corresponding fields have been filtered with a low-pass filter with a 75 km length scale and the results superimposed to the 3DVAR corrections.

The implementation of our strategy suffers from the need to empirically choose the parameters associated with the combination of stationary and ensemble-derived covariances and with the scales for the large-scale bias estimation, which can both benefit from further tuning in the future. However, these experiments represent a proof-of-concept for including flow-dependent and large-scale aspects in a variational assimilation framework.

In order to test the validity of our hypothesis and to quantitatively
estimate the differences introduced with the hybrid-variational scheme
designed, the results of four different experiments have been compared.
Exp-ref uses the standard 3DVAR scheme with static and homogeneous

There are several possible future improvements of the hybrid variational
scheme method presented for estimating background error covariances.
Ménétrier and Auligné (2015) suggested a theoretical framework
where hybrid weights and parameters for the localisation of ensemble-derived
covariances are jointly optimised as a function of the ensemble size. An
alternative possibility may be to include the

Data used in this publication are property of the North
Atlantic Treaty Organization, who owns all rights, including intellectual
property rights. Data originated during REP14-MED experiment by NATO STO
Centre for Maritime Research and Experimentation (CMRE,

We start from the cost function:

To minimise Eq. (A4),

Furthermore, defining the background and analysis perturbations around the
true state

CMRE activities have been supported by the NATO Allied Command Transformation
(ACT) through the contract SAC000404. The authors wish to thank masters and
crews of NR/V