The ECMWF operational ensemble reanalysis-analysis system for ocean and sea-ice: a description of the system and assessment

The ECMWF OCEAN5 system is a global ocean and sea-ice ensemble of reanalysis and real-time analysis. This manuscript gives a full description of the OCEAN5 system, with the focus on upgrades of system components with respect to its predecessors ORAS4 and ORAP5. An important novelty in OCEAN5 is the ensemble generation strategy that includes perturbation of initial conditions, and a generic perturbation scheme for observations and forcing fields. Other upgrades include revisions to the a-priori bias correction scheme, observation quality control and assimilation method for sea-level anomaly. The 5 OCEAN5 historical reconstruction of the ocean and sea-ice state is the ORAS5 reanalysis, which includes 5 ensemble members and covers the period from 1979 onwards. Updated version of observation data sets are used in ORAS5 production, with special attention devoted to the consistency of sea surface temperature (SST) and sea-ice observations. Assessment of ORAS5 through sensitivity experiments suggests that all system components contribute to an improved fit to observation in reanalyses, with the most prominent contribution from direct assimilation of ocean in-situ observations. Results of observing system experiments 10 further suggest that Argo float is the most influential observation type in our data assimilation system. Assessment of ORAS5 has also been carried out for several key ocean state variables and verified against reference climate data sets from ESA CCI project. With respect to ORAS4, ORAS5 has improved ocean climate state and variability in terms of SST and sea-level, mostly due to increased model resolution and updates in assimilated observation data sets. In spite of the improvements, ORAS5 still underestimates the temporal variance of sea level, and continue exhibiting large SST biases in the Gulf Stream and extension 15 regions which are possibly associated with misrepresentation of front positions. Overall, the SST and sea-ice uncertainties estimated using five ORAS5 ensemble members have spatial patterns consistent with those of analysis error. The ensemble spread of sea-ice is commensurable with the sea-ice analysis error. On the contrary, the ensemble spread is under-dispersive for SST. Copyright statement. TEXT 20

The aim of this document is to describe ORAS5 as the ocean reanalyses component of the OCEAN5 system. Details of system upgrades after ORAP5 are discussed. This includes updates in the surface forcing and initialization (in Section 2.2), updates in in-situ observation and assimilation (in Section 2.3); updates in altimeter observation and assimilation (in Section 2.4); generation of the ensemble perturbations (in Section 2.5). The OCEAN5-RT analysis is presented in Section 3. Sensitivity experiments and assessment of ORAS5 system components can be found in Section 4. Section 5 presents evaluation results 5 with selected ocean Essential Climate Variables.

The ORAS5 system
ORAS5 is a global eddy-permitting ocean and sea-ice ensemble reanalysis produced via the OCEAN5 system in its BRT stream. ORAS5 provides historical ocean and sea-ice conditions from 1979 onwards. And a spin-up period between 1958 to 1978 is also provided (INI1 in Table. 2), which can be treated as a backward extension by users that are interested in a longer 10 reanalysis period. Here we give a brief overview of the model and methods used, with emphasis on the differences between ORAS5 and its predecessor ORAP5. This includes different observation data sets of SST, SIC, and in-situ observations; updates in bias estimation and observation quality controls; and a new method in ensemble generation and initialization. Impacts of these updates have been assessed with data assimilation experiments, normally in a reduced resolution in order to reduce computing cost. It is worth pointing out that improvements from these updates presented in this section may not add up to an 15 accumulative "sum" of improvements in ORAS5, and an optimized best configuration is not always guaranteed if it is based on results from a low resolution system. However, this is the standard and only possible procedure to test many components in a complex system such as ORAS5.
2.1 Ocean-sea ice model and data assimilation ORAS5 uses the same ocean model and spatial configuration as ORAP5 (Table 1). The NEMO ocean model version 3.4.1 20 (Madec, 2008) has been used for ORAS5 in a global configuration ORCA025.L75 (Barnier et al., 2006), a tripolar grid which allows eddy to be represented approximately between 50 • S and 50 • N (Penduff et al., 2010). Model horizontal resolution is approximately 25 km in the tropics, and increases to 9 km in the Arctic. There are 75 vertical levels, with level spacing increasing from 1 m at the surface to 200 m in the deep ocean. NEMO is coupled to the Louvain-la-Neuve sea-ice model version 2 (LIM2, see Fichefet and Maqueda (1997)) implemented with the viscous-plastic (VP) rheology. The wave effects 25 introduced since ORAP5 (Breivik et al., 2015) were also implemented in ORAS5, with updated ocean mixing terms for wind.
Given that the wave field is not defined under sea-ice, the wave impact in the Turbulent Kinetic Energy (TKE) scheme is not used under sea-ice. Instead, a constant value of 20 is used under sea-ice as coefficient of the surface input of TKE in ORAS5.
The reanalysis is conducted with NEMOVAR (Weaver et al., 2005;Mogensen et al., 2012) in its 3D-Var FGAT (First-Guess at Appropriate Time) configuration. NEMOVAR is used to assimilate subsurface temperature, salinity, sea-ice concentration  (Donlon et al., 2012) T/S prof EN3 with XBT/MBT correction (Wijffels et al., 2008) EN4 with XBT/MBT correction (Gouretski and Reseghetti, 2010) + NRT SLA AVISO DT2010 (Dibarboure et al., 2011) AVISO DT2014 (Pujol et al., 2016) + NRT sea-ice Same as SST as ORAP5 tional information is also used via an adaptive bias correction scheme (Balmaseda et al., 2013a), which will be explained in Section 2.3. function to produce the assimilation increment. The increment is applied during a second forward integration of the model (the second outer loop) using the incremental analysis updates method with constant weights (IAU; Bloom et al. (1996)). Both SIC and other observations are assimilated using a 5-day assimilation cycle in ORAS5 and share the outer loop model integrations. 10 As in ORAP5, assimilation of SIC data is also included in ORAS5. The background state of ocean and sea ice is produced from a coupled NEMO-LIM2 run, but the minimization of the SIC cost function is separated from the minimization of the cost function for all other ocean state variables. The separation of the sea-ice minimization assumes that there are no covariances between SIC and other variables. Variables which are physically related are divided into balanced and unbalanced components.
The balanced components are linearly dependent (related by the multi-variate relationships), while the unbalanced components are independent and uncorrelated with other variables. The ORAS5 balance relations are the same as for ORAS4 (Mogensen et al., 2012) and ORAP5. The observation and background errors specifications are the same as in ORAP5 , except for sea level (see Section 2.4). 5 2.2 Model initialization and forcing fields

Initialization
As for the previous ocean reanalysis system ORAS4, perturbing the ocean initial conditions at the beginning of the reanalysis period is considered paramount. In ORAS4 different initial states in 1958 were given by sampling a 20 year ocean integration.
ORAS5 had a longer spin-up using reanalyses for the period 1958-1979, conducted using either ERA40 (Uppala et al., 2005 10 or ERA20C (Poli et al., 2016) forcing and assimilating in-situ data. ORAS5 starts in 1979, so it is in principle possible to have initial conditions representative of that given date. A series of ocean reanalyses assimilating in-situ profiles using different surface forcing, data sets and parameters was conducted from the period 1958 to 1975 (Table 2), as an attempt to account for the uncertainty of ocean state at a given point in time. This approach gives a set of 5 initial conditions (INI1-5) to start each of the ensemble member of ORAS5, thus generating the ORAS5 initial perturbations. The control member of ORAS5 was 15 initialized from INI1 with a similar configuration to ORAP5, and is unperturbed: neither the forcing fields nor the observations perturbations are applied (see Section 2.5 for details). A second spin-up from 1975 to 1979 was then conducted with the same settings as used for ORAS5, and the integrations are then continued after 1979. The impact of the initial perturbations is illustrated in Fig. 2, which shows the evolution of the global ocean heat content (OHC) from the 5 spin-up ocean reanalyses listed in Table 2, and ORAS5 with its 5 ensemble members. 20 The initial uncertainty of ORAS5 OHC is illustrated by OHC spread (here we define the spread as the maximum value minus minimum value in OHC, taking into account all ORAS5 ensemble members at a given time) in Fig. 2 the other components of the perturbation scheme (See Section 2.5).

Forcing, SST and SIC
Forcing fields for ORAS5 are derived from the atmospheric reanalysis ERA-Interim (Dee et al., 2011(Dee et al., ) until 2015, and from the ECMWF operational NWP thereafter (see Fig. 3), using revised CORE bulk formulas (Large and Yeager, 2009) that include the impact of surface waves on the exchange of momentum and turbulent kinetic energy (Breivik et al., 2015). Compared to 30 ORAP5, the wind enhanced mixing due to surface waves is updated with a revised spatial distribution scheme. In addition, ob-  Figure 2. Time series of global ocean heat content (in 10 10 Jm −2 ) integrated for the whole water column, from 5 spin-up runs (INI1-5, 1958(INI1-5, -1974 and ORAS5 from 1975 onwards. The shaded areas encompass the spread of all ORAS5 ensemble members. A 12-month running mean has been applied. are used to modify the surface fluxes of heat and freshwater. Readers, however, should note that ORAS5 will be re-processed with ERA-Interim forcing and reprocessed observation data set (e.g. EN4 (Good et al., 2013)) from 2015 onwards. This reprocessed ORAS5 product will be extended annually with consistent forcing and observation data set whenever possible. This should produce consistent time series that are suitable for climate monitoring applications beyond 2014. The reprocessed ORAS5 will be available as part of the ensemble of global reanalyses distributed by the Copernicus Marine Environmental 5 Monitoring Services (CMEMS).
SST is assimilated in ORAS5 by modifying the surface non-solar total heat flux using the product of a globally uniform restoration term of −200 Wm −2 K −1 and the difference between modelled and observed SST (see Haney (1971)). The effect of this restoration can be illustrated as follows: assuming a constant mismatch to observations of 1 K within a well mixed upper 50 m layer of water, the relaxation term will restore the water temperature in this mixed layer by 1 K in about 12 days. The 10 numerical value is unchanged from previous ECMWF ocean reanalyses-ORAS4; the original choice was motivated to keep SST errors within 0.2 K in the global ocean. The same value is used in other ocean reanalysis systems with similar horizontal resolution as ORAS5 . However, given that ORAS5 has finer vertical resolution, this term may need and 7 in Zuo et al. (2015).
All spin-ups are carried out in ORCA025.L75 configuration.
revision. Besides, it has also been found that ocean circulation in climate model is sensitive to the strength of SST restoration (Servonnat et al., 2014). More discussion of SST nudging and associated impact on ocean state can be found in Section 5.1.
A similar global uniformed SSS restoration term of -33.3 mm/day to climatology has been applied by adding a term to the surface freshwater fluxes equation. This is equivalent to a restoration time-scale of about 1 year for a well mixed upper 10 m layer of water with a mean model surface salinity of 35 psu.

5
Temporal consistency in the SST analysis product employed is important for both ocean and atmospheric reanalysis. Hirahara et al. (2016) found that the OSTIA SST reanalysis product has a noticeably different global mean with respect to its homonymous real-time product; they recommended to use SST from Titchner and Rayner (2014) in combination with the real-time OSTIA for production of the atmospheric reanalysis ERA5. HadISST2.1 is a new pentad SST product with a spatial resolution of 0.25 • resulting from the EU FP7 project ERA-CLIM2. The bias correction and data homogenization in this prod-10 uct is superior to its predecessor HadSST3 (Kennedy et al., 2011a, b), and more importantly, the resulting SST are consistent with those delivered operationally by OSTIA (Donlon et al., 2012). ORAS5 has adopted the same SST as ERA5. Therefore, SST in ORAS5 prior to 2008 comes from HadISST2.1, and from operational OSTIA thereafter.
The SIC data assimilated in ORAS5 comes from the OSTIA reanalysis before 2008. This is the same as in ORAP5. Sea-ice data in HadISST2.1 includes both re-processed sea-ice concentration data from the EUMETSAT Ocean and Sea Ice Satellite

15
Application Facilities (OSI-SAF) and polar ice charts data from National Ice Center (NIC). SIC in HadISST2.1 is calibrated against NIC sea-ice charts in order to ensure consistency with chart analyses prior to the satellite era. However, sea-ice concentration in sea-ice charts has large uncertainties itself (Karvonen et al., 2015). Moreover, some sea-ice charts are biased towards ERA40 and Reynolds OIv2d data was used before 1985 when OSTIA product is not available.
high SIC. As a result, sea-ice concentration in the HadISST2.1 data is substantially higher than in the OSI-SAF data (Titchner and Rayner, 2014) and OSTIA analysis .  In order to assess the impact of assimilating different SST and SIC products in our system, sensitivity experiments have been carried out at ORCA1.L42 resolution (approximately 1 • at tropics with 42 vertical levels) with ORAP5-equivalent Low-Resolution configuration (hereafter referred to as OP5-LR). SST and SIC data used in these experiments are listed in Table 3, together with the experiment names. Global mean SST from these experiments are shown in Fig. 4, together with the SST analysis products that were assimilated. For verification, the latest European Space Agency Surface Temperature Climate 5 Change Initiative (ESA SST CCI) multi-year SST record (Merchant et al., 2014) (version 1.1) is also included here as a reference. This data set is generated from satellite observations only and is independent from in situ observations.
Despite the discrepancy in the early period, HadISST2 and OSTIA SST analyses are very similar after 2008, suggesting that HadISST2 is more consistent with the operational OSTIA SST product than the OSTIA reanalysis SST itself, as already pointed out by Hirahara et al. (2016). OSTIA reanalysis SST is systematically colder than both HadISST2 and ESA CCI SST observations agree better with HadISST2 SST than with OSTIA SST, thus pull the analysed SST towards the warmer side. This lack of consistency between near-surface in-situ observations and OSTIA reanalysis, and between operational OSTIA SST and OSTIA reanalysis, determined the final choice of SST product for ORAS5.
The above experiments were also used to inform the choice of the SIC data set. Departure of sea-ice thickness (SIT) from the three sensitivity experiments (Table 3)  HadI as verified against ICESat observations. This is mainly due to assimilation of HadISST2 SIC that is in general higher than those of Reynolds/OSTIA data. In fact, assimilation of HadISST2 SIC during 1979-1984 implies strong positive sea-ice 25 volume increments with respect to ERA40/Reynolds data, which are equivalent to adding approximately 3 meters of SIT per year in most of the Arctic basin during this period (not shown). This effect has also been discussed by Tietsche et al. (2013) in their sea-ice assimilation experiments. As a result, ASM-HadI exhibits unrealistic sea-ice conditions in both the Arctic and the Antarctic (not shown). Therefore, we chose to use the OSTIA reanalysis SIC in ORAS5 until 2008, together with SST observation from HadISST2.  The in-situ temperature and salinity (T/S) profiles in ORAS5 come from the recently released quality controlled data set EN4 (Good et al., 2013) with Expendable BathyThermograph (XBT) and Mechanical bathythermograph (MBT) depth corrections from Gouretski and Reseghetti (2010) until May 2015. EN4 is a re-processed observational data set with globally quality-5 controlled ocean T/S profiles. It includes all conventional oceanic observations (Argo, XBT/MBT, Conductivity-Temperature-Depth (CTD), moored buoys, ship and mammal-based measurements). Data from the Arctic Synoptic Basin Wide Oceanography (ASBO) project was also included in EN4, therefore improves data coverage in the Arctic. Compared to its predecessor EN3 (used in ORAS4 and ORAP5), EN4 has increased vertical resolution, improved QC and duplication check, and extends farther back in time. For the latest years, EN4 also contains a more complete and cleaned record of the Argo data, with 10 bias-corrected data whenever possible. After May 2015, ORAS5 starts using the operational data from the Global Telecommunications System (GTS), which consists of data received in near-real-time at ECMWF. The same quality control procedures as described in Section 2.3.3 are applied to all GTS data, to ensure that only good quality observations similar to EN4 data are assimilated in ORAS5.
The new EN4 data set has been evaluated against the EN3 data set using twin experiments carried out in the OP5-LR configuration at ORCA1.L42 resolution. Twin experiments comprise a reference run EXP3 that assimilates EN3 data, and another run EXP4 that assimilates EN4 data but are otherwise identical. For verification purpose, a group of CTD mooring 5 arrays in the Barents Sea was withdrawn from data assimilation in either EXP3 or EXP4. Mean bias and root-mean-square departure of model background with respect to these CTD moorings are shown in Fig. 6 for both experiments. The EXP4 has reduced temperature and salinity RMS errors in the Barents Sea. This better estimation of mean ocean state in EXP4 can be attributed to an improved observation coverage of EN4. After 2005, the Arctic ocean observation almost doubled in EN4 with respect to EN3. As a results, EXP4 also show freshening (up to 0.2 psu) near the Greenland coast, at the edge of East Siberian 10 Sea and across the Baffin Bay, which are directly related to discrepancies between the EN3 and EN4 data sets (not shown).

Observing system experiments
Observing System Experiments (OSEs) are widely used as a method to evaluate the impact of existing observations, and are routinely carried out at ECMWF for assessment of previous operational ocean reanalysis systems and seasonal forecast (Balmaseda and Anderson, 2009). To understand the impact of individual observation types in EN4, a series of OSEs have been switched off. First, a reference experiment (ORA-ALL) has been carried out by assimilating all in-situ observations from the quality-controlled EN4 data set. Four OSE-ORAs experiments were then carried out based on ORA-ALL, by withdrawing individual in-situ observation types from the global data assimilation system: 1) NoArgo -removing Argo floats; 2) NoMooring -removing Moored buoys data; 3) NoShip -removing XBT, MBT and CTD data; 4) NoInsitu -removing all in-situ observations. All OSE-ORAs have been driven by the same forcing from ERA-Interim. Research Moored Array for African-Asian-Australian Monsoon Analysis and Prediction (RAMA). The degradation resulting from the removal of PIRATA is slightly larger than that coming from TAO/TRITON and RAMA. This can be attributed to a more realistic ocean state in the tropical Pacific and Indian oceans constrained by surface observations (SST) and forcings (winds and surface fluxes) in our system, but is also likely to associated with the drastic reduction in the observation number   Removal of all ocean in-situ observations (Fig. 7d) gives an estimation about the total impact of GOOS, which is not a simple linear combination of individual observation type. Note that in the Southern Ocean the RMSD is sometimes larger in NoArgo than in NoInsitu, which indicate some inadequacy of the data assimilation process. Overall, the weak impact of removal of observations in the Indian Ocean is possibly related to the comparatively sparse observing system in that region. Generally, the tropical Atlantic seems to be more sensitive to the removal of in-situ observations than the other tropical ocean basins. All input observation are subject to a global quality control procedures similar to those employed in EN4. Among these are checks on duplication, background, stability, bathymetry, and using the Argo grey list (from ftp://ftp.ifremer.fr/ifremer/ argo/etc/ar_grey-list/). In addition, a new temperature-salinity pair check has been introduced in ORAS5, in which salinity observation will be rejected whenever the corresponding temperature observation at the same location is not available. This 5 pair check has been designed to avoid assimilating salinity observation alone, considering that temperature is the primary variable in the multivariate balance operator (Weaver et al., 2005)

Bias correction scheme
Model bias correction is essential for the ocean data assimilation system, especially for dealing with irregular and inhomogeneous ocean observations. A similar multi-scale bias correction scheme as described in Balmaseda et al. (2013a) has been implemented in ORAS5 to correct temperature/salinity biases in the extra-tropical regions. A pressure correction for the tropical regions has been implemented as well in this bias correction scheme. This is an important method for mitigation of suspicious 20 climate signals that could be introduced due to the assimilation of an evolving observation network. Compared to ORAP5, the ORAS5 bias correction scheme includes two major upgrades. First, the a-priori bias term (offline bias) in ORAS5 has been estimated using an ensemble of five realizations of assimilation runs (only temperature and salinity) during the Argo era (2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) with different forcing and model parameters (See Table. 4). The sampling period starts a few years after the Argo floats, when a relatively homogeneous global ocean observing network becomes available. The equivalent term in ORAP5 was 25 estimated from a single realization of reanalysis from a shorter period (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009). The ensemble approach allows uncertainties of model errors to be estimated, and could provide, in some regions, a more robust estimation of the systematic model error. In ORAS5 only the ensemble mean of a-priori biases estimated from these five realizations (BIAS1-5) was used in order to account for seasonal variations of the model/forcing errors.
To help readers understanding about relative contributions of offline bias correction in different systems, Fig. 9 shows the 30 mean vertical profiles of the a-priori bias correction applied to temperature and salinity in ORAS5 and two previous ECMWF ocean reanalyses (ORAS4 and ORAP5). It is worth noting that the value shown in Fig. 9 has been added in the reanalysis system to correct model background errors, therefore it is opposite to model biases. In general, the two high-resolution reanalyses  (ORAP5 and ORAS5) have opposite and weaker temperature biases to ORAS4. Considering that all three reanalyses use the same ERA-Interim forcing, the different sign of bias terms is likely a result of model physics/resolution rather than forcing.
However, both the SST observational data set and the surface flux formulation have changed substantially between ORAS4 and ORAS5, and therefore, the effect of surface fluxes and SST cannot be neglected. Compared to ORAP5, ORAS5 has slightly increased cold bias around 100 m, but with reduced cold bias below 200 m. All three reanalyses show fresh biases in salinity 5 for the upper 100 m, with ORAS5 bias stays in-between ORAP5 and ORAS4. The same offline bias correction terms in maps are shown in Fig. 10 for ORAP5 and ORAS5. Both ORAP5 and ORAS5 show very similar spatial patterns in temperature and salinity biases, suggesting common model or forcing errors. However, temperature bias in ORAS5 are clearly weaker than in ORAP5 between 300-700 m, especially for the Tropics. On the contrary, Upper 100 m salinity bias in ORAS5 is larger than ORAP5 almost everywhere. This bias term is the systematic model/forcing errors estimated using in-situ observations, 10 therefore the result is subject to the temporal and spatial coverage of global ocean observing system. The differences between ORAS5 and ORAP5 as seen in Fig. 9 and Fig. 10 are results from (a) improved temporal and spatial coverage in the new EN4 data set with increased vertical resolution; (b) a different climatological period used for ORAS5 bias estimation; and (c) the ensemble bias estimation method used in ORAS5. Furthermore, a stability check was introduced in the ORAS5 bias correction that caps the minimum value of salinity bias 15 correction term to prevent static instability. We define a minimum value for the squared buoyancy frequency as N 2 min . In every model grid cell where N 2 as defined by the model background potential density profile (ρ σ ) is close to static instability (N 2 <= N 2 min , N 2 min = 1e −10 ), we modify the salinity bias to ensure that δN 2 due to total bias (both temperature and salinity) is 0. In this way, the salinity bias correction is prevented from introducing instability in the water column, which could otherwise induce spurious vertical convection, thought to be the cause of large reanalysis biases in regions around the Mediterranean outflow waters in the Northern Atlantic Ocean (Zuo et al., 2017b). Results of model fit-to-observation errors from a set of twin 5 assimilation experiments testing the impact of the bias capping can be found in Fig. 11. The twin experiments were set up in the OP5-LR configuration -but assimilating EN4 data set instead of EN3. The reference run (NoCap) does not activate salinity bias capping, while the other run (CP10) adds salinity bias capping and has otherwise exactly the same configuration. Both temperature and salinity RMSE profiles of NoCap show a local maximum at 1000 m, which is associated with the spurious convection between 1000 and 2000 m due to warm and salty Mediterranean outflow. The new salinity bias capping in CP10 successfully reduces bias and RMSE for both temperature and salinity at this depth range. As a result, CP10 also exhibits improved sea level correlation with altimeter data compared to NoCap (not shown). Further assessment of this bias correction method with respect to in-situ observations can be found in Section 4.2. The sea-level anomaly (SLA) observations produced by AVISO (Archiving Validation and Interpretation of Satellite Oceanographic data) DUACS (Data Unification and Altimeter Combination System) has been updated to the latest version DT2014 (Pujol et al., 2016) in ORAS5 for both filtered along-track and gridded SLA data. Compared to the previous version DT2010 (Dibarboure et al., 2011) that has been used in ORAS4 and ORAP5 reanalyses, the DT2014 data set has received a series of major upgrades, including a new 20-year altimeter reference period (1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) and increased spatial resolution (14 km in 10 low latitudes), among others. Another important change in ORAS5 w.r.t. ORAS4/ORAP5 is that SLA thinning is now done by stratified random sampling (Zuo et al., 2017a) instead of creating superobbing SLA observations, as a method to account for observation representativeness errors from along-track SLA data. As a result, ORAS5 ingests SLA observations with increased local variability but reduced observation error standard deviations (OBE STD). Compared to ORAS4, the SLA OBE STD in ORAS5 is reduced by approximately 20% in the Tropics due to increased spatial resolution of DT2014 data set. ORAS5 also assimilates more along-track SLA data whenever newly available satellite missions (i.e. GeoSat Follow-On, HaiYang-2A, Topex New, Jason-1 Geodetic, Jason-1 New, Saral/AltiKa) are available in DT2014. Other parts of the scheme, e.g. a reducedgrid construction (typical 1 • by 1 • in latitude/longitude) and a method for diagnosing OBE STD (Mogensen et al., 2012), 5 remain unchanged. SLA observation has not been assimilated in ORAS5 outside the latitudinal band from 50 • S to 50 • N, nor in regions shallower than 500 m. Assessment of this change in SLA assimilation can be found in Section 5.2. A reference mean dynamic topography (MDT) is required in order to assimilate SLA along-track data in an ocean general circulation model. This is necessary because altimeter measurement and the state variable in the ocean model are with respect to different reference surfaces. There are several approaches to tackle this problem. One approach consists on using an external 10 MDT (Rio et al., 2014), which is further corrected by using cumulative SLA innovation terms (Lea et al., 2008). This is the approach followed in the Met Office's global Forecasting Ocean Assimilation Model (FOAM, Waters et al. (2015)) and in the CMEMS global ocean monitoring and forecasting system (Lellouche et al., 2018). A different approach is used at ECMWF, and consists on estimating the MDT from a multi-year pre-reanalysis run assimilating T/S observations; this is the so-called model MDT approach, and it is described in (Balmaseda et al., 2013a). The MDT in ORAS5 follows this model MDT approach, 15 except that the pre-reanalysis run, which assimilate only in-situ observations and with bias correction, was produced using two parallel streams instead of one sequential integration, in order to accelerate the process of computing the MDT. The MDT was then constructed by averaging the resulting sea-surface height over a reference period 1996-2012, with additional correction term to account for the different averaging period w.r.t the DT2014 data set as done in ORAP5 . In this way, the assimilation of SLA constrains the temporal variability of the reanalysis without affecting the reanalysis mean state. However, it also means that the assimilation of SLA will not further correct model mean state. The difference in MDT used by ORAS5 and by FOAM system is shown in Fig. 12. Large differences can be found in regions with strong meso-scale eddy activities (e.g. along the Western Boundary Currents and the ACC currents), and along the Antarctic coasts. A dipole of positive-negative 5 MDT departures along the Gulf Stream and extensions is of particular interest. This is consistent with the estimated a-priori temperature and salinity biases in ORAS5 (Fig.10), suggesting some model/forcing errors in this regions.
Prior to 1993, mass variation that contributes to the change of Global Mean Sea Level (GMSL) in ORAS5 was constrained using the GRACE-derived climatology. The total GMSL was then constrained by assimilating altimeter-derived GMSL after 1993. This is the same as in ORAP5 . The GMSL was derived from altimeter observations, firstly using 10 reprocessed DT2014 gridded SLA data up to 2014, then using AVISO NRT gridded SLA from 2015 onwards. A systematic offset of GMSL between these two data sets is expected, due to slightly different data processing methods (e.g. multi-mission and mapping method). This offset is corrected for, in order to avoid introducing spurious GMSL discontinuities in the system.
Assuming that sources of error do not change over time, this GMSL offset between delayed and NRT gridded SLA products can be derived using GMSL difference averaged over their overlapping period. This period covers from May 2014 to November 15 2014 at the time of ORAS5 production. This value was then added for bias correction of GMSL derived from NRT data from 2015 onwards.

Ensemble generation
A new generic ensemble generation scheme developed by perturbing both observations and surface forcings has been implemented in ORAS5. Here, we give a brief summary of the scheme. Preliminary assessments of ORAS5 temperature and salinity 20 ensemble spread are also presented here. The reader should refer to Zuo et al. (2017a) for details about this ensemble generation scheme.
ORAS5 has employed a stratified random sampling method for pre-processing of both surface and sub-surface observations. As a result, the different members of the ensemble see different observations. This is a way to optimize the number of the observations, since more observations are used in the ensemble. The in-situ observation profiles are perturbed in ORAS5 25 in two ways: by perturbing the longitude/latitude locations, and vertical perturbation by applying vertical stratified random thinning. The latitude/longitude locations of ocean in-situ profiles are perturbed so that the resulting locations are uniformly distributed within a circle of radius 50 km around the original location. This radius is chosen primarily considering observation representativeness error with respect to model horizontal resolution. The vertical thinning is applied by assuming a uniform distribution of possible observation location within any given vertical range, and a maximum of 2 observations within each 30 model level, if available, are then randomly selected for data assimilation. A similar stratified random thinning method is also applied to perturbing ORAS5 surface observations (SIC and SLA). In all cases some pre-defined reduced grids are constructed in order to carry out thinning, where observations within a given stencil in the reduced grid are randomly selected. As a result, each ensemble member assimilates slightly different observations. For SIC observation, this reduced grid is constructed with a length scale of approximately 30 km in the Arctic region. For SLA observation, this reduced grid is constructed with a length scale of approximately 100 km in the Tropics. These values were chosen to ensure a reasonable sample size within the reduced grid. Altimeter observations from different satellite missions are treated separately. This method ensures that the number of observation assimilated in each of the perturbed ORAS5 members is comparable to that in the unperturbed member.
A new method has also been developed to perturb surface forcing fields used to drive ORAS5. This method preserves 5 the multivariate relationship between different surface flux components, and has been used to perturb SST, SIC, wind stress, net precipitation (precipitation minus evaporation) and solar radiation. ORAS5 forcing perturbation takes into account both structural errors, which are derived from differences between separate analyses data sets (e.g. wind stress differences between NCEP and ERA-40); and analysis errors, which are derived from differences between ensemble members within the same ensemble analysis (e.g. the 10 ensemble members of ERA20C (Poli et al., 2016)). The forcing in the ORAS5 control member 10 remains unperturbed.
Assessment of the ORAS5 temperature and salinity ensemble spreads has been carried out with respect to specified model BackGround Error (BGE) standard deviation (σ s b ) and the BGE standard deviation diagnosed with the Desroziers method (σ d b ), following the same procedure described in (Zuo et al., 2017a). Readers are reminded that the salinity σ s b shown here is for unbalance component only. Fig. 13 shows spatial map of these diagnosed values at 100 m depth, after binning and averaging in 15 5 • ×5 • lon/lat boxes. Here, the ORAS5 temperature ensemble spread (Fig. 13a) shows a spatial pattern that is very similar to the diagnosed value using the Desroziers method (Fig. 13e), except its amplitude is weaker, especially in the Tropics. The salinity ensemble spread in ORAS5 (Fig. 13b) is in general under-dispersive when verified against diagnosed σ d b (Fig. 13f). The spatial patters between salinity ensemble spread and diagnosed σ d b are reasonably consistent. On the contrary, the specified σ s b in ORAS5 are clearly overestimated almost everywhere for both temperature and salinity (Fig. 13c,d), suggesting that the current   (Buizza et al., 2018). Work is on-going at ECMWF for coupling the lower boundary conditions of atmospheric analysis system to OCEAN5-RT analysis with SST and SIC (Browne et al., 2018)). Now the OCEAN5 system is a major component needed for ECMWF's Earth system approach, with an ever stronger coupling between the atmosphere, land, waves, ocean and sea-ice components. Fig. 15 shows schematically how the OCEAN5 suite, with its BRT and RT components, is implemented at ECMWF. The OCEAN5-BRT uses a 5-day assimilation window and is updated every 5 days with a delay D of 7 to 11 days. A minimum delay period of 7 days has been chosen in order to avoid a large degradation of the sea level analysis caused by delays in receiving NRT altimeter observations from CMEMS. The OCEAN5-RT analysis is updated daily using a variable assimilation window of 8 to 12 days (equal to D + 1): starting from the last BRT analysis, it brings the RT analysis forward up to current conditions, 15 to produced ocean states suitable to initialise the coupled forecast. This RT extension contains 2 assimilation cycles (Chunk) with a variable second assimilation window. The RT extension is always initialized from the last day of the BRT analysis and synchronically switches to the new initialization whenever the BRT analysis updates, hence the variable assimilation window.
Taking current model day in its Year-Month-Day format (YMD), then in Fig. 15 the RT assimilation window length for YMD is 10 days, and is initialized from YMD-10 BRT analysis. In practice, the OCEAN5 RT analysis is launched every day at 14Z (same as ORTS4) to produce a daily analysis valid for 0Z for the following day (YMD+1). Unlike the historical ocean reanalysis, which is driven by atmospheric reanalysis forcing (e.g. ERA-Interim) and assimi-5 lates re-processed observation data sets whenever possible; the OCEAN5-RT component relies on ECMWF NWP forcings and NRT observation data input. The surface forcing fields that drive the OCEAN5-RT component come from ECMWF operational atmospheric analysis, except for the last day (YMD) when forcing is provided by ECMWF operational long forecast. Observations assimilated in OCEAN5-RT analysis come from GTS (ocean in-situ observations), CMEMS operational service (NRT sea-level anomalies) and daily-mean SIC and SIC data from OSTIA operational analysis. However, these may be different 10 from the BRT. In the case of in-situ observations, not all observations will be available at the start time or during the run time of the RT stream. SST and SIC data for the last day (YMD) are persisted from the previous day (YMD-1), since they are not available by the time the RT analysis is produced. 4 Assessment of ORAS5 system components

Sensitivity experiments
Additional experiments have been conducted within the ORAS5 framework to help with assessment of different system components. These include sensitivities to SST nudging, bias correction, assimilation of in-situ and satellite altimeter data. Studies of other system parameters, e.g. sensitivity to OBE STD specification, have been carried out but are not discussed here for the 5 sake of conciseness. A summary of system configurations of these sensitivity experiments can be found in Table 5

Verification in observation space
Assessment of ORAS5 performance in observation space is carried out using model background errors with respect to all assimilated observations. We compute the model RMSE based on discrepancy between model background and observation for ORAS5 and all sensitivity experiments in Table 5. This approach allows to assess contributions from different system components and the performance of ORAS5 as an integrated reanalysis system. The reader should note that error statistics in error characteristics within some small vertical depth range (e.g. within 100 m), then this comparison between control runs and assimilation runs is still valid. Time series of global mean RMSE in temperature and salinity from different sensitivity experiments are shown in Fig. 16, together with the total number of assimilated observations of various types shown with the right y-axis. Mean vertical profiles of these model RMSEs can be found in Figures 17, after temporally averaged over a period (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) that is with near-homogeneous global Argo distribution.

5
Overall, all components of the ocean reanalysis system (SST nudging, bias correction, assimilation of in-situ observation and altimeter data) contribute to reducing the model error, both in temperature (Fig. 16(top)) and salinity (Fig. 16(bottom)).
However, by construction, some components have a more profound impact on the improvement of the ocean state, e.g. the assimilation of in-situ observations. The magnitude of RMSE reduction due to direct T/S assimilation can be derived from departure between O5-NoBias (red lines) and CTL-HadIS (green lines). The error reduction due to assimilation of in-situ data 10 varies over time and is loosely proportional to the total number of observations assimilated. Over the Argo period 2005-2014, assimilation of in-situ data accounts for 65% of total RMSE reduction in temperature, and for nearly 90% of total RMSE reduction in salinity. These values are normalized against the total RMSE reductions derived from departures between ORAS5 (black lines) and CTL-NoSST (blue lines). Note that CTL-NoSST also shows a declining trend in its fit-to-observation errors, especially following the introduction of the Argo floats (Fig. 16). It is important to point out that this trend in CTL-NoSST does After 2015 a noticeable drop in the available Argo observations is due to switching from re-processed EN4 to the NRT GTS data stream, leading to small rise of ORAS5 temperature and salinity RMSEs in Fig. 16. A disruption in TAO/TRITON mooring array between 2012-2014 is also visible in Fig. 16top, which caused slightly increased ORAS5 RMSE in the Tropics 25 during this period (not shown).
Differences between the CTL-NoSST and CTL-HadIS in Fig. 16 and Fig. 17 give an estimate of surface SST nudging contributions. This component contributes about 18% to the global temperature error reduction (Fig. 16(top)). However, it leads to an increase of salinity errors between 1985 and 2005 ( Fig. 16(bottom)). This deterioration can be as large as 10% in the mid 1990s. SST nudging is the dominant term in temperature error reduction for the upper 200 m in the northern extra-30 tropics (Fig. 17b), but also leads to slightly increased temperature error in the Tropics below 300 m (Fig. 17a). This degradation may be linked with the inappropriate partition of surface non-solar heat fluxes above and below the tropical thermocline, which is normally shallow than 200 m. During the Argo period, SST nudging also reduces the salinity RMSE for the upper 1000 m in the southern extra-tropics (Fig. 17f). For the upper 200 m of the southern extra-tropics, SST nudging accounts for nearly 40%  Contribution of the multi-scale bias correction implemented in ORAS5 can be derived from differences between O5-NoBias and ORAS5. This component plays an important role in correcting model errors, especially for the extra-tropical regions where the online bias term is applied as a direct correction to the T/S fields (Fig. 17b,c,e,f). In the global ocean, this bias correction contributes to the total RMSE reduction with about 14% for temperature and about 10% for salinity, averaged for the upper 1000 m. This bias correction contribution is also relatively stable over time, and less susceptible to the evolving GOOS (Fig. 16).

5
Other system components, like the assimilation of the altimeter data, lead to marginal improvements in global temperature (ca. 3%), and have mostly neutral impact on the model salinity errors (not shown). One possible reason for this relatively small impact from assimilation of altimeter data is that, by construction, the assimilation of SLA does not correct mean model biases but only affects the temporal variability of reanalysis. In addition, the altimeter data in the ECMWF reanalyses is perhaps given a weak weight compared with meso-scale applications of ocean data assimilation, as to avoid spurious circulations and 10 degradation of the deep ocean (Zuo et al., 2017b). This result is very similar to ORAP5, which indicates that the new SLA thinning scheme in ORAS5 is as effective as the superobbing scheme in representing observation representativeness error.
Overall, we conclude that all components of the ORAS5 ocean data assimilation contribute to an improved ocean analysis state when verified against in-situ observations.

Assessment of ORASocean essential climate variables
Ocean essential climate variables (ECVs) are ocean variables commonly used for monitoring ocean state and climate signals on decadal or longer time scales. SST, SLA and SIC are three of key ocean ECVs defined by the Global Climate Observing 5 System (GCOS), and they have been selected here for an assessment of ORAS5 for climate applications. The ESA CCI project has developed suitable climate data records of these ECVs, which are generally derived from a combination of satellite and in-situ observations. Here, the latest versions of these ESA CCI climate data records for SST, SLA and SIC were chosen as reference climate data sets to verify ORAS5 and some relevant sensitivity experiments. These observation-only analyses are produced with different production systems (e.g. different satellite missions) and/or processing chains (e.g. bias correction 10 method) compared to the observational data sets that were assimilated in ORAS5. All statistics are computed using monthlymean fields from ORAS5 and ESA CCI observation data sets interpolated to a common 1 • × 1 • latitude-longitude grid.

Sea surface temperature
The ESA SST CCI (SST_cci) long-term analysis provides daily surface temperature of the global ocean over the period 1992 to 2010. Unlike the HadISST2 and OSTIA SST analyses, both of which are bias-corrected against in-situ observations (e.g.

15
drifting buoys), ESA SST_cci only uses satellites observations (AVHRR and ATSR). Therefore, it provides an reference SST data set of a quality that is suitable for climate research. The latest version 1.1 of the ESA SST_cci (Merchant et al., 2016) data set (referred to as SST_cci1.1 hereafter), has been used here for verification of the performance of ORAS5 at the sea surface.
The SST_cci1.1 data set is an update of version 1.0 described by Merchant et al. (2014).  For intercomparison, results from ORAS4 and two other sensitivity experiments are also included here. Compared to ORAS4 ( Fig. 18a), ORAS5 SST has reduced warm bias in extra-tropics, especially in the northern North Pacific, the Norwegian sea, the Southern Ocean and in the Brazil/Malvinas current regions. A dipole of positive-negative bias patterns in the Gulf Stream and extension is still visible in ORAS5, though it is with reduced magnitude compared to ORAS4. This suggests that the pathway of Gulf Stream extensions may be misrepresented in ORAS5. Spatial patterns of SST bias and RMSE in ORAS5 (Fig. 18c,d) 25 are consistent with those derived from the difference of HadISST2 and SST_cci1.1 (Fig. 20a,b), with large RMSE normally in regions with strong eddy kinetic energy (EKE). These are also regions where ORAS5 SST has large ensemble spread (>0.5 K in Fig. 19). In general, the SST RMSE in ORAS5 is reduced w.r.t ORAS4 (Fig. 18b), e.g. in the South Indian, the South and western North Pacific, and southern South Atlantic. Readers are reminded that mean differences between ocean syntheses and SST_cci1.1 have been removed before computing RMSE in Fig. 18b,d,f,h. Compared to ORAS4, the global averaged RMSE 30 is reduction by about 10% (30% if taken mean difference into account) in ORAS5. with respect to SST_cci1.1 data set. SST bias is computed using monthly-mean SST data and averaged over the 1993-2010 period. The RMSE is computed using monthly anomaly SST data after removal of seasonal cycle, and then normalized against the temporal standard deviation of SST_cci1.1 data (also without seasonal cycle) over the same period. Note that RMSE smaller than 0.4 are shown as white.  It is worth pointing out that different SST data sets were used for constraining SST in these two ocean syntheses before 2008: ORAS4 used OSTIA, and ORAS5 uses HadISST2. However, this improvement in ORAS5 SST can not be attributed to the new HadISST2 data set. To the contrary: w.r.t. SST_cci1.1, SST in HadISST2 has higher RMSE (by about 5%) and increased warm bias than OSTIA in the extra-tropics (Fig. 20). Therefore, improvements in ORAS5 SST should be attributed to increased model resolution and assimilation of updated EN4 in-situ data with improved vertical resolution.

5
Differences between ORAS5 (Fig. 18c,d) and CTL-HadIS (Fig. 18e,f) are non-trivial, with largely reduced mean biases in ORAS5, especially for the Labrador Sea and East of Japan. These regions also have large SST RMSE due to misrepresentation of mixed layer depth in CTL-HadIS, but are slightly improved in ORAS5 by assimilating in-situ observations. As expected, CTL-NoSST (Fig. 18g,h) has the largest SST biases w.r.t. SST_cci1.1. These biases are associated with systematic model and/or forcing errors, e.g. underestimated upwelling west of South America and South Africa, misrepresentation of 10 mixing in the Southern Ocean, or others. The difference between CTL-NoSST and CTL-HadIS highlights the fact that the SST nudging method is very effective in keeping SST close to observations in the reanalysis system. Further investigation on poor performance in the Gulf Stream and extension is on-going at moment.

Sea level
The ESA sea-level CCI (SL_cci) project provides long-term along-track and gridded sea-level products from satellites for 15 climate applications. Here, we use the latest reprocessed version 2.0 data from SL_cci (hereafter called SL_cci2) for validation of ocean syntheses sea level. The SL_cci2 sea-level data is an update of version 1.1 (Ablain et al., 2015) and includes data from additional altimeter missions (SARAL/AltiKa and CryoSat-2). Unlike the AVISO DT2014 product, which is dedicated to the best possible retrieval of meso-scale signals, SL_cci2 data focuses on the homogeneity and stability of the sea-level record. It has been produced using a different processing chain, and it also uses new altimeter standards, including a new orbit solution, 20 atmospheric corrections, wet troposphere corrections, and a new mean sea-surface and ocean-tide model (see Quartly et al. (2017)). Therefore, it can be used here as an reference climate data set for validation of ocean syntheses in climate scale.
In order to evaluate the temporal variability of regional sea level in ocean synthesis, the temporal correlation between ORAS5 SLA and SL_cci2 gridded SLA data has been computed over the 2004-2013 period, with its result shown in Fig. 21c. In general, sea level variation of ORAS5 is well reproduced in the tropics, with a temporal correlation normally higher than 0.9. Reduced 25 correlation is visible along the North Equatorial Countercurrent in the Pacific, and is related with the discrepancy between DT2014 and SL_cci2 data sets (Fig. 22a). Poor performance near the coast and in extra-tropics could be attributed partly to no SLA assimilation in these regions. This is similar for ORAS4 (Fig. 21a), except that ORAS4 sea level correlation is lower than ORAS5 almost everywhere, and especially in the tropical Indian, the tropical Atlantic and the Norwegian Sea. This difference can in large parts be attributed to the eddy-permitting model resolution of ORAS5, which accounts for most improvement in 30 the extra-tropics, and the assimilation of the new AVISO DT2014 data set, which accounts for most improvement in the tropics.
As expected, removal of altimeter SLA data significantly degraded system performance, as demonstrated by correlation difference between O5-NoAlt (Fig. 21e) and ORAS5. In addition, assimilation of ocean in-situ observations further improves representation of sea level in the reanalysis due to better representation of meso-scale dynamics. This improvement is relatively  For reference, the same diagnostics have been carried out for AVISO DT2014 data with respect to SL_cci2 (Fig. 22). In gen-5 eral, the temporal correlation between DT2014 and SL_cci2 is very high, indicating excellent agreement of temporal variations between the two data sets. Regions with lower correlation are visible though, e.g. along the North Equatorial Countercurrent in the Pacific between 180 • W and 100 • W (Fig. 22a). This is likely associated with differences in the production chains between DT2014 and SL_cci2, which include different altimeter-mission-dependent orbit solutions and geophysical corrections and different filtering methods in processing along-track SLA. This discrepancy between different observational data sets is also 10 responsible for the low correlation between ORAS5 and SL_cci2 SLA in the same region. Discrepancies in polar sea-level variances between DT2014 and SL_cci2 are likly associated with the new pole tide model (Desai et al., 2015) used in SL_cci2.
In order to evaluate the magnitude of temporal SLA variance in ORAS5, we compute the ratio of SLA variance between ocean syntheses and SL_cci2 for the 2004-2013 period, with results shown in Fig. 21 that has its origin in the forward ocean model. The CTL-HadIS experiment clearly exhibits this underestimation (Fig. 21h), it is likely related to the 0.25 • resolution still being insufficient. Some of this underestimation is also attributed to the assimilated DT2014 data set, which has about 10% less variance than SL_cci2 in the average grid cell (see Fig. 22b). This difference between SL_cci2 and DT2014 is mostly due to different geophysical corrections used in production (Jean-François Legeais, personal communication). Removal of altimeter data (O5-NoAlt, Fig. 21f) and in-situ data (CTL-HadIS, Fig. 21h) from the assimilation system further reduces simulated SLA variances, by approximately 3% and 5%, respectively. There are regions 5 where ORAS5 has larger SLA variance though, e.g. in the Baffin Bay, Hudson Bay, and most areas in the Southern Ocean. The readers is referred to Legeais et al. (2018) for a detailed evaluation about ORAS5 sea level trend and its decomposition with respect to AVISO DT2014 and other ESA Sea Level CCI products.

Sea-ice concentration
The ESA Sea-Ice CCI (SI_cci) project has produced a long-term SIC data set based on satellite passive microwave radiances.

10
The latest version 1.1 SIC data from SI_cci (hereafter SI_cci1.1) was produced using a sea-ice concentration algorithm and methodology developed by EUMETSAT Ocean and Sea Ice Satellite Application Facility (Sørensen and Lavergne, 2017 ability to represent Arctic sea-ice has been documented to be reasonably good (Tietsche et al., 2015;Chevallier et al., 2017;Uotila et al., 2018). Therefore, ORAP5 has been retained here as a reference data set. Overall, ORAS5 SIC (Fig. 23c,d) has the same error characteristics as ORAP5 (Fig. 23a,b), which has already been well documented in Tietsche et al. (2015). The averaged SIC RMSE is normally less than 5% in the Arctic, again comparable with ORAP5. The largest ORAS5 SIC RMSE (up to 20%) appears in the Labrador Sea in Arctic winter (Fig. 23c), which is caused by a mean positive(negative) SIC bias 20 in the western(eastern) part of Labrador Sea. High SIC RMSE is also visible in the east coast of Greenland in both Arctic winter and summer for ORAS5 (Fig. 23d), and is caused by a mean positive(negative) SIC bias in the East Greenland Current north(south) of Iceland. These are also regions identified with large model/forcing errors as shown in CTL-HadIS (Fig. 23e).
Like in ORAP5, visible SIC error along the Arctic coastal lines and in the Baltic Sea in ORAS5 can be attributed to observation errors in OSTIA SIC reanalysis. Assimilation of OSTIA SIC has greatly improved sea-ice performance in ORAS5. Compared 25 to CTL-HadIS (Fig. 23e,f), ORAS5 has reduced SIC RMSE almost everywhere in Arctic summer (Fig. 23d). The largest improvement in Arctic winter is located at the east of Greenland along the south edge of the Arctic sea-ice outflow extension, which is associated with model errors in ocean current and/or sea-ice velocity. The SST nudging scheme also contributes to reduction of SIC RMSE in the system (not shown). These improvements are mostly due to correction of thermodynamic errors in the model, which is common in Arctic summer for Arctic surface water, but also in Arctic winter and in the Barent Seas. 30 For reference, the ensemble spreads of ORAS5 SIC are shown in Fig. 24, which are estimated using the same monthly mean SIC conditions from the five ensemble members of ORAS5. This is encouraging to see that the spatial patterns of ORAS5 SIC uncertainty match those of RMSE reasonably well, even though that the ORAS5 is over-confident in the Labrador Sea and east coast of Greenland. ORAS5 sea-ice uncertainty has been tested by Richter et al. (2018) in two radiative transfer models  to generate atmosphere brightness temperatures. In addition, an evaluation of ORAS5 sea-ice thickness in the Arctic has been carried out by Tietsche et al. (2018) with a focus on thin sea ice with respect to a data set derived from L-band radiances from the SMOS satellite. The interested reader is also referred to Zuo et al. (2018) for a case study about extreme sea-ice conditions derived from ORAS5 in 2016 and possible causes for both Arctic and Antarctic.

5
ORAS5 is a state-of-the-art 0.25 • resolution ocean and sea-ice ensemble reanalysis system that covers the period from 1979 to present. ORAS5 and its real-time extension constitute OCEAN5, the fifth generation of ECMWF's ensemble reanalysisanalysis system. Major improvements of ORAS5 w.r.t. ORAS4 are the inclusion of a sea-ice reanalysis, increased resolution in the ocean, improved and up-to-date observational data sets, and improved methods for ensemble generation. ORAS5 also includes a series of system updates w.r.t. ORAP5, a pilot system. These include (a) improved observation pre-processing and 10 quality-control methods; (b) revised bias correction scheme with stability check to prevent static instability; (c) a faster method to estimate the MDT for SLA assimilation. Particular attention is devoted to the consistency of surface observations, e.g.
using HadISST2 SST together with OSTIA operational SST, and to an ensemble strategy that includes perturbation of initial condition, bias correction, observation and forcing. These system updates are described in detail in this manuscript, together with an evaluation of system performance in the context of data assimilation.

15
The OCEAN5 RT analysis is produced daily, and is essential for the timely initialization of the ECMWF coupled forecasts.
Initialized from the latest ORAS5 conditions, the RT extension is produced by assimilating all available observational data into the ocean model driven by NWP forcings. Differences to ORAS5 are the variable assimilation window length, the smaller number of observations used, and the atmospheric forcing.
A series of sensitivity experiments have been carried out in order to assess ORAS5. It was found that all system components 5 (SST nudging, assimilation of in-situ observation and/or SLA data, bias correction) contribute to an improved ocean state by reducing fit-to-observation errors in ocean syntheses. Among them direct assimilation of in-situ observations accounts for most improvements in both temperature (65%) and salinity (90%). This result suggests that different observation types (multiple altimeters, satellite SST and SIC observations, ocean in-situ) can be effectively assimilated in the ocean and sea-ice model and allow constraining efficiently ocean and sea-ice states. Impact of different in-situ observation types in the current 10 global ocean observing system were tested with global OSEs. Various metrics showed a non-linear degradation of the analysed ocean state for all observation types, with Argo showing the strongest impact. Region-wise, the degradation of the ocean state in the Atlantic was more severe than in the other main ocean basins, indicating the strong need for a dense in-situ network in this region.
The climate quality of ORAS5 has been evaluated using the three ECVs (SST, SLA and SIC) against reference climate 15 records from the ESA CCI project. Results suggest that ORAS5 has an improved ocean state w.r.t. ORAS4 in the context of reconstructed SST and sea-level, with much reduced warm biases in extra-tropics and better regional sea-level variance between 50 • S to 50 • N. The performance of SIC in ORAS5 is similar to that of its predecessor ORAP5. In addition, the ORAS5 ensemble of SIC appears to provide a reliable measure of uncertainty in the estimation, being comparable to the RMSE between ORAS5 and ESA CCI SIC observations. It also allows for uncertainty estimation of climate signals, which 20 however is beyond the scope of this document and will be investigated elsewhere (Zuo et al., in preparation). Evaluations of ORAS5 have also been carried out within the framework of ESA SL_cci (Legeais et al., 2018), ESA-SMOS  and CMEMS projects .
The large SST biases in the Gulf Stream and its extensions have improved in ORAS5 compared to ORAS4, as a consequence of increased spatial resolution. However, the bias remains large, and is associated with a fundamental misrepresentation of front 25 positions and overshoot of the northward transport along the coast after Cape Hatteras. The impact of high resolution in ORAS5 is more visible in the area of the sub-polar gyre. Other issues identified in ORAS5 that need improving include the usage of observations in high latitudes, near the coast and on the continental shelf, especially with the recent development of the new ESA CCI sea-level product (Quartly et al., 2017;Legeais et al., 2018) which has reduced uncertainties in these regions. The underestimated SLA variances is thought to associated with sub-optimal parameter specifications in observation errors and 30 data sampling.
Two clear priorities for developments of the ocean data assimilation system emerge from the experience with ORAS5. One is the treatment of SST observational constraints. The other required improvement is related with the assimilation of altimeterderived sea level. The current relaxation method to constrain the SST has several shortcomings: i) it lacks the capability to project directly the SST information into the subsurface, relying on the ocean model mixing processes to achieve that; ii) the 35 strength of the relaxation at high latitudes can have strong impacts on the ocean circulation, introducing process imbalance which damage the coupled forecast. The latter is the subject of a more detailed study (in preparation). It would be possible to optimize the strength of the SST nudging; but a longer-term solution requires investing in the proper assimilation of SST, using an appropriate vertical and horizontal correlation structure function and multivariate relationships. The assimilation of altimeter-derived sea level should also be improved. The current practice of assimilating sea level anomalies (SLA) requires a 5 pre-computed mean dynamic topography (MDT), which is expensive, or even unaffordable in coupled data assimilation, and it is prone to errors. Better solutions should be sought in terms of an online computation of the MDT (Lea et al., 2008), or, preferably, by making direct use of sea surface height and geoid information. The use of altimeter observations should also be optimized by further development of the multivariate background error covariance formulation in NEMOVAR, so as to include constraints between sea surface height and barotropic stream function. This should have a large impact in constraining the 10 position of the Gulf Stream and other oceanic fronts, which should benefit the NWP forecasting activities. Development of the next generation of ocean reanalysis system also requires: a) a better quality atmospheric forcing with increased temporal and spatial resolutions; b) an improved perturbation strategy with stochastic model perturbations; c) a flow-dependent BGE covariance matrix in NEMOVAR; and d) revised parameterizations for both OBE and BGE covariance matrices.
Data availability. The full ORAS5 data set can be downloaded from the Integrated Climate Data Center portal at http://icdc.cen.uni-hamburg.