Articles | Volume 16, issue 4
Research article
23 Jul 2020
Research article |  | 23 Jul 2020

Assessing the role and consistency of satellite observation products in global physical–biogeochemical ocean reanalysis

David Andrew Ford

As part of the European Space Agency's Climate Change Initiative, new sets of satellite observation products have been produced for essential climate variables including ocean colour, sea surface temperature, sea level, and sea ice. These new products have been assimilated into a global physical–biogeochemical ocean model to create a set of 13-year reanalyses at 1 resolution and 3-year reanalyses at 1∕4 resolution. In a series of experiments, the variables were assimilated individually and in combination in order to assess their consistency from a data assimilation perspective. The satellite products, and the reanalyses assimilating them, were found to be consistent in their representation of spatial features such as fronts, sea ice extent, and bloom activity. Assimilating multiple variables together often resulted in larger mean increments for a variable than assimilating it individually, providing information about model biases and compensating errors which could be addressed in the future development of the model and assimilation scheme. Sea surface fugacity of carbon dioxide had lower errors against independent observations in the higher-resolution simulations and was improved by assimilating ocean colour or sea ice concentration, but it was degraded by assimilating sea surface temperature or sea level anomaly. Phytoplankton biomass correlated more strongly with net air–sea heat fluxes in the reanalyses than chlorophyll concentration did, and the correlation was weakened by assimilating ocean colour data, suggesting that studies of phytoplankton bloom initiation based solely on chlorophyll data may not provide a full understanding of the underlying processes.

1 Introduction

In order to understand and monitor the Earth's climate system, it must be routinely and systematically observed. Satellite remote sensing is particularly valuable in providing daily global data, with satellite records for some variables dating back to the 1970s. To allow for the detection of climate variability and change, long-term continuous time series of such data need to be consistently processed, stable, and free from artificial trends. Products which match these requirements can be referred to as climate data records (CDRs) (NRC2004). To address this requirement for a set of essential climate variables (ECVs) (GCOS2011) that can be observed from space, the European Space Agency (ESA) initiated a programme called the Climate Change Initiative (CCI) (Plummer et al.2017). For ECVs including ocean colour, sea surface temperature, sea level, and sea ice, sets of satellite observation products have been developed and made available, with the aim that they can be used as CDRs. A further aim is for the CCI products to be consistent between ECVs, allowing for an integrated assessment of the climate system.

Such observation products are vital for understanding climate variability and change but are insufficient on their own. Coverage is incomplete, not all variables of interest are routinely observed, and there is no predictive capability. Models are required to address these aspects and in conjunction with observations can provide a much fuller understanding of the Earth system. Recognising this need, the CCI programme includes a Climate Modelling User Group (CMUG) to assess CCI data products from a modelling perspective.

A powerful way to combine these sources of information is through data assimilation, which can be used to create reanalyses (Storto et al.2019). As with CDRs, reanalyses are of most benefit to climate studies if they are stable and consistent, both throughout time and between model variables. In these cases, reanalyses can be used to provide valuable insights into the Earth system, which may not be available from observations or models alone (Jackson et al.2016). However, models have limitations, and even with the aid of data assimilation they cannot accurately simulate all aspects of the climate system. Nevertheless, the process of creating and assessing reanalyses itself can aid understanding, both of the real world and of the underlying models. This can lead to improvements in these models, in turn leading to improvements in the next generation of reanalyses and climate projections.

Physical ocean reanalyses are increasingly used for climate studies (Balmaseda et al.2015; Storto et al.2019), and physical–biogeochemical reanalyses assimilating ocean colour data on its own are starting to be developed (Ciavatta et al.2016; Ford and Barciela2017; Fennel et al.2019). It is not yet routine though for a single ocean reanalysis to include the assimilation of both physical and biogeochemical data. A major reason for this is that, especially in global models, assimilating physical observations has been widely found to degrade biogeochemical fields (While et al.2010; Raghukumar et al.2015; Park et al.2018) through the creation of spurious vertical mixing. This is an outstanding issue for the ocean data assimilation community and has not been solved in this study. However, whilst this currently limits the ability to produce stable long-term physical–biogeochemical ocean reanalyses, consistency of variability and features can still be explored, especially if the biogeochemistry is constrained by assimilation of ocean colour data.

This paper presents a set of physical–biogeochemical ocean model runs, assimilating different combinations of satellite data products, as well as in situ temperature and salinity observations. These were performed as part of the CCI CMUG. The aims of the study were the following:

  • to assess whether the four marine ECVs developed as part of phase one of CCI are mutually consistent from a data assimilation perspective;

  • to assess the consistency of physical–biogeochemical relationships in reanalyses assimilating different combinations of these ECVs; and

  • to assess the impact of assimilating the ECVs on the marine carbon cycle.

To address these aims, the model runs have been assessed in a variety of ways. Results are presented here evaluating the assimilation increments, frontal features in the Agulhas Current and Gulf Stream regions, sea ice extent and phytoplankton blooms in the Arctic, validation against in situ carbon observations, variability in the tropical Pacific, and the relationship between phytoplankton variability and air–sea heat fluxes.

2 Model and assimilation

The model and assimilation setup used in this study, and the couplings between different components, is shown schematically in Fig. 1. An overview of the components is given below; for further details the reader is referred to the references provided.

Figure 1Model and assimilation setup used. Arrows denote the direction of coupling.


The physical modelling system was a reanalysis version of v13.1 of the global Forecasting Ocean and Assimilation Model (FOAM), which is run operationally at the Met Office (Blockley et al.2014). FOAM v13.1 includes the GO5.0 configuration (Megann et al.2014) of v3.4 of the Nucleus for European Modelling of the Ocean (NEMO) (Madec2008) hydrodynamic model, the GSI6.0 configuration (Rae et al.2015) of the Los Alamos sea ice model (CICE) (Hunke and Lipscomb2010), and a 3D-Var configuration of the NEMOVAR v3 data assimilation scheme (Waters et al.2015). FOAM v13.1 is a development of the FOAM v12.0 system described by Blockley et al. (2014), and the differences between the two are detailed by Jackson et al. (2016). It is the same configuration that was used to produce the GloSea5 physical ocean reanalysis (MacLachlan et al.2015; Jackson et al.2016), with atmospheric forcing provided by the ERA-Interim reanalysis (Dee et al.2011).

The biogeochemical model used was the Hadley Centre Ocean Carbon Cycle model (HadOCC) (Palmer and Totterdell2001). HadOCC is a relatively simple nutrient–phytoplankton–zooplankton–detritus (NPZD) model, with six state variables, a fully coupled carbon cycle, and a variable carbon-to-chlorophyll ratio. FOAM-HadOCC has been used for previous assimilation studies (Ford et al.2012; While et al.2012; Ford and Barciela2017). The HadOCC model settings used in this study were the same as used by Ford and Barciela (2017), except that atmospheric CO2 concentrations were taken from the National Oceanic and Atmospheric Administration Earth System Research Laboratory (NOAA/ESRL) global monthly mean observation product (Dlugokencky and Tans,, last access: 31 October 2014). The coupling between FOAM and HadOCC was one-way, with no feedback from biogeochemistry to physics.

Assimilation of physics data used the first guess at appropriate time (FGAT) 3D-Var NEMOVAR implementation of Waters et al. (2015), with the added option to use multiple correlation length scales developed by Mirouze et al. (2016). The assimilation of ocean colour data was the same as the approach taken by Ford et al. (2012) and Ford and Barciela (2017), except that the analysis correction scheme of Martin et al. (2007) was replaced by the 3D-Var NEMOVAR implementation of Waters et al. (2015). NEMOVAR was used to create an analysis of surface log 10(chlorophyll), and the nitrogen balancing scheme of Hemmings et al. (2008) was used to create a set of 3D increments to all the biogeochemical state variables, which were applied to the model. The nitrogen balancing scheme is designed to conserve mass and balances nitrogen between variables based on whether phytoplankton growth or loss errors dominate (Hemmings et al.2008). Specific details regarding the assimilation of each ECV are given in the following sections.

3 Observations

This study used various sets of satellite and in situ observation products for both assimilation and validation, and these are described in turn below. Product versions used were the most recent official releases available at the time the study began.

Satellite ocean colour (OC) data were taken from the v2 CCI product (Sathyendranath et al.2017). This study used level three (L3) daily average chlorophyll data, merging information from the Sea-viewing Wide Field-of-view Sensor (SeaWiFS) on board SeaStar, the Moderate Resolution Imaging Spectroradiometer (MODIS) on board Aqua, and the Medium Resolution Imaging Spectrometer (MERIS) on board Envisat. During the product generation MERIS and MODIS remote sensing reflectances were bias-corrected against SeaWiFS in order to minimise inter-sensor differences. The product also includes per-pixel uncertainty estimates, following the approach of Moore et al. (2009).

Satellite sea surface temperature (SST) data were taken from the v1.1 CCI product (Merchant et al.2014). This study used level two pre-processed (L2P) SST data from the Advanced Very High Resolution Radiometer (AVHRR) series of sensors and level three uncollated (L3U) SST data from the Along-Track Scanning Radiometer (ATSR) series of sensors. Individual sensors were processed and provided separately, with AVHRR bias-corrected against ATSR. Per-pixel uncertainty estimates are included, as described by Bulgin et al. (2016a, b).

Satellite sea level anomaly (SLA) data were taken from the v1.1 CCI product (Ablain et al.2015;, last access: 8 February 2016). This study used the Fundamental Climate Data Record (FCDR) product, which provides level 2 (L2) single-sensor along-track SLA data for all available sensors. SLA was calculated using the DTU10 (Andersen and Knudsen2009; Andersen2010) mean sea surface as a reference.

Satellite sea ice concentration (SIC) data were taken from the v1.2 reprocessed global level four (L4) product (Tonboe et al.2016) of the EUMETSAT Ocean and Sea Ice Satellite Application Facility (OSI SAF) based on the Special Sensor Microwave Imager/Sounder (SSMIS), the Special Sensor Microwave/Imager (SSM/I), and the Scanning Multichannel Microwave Radiometer (SMMR). An OSI SAF product was used rather than CCI because the CCI SIC project began later than the other ECVs, and an appropriate CCI product for these experiments was not yet available when the study began. However, the OSI SAF and CCI SIC projects are led by the same group, with shared research and development capabilities. Furthermore, OSI SAF products provide a consistent time series from 1979, whereas CCI products use sensors which only provide a consistent time series from 2002, meaning the OSI SAF products are more appropriate for many reanalysis users.

In situ SST data were taken from v2.5 of the International Comprehensive Ocean–Atmosphere Data Set (ICOADS) (Woodruff et al.2011) for the period 1998–2007 and from the Global Telecommunications System (GTS) from 2008 onwards. They include measurements from moored buoys, drifting buoys, and ships (Blockley et al.2014; Jackson et al.2016).

In situ temperature and salinity profile (T&S) data were taken from the EN4.1.1 product (Good et al.2013) with Gouretski and Reseghetti (2010) corrections.

In situ surface fugacity of carbon dioxide (fCO2) data were taken from the v2 Surface Ocean CO2 Atlas (SOCAT) product (Bakker et al.2014). The capability to assimilate these data exists (While et al.2012), but in this study fCO2 observations were only used for validation.

4 Experiments

The aims of this study included assessing both spatial features and interannual variability. The former requires a high enough model resolution to represent such features, and the latter requires multi-year model runs. Due to the computational expense of the assimilative coupled physical–biogeochemical model, multiple long runs were unable to be performed at eddy-permitting resolution. Therefore, two sets of model runs were performed at different resolutions. To assess interannual variability, a set of 13-year runs was performed at 1 horizontal resolution, covering the period 1 January 1998 to 31 December 2010, which are the full years for which there are data from all four satellite ECV products used in this study. To assess spatial features, a set of 3-year runs was performed at 1∕4 horizontal resolution, covering the period 1 January 2008 to 31 December 2010. The same vertical resolution was used in both cases, with 75 levels and a 1 m surface layer.

The same CICE and HadOCC settings were used for both resolutions, but some NEMO settings needed to be changed for running at 1 resolution compared with the 1∕4 configuration described by Megann et al. (2014). For the 1 runs, the eddy-induced velocity parameterisation of Gent and McWilliams (1990) was turned on, and Laplacian rather than bi-Laplacian lateral iso-level diffusion was used on momentum, with associated mixing coefficients varying in 3D rather than 2D. Furthermore, the special treatment of tidal mixing in the Indonesian throughflow, as developed by Koch-Larrouy et al. (2008), was only used at 1∕4 resolution.

Prior to assimilation, observations of physical variables were quality-controlled in the same manner as for the operational FOAM system and GloSea5 reanalysis (Blockley et al.2014; Storkey et al.2010). OC observations were quality-controlled as described for CCI products by Ford and Barciela (2017). All assimilated observations were median-averaged to a resolution of 13 km for the 1∕4 runs and 100 km for the 1 runs. For physics observations, the error variances used by NEMOVAR were those described by Blockley et al. (2014) for the 1∕4 runs, interpolated to 1 for the 1 runs. For OC observations, the error variances were those described for CCI products by Ford and Barciela (2017) for the 1 runs, interpolated to 1∕4 for the 1∕4 runs. For the 1∕4 runs, two correlation length scales were used by NEMOVAR for physical ocean variables, as described by Mirouze et al. (2016). For OC and SIC at 1∕4 resolution, and all variables at 1 resolution, a single correlation length scale was used based on the first baroclinic Rossby radius, as described by Waters et al. (2015). Unlike the operational FOAM system and GloSea5 reanalysis, no bias correction of SST observations was performed in order to assess whether the bias correction and processing performed during the creation of the SST CDR were sufficient to allow for a stable reanalysis. However, SLA bias correction (Lea et al.2008; Blockley et al.2014) was still performed using the DTU10 (Andersen and Knudsen2009; Andersen2010) mean dynamic topography (MDT), as this corrects for errors in the MDT.

Table 1Summary of model runs performed. SST refers to satellite data only; in situ SST and profile data are included with T&S. HIGH and LOW are intended as relative terms.

Download Print Version | Download XLSX

For the 1∕4 runs, spun-up physical and biogeochemical model fields were taken from a previous project, and used as initial conditions with no further spin-up. For the 1 runs, a dedicated spin-up was performed without data assimilation, covering the period 1 January 1980 to 31 December 1997. The initial conditions were the same as for the spin-up of Ford and Barciela (2017), except for temperature and salinity, which in this study were taken from the EN4.1.1 objective analysis for January 1980 (Good et al.2013; Gouretski and Reseghetti2010), and dissolved inorganic carbon (DIC) and alkalinity, which were taken from the Global Data Analysis Project (GLODAP) climatology (Key et al.2004) and converted from per unit mass to per unit volume to match the units used by HadOCC. At the end of the spin-up, a uniform constant was added to the NEMO sea surface height (SSH) fields to give a global mean SSH of zero, as the global mean SSH had drifted and would have caused a large initialisation shock when SLA assimilation began. This procedure maintained all SSH gradients and features and had no impact on model dynamics. These fields were then used as initial conditions for the main 1 runs.

The model runs performed are summarised in Table 1. At each resolution there were eight runs: a free run with no data assimilation, four runs assimilating each satellite ECV individually, a run assimilating all four satellite ECVs together, a run assimilating all four satellite ECVs plus in situ SST and temperature and salinity profiles (T&S), and a run assimilating satellite SST, OC, and SIC. Unique identifiers for each run are detailed in Table 1, prefixed with HIGH for 1∕4 and LOW for 1, which are simply intended as relative terms. The identifiers include the variables assimilated (e.g. HIGH_OC_SST_SIC_SLA_T&S for the 1∕4 run assimilating all data), with in situ SST and profile data included with T&S.

In most cases, assimilation increments were applied at all model grid points. However, for model stability the following exceptions were required.

  • No increments were applied in the Baltic Sea in the 1 runs, as it is treated as an enclosed sea at this resolution.

  • No biogeochemical increments were applied in grid boxes with SIC greater than 0.01, in a relaxation of the conditions imposed by Ford et al. (2012) and Ford and Barciela (2017).

  • Phytoplankton nitrogen increments were limited in magnitude to 1.0 mmol m−3 in a region surrounding the Amazon River outflow, prior to running the Hemmings et al. (2008) nitrogen balancing scheme. This was to avoid spuriously large DIC increments at very low chlorophyll concentrations in the region of freshwater influence.

  • No assimilation was performed on 18 January 2000 in 1 runs including SLA assimilation, as a few anomalously large SLA observations resulted in unrealistic increments.

  • Near the Antarctic coast during February and March, sparse SLA and T&S observations located in melt ponds occasionally led to unrealistically large increments being generated in LOW_SLA and LOW_OC_SST_SIC_SLA_T&S. In these cases, no increments were applied for a short period in the surrounding region until the ice had melted further.

  • SLA assimilation is designed to be performed in combination with T&S assimilation (Lea et al.2014), and assimilating SLA data on their own can sometimes result in adverse changes to subsurface density structure in energetic regions. This occasionally led to a model instability in the Malvinas Current region in HIGH_SLA, and so to prevent this no increments were applied in this region on 12 dates during the run. This was also required on one date during HIGH_OC_SST_SIC_SLA_T&S.

5 Results

The model runs have been assessed through a series of case studies, presented in turn below. These are intended to explore physical–biogeochemical relationships in the model and observations, and the impact of data assimilation on these, rather than simply validating the accuracy of the reanalyses. For validation of the underlying system, the reader is referred to Blockley et al. (2014) for the physical model and assimilation, Ford and Barciela (2017) for the biogeochemical model and assimilation, and Lea et al. (2014) and King et al. (2020) for data withholding experiments performed with the physics-only system.

5.1 Assimilation increments

A measure of how hard the data assimilation is working is the long-term mean and standard deviation of the assimilation increments. This can also provide important information about model biases. The larger the increments, the larger the corrections being applied to the model to try to keep it close to the observations. In theory, if the observation products are providing consistent information and the model and assimilation scheme are performing as intended, then assimilating multiple ECVs should result in smaller mean increments for a given ECV compared with assimilating that ECV alone. To test that theory, maps of the mean increments for 1998–2010 are plotted for OC (Fig. 2) and SST (Fig. 3) from each of the 1 runs assimilating those ECVs. For runs assimilating multiple ECVs, the difference in the absolute mean increments from the run assimilating the single ECV is also plotted. Note that there is no feedback from the biogeochemistry to the physics, so assimilating OC has no impact on the physics variables. Mean increments from the 1∕4 runs (not shown) showed generally consistent patterns with those from the 1 runs, apart from when assimilating SLA, as discussed below.

Figure 2Mean surface log 10(chlorophyll) assimilation increments for 1998–2010.

Figure 3Mean SST assimilation increments for 1998–2010.

When OC was assimilated individually, the mean surface log 10(chlorophyll) increments were negative across most of the ocean (Fig. 2a), indicating a persistent model bias which the assimilation was continually trying to correct. This is consistent with the validation presented in Ford and Barciela (2017). When SST was also assimilated (Fig. 2b, c), the magnitude of the mean increments was slightly increased, particularly around the edges of the subtropical gyres. In LOW_FREE the global mean SSH drifted over time due to a freshwater imbalance in the model (Blockley et al.2014). Assimilating SST without also assimilating T&S is known to generate spurious heat beneath the mixed layer, which served to reduce the drift in SSH, but by introducing a compensating error. This resulted in the spatial extent of the oligotrophic gyres spuriously shrinking, leading to excess primary production which the OC assimilation was continually trying to reduce. Globally, the magnitude of the log 10(chlorophyll) increments was then increased considerably when SLA was also assimilated (Fig. 2d, e) and remained increased when T&S was assimilated (Fig. 2f, g). The reason for this is the issue mentioned in the Introduction, and explored by While et al. (2010) and Park et al. (2018), that assimilating SLA and T&S results in spurious vertical mixing, especially in equatorial regions, bringing excess nutrients to the surface and fuelling primary production. Counterintuitively, despite the biggest impact on mixing being at the Equator, the biggest impact on the log 10(chlorophyll) increments was away from the Equator. This is because phytoplankton growth in these runs was not generally nutrient-limited around the Equator, so an increased nutrient supply had little impact on chlorophyll concentration. However, over the 13 years of the runs these nutrients were advected away from the Equator into surrounding nutrient-limited regions, resulting in excessive primary production and an increase in the mean increments as the assimilation tried to correct this.

When SST was assimilated individually, the mean SST increments were also generally negative (Fig. 3a), indicating a persistent warm bias in the model, consistent with the findings of Blockley et al. (2014). When SLA assimilation was also introduced (Fig. 3b, c), the mean increments increased in magnitude, apparently due to the subsurface density corrections applied by the SLA assimilation not being entirely consistent with the SST, meaning larger SST increments were required to correct the bias. It has previously been shown by Lea et al. (2014) that SLA assimilation is best performed in combination with T&S assimilation, as this allows subsurface density errors in the model to be corrected. Adding T&S assimilation (Fig. 3d, e) reduced the mean increments, but they remained larger than when assimilating SST alone.

When SLA was assimilated individually (not shown), the SLA increments were largest in energetic regions such as around the Equator and western boundary currents, as well as the Southern Ocean. There was no clear global bias due to the SLA bias correction scheme (Lea et al.2008). In the 1 model, adding SST assimilation had little impact on the increments, whilst adding T&S assimilation resulted in patchy increases in mean SLA increments. In the 1∕4 model, adding SST assimilation served to reduce the mean SLA increments, and adding T&S assimilation greatly reduced them further. This latter result is consistent with Lea et al. (2014) and highlights the complementarity of SLA and T&S data. The contrasting findings in the 1 model are likely due to the coarse resolution, which is unable to resolve eddies, and the fact that the error covariances were not tuned for the lower resolution.

Assimilating SIC and SST in combination reduced both the mean SIC and SST increments applied in the Antarctic, compared with assimilating SIC and SST individually, but not in the Arctic, where the impact was minimal (not shown).

The finding that assimilating multiple ECVs often results in larger mean increments for a given ECV compared with assimilating that ECV alone implies that either the observation products are providing inconsistent information or that the model and assimilation are not performing entirely as intended. Analysis suggests the latter to be the case, meaning that looking at mean increments does not provide evidence either way about whether the CCI products are mutually consistent. It does, though, highlight issues with the multivariate assimilation which can be addressed during future development work. It should also be noted that the physics data assimilation is designed to work best when all data types are available, as these provide complementary information (Lea et al.2014).

5.2 Fronts and eddies

An important way in which CDRs and reanalyses should be consistent is in the representation of spatial features, such as the positioning of fronts and eddies and details of major currents such as the Agulhas and Gulf Stream. This can be assessed by analysing the horizontal gradients of different fields. Following the method of Martin (2016), horizontal gradients of SST, SLA, and surface log 10(chlorophyll) were calculated by bilinearly interpolating model fields to observation locations, binning the resulting model and observation values onto a regular grid over a month, and calculating their spatial derivative. Gradients have been plotted for the final month, December 2010, in the Agulhas Current (Fig. 4) and Gulf Stream (Fig. 5) regions for the CCI products and the 1∕4 free run and runs assimilating SST, SLA, and OC individually and in combination. SST and surface log 10(chlorophyll) were binned onto a 1∕4 grid, as this gave the clearest resolution of features for these variables, whilst SLA was binned onto a 1∕2 grid due to the lower observational coverage. These regions were chosen due to their complex variability and physical–biogeochemical interactions. Similar conclusions have been found from looking at other regions (not shown).

Figure 4Observed and modelled gradients in the Agulhas Current region for December 2010.

Figure 5Observed and modelled gradients in the Gulf Stream region for December 2010.

In the Agulhas Current region, a good correspondence can be seen between features in each variable in the observation fields (Fig. 4a–c). While the position of gradients is not expected to be identical in all fields, SST fronts can generally be found around the eddies identified in the SLA fields, with the log 10(chlorophyll) gradients showing bloom activity along these fronts, relating to the advection of nutrients (Machu et al.2005). This suggests that the CCI products are giving a suitably complementary view of such features. In HIGH_FREE (Fig. 4d–f), SST and SLA gradients are found in similar locations to the observations, but not quite as sharp, and the ocean is less energetic, as expected from an eddy-permitting-resolution model. High primary production associated with the Agulhas retroflection is captured well, but otherwise the log 10(chlorophyll) gradients are a poor match for the observed fields. Gradients associated with SST fronts are too weak, and Southern Ocean primary production is too high and overly homogeneous, leading to a spuriously strong gradient at the boundary of the Indian Ocean and Southern Ocean. In HIGH_SST (Fig. 4g–i) improvements are seen in the positions of both SST and SLA gradients. Some small strengthening of log 10(chlorophyll) gradients is seen. In HIGH_SLA (Fig. 4j–l) the ocean becomes more energetic and the match with observed SLA gradients is improved, but the impact on SST and log 10(chlorophyll) gradients is mixed. In HIGH_OC (Fig. 4m–o) there is no impact on physical fields due to the one-way coupling in the model, but log 10(chlorophyll) gradients are greatly improved although still weaker than in the observations. In HIGH_OC_SST_SIC_SLA (Fig. 4p–r), the best match for observed gradients of both SST and SLA can be seen. The log 10(chlorophyll) gradients are similar in magnitude but overly noisy compared with HIGH_OC, likely due to excessive vertical mixing resulting from the assimilation of SLA data, but the spurious gradient in production between the Southern Ocean and Indian Ocean is finally removed.

In the Gulf Stream region (Fig. 5), similar results were found. In the observation fields SST and log 10(chlorophyll) fronts are largely collocated and situated around eddies identified in the SLA products. In HIGH_FREE the SST gradients are broadly similar to the observed fields, but some specific features are lacking. SLA and log 10(chlorophyll) gradients are found in corresponding locations but are too weak in magnitude compared to the observations. In HIGH_SST the position and magnitude of gradients are improved in all three fields. In HIGH_SLA the SLA gradients are improved, with some improvement to SST and log 10(chlorophyll) gradients, but there is also increased noise. In HIGH_OC the locations of log 10(chlorophyll) gradients match those in the observed fields, but the magnitude remains too weak. In HIGH_OC_SST_SLA the best combined representation of gradients in the three fields is seen.

5.3 Sea ice extent

Consistency is expected between SST and SIC in satellite products and analyses derived from them (Roberts-Jones et al.2012). Consistency should also be expected between OC and SIC products. Simply in terms of observational coverage, OC cannot be observed under ice cover, and nearby ice may affect satellite retrievals (Bélanger et al.2007), so OC data should not be expected at locations where the OC retrieval would be contaminated by sea ice (Wang and Shi2009). Furthermore, phytoplankton blooms often occur around the ice edge, where freshly melted ice reveals nutrient-rich stratified waters (Perrette et al.2011). These would be expected to be identified in observed and modelled chlorophyll fields in locations commensurate with the ice edge in SIC products.

Maps of SIC and surface chlorophyll in the Arctic are plotted from the observation products and a selection of 1∕4 model runs for the 8 d period of 17–24 May 2010 (Fig. 6). This period was chosen as it is during the spring bloom and ice melt in the Arctic, has good observational coverage, and contains interesting and representative features. Daily ocean colour products have insufficient spatial coverage, but the 8 d composite product for this period has near-complete coverage while still representing a snapshot of conditions.

Figure 6SIC (left column) and surface chlorophyll (right column) for 17–24 May 2010 from observed (a, b) and modelled (c–l) fields. In (b) chlorophyll blooms in the Bering Sea and Barents Sea are marked with Be and Ba, respectively.

In the observation products (Fig. 6a, b), the limit of OC coverage mostly matches the ice edge, as defined by the 0.15 contour (Parkinson and Cavalieri2008) and overlaid in red on the plots. In a few locations there are OC observations collocated with sea ice, but only where the SIC is low. Furthermore, the 8 d OC product is a composite, while the mean of the L4 SIC products over the period has been plotted, so this may be due to variability in SIC over the 8 d. Overall, OC coverage can be considered consistent with the SIC fields. Chlorophyll blooms can also be seen near to the ice edge in the Barents Sea and Bering Sea.

In HIGH_FREE (Fig. 6c, d), SIC is a reasonable qualitative match for the observations, but the concentrations are generally too low, and the ice edge is too far south. There is a chlorophyll bloom in the Barents Sea, near the ice edge in the model, but not in the Bering Sea. Within the ice-covered region, high chlorophyll concentrations are found in areas with moderate SIC. Observational studies have confirmed the presence of chlorophyll blooms under sea ice (Arrigo et al.2012), but no observations are available for model validation.

In HIGH_SST, the position of the ice edge is a much better match for the observations than in HIGH_FREE. This suggests that the CCI SST products are consistent with the SIC products and that assimilating SST therefore improves SIC. Chlorophyll fields remain similar, except that the bloom in the Barents Sea extends further north up to the altered ice edge.

HIGH_SIC has a further improved ice edge position and higher SIC within the ice region, better matching the observations. This increase in SIC has the effect of reducing the chlorophyll concentration in these areas. The lack of in situ observations means this change cannot be validated, but lower chlorophyll associated with greater ice cover is a consistent response due to the decrease in light availability.

HIGH_OC improves chlorophyll near and away from the ice edge, including introducing bloom activity in the Bering Sea. Since SIC is unconstrained, though, the assimilative model cannot capture exact details around the ice edge. Nor is any change seen within the ice region, where there are no OC observations.

In HIGH_OC_SST_SIC, the best match for the SIC and OC observations is seen, as is reduced bloom activity in the ice region.

5.4 Carbon cycle validation

A key aim of global marine biogeochemical reanalysis is to study the carbon cycle and provide the best possible estimates of surface fCO2 and air–sea CO2 flux. This can inform the study of the global carbon budget (e.g. Le Quéré et al.2018).

Figure 7Taylor plot showing global fCO2 statistics against SOCAT v2 for each run.


To assess the impact of assimilating different CCI products on the model's ability to reproduce surface fCO2, each run has been validated against observations from the SOCAT v2 database (Bakker et al.2014). The observations were passed to the model while it was running, and an observation operator was used which bilinearly interpolated the model fields to the observation locations at the nearest model time step to the observation time. From these match-ups, validation statistics have been calculated and displayed as a Taylor plot (Taylor2001) in Fig. 7. As in Ford and Barciela (2017), observations in shelf seas, defined as waters where the bottom depth is <200 m (Simpson and Sharples2012), have been excluded from the validation. This is because a relatively coarse global model without tides is unable to represent complex shelf sea processes. In Fig. 7 a point is plotted for each model run in Table 1, comparing to all off-shelf fCO2 observations during the length of that run (1998–2010 for the 1 runs, 2008–2010 for the 1∕4 runs). Furthermore, a point has been plotted for each of the 1 runs just assessing the years 2008–2010, allowing for a direct comparison between the 1 and 1∕4 runs.

At each resolution, assimilating OC or SIC products resulted in a small improvement in fCO2 statistics, while assimilating SST resulted in a small degradation. Assimilating all three together improved the normalised standard deviation while lowering the correlation. Assimilating SLA products, either individually or in combination with other variables, resulted in a large degradation in fCO2 statistics. This is due to the impact on vertical mixing processes mentioned in the Introduction.

Comparing the 1 and 1∕4 runs for 2008–2010, for all combinations of assimilation the 1∕4 runs show a small but marked improvement in fCO2 statistics. Given that the double-penalty effect (Gilleland et al.2009) often masks improvements due to model resolution, this is a clear suggestion that the higher-resolution model is better able to accurately represent fCO2.

5.5 Temporal variability

A major driver of climate variability is the El Niño–Southern Oscillation (ENSO). One measure of ENSO variability is the Niño 3.4 index (Fig. 8a), calculated as the 5-month running mean of SST anomalies in the Niño 3.4 region (5 N–5 S, 170–120 W) of the tropical Pacific (Trenberth1997). To explore the representation of ENSO variability in SST, vertically integrated primary production (PP) and air–sea CO2 flux, and the impact of SST and OC assimilation and model resolution, 5-month running means of these variables averaged over the Niño 3.4 region are plotted in Fig. 8. For LOW_FREE and HIGH_FREE the absolute values are plotted, and for LOW_SST, HIGH_SST, LOW_OC, HIGH_OC, LOW_OC_SST_SIC, and HIGH_OC_SST_SIC, anomalies from LOW_FREE and HIGH_FREE are plotted.

Figure 8The 5-month running mean time series of variables averaged over the Niño 3.4 region (5 N–5 S, 170–120 W). (a) Observed Niño 3.4 SST index (Trenberth1997) as calculated from HadISST1 data (Rayner et al.2003) and downloaded from (last access: 28 May 2020). (b) SST in free runs, (c) anomaly of SST from free runs, (d) vertically integrated primary production in free runs, (e) anomaly of vertically integrated primary production from free runs, (f) air–sea CO2 flux in free runs, (g) anomaly of air–sea CO2 flux from free runs.


ENSO variability in SST is well reproduced in LOW_FREE (Fig. 8b), aligning with ENSO events seen in the observed Niño 3.4 SST index (Fig. 8a) as calculated from HadISST1 data (Rayner et al.2003) and downloaded from (last access: 28 May 2020). HIGH_FREE shows very similar variability, but with slightly higher SST than LOW_FREE. In the first few years of LOW_SST, the assimilation acted to reduce SST compared to LOW_FREE (Fig. 8c), enhancing the prolonged La Niña (negative Niño 3.4 SST index) conditions of the period. The assimilation also served to enhance the El Niño (positive Niño 3.4 SST index) of 2009/10 but otherwise largely just modulated seasonal variability of SST rather than interannual variability. In HIGH_SST there was a similar impact on variability, but the anomaly from HIGH_FREE is offset in magnitude from that between LOW_SST and LOW_FREE, with the assimilation consistently cooling the model.

Very low PP is seen in LOW_FREE (Fig. 8d) at the beginning of the time series, related to the major El Niño event of 1997/98. Much more limited interannual variability is seen through the rest of the period, but with slightly reduced PP during the 2002/03 and 2009/10 El Niño events. Variability in HIGH_FREE is very similar to that in LOW_FREE but slightly offset in magnitude, as with SST. Assimilating SST individually had a limited impact on PP (Fig. 8e), while assimilating OC individually served to substantially reduce PP and impact seasonal variability. In LOW_OC_SST_SIC the SST assimilation made more difference than in LOW_SST, including changing interannual variability during the 1998–2001 La Niña conditions. The difference between LOW_OC_SST_SIC and HIGH_OC_SST_SIC is frequently greater than the combined difference between LOW_SST and HIGH_SST and between LOW_OC and HIGH_OC.

In air–sea CO2 flux a clear ENSO signal is seen in LOW_FREE (Fig. 8f), similar to that in SST. HIGH_FREE displays the same variability but with a clear offset. The smaller offsets in SST and PP may contribute to this, but it is most likely caused by differences in the initialisation of DIC and alkalinity (Lebehot et al.2019). Assimilation of OC data had little impact (Fig. 8g), while assimilation of SST had an impact on the seasonal cycle and slightly reduced air–sea CO2 flux anomalies during La Niña conditions. SST assimilation also served to increase the differences between LOW_FREE and HIGH_FREE.

5.6 Phytoplankton and air–sea heat flux

One of the most dramatic and important features of the marine ecosystem is the spring bloom, and interannual variability in this can have wide-ranging impacts from carbon storage to fish stocks. Debate continues as to the exact mechanism which causes the bloom to occur (Behrenfeld and Boss2014, 2018), but some studies have suggested a direct link between the timing of the annual increase in phytoplankton and the timing of the net air–sea heat fluxes switching from negative to positive (Taylor and Ferrari2011; Smyth et al.2014). Other studies have reached contrasting (Mahadevan et al.2012) or mixed (Brody et al.2013) conclusions. This may in part be due to some studies looking at chlorophyll concentration and others at phytoplankton biomass (Westberry et al.2016; Behrenfeld and Boss2018). The relationship between phytoplankton and net air–sea heat flux at other stages of the seasonal cycle also remains an open question.

To explore this relationship over a long model time series and throughout the seasonal cycle, as well as the impact on this of assimilating SST and OC data, maps of the correlation between surface chlorophyll concentration and net air–sea heat flux and between surface phytoplankton biomass and net air–sea heat flux are plotted in Fig. 9 for LOW_FREE, LOW_OC, LOW_SST, and LOW_OC_SST_SIC calculated from the daily model outputs for 1998–2010.

In LOW_FREE (Fig. 9a, b), there is a moderate positive correlation between chlorophyll and net air–sea heat flux in the subpolar North Atlantic and North Pacific, the Southern Ocean, and patches in the tropics. Elsewhere, there is generally a small to moderate negative correlation. Between phytoplankton biomass and net air–sea heat flux, there is a moderate to strong positive correlation almost everywhere, except for low correlation in the subtropical gyres and Indian Ocean.

Figure 9Maps of correlation between surface chlorophyll and net air–sea heat flux (left column) and between surface phytoplankton biomass and net air–sea heat flux (right column), covering 1998–2010 for a selection of runs.

In LOW_OC (Fig. 9c, d), the patterns of correlation between chlorophyll and net air–sea heat flux become more coherent at low latitudes, with positive correlation in the eastern tropical Pacific and central Atlantic and negative correlation in the subtropical gyres, western tropical Pacific, and Indian Ocean. These patterns resemble patterns of mean chlorophyll concentration (not shown). The correlation between phytoplankton biomass and net air–sea heat flux is weakened globally, with negative correlations in the subtropical gyres, becoming more like the correlation between chlorophyll and net air–sea heat flux.

In LOW_SST (Fig. 9e, f), both the chlorophyll and phytoplankton biomass correlations with net air–sea heat flux remain largely unaltered from LOW_FREE, though they are slightly elevated in equatorial regions. Correlations in LOW_OC_SST_SIC (Fig. 9g, h) largely resemble those in LOW_OC, with some small regional differences.

The different patterns seen in LOW_FREE for chlorophyll and phytoplankton biomass suggest that seasonal variations in the carbon-to-chlorophyll ratio play an important role and that studies based solely on chlorophyll data may not provide a full understanding of the underlying processes. This would imply that models, which unlike observations can provide a year-round gap-free representation of relevant variables, should be able to provide a valuable contribution to such studies, with data assimilation able to constrain them to match available observations. However, assimilating OC data weakened the correlation between phytoplankton biomass and net air–sea heat flux in the model such that it became more like the correlation between chlorophyll and net air–sea heat flux. This may have been a realistic response, or it may have been an artefact of the data assimilation method. In the scheme used, when chlorophyll derived from OC is assimilated, the phytoplankton biomass is updated so as to maintain the existing carbon-to-chlorophyll ratio in the model. In essence, the assimilation assumes the carbon-to-chlorophyll ratio to be correct and the phytoplankton biomass to be in error. However, it could be the carbon-to-chlorophyll ratio that is in error or likely a mixture of both. Without widespread observations of phytoplankton biomass, this is difficult to assess and therefore to know how to interpret the results. But if, for instance, a clear relationship could be determined between phytoplankton biomass and net air–sea heat flux or another property, then the assimilation could be programmed to update the carbon-to-chlorophyll ratio so as to maintain or enhance this relationship in the model.

6 Summary

A series of experiments was performed to assess the multivariate consistency, from a data assimilation perspective, of ESA CCI satellite observation products for ocean colour (OC), sea surface temperature (SST), sea level anomaly (SLA), and sea ice concentration (SIC). The products were assimilated, individually and in combination, into a global physical–biogeochemical ocean model to create a set of 13-year reanalyses at 1 resolution and 3-year reanalyses at 1∕4 resolution. These have been assessed through a series of case studies examining the assimilation increments, frontal features, sea ice extent and phytoplankton blooms, carbon cycle validation, and the relationship between phytoplankton variability and air–sea heat fluxes.

In the assessment performed, the observation products were found to be consistent in terms of their representation of spatial features, such as fronts and eddies in the Agulhas Current and Gulf Stream regions, and SST, OC, and SIC around the retreating ice edge in the Arctic. This gives confidence that the ESA CCI products can be used in synergy to gain insights into the Earth system. In these cases, this consistency transferred through the data assimilation, resulting in an improved reanalysis when the satellite products were assimilated in combination rather than just individually.

In other cases, assimilating particular variables served to degrade rather than improve, the representation of other non-assimilated variables in the reanalysis. This was found to be due to issues with the model and data assimilation system, rather than any lack of consistency between the observations being assimilated. Much of this was down to known issues with physical data assimilation causing spurious vertical mixing, which is not unique to the system used in this study (While et al.2010; Raghukumar et al.2015; Park et al.2018). But it also revealed complex interactions between the model and assimilation, with the assimilation of individual variables improving some non-assimilated variables while degrading others and correcting some compensating errors while introducing others. The experiments performed in this study highlight ways in which the model and assimilation could each be developed accordingly, but there does not appear to be any one simple fix. Addressing known model biases in variables such as SST and chlorophyll should reduce such issues. Impacts on non-assimilated variables could also be counteracted by assimilating additional observations such as nutrients (Yu et al.2018), but these cannot be observed from space and so observational coverage remains insufficient. The biogeochemical model could potentially benefit from spatially varying parameterisations, with parameter values updated in a combined state–parameter estimation (e.g. Simon et al.2015).

Conclusions about model and assimilation performance, as well as consistency and variability, apply similarly to both the 1 and 1∕4 configurations of the model. The higher-resolution model was better able to simulate surface fCO2 with and without data assimilation. This may be due to improved representation of processes in the 1∕4 configuration or may reflect differences in the initialisation of DIC and alkalinity fields, which model fCO2 has been shown to be sensitive to (Lebehot et al.2019). The two resolutions show comparable temporal variability, with data assimilation having a similar impact. It is likely that conclusions about multivariate consistency are broadly generalisable to other resolutions and potentially regional models, though as all models and configurations have their own particular properties and biases, exact results may vary.

Previous studies have suggested a direct correlation between the timing of the initiation of the spring bloom and that of the annual switch from negative to positive air–sea heat fluxes (Taylor and Ferrari2011; Smyth et al.2014). Other studies have reached contrasting (Mahadevan et al.2012) or mixed (Brody et al.2013) conclusions. This may in part be due to some studies looking at chlorophyll concentration and others at phytoplankton biomass (Westberry et al.2016; Behrenfeld and Boss2018). The reanalyses produced in this study provided an opportunity to look at this relationship in a long model time series and the impact of data assimilation. In the free-running model, there was a strong positive correlation between phytoplankton biomass and net air–sea heat flux across much of the ocean, whereas for chlorophyll concentration the correlation with net air–sea heat flux was weaker and often negative at low latitudes. This suggests that seasonal variations in the carbon-to-chlorophyll ratio play an important role and that studies of phytoplankton bloom initiation based solely on chlorophyll data may not provide a full understanding of the underlying processes. However, the assimilation of OC data weakened the correlation between phytoplankton biomass and net air–sea heat flux. In the absence of widespread in situ observations of phytoplankton biomass, it is difficult to assess if this response was realistic or due to a deficiency in the assimilation methodology. This serves to emphasise the importance of studying multivariate relationships and considering both observational and model data.

Data availability

The nature of the 4D data generated in running the model experiments requires a large tape storage facility. These data are in excess of 100 TB. However, the data can be made available upon request from the author.

Competing interests

The author declares that there is no conflict of interest.


Thank you to Jenny Machinford, Susan Kay, and Matt Martin for useful discussions and comments on the draft paper, as well as to the two anonymous reviewers for their helpful comments in the Ocean Science discussion forum. Surface Ocean CO2 Atlas (SOCAT) data were obtained from (last access: 16 July 2020). SOCAT is an international effort endorsed by the International Ocean Carbon Coordination Project (IOCCP), the Surface Ocean Lower Atmosphere Study (SOLAS), and the Integrated Marine Biosphere Research (IMBeR) programme to deliver a uniformly quality-controlled surface ocean CO2 database. The many researchers and funding agencies responsible for the collection of data and quality control are thanked for their contributions to SOCAT.

Financial support

This research has been supported by the European Space Agency (ESA) through the Climate Modelling User Group (CMUG) project of the ESA Climate Change Initiative (CCI).

Review statement

This paper was edited by Anna Rubio and reviewed by two anonymous referees.


Ablain, M., Cazenave, A., Larnicol, G., Balmaseda, M., Cipollini, P., Faugère, Y., Fernandes, M. J., Henry, O., Johannessen, J. A., Knudsen, P., Andersen, O., Legeais, J., Meyssignac, B., Picot, N., Roca, M., Rudenko, S., Scharffenberg, M. G., Stammer, D., Timms, G., and Benveniste, J.: Improved sea level record over the satellite altimetry era (1993–2010) from the Climate Change Initiative project, Ocean Sci., 11, 67–82,, 2015. a

Andersen, O. B.: The DTU10 Gravity field and Mean sea surface, Second international symposium of the gravity field of the Earth (IGFS2), Fairbanks, Alaska, available at: (last access: 8 February 2016), 2010. a, b

Andersen, O. B. and Knudsen, P.: DNSC08 mean sea surface and mean dynamic topography models, J. Geophys. Res., 114, C11001,, 2009. a, b

Arrigo, K. R., Perovich, D. K., Pickart, R. S., Brown, Z. W., van Dijken, G. L., Lowry, K. E., Mills, M. M., Palmer, M. A., Balch, W. M., Bahr, F., Bates, N. R., Benitez-Nelson, C., Bowler, B., Brownlee, E., Ehn, J. K., Frey, K. E., Garley, R., Laney, S. R., Lubelczyk, L., Mathis, J., Matsuoka, A., Mitchell, B. G., Moore, G. W. K., Ortega-Retuerta, E., Pal, S., Polashenski, C. M., Reynolds, R. A., Schieber, B., Sosik, H. M., Stephens, M., and Swift, J. H.: Massive Phytoplankton Blooms Under Arctic Sea Ice, Science, 336, 1408–1408,, 2012. a

Bakker, D. C. E., Pfeil, B., Smith, K., Hankin, S., Olsen, A., Alin, S. R., Cosca, C., Harasawa, S., Kozyr, A., Nojiri, Y., O'Brien, K. M., Schuster, U., Telszewski, M., Tilbrook, B., Wada, C., Akl, J., Barbero, L., Bates, N. R., Boutin, J., Bozec, Y., Cai, W.-J., Castle, R. D., Chavez, F. P., Chen, L., Chierici, M., Currie, K., de Baar, H. J. W., Evans, W., Feely, R. A., Fransson, A., Gao, Z., Hales, B., Hardman-Mountford, N. J., Hoppema, M., Huang, W.-J., Hunt, C. W., Huss, B., Ichikawa, T., Johannessen, T., Jones, E. M., Jones, S. D., Jutterström, S., Kitidis, V., Körtzinger, A., Landschützer, P., Lauvset, S. K., Lefèvre, N., Manke, A. B., Mathis, J. T., Merlivat, L., Metzl, N., Murata, A., Newberger, T., Omar, A. M., Ono, T., Park, G.-H., Paterson, K., Pierrot, D., Ríos, A. F., Sabine, C. L., Saito, S., Salisbury, J., Sarma, V. V. S. S., Schlitzer, R., Sieger, R., Skjelvan, I., Steinhoff, T., Sullivan, K. F., Sun, H., Sutton, A. J., Suzuki, T., Sweeney, C., Takahashi, T., Tjiputra, J., Tsurushima, N., van Heuven, S. M. A. C., Vandemark, D., Vlahos, P., Wallace, D. W. R., Wanninkhof, R., and Watson, A. J.: An update to the Surface Ocean CO2 Atlas (SOCAT version 2), Earth Syst. Sci. Data, 6, 69–90,, 2014. a, b

Balmaseda, M., Hernandez, F., Storto, A., Palmer, M., Alves, O., Shi, L., Smith, G., Toyoda, T., Valdivieso, M., Barnier, B., Behringer, D., Boyer, T., Chang, Y.-S., Chepurin, G., Ferry, N., Forget, G., Fujii, Y., Good, S., Guinehut, S., Haines, K., Ishikawa, Y., Keeley, S., Köhl, A., Lee, T., Martin, M., Masina, S., Masuda, S., Meyssignac, B., Mogensen, K., Parent, L., Peterson, K., Tang, Y., Yin, Y., Vernieres, G., Wang, X., Waters, J., Wedd, R., Wang, O., Xue, Y., Chevallier, M., Lemieux, J.-F., Dupont, F., Kuragano, T., Kamachi, M., Awaji, T., Caltabiano, A., Wilmer-Becker, K., and Gaillard, F.: The Ocean Reanalyses Intercomparison Project (ORA-IP), J. Oper. Oceanogr., 8, s80–s97,, 2015. a

Behrenfeld, M. J. and Boss, E. S.: Resurrecting the Ecological Underpinnings of Ocean Plankton Blooms, Annu. Rev. Mar. Sci., 6, 167–194,, 2014. a

Behrenfeld, M. J. and Boss, E. S.: Student's tutorial on bloom hypotheses in the context of phytoplankton annual cycles, Glob. Change Biol., 24, 55–77,, 2018. a, b, c

Bélanger, S., Ehn, J. K., and Babin, M.: Impact of sea ice on the retrieval of water-leaving reflectance, chlorophyll a concentration and inherent optical properties from satellite ocean color data, Remote Sens. Environ., 111, 51–68,, 2007. a

Blockley, E. W., Martin, M. J., McLaren, A. J., Ryan, A. G., Waters, J., Lea, D. J., Mirouze, I., Peterson, K. A., Sellar, A., and Storkey, D.: Recent development of the Met Office operational ocean forecasting system: an overview and assessment of the new Global FOAM forecasts, Geosci. Model Dev., 7, 2613–2638,, 2014. a, b, c, d, e, f, g, h, i

Brody, S. R., Lozier, M. S., and Dunne, J. P.: A comparison of methods to determine phytoplankton bloom initiation, J. Geophys. Res.- Oceans, 118, 2345–2357,, 2013. a, b

Bulgin, C., Embury, O., Corlett, G., and Merchant, C.: Independent uncertainty estimates for coefficient based sea surface temperature retrieval from the Along-Track Scanning Radiometer instruments, Remote Sens. Environ., 178, 213–222,, 2016a. a

Bulgin, C., Embury, O., and Merchant, C.: Sampling uncertainty in gridded sea surface temperature products and Advanced Very High Resolution Radiometer (AVHRR) Global Area Coverage (GAC) data, Remote Sens. Environ., 177, 287–294,, 2016b. a

Ciavatta, S., Kay, S., Saux-Picart, S., Butenschön, M., and Allen, J. I.: Decadal reanalysis of biogeochemical indicators and fluxes in the North West European shelf-sea ecosystem, J. Geophys. Res.-Oceans, 121, 1824–1845,, 2016. a

Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A. C. M., van de Berg, L., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., Haimberger, L., Healy, S. B., Hersbach, H., Hólm, E. V., Isaksen, L., Kâllberg, P., Köhler, M., Matricardi, M., McNally, A. P., Monge-Sanz, B. M., Morcrette, J.-J., Park, B.-K., Peubey, C., de Rosnay, P., Tavolato, C., Thépaut, J.-N., and Vitart, F.: The ERA-Interim reanalysis: configuration and performance of the data assimilation system, Q. J. Roy. Meteor. Soc., 137, 553–597,, 2011. a

Fennel, K., Gehlen, M., Brasseur, P., Brown, C. W., Ciavatta, S., Cossarini, G., Crise, A., Edwards, C. A., Ford, D., Friedrichs, M. A. M., Gregoire, M., Jones, E., Kim, H.-C., Lamouroux, J., Murtugudde, R., Perruche, C., and the GODAE OceanView Marine Ecosystem Analysis and Prediction Task Team: Advancing Marine Biogeochemical and Ecosystem Reanalyses and Forecasts as Tools for Monitoring and Managing Ecosystem Health, Frontiers in Marine Science, 6, 89,, 2019. a

Ford, D. and Barciela, R.: Global marine biogeochemical reanalyses assimilating two different sets of merged ocean colour products, Remote Sens. Environ., 203, 40–54,, 2017. a, b, c, d, e, f, g, h, i, j, k

Ford, D. A., Edwards, K. P., Lea, D., Barciela, R. M., Martin, M. J., and Demaria, J.: Assimilating GlobColour ocean colour data into a pre-operational physical-biogeochemical model, Ocean Sci., 8, 751–771,, 2012. a, b, c

GCOS: Systematic Observation Requirements for Satellite-Based Data Products for Climate, available at: (last access: 30 August 2019), 2011. a

Gent, P. R. and McWilliams, J. C.: Isopycnal Mixing in Ocean Circulation Models, J. Phys. Oceanogr., 20, 150–155,<0150:IMIOCM>2.0.CO;2, 1990. a

Gilleland, E., Ahijevych, D., Brown, B. G., Casati, B., and Ebert, E. E.: Intercomparison of Spatial Forecast Verification Methods, Weather Forecast., 24, 1416–1430,, 2009. a

Good, S. A., Martin, M. J., and Rayner, N. A.: EN4: Quality controlled ocean temperature and salinity profiles and monthly objective analyses with uncertainty estimates, J. Geophys. Res.-Oceans, 118, 6704–6716,, 2013. a, b

Gouretski, V. and Reseghetti, F.: On depth and temperature biases in bathythermograph data: Development of a new correction scheme based on analysis of a global ocean database, Deep-Sea Res. Pt. I, 57, 812–833,, 2010. a, b

Hemmings, J. C. P., Barciela, R. M., and Bell, M. J.: Ocean color data assimilation with material conservation for improving model estimates of air-sea CO2 flux, J. Mar. Res., 66, 87–126,, 2008. a, b, c

Hunke, E. C. and Lipscomb, W. H.: CICE: the Los Alamos sea ice model documentation and software users' manual, Version 4.1, LA-CC-06-012, Los Alamos National Laboratory, NM, 2010. a

Jackson, L. C., Peterson, K. A., Roberts, C. D., and Wood, R. A.: Recent slowing of Atlantic overturning circulation as a recovery from earlier strengthening, Nat. Geosci., 9, 518–522,, 2016. a, b, c, d

Key, R. M., Kozyr, A., Sabine, C. L., Lee, K., Wanninkhof, R., Bullister, J. L., Feely, R. A., Millero, F. J., Mordy, C., and Peng, T.-H.: A global ocean carbon climatology: Results from Global Data Analysis Project (GLODAP), Global Biogeochem. Cy., 18, GB4031,, 2004. a

King, R. R., Lea, D. J., Martin, M. J., Mirouze, I., and Heming, J.: The impact of Argo observations in a global weakly coupled ocean–atmosphere data assimilation and short-range prediction system, Q. J. Roy. Meteor. Soc., 146, 401–414,, 2020. a

Koch-Larrouy, A., Madec, G., Blanke, B., and Molcard, R.: Water mass transformation along the Indonesian throughflow in an OGCM, Ocean Dynam., 58, 289–309,, 2008. a

Lea, D. J., Drecourt, J.-P., Haines, K., and Martin, M. J.: Ocean altimeter assimilation with observational- and model-bias correction, Q. J. Roy. Meteor. Soc., 134, 1761–1774,, 2008. a, b

Lea, D. J., Martin, M. J., and Oke, P. R.: Demonstrating the complementarity of observations in an operational ocean forecasting system, Q. J. Roy. Meteor. Soc., 140, 2037–2049,, 2014. a, b, c, d, e

Lebehot, A. D., Halloran, P. R., Watson, A. J., McNeall, D., Ford, D. A., Landschützer, P., Lauvset, S. K., and Schuster, U.: Reconciling Observation and Model Trends in North Atlantic Surface CO2, Global Biogeochem. Cy., 33, 1204–1222,, 2019. a, b

Le Quéré, C., Andrew, R. M., Friedlingstein, P., Sitch, S., Hauck, J., Pongratz, J., Pickers, P. A., Korsbakken, J. I., Peters, G. P., Canadell, J. G., Arneth, A., Arora, V. K., Barbero, L., Bastos, A., Bopp, L., Chevallier, F., Chini, L. P., Ciais, P., Doney, S. C., Gkritzalis, T., Goll, D. S., Harris, I., Haverd, V., Hoffman, F. M., Hoppema, M., Houghton, R. A., Hurtt, G., Ilyina, T., Jain, A. K., Johannessen, T., Jones, C. D., Kato, E., Keeling, R. F., Goldewijk, K. K., Landschützer, P., Lefèvre, N., Lienert, S., Liu, Z., Lombardozzi, D., Metzl, N., Munro, D. R., Nabel, J. E. M. S., Nakaoka, S., Neill, C., Olsen, A., Ono, T., Patra, P., Peregon, A., Peters, W., Peylin, P., Pfeil, B., Pierrot, D., Poulter, B., Rehder, G., Resplandy, L., Robertson, E., Rocher, M., Rödenbeck, C., Schuster, U., Schwinger, J., Séférian, R., Skjelvan, I., Steinhoff, T., Sutton, A., Tans, P. P., Tian, H., Tilbrook, B., Tubiello, F. N., van der Laan-Luijkx, I. T., van der Werf, G. R., Viovy, N., Walker, A. P., Wiltshire, A. J., Wright, R., Zaehle, S., and Zheng, B.: Global Carbon Budget 2018, Earth Syst. Sci. Data, 10, 2141–2194,, 2018. a

Machu, E., Biastoch, A., Oschlies, A., Kawamiya, M., Lutjeharms, J., and Garçon, V.: Phytoplankton distribution in the Agulhas system from a coupled physical–biological model, Deep-Sea Res. Pt. I, 52, 1300–1318,, 2005. a

MacLachlan, C., Arribas, A., Peterson, K. A., Maidens, A., Fereday, D., Scaife, A. A., Gordon, M., Vellinga, M., Williams, A., Comer, R. E., Camp, J., Xavier, P., and Madec, G.: Global Seasonal forecast system version 5 (GloSea5): a high-resolution seasonal forecast system, Q. J. Roy. Meteor. Soc., 141, 1072–1084,, 2015. a

Madec, G.: NEMO ocean engine, Note du Pole de modélisation, Institut Pierre-Simon Laplace (IPSL), France, No 27 ISSN No, 1288–1619, 2008. a

Mahadevan, A., D'Asaro, E., Lee, C., and Perry, M. J.: Eddy-Driven Stratification Initiates North Atlantic Spring Phytoplankton Blooms, Science, 337, 54–58,, 2012. a, b

Martin, M., Hines, A., and Bell, M.: Data assimilation in the FOAM operational short-range ocean forecasting system: A description of the scheme and its impact, Q. J. Roy. Meteor. Soc., 133, 981–995,, 2007. a

Martin, M. J.: Suitability of satellite sea surface salinity data for use in assessing and correcting ocean forecasts, Remote Sens. Environ., 180, 305–319,, 2016. a

Megann, A., Storkey, D., Aksenov, Y., Alderson, S., Calvert, D., Graham, T., Hyder, P., Siddorn, J., and Sinha, B.: GO5.0: the joint NERC–Met Office NEMO global ocean model for use in coupled and forced applications, Geosci. Model Dev., 7, 1069–1092,, 2014. a, b

Merchant, C. J., Embury, O., Roberts-Jones, J., Fiedler, E., Bulgin, C. E., Corlett, G. K., Good, S., McLaren, A., Rayner, N., Morak-Bozzo, S., and Donlon, C.: Sea surface temperature datasets for climate applications from Phase 1 of the European Space Agency Climate Change Initiative (SST CCI), Geosci. Data J., 1, 179–191,, 2014. a

Mirouze, I., Blockley, E. W., Lea, D. J., Martin, M. J., and Bell, M. J.: A multiple length scale correlation operator for ocean data assimilation, Tellus A, 68, 29744,, 2016. a, b

Moore, T. S., Campbell, J. W., and Dowell, M. D.: A class-based approach to characterizing and mapping the uncertainty of the MODIS ocean chlorophyll product, Remote Sens. Environ., 113, 2424–2430,, 2009. a

NRC: Climate data records from environmental satellites, National Academy of Sciences, Interim Report, available at: (last access: 30 August 2019), 2004. a

Palmer, J. and Totterdell, I.: Production and export in a global ocean ecosystem model, Deep-Sea Res. Pt. I., 48, 1169–1198,, 2001. a

Park, J.-Y., Stock, C. A., Yang, X., Dunne, J. P., Rosati, A., John, J., and Zhang, S.: Modeling Global Ocean Biogeochemistry With Physical Data Assimilation: A Pragmatic Solution to the Equatorial Instability, J. Adv. Model. Earth Sy., 10, 891–906,, 2018. a, b, c

Parkinson, C. L. and Cavalieri, D. J.: Arctic sea ice variability and trends, 1979–2006, J. Geophys. Res., 113, C07003,, 2008. a

Perrette, M., Yool, A., Quartly, G. D., and Popova, E. E.: Near-ubiquity of ice-edge blooms in the Arctic, Biogeosciences, 8, 515–524,, 2011. a

Plummer, S., Lecomte, P., and Doherty, M.: The ESA Climate Change Initiative (CCI): A European contribution to the generation of the Global Climate Observing System, Remote Sens. Environ., 203, 2–8,, 2017. a

Rae, J. G. L., Hewitt, H. T., Keen, A. B., Ridley, J. K., West, A. E., Harris, C. M., Hunke, E. C., and Walters, D. N.: Development of the Global Sea Ice 6.0 CICE configuration for the Met Office Global Coupled model, Geosci. Model Dev., 8, 2221–2230,, 2015. a

Raghukumar, K., Edwards, C. A., Goebel, N. L., Broquet, G., Veneziani, M., Moore, A. M., and Zehr, J. P.: Impact of assimilating physical oceanographic data on modeled ecosystem dynamics in the California Current System, Prog. Oceanogr., 138, 546–558,, 2015. a, b

Rayner, N. A., Parker, D. E., Horton, E. B., Folland, C. K., Alexander, L. V., Rowell, D. P., Kent, E. C., and Kaplan, A.: Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century, J. Geophys. Res., 108, 4407,, 2003. a, b

Roberts-Jones, J., Fiedler, E. K., and Martin, M. J.: Daily, Global, High-Resolution SST and Sea Ice Reanalysis for 1985–2007 Using the OSTIA System, J. Climate, 25, 6215–6232,, 2012. a

Sathyendranath, S., Brewin, R. J., Jackson, T., Mélin, F., and Platt, T.: Ocean-colour products for climate-change studies: What are their ideal characteristics?, Remote Sens. Environ., 203, 125–138,, 2017. a

Simon, E., Samuelsen, A., Bertino, L., and Mouysset, S.: Experiences in multiyear combined state–parameter estimation with an ecosystem model of the North Atlantic and Arctic Oceans using the Ensemble Kalman Filter, J. Marine Syst., 152, 1–17,, 2015. a

Simpson, J. H. and Sharples, J.: Introduction to the physical and biological oceanography of shelf seas, Cambridge University Press, Cambridge, 2012. a

Smyth, T. J., Allen, I., Atkinson, A., Bruun, J. T., Harmer, R. A., Pingree, R. D., Widdicombe, C. E., and Somerfield, P. J.: Ocean net heat flux influences seasonal to interannual patterns of plankton abundance, PLoS One, 9, e98709,, 2014. a, b

Storkey, D., Blockley, E. W., Furner, R., Guiavarc'h, C., Lea, D., Martin, M. J., Barciela, R. M., Hines, A., Hyder, P., and Siddorn, J. R.: Forecasting the ocean state using NEMO:The new FOAM system, J. Oper. Oceanogr., 3, 3–15,, 2010. a

Storto, A., Alvera-Azcárate, A., Balmaseda, M. A., Barth, A., Chevallier, M., Counillon, F., Domingues, C. M., Drevillon, M., Drillet, Y., Forget, G., Garric, G., Haines, K., Hernandez, F., Iovino, D., Jackson, L. C., Lellouche, J.-M., Masina, S., Mayer, M., Oke, P. R., Penny, S. G., Peterson, K. A., Yang, C., and Zuo, H.: Ocean Reanalyses: Recent Advances and Unsolved Challenges, Frontiers in Marine Science, 6, 418,, 2019. a, b

Taylor, J. R. and Ferrari, R.: Shutdown of turbulent convection as a new criterion for the onset of spring phytoplankton blooms, Limnol. Oceanogr., 56, 2293–2307,, 2011. a, b

Taylor, K. E.: Summarizing multiple aspects of model performance in a single diagram, J. Geophys. Res., 106, 7183–7192,, 2001. a

Tonboe, R. T., Eastwood, S., Lavergne, T., Sørensen, A. M., Rathmann, N., Dybkjær, G., Pedersen, L. T., Høyer, J. L., and Kern, S.: The EUMETSAT sea ice concentration climate data record, The Cryosphere, 10, 2275–2290,, 2016.  a

Trenberth, K. E.: The Definition of El Niño, B. Am. Meteorol. Soc., 78, 2771–2778,<2771:TDOENO>2.0.CO;2, 1997. a, b

Wang, M. and Shi, W.: Detection of Ice and Mixed Ice–Water Pixels for MODIS Ocean Color Data Processing, IEEE T. Geosci. Remote, 47, 2510–2518,, 2009. a

Waters, J., Lea, D. J., Martin, M. J., Mirouze, I., Weaver, A., and While, J.: Implementing a variational data assimilation system in an operational 1∕4 degree global ocean model, Q. J. Roy. Meteor. Soc., 141, 333–349,, 2015. a, b, c, d

Westberry, T. K., Schultz, P., Behrenfeld, M. J., Dunne, J. P., Hiscock, M. R., Maritorena, S., Sarmiento, J. L., and Siegel, D. A.: Annual cycles of phytoplankton biomass in the subarctic Atlantic and Pacific Ocean, Global Biogeochemical Cy., 30, 175–190,, 2016. a, b

While, J., Haines, K., and Smith, G.: A nutrient increment method for reducing bias in global biogeochemical models, J. Geophys. Res., 115, C10036,, 2010. a, b, c

While, J., Totterdell, I., and Martin, M.: Assimilation of pCO2 data into a global coupled physical-biogeochemical ocean model, J. Geophys. Res., 117, C03037,, 2012. a, b

Woodruff, S. D., Worley, S. J., Lubker, S. J., Ji, Z., Eric Freeman, J., Berry, D. I., Brohan, P., Kent, E. C., Reynolds, R. W., Smith, S. R., and Wilkinson, C.: ICOADS Release 2.5: extensions and enhancements to the surface marine meteorological archive, Int. J. Climatol., 31, 951–967,, 2011. a

Yu, L., Fennel, K., Bertino, L., Gharamti, M. E., and Thompson, K. R.: Insights on multivariate updates of physical and biogeochemical ocean variables using an Ensemble Kalman Filter and an idealized model of upwelling, Ocean Model., 126, 13–28,, 2018. a

Short summary
Satellite observations of the ocean were combined with a numerical model to create simulations of the ocean state between 1998 and 2010. Relationships between physical and biogeochemical quantities were assessed to investigate whether observations of different variables are consistent in their representation of the Earth system. Good consistency was found. The results also highlighted ways in which the model could be improved and the respective impacts of using different observations.