Optimal adjustment of the atmospheric forcing parameters of ocean models using sea surface temperature data assimilation

Introduction Conclusions References


Introduction
Near-surface variables from atmospheric reanalyses (air temperature and humidity, wind speed, downward radiation and precipitation) are commonly used to specify surface boundary condition in ocean general circulation models (hereafter OGCMs) required for operational forecasts, ocean reanalyses, or hindcast simulations of the recent ocean variability (the last 50 yr).However, these atmospheric variables are characterized by significant uncertainties at global scale as shown in different studies, such as Milliff et al. (1999), Wang and McPhaden (2001), Smith et al. (2001) or Sun et al. (2003).For example, the use of two different databases (e.g. from numerical weather prediction centers like ECMWF or NCEP) to compute the mean ocean-atmosphere net heat flux can lead to discrepancies on the order of at least 10 W m −2 , while the signal of a global warming of the world ocean corresponds to a value of 0.5 W m −2 (Josey, 2011).It is therefore important to reduce the atmospheric variables uncertainties in order to obtain better agreement between the models and the real ocean.Sea surface temperature (SST) is more accurately observed from space than most near-surface atmospheric variables or air-sea fluxes assimilated in atmospheric models to construct the reanalyses.Although SST is used as boundary conditions in these atmospheric models, large errors remain in the produced surface atmospheric parameters due to bulk formulae and radiative transfers model uncertainties.In OGCMs, observed values of SST (intrinsically linked to air-sea exchanges) are not used in the surface forcing except when explicitly assimilated.In brief, models do not benefit, in their forcing, from one of the best observed ocean surface variables.
One of the approaches to incorporate SST information into ocean simulations consists in assimilating observed SST products to correct the model state.This method can lead however to some inconsistency between the "assimilated" solution of the model and the "forced" one, and to the rejection of the information contained in the SST.Since atmospheric forcing and particularly atmospheric variables are an important source of surface errors, an alternative is to use SST data assimilation to constrain the surface atmospheric input variables of the atmospheric reanalyses.The aim is to correct the source of errors and not the consequence.An improvement of fluxes estimation is crucial to perform realistic ocean simulations and can also help improving the atmospheric reanalyses since their atmospheric state estimation could benefit from the link with the ocean dynamics via the ocean model.Moreover, this approach proposes an alternative to other corrections of atmospheric forcing realized following ad hoc considerations (e.g.Large and Yeager, 2004;Brodeau et al., 2010).
This idea has been explored in previous studies using two different types of data assimilation schemes.On the one hand, Stammer et al. (2004) used a four-dimensional variational data assimilation scheme by including air-sea fluxes in the control vector.Over a ten-year period, they assimilated a large variety of oceanic observations, computed air-sea flux corrections, and carried out an important validation effort to compare the corrected fluxes to other independent estimates.This pioneering work was the first to demonstration that it is possible to make estimates of the atmospheric forcing fields by ocean data assimilation in a realistic case, even if the authors also show that the forcing correction may sometimes compensate for other errors in the ocean model.On the other hand, fluxes corrections can also be computed using sequential assimilation methods.These methods are widely used in ocean data assimilation systems, and are easier to implement than their variational counterpart.Skachko et al. (2009) and Skandrani et al. (2009) explored the capability of such methods to estimate forcing parameters corrections by using ocean observations data.On the basis of a reduced order Kalman filter, they carried out idealized experiments that differ from the parameters included in the control vector, the construction of the synthetic observations to be assimilated, and the construction of initial condition.They showed that in an idealized case where errors are assumed to be entirely due to forcing parameters, it is possible to implement a sequential data assimilation method to estimate objective corrections of these parameters.While Skachko et al. (2009) used temperature and salinity profiles that simulate the horizontal and temporal distribution of Argo floats in twin experiments, Skandrani et al. (2009) assimilated sea surface temperature and sea surface salinity extracted from a Mercator Ocean reanalysis (Ferry et al., 2010).The first study by Skachko et al. (2009) showed that this procedure leads to accurate estimations of parameters included in the control vector, and following these first results, Skandrani et al. (2009), in a more realistic context, estimated forcing parameters corrections leading to reduced differences between the free simulation and the reanalysis.However, these two studies also suggest that the other sources of model errors (due to the coarse resolution or to the initial condition for example) can lead to unrealistic forcing parameters corrections and must be considered very carefully.In particular, they point out the importance of a proper determination of the prior probability distribution of forcing parameters and of their associated error covariance matrix to obtain good parameters estimation.These results are thus an appropriate starting point for further developments of this approach, investigating its feasibility in a more realistic context.In this case the sources of errors are more diverse and require particular attention.
The purpose of this research is thus to constrain (within observation-based air-sea flux uncertainties) the surface forcing function of an ocean model (i.e.surface atmospheric input variables from atmospheric reanalyses) by using a methodology based on advanced statistical assimilation methods (ensemble Kalman filter) to take into account SST satellite observations.In other words, the objective of this work is to take advantage of an ocean model to correct near-surface atmospheric variables, and to ensure their consistency with ocean surface dynamics.With respect to the studies briefly described before, this work presents the originality to be carried out for longer timescales, and with real SST observations assimilated in realistic global ocean model simulations.
In the present paper, we take one step further than the past idealized studies to estimate a set of corrections for the atmospheric input data from the ERAinterim (ERAi hereafter, Dee et al., 2011) reanalysis for the period of 1989-2007.Ocean modeling can be used to answer a number of particular questions, and given the chosen focus, the forcing optimization will be designed differently.Here we aim to address the problem of long-term trends and biases in the model, with the final objective of improving the mean state of the ocean in long-term hindcast simulations, without modifying its interannual or high frequency variability.We use a sequential method based on the SEEK filter.For experiments over a one month duration, observed monthly SST product (Hurrell et al., 2008) and SSS seasonal climatology data (Levitus et al., 1998) are assimilated to obtain monthly corrections of the atmospheric forcing parameters (air temperature and humidity, zonal and meridional wind speed, and downward radiation).The expected outcome of our experiment is thus to obtain a comprehensive set of estimated parameters for every month of the period between 1989 and 2007, so that they can be used later in a free model simulation.
We first describe in Sect. 2 the forcing function of the model and the Kalman filter methodology used as a basis for this work.Section 3 concerns the principal adjustments necessary to fit our realistic context and the long-term focus of this study.Finally, a selection of results that illustrate the strengths and weaknesses of the method is presented in Sect. 4.

Ocean model and forcing by the atmosphere
The OGCM is the first key ingredient to estimate relevant parameters corrections, and to evaluate their validity and their impact in free (i.e.without data assimilation) model runs.Since the estimation problem requires a large number of model simulations (to perform the ensemble experiments), and since the goal is to explore the problem at a global scale, we chose to use a coarse resolution OGCM: the 2 • resolution configuration of NEMO (Madec, 2008), with 46 vertical levels.In previous studies, Skachko et al. (2009) and Skandrani et al. (2009) chose this coarse resolution approach to set up their idealized experiments before considering applying it to higher resolution configurations.Implementing this kind of methodology with high resolution models would indeed be numerically too expensive to be considered as a first step.
Different methods exist to force an ocean-only model.Here, the model forcing function is computed following the methodology proposed by Large et al. (1997).All fluxes are calculated at every model grid point with classical bulk formulas which take as input the near-surface atmospheric variables (air temperature and humidity, zonal and meridional wind speed and precipitation), downward atmospheric fluxes (short wave and long wave radiation) and the ocean surface variables calculated by the model (SST and surface currents velocity).
The components of the forcing function are the freshwater flux (F w ), the momentum flux (τ ), and the net heat flux (Q net ) which is compiled as the sum of the short wave solar radiation flux (Q sw ), the long wave radiation flux (Q lw ), the sensible heat flux (Q sens ) and the latent heat flux (Q lat ).
Air-sea fluxes estimation in the model requires the SST of the model (T s ), the atmospheric state, the downward radiation, and the precipitation.Atmospheric state and downward radiation are involved in the fluxes computation as follows: -The turbulent fluxes involve the knowledge of X A = (T a , q a , U 10 ) the atmospheric state, with T a being the air temperature, q a the specific humidity and U 10 the relative wind speed at ten meters above the surface: , where ρ a is the air density considered as a constant of 1.29 kg m −3 , C p the air specific heat (∼ 1000 J kg −1 K −1 ), L v the latent heat (2.26 × 10 6 J kg −1 ), and q sat the saturation specific humidity.
-The radiative fluxes involve the knowledge of X R = (rad sw , rad lw ) the downward radiation, with rad sw being the downward shortwave radiation, and rad lw the downward long-wave radiation: where α is the ocean surface albedo (∼ 0.066), − σ T s 4 is the infrared radiation flux from the ocean surface with σ the Stefan-Boltzmann constant (5.67 ×10 −8 ), and the seawater emissivity (0.98).
-The freshwater flux involves the knowledge of precipitation (precip): with R the continental contribution to the freshwater budget.
Near-surface variables (air temperature and humidity, wind speed, downward radiation and precipitation) from atmospheric reanalysis (here ERAi) used to specify surface boundary condition of the model are characterized by large uncertainties at global scale.Temperature and humidity are instantaneous forecasts at 2 m above the ocean surface and is mentioned as t 2 and q 2 , respectively, hereafter.Zonal and meridional wind speeds are instantaneous forecasts at 10 m above the surface, and are noted as u 10 and v 10 .Atmospheric radiation and precipitation fluxes are 12 h integrated forecasts.ERAi outputs were available to us every 6 h.We thus propose here to estimate corrections of t 2 , q 2 , u 10 , v 10 , rad sw , rad lw and precip.
C H , C E and C D are the bulk transfer coefficients for sensible heat, humidity and momentum, respectively.Several parameterizations exist to compute these coefficients.All of them present some uncertainties but the one used in our ocean model NEMO (hereafter LY04) is the one described by Large and Yeager (2004).Even if they are impacted by forcing modifications, we chose to focus here on the atmospheric variables corrections, instead of following the approach of Skachko et al. (2009) and Skandrani et al. (2009) used in previous work, because as shown in Fig. 1, the contribution of atmospheric variable error on the net heat-flux uncertainty is more important than the contribution of bulk coefficient uncertainty Brodeau (2007).

Kalman filtering for parameter estimation
In the standard current state of the art of ocean data assimilation systems, sequential data assimilation methods are most often used to correct the model state (e.g.Drévillon et al., 2008).However, these traditional implementations do not actually correct the source of errors but the consequence and can lead to possible inconsistencies between  (1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000) estimated from different bulk algorithms (different transfer coefficients parametrizations): COA3 described by Fairall et al. (2003) and LY04.These climatologies are computed with the same atmospheric forcing field and the same prescribed SST (from Brodeau (2007)).Bottom: Mean difference between net heat-flux (in W m −2 ) climatologies (1989-2007) estimated with air temperature, air humidity and wind speed from different data sets (ERAinterim and DFS4.3 as described in Brodeau et al., 2010).These climatologies only differ by the atmospheric parameters used and are computed with the LY04 bulk algorithm.
the assimilated solution and the forcing.The method developed here proposes an alternative to this classical approach by estimating corrections of a source of model errors: the forcing parameters.Here we use the same methodology (based on a Kalman filter analysis) as developed by Skandrani et al. (2009) to estimate atmospheric forcing parameter corrections.With respect to the classical formulation of the Kalman filter (Kalman, 1960), the control vector initially containing the model state is extended to the forcing parameters that we want to estimate (Eq.1).The control vector becomes with x the ocean state control vector containing the SST and the SSS, and p the parameters vector that we aim to correct, where t 2 is the 2 m air temperature, q 2 the 2 m air humidity, u 10 the 10 m zonal wind speed, v 10 the 10 m meridional wind speed, rad sw the downward shortwave radiation, rad lw the downward long-wave radiation, and precip the precipitation.
The background control vector xf is the model output from a simulation forced by the first guess atmospheric parameters over the assimilation window extended to these atmospheric parameters.We can then obtain the best estimate xa of the control vector by taking into account available observations (the vector y): with xf the background simulation, Ĥ the observation operator, and K the Kalman gain given by with R the observation error, and Pf the forecast error covariance matrix.
The formulation of the Kalman filter applied to an extended state vector needs to know the forecast error covariance matrix Pf in the augmented space, given by with Pa the background error covariance matrix, or initialisation error, M the model operator, and Q the model error.

The knowledge of the
Pf matrix (and thus Pa and Q) is crucial in statistical estimation approaches.As already pointed out by Skachko et al. (2009) the impact of the background error on the model results is even more important when the model state correction is not applied.In their more realistic study, Skandrani et al. (2009) also highlight the fact that making appropriate assumptions about the forecast error statistics is increasingly important as the problem becomes more realistic.
However, the objective is here different from these two previous studies, as it focuses on a mean long-term correction of the forcing parameters.The aim is not to correct the model state and improve the short-term forecast of the model, but to estimate long-term atmospheric parameter corrections to be used in another independent free simulation.For this reason, we decided to use a different strategy and to carry out an independent ensemble experiment for each assimilation window instead of running a classical chain of analysis cycles.This process simplifies the estimation of the forcingerror covariance matrix (described in Sect.3).Moreover, under the assumption that all the forecast error is due to forcing error and that the forcing correction is ideal, the background error covariance matrix for the next assimilation window is assumed very small.In practice, if we consider that the initialization state of each experiment is close enough to assimilated observations, we can then neglect the initial error covariance error and set Pa to zero in the expression of Pf (Eq.4).

The expression of the
Pf matrix will become In this approximation, the estimation of Pf involves only the knowledge of the model error Q generated by forcing errors during the assimilation window.As this matrix must be representative of forcing parameters errors only, we have to reduce all the other error sources in the ensemble experiments to avoid compensation effects in the parameters estimation.In this context, the main difficulty is to preserve the robustness of the methodology previously developed.Contrary to Skachko et al. (2009) and Skandrani et al. (2009), the observation errors need to be taken into account.We do not know the true value of forcing parameters, neither the perfect initial condition that will ensure the efficiency of assimilation experiments.We also have to make the distinction between forcing errors and other potential errors in the system to obtain the least versatile correction possible.Indeed, the solution will depend on the model used and it could be difficult to apply the same parameter corrections to other configurations where the nature of the model error is different.All these matters will be addressed to evaluate the relevance of the method and results in Sect.4.1.
Finally, two additional refinements of the forecast error statistics will be used in the experiments.First, to avoid spurious long range influence of the observations, the ensemble covariance matrix will be localized as described in Brankart et al. (2011), using a horizontal cutting length scale of three grid points (i.e. about 600 km in longitude along the equator).Although it is an arbitrary choice, the length scale of 600 km corresponds to the scale of the impact of a monthly perturbation of the forcing.Several possibilities have been tried before choosing this one, which is a good compromise between capturing the impact of a monthly forcing perturbation while not introducing artificial large-scale correlations.Second, to avoid excessive and non-physical parameters corrections, the forecast probability distribution will be assumed to be a truncated Gaussian probability distribution, as developed by Lauvernet et al. (2009), and as already used in Skandrani et al. (2009).In practice, all parameter corrections above 1.5 standard deviations will be truncated.

Challenge of a realistic application
The first challenge of a realistic application of this method is that available observations of the real ocean are sparse in space and time.We chose to assimilate monthly mean SST data, which is one of the most accurate observations of the ocean, and SSS climatological data to constrain the freshwater budget of the system, and thus to limit unrealistic impact of the correction on the SSS.We use the Hurrel database (Hurrell et al., 2008) giving SST monthly mean between 1989 and 2007, and a climatology of monthly SSS for the same period from Levitus et al. (1998).Since the available information is now more limited than in idealized cases, this application involves some crucial adjustments of the methodology (see Sect. 3).
The time resolution of available data implies considering assimilation windows of one month, and thus monthly parameters corrections.However we do not have access to the corresponding subsurface mean state to construct the ideal initial condition in order to ensure the validity of the assumption of Pa equal to zero.For our application, this will be the first adaptation to make to the initial methodology.
With these observations, we will compute corrections to the atmospheric parameters from the ERAi reanalysis available between 1989 and 2007.ERAi is the most recent reanalysis produced by the European Center for Meteorological and Weather Forcasting (ECMWF).The atmospheric variables are available every 6 h for X A and every day for X R and precipitation at the same spatial resolution as the model, but we do not have access to the associated uncertainties.The estimation of the prior probability distribution of the forcing parameters is the second difficulty of a realistic experiment.The monthly uncertainties associated to the parameters have to be characterized.Finally, to run Monte Carlo experiments to estimate the matrix Q, the forcing error in the model has to be isolated to avoid potential compensation effects.
www.ocean-sci.net/9/867/2013/Ocean Sci., 9, 867-883, 2013 The forecast error covariance matrix Pf reflects the impact of the forcing parameters errors on the model state.We estimate this matrix by running Monte Carlo experiments, or ensemble experiments, as performed by Skachko et al. (2009) and Skandrani et al. (2009).For each month, we run 200 experiments from the same initial condition.These experiments only differ by the atmospheric parameters used to force the model.
Pf is then deduced from a statistical analysis of the 200 forecast states ensemble.However, to be consistent with prior assumptions which are needed to apply the filter methodology, the ensemble experiments have to be representative of the impact of the forcing parameters error only.We need thus to take care of three constraints that represent the most important methodological developments necessary for a realistic application, and that will be described in the following sections: -minimization of the model errors that are not due to the forcing function by a robust diagnostic approach; -minimization of the initial condition errors; -computation of parameter perturbations representative of the forcing uncertainties.

Robust diagnostic ensemble experiments
The 200 model states ensemble have to be representative of the impact of forcing errors on monthly timescale ocean dynamics, and particularly on the surface where observations are assimilated.To ensure the consistency of this assumption, other possible errors have to be minimized while performing ensemble experiments.Indeed, the coarse resolution model used here involves parametrization and approximations to compensate unresolved processes.It thus contains errors that are not attributable to forcing, which lead in particular to a bad positioning of the water masses in strong advection regions such as western boundary currents.If this type of error is present in the ensemble runs, it will lead to an inconsistent estimation of Pf , and finally to unphysical parameters compensating the non-forcing errors.As a consequence, ensemble runs have to be as close as possible to the real ocean, and be representative of the forcing errors only.
We chose to constrain the model by using a robust diagnostic approach (Sarmiento and Bryan, 1982) consisting in a three-dimensional relaxation of the solution to temperature and salinity climatologies (Levitus et al., 1998).Relaxation terms are added to heat, mass and salt conservation equations and create artificial sources and sinks to bring back the simulation to the corresponding climatologies.The relaxation strength is adapted through its time constant.A large value of this time constant corresponds to a light relaxation, and a small one to a strong relaxation.In our case, we adjust the relaxation time constant to allow the model to react to atmospheric forcing without getting to far from the climatology.Two domains are thus specified where the relaxation characteristics are different: -The first 400 m from the surface are likely to be impacted by a forcing modification at monthly timescale.The time constant of the relaxation is one year in this domain to constrain the large-scale feature while allowing for a possible impact of the atmospheric fluxes on the dynamics.
-Below 400 m depth, the relaxation applied is strong with a one-month relaxation constant.The large-scale feature is thus strongly constrained at depth to ensure a good consistency with the observed ocean and limit non-forcing-error sources.
The robust diagnostic approach is a tool used here to produce a density field consistent with an incomplete set of observations.The purpose is to help the identification of a particular set of atmospheric parameters and not to produce a relevant description of ocean circulation.In the constrained simulations, the large-scale water masses positioning in the model solution (e.g. the structure of gyres) are more consistent with observed ocean (figure not shown).The first use of the method was to evaluate the ocean response to a given atmospheric CO 2 field (Sarmiento and Bryan, 1982).This first approach presents some similarities with our objective to identify the impact on the ocean surface of a given atmosphere.
A robust diagnostic simulation introduces artificial sources and sinks of heat and freshwater in the conservation equations and can lead to an unrealistic representation of some processes.However, since the relaxation is lighter in the region directly affected by atmospheric forcing, the relevance of our approach is still valid in this special zone of interest.This three-dimensional relaxation ensures a limited effect of non-forcing errors in the ensemble experiments, so that the Pf matrix is only representative of the errors in the atmospheric parameters.The validity of this hypothesis is crucial to estimate consistent parameter corrections and to reduce possible compensation effects.

Initialization procedure
The second assumption of this study is that the initial error for every month (the Pa matrix in Eq. 4) can be neglected.

If
Pa is non-negligible, then the final equation of Pf (Eq.5) cannot be considered valid.In absolute terms, to ensure that Pa is equal to zero we should initialize ensemble experiments with the true ocean state.However this true threedimensional ocean state is not available.To obtain a reasonable approximation of an initial condition with the correct SST, a simulation over the 1989-2007 period strongly constrained to fit surface observation is constructed.From this simulation we then extract the initial condition that will be used for ensemble experiments.To carry out this simulation, a strong relaxation is applied to the surface in addition to the robust diagnostic approach described in the previous section.The surface relaxation is prescribed with a time constant of two hours to deprive the model of surface evolution freedom.However the objective is to construct an initial condition for the ensemble experiments.This particular initial condition construction is the best compromise that we have found to initialize the ensemble experiments, and it is an important additional procedure allowing a realistic application of the methodology developed by Skandrani et al. (2009) to be carried out.
For each month the corresponding initial condition is extracted from the strongly constrained simulation.It is then used to initialize ensemble experiments.To build consistent parameters corrections, the background simulation of one month that gives the background control vector xf (Eq.2) has to be initialized with the same ideal conditions.

Parameter perturbations computation
The last important assessment to make concerning the ensemble experiments setup is to ensure that the perturbations applied to atmospheric parameters are actually representative of the real forcing uncertainties.It is assumed that parameters uncertainties are comparable to their intra-seasonal and interannual variability in the ERAi reanalysis.This assumption turns out to be a fair way to identify parameters uncertainties, but one could envisage other options, such as quantifying the uncertainties by using the local differences between two or more forcing data sets.The parameters prior probability distribution is defined as Gaussian (strictly speaking, a truncated Gaussian to avoid extreme parameter cor-rections, see Sect.2.2) and its covariance matrix is derived from the ERAi reanalysis intra-seasonal and interannual variability between 1989 and 2007.Perturbations are specific to a given month and constant over each monthly assimilation window.For example, to estimate a set of perturbations for April, we consider the reanalysis signal corresponding to all three month windows (March-April-May) of the reanalysis over the 1989-2007 period (60 states).This Gaussian has a zero mean, and its covariance is defined by the covariance of the parameters around their mean over these three months periods.This assumption is made to characterize the covariance of atmospheric parameters.The choice of intra-seasonal and interannual variability rather than the total variability is made to avoid in the perturbations of April, for example, a variability that is characteristic of wintertime.A random sample of 200 perturbations is then constructed for each month.
Figure 2 illustrates the standard deviation of the parameter perturbations for a given month ensemble (here January).To

ensure relevant
Pf estimation, this amplitude has to be representative of the uncertainty of each atmospheric variable.One can see that the standard deviation associated to each parameter corresponds to monthly parameter uncertainties at the global scale, with values in the range of 0-1.5 • C for the temperature, 10 to 25 W m −2 for the downward shortwave radiation, and 0.5 to 3 m s −1 for the zonal and meridional wind speed.However, at high latitudes, the ensemble dispersion of temperature and shortwave radiation is larger (up to 6 • C for the temperature and 70 W m −2 for the shortwave radiation).These values are clearly excessive to represent the real uncertainty, which illustrates the limits of considering the intra-seasonal and interannual variability of parameters to quantify the uncertainties.The problem described here is not related to the ice extent variability in the ocean model, www.ocean-sci.net/9/867/2013/Ocean Sci., 9, 867-883, 2013 since the ocean model is not involved in the computation of the parameters perturbations.The monthly variability of atmospheric variables is much larger at high latitudes, particularly for the temperature and radiative fluxes, mostly because of the position of the sun influence.The assumption made to generate the parameter perturbations in the ensemble experiments could thus be irrelevant in high latitudes regions because changes from one month to another in the three month period can be much larger than the uncertainties.As a consequence, the results in these regions must be interpreted cautiously.In practice, the whole procedure is repeated monthly between 1989 and 2007, which results in running 200 times the model over the 19 yr period to obtain the set of forcing corrections.Such an important amount of simulations and could be a critical difficulty with a higher resolution model.

Results
The objective of this section is to analyze the strength and weaknesses of the methodology that has been adapted to correct ERAi variables.The relevance of specific assumptions described above is evaluated by selecting some diagnostic at global and regional scales.As a first step, we will focus on the capability of the method to isolate the forcing impact in the model and compute realistic parameter corrections.Second, we will look at the results in terms of global scale heat budget resulting from our corrections.Lastly, we will discuss the impact of the computed corrections in the intertropical band where the method presents the maximum reliability.

Assessment of the method
The first important step of the method validation is to look at the results over a specific month (January 2004) which corresponds to one assimilation window.We compare (Fig. 3, top panel) the SST resulting from the analysis step of the Kalman filter (i.e. the output of Eq. 2), and (Fig. 3, bottom panel) the SST computed by a free model run forced by the newly estimated parameters (hereafter ERAcor).This diagnostic will determine if the use of corrected parameters in a free simulation has the same impact as the correction computed by the analysis scheme.In other words, the objective here is to evaluate if we have properly determined the Pf matrix which describes the impact of forcing errors in the model.Figure 3 compares the mean SST increment resulting from the analysis step to the mean SST increment produced in a free run model forced by ERAcor atmospheric variables.In both cases, the resulting SST shows a better agreement with its observed counterpart (not shown).Furthermore, one can see that both increments are very similar in structure and intensity.This result means that the error covariance matrix Pf constructed here is consistent with the actual impact of forcing errors in the model.The second step to evaluate the reliability of our method to improve the forcing for free model simulations is to look at the consistence of the amplitude of parameter corrections estimated by the method with the uncertainties on atmospheric variables.Since the focus is here to correct potential longterm trends and mean error, it is of primary importance to assess the regional relevance of the corrections temporal mean between 1989 and 2007 (Fig. 4 for the local values, and Fig. 5 for the zonal mean).
In Figs. 4 and 5, we notice that for the major part of the world ocean the corrections are consistent with the assumed forcing uncertainties (e.g.−1 to 1 • C for the 2 m air temperature, 1.5 to 1.5 m s −1 for wind speed, or −20 to 20 W m −2 for shortwave downward radiation), which is comparable to the 1989-2006 mean differences between the forcing parameters of ERAi and DFS4.3 (Brodeau et al., 2010) (not shown).Furthermore, temperature and humidity corrections have a similar behavior, consistently with the strong correlation of these two variables.The same comment can be made regarding shortwave and long-wave radiation corrections.On the other hand, the large-scale structure of the corrections is mostly zonal.For example, we obtain a mean shortwave radiation (rad sw ) correction of of approximately  ).The corrections obtained with our method are thus physically reasonable in terms of large-scale intensity and structure.However, a critical analysis of Fig. 4 allows some persisting problems to be identified.These correction fields present some small-scale features (especially at high lat-itudes), which are not appropriate in forcing corrections.These structures are particularly obvious in the southern ocean in the temperature, shortwave radiation and wind speed corrections, and are likely to be the result of the local implementation of the analysis.These small-scale structures make it difficult to interpret the corrections obtained in the Southern Ocean, and highlight one limit of the use of local data assimilation.The same problem has already been evidenced by Stammer et al. (2004) in a variational approach.
Small-scale structures in the Southern Ocean lead to important gradients in the correction and can result in instability propagation when applied to a free model run.However, critical gradients of 20 to 40 W m −2 for the shortwave radiation in this region are not only explained by the localized analysis.Indeed, we noticed in Sect.3.3 that the assumption made to compute parameters perturbations (and thus parameter prior probability distribution) was not relevant at high latitudes, which can lead to spurious corrections.These spurious corrections have obviously not been totally eliminated by the truncation of the prior probability distribution mentioned in Sect. 2. The combination of this problem with the covariance localization hampers the relevance of the correction in the high latitudes.
Figure 4 also allows the results in strong advection regions to be looked at more precisely, such as the Gulf Stream, the Kuroshio region, or the Agulhas Current region but also the Antartic Circumpolar Current region.As mentioned in Sect.3.1, the water masses and front positioning in these regions are not properly represented in our ocean model.Despite of the use of a robust diagnostic approach to compute ensemble experiments, these non-forcing errors are still present in the model, with the consequence that estimated corrections in these regions are often excessive (with for example orders of magnitude of 4 • C for the temperature, 3 m s −1 for the zonal wind speed, and up to 60 W m −2 for the shortwave radiation).These values are typical of compensation effects.We have here an overestimation of parameter modifications compensating for other model errors.Considering that larger observation errors are not sufficient to bypass this problem, we cannot be very confident in the realism of estimates made in mid-latitude strong advection regions.
Finally, we observe that the corrections computed for the precipitation field are negligible (figure not shown).This is consistent with the fact that the SSS assimilated aims to avoid unbalanced corrections, and not to actually generate significant precipitation corrections.
In summary, we have identified in this section the limits of the method implemented to compute upgraded atmospheric variables.And the conclusion is that even if the results look rather consistent with parameter uncertainties in the major part of the world ocean, we have to be cautious with regional corrections computed in the high latitudes and strong advection regions.However these problems remain quite local and we can still look at the global impact of the correction on the ocean heat budget.

Global results
To study this global impact we have to compute air-sea fluxes corresponding to this new forcing parameters data set using the bulk formulas.The objective here is to evaluate this new data set reliability to force a free model run.The tool used to compute air-sea fluxes offline, using the same formulae as the ocean model and a given atmospheric data set, was developed by Brodeau (2007).Instead of using the model SST, this diagnostic involves a prescribed SST (here the Hurrel SST).Inter-comparison between different atmospheric data set to quantify the parameter modifications impact is thus easier than in the model where the SST feedback on the flux computation is taken into account.All fluxes presented in the following are computed using this offline approach.Global results concerning the net heat flux will be presented first, before focusing on the intertropical band to quantify the distribution of this integrated modification over the different net heat-flux components.
The net heat-flux integrates all modifications applied to the latent, sensible and radiative fluxes.The diagnostic of the parameter corrections impact on this flux thus reflects the global effect of forcing modifications.The first criterion used to evaluate a forcing data set is to look at the global heat balance over the considered period (here 1989 to 2007).A balanced budget is a sign of quality for the given forcing.Figure 6 shows that the corrections modify the net heat-flux budget from a positive value of almost 15 W m −2 (left) to approximately 2 W m −2 (right).This result means that forcing the model with ERAi atmospheric variables would lead to an excessive heating of the ocean, while our corrections reduce drastically this heat excess and maintain the heat budget close to equilibrium.
Furthermore, a clear negative trend is present in the ERAi time series of the net heat flux.This trend is inconsistent with the observed global warming for the last twenty years (Bindoff et al., 2007).This negative trend is not present any more in the time series of the net heat-flux calculated with the corrected parameters.This diagnostic leads to two crucial results since the methodology does not involve any explicit constraint to obtain a balanced heat budget or to remove potential unrealistic trends.

Intertropical band
Results commented on in Sect.4.1 allowed regions where reliability of the method is questionable to be identified.To evaluate locally the impact of computed corrections on each net heat-flux component, we chose now to focus on the region between 20 • N and 20 • S, called intertropical band.This region has the advantage not to be affected by spurious corrections or biases in the model circulation mentioned in Sect.4.1.Furthermore, the model used here has 0.5 • meridional resolution along the equator and thus captures the critical scales of the equatorial current system (Madec and Imbard, 1996).As a consequence we expect that the results there reach the maximum reliability.It is thus interesting to go further from the net heat-flux diagnostic and to look at the impact of parameter corrections on the sensible, latent, long-wave radiation and shortwave radiation fluxes.
Figure 7 illustrates the impact of these modifications on the mean of the different net heat-flux components over the period 1989-2007.As expected, given the reduced global heat budget, we observe a reduction of the net heat flux of about 20 W m −2 .This tends to reduce the warm bias in the intertropical band that can be evidenced by comparing the model SST to the observations (see Fig. 11 described later).Furthermore, since we compute corrections separately for every atmospheric variable, we can access to valuable information regarding the relative contribution of each component of the net heat flux to this reduction of the net heat flux.From this diagnostic, it appears that to reduce the heat gain in the intertropical band, the radiative flux (Q sw + Q lw ) remains almost the same (augmentation about 5 W m −2 ), while the heat loss through turbulent fluxes is clearly increased.As a consequence, the net heat flux decrease is mostly due to a modification of the turbulent fluxes.All variable corrections in the intertropical band contribute together to an increased heat loss by turbulent fluxes (decreased temperature and humidity, and increased wind speed).
Among these contributions, wind speed is identified by our method to be the essential mechanism involved in the increase of the turbulent heat loss (Fig. 8).

Long-term free model simulation
The last diagnostic used here to assess the method is the analysis of a free model run forced by the corrected parameters.We showed in the previous sections that our corrections present some unreliable characteristics that could propagate in the model.Parameter corrections are thus post-processed before being used in the model, in order to smooth unrealistic small-scale features, and to avoid spurious corrections to be applied in the model.
To smooth small-scale features, an anisotropic boxcar filter is applied to each monthly correction.The box size is 10 • in latitude and 20 • in longitude in order to keep the zonal signal structure while limiting local gradients.The largescale correction characteristics thus persist but the parasite structures observed for example in the Southern Ocean are smoothed out as shown in Fig. 9. Since spurious corrections are still present at high latitudes, the corrections are set to 0 for latitudes larger than 60 • in both hemispheres because the error is considered to be mostly due to other processes than forcing (essentially the fact that the method does not take into account the ice cover in these regions).Resulting 1989-2007 mean corrections are shown in Fig. 9.
Although this procedure does not allow the corrections actually computed by the data assimilation method to be tested, the 1989-2007 net heat-flux time series computed from the parameters after correction and from the original ERAi variables are well correlated (> 0.5 in the intertropical band), which gives an insight on the good stability of the corrections (e.g., Fig. 10).Furthermore, we can evaluate the corrections in terms of impact in the model focusing on the results in the intertropical band where this post-processing has not a drastic impact on the solution since it is the region of correction maximum validity.
In Sect.4.3 we had shown that the integrated impact of our parameter corrections is to reduce the net heat flux in the intertropical band.A reduced SST in the free model run is thus expected when replacing ERAi forcing by corrected parameters.In Fig. 11, we first observe that the simulation forced by ERAi variables presents a long-term mean warm bias ranging between 0.5 and 2 • C in the intertropical band (top).Reducing this warm bias to reach values below 0.5 • C for most of the region (consistently with a reduced net heat flux), our correction thus leads to a better agreement with observed SST from a long-term point of view (bottom).Moreover, the stan-dard deviation of the monthly differences between observed and simulated SST is also reduced by up to 0.2 • C on average in the intertropical band (figure not shown).The strategy developed here to compute independent monthly corrections is thus consistent with our initial objective to improve the long-term mean forcing.
Besides the consistence between observed and simulated SST in the intertropical band, the interannual variability of the model response is not modified by the forcing corrections as shown in Fig. 12.This figure shows the interannual variability of the global means of the SST, the sea surface height (Fig. 12, top panel), and the meridional overturning cell intensity at 30 • S (Fig. 12, bottom panel).It is clear that despite the discrepancies between the mean values, the year to year variations are in a very good agreement.This result ensure that the corrections applied do not hampers the interannual variability of the initial ERAi forcing.

Conclusions
In this study we explored the feasibility of a sequential data assimilation methodology to estimate monthly corrections of ERAi forcing parameters using real-SST observations.Independent monthly data assimilation experiments have been performed over the period 1989-2007 to compute a corrected forcing data set ERAcor (without correcting the model state), to be used in a free model run.The prior probability distribution of the parameters has been characterized by the intraseasonal and interannual variability of ERAi reanalysis, and ensemble experiments of 200 members were performed to estimate the forecast error covariance matrix.Implementing such a methodology in a realistic case is challenging, by the care needed to make assumptions and methodology developments.The initialization procedure of the experiments is crucial to avoid the propagation of potential initial condition errors unrelated to the forcing.For this matter, initial condition as consistent as possible with the surface observations has been extracted from a strongly constrained simulation.Moreover, the impact of the forcing on the model state had to be isolated from other model errors in the ensemble experiments with a robust diagnostic approach.Finally, the residual excessive corrections have been limited by a truncation of the prior probability distribution of the parameters as described in Skandrani et al. (2009).
This paper illustrates the benefit of using an objective method to estimate forcing corrections.This method has been successfully demonstrated in identifying properly the monthly forcing impact on the model state: -For every given month, the computed corrections applied to a free model run have the same impact as the analysis step on the model SST.
-The net heat-flux budget resulting from ERAcor (highly positive in ERAi database) is balanced at the  global scale, and the potentially unrealistic negative trend observable in the time series of this flux in the ERAi database (leading to ocean cooling) over the whole time period has been removed.
-Parameter corrections computed improve the longterm mean state of a free (without data assimilation) simulation with respect to ocean-surface observations, and can guide typical ad hoc forcing correc- tions mostly used to adjust forcing parameters (e.g., Brodeau et al., 2010).
We have not evaluated the consistency of the interior ocean dynamics by comparing our results to observations.Indeed, since the consistency of the corrections is mainly assessed for the intertropical band, it would be first useful to construct  a comprehensive forcing data set including these corrections without introducing any discontinuity in the forcing field before looking at the behavior of the global ocean circulation.However, we diagnosed the equatorial undercurrent in the simulation forced by ERAcor and it turned out that the corrected forcing leads to a better representation (in terms of intensity) of this important equatorial circulation feature than ERAinterim with respect to TAO observations (not shown).As a further step, one could envisage applying a similar method to an operational system for short term forecast in present-day operational systems.Indeed, the ocean state correction can present some inconsistency with the forcing parameters used, whereas our approach proposes an alternative to keep the interactive ocean-atmosphere link while correcting efficiently the ocean surface state.
The diagnostic of the computed correction itself has highlighted some limits of the method.Small-scale features probably due to the local application of the Kalman filter analysis appear in the correction fields.We also computed some spurious corrections in strong advection regions and at high latitudes despite the effort made to reduce non-forcing uncertainties in ensemble experiments and to define an appropriate parameter prior probability distribution.Improving the identification of forcing error in the model is essential to expand the validity domain of the corrections from the intertropical band to the whole ocean (sea ice covered regions would probably need to be treated separately).In our study, the problem of non-forcing-errors limitation has been addressed by using a robust diagnostic approach.Quantifying properly the remaining part of the error which is directly due to a given model implies to run exactly the same experiments but with another model.We could then identify the part of the correction which is dependent to the spatial resolution for example, or to a given formulation used in the model.Unfortunately, this kind of diagnostic would be numerically very expensive, and does not guaranty to capture properly the part of the correction corresponding to the model in all cases but locally in space and time.To go further one could consider to constrain the water masses by the same type of relaxation but using an ocean reanalysis as reference instead of temperature and salinity climatology.As there are now many atmospheric reanalyses available (ERAi, JRA25, NASA MERRA, NCEP/DOE...), the definition of parameter prior probability distribution could also be based on the magnitude differences observed between different atmospheric reanalyses instead of intra-seasonal and interannual variability of a single reanalysis (as in Lucas et al., 2008).This could represent a substantial improvement of the method since it would provide a more realistic description of the parameter uncertainties and would reduce spurious corrections at high latitudes.
Furthermore, the model convergence to assimilated observation strongly depends on the presence of the initialization procedure in the experiments.Parameters computed via this method are thus not necessarily appropriate to reduce the instantaneous discrepancies between model state and observations when applied to a free model long-term simulation.However, our results show that these corrections have a positive impact on the long-term mean values of fluxes and ocean surface state.The heat budget resulting from corrected parameters is closer to zero than the one computed with original ERAi variables, with a heat excess reduced by more than 10 W m −2 .In addition, our corrections remove the unrealistic negative trend observed in the time series of the global net heat flux computed with ERAi forcing parameters and the SST of Hurrell et al. (2008).In the intertropical band where we have a maximum validity of the corrections, the warm bias is reduced which leads to a better agreement with surface observations.The diagnostics of the impact on each net heat-flux component allowed the turbulent heat flux to be identified as the main actor of this reduced warming, via wind speed increase in the intertropics essentially.These results are not explicitly prescribed by the method and only result from the ensemble description of the correlations between the forcing variables and the SST.
Several perspectives can now be proposed to go further from this study.On the one hand the method can be improved to maximize the impact of observations.In terms of methodology, the first difficulty is that the problem is underdetermined: we estimate seven forcing parameters using only two observed variables.This under-determination could be partly reduced by assimilating real SSS observations like SMOS or AQUARIUS when available instead of SSS climatology, as well as other ocean profile observations databases like ARGO or TAO.All available observations of the surface, the mixed layer or the deep convection regions that are directly influenced by air-sea fluxes would be beneficial to constrain our system.Recent progresses in intensive ocean observation give a strong potential to our methodology in addressing the problem of the objective computation of forcing correction.
On the other hand further work could be developed in order to take advantage of the results already available from our study.Our method is numerically expensive since it involves numerous simulation experiments, and could not reasonably be used at present for higher resolution models.Furthermore, even if the results are inhomogeneous in quality depending on the region considered, the method is certainly able to give valuable insights to guide typical ad hoc forcing adjustments.Results should be first subject to further evaluation by comparison with available atmospheric or fluxes observations like TAO/TRITON, PIRATA (Pilot Research Moored Array in the Tropical Atlantic), RAMA, and reconstructions based on reanalyses and satellite observations, such as OAflux (Yu and Weller, 2007) or TROPFlux (Praveen Kumar et al., 2011) databases.VOS-based products like NOCS (Berry and Kent, 2009) database can also be useful to conduct further evaluation of our correction as far as we consider a region with acceptable sampling error (Gulev et al., 2007).This work would make possible to identify more precisely the strengths and weaknesses of the method, to have a more critical look on the results, and to identify the benefit of the corrections in every particular application.
A wider perspective concerns the implementation of reanalyses.While in the construction of an ocean reanalysis, it is certainly useful to extend the ocean state control vector to the forcing parameters to avoid the propagation of forcing errors in the system (e.g., Cerovecki et al., 2011), our methodology could also be valuable to provide improved ocean boundary information for the implementation of atmospheric reanalyses.Indeed, our approach is not only dedicated to the simulation of the ocean state and can be viewed as an objective way of controlling the air-sea interactions.More precisely, one could envisage to improve boundary conditions in atmospheric models by not using anymore the SST directly but the atmospheric parameters corrections produced by the assimilation of the SST in an ocean model.

Fig. 2 .
Fig. 2. January ensemble standard deviation in terms of air temperature (top left), shortwave radiation (top right), zonal wind speed (bottom left), and meridional wind speed (bottom right).

Fig. 3 .
Fig. 3. SST differences between the free model run forced by ERAi and the result of the analysis step (top), and between the free model run forced by ERAi and the free model run forced by ERAcor (bottom) for January 2004.

Fig. 6 .
Fig. 6. 1989-2007 time series of global net heat fluxes monthly means (in red) computed with ERAi variables (left) and ERAcor variables (right).The black lines represent the global mean of the net heat flux over the whole 1989-2007 period.

Fig. 11 .
Fig. 11.1989-2007 mean differences between Hurrel SST and free model run forced by ERAi (top), and between Hurrel SST and free model run forced by ERAcor (bottom) in the 20 • N-20 • S latitude band.

Fig. 12 .
Fig. 12.Comparison between the simulations forced by ERAi (black) and ERAcor (red).Top: time series between 1995 and 2007 of the global mean of the sea surface height in m (left), and of the global mean of the temperature in • C (right).Bottom: time series between 1995 and 2007 of the maximum intensity of the meridional overturning circulation at 30 • S in Sv.