The improvements to the regional South China Sea Operational Oceanography Forecasting System

South China Sea Operational Oceanography Forecasting System (SCSOFS) had been built up 10 and operated in National Marine Environmental Forecasting Center of China to provide daily updated hydrodynamic forecasting in SCS for the future 5 days since 2013. This paper presents comprehensive updates had been conducted to the configurations of the physical model and data assimilation scheme in order to improve SCSOFS forecasting skills in recent years. It highlights three of the most sensitive updates, sea surface atmospheric forcing method, tracers advection discrete scheme, and modification of 15 data assimilation scheme. Scientific inter-comparison and accuracy assessment among five versions during the whole upgrading processes are performed by employing Global Ocean Data Assimilation Experiment OceanView Inter-comparison and Validation Task Team Class4 metrics. The results indicate that remarkable improvements have been achieved in SCSOFSv2 with respect to the original version SCSOFSv1. Domain averaged monthly mean root mean square errors decrease from 1.21 °C to 0.52 °C 20 for sea surface temperature, from 21.6cm to 8.5cm for sea level anomaly, respectively.


Introduction
The South China Sea (SCS) is located between 2°30′S～23°30′N and 99°10′E～121°50′E, the largest in area and the deepest in depth, a semi-closed marginal sea in the western Pacific. Its area is about 3.5 million km 2 , and its maximum depth is about 5300 m at the central region. Monsoon (Hellerman and Rosenstein, 1983), showing a cyclonic gyre in winter and an anti-cyclonic gyre in summer (Mao et al., 1999;Chu and Li, 2000). The multi-scale oceanic circulation dynamical processes of the SCS are affected by various factors, i.e. the Kuroshio intrusion through the LUS (Nan et al., 2015;Farris and Wimbush, 1996;Liu et al., 2019), internal waves Li et al., 2015) or internal solitary waves (Zhang et al., 2018;Zhao and Alford, 2006;Cai et al., 2014) generated in the LUS 35 and propagating in the northern SCS, the SCS throughflow as a branch of the Pacific to Indian Ocean throughflow (Wei et al., 2019;Wang et al., 2011), and energetic mesoscale eddy activities (Zu et al., 2019;Xu et al., 2019;Zhang et al., 2016;Zheng et al., 2017;Hwang and Chen, 2000;Wang et al., 2020).
The multi-scale dynamical mechanisms in the SCS are too complex to understand clearly as yet, it has always been a challenge to simulate or reproduce the ocean circulations, not to mention forecast future 40 oceanic status by Operational Oceanography Forecasting System (OOFS).
of the Kalimantan Island, the Banda Sea was connected in the south of Buru Island and Pulau Seram; and opened the Tomini Bay and the Cenderawasih Bay. It is obvious that the land-sea masks changing can generate important effects on the sea water volume and transport in the model domain, then would 120 contribute to simulating more accurate ocean circulations.
The bathymetry of ETOPO1 data set, with 1 arc-minute grid resolution from U.S. National Geophysical Data Center (NGDC) is substituted by the General Bathymetric Chart of the Oceans (GEBCO_08, Grid version 20091120, http://www.gebco.net) global continuous terrain model for ocean and land with 30 arc-second spatial resolution in SCSOFSv2. It is also merged with the measured bathymetry in the area 125 near the coast of China mainland and adjusted with the tidal range. Then it is smoothed by applying a selective filter 8 times to reduce the isolated seamounts on the deep ocean, so that the "slope parameter" r=grad(h)/h is lower than a maximum value r0=0.2 for each grid, in order to supress the computational errors of the pressure-gradient (Shchepetkin and McWilliams, 2003). The maximum depth is set to be 6000m still, but the minimum depth to be changed from 10m in SCSOFSv1 to 5m in SCSOFSv2 (Wang, 130 1996). The final smoothed bathymetry is shown in Fig.1.
For the vertical terrain-following coordinate, it has been increased from 36 s-coordinate layers in SCSOFSv1 to 50 layers in SCSOFSv2. The transformation equation from the original formulation is also changed to the improved solution (Shchepetkin and McWilliams, 2005). The original vertical stretching function (Song and Haidvogel, 1994) is replaced with an improved double stretching function 135 (Shchepetkin and McWilliams, 2005), to make it thinner on the upper 300m in order to resolve the thermocline well.
The new initial temperature and salinity (T/S) fields in SCSOFSv2 are extracted from the Generalized Digital Environmental Model Version 3.0 (GDEMV3, Carnes, 2009)  The net surface heat flux correction is still following Barnier et al. (1995) in SCSOFSv2, but the 155 parameter (dQ/dSST) of kinematic surface net heat flux sensitivity to sea surface temperature (SST) is calculated using SST, sea surface atmospheric temperature, atmospheric density, wind speed and sea level specific humidity, instead of setting a constant number -30 W m 2 K -1 for the whole domain as in SCSOFSv1. So the parameter dQ/dSST varies temporally and spatially. Meanwhile, we also replace the merged satellite's infrared sensors and microwave sensor, and in-situ (buoy and ship) data global daily 160 SST (MGDSST) obtained from the Office of Marine Prediction of the Japan Meteorological Agency (JMA), with the infrared Advanced Very High Resolution Radiometer (AVHRR) satellite data, which is an analysis constructed by combining observations from different platforms on a regular grid via optimum interpolation and provided by National Centers for Environmental Information (NCEI).
The North Equatorial Current (NEC) is an interior Sverdrup steady current in the subtropical NP and 165 located at about 10°N-20°N, and usually bifurcates into two branches after encountering the western boundary along the Philippine coast in the west of 130°E (Qiu and Chen, 2010). However, the NEC is separated into two branches since the model's eastern lateral boundary in SCSOFSv1, its main branch located at about 9.5°N-13°N, the other branch located at 14.5°N-17°N (Fig. 2a), which is clearly not in line with the fact. This is because the Guam Island (Fig. 2  The SCSOFSv2 is run with 5s time step for the external mode, and 150s for the internal mode under all new configurations mentioned above and in Section 3. The modification for the time step is due to the change of the discrete schemes, which would be illustrated in Section 3. Before the operational run, a 26 years climatology run is conducted for spinning-up, and is followed with a hindcast run from 2005 to 185 2018 (Wang et al., 2012). The daily mean of model results is archived and used to validate in the following parts of this paper.

Highlights and sensitive updates and their impacts
Most of bias or errors in the operational systems mainly induced by initial errors and model deficiencies, which can be attributed to some major recurring problems like sea surface atmospheric forcing, intrinsic 190 deficiencies of numerical model (e.g. discrete schemes, parameterization schemes for sub-grid scale), https://doi.org/10.5194/os-2020-104 Preprint. Discussion started: 24 November 2020 c Author(s) 2020. CC BY 4.0 License. and the assimilation schemes. In this section, we elaborate solutions, that are not mentioned in Sect.2, to such problems applying in SCSOFSv2.

Sea surface atmospheric forcing
The air-sea interaction is one of essential physical processes that affect vertical mixing and thermal 195 structure of the upper-ocean. The air-sea fluxes mainly include momentum flux, fresh water flux and heat flux. SST is an important indicator of ocean circulation, ocean front, upwelling and sea water mixing, whose variation mainly depending on the air-sea interaction and the ocean thermal and dynamical factors (Bao et al., 2002). Thus, for OOFS and ocean numerical modelling, accurate simulation and forecasting of SST is one of the most important metric to evaluate the modelling and forecasting skill.

200
The accurate input of sea surface atmospheric forcing plays a key role to excel in model simulation of SST. ROMS provides two methods to introduce sea surface atmospheric forcing: the first option is directly forcing ocean model by providing momentum fluxes (wind stress), net fresh water fluxes, net heat fluxes and shortwave radiation fluxes from atmospheric datasets; the second is employing the Fairall et al. (2003) COARE3.0 bulk algorithm to calculate air-sea momentum, fresh water and heat turbulent 205 fluxes using the set of atmospheric variables from atmospheric datasets including wind speed at 10m above sea surface, mean sea level air pressure, air temperature at 2m above sea surface, air relative humidity at 2m above sea surface, downward longwave radiation flux, precipitation rate and shortwave radiation fluxes (Large and Yeager, 2009). Since the SST using in the calculation of those three air-sea fluxes is extracted from ocean model, the increasing of SST thus induces the variations of sensible heat 210 flux, latent heat flux, and longwave radiation then increasing loss of ocean heat, and inhibiting the further increasing of SST, and vice versa. It means that an effective negative feedback mechanism could form between SST and SST-related heat fluxes. In this case, the simulated SST would maintain at a reasonable level. The first one is employed in SCSOFSv1, the second one, bulk algorithm, is employed in SCSOFSv2.

215
In order to evaluate the performances of different sea surface atmospheric forcing methods, we conduct a special experiment by changing the method based on SCSOFSv1, here named the experiment as BulkFormula. In this experiment, we use the merged satellite SST analysis with a multi-scale optimal https://doi.org/10.5194/os-2020-104 Preprint. Discussion started: 24 November 2020 c Author(s) 2020. CC BY 4.0 License. interpolation called the Operational SST and Sea Ice Analysis (OSTIA) system, which globally coverage on a daily basis at a horizontal grid resolution of 1/20° (~6 km) and provided by the Met Office (Donlon 220 et al., 2012), to verify the results of SCSOFS.

225
Figure3 shows the distributions of monthly mean SST differences of SCSOFSv1, BulkFormula, SCSOFSv2 minus OSTIA SST in January, April, July and October, 2014 to stand for Winter, Spring, Summer and Autumn, respectively. It is found that the simulated SST are higher than OSTIA SST for all three results in general. The differences are pronouncedly higher for the results from SCSOFSv1 than the results from BulkFormula and SCSOFSv2. The maximum differences mainly occur near coast ( Fig.3 230 upper panels), especially for a few bays embedded into the mainland which is hard to be resolved well with 2-3 horizontal grids at 1/30° resolution and in very shallow water depth in SCSOFSv1. This is because sea surface atmospheric forcing data is not accurate enough near the coast, and provide abnormally more heat to ocean causing the continuously heating up of ocean. Thus, simulated SST is beyond normal level in SCSOFSv1. This phenomenon can be alleviated significantly by introducing the 235 https://doi.org/10.5194/os-2020-104 Preprint. Discussion started: 24 November 2020 c Author(s) 2020. CC BY 4.0 License. effective negative feedback mechanism between model's SST and air-sea heat flux by employing the  improved by about 21% only by changing the method of sea surface atmospheric forcing from directly forcing to COARE 3.0 bulk algorithm, due to effective negative feedback mechanism.

250
Spurious diapycnal mixing is one of traditional errors in state-of-the-art atmospheric and oceanic model, especially for the terrain-following coordinate regional models including both the continental slope and deep ocean (Marchesiello et al., 2009;Naughten et al., 2017;Barnier et al., 1998). Marchesiello et al. (2009)

260
For SCSOFSv1, a third-order upstream horizontal advection scheme (hereafter referred to as U3H), a fourth-order centered vertical advection scheme (hereafter referred to as C4V), and the scheme of https://doi.org/10.5194/os-2020-104 Preprint. Discussion started: 24 November 2020 c Author(s) 2020. CC BY 4.0 License.  To fix this problem, we tested various model settings and compiling options available in ROMS, such as increasing the number of vertical levels, changing the advection and diffusion schemes, horizontal mixing surfaces for tracers, horizontal mixing schemes. Details of how tested model settings effect on 300 the spurious diapycnal mixing are beyond the scope of this paper, which will be discussed in a separate paper. Based on test results, we conclude that the spurious diapycnal mixing problem can be suppressed significantly by employing AAG scheme combination (Fig.5d, e and Fig.6d, e) in SCSOFSv2.
The monthly mean temperature at 1000 m layer from SCSOFSv2 varies from 3.0-12.0℃ in initial condition to 3.0-11.5℃ (Fig.5d), and the domain averaged monthly mean value increases slightly from temperature is about 0.2℃ and salinity is about 0.03, yet remaining stable after 20 model years (Fig.7).
It is suggested that spurious diapycnal mixing has been suppressed significantly by AAG scheme combination, which can preserve the characteristics of water masses in deep ocean well.
In addition, it is found that the model skill for SST has also been improved significantly while the new 315 AAG scheme employed in SCSOFSv2 ( Fig.3 and Fig.4). The maximum of monthly mean differences between simulated SST by SCSOFSv2 and OSTIA is about 3-4℃, which is smaller than the results from  innovations between the observation and model forecast would be decreased from 7 days to 1.5 hours by using FGAT.

365
In SCSOFSv1, the analysis increments of sea surface height and three-dimensional temperature, salinity, zonal and meridional velocities produced by each analysis of data assimilation are applied to the model initial fields at one time step. It would induce model significant initial shock and spurious high-frequency oscillation due to the imbalance between the increments and the model physics inevitably (Lellouche et al., 2013;Ourmières et al., 2006), and usually causes a rapid growth of forecast error and even model 370 blow up after a few assimilation cycles or one or two-year period after the intermittent assimilation run.
It is a threat to the stability and robustness of OOFS. Therefore, we introduced the incremental analysis update (IAU) method (Bloom et al., 1996;Ourmières et al., 2006) to apply each analysis increment to the model integration as a forcing term in a gradual manner in SCSOFSv2 to diminish the negative impact.
In our case, we get the tendency term by dividing the increments with the total number of time steps 375 within an assimilation cycle as in most IAU methodologies, in order to make sure the time integral of tendency term equals the analysis increment calculated by EnOI.  Once including the FGAT and IAU methods in EnOI scheme, the whole system integral strategy has to be adjusted by adding one more model integration over the assimilation time window (Lellouche et al., 2013). In SCSOFSv1, only one time model integration is needed. It means that once physical ocean model finish 7 days run (does not need to archive snapshot fields) and output a restart field, the EnOI data assimilation module is started to calculate the analysis increments at the restart field time and add it 385 to the restart field, then the physical ocean model makes a hot-start from the updated restart field to run 7 days for next cycle.
However, in SCSOFSv2, two times model integration is needed due to considering the FGAT and IAU methods (Fig.8). It means that physical ocean model needs to be integrated 14 days in each assimilation cycle, to add the tendency term to the model prognostic equations due to the IAU method used during 390 the first 7 days run (referred to as "Analysis Stage"), to output restart field at the end of 7th day for hot starting ocean model in next cycle, and to output 3-hourly snapshots forecast fields during the second 7 days run (referred to as "Forecast Stage") to be used in next cycle by FGAT method. The model outputs from the Analysis Stage are referred to as "Best Estimate", and from the Forecast Stage are referred to as "Forecast". The analysis increments are defined at the 3.5th day, but not at the end of 7th day as in 395 SCSOFSv1, with the observed SLA and Argo T/S vertical profiles data within the 7-day time window and AVHRR SST data on the 4th day used by FGAT method.

Scientific inter-comparison and accuracy assessment
In order to show the improvements of different SCSOFS sub-versions during the upgrading process, the results of scientific inter-comparison and assessment are shown in this section, by using the GOV Inter-400 comparison and Validation Task Team (IV-TT) Class 4 verification framework (Hernandez et al., 2009).
Class 4 metrics are used for inter-comparison and validation among different global or regional OOFSs or assimilation systems originally Divakaran et al., 2015). In this paper, we use Class 4 metrices and select the first four physical variables, SST, SLA and T/S, to inter-compare and assess the accuracy among different sub-versions of SCSOFS (Table 2). Since all the reference measurements data mentioned above have not been used in SCSOFS for those sub-versions without data assimilation, they are independent reference observation from SCSOFS except for  indicated that the accuracy of SST can be benefited from the sea surface atmospheric forcing method, more accurate observed SST data used for sea surface heat flux correction, temperature advection discrete scheme, and SST data assimilation.
https://doi.org/10.5194/os-2020-104 Preprint. Discussion started: 24 November 2020 c Author(s) 2020. CC BY 4.0 License.  Fig.10b). We also found that OISST data is closer to OSTIA than MGDSST (figure not shown). Due to benefit from those changes, the maximum and minimum value of SST RMSE have decreased from 1.92℃ and 0.71℃ in v1 to 1.52℃ and 0.60℃ in v1.2 for the whole year 2013, respectively. It is noteworthy that AAG schemes combination not only improves the deep layer 445 temperature, but also contributes to the improvement of SST due to internal baroclinic vertical heat transport. The maximum and minimum value of SST RMSE is 1.21℃ and 0.52℃ in v1.3. For the results with data assimilation in v2, the maximum and minimum value of SST RMSE is only 1.13℃ and 0.32℃, respectively. It is better than the result in v1.3 year-round. For the horizontal distribution of SST RMSE, large values are mainly located at the areas near equator, 450 coast areas and northern lateral boundary, with most of values larger than 1.5℃ and maximum value about 6.67℃ in v1 (Fig.10c). In v1.3, due to the contributions of all the above model updates, the pattern of RMSE is similar with v1 basically without significant variations, but the maximum value decreases to 3.91℃ and most of values are smaller than 1.2℃. After applying MOOAS in v2 (Fig.10d), only a few large RMSE values are located at the eastern coast of Philippine island with the maximum value of 2.09℃ 455 and most of values lower than 0.8℃. It indicates that the performance of SST in SCSOFSv2 has been improved significantly due to all the updates mentioned above.

SLA
For the whole improving process, the accuracy of SLA is also continuously increasing from version v1 to v2, with RMSE decreasing from 21.6cm in v1 to 8.5cm in v2 with PI being 60.6%, for the annual From Fig.11(a (Fig.11b). In v1.3 (Fig.11c)

T/S profiles
For 3D T/S distribution, by comparing model results with climatology T/S profiles, the results from first three versions show poor correlation with observations ( Fig.12a and Fig.13a) and large RMSE (Fig.12b   495 and Fig.13b), i.e. 1.44-1.75℃ for T and 0.13-0.14 for S (Table 2), even if they decrease with model updates. Especially, for the vertical distribution, the RMSE can reach to larger than 3℃ for T and 0.3 for S in thermocline and halocline, respectively, and remained larger than 1℃ for T in deep layer and 0.1 for S in above 700m depth ( Fig.12d and Fig.13d). This may result from spurious diapycnal mixing due to UCI schemes combination employed. Those updates in v1.1 and v1.2 can only slightly improve 3D 500 T/S, and cannot contribute to their intrinsic improvement, neither for surface forcing nor for lateral boundary conditions, except for surface layer of shallower than 100m.
However, once AAG schemes combination employed in v1.3, the improvements to 3D T/S are obvious with respect to the first three versions (Fig.12a,b and Fig.13a,b). The AC increases to 0.38-0.48 for T and 0.30-0.44 for S, and RMSE decreases to 0.96-1.03℃ for T and 0.10-0.11 for S, respectively (Table   505 2). For the vertical distribution, the AC remains around 0.4 for both T and S in the whole water column, and over 0.6 for T in the surface layer ( Fig.12c and Fig.13c), RMSEs significantly decrease to less than 2℃ for T in thermocline and 0.25 for S in halocline, and less than 1℃ for T and 0.1 for S in deep layer ( Fig.12d and Fig.13d).
For the horizontal distribution of 3D T/S RMSE, RMSE of T is more likely being more than 1.5℃ with 510 maximum and minimum values being 4.45℃ and 0.49℃ (Fig.12e), and RMSE of S is larger than 0.1, with https://doi.org/10.5194/os-2020-104 Preprint. Discussion started: 24 November 2020 c Author(s) 2020. CC BY 4.0 License. By employing MOOAS, accuracy of 3D T/S has been improved continuously in v2 compare to v1.3 for all the metrics in 2018 ( Fig.12 and Fig.13). The mean AC has increased from 0.38 to 0.57 for T, and 525 from 0.30 to 0.51 for S. The mean RMSE has decreased from 0.96℃ to 0.67℃ for T, and from 0.11 to 0.08 for S (Table 2). For vertical distribution of AC for T, it's over 0.6 in surface, over 0.4 above 600m, and over 0.3 in deep layer (Fig.12c). RMSE of T is less than 1.5℃ for the whole vertical profile, and the maximum value is located at the thermocline similar with other versions, but the error decreases dramatically (Fig.12d). Unlike T, vertical AC of S does not show significant improvement in v2 with 530 respect to v1.3 below 200m, and it shows a little higher than which in v1.3 (Fig.13c)  RMSE is less than 0.25 for the whole vertical profile, with the maximum value located at surface and decreasing with depth, and decrease to less than 0.05 below 600m (Fig.13d).

535
For the horizontal RMSE distribution in v2, most of T RMSE is larger than 0.8℃ with maximum and minimum values being 1.96℃ and 0.03℃ (Fig.12f); and most of S RMSE is greater than 0.1, with maximum and minimum values being 0.35 and 0.01 (Fig.13f), respectively, in 2018.

Conclusions
This literature illustrates all the updates applied on SCSOFSv1 in both physical model settings, inputs 540 and EnOI data assimilation scheme in recent couple years based on the recommendations of Zhu et al.

580
Although SCSOFSv2 has improved greatly comparing to the previous versions, some biases still exist in surface and subsurface. We need to continue to improve the system in both physical model settings and data assimilation scheme for next step, such as sub-grid parameterization scheme for unresolved physical processes, vertical turbulent mixing scheme to consider wave mixing, more accurate input and forcing data source, and assimilating more or new types of observations (glider or mooring T/S, drifting buoys, 585 in-situ velocity from moorings) into the system. Competing interests. The authors declare that they have no conflict of interest.
41806003. We would like to thank the anonymous reviewers for their careful reading of the manuscript and for providing constructive comments to improve the manuscript.