Seasonal extrema of sea surface temperature in CMIP6 models

CMIP6 model sea surface temperature (SST) seasonal extrema averaged over 1981–2010 are assessed against the World Ocean Atlas (WOA18) observational climatology. We propose a mask to identify and exclude regions of large differences between three commonly used climatologies (WOA18, WOCE-Argo Global Hydrographic climatology (WAGHC) and the Hadley Centre Sea Ice and Sea Surface Temperature data set (HadISST)). The biases in SST seasonal extrema are largely consistent with the annual mean SST biases. However, the amplitude and spatial pattern of SST bias vary seasonally in the 20 CMIP6 models assessed. Large seasonal variations in the SST bias occur in eastern boundary upwelling regions, polar regions, the North Pacific and the eastern equatorial Atlantic. These results demonstrate the importance of evaluating model performance not simply against annual mean properties. Models with greater vertical resolution in their ocean component typically demonstrate better representation of SST extrema, particularly seasonal maximum SST. No significant relationship of SST seasonal extrema with horizontal ocean model resolution is found.

. The 20 CMIP6 models used in this study; the horizontal resolution of their ocean component; ocean vertical coordinate (z: traditional height coordinate; z * : rescaled height coordinate for more accurate representation of free-surface variations; ρ: isopycnic coordinate; σ: terrainfollowing sigma coordinate; multiple symbols refer to a hybrid coordinate); total number of ocean vertical levels; thickness of the ocean top grid cell; and references.       -CM2  ACCESS-ESM1-5  AWI-CM-1-1-MR  BCC-CSM2-MR  BCC-ESM1  CESM2  CanESM5 HiGEM 30 • and 70 • than at other latitudes. The larger biases, and greater difference between T max and T min , at mid-high latitudes (greater than 30 • in both hemispheres) may be explained by the large seasonal cycle of mixed layer depth there. Shallower 100 summer mixed layers have smaller heat capacity, thus a small error in heat fluxes or mixing processes can result in a large bias for T max , though this will be modulated by any seasonal biases in mixed layer depth. The larger inter-model biases in T max than in T min can be explained by the shallower mixed layer in summer, which can amplify SST biases due to biases in surface heat flux. The difference between biases in T max and T min leads to biases in T cycle (Fig. 1d). The RMSE of T cycle at low latitudes is typically 1 • C, whereas at mid-high latitudes it is larger, particularly in the Northern Hemisphere (Fig. 8c)    In the subtropical North Pacific, the SST cold bias is typically 0.5-1 • C smaller in T max than T min , which leads to a too large T cycle (Figs. 1b-d, 2, 3). Zhu et al. (2020) showed a similar seasonal SST cold bias in the CMIP6 multi-model mean, but not in the CMIP5 multi-model mean. Underestimated surface shortwave radiation and too strong westerly winds in the CMIP6 multi-model mean (Lyu et al., 2020;Li et al., 2020) are possible reasons for the year round cold bias. The shortwave radiation 115 bias is likely related to the bias of low-level cloud in the subtropics (Burls et al., 2017;Li and Xie, 2012), and its associated cold bias is smaller in winter when there is less solar radiation. The westerly winds cool the surface through latent heat flux and southward ocean advection due to Ekman transport. The latent heat loss shows a maximum in summer (Yu, 2007), while the ocean heat advection shows a maximum in winter when meridional SST gradients are greatest.

120
For the multi-model mean, there is a warm bias in T max which exceeds 2 • C and a cold bias in T min of 0.5-1.5 • C. Similar seasonal biases exist in CMIP5 models and were linked to an easterly wind bias throughout the year there (Song and Zhang, 2020). A coarse atmospheric model resolution smooths out the elevation difference between mountains and oceans, which allows easterly trade winds to cross the mountains, leading to the easterly wind bias (Song and Zhang, 2020). An easterly bias of annual mean wind was found in the CMIP6 multi-model mean (Li et al., 2020;Lyu et al., 2020). If the easterly bias exists 125 throughout the year, it can explain the seasonal SST bias we found. During winter-spring, the northeastern Pacific ITCZ is dominated by easterly winds, so overly strong easterly winds enhance surface evaporation and lead to cold biases. In contrast, during summer-autumn when westerly winds dominate, the simulated wind is too weak, which causes the warm bias. The northeastern Pacific is a region where tropical cyclones and heatwaves occur (Gilford et al., 2017;Frölicher and Laufkötter, 2018), so a warm bias of over 2 • C in T max may lead to overprediction of tropical cyclones and heatwaves.

130
The multi-model mean has a cold bias in T max and a warm bias in T min over the Northwest Pacific, leading to a too small T cycle (bias of more than 2 • C) (Figs. 1b-d). The warm bias in winter can be seen in many models, especially in ACCESS-ESM1-5, BCC-ESM1, CanESM5 and INM-CM5-0 (Fig. 3). The winter warm bias east of Japan was also found in a CMIP5 multi-model mean (Wang et al., 2018), but from our results the warm bias extends further east (Fig. 1c). radiation is negligible at high latitudes in winter, the SST cold bias due to cloud bias is much smaller in winter than in summer, 140 consistent with our results. Deep winter mixed layer depths and SSTs close to freezing likely also contribute to the smaller cold biases in T min than in T max at high latitudes.
In most models there is a warm T mean bias in the Southern Ocean, commonly attributed to excessive short wave radiation In eastern boundary upwelling regions (especially the Benguela and Humboldt Currents), most models have a seasonal warm bias that is 1-5 • C smaller in T max than T min (Figs. 1b-c, 2, 3). Richter (2015) suggested that underestimation of stratocumulus cloud and insufficient upwelling due to overly weak winds contribute to the warm bias in eastern boundary upwelling regions.
The warm bias we found therefore is likely associated with the underestimated surface shortwave radiation and overly weak 160 upwelling-favourable winds in CMIP6 models identified by Li et al. (2020). The warm bias may lead to excessive precipitation in the Atlantic Ocean off Angola and Namibia as shown by Rouault et al. (2003). Letelier et al. (2009) showed that in the Humboldt Current coastal region the cooling effect of upwelling is strongest in austral summer, which is consistent with the peak of upwelling-favourable wind in December and January. A poor simulation of the seasonal cloud and upwelling processes will contribute to the seasonality of SST biases in eastern boundary upwelling regions.

165
Most models have a seasonal warm SST bias in the eastern equatorial Atlantic (Figs. 1b-c, 2 and 3). The T min multi-model mean bias can be more than 2 • C larger than the T max multi-model mean bias. Richter and Tokinaga (2020) showed a similar seasonal warm bias in the CMIP6 multi-model mean, which is about 1-2 • C larger during June-July-August than March-April-May. Richter et al. (2012) argued that the warm SST bias in the eastern equatorial Atlantic during June-July-August is linked to overly deep thermoclines caused by overly weak easterlies during March-April-May. Therefore, the warm bias can be attributed 170 to overly weak easterlies in the CMIP6 multi-model mean (Li et al., 2020;Lyu et al., 2020). GISS-E2-1-G and GISS-E2-1-H have the largest seasonality of SST warm bias in the eastern equatorial Atlantic, with T min biases up to 5 • C. Richter and Tokinaga (2020) illustrated that warmer than observed SSTs in the equatorial Atlantic lead to excessive precipitation. Roxy (2014) quantified the SST-precipitation relationship: a 1 • C SST increase corresponds to a 2 mm/day precipitation increase.
Therefore, the 5 • C T min warm bias in GISS-E2-1-G and GISS-E2-1-H could cause a 10 mm/day increase in precipitation.

175
Although the amplitudes of biases are different in T max and T min , the global patterns and signs of T max and T min biases are similar to each other in most models (Figs. 2, 3). Wang et al. (2014) indicated that the SST bias of the CMIP5 multi-model mean has a pattern independent of season but did not analyse the seasonality in bias in individual models. Our results show two exceptions: E3SM-1-0 and IPSL-CM6A-LR, which both have an overall warm bias in T max , but an overall cold bias in T min (Figs. 2h,t 3h,t), which tend to cancel out in the annual means. The T max RMSE is 1.38 • C for E3SM-1-0 and 1.36 • C 180 for IPSL-CM6A-LR, the T min RMSE is 1.39 • C for E3SM-1-0 and 1.21 • C for IPSL-CM6A-LR, whereas the T mean RMSE is only 1.17 • C for E3SM-1-0 and 0.94 • C for IPSL-CM6A-LR. In E3SM-1-0, the global annual average mixed layer depth is generally too shallow (Golaz et al., 2019), which can contribute to the summer SST warm bias and winter SST cold bias, and a similar process may be affecting IPSL-CM6A-LR. These results illustrate the risks involved in assessing only annual means, as models may have greater biases than assumed, so tropical cyclone formation, for example, may be overpredicted.

185
In mid-latitudes the SST seasonal cycle is well represented by an annual sinusoid whereas in equatorial and polar regions an annual sinusoid explains little of the total SST seasonal variance (Trenberth, 1983;Yashayaev and Zveryaev, 2001 within 1 month (Figs. 4,5,6). In subtropical regions, seasonal SST biases are consistent with biases in T mean . Differences between the T max and T min biases are smaller than those in non-sinusoidal regions (Fig. 9). In regions with non-sinusoidal SST seasonal cycles such as the western equatorial Pacific, northwestern Indian Ocean, the Arctic and the Antarctic (sinusoidal signal explains 33%, 23%, 58% and 46% of the observed variances), models tend to have biases in amplitudes or phases of their SST seasonal cycles (Figs. 4,5,6, In the western equatorial Pacific, the SST seasonal cycle in WOA18 is modest (less than 1 • C), whereas in some models such as MPI-ESM1-2-HR, GISS-E2-1-G, GISS-E2-1-H and especially INM-CM5-0 the seasonal cycle is much larger (Fig.   9a). In INM-CM5-0, the T cycle is about 2 • C and there is a cold SST bias throughout the year, reaching 3 • C during September-October-November (Fig. 9a). Similar to our analysis, Volodin et al. (2017)

205
In the northwestern Indian Ocean where the monsoon system prevails, SST has a semi-annual cycle, but most models are unable to reproduce this with the correct amplitude and phase (Figs. 4,5,6,9b). Most CMIP6 models have SST cold biases in this region throughout the year, while the biases are generally larger during March-April-May than other months and the multimodel mean fails to simulate the primary maximum SST (Fig. 9b). Cold SST biases in the northwestern Indian Ocean lead to a significant reduction of the monsoon rainfall over the Indian subcontinent (Prodhomme et al., 2014;Levine and Turner, 2012). in May. GISS-E2-1-G and GISS-E2-1-H fail to simulate a realistic second minimum SST in August (Fig. 9b), which would 215 lead to overly intense tropical cyclones. SST in the northwestern Indian Ocean determines the onset of the summer monsoon (Sijikumar and Rajeev, 2012;Jiang and Li, 2011). The primary maximum SST is two months later in ACCESS-ESM1-5 than in WOA18 (Fig. 9b), which suggests a delayed summer monsoon onset in projections using that model.

Impact of model characteristics on SST seasonal extrema
We have shown that biases in T max , T min and T cycle are different between models. We now use the diversity in the 20 CMIP6 220 models to explore the effects of different model characteristics on the magnitude of these biases as quantified by global areaweighted RMSE for T max , T min , T cycle and T mean .
No significant correlation was found between the models seasonal biases and horizontal ocean resolution (supplementary their representation of global SST. They found that enhanced horizontal resolution does not deliver unambiguous SST bias improvement in all regions for all models, which is consistent with our finding. Nor did we find any correlation of seasonal  The only characteristic yielding a statistically significant relationship was the ocean vertical resolution (Figs. 10, 11). The 230 importance of vertical resolution for reducing seasonal biases is not unexpected: SST is influenced by ocean stratification and ocean vertical mixing processes, whose representation depends upon the vertical resolution. It has been found that high resolution in the upper ocean is important for the representation of diurnal and intraseasonal SST variability in ocean general circulation models (Misra et al., 2008;Xavier et al., 2008;Ge et al., 2017  unambiguously determined for models using an isopycnal or sigma vertical coordinate (6 out of 20 in our study) as their level depths vary with location and time (Bleck, 2002;Shchepetkin and McWilliams, 2005). Excluding the isopycnal and sigma models, the remaining high vertical resolution models are mainly from the Met Office Hadley Centre family, and hence any relationship between SST biases and vertical resolution in the upper ocean might have been overly influenced by that particular family. Hence we use the total number of vertical levels and top grid cell thickness (table 1) as proxies for the vertical resolution.

240
Our study emphasises the importance of vertical resolution for simulating seasonal extreme SST and annual mean SST.
For the 20 models, there is a decrease in bias with increasing total number of vertical levels (Fig. 10). We calculated the inter-model correlation between global RMSE and total number of vertical levels following the method of Wang et al. (2014).
The relationship between SST biases and total number of vertical levels is significant for T max , T min , and T mean (p-values < 0.05), with the largest correlation of -0.648 for T max . The higher correlation between global T max RMSE and ocean vertical thickness (but with smaller correlation than total number of vertical levels): models with a smaller top grid thickness tend to have smaller biases (Fig. 11).
The impact of ocean vertical resolution on SST biases varies with latitude and season. Ocean vertical resolution is most important for T max at low latitudes . SST biases decrease with number of vertical levels in the 250 Benguela, Humboldt and California upwelling regions (supplementary Figs. S10-12). Only the Canary upwelling region, which has the smallest SST bias among the main four eastern boundary upwelling regions, does not have a good inter-model correlation between SST biases and ocean vertical resolution (supplementary Fig. S13).