Extreme Sea Levels in the Baltic Sea under Climate Change Scenarios. Part 1: Model Validation and Sensitivity

An ✿✿✿ We ✿✿✿✿✿✿ analyze ✿✿✿✿✿✿✿ extreme ✿✿✿ sea ✿✿✿✿✿ levels ✿✿✿✿✿✿ (ESLs) ✿✿✿ and ✿✿✿✿✿✿ related ✿✿✿✿✿✿✿✿✿ uncertainty ✿✿ in ✿✿✿ an ensemble of regional climate change scenarios for the Baltic Seais validated and analyzed with respect to extreme sea levels (ESLs) in the recent past. The ERA40 reanalysis and five Coupled Model Intercomparison Project Phase 5 (CMIP5) global general circulation models (GCMs) have been ✿✿✿✿✿✿✿✿✿✿ dynamically ✿ downscaled with the coupled atmosphere-ice-ocean model RCA4-NEMO. Validation of ✿✿✿ The ✿ 100-year return levels against observational estimates along the Swedish coast shows that the model estimates ✿ in ✿✿✿ the ✿✿✿✿✿✿✿ ERA40 ✿✿✿✿✿✿✿✿ hindcast are 5 within the 95% confidence limits for most stations ✿✿ of ✿✿✿ the ✿✿✿✿✿✿✿✿✿✿✿ observational ✿✿✿✿✿✿✿✿ estimates, except those on the west coast. The ensemble mean ✿ of ✿✿✿ the ✿ 100-year return levels turns out to be the best estimator with ✿✿✿✿✿✿✿ averaged ✿✿✿✿ over ✿✿✿ the ✿✿✿ five ✿✿✿✿✿✿ GCMs ✿✿✿✿✿ shows biases of less than 10 cm. The ensemble spread includes the 100-year return levels based on observations. A series of sensitivity studies explores how the choice of different parameterizations, open boundary conditions and atmospheric forcing affects the estimates of 100-year return levels. A small ensemble of different regional climate models (RCMs) forced with ERA40 shows the highest 10 uncertainty in ESLs in the southwestern Baltic Sea and in the northeastern part of the Bothnian Bay. Also ✿✿✿✿✿ Some ✿✿✿✿✿✿ regions ✿✿✿✿ like the Skagerrak, Gulf of Finland and Gulf of Riga are sensitive to the choice of the RCM. A second ensemble of one RCM forced with different GCMs uncovers a lower sensitivity of ESLs against the variance introduced by different GCMs. The uncertainty in the estimates of 100-year return levels introduced by GCMs ranges from 20 cm to 50 ✿✿ 40 cm at different stations ✿✿✿ and ✿✿✿✿✿✿✿ includes ✿✿✿ the ✿✿✿✿✿✿✿✿ estimates ✿✿✿✿✿ based ✿✿✿ on ✿✿✿✿✿✿✿✿✿✿✿ observations. It is of similar size as the 95% confidence limits of 100-year return levels 15 from observational ✿✿✿ tide ✿✿✿✿✿ gauge records. Copyright statement. TEXT

The ✿ 100-year return levels against observational estimates along the Swedish coast shows that the model estimates observations. It is of similar size as the 95% confidence limits of 100-year return levels 15 from observational ✿✿✿ 1872. In the past centuries changes in sea levels in the Baltic Sea have forced communities to build new harbors or move settlements (Ekman, 2009). Sea level had fallen relative to the land due to the glacial isostatic adjustment (GIA). That has lead to harbours falling dry with economic impact on local societies. Today's estimates for the land uplift relative to the geoid range between -0.2 for the German and Polish coasts up to 9 at Höga Kusten (Jivall et al., 2016;Ågren and Svensson, 2011). The land is still rising since the last ice age. 5 Since the beginning of industrialization the global warming trend has caused an accelerating global mean sea level (GMSL) rise (Church et al., 2013). They give an average of 3.2 mm/a GMSL rise for the period 1993 to 2009. The main contribution to GMSL rise has been from the expansion of the warming water in the global oceans. Melt water from glaciers and ice sheets that increase the amount of water in the global ocean has contributed another one third to the GMSL rise. To assess possible trajectories of climate change and related GMSL rise the CMIP5 project  has coordinated an ensemble of 10 model runs with GCMs. These models take into account, apart from natural forcing, representative concentration pathways (RCPs) of how much extra warming is projected at the end of the 21 st century (van Vuuren et al., 2011). This ensemble of global climate scenarios is extensively discussed in the fifth assessment report (AR5) of the IPCC . The GMSL rise in the year 2100 relative to the period 1986 to 2005 ranges from 44 cm (RCP2.6) to 74 cm (RCP8.5), according to Church et al. (2013). The uncertainty for those estimates across the RCPs ranges from 28 cm (RCP2.6) to 98 cm (RCP8.5). warming the zero line of the combined effect would shift south-eastwards. With the most recent estimates including high end and extreme scenarios (e.g. Sweet et al., 2017) GMSL rise could also reach 250 cm in the year 2100 in which case there will be sea level rise relative to land all around the Baltic Sea. 25 Another factor that determines the mean sea level (MSL ) ✿✿✿✿ MSL of the Baltic Sea is related to the large-scale atmospheric circulation over the North Atlantic. Kauker and Meier (2003) have found a good correlation of the zonal wind component with the sea level at station Landsort. The sea level at Landsort is a good measure for the volume of water (or the averaged mean sea level) in the Baltic Sea (Matthäus and Franck, 1992). For the interannual variations Andersson (2002) has shown that sea level variation in Stockholm correlate significantly with the NAO index. For positive phases of the NAO, that are 30 characterized by a more zonal and a stronger atmospheric circulation the MSL of the Baltic Sea is expected to rise. According to AR5  the NAO is likely to become slightly more positive under projected climate change. That would translate to a possible rise of the MSL of the Baltic Sea. Recently, Karabil et al. (2018) found good correlation of interannual and decadal sea level variability in the Baltic Sea with the BANOS index that reflects more closely the variability in geostrophic wind in the entrance region of the Baltic Sea. 35 It has long been known (Ekman, 2009) that the sea level in the Baltic Sea is highest during winter. Samuelsson and Stigebrandt (1996) have shown that on the seasonal and shorter time scales sea level variations in the Baltic Sea are caused by large-scale atmospheric circulation patterns. Together with a potential increase in positive NAO phases and a concurrent increase in the strength of low pressure systems (Schneidereit et al., 2007;Pinto et al., 2009) higher ESLs in the Baltic Sea during winter must be anticipated. However, Meier (2006) has found that ESL may rise faster than MSL even without significant changes in the 5 wind field in downscaled projections of the Baltic Sea.
Analyses of ESLs by Weisse et al. (2014) at specific locations along the European coast, including the Baltic Sea, have shown an increase in the past 100 years. Their projections show a continuing increase of ESLs with MSL rise being the main contributor. They expect decadal variability to contribute to ESL changes in the near future. In their study Vousdoukas et al. (2016) have projected ESLs for the entire coastline of Europe using the bias corrected output of a shallow water model driven 10 with an ensemble of eight CMIP5 models and two RCP scenarios. Both Vousdoukas et al. (2016) and Wahl et al. (2017) discuss the uncertainty of ESLs introduced by the method used to estimate the sea level with long return periods. Wahl et al. (2017) also set into relation the uncertainty of the methodology to the uncertainty introduced by SLR scenarios and conclude that specially for the near future the uncertainty from the choice of the method is dominating. While Wahl et al. (2017) present a global analysis Eelsalu et al. (2014) has shown for the Estonian coast that no method for extreme value estimation was capable 15 to accommodate all observed and hindcast extremes and that the spread among different methods can be substantial.
A number of modeling studies have focused on ESLs in the Baltic Sea.  downscaled two SRES scenarios (Special Report on Emission Scenarios) with two different GCMs. They found large uncertainties in ESLs both from the use of different GCMs and the use of different SLR scenarios. Kowalewski and Kowalewska-Kalkowska (2017) showed that in general, modeled sea level variability in the Baltic Sea can be improved by an increase in resolution. Gräwe and Burchard 20 (2012) used a high resolution model for the western Baltic Sea and could show that the increased resolution (∼ 1 km) allowed the realistic simulation of extremes in the Danish Straits. They also could show that MSL rise causes a nonlinear response in sea level extremes by up to O(10 cm) in shallow and narrow locations in the western and southern Baltic Sea. Hieronymus et al. (2017) investigated the contribution of various forcing mechanisms on the sea level in the North Sea and Baltic Sea and showed that contributions from local wind forcing, atmospheric pressure, as well as remote sea level forcing are important for the Baltic 25 sea levels, and that they interact in a non-linear way to increase the variability. They also showed that the influence of external sea level forcing on periods less than 50 days is damped inside the Baltic Sea. One advantage of regional models versus global models is the higher resolution that can be used to resolve orography and bathymetry. The atmospheric and oceanic dynamics that interact with the regional features give rise to the specific characteristics of the region (e.g. Stein and Alpert, 1993;Feser et al., 2011;Jeworrek et al., 2017). To faithfully model sea level dynamics in the Baltic Sea, Kattegat and Skagerrak a reasonable representation of the driving agents wind and pressure is a minimum requirement. The atmosphere component RCA4 has been shown to yield a good climate compared to observational 5 data sets (Kjellström et al., 2016;Strandberg et al., 2014). Wind from a A1B scenario downscaled with RCA4-NEMO has been analyzed and compared to other RCM results by Ganske et al. (2016). They found low wind speeds in RCA4-NEMO for the highest 99 percentile for the North Sea compared to other RCMs. Dieterich et al. (2013) have shown that the mean wind speed in RCA4-NEMO compares well with observations. Gröger et al. (2015) compared wind speed from RCA4-NEMO with corresponding values from an uncoupled run with RCA4. The largest improvements in wind speed in the coupled model were found 10 in the winter season in regions where the Baltic Sea is covered with sea ice. In uncoupled RCA4 runs the SST is determined by the ocean component of global hindcast simulations that only coarsely resolves the Baltic Sea. This points to an added value using a coupled model for modeling sea level in the Baltic Sea. That is specially true for ESLs that are caused by storms, predominantly in winter time (Samuelsson and Stigebrandt, 1996) when air-sea interaction is underrepresented . 15 This study aims at an analysis of ESLs for the Baltic Sea, Kattegat and Skagerrak based on model simulations for the past and future climates. The model used in this study is a regional atmosphere-ice-ocean model that was used to downscale model Other aspects of this model ensemble have been discussed previously: major baltic inflows (Schimanke et al., 2014), air-sea coupling , changes in wind speed and direction (Ganske et al., 2016), snow-bands (Jeworrek et al., 2017), model intercomparison (Pätsch et al., 2017), changes in heat fluxes  ✿ , ✿✿✿✿✿✿ changes ✿✿ in ✿✿✿✿✿✿✿✿✿✿✿ stratification ✿✿✿✿✿✿✿✿✿✿✿✿✿✿✿✿✿  . A regional model comes at the expense of having to formulate boundary conditions that allow information from the global 5 atmosphere and the global ocean to enter the model domain. The treatment of the open boundaries follows the strategies laid out in ; Dieterich et al. (2019). The sea surface height (SSH) along the open boundaries of the ocean component determines the averaged SSH in the regional model domain. Together with the atmospheric forcing, the SSH information on the open boundaries also contributes to the sea surface variability on time-scales from hours (Büchmann et al., 2011) to decades (Karabil et al., 2017). 10 To represent the tides in the regional model 11 harmonic constituents from the global tidal model at the Oregon State University (Egbert et al., 2010)  The monthly SSH prescribed along the open boundaries is derived from the global solutions of the OGCMs and transfers the information of seasonal, interannual and decadal SSH variability from the global to the regional scale. Details of the procedure are described in Dieterich et al. (2019). The varying SSH of the OGCMs in the northern North Sea represents characteristics of the regional circulation. A high SSH along the European shelf might indicate a weakening North Atlantic Current in the global 20 model (e.g. Saenko et al., 2017). This leads to different averaged SSHs in the regional model which in turn might interact with sea level dynamics on a more local scale (e.g. Gräwe and Burchard, 2012;Pelling et al., 2013). The water level in the Kattegat and the Danish Straits also has consequences for the ventilation and the ecosystem of the Baltic Sea (e.g. Hordoir et al., 2015;Arneborg, 2016;Meier et al., 2016).

The
In order to obtain an ensemble of sea level solutions for the present climate, we have downscaled the historical periods of The RCA4-NEMO runs discussed in the next sections are summarized in Table 1. This small ensemble offers a first insight into the uncertainty that is generated due to different large scale conditions, represented by the GCMs. 30 To set into relation the uncertainty that is inherent in the RCA4-NEMO ensemble forced with different GCMs a second group of experiments is analyzed that uses one GCM but different RCMs. These experiments are listed in Table 2. The RCMs are not independent of each other, but originate from different model setups that are used at the SMHI. The first five setups, except RCA4-NEMO-alt use the same ocean component NEMO-Nordic. RCA4-NEMO-alt differs from the standard experiment RCA4-NEMO ERA40 by using a different ocean component. Some of the relevant differences are lateral mixing 35  (Höglund et al., 2017) along geopotential surfaces, instead of isopycnic ones. Also the alternative NEMO-Nordic uses mixing coefficients according to Smagorinsky (1963). The bottom friction is larger and lateral walls impose a free-slip condition. The model setups RCA4-NEMO-1hr and RCA4-NEMO-50km differ from RCA4-NEMO by an 1-hourly coupling and a 0.44 • resolution RCA4, respectively. For more details on NEMO-Nordic 3.6 see Hordoir et al. (2019) and Höglund et al. (2017) for the model setup used in the STORMWINDS project. 5 Different RCMs have been forced with different reanalysis datasets and Table 3 gives an overview of these sensitivity experiments. Differences between the different RCMs may be larger than differences between different atmospheric forcing datasets.
That should be kept in mind when interpreting the results. NEMO-Nordic ERA-interim uses the ERA-interim reanalysis instead of the ERA40 reanalysis. Usually the atmospheric forcing like wind, pressure and so on are available every three hours. Either Table 3. Sensitivity experiments with different atmospheric forcing for different RCMs. RCA4-NEMO-1hr ERA40 is the same setup as RCA4-NEMO ERA40 but the atmosphere-and ice-ocean-components are coupled every hour. NEMO-Nordic ERA40 and NEMO-Nordic ERA-interim are two ocean-only setups forced with the output of RCA4 ERA40 and RCA4 ERA-interim, respectively. NEMO-Nordic interpolated is the same setup as NEMO-Nordic ERA40 but with linearly interpolated forcing (see text for more details). NEMO-Nordic 3.6 ERA40 and NEMO-Nordic 3.6 EURO4M    (Table 1) RCA4-NEMO-1hr ERA40 1961 -2009 standard with 1-hourly coupling (Table 2) these fields are kept constant or they are linearly interpolated. In a sensitivity run NEMO-Nordic interpolated the forcing fields are linearly interpolated in time and the ocean model experiences a smooth change in the forcing variables. The model setup NEMO-Nordic 3.6 has been driven with two different atmospheric reanalyses. The EURO4M reanalysis uses an atmosphere model with a higher resolution and has been shown to improve on results of the ERA40 reanalysis (Dahlgren et al., 2016). The reference setup RCA4-NEMO ERA40 may be compared with a setup RCA4-NEMO-1hr ERA40 where the atmosphere and 5 ice-ocean components exchange fluxes and surface temperatures every hour. This sensitivity study may answer the question whether a more frequent coupling than 3-hourly is necessary. Ta Smagorinsky (1963). NEMO-Nordic free-slip differs by the slip conditions along lateral walls compared to NEMO-Nordic no-slip. In NEMO-Nordic MSL the MSL is 58 cm higher compared to NEMO-Nordic interpolated (cf.

Mean Sea Levels
The five different GCMs used for the regional downscaling exhibit different MSLs, where the regional model domain has it's open boundaries. For this reason, the MSL averaged over the regional model domain varies between -20 cm to 160 cm among different ensemble members. To match observed MSLs in the Baltic Sea the model results need to be adjusted, using as a reference. It has been determined with a linear regression using long-term observations (Hammarklint, 2009 Ekman and Mäkinen (1996). Compared to our 20 model ensemble  apply observed SSH in the Kattegat as boundary condition to their model. We speculate that this is the main reason for the somewhat larger discrepancy between our model results and observational estimates.
ESLs are measured against the mean sea surface and it is therefore important to get the mean sea surface right. A confirmation of a representative ensemble for sea level estimates is the fact that the mean sea surface of the ensemble mean differs only slightly from the one found in the hindcast simulation. The maximum differences between hindcast and ensemble mean are 25 seen at station Spikarna with less than 1 cm.
Since the five different GCMs plus the ERA40 reanalysis provide a range of distinct atmospheric conditions it is unlikely that the atmospheric forcing is responsible for the biases seen in the modeled mean sea surface of the Baltic Sea. On the other hand  have shown in a sensitivity study, that an increase of 30% in wind speed does increase the MSL gradient along the Swedish coast from Smögen to Furuögrund by around 4 cm. That would bring the three northernmost 30 stations in Table 6 closer to the estimated mean sea surface based on the WISKI ✿✿✿ tide ✿✿✿✿✿✿ gauge network. This could indicate that the model system used here generally produces low wind speeds at least when inferred from the mean sea surface of the Baltic Sea.  In a second sensitivity study  increased the river discharge to the Baltic Sea by 34%. The pattern is different from the one caused by an increased wind speed and would fit with the data presented in Fig. 1 and Table 6 because the Baltic Proper would show a mean sea surface around 1 cm higher than the Kattegat and the Bothnian Bay. Our modeled mean sea surface with positive biases in the Baltic Proper suggests that the model system has a fresh bias, which has been found by Dieterich et al. (2013Dieterich et al. ( , 2019   Lines three in Table 7 show return levels estimated from an ocean-only model that has been driven with a downscaled ERA40 reanalysis. Except for the west coast the model setup NEMO-Nordic 3.6 ERA40 produces lower return levels than the coupled model RCA4-NEMO ERA40.   . Generally, the model estimates of the 100-year 25 return levels are lower than those from the observations. The same is true for the 20-year return levels (not shown). In the Bothnian Bay (Furuögrund and Ratan) and in the southern Baltic Sea (Kungsholmsfort and Klagshamn) the model estimates from the coupled model RCA4-NEMO ERA40 are within the confidence limits of the observational estimates. In the central Baltic Sea (Stockholm and Landsort) all models estimate somewhat lower return levels than the observations would suggest.

Model Sensitivity
The light gray shading in Fig. 2 is the uncertainty generated by different RCMs. Comparing it with the colored shading shows that overall, the RCMs disagree more than the confidence limit of the GEV estimation. This is true however only for the mixed ensemble of coupled and uncoupled RCMs. Clearly, the two groups are clustered and the ensemble is not normally distributed.
The uncertainty within the first three coupled RCMs and the three uncoupled RCMs in Table 2 and Fig. 2 is much smaller.
ESLs in the Baltic Sea are sensitive to details of how physics and dynamics are implemented in the numerical model. It is well known that bottom friction has a major impact on the amplitude and phase of sea level variations (Gräwe and Burchard, 2012). In this section a series of sensitivity runs are presented that are set up to explore how ESLs depend on different aspects 5 of model implementation and forcing.

Decadal Variability
From the relatively small spread (O(20 cm)) among the estimates in Fig. 3  Proper. That might have to do with the ice cover, which is known to be sensitive to decadal variability (Jevrejeva et al., 2003).
During positive phases of NAO the Baltic Sea tends to experience mild winters with less ice cover (Omstedt and Chen, 2001) together with stronger westerlies. This situation promotes the momentum transfer from the atmosphere to the ocean and the generation of storm surges.

Atmospheric Forcing
The influence of atmospheric forcing on ESLs is shown in Fig. 4. ESLs of seven different model runs are compared. However, there are different models involved as well that behave quite differently (cf. Fig. 2). What can be deduced from Fig. 4  In the Bothnian Bay at least a part of this difference needs to be attributed to the higher wind speeds found in the EURO4M data set (Dahlgren et al., 2016) and the coupled experiments . The experiment NEMO-Nordic interpolated 30 confirms that with lower return levels compared to the regular experiment NEMO-Nordic ERA40, where the forcing was not interpolated between timesteps. In the latter case the stepwise forcing includes higher harmonics that generate high frequency gravity waves that add to the sea levels extremes. As can be seen from Fig. 4 ESLs at Stockholm are O(40 cm) higher if high

Open Boundary Conditions
The

Model Parameters
The influence of the water depth, lateral friction and other model parameters on the 100-year return levels is shown in Fig. 6.
How the horizontal viscosity is represented in the model has an effect on ESLs. This can be deduced from the two experi-5 ments NEMO-Nordic viscous and NEMO-Nordic no-slip. The former uses harmonic viscosity along the geopotential surfaces and the latter uses viscosity coefficients calculated according to Smagorinsky (1963). In the case NEMO-Nordic no-slip the selective viscosity prevents the degradation of gradients near the coast, where the ESLs are measured and leads to much higher  The influence of MSL on ESLs is shown in Fig. 6 for experiments four and five. The experiment NEMO-Nordic MSL has a higher MSL (58 cm) than NEMO-Nordic interpolated. At none of the stations shown in Fig. 6 is there a marked effect on ESLs 10 due to differing MSLs. This is in agreement with a study by Hieronymus et al. (2018) where the parameters of the GEV used to estimate ESLs do not change with MSL. A small effect can be seen at station Smögen where ESLs are higher with a lower MSL. This is in accordance with theory where shallower regions exhibit higher sea level signals (Pelling et al., 2013).
The last two experiments NEMO-Nordic E-HYPE and NEMO-Nordic discharge in Fig. 6 show the sensitivity of ESLs against the river discharge. NEMO-Nordic E-HYPE with a higher O(1500 m 3 /a) freshwater input than NEMO-Nordic discharge shows a minor increase of ESLs in the Bothnian Bay. The freshwater signal is also visible in the MSL, which is 2 mm lower in the Bothnian Bay and 0.2 mm lower in the Baltic Proper. Since the spatial and temporal changes of the river discharge are different in the two experiments the effect of the higher freshwater input in NEMO-Nordic E-HYPE is masked by other 5 processes. In this section the GCM uncertainty is compared to uncertainty that comes with the RCM. With RCM uncertainty we mean the range of solutions for ESLs that arise from the use of different model formulations or choice of parameters as shown in Fig. 2.

100-year Return Levels
In Fig. 7 the 100-year return levels for the five historical periods at nine sea levels stations along the Swedish coast are compared to the results from the downscaled hindcast RCA4-NEMO ERA40 and to observational estimates. One argument to 25 use an ensemble of model runs is to gain insight into the spread of possible solutions. Additionally, for a meaningful ensemble the ensemble mean should be better than individual model members. In this figure it becomes apparent that the ensemble mean is closer to the estimates from the WISKI database  observational estimates. It is therefore not wrong to assume that the ensemble averaged ESLs are from the same distribution as the ones estimated from observations. The confidence limits for the 100-year return levels in Fig. 7 are based on how well the theoretical distributions can be approximated by the sampled ones. In the end it is a matter of how long the available timeseries are to estimate return levels.
Long time series are rare and are usually available from a few stations only. So, natural variability tends to be underrepresented  All estimates are based on the common historical period 1970 to 1999. Fig. 7 shows that 1.96 times the ensemble dispersion is larger than the 95% confidence limits for the GEV estimates for four stations. In the Baltic Proper and on the west coast the two measures are comparable. These are also the stations that are least sensitive to model formulation (Fig. 2) and atmospheric forcing (Fig. 4). At other stations the uncertainty in the ensemble could be reduced by increasing the number of ensemble members.

5
To be able to map the ESLs and their uncertainty for the whole Baltic Sea we turn to somewhat less extreme sea levels than the 100-year return levels. Figure 8 compares the warning levels used by the SMHI (Table 8)  The individual estimates for the mean 99.9 percentile of ESLs (Fig. 8) agree very well with each other. 95% of all values are within no more than O (15 cm). This is the estimate of GCM uncertainty based on the 99.9 percentile. It is much lower than the 20 to 40 cm disagreement among the 100-year return levels in different GCMs (Fig. 7). The uncertainty estimates based on the 99.9 percentile are therefore minimum estimates for the uncertainty attributed to GCMs. Figure 8 indicates that the 99.9 percentile are close to warning level 1 used at the SMHI. 5   Table 8. Warning level for the Baltic Sea, Kattegat and Skagerrak used at the SMHI (Schöld et al., 2017). If sea level is predicted to be higher or equal to a specific warning level a public warning is issued. Warning levels are given relative to the mean sea level. The 99.9 percentile of ESLs are shown in Fig. 9. They are between 70 cm and 120 cm higher than the mean sea surface.
The most extreme sea levels occur at the eastern end of the Gulf of Finland and the northern end of the Bothnian Bay. In the western part of the Kattegat, the southwestern Baltic Sea and The Gulf of Riga the 99.9 percentile are up to 100 cm higher than the mean sea surface.
Since the mean sea surface is increasing towards the east and the north the 99.9 percentile relative to bedrock shows a more are the same in these regions the disagreement comes from a higher setup in shallower coastal regions, which is more sensitive The validation of the modeled ESLs has shown that the ensemble mean of the historical period compares well to observational estimates, except for the Swedish west coast (cf. Table 9). The modeled ESLs are lower than the ones inferred from observations. The results are however compatible with the assumption that the model generates the same distribution of ESLs as   9 also shows that the ensemble mean estimates are closer to the estimates based on observations than those calculated from the single ERA40 hindcast. By means of sensitivity studies model deficiencies could be identified. At station Smögen the model is sensitive to sea level variability that is generated in the North East Atlantic. In the standard model configuration this information is not provided. The model results do improve however, when extra variability from the North Sea is present. At station Stockholm the model shows ESLs that are affected by processes and geography the model does not resolve properly. 5 This can serve to formulate hypotheses for the development of improved model versions.
On the list of model improvements in the RCM is the reduction of the fresh bias in the Baltic Sea that would bring the mean sea surface closer to the observed one and thus reduce the underestimation of ESLs in the Kattegat and the northern Baltic Sea.
Another model deficiency that needs to be addressed is the too low wind speed in the highest 99 percentile that are responsible for the generation of ESLs in the Baltic Sea. are taken. Some model development however can be envisioned for regional models that potentially improve ESL estimation.
Drying and wetting of adjacent low-lying land can help to more realistically represent the energy budget of storm surges.
A wave model could improve the representation how momentum is transferred from the atmosphere to the ocean and vice versa. Today, most coupled regional models treat the sea surface between the atmosphere and the ocean as an interface to exchange fluxes of momentum, energy and matter. A wave model can be integrated as an individual component in a coupled In this study we presented a validation and analysis of simulated ESLs for the Baltic Sea, Kattegat and Skagerrak. To calculate regional sea level scenarios we have downscaled five members of the CMIP5 ensemble for the historical period. A second part of the study discusses the sea level projections for the 21century according to three different RCPs. The ensemble spread within the regional climate ensemble allows us to assess the uncertainty that is inherent to different GCM solutions.
One source of uncertainty, which is missing from our ensemble is how different RCMs influence the uncertainty of ESLs. An approximate estimate, based on interdependent RCMs, is approximately double that of the uncertainty generated by the GCMs.
The uncertainty estimate based on the RCMs has several weaknesses. First of all the ensemble is small, specially since the models are not independent from each other. That would tend to yield a small uncertainty. On the other hand the bulk of the uncertainty is due to two different clusters of solutions. The coupled models generally behave differently that the uncoupled 5 ones. That increases the RCM uncertainty in a somewhat artificial way. Sea level in the Baltic Sea can be tuned to some extent Gräwe and Burchard, 2012) and should eventually lead to a smaller uncertainty. Sea level is probably the one variable in Baltic Sea or North Sea and Baltic Sea models that can be used for forecast, even without assimilation of data, at least direct assimilation into the ocean model. The atmospheric forcing is the other crucial ingredient and that is usually derived from the weather forecast. That is the regular procedure in the agencies concerned with sea level forecast around the Another task concerns the reduction of the uncertainty that stems from the GCMs. In our ensemble we have used five 20 different CMIP5 GCMs that span the parameter space. The addition of well behaved, but independent GCMs into the ensemble of regional projections would be valuable to generate more robust estimates and presumably a smaller uncertainty. In our ensemble the GCM uncertainty of 20 to 50 cm is larger at half of the stations than the confidence limits related to the estimation of the 100-year return levels. Observationally ✿✿✿✿✿✿✿✿✿✿ Observation ✿ based estimates of return levels are known to produce outliers on the Swedish west coast (Fredriksson et al., 2016). It is not clear whether these are among the 5% that are bound to be outside  For planning and management purposes it is important to consider the spread of possible solutions along with the mean or median estimates. For ESLs along the Swedish coast we see potential for the reduction in uncertainty from both improvements on the RCMs and the representation of the climate by the GCMs. We have considered only one estimator for the 100-year return levels, but Södling and Nerheim (2017) has shown that different approaches yield a range of results, also along the Swedish coast. These uncertainties should be taken into account as well in future investigations.

35
Recently,  assessed different sources of uncertainty in projections of biogeochemical cycles in the Baltic Sea. Some of the uncertainties may be reduced by developing better modeling strategies, boundary and forcing data. Other uncertainties are related to unknown future nutrient input and greenhouse gas emissions. They stress the importance of regular information on current knowledge which includes the uncertainty in model results that stem from different sources. In the context of high end climate change scenarios Capela Lourenço et al. (2018) did not find climate change uncertainty as being 5 perceived as a barrier in the implementation of climate adaption. Uncertainty estimates are also planned to be included in management tools such as Symphony within the ClimeMarine project.

Conclusions
In this study we have analyzed ESLs in a regional sea level ensemble for the Baltic Sea. The ensemble uses one RCM forced with different GCMs. This allows to assess the uncertainty of 100-year return levels introduced by large scale circulation pat-10 terns represented by the GCMs. This uncertainty is one to two times the confidence limits of the observational GEV estimates.
The observational confidence limits express the uncertainty in 100-year return levels from the use of short timeseries. Another source of uncertainty lies in the use of a specific RCM. We have estimated an upper bound for this uncertainty to be double the size of the GCM uncertainty. With the analysis of sensitivity studies, processes and shortcomings have been identified that will allow model development to reduce this uncertainty below the GCM uncertainty. 15 The main findings of this study may be summarized as follows: - -The GCM uncertainty of the ensemble mean 100-year return levels is the same order of magnitude as the 95% confidence limits from the GEV estimates with the blockmaxima method.

25
-The bias in ESLs at stations Gothenburg and Smögen needs to be reduced. Sensitivity studies have shown that high frequency variability should be included at the open boundaries of the regional ocean model. Model development should also aim to reduce the RCM uncertainty.
Code and data availability. Data and software used for the analyses in this study can be made available from the authors upon request. The tide gauge data used for validation is publicly available at https://www.smhi.se/klimatdata/oceanografi/havsvattenstand. Answers to referee #1 Thank you for your comments and suggestions. We think it helped to improve the manuscript.
Page 1 line 20 -What is "Backafloden"? This should be either removed, or explained.
-Backafloden is the Swedish name for the storm surge in November 1872 that flooded a number of cities in the southwestern Baltic Sea with record sea levels. We have described the event in a sentence and avoided the name Backafloden.

Interactive comment
Printer-friendly version

Discussion paper
Page 4 line 31 -Which storm surge model is being referred to here?
-The name of the storm surge model is NOAMOD. It is used at the SMHI for generating high frequency variability in the northern North Sea for the sea level forecast models. We have added a sentence to the text about NOAMOD. Unfortunately, there is no reference that could be cited.
Page 7 Table 2 and related text -what is "ORAS4"?
-ORAS4 is the previous Ocean Reanalysis System evaluated in e.g. Balmaseda et al., 2013. We have now included the reference in the text and in the caption, together with the explanation of the acronym.
-In this and most other cases we have replaced "WISKI database" or "WISKI network" with "tide gauge network". The name WISKI appears now only together with the reference to SMHI's open data. Figure 1: The text labels are quite difficult to read here. ...
-We have moved the labels for the stations into free spaces on the map. The caption now mentions that the ensemble mean consists of the historical periods of the scenarios in Table 1.
-Figure 1b now contains labels for the different basins of the Baltic Sea instead of the station names. This is to accommodate a suggestion of reviewer #2.  -We have adapted the caption of Figure 8.

Answers to referee #2
Thank you for your detailed comments and suggestions. It helped to improve different aspects of the manuscript including readability.

General comments
However, I think, there are some issues needed to be clarified in terms of content and writing style of the paper. ...
-We have reworked the manuscript to streamline it and improve the consistency and C1

Interactive comment
Printer-friendly version Discussion paper style. In many places the reviewer's suggestions have helped to eliminate redundant formulations or to add necessary explanations.
First of all, the novelty of paper is not well written. ...
-We have added a paragraph towards the end of the introduction that specifies more clearly what this study contributes to the ESL research in the Baltic Sea.
-The beginning of the abstract now also highlights the contribution to the uncertainty discussion.
Secondly, the paper is a little bit hard to read. ...
-This is a good point. P1L5: Sentence reads: 'The ensemble mean 100-year return levels turns out to be the best estimator with biases less than 10 cm.' ...
-The idea here was to mention that the ensemble mean of an ensemble of model runs shows a better agreement with observations than a single model run, even the ERA40 hindcast. We have simplified the sentence.
P1L6: Sentence starts with: 'The ensemble spread..'. This sentence is redundant. It should be removed from the abstract.

Interactive comment
Printer-friendly version Discussion paper -We think this is among the important findings of our study. The 5% to 95% range of the ensemble solutions (the likely range in IPCC parlance) includes the estimates based on observations. This gives us some confidence that the solutions in the ensemble cover what has been observed. This will be important in cases where there are no observations. We have combined this finding with the summary of the GCM uncertainty at the end of the abstract.
P1L10: Please update the sentence starting in this line in this way: 'Some regions like Skagerrak, Gulf of Finland. . .'.
-We have changed the text according to the reviewer's recommendation.
-We have clarified in the text that we mean tide gauge records.
I did not understand the sentence in P1L23: -Yes, the text was mentioning relative sea level rise and fall due to two different processes. We have now removed the example on sea level fall due to GIA to improve the readability. The estimates for GIA appear now after the introduction on GMSL rise.
-We prefer to leave the first paragraph at the beginning of the introduction. It explains why we want to know about SLR and ESL.
-Yes, it is true, we did not explain the return level concept. We have added a short paragraph to Section 3.2 (Extreme Sea Levels) where the return levels are introduced.
P2L7 Please revise 'has coordinated to an ensemble' to 'has coordinated to an ensemble mean.' -We think that the formulation as it is describes better that the CMIP5 project  has coordinated an ensemble of model runs with GCMs. From this ensemble C3

Interactive comment
Printer-friendly version Discussion paper we have downscaled five individual ensemble members (GCMs) with a regional climate model and calculated return periods that we want to discuss.
P2L9 again add 'mean' to next to the word 'ensemble' -We would like to mention in the text, that many aspects of the ensemble of GCMs are discussed in IPCC's AR5. This includes also the ensemble spread (the uncertainty) not only the ensemble mean.
P2L13 Sentence starts at this line does not mention about the region. ...
-The effect of GIA and GMSL on Baltic Sea level depends on the region in the Baltic Sea and on the climate change scenario. We have tried to describe this in the sentences that follow.
Can authors briefly explain the methods that they applied in this study at the end of introduction? (before P4L3).
-The main method that was used to generate the model estimates for the ESLs was the dynamical downscaling of a number of GCMs. This is part of the paragraph (previously) starting at P5L9 in Section 2. We left out the details of how the dynamical downscaling and the coupling between atmosphere and ocean was done. This has been published in e.g.  2019.
-We first followed the suggestion and summarized our approach with the statistical methods in the paragraph on novelty in the introduction (see also next answer). That would have required to introduce the return period concept in the introduction. We have decided against it and prefer to explain the return period concept where we show the first return levels for different return periods (Tab. 7, Sect. 3.2). To avoid repetition we have now added a brief outline of our approach at the beginning of Section Extreme Sea Levels, not in the introduction.
P4L3: First two sentences of this paragraph should be placed earlier in the introduction section where the authors explain the novelty of this study. C4

Interactive comment
Printer-friendly version Discussion paper -We have moved the two sentences further up in the introduction and combined them with the description of the novelty of this study.
P4L10: Which paper? Please put the reference. And please also write the principle conclusion of that paper, if it is needed to understand the scope of this paper.
-We have added a short explanation at the end of the introduction how this paper is related to the companion paper. We also added the reference. The second part is not needed to understand the results and discussion in this paper. The second part gives an extra motivation for this study.
P5L12 please cite the paper at the end of the sentence.
-We have added the reference for the paper, which is in preparation.
-We chose Landsort because it is usually taken as the reference station in the middle of the Baltic Sea, near the nodal line, that represents the amount of water (or the mean sea level) in the Baltic Sea, e.g. Lisitzin, 1974, Mohrholz et al., 2015 It is also the station with the least amount of spread in ESLs among different ensemble members, cf. Fig. 2 and 7. The tide gauge station in Stockholm is close to the city center, where it can be affected by freshwater from lake Mälaren (Samuelsson and Stigebrandt, 1996) and where the model resolution is too coarse to properly resolve the Stockholm archipelago.
P8L18: Authors mention that modeled and observed sea surface for the period 1970-1999 is compared in Fig 1. ...
-We validate the model results for MSL along the coast, where observations from tide gauge stations are available. The model results show a reasonable agreement with observations. We implicitly assume that MSLs along the coast are tightly connected to MSLs in adjacent regions in the open Baltic Sea. Those are mainly determined by the shallow water equations including the atmospheric and riverine forcing. These are C5

Interactive comment
Printer-friendly version Discussion paper properly represented in our model, although there is room for improvement. To avoid confusion we have changed the wording in the paragraph and call the observations mean sea level, not mean sea surface. We have not used a mean sea surface based on observations.
-We have updated Figure 1b) to show the names of the different gulfs and basins in the Baltic Sea that are mentioned in the text.
P15L1: What is a,b,c in Figure 5?
-We have now included station labels in Figures 2 to 8 to make it easier to identify individual stations. This was a suggestion of reviewer #1.
-We have eliminated this sentence.
-We have tried improve the text in many places to make it more readable.
P23L7; It is mentioned that SLR,tides. Storm surges and wind waves increase ...
-The idea was to show an example where the interaction of waves and sea level has a large effect. That's why we chose to cite Arns et al., 2017. We agree that the interaction with tides is not significant in the Baltic Sea. We have replaced the sentence with examples from the Baltic Sea, that have been described in Weisse and Weidemann, 2017, Wisniewski and Wolski, 2011 P23L25. Again the same issue. The sentence 'Similar for the ocean-only models'. ...
-We have changed the text to make the statement clearer.
-We have changed the wording in this sentence to 'Observation based'.

Interactive comment
Printer-friendly version Discussion paper -The reason for mentioning the second part of the study was to provide a clue for the interested reader to the scenario part of the model ensemble in connection with the uncertainty discussion. We have now eliminated the sentence.